DHASH感知算法计算视频相邻帧的相似度

一个朋友想用python来读取视频帧,根据帧和帧之间相似度判断剪辑痕迹;但是最后发现并没什么用......

原理就是遍历地读取图像相邻帧,将图像相邻帧前处理后,缩小什么的,计算d_hash,然后计算其汉明距离,然后把汉明距离变成相似度,比较相似度,如果相似度低于阈值,就标记时间什么的

不过我一想,欸!好像可以用来解GIF隐写

我就把代码放这里了

1.处理视频版本的

python 复制代码
import cv2
import numpy as np
from PIL import Image

# Constants
VIDEO_FILE = "faker.mp4"
FRAME_PREFIX = "frame"
FAIL_THRESHOLD = 0.77

def d_hash(image):
    """Calculate the difference hash for the given image."""
    hash_bits = []
    for i in range(8):
        for j in range(8):
            hash_bits.append(1 if image[i, j] > image[i, j + 1] else 0)
    return hash_bits

def hamming_distance(hash1, hash2):
    """Calculate the Hamming distance between two hashes."""
    return sum(1 for x, y in zip(hash1, hash2) if x != y)

def process_frame(frame):
    """Process frame to convert it to a format suitable for hash computation."""
    return np.array(Image.fromarray(frame).resize((9, 8), Image.LANCZOS).convert('L'), 'f')

def main():
    video = cv2.VideoCapture(VIDEO_FILE)
    edit_detected = False

    while True:
        # Read two consecutive frames from the video
        success, frame0 = video.read()
        success1, frame1 = video.read()

        if not success or not success1:
            break

        # Process frames
        frame0_processed = process_frame(frame0)
        frame1_processed = process_frame(frame1)

        # Calculate hashes and distance
        hash0 = d_hash(frame0_processed)
        hash1 = d_hash(frame1_processed)
        distance = hamming_distance(hash0, hash1)
        similarity = 1.0 - distance / 64.0

        # Check similarity against the threshold
        if similarity < FAIL_THRESHOLD:
            msec = video.get(cv2.CAP_PROP_POS_MSEC)
            minute, second = divmod(msec // 1000, 60)
            print(f"{int(minute)} minute {int(second)} second detected with similarity {similarity}")
            edit_detected = True

    if not edit_detected:
        print("No edit detected.")

if __name__ == "__main__":
    main()

2.处理GIF版本的

处理GIF的时候如果遇到相似度过低的帧,就标记出来,而不是时间

python 复制代码
from PIL import Image, ImageSequence
import numpy as np

# Constants
GIF_FILE = "aaa.gif"
FRAME_PREFIX = "frame"
FAIL_THRESHOLD = 0.77

def d_hash(image):
    """Calculate the difference hash for the given image."""
    hash_bits = []
    for i in range(8):
        for j in range(8):
            hash_bits.append(1 if image[i, j] > image[i, j + 1] else 0)
    return hash_bits

def hamming_distance(hash1, hash2):
    """Calculate the Hamming distance between two hashes."""
    return sum(1 for x, y in zip(hash1, hash2) if x != y)

def process_frame(frame):
    """Process frame to convert it to a format suitable for hash computation."""
    return np.array(frame.resize((9, 8), Image.LANCZOS).convert('L'), 'f')

def main():
    gif = Image.open(GIF_FILE)
    frames = [frame.copy() for frame in ImageSequence.Iterator(gif)]
    edit_detected = False
    previous_hash = None

    for index, frame in enumerate(frames):
        # Process frame
        frame_processed = process_frame(frame)

        # Calculate hash
        current_hash = d_hash(frame_processed)

        if previous_hash is not None:
            # Calculate distance and similarity
            distance = hamming_distance(previous_hash, current_hash)
            similarity = 1.0 - distance / 64.0

            # Check similarity against the threshold
            if similarity < FAIL_THRESHOLD:
                print(f"Frame {index} detected with similarity {similarity}")
                edit_detected = True
        
        previous_hash = current_hash

    if not edit_detected:
        print("No edit detected.")

if __name__ == "__main__":
    main()

FAIL_THRESHOLD那里可以调节灵敏度,越高越容易筛选帧对

不过话说回来,好像还是很鸡肋......算了......写都写了

兴许以后有用呢。

相关推荐
ACP广源盛1392462567310 小时前
GSV6127E(EA)#Type-C / 显示端口 1.4/HDMI 2.0 转 MIPI/LVDS 转换器(带音频提取和嵌入式微控制器)
单片机·嵌入式硬件·音视频
wangchen011212 小时前
基于视频识别的大模型项目实战心得
语言模型·音视频
北极糊的狐16 小时前
狸窝转换器将MP4格式视频转换为以下格式后的大小对比:RM、RMVB、AVI、MKV、WMV、VOB、MOV、FLV、ASF、DAT、3GP、MPG、MPEG
音视频
是Dream呀17 小时前
一个账号调用N个AI模型!从LLM到视频生成的丝滑解决方案
人工智能·大模型·aigc·音视频·deepseek
顾北川_野1 天前
播放PCM音频增益低+单独增强PCM解码的方案
音视频·pcm
Everbrilliant891 天前
FFmpeg解码音频数据AudioTrack/OpenSL播放
ffmpeg·音视频·audiotrack·opensl·ffmpeg音频解码播放·decodethread·opensl播放与解码同步
ivy159868377151 天前
JM20329是一款高性能、低功耗的USB桥接芯片,实现串行接口(如SATA、IDE)与USB接口之间的数据转换。
c语言·开发语言·ide·嵌入式硬件·eureka·音视频·视频编解码
温暖名字1 天前
调用qwen3-omni的api对本地文件生成视频文本描述(批量生成)
python·音视频·qwen·qa问答
太阳人7981 天前
MIPI D-PHY/C-PHY接收器压力眼图测试介绍
功能测试·嵌入式硬件·音视频·硬件工程
blackorbird2 天前
视频生成类大模型 Sora 2 系统提示提取技术研究
音视频