【OpenCV】 中使用 Lucas-Kanade 光流进行对象跟踪和路径映射

文章目录

  • 一、说明
  • [二、什么是Lucas-Kanade 方法](#二、什么是Lucas-Kanade 方法)
  • [三、Lucas-Kanade 原理](#三、Lucas-Kanade 原理)
  • 四、代码实现
    • [4.1 第 1 步:用户在第一帧绘制一个矩形](#4.1 第 1 步:用户在第一帧绘制一个矩形)
    • [4.2 第 2 步:从图像中提取关键点](#4.2 第 2 步:从图像中提取关键点)
    • [4.3 第 3 步:跟踪每一帧的关键点](#4.3 第 3 步:跟踪每一帧的关键点)

一、说明

本文针对基于光流法的目标追踪进行叙述,首先介绍Lucas-Kanade 方法的引进,以及基本推导,然后演示如何实现光流法的运动跟踪。并以OpenCV实现一个基本项目。

二、什么是Lucas-Kanade 方法

在计算机视觉领域,Lucas-Kanade 方法是 Bruce D. Lucas 和Takeo Kanade开发的一种广泛使用的光流估计差分方法。该方法假设所考虑像素局部邻域中的光流基本恒定,并根据最小二乘准则求解该邻域中所有像素的基本光流方程。

通过结合来自多个邻近像素的信息,Lucas-Kanade 方法通常可以解决光流方程固有的模糊性。与逐点方法相比,该方法对图像噪声的敏感度也较低。另一方面,由于它是一种纯局部方法,因此无法提供图像均匀区域内部的流信息。

三、Lucas-Kanade 原理

在理论上,初始时间为 t 0 t_0 t0 时刻,经历过 Δ t \Delta t Δt时段后,点p会移动到另一个位置 p ′ p′ p′ ,并且 p ′ p′ p′ 本身和周围都有着与p相似的亮度值。朴素的LK光流法是直接用灰度值代替RGB作为亮度。根据上面的描述,对于点p而言,假设p 的坐标值是( x , y ),有
I ( x , y , t ) = I ( x + Δ x , y + Δ y , t + Δ t ) I(x, y, t) = I(x+\Delta x,y+\Delta y, t+\Delta t) I(x,y,t)=I(x+Δx,y+Δy,t+Δt)

根据泰勒公式:在这里把x 、y 看做是t 的函数,把公式(1)看做单变量t 的等式,只需对t进行展开)
I ( x , y , t ) = I ( x , y , t ) + ∂ I ∂ x ∂ x ∂ t + ∂ I ∂ y ∂ y ∂ t + ∂ I ∂ t + o ( Δ t ) I(x,y,t)=I(x,y,t)+\frac{∂I} {∂x}\frac{∂x}{∂t}+\frac{∂I} {∂y}\frac{∂y}{∂t}+\frac{∂I} {∂t}+o(Δt) I(x,y,t)=I(x,y,t)+∂x∂I∂t∂x+∂y∂I∂t∂y+∂t∂I+o(Δt)

对于一个像素区域:
I x ( q 1 ) V x + I y ( q 1 ) V x = − I t ( q 1 ) I x ( q 2 ) V x + I y ( q 2 ) V x = − I t ( q 2 ) . . . I x ( q n ) V x + I y ( q n ) V x = − I t ( q n ) I_x(q_1)V_x+I_y(q_1)V_x=-I_t(q_1)\\I_x(q_2)V_x+I_y(q_2)V_x=-I_t(q_2)\\...\\I_x(q_n)V_x+I_y(q_n)V_x=-I_t(q_n) Ix(q1)Vx+Iy(q1)Vx=−It(q1)Ix(q2)Vx+Iy(q2)Vx=−It(q2)...Ix(qn)Vx+Iy(qn)Vx=−It(qn)

在这里: q 1 , q 2 , . . . q n q_1,q_2,...q_n q1,q2,...qn是窗口内点的标号, I x ( q i ) I_x(q_i) Ix(qi), I y ( q i ) I_y(q_i) Iy(qi), I t ( q i ) I_t(q_i) It(qi)是图像的灰度偏导数,

这些方程可以写成矩阵形式:
A v = b Av=b Av=b

这个系统的方程多于未知数,因此它通常是过度确定的。Lucas-Kanade方法通过最小二乘原理得到折衷解。也就是说,它解决了2×2系统:


因此

四、代码实现

4.1 第 1 步:用户在第一帧绘制一个矩形

bash 复制代码
# Path to video  
video_path="videos/bicycle1.mp4" 
video = cv2.VideoCapture(video_path)

# read only the first frame for drawing a rectangle for the desired object
ret,frame = video.read()

# I am giving  big random numbers for x_min and y_min because if you initialize them as zeros whatever coordinate you go minimum will be zero 
x_min,y_min,x_max,y_max=36000,36000,0,0


def coordinat_chooser(event,x,y,flags,param):
    global go , x_min , y_min, x_max , y_max

    # when you click the right button, it will provide coordinates for variables
    if event==cv2.EVENT_RBUTTONDOWN:
        
        # if current coordinate of x lower than the x_min it will be new x_min , same rules apply for y_min 
        x_min=min(x,x_min) 
        y_min=min(y,y_min)

         # if current coordinate of x higher than the x_max it will be new x_max , same rules apply for y_max
        x_max=max(x,x_max)
        y_max=max(y,y_max)

        # draw rectangle
        cv2.rectangle(frame,(x_min,y_min),(x_max,y_max),(0,255,0),1)


    """
        if you didn't like your rectangle (maybe if you made some misclicks),  reset the coordinates with the middle button of your mouse
        if you press the middle button of your mouse coordinates will reset and you can give a new 2-point pair for your rectangle
    """
    if event==cv2.EVENT_MBUTTONDOWN:
        print("reset coordinate  data")
        x_min,y_min,x_max,y_max=36000,36000,0,0

cv2.namedWindow('coordinate_screen')
# Set mouse handler for the specified window, in this case, "coordinate_screen" window
cv2.setMouseCallback('coordinate_screen',coordinat_chooser)


while True:
    cv2.imshow("coordinate_screen",frame) # show only first frame 
    
    k = cv2.waitKey(5) & 0xFF # after drawing rectangle press ESC   
    if k == 27:
        cv2.destroyAllWindows()
        break


cv2.destroyAllWindows()

4.2 第 2 步:从图像中提取关键点

bash 复制代码
# take region of interest ( take inside of rectangle )
roi_image=frame[y_min:y_max,x_min:x_max]

# convert roi to grayscale
roi_gray=cv2.cvtColor(roi_image,cv2.COLOR_BGR2GRAY) 

# Params for corner detection
feature_params = dict(maxCorners=20,  # We want only one feature
                      qualityLevel=0.2,  # Quality threshold 
                      minDistance=7,  # Max distance between corners, not important in this case because we only use 1 corner
                      blockSize=7)

first_gray = cv2.cvtColor(frame,cv2.COLOR_BGR2GRAY)

# Harris Corner detection
points = cv2.goodFeaturesToTrack(first_gray, mask=None, **feature_params)


# Filter the detected points to find one within the bounding box
for point in points:
    x, y = point.ravel()
    if y_min <= y <= y_max and x_min <= x <= x_max:
        selected_point = point
        break

# If a point is found, convert it to the correct shape
if selected_point is not None:
    p0 = np.array([selected_point], dtype=np.float32)

plt.imshow(roi_gray,cmap="gray")

将从此图像中提取关键点

4.3 第 3 步:跟踪每一帧的关键点

bash 复制代码
############################ Parameters ####################################

""" 
winSize --> size of the search window at each pyramid level
Smaller windows can more precisely track small, detailed features -->   slow or subtle movements and where fine detail tracking is crucial.
Larger windows is better for larger displacements between frames ,  more robust to noise and small variations in pixel intensity --> require more computations
"""

# Parameters for Lucas-Kanade optical flow
lk_params = dict(winSize=(7, 7),  # Window size
                 maxLevel=2,  # Number of pyramid levels
                 criteria=(cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 0.03))


############################ Algorithm ####################################

# Read video
cap = cv2.VideoCapture(video_path)

# Take first frame and find corners in it
ret, old_frame = cap.read()

width = old_frame.shape[1]
height = old_frame.shape[0]

# Create a mask image for drawing purposes
mask = np.zeros_like(old_frame)

frame_count = 0
start_time = time.time()

old_gray = first_gray

while True:
    ret, frame = cap.read()
    if not ret:
        break

    frame_gray = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)

    if p0 is not None:
        # Calculate optical flow
        p1, st, err = cv2.calcOpticalFlowPyrLK(old_gray, frame_gray, p0, None, **lk_params)  
        good_new = p1[st == 1]  # st==1 means found point
        good_old = p0[st == 1]


        if len(good_new) > 0:
            # Calculate movement
            a, b = good_new[0].ravel()
            c, d = good_old[0].ravel()
 
            # Draw the tracks
            mask = cv2.line(mask, (int(a), int(b)), (int(c), int(d)), (0, 255, 0), 2)
            frame = cv2.circle(frame, (int(a), int(b)), 5, (0, 255, 0), -1)

            img = cv2.add(frame, mask)

            # Calculate and display FPS
            elapsed_time = time.time() - start_time
            fps = frame_count / elapsed_time if elapsed_time > 0 else 0
            cv2.putText(img, f"FPS: {fps:.2f}", (width - 200, 30), cv2.FONT_HERSHEY_SIMPLEX, 1, (255, 0, 0), 2, cv2.LINE_AA)

            cv2.imshow('frame', img)

            # Update previous frame and points
            old_gray = frame_gray.copy()
            p0 = good_new.reshape(-1, 1, 2)

        else:
            p0 = None

        # Check if the tracked point is out of frame
        if not (25 <= a < width):
            p0 = None  # Reset p0 to None to detect new feature in the next iteration
            selected_point_distance = 0  # Reset selected point distance when new point is detected


    # Redetect features if necessary
    if p0 is None:
        p0 = cv2.goodFeaturesToTrack(frame_gray, mask=None, **feature_params)
        mask = np.zeros_like(frame)
        selected_point_distance=0
 
    frame_count += 1

    k = cv2.waitKey(25)
    if k == 27:
        break

 
cv2.destroyAllWindows()
cap.release()

结果

相关推荐
社会零时工13 分钟前
【OpenCV】相机标定之利用棋盘格信息标定
人工智能·数码相机·opencv
像素工坊可视化14 分钟前
监控升级:可视化如何让每一个细节 “说话”
运维·人工智能·安全
后端小肥肠19 分钟前
新店3天爆100单!我用零代码Coze搭客服,竟成出单神器?(附喂饭级教程)
人工智能·aigc·coze
AI大模型知识23 分钟前
Qwen3 Embeding模型Lora微调实战
人工智能·低代码·llm
Coovally AI模型快速验证1 小时前
SFTrack:面向警务无人机的自适应多目标跟踪算法——突破小尺度高速运动目标的追踪瓶颈
人工智能·神经网络·算法·yolo·计算机视觉·目标跟踪·无人机
Brduino脑机接口技术答疑1 小时前
脑机新手指南(七):OpenBCI_GUI:从环境搭建到数据可视化(上)
人工智能·算法·脑机接口·新手入门
jndingxin1 小时前
OPenCV CUDA模块光流处理------利用Nvidia GPU的硬件加速能力来计算光流类cv::cuda::NvidiaHWOpticalFlow
人工智能·opencv·计算机视觉
计算机小手1 小时前
开源大模型网关:One API实现主流AI模型API的统一管理与分发
人工智能·语言模型·oneapi
kk5792 小时前
保姆级教程:在无网络无显卡的Windows电脑的vscode本地部署deepseek
人工智能·windows·vscode·chatgpt
一勺汤2 小时前
YOLO12 改进|融入 大 - 小卷积LS Convolution 捕获全局上下文与小核分支提取局部细节,提升目标检测中的多尺度
yolo·计算机视觉·多尺度·yolo12·yolo12改进·lsconv·小目标