转载请注明出处：http://blog.csdn.net/wzmsltw/article/details/53221179

在上一篇笔记：iDT算法中，对iDT算法的原理做了简单的介绍。由于iDT算法提供了算法源码，自己也用它做了不少实验，因此介绍一下其代码的使用方法，并对源代码做一些解析。iDT算法的代码在作者个人主页可以下载到，也可以点击此处下载：iDT算法源码。

除了本篇博客之外，还有iDT用法及源码剖析这篇文章介绍的也不错，供参考。

基本功能

iDT算法框架中还包括Fisher Vector编码和SVM分类两阶段的工作，但作者提供的代码只包括到输出iDT特征的阶段，后续步骤需要使用其他代码或工具。其中有人编写了专门用与DT特征的FV编码C++程序：DTFV 。SVM则可以使用liblinear，对于高维度数据速度比较快。
因此，此处要讨论的iDT算法代码的输入为一段视频，输出为iDT特征的列表，每行为1个特征，对应视频中的某段轨迹。每个特征的维度为426维：Trajectory-30, HOG-96, HOF-108, MBH-192。每种特征的维数是如何得到的见iDT算法那篇博客。

编译与基本使用

详细内容见文件夹中的README。

iDT代码的依赖包括两个库：

OpenCV: readme中推荐用2.4.2，实际上用最新的2.4.13也没问题。但OpenCV3就不知道能不能用了，没有试过。
ffmpeg: readme中推荐用0.11.1。实际上装最新的版本也没有问题

这两个库的安装教程网上很多，就不再多做介绍了。而且也都是很常用的库。

在安装完以上两个库后，就可以进行代码编译了。只需要在代码文件夹下make一下就好，编译好的可执行文件在./release/下。

使用时输入视频文件的路径作为参数即可./release/DenseTrackStab ./test_sequences/person01_boxing_d1_uncomp.avi。

代码结构

iDT代码中主要包括如下几个代码文件

DenseTrackStab.cpp:iDT算法主程序
DenseTrackStab.h:轨迹跟踪的一些参数，以及一些数据结构体的定义
Descriptors.h:特征相关的各种函数
Initialize.h:初始化相关的各种函数
OpticalFlow.h:光流相关的各种函数
Video.cpp: 这个程序与iDT算法无关，只是作者提供用来测试两个依赖库是否安装成功的。

bound box相关内容

bound box即提供视频帧中人体框的信息，在计算前后帧的投影变换矩阵时，不使用人体框中的匹配点对。从而排除人体运动干扰，使得对相机运动的估计更加准确。

作者提供的文件中没有bb_file的格式，代码中也没有读入bb_file的接口，若需要用到需要在代码中添加一条读入文件语句（下面的代码解析中已经添加）。bb_file的格式如下所示

frame_id a1 a2 a3 a4 a5 b1 b2 b3 b4 b5

其中frame_id是帧的编号，从0开始。代码中还有检查步骤，保证bb_file的长度与视频的帧数相同。

后面的数据5个一组，为人体框的参数。按顺序分别为：框左上角点的x，框左上角点的y，框右下角点的x，框右下角点的y，置信度。需要注意的是虽然要输入置信度，但实际上这个置信度在代码里也没有用上的样子，所以取任意值也不影响使用。

至于如何获得这些bound box的数据，最暴力的方法当然是手工标注，不过这样太辛苦了。在项目中我们采用了SSD（single shot multibox detector）算法检测人体框的位置。

主程序代码解析

iDT算法代码的大致思路为：

读入新的一帧
通过SURF特征和光流计算当前帧和上一帧的投影变换矩阵
使用求得的投影变换矩阵对当前帧进行warp，消除相机运动影响
利用warp后的当前帧图像和上一帧图像计算光流
在各个图像尺度上跟踪轨迹并计算特征
保存当前帧的相关信息，跳到1

以下通过一些简单的注释对代码进行解析

#include "DenseTrackStab.h"
#include "Initialize.h"
#include "Descriptors.h"
#include "OpticalFlow.h"#include <time.h>using namespace cv;//如果要可视化轨迹，将show_track设置为1
int show_track = 0;int main(int argc, char** argv)
{//读入并打开视频文件VideoCapture capture;char* video = argv[1];int flag = arg_parse(argc, argv);capture.open(video);if(!capture.isOpened()) {fprintf(stderr, "Could not initialize capturing..\n");return -1;}//这句代码是我自己添加的，源代码中没有提供bb_file的输入接口char* bb_file = argv[2];int frame_num = 0;TrackInfo trackInfo;DescInfo hogInfo, hofInfo, mbhInfo;//初始化轨迹信息变量InitTrackInfo(&trackInfo, track_length, init_gap);InitDescInfo(&hogInfo, 8, false, patch_size, nxy_cell, nt_cell);InitDescInfo(&hofInfo, 9, true, patch_size, nxy_cell, nt_cell);InitDescInfo(&mbhInfo, 8, false, patch_size, nxy_cell, nt_cell);SeqInfo seqInfo;InitSeqInfo(&seqInfo, video);//初始化bb信息，将bb_file中的信息加载到bb_list中std::vector<Frame> bb_list;if(bb_file) {LoadBoundBox(bb_file, bb_list);assert(bb_list.size() == seqInfo.length);}if(flag)seqInfo.length = end_frame - start_frame + 1;if(show_track == 1)namedWindow("DenseTrackStab", 0);//初始化surf特征检测器//此处200为阈值，数值越小则用于匹配的特征点越多，效果越好（不一定），速度越慢SurfFeatureDetector detector_surf(200);SurfDescriptorExtractor extractor_surf(true, true);std::vector<Point2f> prev_pts_flow, pts_flow;std::vector<Point2f> prev_pts_surf, pts_surf;std::vector<Point2f> prev_pts_all, pts_all;std::vector<KeyPoint> prev_kpts_surf, kpts_surf;Mat prev_desc_surf, desc_surf;Mat flow, human_mask;Mat image, prev_grey, grey;std::vector<float> fscales(0);std::vector<Size> sizes(0);std::vector<Mat> prev_grey_pyr(0), grey_pyr(0), flow_pyr(0), flow_warp_pyr(0);std::vector<Mat> prev_poly_pyr(0), poly_pyr(0), poly_warp_pyr(0);std::vector<std::list<Track> > xyScaleTracks;int init_counter = 0; // 记录何时应该计算新的特征点while(true) {Mat frame;int i, j, c;// 读入新的帧capture >> frame;if(frame.empty())break;if(frame_num < start_frame || frame_num > end_frame) {frame_num++;continue;}/*-----------------------对第一帧做处理-------------------------*///由于光流需要两帧进行计算，故第一帧不计算光流if(frame_num == start_frame) {image.create(frame.size(), CV_8UC3);grey.create(frame.size(), CV_8UC1);prev_grey.create(frame.size(), CV_8UC1);InitPry(frame, fscales, sizes);BuildPry(sizes, CV_8UC1, prev_grey_pyr);BuildPry(sizes, CV_8UC1, grey_pyr);BuildPry(sizes, CV_32FC2, flow_pyr);BuildPry(sizes, CV_32FC2, flow_warp_pyr);BuildPry(sizes, CV_32FC(5), prev_poly_pyr);BuildPry(sizes, CV_32FC(5), poly_pyr);BuildPry(sizes, CV_32FC(5), poly_warp_pyr);xyScaleTracks.resize(scale_num);frame.copyTo(image);cvtColor(image, prev_grey, CV_BGR2GRAY);//对于每个图像尺度分别密集采样特征点for(int iScale = 0; iScale < scale_num; iScale++) {if(iScale == 0)prev_grey.copyTo(prev_grey_pyr[0]);elseresize(prev_grey_pyr[iScale-1], prev_grey_pyr[iScale], prev_grey_pyr[iScale].size(), 0, 0, INTER_LINEAR);// 密集采样特征点std::vector<Point2f> points(0);DenseSample(prev_grey_pyr[iScale], points, quality, min_distance);// 保存特征点std::list<Track>& tracks = xyScaleTracks[iScale];for(i = 0; i < points.size(); i++)tracks.push_back(Track(points[i], trackInfo, hogInfo, hofInfo, mbhInfo));}// compute polynomial expansionmy::FarnebackPolyExpPyr(prev_grey, prev_poly_pyr, fscales, 7, 1.5);//human_mask即将人体框外的部分记作1,框内部分记作0//在计算surf特征时不计算框内特征（即不使用人身上的特征点做匹配）human_mask = Mat::ones(frame.size(), CV_8UC1);if(bb_file)InitMaskWithBox(human_mask, bb_list[frame_num].BBs);detector_surf.detect(prev_grey, prev_kpts_surf, human_mask);extractor_surf.compute(prev_grey, prev_kpts_surf, prev_desc_surf);frame_num++;continue;}/*-----------------------对后续帧做处理-------------------------*/init_counter++;frame.copyTo(image);cvtColor(image, grey, CV_BGR2GRAY);// 计算新一帧的surf特征，并与前一帧的surf特帧做匹配// surf特征只在图像的原始尺度上计算if(bb_file)InitMaskWithBox(human_mask, bb_list[frame_num].BBs);detector_surf.detect(grey, kpts_surf, human_mask);extractor_surf.compute(grey, kpts_surf, desc_surf);ComputeMatch(prev_kpts_surf, kpts_surf, prev_desc_surf, desc_surf, prev_pts_surf, pts_surf);// 在所有尺度上计算光流，并用光流计算前后帧的匹配my::FarnebackPolyExpPyr(grey, poly_pyr, fscales, 7, 1.5);my::calcOpticalFlowFarneback(prev_poly_pyr, poly_pyr, flow_pyr, 10, 2);MatchFromFlow(prev_grey, flow_pyr[0], prev_pts_flow, pts_flow, human_mask);// 结合SURF的匹配和光流的匹配MergeMatch(prev_pts_flow, pts_flow, prev_pts_surf, pts_surf, prev_pts_all, pts_all);//用上述点匹配计算前后两帧图像之间的投影变换矩阵H//为了避免由于匹配点多数量过少造成 投影变换矩阵计算出错，当匹配很少时直接取单位矩阵作为HMat H = Mat::eye(3, 3, CV_64FC1);if(pts_all.size() > 50) {std::vector<unsigned char> match_mask;Mat temp = findHomography(prev_pts_all, pts_all, RANSAC, 1, match_mask);if(countNonZero(Mat(match_mask)) > 25)H = temp;}//使用上述得到的投影变换矩阵H对当前帧图像进行warp，从而消除相机造成的运动Mat H_inv = H.inv();Mat grey_warp = Mat::zeros(grey.size(), CV_8UC1);MyWarpPerspective(prev_grey, grey, grey_warp, H_inv); // warp the second frame// 用变换后的图像重新计算各个尺度上的光流图像my::FarnebackPolyExpPyr(grey_warp, poly_warp_pyr, fscales, 7, 1.5);my::calcOpticalFlowFarneback(prev_poly_pyr, poly_warp_pyr, flow_warp_pyr, 10, 2);//在每个尺度分别计算特征for(int iScale = 0; iScale < scale_num; iScale++) {//尺度0不缩放，其余尺度使用插值方法缩放if(iScale == 0)grey.copyTo(grey_pyr[0]);elseresize(grey_pyr[iScale-1], grey_pyr[iScale], grey_pyr[iScale].size(), 0, 0, INTER_LINEAR);int width = grey_pyr[iScale].cols;int height = grey_pyr[iScale].rows;// compute the integral histogramsDescMat* hogMat = InitDescMat(height+1, width+1, hogInfo.nBins);HogComp(prev_grey_pyr[iScale], hogMat->desc, hogInfo);DescMat* hofMat = InitDescMat(height+1, width+1, hofInfo.nBins);HofComp(flow_warp_pyr[iScale], hofMat->desc, hofInfo);DescMat* mbhMatX = InitDescMat(height+1, width+1, mbhInfo.nBins);DescMat* mbhMatY = InitDescMat(height+1, width+1, mbhInfo.nBins);MbhComp(flow_warp_pyr[iScale], mbhMatX->desc, mbhMatY->desc, mbhInfo);// 在当前尺度 追踪特征点的轨迹，并计算相关的特征std::list<Track>& tracks = xyScaleTracks[iScale];for (std::list<Track>::iterator iTrack = tracks.begin(); iTrack != tracks.end();) {int index = iTrack->index;Point2f prev_point = iTrack->point[index];int x = std::min<int>(std::max<int>(cvRound(prev_point.x), 0), width-1);int y = std::min<int>(std::max<int>(cvRound(prev_point.y), 0), height-1);Point2f point;point.x = prev_point.x + flow_pyr[iScale].ptr<float>(y)[2*x];point.y = prev_point.y + flow_pyr[iScale].ptr<float>(y)[2*x+1];if(point.x <= 0 || point.x >= width || point.y <= 0 || point.y >= height) {iTrack = tracks.erase(iTrack);continue;}iTrack->disp[index].x = flow_warp_pyr[iScale].ptr<float>(y)[2*x];iTrack->disp[index].y = flow_warp_pyr[iScale].ptr<float>(y)[2*x+1];// get the descriptors for the feature pointRectInfo rect;GetRect(prev_point, rect, width, height, hogInfo);GetDesc(hogMat, rect, hogInfo, iTrack->hog, index);GetDesc(hofMat, rect, hofInfo, iTrack->hof, index);GetDesc(mbhMatX, rect, mbhInfo, iTrack->mbhX, index);GetDesc(mbhMatY, rect, mbhInfo, iTrack->mbhY, index);iTrack->addPoint(point);// 在原始尺度上可视化轨迹if(show_track == 1 && iScale == 0)DrawTrack(iTrack->point, iTrack->index, fscales[iScale], image);// 若轨迹的长度达到了预设长度,在iDT中应该是设置为15// 达到长度后就可以输出各个特征了if(iTrack->index >= trackInfo.length) {std::vector<Point2f> trajectory(trackInfo.length+1);for(int i = 0; i <= trackInfo.length; ++i)trajectory[i] = iTrack->point[i]*fscales[iScale];std::vector<Point2f> displacement(trackInfo.length);for (int i = 0; i < trackInfo.length; ++i)displacement[i] = iTrack->disp[i]*fscales[iScale];float mean_x(0), mean_y(0), var_x(0), var_y(0), length(0);if(IsValid(trajectory, mean_x, mean_y, var_x, var_y, length) && IsCameraMotion(displacement)) {// output the trajectoryprintf("%d\t%f\t%f\t%f\t%f\t%f\t%f\t", frame_num, mean_x, mean_y, var_x, var_y, length, fscales[iScale]);// for spatio-temporal pyramidprintf("%f\t", std::min<float>(std::max<float>(mean_x/float(seqInfo.width), 0), 0.999));printf("%f\t", std::min<float>(std::max<float>(mean_y/float(seqInfo.height), 0), 0.999));printf("%f\t", std::min<float>(std::max<float>((frame_num - trackInfo.length/2.0 - start_frame)/float(seqInfo.length), 0), 0.999));// output the trajectoryfor (int i = 0; i < trackInfo.length; ++i)printf("%f\t%f\t", displacement[i].x, displacement[i].y);//实际上，traj特征的效果一般，可以去掉，那么输出以下几个就好了//如果需要保存输出的特征，可以修改PrintDesc函数PrintDesc(iTrack->hog, hogInfo, trackInfo);PrintDesc(iTrack->hof, hofInfo, trackInfo);PrintDesc(iTrack->mbhX, mbhInfo, trackInfo);PrintDesc(iTrack->mbhY, mbhInfo, trackInfo);printf("\n");}iTrack = tracks.erase(iTrack);continue;}++iTrack;}ReleDescMat(hogMat);ReleDescMat(hofMat);ReleDescMat(mbhMatX);ReleDescMat(mbhMatY);if(init_counter != trackInfo.gap)continue;// detect new feature points every gap framesstd::vector<Point2f> points(0);for(std::list<Track>::iterator iTrack = tracks.begin(); iTrack != tracks.end(); iTrack++)points.push_back(iTrack->point[iTrack->index]);DenseSample(grey_pyr[iScale], points, quality, min_distance);// save the new feature pointsfor(i = 0; i < points.size(); i++)tracks.push_back(Track(points[i], trackInfo, hogInfo, hofInfo, mbhInfo));}//这里有好多个copyTo prev_xxx//因为计算光流，surf匹配等都需要上一帧的信息，故在每帧处理完后保存该帧信息，用作下一帧计算时用init_counter = 0;grey.copyTo(prev_grey);for(i = 0; i < scale_num; i++) {grey_pyr[i].copyTo(prev_grey_pyr[i]);poly_pyr[i].copyTo(prev_poly_pyr[i]);}prev_kpts_surf = kpts_surf;desc_surf.copyTo(prev_desc_surf);frame_num++;if( show_track == 1 ) {imshow( "DenseTrackStab", image);c = cvWaitKey(3);if((char)c == 27) break;}}if( show_track == 1 )destroyWindow("DenseTrackStab");return 0;
}
<std::list

以上只是对程序代码的简单解析，如果需要使用到iDT的代码还是需要自己好好研究代码的，此篇笔记也只算是自己的一个笔记啦。个人感受iDT算法的思路非常经典，有很多值得参考的地方，代码也写的很好，可以进行修改用到别的地方。

行为识别笔记：iDT算法用法与代码解析相关推荐

YOLO系列 --- YOLOV7算法（二）：YOLO V7算法detect.py代码解析
YOLO系列 - YOLOV7算法(二):YOLO V7算法detect.py代码解析 parser = argparse.ArgumentParser()parser.add_argument('- ...
无人驾驶算法——Baidu Apollo代码解析之ReferenceLine Smoother参考线平滑
无人驾驶算法--Baidu Apollo代码解析之ReferenceLine Smoother参考线平滑 Apollo 参考线平滑类 reference_line_provider.cc 代价函数 c ...
PnP算法简介与代码解析-柴政
PnP算法简介与代码解析-柴政 PnP求解算法是指通过多对3D与2D匹配点,在已知或者未知相机内参的情况下,利用最小化重投影误差来求解相机外参的算法.PnP求解算法是SLAM前端位姿跟踪部分中常用的算 ...
【计算机视觉】PnP算法简介与代码解析-柴政（solvepnp理论篇）
PnP算法简介与代码解析-柴政 PnP求解算法是指通过多对3D与2D匹配点,在已知或者未知相机内参的情况下,利用最小化重投影误差来求解相机外参的算法.PnP求解算法是SLAM前端位姿跟踪部分中常用的算 ...
基于deap脑电数据集的脑电情绪识别二分类算法（附代码）
想尝试一下脑电情绪识别的各个二分类算法. 代码主要分为三部分:快速傅里叶变换处理(fft).数据预处理.以及各个模型处理. 采用的模型包括:决策树.SVM.KNN三个模型(模型采用的比较简单,可以直接 ...
ViBe算法原理和代码解析
ViBe - a powerful technique for background detection and subtraction in video sequences 算法官网:http:// ...
RPnP算法原文及代码解析
(有不对的地方希望得到您的指正~~) RPnP(Robust Perspective-n-Point)是一种快速且具有鲁棒性的PnP求解方法.在2D-3D点较少的情况下,能取得较理想的计算效果.并且其 ...
java fft 频谱算法_快速傅里叶变换（FFT）算法原理及代码解析
FFT与DFT关系: 快速傅里叶变换(Fast Fourier Transform)是离散傅里叶(DFT)变换的一种快速算法,简称FFT,通过FFT可以将一个信号从时域变换到频域:FFT(快速傅里叶变 ...
十种深度学习算法要点及代码解析
前言谷歌董事长施密特曾说过:虽然谷歌的无人驾驶汽车和机器人受到了许多媒体关注,但是这家公司真正的未来在于机器学习,一种让计算机更聪明.更个性化的技术. 也许我们生活在人类历史上最关键的时期:从使用大 ...

行为识别笔记：iDT算法用法与代码解析