博士申请——Research Proposal

之前在申请境外博士的时候，写过一篇RP（研究计划书）。由于不是硕士的研究方向，所以写的比较浅显。在这里贴出来，供大家参阅，对于某些童鞋，或许会有所帮助。

Why do I want the Ph.D

About Myself

Background

My current research direction is multimodal language generation. The research goal is to integrate the multimodal information acquired by the robot into a perfect sentence. At present I have designed a basic framework by referring the methods of image caption and machine translation to carry on my research. I plan to pursue graduate studies towards a Ph.D. degree at your school of computer science from the fall of 20XX. In the future, I am determined to devote myself to the research on object tracking.

Research Motivation

Object tracking is a process to locate an interested object in a series of images, so as to reconstruct the moving object’s track. Given a bounding box defining the object of interest in a single frame, the goal of tracking is to automatically determine the object’s bounding box or indicate that the object is not visible in every frame that follows.

As a mid-level task in computer vision, object tracking grounds high-level tasks such as pose estimation, action recognition, and behavior analysis. It has numerous practical applications, such as visual surveillance, human computer interaction and virtual reality. Although object tracking has been studied for several decades, it still remains challenging due to factors like abrupt appearance changes and severe object occlusions. Apart from those practical requirements that appeal to me deeply, I am curious about dealing with all these issues.

Object Tracking

Introduction

There are two main forms of object tracking, namely, single-object-tracking(SOT) and multi- object-tracking(MOT). Compared with single-object-tracking, which primarily focuses on designing sophisticated appearance models and/or motion models to deal with challenging factors such as object deformation, occlusion, illumination changes, motion blur and background clutters, multiple-object-tracking additionally requires two tasks to be solved: determining the number of objects, which typically varies over time, and maintaining their identities. Apart from the common challenges in both SOT and MOT, further key issues that complicate MOT include among others:1) frequent occlusions, 2) initialization and termination of tracks, 3) similar appearance, and 4) interactions among multiple objects.

In order to deal with all these issues, a wide range of solutions have been proposed in the past decades. In general, most tracking algorithms can be categorized into two classes based on their representation schemes: generative and discriminative models. Generative models typically learn an appearance model and use it to search for image regions with minimal reconstruction errors as tracking results. The typical generative algorithms are sparse representation methods, which have been used to represent the object by a set of targets and trivial templates to deal with partial occlusion, illumination change and pose variation. Discriminative models pose object tracking as a detection problem in which a classifier is learned to separate the target object from its surrounding background within a local region. Unlike generative methods, discriminative approaches use both target and background information to find a decision boundary for differentiating the target object from the background. And this is employed in tracking-by-detection methods, where a discriminative classifier is trained online using sample patches of the target and the surrounding background.

Related Work

A class of tracking techniques called “tracking-by-detection” are proposed in object tracking after Mykhaylo combining the advantages of both detection and tracking in a single framework. These methods train a discriminative classifier in an online manner to separate the object from the background. This classifier bootstraps itself by using the current tracker state to extract positive and negative examples from the current frame. However, light inaccuracies in the tracker can lead to incorrectly labeled training examples, which degrade the classifier and can cause drift. Boris et al. show that using Multiple Instance Learning (MIL) instead of traditional supervised learning avoids these problems and can lead to a more robust tracker with fewer parameter tweaks.

Particle filter (PF) realizes recursive Bayesian estimation based on the Monte Carlo method, using random particle groups to discretely express the posterior probability density function (PDF) of object state. Particle filter performs very well with non-linear and non-Gaussian dynamic state estimation problems, and it is widely used in object tracking. Since the invention of the particle filter, several types of appearance models for this framework have been proposed, including color, contour, edge, and saliency. However, a particle filter itself is a high complexity algorithm because each particle must be processed separately. Complex models can dramatically increase the overall execution time of a particle filter framework, rendering it useless in real-life applications. In addition, the particle filter which is a generative algorithm has a poorer performance under some complex visual scenarios compared with discriminative algorithms such as correlation filters and deep learning.

Some traditional types of correlation filters such as ASEF and UMACE filters have been trained offline and are used for object detection or target identification. However, their training needs are poorly suited to tracking. Object tracking requires robust filters to be trained from a single frame and dynamically adapted as the appearance of the target object changes. Bolme et al. introduce a regularized variant of ASEF named Minimum Output Sum of Squared Error (MOSSE) which is suitable for visual tracking. A tracker based upon MOSSE filters is robust and effective. Because correlation filters can be interpreted as linear classifiers, there is the question of whether they can take advantage of the Kernel Trick to classify on richer non-linear feature spaces. Some researchers investigate this problem, and Henriques et al. derive a new Kernelized Correlation Filter(KCF) and Kernel SDF filters have been proposed by Patnaik et al..

In the past few years, deep learning architectures have been used successfully to give very promising results for some complicated tasks, including image classification and speech recognition. The key to success is to make use of deep architectures to learn richer invariant features via multiple nonlinear transformations. Naiyan Wang et al. believe that visual tracking can also benefit from deep learning for the same reasons, and they propose a novel deep learning tracker (DLT) for robust visual tracking. DLT uses a stacked denoising autoencoder (SDAE) to learn generic image features from a large image dataset as auxiliary data and then transfers the features learned to the online tracking task. Then, they bring the biologically-inspired convolutional neural network (CNN) framework to visual tracking to address the challenge of limited labeled training data. Subsequently, Hyeonseob et al. propose a novel CNN architecture, referred to as Multi-Domain Network (MDNet), to learn the shared representation of targets from multiple annotated video sequences for visual tracking, where each video is regarded as a separate domain. Besides, Milan et al. present an approach based on recurrent neural networks(RNN) to address the challenging problem of data association and trajectory estimation. And they show that an RNN-based approach can be utilised to learn complex motion models in realistic environments.

Tracking System

A tracking system generally consists of four basic components:

Motion Model. It relates the locations of the object over time. Based on the estimation from the previous frame, the motion model generates a set of candidate regions or bounding boxes which may contain the target in the current frame.
Feature Extraction. The features extracted from candidate regions or bounding boxes are usually used for object representation. The common features are histogram features, texture features, color features, haar-like features and deep convolutional features.
Appearance Model. An appearance model can be used to evaluate the likelihood that the object of interest is at these candidate regions. For object tracking, local appearance models are generally more robust than holistic ones.
Online Update Mechanism. This mechanism controls the strategy and frequency of updating the appearance model. It has to strike a balance between model adaptation and drift.

Future Work

Some research work can be carried out in the future, such as：

Reducing the search scope of object in motion model.
Applying the visual selective attention mechanism to object tracking.
Researching object tracking in discontinuous video with the aid of person re-id.

Bibliography

[1] Bolme D S, Beveridge J R, Draper B A, et al. Visual object tracking using adaptive correlation filters[C]. computer vision and pattern recognition, 2010: 2544-2550

[2] Babenko B, Yang M, Belongie S J, et al. Robust Object Tracking with Online Multiple Instance Learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(8): 1619-1632.

[3] Yilmaz A, Javed O, Shah M, et al. Object tracking: A survey[J]. ACM Computing Surveys, 2006, 38(4).

[4] Andriluka M, Roth S, Schiele B, et al. People-tracking-by-detection and people-detection-by-tracking[C]. computer vision and pattern recognition, 2008: 1-8.

[5] Kalal Z, Mikolajczyk K, Matas J, et al. Tracking-Learning-Detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(7): 1409-1422.

[6] Truong M T, Pak M, Kim S, et al. Single object tracking using particle filter framework and saliency-based weighted color histogram[J]. Multimedia Tools and Applications, 2018, 77(22): 30067-30088.

[7] Zhou T, Ouyang Y, Wang R, et al. Particle filter based on real-time Compressive Tracking[C]. international conference on audio language and image processing, 2016: 754-759.

[8] Wang N, Yeung D Y. Learning a Deep Compact Image Representation for Visual Tracking[C]. neural information processing systems, 2013: 809-817.

[9] Choi J, Chang H J, Yun S, et al. Attentional Correlation Filter Network for Adaptive Visual Tracking[C]. computer vision and pattern recognition, 2017: 4828-4837

[10] Milan A, Rezatofighi S H, Dick A R, et al. Online Multi-Target Tracking Using Recurrent Neural Networks.[J]. national conference on artificial intelligence, 2016: 4225-4232.

[11] Babenko B, Yang M, Belongie S J, et al. Robust Object Tracking with Online Multiple Instance Learning[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2011, 33(8): 1619-1632.

[12] Wang N, Shi J, Yeung D Y, et al. Understanding and Diagnosing Visual Tracking Systems[J]. international conference on computer vision, 2015: 3101-3109.

[13] Wang N, Li S, Gupta A, et al. Transferring Rich Feature Hierarchies for Robust Visual Tracking.[J]. arXiv: Computer Vision and Pattern Recognition, 2015.

[14] Wei J, Hongjuan L, Wei S, et al. A new particle filter object tracking algorithm based on dynamic transition model[C]. international conference on information and automation, 2016: 1832-1835.

[15] Huang L, Ma B, Shen J, et al. Visual Tracking by Sampling in Part Space[J]. IEEE Transactions on Image Processing, 2017, 26(12): 5800-5810.

[16] Zhang K, Zhang L, Liu Q, et al. Fast Visual Tracking via Dense Spatio-Temporal Context Learning[C]. european conference on computer vision, 2014: 127-141.

[17] Nam H, Han B. Learning Multi-domain Convolutional Neural Networks for Visual Tracking[J]. computer vision and pattern recognition, 2016: 4293-4302.

[18] Henriques J F , Caseiro R , Martins P , et al. High-Speed Tracking with Kernelized Correlation Filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(3):583-596.

[19] R. Patnaik and D. Casasent. Fast FFT-based distortion-invariant kernel filters for general object recognition. In Proceedings of SPIE, volume 7252, 2009.

[20] Henriques J F, Caseiro R, Martins P, et al. Exploiting the circulant structure of tracking-by-detection with kernels[C]. european conference on computer vision, 2012: 702-715.

博士申请——Research Proposal相关推荐

北欧岗位制博士申请有多难？
链接:https://www.zhihu.com/question/427690707 编辑:深度学习与计算机视觉声明:仅做学术分享,侵删作者:匿名用户 https://www.zhihu.com ...
一位普通背景的2021海内外博士申请总结
点击上方"CVer",选择加"星标"置顶重磅干货,第一时间送达本文作者:Grandz | 来源:知乎(系CVer粉丝投稿) https://zhuanl ...
博士申请 | 香港科技大学冯雁教授招收2023秋季入学全奖博士研究生
合适的工作难找?最新的招聘信息也不知道? AI 求职为大家精选人工智能领域最新鲜的招聘信息,助你先人一步投递,快人一步入职! 香港科技大学香港科技大学,简称港科大,是香港的一所公立研究型大学,位于香 ...
申请美国计算机科学博士,美国计算机博士申请案例分析
近几年计算机专业申请录取愈加激烈,能获取到博士学位已然是非常得不容易了,而且还是明尼苏达大学的计算机专业,是全美cs专业排名前30的院校,为立志做科研学生的梦校之一!令人欢呼雀跃这来之不易,让我们一同 ...
博士申请 | 伦敦帝国理工学院李烨教授招收智能信号处理方向全奖博士生
合适的工作难找?最新的招聘信息也不知道? AI 求职为大家精选人工智能领域最新鲜的招聘信息,助你先人一步投递,快人一步入职! 伦敦帝国理工学院伦敦帝国理工学院(Imperial College Lo ...
欧洲的计算机博士申请,申请经典案例：欧洲计算机科学专业博士全奖
录取阶段及专业:计算机科学专业博士学位课程(全奖) 学生JING 北京第二外国语大学国际贸易专业硕士毕业华北电力大学计算机科学专业本科毕业申请时就职于北京某大型企业网络工程师留学目的:由于先生去 ...
香港科技大学计算机专业博士申请,协助申请研究生MSc博士PhD，香港高校【计算机2021提前批】已经开放，含【港府奖学金】...
协助申请研究生MSc博士PhD,香港高校[计算机2021提前批]已经开放,含[港府奖学金]录取名单5月16日公布 2020年HKPFS录取详情见下链接: https://www.ugc.edu.hk/ ...
博士申请 | 悉尼科技大学澳大利亚人工智能研究院招收联邦学习全奖博士生...
来源:AI求职悉尼科技大学悉尼科技大学-澳大利亚人工智能研究院(The Australian Artificial Intelligence Institute,简称 AAII),是澳大利亚最大的 ...
申请计算机专业有关个人陈述吗,美国计算机博士申请个人陈述范文
美国计算机博士申请个人陈述范文分享,文书个人陈述是美国博士申请文书中相当重要的组成部分,对于计划申请美国计算机专业博士研究生的同学可以根据本文提供的计算机博士申请个人陈述范文进行参考,撰写自己的个人陈 ...
博士申请 | 香港中文大学（深圳）吴保元教授招收人工智能全奖博士/博后/RA
合适的工作难找?最新的招聘信息也不知道? AI 求职为大家精选人工智能领域最新鲜的招聘信息,助你先人一步投递,快人一步入职! 香港中文大学(深圳) 香港中文大学(深圳)是一所经国家教育部批准,传承香港 ...

博士申请——Research Proposal

博士申请——Research Proposal相关推荐

最新文章

热门文章