NCA Feature Selection for Classification

Consider a multi-class classification problem with a training set containing n observations:

S={(xi,yi),i=1,2,…,n},

where xi∈ℝp are the feature vectors, yi∈{1,2,…,c} are the class labels, and c is the number of classes. The aim is to learn a classifier f:ℝp→{1,2,…,c} that accepts a feature vector and makes a prediction f(x) for the true label y of x.

Consider a randomized classifier that:

Randomly picks a point, Ref(x), from S as the ‘reference point’ for x

Labels x using the label of the reference point Ref(x).

This scheme is similar to that of a 1-NN classifier where the reference point is chosen to be the nearest neighbor of the new point x. In NCA, the reference point is chosen randomly and all points in S have some probability of being selected as the reference point. The probability P(Ref(x)=xj|S) that point xj is picked from S as the reference point for x is higher if xj is closer to x as measured by the distance function dw, where

dw(xi,xj)=∑r=1pwr2|xir−xjr|,

and wr are the feature weights. Assume that

P(Ref(x)=xj|S)∝k(dw(x,xj)),

where k is some kernel or a similarity function that assumes large values when dw(x,xj) is small. Suppose it is

k(z)=exp(−zσ),

as suggested in [1]. The reference point for x is chosen from S, so sum of P(Ref(x)=xj|S) for all j must be equal to 1. Therefore, it is possible to write

P(Ref(x)=xj|S)=k(dw(x,xj))∑j=1nk(dw(x,xj)).

Now consider the leave-one-out application of this randomized classifier, that is, predicting the label of xi using the data in S−i, the training set S excluding the point (xi,yi). The probability that point xj is picked as the reference point for xi is

pij=P(Ref(xi)=xj|S−i)=k(dw(xi,xj))∑j=1,j≠ink(dw(xi,xj)).

The average leave-one-out probability of correct classification is the probability pi that the randomized classifier correctly classifies observation i using S−i.

pi=∑j=1,j≠inP(Ref(xi)=xj|S−i)I(yi=yj)=∑j=1,j≠inpijyij,

where

yij=I(yi=yj)={1ifyi=yj,0otherwise.

The average leave-one-out probability of correct classification using the randomized classifier can be written as

F(w)=1n∑i=1npi.

The right hand side of F(w) depends on the weight vector w. The goal of neighborhood component analysis is to maximize F(w) with respect to w. fscnca uses the regularized objective function as introduced in [1].

F(w)=1n∑i=1npi−λ∑r=1pwr2=1n∑i=1n[∑j=1,j≠inpijyij−λ∑r=1pwr2]︸Fi(w)=1n∑i=1nFi(w),

where λ is the regularization parameter. The regularization term drives many of the weights in w to 0.

After choosing the kernel parameter σ in pij as 1, finding the weight vector w can be expressed as the following minimization problem for given λ.

w^=argminwf(w)=argminw1n∑i=1nfi(w),

where f(w) = -F(w) and fi(w) = -Fi(w).

Note that

1n∑i=1n∑j=1,j≠inpij=1,

and the argument of the minimum does not change if you add a constant to an objective function. Therefore, you can rewrite the objective function by adding the constant 1.

w^=argminw{1+f(w)}=argminw{1n∑i=1n∑j=1,j≠inpij−1n∑i=1n∑j=1,j≠inpijyij+λ∑r=1pwr2}=argminw{1n∑i=1n∑j=1,j≠inpij(1−yij)+λ∑r=1pwr2}=argminw{1n∑i=1n∑j=1,j≠inpijl(yi,yj)+λ∑r=1pwr2},

where the loss function is defined as

l(yi,yj)={1ifyi≠yj,0otherwise.

The argument of the minimum is the weight vector that minimizes the classification error. You can specify a custom loss function using the LossFunction name-value pair argument in the call to fscnca.

matlab NCA,Neighborhood Component Analysis (NCA) Feature Selection相关推荐

  1. 论文阅读报告:Feature Selection for Multi-label Classification Using Neighborhood Preservation,Zhiling Cai

    文章目录 1. 论文出处 2. 流程(示意图) 3. 预备知识 3.1 相似性保持特征选择(Similarity Preserving Feature Selection) 3.2 多标签 4. 论文 ...

  2. A feature selection method via analysis of relevance, redundancy, and interaction

    mutual information feature selection, MIFS:互信息特征选择 Smymetrical uncertainty, SU Fast Correlation-Basd ...

  3. 【计算神经科学冒险者们】2.3 神经编码:特征选择(Neural Encoding:Feature Selection)...

    Today's Task:How to find the components of this model 1 选取特征Feature 1.1 How to proceed? Our problem ...

  4. 机器学习降维算法一:PCA (Principal Component Analysis)

    引言: 机器学习领域中所谓的降维就是指采用某种映射方法,将原高维空间中的数据点映射到低维度的空间中.降维的本质是学习一个映射函数 f : x->y,其中x是原始数据点的表达,目前最多使用向量表达 ...

  5. PCA(Principal Component Analysis)的原理、算法步骤和实现。

    PCA的原理介绍:  PCA(Principal Component Analysis)是一种常用的数据分析方法.PCA通过线性变换将原始数据变换为一组各维度线性无关的表示,可用于提取数据的主要特征分 ...

  6. 机器学习与高维信息检索 - Note 4 - 主成分分析及其现代解释(Principal Component Analysis, PCA)及相关实例

    主成分分析及其现代解释 4. 主成分分析及其现代解释 Principal Component Analysis and Its Modern Interpretations 4.1 几何学解释 The ...

  7. Feature Selection Based on Mutual Information:Criteria of Max-Dependency, Max-Relevance,and Min-Redu

    Feature Selection Based on Mutual Information:Criteria of Max-Dependency, Max-Relevance,and Min-Redu ...

  8. Machine Learning week 8 quiz: Principal Component Analysis

    Principal Component Analysis 5 试题 1. Consider the following 2D dataset: Which of the following figur ...

  9. 【ML】Principle Component Analysis 主成分分析

    PCA(Principal Component Analysis) 是一种常见的数据分析方式,常用于高维数据的降维,可用于提取数据的主要特征分量. 首先介绍一些关于线性降维的基本思想,用于线性PCA的 ...

  10. R语言的特征选择(Feature Selection)包:Boruta和caret

    转载自:http://www.zhizhihu.com/html/y2011/3188.html 对于大数据的分析,特征选择Feature Selection和降维是必不可少的,R有很多做FS的包,这 ...

最新文章

  1. 31天重构学习笔记重新整理下载
  2. 多线程下ArrayList类线程不安全的解决方法及原理
  3. too many connections 解决方法
  4. 基于@Bean声明lazy-queue
  5. Angular No provider for EffectsRootModule错误消息的出现原因和修复方式
  6. 手游开发者交流会议暨OGEngine新版发布
  7. 使用2.26内核的linux,介绍linux 2.6.9-42内核升级到linux 2.6.26-42的方法
  8. Log4j快速使用精简版
  9. linux ssh Unused,安装openssh-portable时遇到的问题及解决办法
  10. 计算机统考第五次作业操作题,计算机基础第5次作业 第五章 Powerpoint知识题
  11. 阶段5 3.微服务项目【学成在线】_day04 页面静态化_04-freemarker基础-基础语法种类...
  12. 算法:回溯二 生成有效括号对Generate Parentheses
  13. 调用阿里云的通用文字识别-高精版识别接口,识别图片中的文字详解
  14. android锁屏事件监听,Android 监听锁屏、解锁、开屏 操作
  15. 33.iptables备份与恢复 firewalld的9个zone以及操作 service的操作
  16. NB-IoT、LoRa、eMTC、Zigbee、Sigfox、WiFi、蓝牙,谁能称霸物联网时代?
  17. Battery Historian2.0使用过程中遇到的一些问题
  18. 西安电子科技大学计算机研究生寝室,西安电子科技大学硕士宿舍身亡 生前说自己累...
  19. Android App节日换肤
  20. 2011年成都信息工程学院第二季极客大挑战逆向第三题Game破文

热门文章

  1. 谈一谈Normalize.css
  2. 【UE4】Bounds 详解
  3. 笔记本 wlan 设置,不用wifi热点软件就可用
  4. 丘比特之箭python代码_心形丘比特之箭_可爱漂亮的非主流情侣QQ空间留言代码
  5. 篮球c语言程序,源程序C代码:篮球比赛应用系统
  6. echart中饼图或者南丁格尔玫瑰图是否显示label或lableLine
  7. 测试小兵成长记:柳暗花明又一村
  8. 系统分析师学习笔记(十七)
  9. ecosphere是什么意思_ecosphere的翻译_音标_读音_用法_例句 - 必应 Bing 词典
  10. iphone11屏比例_iPhone每一代的屏幕尺寸比例是多少