原文链接:https://joselynzhao.top/2019/04/15/A-Survey-of-Zero-Shot-Learning_-Settings,-Methods,-and-Applications-reading-notes/

文章目录

  • A Survey of Zero-Shot Learning: Settings, Methods, and Applications
    • 1 Introduction
        • restrictions(限制)
        • Existing plan
        • some popular application scenarios
      • 1.1 Overview of Zero-Shot Learning
        • the definition of zero-shot learning
        • transfer learning
        • Auxiliary information(辅助信息)
      • 1.2 Learning Settings
      • 1.3 Contributions and Article Organization
        • Our contributions
        • Article organization
    • 2 SEMANTIC SPACES
      • 2.1 Engineered Semantic Spaces
        • Attribute spaces
        • Lexical spaces
        • Text-keyword spaces
        • Summary of engineered semantic spaces
      • 2.2 Learned Semantic Spaces
        • Label-embedding spaces
        • Text-embedding spaces
        • Image-representation spaces
        • Summary of learned semantic spaces
    • 3 METHODS
      • 3.1 Classifier-Based Methods
        • 3.1.1 Correspondence Methods

A Survey of Zero-Shot Learning: Settings, Methods, and Applications


Zero-shot learning is a powerful and promising learning paradigm, in which the classes covered by training instances and the classes we aim to classify are disjoint.

In this paper

  1. provide an overview of zero-shot learning.
    classify zero-shot learning into three learning settings.
  2. describe different semantic spaces adopted in existing zero-shot learning works.
  3. categorize existing zero-shot learning methods and introduce representative methods under each category.
  4. discuss different applications of zero-shot learning
  5. highlight promising future research directions of zero-shot learning

1 Introduction

restrictions(限制)

In supervised classification:

  • need sufficient labels.
  • the learned classifier can only classify the instances belonging to classes covered by the training data

Existing plan

open set recognition methods
Lalit P. Jain, Walter J. Scheirer, and Terrance E. Boult. 2014. Multi-class open set recognition using probability of inclusion. In European Conference on Computer Vision (ECCV’14). 393–409.

it cannot determine which specific unseen class the instance belongs to.

也就是说,分类器可以判断出,测试实体是否属于训练样本中的类型,但是无法给出未知类型的定义。

For methods under the above learning paradigms, if the testing instances belong to unseen classes that have no available labeled instances during model learning (or adaption), the learned classifier cannot determine the class labels of them.

some popular application scenarios

which require the classifier to have the ability to determine the class labels for the instances.

  • The number of target classes is large
    collecting sufficient labeled instances for such a large number of classes is challenging.

  • Target classes are rare.
    An example is fine-grained object classification.
    For many rare breeds, we cannot find the corresponding labeled instances.

  • Target classes change over time.
    for some new products, it is difficult to find corresponding labeled instances

  • In some particular tasks, it is expensive to obtain labeled instances.
    For example, in the image semantic segmentation problem

To solve this problem, zero-shot learning (also known as zero-data learning [81]) is pro- posed.

The aim of zero-shot learning:
classify instances belonging to the classes that have no labeled instances.

range of applications:

  • computer vision
  • natural language processing
  • ubiquitous computing

1.1 Overview of Zero-Shot Learning


Each instance is usually assumed to belong to one class.

the definition of zero-shot learning

Denote S={cisi=1,2,...,Ns}S=\left \{ c_{i}^{s} i = 1,2,...,N_{s}\right \}S={cisi=1,2,...,Ns} as the set of seen classes
Denote U={ciui=1,2,...,Nu}U=\left \{ c_{i}^{u} i = 1,2,...,N_{u}\right \}U={ciui=1,2,...,Nu} as the set of unseen classes
Note that S∩U = ∅.

即:可见类和不可见类 互斥,不存在交集。

Denote X as the feature space, which is D dimensional
Denote Dtr={(xitr,yitr)∈X×S}i=1NtrD^{tr} = \left \{ (x_{i}^{tr},y_i^{tr}) \in X \times S \right \}_{i=1}^{N_{tr}}Dtr={(xitr,yitr)X×S}i=1Ntr as the set of labeled training instances belonging to seen classes;

Denote Xte={xite∈X}i=1NteX^{te} = \left\{x_i^{te} \in X \right\} _{i=1}^{N_{te}}Xte={xiteX}i=1Nte as the set of testing instances

Definition 1.1 (Zero-Shot Learning).
Given labeled training instances DtrD^{tr}Dtr belonging to the seen classes S, zero-shot learning aims to learn a classifierfu(⋅)f^u(·)fu() : X→U that can classify testing instancesXteX^{te}Xte (i.e., to predict YteY^{te}Yte ) belonging to the unseen classes U.

zero-shot learning is a subfield of transfer learning(迁移学习)

transfer learning


In homogeneous transfer learning:
the feature spaces and the label spaces are the same
in heterogeneous transfer learning:
the feature spaces and/or the label spaces are different.

In zero-shot learning:
the same feature spaces, but different label spaces.

so zero-shot learning belongs to heterogeneous transfer learning.

note:
heterogeneous transfer learning with different label spaces(HTL-DLS)

HTL-DLS VS zero-shot learning:
whether there are some labeled instances for the target label space classes.
HTL-DLS have, however zero-shot learning dose not.

Auxiliary information(辅助信息)

Such auxiliary information should contain information about all of the unseen classes.
Meanwhile, the auxiliary information should be related to the instances in the feature space.
the auxiliary information involved by existing zero-shot learning methods is usually some semantic information.
It forms a space that contains both the seen and the unseen classes.

We denote τ\tauτ as the semantic space. Suppose τ\tauτ is M-dimensional.
Denote tis∈τt_i^s \in \tautisτ as the class prototype for seen class cisc_i^scis.
Denote tiu∈τt_i^u \in \tautiuτ as the class prototype for unseen class ciuc_i^uciu.

Denote Ts={tis}i=1NsT^s = \left\{t_i^s \right\}_{i=1}^{N_s}Ts={tis}i=1Ns as the set of prototypes for seen classes
Denote Tu={tiu}i=1NuT^u = \left\{t_i^u \right\}_{i=1}^{N_u}Tu={tiu}i=1Nu as the set of prototypes for unseen classes

Denote π (·) : S∪U→T as a class prototyping function that takes a class label as input and outputs the corresponding class prototype.

将类标签作为输入并输出相应的类原型。


We summarise the key notations used throughout this article in Table 1

1.2 Learning Settings

Based on the degree of transduction, we categorise zero-shot learning into three learning settings.

Definition 1.2 (Class-Inductive Instance-Inductive (CIII) Setting). Only labeled training instances DtrD^{tr}Dtr and seen class prototypes TsT^sTs are used in model learning.

Definition 1.3 (Class-Transductive Instance-Inductive (CTII) Setting). Labeled training instances DtrD^{tr}Dtr , seen class prototypes TsT^sTs , and unseen class prototypes TuT^uTu are used in model learning.

Definition 1.4 (Class-Transductive Instance-Transductive (CTIT) Setting). Labeled training instancesDtrD^{tr}Dtr , seen class prototypesTsT^sTs, unlabeled testing instances XteX^{te}Xte , and unseen class prototypes TuT^uTu are used in model learning.

from the fig.1. we can see the classifier fu(⋅)f^u (·)fu() is learned with increasingly specific testing instances’ information.

加入到模型学习中的关于特色测试示例的信息在逐渐增加

the performance of the model learned with the training instances will decrease when applied to the testing instances.
In zero-shot learning, this phenomenon is usually referred to as domain shift
(Yanwei Fu, Timothy M. Hospedales, Tao Xiang, and Shaogang Gong. 2015. Transductive multi-view zero-shot learning. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 11 (2015), 2332–2345.)

1.3 Contributions and Article Organization

Based on how the feature space and the semantic space are related, this article categorises the zero-shot learning methods into three categories:

in this article, the emphasis is the evaluation of existing zero-shot learning methods.

a comprehensive survey of zero-shot learning that covers a systematic categorisation of learning settings, methods, semantic spaces, and applications is needed.

Our contributions

  1. As shown in Figure 2(b), we provide a hierarchical categorisation of existing methods in zero-shot learning.
  2. We provide a formal classification and definition of different learning settings in zero-shot learning.
  3. As shown in Figure 2(a), we provide a categorisation of existing semantic spaces in zero-shot learning.

Article organization

2 SEMANTIC SPACES

According to how a semantic space is constructed,the semantic space can be divided as follows:

2.1 Engineered Semantic Spaces

Attribute spaces

Attribute spaces are constructed by a set of attributes.
In an attribute space, a list of terms describing various properties of the classes are defined as attributes.

在属性空间中,一系列描述类特性的描述被定义为属性

Each attribute is usually a word or a phrase corresponding to one property(性能) of these classes.

Then, these attributes are used to form the semantic space. with each dimension being one attribute.

For each class, the values of each dimension of the corresponding prototype are determined by whether this class has a corresponding attribute.

每个类对应的语义空间的维度是相同的,每个维度的值是由这个类是否具有这个属性的决定的。比如:“毛是红色的”,而小白兔不具备这个属性,那其对应的维度的值可能表现为“0".


so the attribute values are binary (i.e., 0/1).
the resulting attribute space is referred to as a binary attribute space.

there also exist relative attribute spaces, which measure the relative degree of having an attribute among different classes.
(Devi Parikh and Kristen Grauman. 2011. Relative attributes. In Proceedings of the IEEE International Conference on Computer Vision (ICCV’11). 503–510)

Lexical spaces

Lexical spaces are constructed by a set of lexical items(词汇项)

Lexical spaces are based on the labels of the classes and datasets that can provide semantic information.

词汇可以是各种类形容词,也是表示事物的性质。

Text-keyword spaces

Text-keyword spaces are constructed by a set of keywords extracted from the text descriptions of each class.

both the Plant Database and Plant Encyclopedia are used (which
are specific for plants) to obtain the text descriptions for each flower class.

为了得到很好的细粒度分类效果,从各个网站上获取相关的文本描述。

In zero-shot video event detection, the text descriptions of the events can be obtained from the event kits provided in the dataset.

After obtaining the text descriptions for each class, the next step is to
construct the semantic space and generate class prototypes from these descriptions.

each dimension corresponding to a keyword.

Summary of engineered semantic spaces

The advantage of engineered semantic spaces:
the flexibility to encode human domain knowledge through the construction of semantic space and class prototypes.

通过构造语义空间和类原型来编码人类领域知识的灵活性。

The disadvantage of engineered semantic spaces:
the heavy reliance on humans to perform the semantic space and class prototype engineering

严重依赖人类来执行语义空间和类原型工程

2.2 Learned Semantic Spaces

the semantic information is contained in the whole prototype.

语义包含在整个原型中。

Label-embedding spaces

the class prototypes are obtained through the embedding of class labels.

In word embedding:
words or phrases are embedded into a real number space as vectors.
In this space, semantically similar words or phrases are embedded as nearby vectors

相近语义的词或句嵌入到临近的向量

In zero-shot learning, for each class, the class label of it is a word or a phrase.

每个类的标签 是一个词或者一个句子。

In addition to generating one prototype for each class, there are also works [103, 125] that generate more than one prototype for each class in the label embedding space.
In these works, the prototypes of a class are usually multiple vectors following Gaussian distribution(高斯分布).

Text-embedding spaces

the class prototypes are obtained by embedding the text descriptions for each class.(Being similar to text-keyword spaces)

the major difference between above two:
text-keyword space is constructed through extracting keywords and using each of them as a dimension in the constructed space.
A text-embedding space is constructed through some learning models.

Image-representation spaces

the class prototypes are obtained from images belonging to each class.
(类似 text-embedding spaces)

Summary of learned semantic spaces

The advantage:
the process of generating them is relatively less labor intensive, and the generated semantic spaces contain information that can be easily overlooked by humans.

较少劳动,语义空间包含人类容易忽视的信息

The disadvantage:
the prototypes of classes are obtained from some machine-learning models, and the semantics of each dimension are implicit.

每个维度的语义都是隐含的。

3 METHODS


for a zero-shot learning task, we consider one semantic space and represent each class with one prototype in that space.

3.1 Classifier-Based Methods

Existing classifier-based methods usually take a one-versus-rest(一对多) solution for learning the multiclass zero-shot classifier fu(⋅)f^u(·)fu().

对于每一个看不见的类ciuc_i^uciu,都学习一个二进制一对多的的分类器(是这个类,或者不是这个类)

denote fiu(⋅):RD→f_i^u(·): R^D →fiu()RD {0, 1} as the binary one-versus-rest classifier for class ciu∈Uc_i^u \in UciuU.
the eventual zero-shot classifierfuf^ufu (·) for the unseen classes consists of NuN_uNu binary one-versus-rest classifiers { fiu(⋅)∣i=1,2,...,Nuf_i^u (·) | i=1,2,...,N_ufiu()i=1,2,...,Nu }.

3.1.1 Correspondence Methods

A Survey of Zero-Shot Learning: Settings, Methods, and Applications [reading notes]相关推荐

  1. Comprehensive survey of computational ECG analysis: Databases,methods and applications

    1.Learning algorithms classifiers(most common and highest-performing): Support Vector Machines(SVM 支 ...

  2. 深度强化学习综述论文 A Brief Survey of Deep Reinforcement Learning

    A Brief Survey of Deep Reinforcement Learning 深度强化学习的简要概述 作者: Kai Arulkumaran, Marc Peter Deisenroth ...

  3. (转)Paper list of Meta Learning/ Learning to Learn/ One Shot Learning/ Lifelong Learning

    Meta Learning/ Learning to Learn/ One Shot Learning/ Lifelong Learning 2018-08-03 19:16:56 本文转自:http ...

  4. Day 5. Suicidal Ideation Detection: A Review of Machine Learning Methods and Applications综述

    Title: Suicidal Ideation Detection: A Review of Machine Learning Methods and Applications 自杀意念检测:机器学 ...

  5. 简明深度学习方法概述 Deep Learning:Methods and Application

    简明深度学习方法概述 Deep Learning:Methods and Application  人工智能  小林子  7个月前 (03-05)  2558℃  0评论 作者:@神一手golden ...

  6. 语音识别(ASR)论文优选:挑战ASR规模极限Scaling ASR Improves Zero and Few Shot Learning

    声明:平时看些文章做些笔记分享出来,文章中难免存在错误的地方,还望大家海涵.搜集一些资料,方便查阅学习:http://yqli.tech/page/speech.html.语音合成领域论文列表请访问h ...

  7. 001 A Comprehensive Survey of Privacy-preserving Federated Learning(便于寻找:FedAvg、垂直联邦学习的基本步骤)

    这是我看的第一篇关于联邦学习的论文,综述文章,让我对联邦学习有了初步的了解. A Comprehensive Survey of Privacy-preserving Federated Learni ...

  8. A Survey on Deep Transfer Learning 2018 翻译

    A Survey on Deep Transfer Learning 2018 翻译 ((o)/~虽然这篇文章是2018年的,不是很新,但是写的通俗易懂,很适合刚接触迁移学习的同学,所以就翻译了) i ...

  9. 论文笔记:联邦学习——Federated Learning: Challenges, Methods, and Future Directions

    Federated Learning: Challenges, Methods, and Future Directions 论文链接: link Federated Learning: Challe ...

最新文章

  1. mysql top limit_MySQL中如何实现select top n ----Limit
  2. python import from class_Python: import vs from (module) import function(class) 的理解
  3. 寒假每日一题(提高组)【Week 3 完结】
  4. Linux Shell——-if -eq,if -ne,if -gt[笔记]
  5. 神经网络与深度学习——TensorFlow2.0实战(笔记)(四)(python文件)
  6. Android将库导入到build.gradle
  7. Atitit.网页爬虫的架构总结
  8. 圣诞帽php,教你用ps给自己头像p圣诞帽
  9. 网口压线顺序_水晶头网线排序方法 网线安装必看【图文教程】
  10. Excel 批量删除空白行,你用了 2 小时,同事 3 分钟就搞定了
  11. Vivado IP核fifo使用指南
  12. 21个奇葩注释,程序员看了都点赞
  13. 全球运输工业的升级会带来什么
  14. php eot 无法,php EOT
  15. python爬网易云_python爬网易云音乐-知了汇智
  16. 最简单的Document解析xml文件
  17. 现代OpenGL教程 02 - 贴图
  18. 【开发教程5】疯壳·ARM功能手机-串口实验教程
  19. 教学生用计算机画画,教师资格证美术面试真题《用电脑画画》
  20. MyBatis关联映射:一对一、一对多

热门文章

  1. C#:base64解码显示
  2. [ Tensorflow学习之路 ] —— API:TF-Slim
  3. NISP二级证书换CISP证书是怎么回事?
  4. acrobat如何设置可以使pdf输入页码时自动跳到正文对应的页码
  5. apmserver导入MySQL_APMServ MySQL 错误
  6. Spring Cloud Alibaba——Nacos服务配置中心
  7. 紫砂壶的起源 计算机操作题,紫砂壶的起源与历史发展
  8. Linux服务器之内存过高解决思路
  9. Web Workers API
  10. docker pull报错:Timeout exceeded while awaiting headers解决思路