2017 A brief introduction to weakly supervised learning

南大 周志华 National science review(IF 17.3), 2017 (Citations 815)

ABSTRACT

This article reviews some research progress of weakly supervised learning, focusing on three typical types of weak supervision: (See the introduction for a more detailed explanation)

  • incomplete supervision, where only a subset of training data is given with labels;
  • inexact supervision, where the training data are given with only coarse-grained labels;
  • inaccurate supervision, where the given labels are not always ground-truth.

INTRODUCTION

Typically, there are three types of weak supervision.

  • incomplete supervision, i.e. only a (usually small) subset of training data is given with labels while the other data remain unlabeled.

For example, in image categorization the ground-truth labels are given by human annotators; it is easy to get a huge number of images from the Internet, whereas only a small subset of images can be annotated due to the human cost.

  • inexact supervision, i.e. only coarse-grained labels are given.

It is desirable to have every object in the images annotated; however, usually we only have image-level labels rather than object-level labels.

  • inaccurate supervision, i.e. the given labels are not always ground-truth.

Such a situation occurs, e.g. when the image annotator is careless or weary, or some images are difficult to categorize.

INCOMPLETE SUPERVISION

Incomplete supervision concerns the situation in which we are given a small amount of labeled data, which is insufficient to train a good learner, while abundant unlabeled data are available.

Formally, the task is to learn f:X↦Yf: \mathcal{X} \mapsto \mathcal{Y}f:X↦Y from a training data set D=(x1,y1),...,(xl,yl),xl+1,...,xmD = {(x_1, y_1), . . . , (x_l , y_l ), x_{l +1}, . . . , x_m }D=(x1​,y1​),...,(xl​,yl​),xl+1​,...,xm​.

There are two major techniques for this purpose:
incomplete supervision{active learning (With human intervention)semi-supervised learning (Without human intervention)\text{incomplete supervision}\left\{ \begin{aligned} &\text{active learning (With human intervention)}\\ &\text{semi-supervised learning (Without human intervention)} \end{aligned} \right. incomplete supervision{​active learning (With human intervention)semi-supervised learning (Without human intervention)​

  • active learning;

Active learning assumes that there is an‘oracle’, such as a human expert, that can be queried to get ground-truth labels for selected unlabeled instances.

selection criteria of actie learning{informativenessrepresentativeness\text{selection criteria of actie learning} \left\{ \begin{aligned} &\text{informativeness}\\ &\text{representativeness} \end{aligned} \right. selection criteria of actie learning{​informativenessrepresentativeness​

  • semi-supervised learning.

In contrast, semi-supervised learning attempts to automatically exploit unlabeled data in addition to labeled data to improve learning performance, where no human intervention is assumed.

semi-supervised learning{(pure) semi-supervised learningtranductive learning\text{semi-supervised learning}\left\{ \begin{aligned} &\text{(pure) semi-supervised learning}\\ &\text{tranductive learning}\\ \end{aligned} \right. semi-supervised learning{​(pure) semi-supervised learningtranductive learning​

Actually, in semi-supervised learning there are two basic assumptions, i.e. the cluster assumption and the manifold assumption; both are about data distribution. The former assumes that data have inherent cluster structure, and thus, instances falling into the same cluster have the same class label. The latter assumes that data lie on a manifold, and thus, nearby instances have similar predictions. The essence of both assumptions lies in the belief that similar data points should have similar outputs, whereas unlabeled data can be helpful to disclose which data points are similar.

four major categories of semi-supervised learning{generative methodsgraph-based methodslow-density separation methodsdisagreement-based methods.\text{four major categories of semi-supervised learning}\left\{ \begin{aligned} &\text{generative methods}\\ &\text{graph-based methods}\\ &\text{low-density separation methods}\\ &\text{disagreement-based methods.}\\ \end{aligned} \right. four major categories of semi-supervised learning⎩⎪⎪⎪⎪⎨⎪⎪⎪⎪⎧​​generative methodsgraph-based methodslow-density separation methodsdisagreement-based methods.​

INEXACT SUPERVISION

Formally, the task is to learn f:X↦Yf: \mathcal{X} \mapsto \mathcal{Y}f:X↦Y from a training data set D={(X1,y1),...,(Xm,ym)}D = \{(X_1, y_1), ..., (X_m, y_m)\}D={(X1​,y1​),...,(Xm​,ym​)}, where Xi={xi1,...,ximi}⊆XX_i = \{x_{i1}, . . . , x_{im_i} \}\subseteq \mathcal{X}Xi​={xi1​,...,ximi​​}⊆X is called a bag, xij∈X(j∈{1,...,mi})x_{i j}\in X (j∈ \{1, ..., m_i\})xij​∈X(j∈{1,...,mi​}) is an instance, mim_imi​ is the number of instances in XiX_iXi​, and yi∈Y={Y,N}y_i\in \mathcal{Y} = \{Y, N\}yi​∈Y={Y,N}. Xi is a positive bag, i.e. yi=Yy_i = Yyi​=Y, if there exists xipx_{i p}xip​ that is positive, while p∈{1,...,mi}p\in \{1, ..., m_i\}p∈{1,...,mi​} is unknown. The goal is to predict labels for unseen bags. This is called multi-instance learning.

论文笔记 A brief introduction to weakly supervised learning - 2017相关推荐

  1. 关于弱监督学习的详细介绍——A Brief Introduction to Weakly Supervised Learning

    目录 介绍 主动学习 半监督学习 多实例学习 带噪学习 Snorkel 框架介绍 参考 介绍 在机器学习领域,学习任务可大致划分为两类,一种是监督学习,另一种是非监督学习.通常,两者都需要从包含大量训 ...

  2. A brief introduction to weakly supervised learning(简要介绍弱监督学习)

    文章转载自http://www.cnblogs.com/ariel-dreamland/p/8566348.html A brief introduction to weakly supervised ...

  3. 论文笔记 Object-Aware Instance Labeling for Weakly Supervised Object Detection - ICCV 2019

    Object-Aware Instance Labeling for Weakly Supervised Object Detection Kosugi ICCV, 2019 (PDF) (Citat ...

  4. 弱监督学习 weakly supervised learning 笔记

    周志华 A Brief Introduction to Weakly Supervised Learning 2018 引言 在机器学习领域,学习任务可以划分为监督学习.非监督学习.通常,两者都需要从 ...

  5. 【论文阅读】Weakly Supervised Learning of Instance Segmentation with Inter-pixel Relations

    一篇经典的弱监督分割论文,发表在CVPR2019上面 论文标题: Weakly Supervised Learning of Instance Segmentation with Inter-pixe ...

  6. 小样本论文笔记5:Model Based - [6] One-shot learning with memory-augmented neural networks.

    小样本论文笔记5:Model Based - [6] One-shot learning with memory-augmented neural networks 文章目录 小样本论文笔记5:Mod ...

  7. 【论文笔记09】Differentially Private Hypothesis Transfer Learning 差分隐私迁移学习模型, ECMLPKDD 2018

    目录导引 系列传送 Differentially Private Hypothesis Transfer Learning 1 Abstract 2 Bg & Rw 3 Setting &am ...

  8. 论文笔记《Incorporating Copying Mechanism in Sequence-to-Sequence Learning》

    论文笔记<Incorporating Copying Mechanism in Sequence-to-Sequence Learning> 论文来源:2016 ACL 论文主要贡献:提出 ...

  9. 论文笔记:Decoding Brain Representations by Multimodal Learning of Neural Activity and Visual Features

    论文笔记:Decoding Brain Representations by Multimodal Learning of Neural Activity and Visual Features(通过 ...

  10. 【论文导读】- Link Weight Prediction Using Supervised Learning Methods(使用监督学习方法的链路权重预测及其在Yelp网络中的应用)

    文章目录 论文信息 摘要 主要内容(contributions) 图模型和评价指标 特征指标 原图特征指标 原始图转线图 线图特征指标 论文信息 Link Weight Prediction Usin ...

最新文章

  1. Python中知识点笔记
  2. PLSQL_统计信息系列10_统计信息过旧导致程序出现性能问题
  3. 第十届蓝桥杯java B组—试题F 特别数的和
  4. 笔记 - AliCloud 云上安全防护 简介
  5. mysql自定义两个条件排序_使用MySQL中的两个不同列进行自定义排序?
  6. 使用 dpu 检视 dump 中的字符串.
  7. SpringMVC 异步交互 AJAX 文件上传
  8. SSIS常用的包—发送Email任务
  9. 虚拟机中qemu模拟开发板启动过程,使用nfs挂载根文件系统
  10. 永洪bi logo更换
  11. 平衡树--替罪羊树 *
  12. Java中常见常用的类
  13. Pun2插件结合Xlua热更新开发 一、在lua中自定义PunRpc方法
  14. Redis客户端Lettuce深度分析介绍
  15. Mail企业邮箱登录入口在哪里?如何注册企业邮箱账号?
  16. MySQL的异步、半异步、组复制
  17. eclipse上插入中文到mysql,但是navicat显示问号《网上很多方法都没用》,最终google到了精品
  18. 【算法基础】DFS深度优先算法 —— AcWing 843. n-皇后问题 AcWing 842. 排列数字
  19. 网页登录飞书妙记如何添加“飞书妙记”应用?
  20. 史上最纯净的电脑系统重装教程,怕你不会,手把手教你,会了支持一下老学长!

热门文章

  1. 【cooper】深度学习入门:基于Python的理论与实现(鱼书)_个人读书笔记
  2. 关于英语单词记忆的总结
  3. c语言十进制转二进制过程,C语言十进制转二进制代码实例
  4. Pytorch实战——知识点记录(一)
  5. COM连接点 - 基本原理(2)
  6. [论文笔记]Outfit Compatibility Prediction and Diagnosis with Multi-Layered Comparison Network
  7. 英文论文写作小贴士(2)
  8. html游戏 养狗,一起来养狗手游-一起来养狗手游安卓版预约_第一手游网
  9. java随机生成名字_java随机生成一个名字和对应拼音的方法
  10. swagger导出接口文档