文章目录

  • 内容简介
  • What is meta learning?
  • Why meta learning?
  • How and what to do meta learning?
    • Categories
      • 命名大全
      • 分类
  • Datasets
  • Models
    • 分类
    • Black-box
    • Optimization / Gradient based
      • Problems of MAML
      • MAML的其他改进方式
    • Metric-based / non-parametric
      • Problems of metric-based
    • Hybrid
    • Bayesian meta-learning

BY:TA陈建层
除了machine learning之外
又有了新的意义:Meta Learning
本节内容太繁杂,笔记只能记录大概。

内容简介

Outline
● What is meta learning?
● Why meta learning?
● How and what to do meta learning?
● Categories
● Datasets
● Models
公式输入请参考:在线Latex公式

What is meta learning?

这块上节有讲,不啰嗦
就是learn to learn
Usually considered to achieve Few-shot learning (but not limited to)

Why meta learning?

  1. Too many tasks to learn, to learn more efficiently
    ○ Faster learning methods (adaptation)
    ○ Better hyper-parameters / learning algorithms
    ○ Related to:
    ■ transfer learning
    ■ domain adaptation
    ■ multi-task learning
    ■ life-long learning
  2. Too little data, to fit more accurately
    (Better learner, fit more quickly)
    ○ Traditional supervised may not work

How and what to do meta learning?

Categories

命名大全

● MAML (Model Agnostic Meta-Learning)
● Reptile (???)这个应该跟训练的路线像爬虫一样
● SNAIL (Simple Neural AttentIve Learner)
●PLATIPUS (Probabilistic LATent model for Incorporating Priors and Uncertainty in few-Shot learning)鸭嘴兽
● LLAMA (Lightweight Laplace Approximation for Meta-Adaptation)骆马
● ALPaCA (Adaptive Learning for Probabilistic Connectionist Architectures)羊驼
● CAML (Conditional class-Aware Meta Learning)骆驼
●LEO (Latent Embedding Optimization)(拉丁)狮子
● LEOPARD(Learning to generate softmax parameters for diverse classification)豹
● CAVIA (Context Adaptation via meta-learning)(not CAML)豚鼠
● R2-D2 (Ridge Regression Differentiable Discriminator)机器人

分类

将上面这么多的模型按学习的对象来分:

  1. Model Parameters (suitable for Few-shot framework)
    ○ Initializations
    ○ Embeddings / Representations / Metrics
    ○ Optimizers
    ○ Reinforcement learning (Policies / other settings)
  2. Hyperparameters (e.g. AutoML )自动调参
    (beyond the scope of today, but can be viewed as kind of meta learning)
    ○ Hyperparameters search ((training) settings)李老师调参课程
    ○ Network architectures → Network architecture search (NAS)
    (related to: evolutional strategy, genetic algorithm…)
  3. Others
    ○ Algorithm itself (literally, not a network)…… (More in DLHLP)

Datasets

了解一下ML常用数据集:

  1. Omniglot(omni = all, glot = language)
    ○ Launched by linguist Simon Ager in 1998
    ○ As a dataset by Lake in 2015, Science
    ○ Concept learning
    除了真实世界的文字:

    还有二次元的文字:

    脑洞打开,可以把动漫中的文字做进来:

  2. miniImageNet
    ○ from ImageNet but few-shot

  3. CUB (Caltech-UCSD Birds)

Models

分类

Meta-LSTM可以看做是黄绿融合

还有一些融合方法:

下面针对这四个颜色的分类进行简单讲解。

Black-box

思想是每个任务都对应有一个fθf_\thetafθ
那把任务看做是数据,丢到RNN中,希望RNN能够预测出新任务对应的参数
也有用LSTM来做的,并加上注意力机制

Optimization / Gradient based

Learn model initialization用来学习初始化参数的模型:
● MAML (Model Agnostic Meta Learning)
● Reptile
● Meta-LSTM (can be also viewed as RNN black-box)
improvements of MAML针对MAML进行改进的模型:
● Meta-SGD
● MAML++
● AlphaMAML
● DEML
● CAVIA
different meta-parameters 学习其他参数的模型:
● iMAML
● R2-D2 / LR-D2
● ALPaCA
● MetaOptNet

Problems of MAML

● Learning rate → Meta-SGD, MAML++:每个任务的参数θ\thetaθ都用相同的LR是不太合适
Meta-SGD:“Adaptive learning rate” version of MAML,加入了一个参数α\alphaα来解决不同任务不同LR的问题
● Second-order derivatives (instability) → MAML++:在MAML推导过程中忽略的二次偏导导致结果不准确
● Batch Normalization → MAML++:在训练过程加入BN
以上issue导致MAML有下面问题:

  1. Training Instability外层循环的参数梯度不稳定,容易爆炸或消失
    ○ Gradient issues
  2. Second Order Derivative Cost
    ○ Expensive to compute
    ○ First-order → harmful to performance
  3. Batch Normalization Statistics
    ○ No accumulation
    ○ Shared bias
  4. Shared (across step and across parameter) inner loop learning rate
    ○ Not well scaled
  5. Fixed outer loop learning rate

Solutions proposed

  1. Training Instability ⇒ Multi-Step Loss Optimization (MSL)多更新几次内部循环的参数在更新外循环的参数(这不是Reptile吗?)
    ○ Gradient issues
  2. Second Order Derivative Cost ⇒ Derivative-Order Annealing (DA)更新内循环参数的前几次忽略二次偏导项,后面则不忽略
    ○ Expensive to compute
    ○ First-order → harmful to performance
  3. Batch Normalization Statistics
    ○ No accumulation ⇒ Per-Step Batch Normalization Running Statistics
    ○ Shared bias ⇒ Per-Step Batch Normalization Weights & Biases
  4. Shared (across step and across parameter) inner loop learning rate
    ⇒ Learning Per-Layer Per-Step Learning Rates & Gradient Directions (LSLR)
  5. Fixed outer loop learning rate
    ⇒ Cosine Annealing of Meta-Optimizer Learning Rate (CA)

MAML的其他改进方式

当然还有两种从构架上都和MAML不一样的来改进初始化参数的模型:
● Implicit gradients → iMAML
左边是原始MAML,中间是忽略了二次偏导的MAML,右边是iMAML

● Closed-form on feature extraction → R2-D2:用L2正则代替CNN中最后的FC分类。

Metric-based / non-parametric

Learn to compare!
之前的Meta Learning的思想是学一个F来选出一个合适的f,解决某个任务f的参数是θ^\hat\thetaθ^

那上面的分类模型我们可以转换思想,不一定要学一个f来进行分辨testing data中是猫咪还狗狗
而是直接判断testing data与左边的猫和狗的相似度,像猫咪就归类为猫咪,模型构架就变成下面的样子。

模型函数就变成抽取特征,将数据都变成向量表示,最后用KNN、L2等来衡量相似度即可。
常见模型:
• Siamese network孪生网络

• Prototypical network:已知原型的表示,然后将数据抽取后与原型的特征向量进行比较


• Matching network:在上面的模型的基础上考虑不同分类之间的关系,用BiLSTM来存储这些关系。

• Relation network

另外两种方法:

  1. IMP (Infinite Mixture Prototypes)
    • Modified from prototypical
    • The number of mixture determined from data through Bayesian nonparametric methods
  2. GNN

Problems of metric-based

• When the K in N-way K-shot large → difficult to scale(数据量大不好分类)
• Limited to classification (only learning to compare)

Hybrid

Optimization based on model + Metric based embedding (RelationNet z)

这里用的Encoder和Decoder,训练任务通过Encoder映射为Z,然后经过Decoder还原回任务对应的参数θ\thetaθ

Bayesian meta-learning

额外讲一个贝叶斯元学习模型
PS:居然有嘻哈帝国的Lucious Lion

左边数据有三个特征那么如果数据如下图只有两个特征怎么归类:


目前解决这个Uncertainty problems问题的模型有:
Black-box:
• VERSA
Optimization:
• PLATIPUS
• Bayesian MAML (BMAML)
• Probabilistic MAML (PMAML)

李宏毅学习笔记45.Meta Learning番外相关推荐

  1. 台大李宏毅Machine Learning 2017Fall学习笔记 (13)Semi-supervised Learning

    台大李宏毅Machine Learning 2017Fall学习笔记 (13)Semi-supervised Learning 本博客参考整理自: http://blog.csdn.net/xzy_t ...

  2. 台大李宏毅Machine Learning 2017Fall学习笔记 (16)Unsupervised Learning:Neighbor Embedding

    台大李宏毅Machine Learning 2017Fall学习笔记 (16)Unsupervised Learning:Neighbor Embedding

  3. 台大李宏毅Machine Learning 2017Fall学习笔记 (14)Unsupervised Learning:Linear Dimension Reduction

    台大李宏毅Machine Learning 2017Fall学习笔记 (14)Unsupervised Learning:Linear Dimension Reduction 本博客整理自: http ...

  4. 安卓开发学习日记第四天番外篇_用Kotlin炒冷饭——越炒越小_莫韵乐的欢乐笔记

    安卓开发学习日记第四天番外篇--用Kotlin炒冷饭--越炒越小 前情提要 安卓开发学习日记第一天_Android Studio3.6安装 安卓开发学习日记第二天_破坏陷阱卡之sync的坑 安卓开发学 ...

  5. 【机器学习笔记】可解释机器学习-学习笔记 Interpretable Machine Learning (Deep Learning)

    [机器学习笔记]可解释机器学习-学习笔记 Interpretable Machine Learning (Deep Learning) 目录 [机器学习笔记]可解释机器学习-学习笔记 Interpre ...

  6. 联邦学习笔记—《Communication-Efficient Learning of Deep Networks from Decentralized Data》

    摘要: Modern mobile devices have access to a wealth of data suitable for learning models, which in tur ...

  7. 台湾大学林轩田机器学习基石课程学习笔记1 -- The Learning Problem

    红色石头的个人网站:redstonewill.com 最近在看NTU林轩田的<机器学习基石>课程,个人感觉讲的非常好.整个基石课程分成四个部分: When Can Machine Lear ...

  8. 《机器学习基石》学习笔记 1 The Learning Problem

    B站真是个神奇的地方,平时找不到的课程来B站找找,总是有惊喜的,开心. NTU林轩田的<机器学习基石>课程 整个基石课程分成四个部分: When Can Machine Learn? Wh ...

  9. 论文学习笔记02(Learning phrase representations using rnn encoder-decoder for statistical machine translat)

    论文学习笔记 Learning phrase representations using rnn encoder-decoder for statistical machine translation ...

  10. 机器学习李宏毅学习笔记35

    文章目录 前言 一.Meta learning 1.第一步 2.第二步 3.第三步 二.machine learning 和 meta learning区别 总结 前言 Meta learning元学 ...

最新文章

  1. vue.js+socket.io打造一个好玩的新闻社区
  2. 在js中加html_在HTML文档中嵌入JavaScript的四种方法
  3. simulink自定义信号源方法matlab数据导入sim
  4. 体验最火的敏捷-SCRUM!(网络直播课程 免费)
  5. STL常用容器大致对比
  6. leetcode491. 递增子序列(回溯算法)
  7. 每日一道shell练习(09)——sed处理
  8. java urlconnection乱码_HttpURLConnection 请求乱码
  9. python中文文本分析_中文文本处理
  10. go -context
  11. cd `dirname $0` 的特殊用法
  12. linux每天定时开关机,如何实现ubuntu每天定时关机
  13. 使用matlab进行深度学习
  14. 匈牙利算法【匹配问题】
  15. 自研ARM芯片,亲手拆掉Wintel联盟,微软这次是认真的吗?
  16. 项目-2.EVP论文与代码解析(Audio-Driven Emotional Video Portraits)
  17. (63)计数器设计(递增计数器)
  18. Apache和PHP环境打开php页面File Not Found问题
  19. C# Base64编码、AES等编码加、解密
  20. 第5次作业+163+张玉洁

热门文章

  1. 软件项目管理与过程改进 BB平台 题库整理
  2. 计算机科学与技术专业课程简介
  3. JS设计模式 - 工厂模式
  4. twaver html5软件价格,TWaver数据中心可视化软件
  5. 微信公众号 | 封面图及缩略图设置及修改技巧
  6. 微信公众号推送html文件,如何利用微信公众号推送教学资源?
  7. 运动控制系统常用传感器介绍
  8. GPS经纬度转百度地图经纬度
  9. plm系统服务器,SIPM/PLM
  10. 【数据分析】豆瓣电影Top250爬取的数据的可视化分析