ps：简单记录ICML2018论文研讨会内容

2018.7.23

Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demonstrations. http://proceedings.mlr.press/v80/wang18d/wang18d.pdf
- 零和博弈（GAN受此启发）和逆强化学习
Learning to Explore via Meta-Policy Gradient. http://proceedings.mlr.press/v80/xu18d/xu18d.pdf
- 元策略梯度
QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning. http://proceedings.mlr.press/v80/rashid18a/rashid18a.pdf
- 值函数分解

2018.7.24

Ray: A Distributed Framework for Emerging AI Applications. Arxiv. https://arxiv.org/pdf/1712.05889.pdf
- 伯克利分布式工具，分享者并没有讲清楚如何部署分布式
Katyusha X: Practical Momentum Method for Stochastic Sum-of-Nonconvex Optimization. ICML 2018. http://proceedings.mlr.press/v80/allen-zhu18a/allen-zhu18a.pdf
- 非凸优化
Self-Imitation Learning. ICML 2018. http://proceedings.mlr.press/v80/oh18b/oh18b.pdf
- 自模仿学习

2018.7.27

Mix & Match - Agent Curricula for Reinforcement learning. ICML 2018. http://proceedings.mlr.press/v80/czarnecki18a/czarnecki18a.pdf
- transfer learning用于强化学习
- k越大，模型吸收前面模型的内容越多，训练复杂度越高
Self-Consistent Trajectory Autoencoder: Hierarchical Reinforcement Learning with Trajectory Embeddings. ICML 2018. http://proceedings.mlr.press/v80/co-reyes18a/co-reyes18a.pdf
- 类似于VAE用于分层强化学习
State Abstractions for Lifelong Reinforcement Learning. ICML 2018. http://proceedings.mlr.press/v80/abel18a/abel18a.pdf
- 终身强化学习相当于任务可迁移

2018.7.28

Efficient Neural Architecture Search via Parameter Sharing. ICML 2018. http://proceedings.mlr.press/v80/pham18a/pham18a.pdf
- 在NAS基础上做改进，对于给定的神经网络模块，建立DAG图，具体算法有待继续研究
- Google Brain的insight很好，但是还很weak
Implicit Quantile Networks for Distributional Reinforcement Learning. ICML 2018. http://proceedings.mlr.press/v80/dabney18a/dabney18a.pdf
- 本文分成Quantile和Distributional，可以看下作者之前两篇工作

2018.7.29

Bayesian Optimization of Combinatorial Structures. ICML 2018. http://proceedings.mlr.press/v80/baptista18a/baptista18a.pdf
- 没听懂，只是取得部分进展
Visualizing and Understanding Atari Agents. ICML 2018. http://proceedings.mlr.press/v80/greydanus18a/greydanus18a.pdf
- 高斯模糊某一片，看看这块区域对于Q值的影响
Policy Optimization with Demonstrations. ICML 2018. http://proceedings.mlr.press/v80/kang18a/kang18a.pdf
- 没怎么听

2018.7.30

Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents. ICML 2018. http://proceedings.mlr.press/v80/zhang18n/zhang18n.pdf
- 智能体之间的通信，随机选择子图，理论早已经弄好，然后实验简单设计
Structured Evolution with Compact Architectures for Scalable Policy Optimization. ICML 2018. http://proceedings.mlr.press/v80/choromanski18a/choromanski18a.pdf
- Google brain的，讲了一堆矩阵概念，理论解释不清楚，实验完备
Using Reward Machines for High-Level Task Specification and Decomposition in Reinforcement Learning. ICML 2018. http://proceedings.mlr.press/v80/icarte18a/icarte18a.pdf
- 用状态机完成与环境交互一次，就能完成多任务的reward计算

2018.7.31

Essentially No Barriers in Neural Network Energy Landscape. ICML 2018.
http://proceedings.mlr.press/v80/draxler18a/draxler18a.pdf
- 局部最优点连线
Time Limits in Reinforcement Learning. ICML 2018. http://proceedings.mlr.press/v80/pardo18a/pardo18a.pdf
- 考虑有限步长
Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. ICML 2018. http://proceedings.mlr.press/v80/athalye18a/athalye18a.pdf
- best paper对抗的7篇中未被攻克的

2018.8.1

Learning with Abandonment. ICML 2018.
http://proceedings.mlr.press/v80/schmit18a/schmit18a.pdf
- 在推荐系统中用强化学习，设计了一个用户容忍度theta
Latent Space Policies for Hierarchical Reinforcement Learning. ICML 2018. http://proceedings.mlr.press/v80/haarnoja18a/haarnoja18a.pdf
- 分层强化学习主要是解决解决系数reward或者复杂情况
- 这篇文章文不对标题的分层强化学习
Coordinated Exploration in Concurrent Reinforcement Learning. ICML 2018. http://proceedings.mlr.press/v80/dimakopoulou18a/dimakopoulou18a.pdf
- 提出了seed算法，对比了之前的UCB和辛普森采样，没有解释清楚Concurrent多智能体协同运作

2018.8.2

Clipped Action Policy Gradient. ICML 2018. http://proceedings.mlr.press/v80/fujita18a/fujita18a.pdf
- 求策略梯度的时候用alpha和beta截断，是无偏估计
An Inference-Based Policy Gradient Method for Learning Options. ICML 2018. http://proceedings.mlr.press/v80/smith18a/smith18a.pdf
- 分层强化学习领域的一篇文章与
- 与ICML2017的A Laplacian Framework for Option Discovery
  in Reinforcement Learning算法类似，实验也有比较

2018.8.3

Universal Planning Networks: Learning Generalizable Representations for Visuomotor Control. ICML 2018. http://proceedings.mlr.press/v80/srinivas18b/srinivas18b.pdf
- 引出对state抽象，做一个model-based，model-based与model-free结合
Investigating Human Priors for Playing Video Games. ICML 2018. http://proceedings.mlr.press/v80/dubey18a/dubey18a.pdf

2018.8.4

Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. ICML 2018. http://proceedings.mlr.press/v80/athalye18a/athalye18a.pdf
- ICML2018 best paper
- 原来的ICLR的基于梯度的防御机制主要有三种，分别是梯度破碎，随机梯度，多轮之后爆炸和消失梯度，一三的对抗方法是找一个不定点可导函数，第二个对抗方法是期望最大化
Addressing Function Approximation Error in Actor-Critic Methods. ICML 2018. http://proceedings.mlr.press/v80/fujimoto18a/fujimoto18a.pdf
- 状态抽象，类似于vae，对状态抽象再还原，最后再最小化作者提出的loss
Efficient Model-Based Deep Reinforcement Learning with Variational State Tabulation. ICML 2018. http://proceedings.mlr.press/v80/corneil18a/corneil18a.pdf

To be continue。。。。。

ICML2018论文研讨会记录相关推荐

AIS 2019(ACL IJCAI SIGIR)论文研讨会研究趋势汇总
AIS 2019(ACL IJCAI SIGIR)论文研讨会文章目录 AIS 2019(ACL IJCAI SIGIR)论文研讨会 ACL 进展综述-清华刘知远 ACL2019投稿统计 1. 预训练 ...
MapReduce论文阅读记录
本文为阅读MapReduce论文的记录,内容主要是论文的第三部分--实现.方便本人今后查看. 1. 运行概述下图展示了 MapReduce 过程的整体情况当用户程序执行 MapReduce 时,会 ...
计算机论文指导记录范本,论文指导内容记录怎么写 3篇论文指导记录20篇
论文指导内容记录怎么写 3篇论文指导记录20篇论文指导内容记录怎么写 3篇论文指导记录20篇精品文档,仅供参考论文指导内容记录怎么写 3篇论文指导记录20篇指导是一个汉语词语,读音为zhdo, ...
Life Long Learning论文阅读记录之LwF
Life Long Learning论文阅读记录之LwF 写在前面获取原文问题难点目标符号说明现有方法不使用旧数据集的方法 Learning without Forgetting(LwF ...
ICCV2017 论文浏览记录(转)
mark一下,感谢作者分享! 作者将ICCV2017上的论文进行了汇总,在此记录下来,平时多注意阅读积累. 之前很早就想试着做一下试着把顶会的论文浏览一遍看一下自己感兴趣的,顺便统计一下国内高校或者研 ...
ICCV2017 论文浏览记录
之前很早就想试着做一下试着把顶会的论文浏览一遍看一下自己感兴趣的,顺便统计一下国内高校或者研究机构的研究方向,下面是作为一个图像处理初学者在浏览完论文后的觉得有趣的文章: ICCV2017 论文浏览 ...
鹅鹅鹅的论文投稿记录~
记录读研期间的论文.专利.会议论文投稿记录 2022.4 基于已有数据和修改好的word稿,在网上找了爱思唯尔的latex模板后排了几天版,然后读了期刊的投稿须知 2022.4.14 Submitte ...
【计算机视觉】Mip-nerf 论文精读记录
[计算机视觉]Mip-nerf 论文精读记录本人是刚入门的计算机视觉小白,此系列为nerf论文精读系列笔记记录,感兴趣的朋友可以关注一下,共同成长! Mip-NeRF: A Multiscale R ...
Matlab2014 巡回研讨会记录
Matlab2014 巡回研讨会记录作者:毛淦时间: 2014年5月10日 16:23:56 自己一共参加了五个讲座,各有不同的收获,详细总结如下最新的MATLAB程序开发技巧新版的 ...

ICML2018论文研讨会记录

2018.7.23

2018.7.24

2018.7.27

2018.7.28

2018.7.29

2018.7.30

2018.7.31

2018.8.1

2018.8.2

2018.8.3

2018.8.4

ICML2018论文研讨会记录相关推荐

最新文章

热门文章