ICLR2020 || 106篇深度强化学习顶会论文汇总
深度强化学习实验室报道
转载自: EndtoEnd.ai
编辑:DeepRL
【导读】今年的ICLR大会转到了线上举行,DeepMind和哈佛的研究人员投稿了一篇神经网络控制虚拟小白鼠模的论文十分亮眼。此次ICLR大会,华人学者参与论文数占比近60%,Google入选80余篇表现依旧抢眼,而国内的研究团队也不落下风,满分论文频现。本届ICLR 2020共有2594篇投稿,687 篇被接收。其中:48篇 oral 108篇,spotlights 531篇, poster 录取率为 26.5%,相比去年的 31.4% 略有降低。强化学习一直是ICLR投稿的热点,近年来强化学习及深度强化学习不断刷新着人类在游戏、棋牌等领域的最好成绩,关于谷歌研究人员用6小时完成AI芯片设计,也是采用了深度强化学习方法,强化学习的威力不容小觑。本文共列举了106篇深度强化学习领域的论文。
排名 |
1 |
平均得分 |
8 |
论文地址 |
https://openreview.net/forum?id=HJgLZR4KvH |
标题 |
Dynamics-aware Unsupervised Skill Discovery |
得分 |
8 8 8 |
Variance |
0 |
Decision |
Accept (Talk) |
排名 |
1 |
平均得分 |
8 |
论文地址 |
https://openreview.net/forum?id=H1gax6VtDB |
标题 |
Contrastive Learning Of Structured World Models |
得分 |
8 8 8 |
Variance |
0 |
Decision |
Accept (Talk) |
排名 |
1 |
平均得分 |
8 |
论文地址 |
https://openreview.net/forum?id=r1etN1rtPB |
标题 |
Implementation Matters In Deep Rl: A Case Study On Ppo And Trpo |
得分 |
8 8 8 |
Variance |
0 |
Decision |
Accept (Talk) |
排名 |
1 |
平均得分 |
8 |
论文地址 |
https://openreview.net/forum?id=HkxlcnVFwB |
标题 |
Gendice: Generalized Offline Estimation Of Stationary Values |
得分 |
8 8 8 |
Variance |
0 |
Decision |
Accept (Talk) |
排名 |
1 |
平均得分 |
8 |
论文地址 |
https://openreview.net/forum?id=S1g2skStPB |
标题 |
Causal Discovery With Reinforcement Learning |
得分 |
8 8 8 |
Variance |
0 |
Decision |
Accept (Talk) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=r1genAVKPB |
标题 |
Is A Good Representation Sufficient For Sample Efficient Reinforcement Learning? |
得分 |
8 8 6 |
Variance |
0.89 |
Decision |
Accept (Spotlight) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=rklHqRVKvH |
标题 |
Harnessing Structures For Value-based Planning And Reinforcement Learning |
得分 |
6 8 8 |
Variance |
0.89 |
Decision |
Accept (Talk) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=SJgzLkBKPB |
标题 |
Explain Your Move: Understanding Agent Actions Using Focused Feature Saliency |
得分 |
6 8 8 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=SJeD3CEFPH |
标题 |
Meta-q-learning |
得分 |
8 8 6 |
Variance |
0.89 |
Decision |
Accept (Talk) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=HJl8_eHYvS |
标题 |
Discriminative Particle Filter Reinforcement Learning For Complex Partial Observations |
得分 |
8 6 8 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=rkgbYyHtwB |
标题 |
Disagreement-regularized Imitation Learning |
得分 |
6 8 8 |
Variance |
0.89 |
Decision |
Accept (Spotlight) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=S1glGANtDr |
标题 |
Doubly Robust Bias Reduction In Infinite Horizon Off-policy Estimation |
得分 |
6 8 8 |
Variance |
0.89 |
Decision |
Accept (Spotlight) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=rkgvXlrKwH |
标题 |
Seed Rl: Scalable And Efficient Deep-rl With Accelerated Central Inference |
得分 |
8 6 8 |
Variance |
0.89 |
Decision |
Accept (Talk) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=rJe2syrtvS |
标题 |
The Ingredients Of Real World Robotic Reinforcement Learning |
得分 |
6 8 8 |
Variance |
0.89 |
Decision |
Accept (Spotlight) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=BJlQtJSKDB |
标题 |
Watch The Unobserved: A Simple Approach To Parallelizing Monte Carlo Tree Search |
得分 |
8 6 8 |
Variance |
0.89 |
Decision |
Accept (Talk) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=ryeYpJSKwr |
标题 |
Meta-learning Acquisition Functions For Transfer Learning In Bayesian Optimization |
得分 |
8 6 8 |
Variance |
0.89 |
Decision |
Accept (Spotlight) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=ryxdEkHtPS |
标题 |
A Closer Look At Deep Policy Gradients |
得分 |
8 6 8 |
Variance |
0.89 |
Decision |
Accept (Talk) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=BJeAHkrYDS |
标题 |
Fast Task Inference With Variational Intrinsic Successor Features |
得分 |
8 6 8 |
Variance |
0.89 |
Decision |
Accept (Talk) |
排名 |
2 |
平均得分 |
7.33 |
论文地址 |
https://openreview.net/forum?id=rJgJDAVKvB |
标题 |
Learning To Plan In High Dimensions Via Neural Exploration-exploitation Trees |
得分 |
8 8 6 |
Variance |
0.89 |
Decision |
Accept (Spotlight) |
排名 |
3 |
平均得分 |
7 |
论文地址 |
https://openreview.net/forum?id=S1lOTC4tDS |
标题 |
Dream To Control: Learning Behaviors By Latent Imagination |
得分 |
8 6 6 8 |
Variance |
1 |
Decision |
Accept (Spotlight) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=SygKyeHKDH |
标题 |
Making Efficient Use Of Demonstrations To Solve Hard Exploration Problems |
得分 |
6 8 6 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=SJleNCNtDH |
标题 |
Intrinsic Motivation For Encouraging Synergistic Behavior |
得分 |
6 8 6 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=S1xKd24twB |
标题 |
Sqil: Imitation Learning Via Reinforcement Learning With Sparse Rewards |
得分 |
8 6 6 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=ryxgJTEYDr |
标题 |
Reinforcement Learning With Competitive Ensembles Of Information-constrained Primitives |
得分 |
8 6 6 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=B1gZV1HYvS |
标题 |
Multi-agent Interactions Modeling With Correlated Policies |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=BJgy96EYvr |
标题 |
Influence-based Multi-agent Exploration |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Spotlight) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=rylJkpEtwS |
标题 |
Learning The Arrow Of Time For Problems In Reinforcement Learning |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=Bkl7bREtDr |
标题 |
Amrl: Aggregated Memory For Reinforcement Learning |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=S1xCPJHtDB |
标题 |
Model Based Reinforcement Learning For Atari |
得分 |
6 8 6 |
Variance |
0.89 |
Decision |
Accept (Spotlight) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=r1lL4a4tDB |
标题 |
Variational Recurrent Models For Solving Partially Observable Control Tasks |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=HJlxIJBFDr |
标题 |
Sample Efficient Policy Gradient Methods With Recursive Variance Reduction |
得分 |
6 8 6 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=H1exf64KwH |
标题 |
Exploring Model-based Planning With Policy Networks |
得分 |
6 8 6 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=HygnDhEtvr |
标题 |
Reinforcement Learning Based Graph-to-sequence Model For Natural Question Generation |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=rkg-TJBFPB |
标题 |
Ride: Rewarding Impact-driven Exploration For Procedurally-generated Environments |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=ryeG924twB |
标题 |
Learning Expensive Coordination: An Event-based Deep Rl Approach |
得分 |
6 8 6 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=SJxbHkrKDH |
标题 |
Evolutionary Population Curriculum For Scaling Multi-agent Reinforcement Learning |
得分 |
6 8 6 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=S1xitgHtvS |
标题 |
Making Sense Of Reinforcement Learning And Probabilistic Inference |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Spotlight) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=rkxDoJBYPB |
标题 |
Reinforced Genetic Algorithm Learning For Optimizing Computation Graphs |
得分 |
8 6 6 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=Sye57xStvB |
标题 |
Never Give Up: Learning Directed Exploration Strategies |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=HJgC60EtwB |
标题 |
Robust Reinforcement Learning For Continuous Control With Model Misspecification |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=S1l8oANFDH |
标题 |
Synthesizing Programmatic Policies That Inductively Generalize |
得分 |
6 8 6 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=r1lOgyrKDS |
标题 |
Adaptive Correlated Monte Carlo For Contextual Categorical Sequence Generation |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Poster) |
排名 |
4 |
平均得分 |
6.67 |
论文地址 |
https://openreview.net/forum?id=S1evHerYPr |
标题 |
Improving Generalization In Meta Reinforcement Learning Using Neural Objectives |
得分 |
6 6 8 |
Variance |
0.89 |
Decision |
Accept (Spotlight) |
排名 |
5 |
平均得分 |
6.33 |
论文地址 |
https://openreview.net/forum?id=rJeQoCNYDS |
标题 |
Single Episode Transfer For Differing Environmental Dynamics In Reinforcement Learning |
得分 |
3 8 8 |
Variance |
5.56 |
Decision |
Accept (Poster) |
排名 |
5 |
平均得分 |
6.33 |
论文地址 |
https://openreview.net/forum?id=H1gX8C4YPr |
标题 |
Decentralized Distributed Ppo: Mastering Pointgoal Navigation |
得分 |
3 8 8 |
Variance |
5.56 |
Decision |
Accept (Poster) |
排名 |
6 |
平均得分 |
6.25 |
论文地址 |
https://openreview.net/forum?id=SJezGp4YPr |
标题 |
Geometric Insights Into The Convergence Of Nonlinear Td Learning |
得分 |
8 3 6 8 |
Variance |
4.19 |
Decision |
Accept (Poster) |
排名 |
6 |
平均得分 |
6.25 |
论文地址 |
https://openreview.net/forum?id=BJgZGeHFPH |
标题 |
Dynamics-aware Embeddings |
得分 |
3 8 6 8 |
Variance |
4.19 |
Decision |
Accept (Poster) |
排名 |
7 |
平均得分 |
6.2 |
论文地址 |
https://openreview.net/forum?id=S1ly10EKDS |
标题 |
Reanalysis Of Variance Reduced Temporal Difference Learning |
得分 |
8 8 6 3 6 |
Variance |
3.36 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=BkglSTNFDB |
标题 |
Q-learning With Ucb Exploration Is Sample Efficient For Infinite-horizon Mdp |
得分 |
6 6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=H1e0Wp4KvH |
标题 |
Automated Curriculum Generation Through Setter-solver Interactions |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=r1xGP6VYwH |
标题 |
Optimistic Exploration Even With A Pessimistic Initialisation |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=Syx7A3NFvH |
标题 |
Multi-agent Reinforcement Learning For Networked System Control |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=BJe1334YDH |
标题 |
A Learning-based Iterative Method For Solving Vehicle Routing Problems |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=rkgpv2VFvr |
标题 |
Sharing Knowledge In Multi-task Deep Reinforcement Learning |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=SJgob6NKvH |
标题 |
Rtfm: Generalising To New Environment Dynamics Via Reading |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=HkgsWxrtPB |
标题 |
Meta Reinforcement Learning With Autonomous Inference Of Subtask Dependencies |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=rke3TJrtPS |
标题 |
Projection Based Constrained Policy Optimization |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=B1x6w0EtwH |
标题 |
Graph Constrained Reinforcement Learning For Natural Language Action Spaces |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=SylOlp4FvH |
标题 |
V-mpo: On-policy Maximum A Posteriori Policy Optimization For Discrete And Continuous Control |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=SJexHkSFPS |
标题 |
Thinking While Moving: Deep Reinforcement Learning With Concurrent Control |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=rke7geHtwH |
标题 |
Keep Doing What Worked: Behavior Modelling Priors For Offline Reinforcement Learning |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=Hyg-JC4FDr |
标题 |
Imitation Learning Via Off-policy Distribution Matching |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=ByxdUySKvS |
标题 |
Adversarial Autoaugment |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=B1gqipNYwH |
标题 |
Option Discovery Using Deep Skill Chaining |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=HJgLLyrYwB |
标题 |
State-only Imitation With Transition Dynamics Mismatch |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=HyxnMyBKwB |
标题 |
The Gambler’s Problem And Beyond |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=B1e-kxSKDH |
标题 |
Structured Object-aware Physics Prediction For Video Modeling And Planning |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=H1lmhaVtvr |
标题 |
Dynamical Distance Learning For Semi-supervised And Unsupervised Skill Discovery |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=SkeIyaVtwB |
标题 |
Exploration In Reinforcement Learning With Deep Covering Options |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=S1lEX04tPr |
标题 |
Cm3: Cooperative Multi-goal Multi-stage Multi-agent Reinforcement Learning |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=ryxB2lBtvH |
标题 |
Learning To Coordinate Manipulation Skills Via Skill Behavior Diversification |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=H1ezFREtwH |
标题 |
Composing Task-agnostic Policies With Deep Reinforcement Learning |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=B1gskyStwr |
标题 |
Frequency-based Search-control In Dyna |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=S1ltg1rFDS |
标题 |
Black-box Off-policy Estimation For Infinite-horizon Reinforcement Learning |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=ryg48p4tPH |
标题 |
Action Semantics Network: Considering The Effects Of Actions In Multiagent Systems |
得分 |
6 6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=BkxXe0Etwr |
标题 |
Caql: Continuous Action Q-learning |
得分 |
6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=SkgC6TNFvr |
标题 |
Reinforced Active Learning For Image Segmentation |
得分 |
6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=Hye1kTVFDS |
标题 |
The Variational Bandwidth Bottleneck: Stochastic Evaluation On An Information Budget |
得分 |
6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
8 |
平均得分 |
6 |
论文地址 |
https://openreview.net/forum?id=H1gzR2VKDH |
标题 |
Hierarchical Foresight: Self-supervised Learning Of Long-horizon Tasks Via Visual Subgoal Generation |
得分 |
6 6 |
Variance |
0 |
Decision |
Accept (Poster) |
排名 |
9 |
平均得分 |
5.75 |
论文地址 |
https://openreview.net/forum?id=BJliakStvH |
标题 |
Maximum Likelihood Constraint Inference For Inverse Reinforcement Learning |
得分 |
8 6 3 6 |
Variance |
3.19 |
Decision |
Accept (Spotlight) |
排名 |
9 |
平均得分 |
5.75 |
论文地址 |
https://openreview.net/forum?id=rygfnn4twS |
标题 |
Autoq: Automated Kernel-wise Neural Network Quantization |
得分 |
6 6 8 3 |
Variance |
3.19 |
Decision |
Accept (Poster) |
排名 |
9 |
平均得分 |
5.75 |
论文地址 |
https://openreview.net/forum?id=Hkl9JlBYvr |
标题 |
Varibad: A Very Good Method For Bayes-adaptive Deep Rl Via Meta-learning |
得分 |
8 6 8 1 |
Variance |
8.19 |
Decision |
Accept (Poster) |
排名 |
10 |
平均得分 |
5.67 |
论文地址 |
https://openreview.net/forum?id=SJg5J6NtDr |
标题 |
Watch, Try, Learn: Meta-learning From Demonstrations And Rewards |
得分 |
8 3 6 |
Variance |
4.22 |
Decision |
Accept (Poster) |
排名 |
10 |
平均得分 |
5.67 |
论文地址 |
https://openreview.net/forum?id=rJeINp4KwH |
标题 |
Population-guided Parallel Policy Search For Reinforcement Learning |
得分 |
6 8 3 |
Variance |
4.22 |
Decision |
Accept (Poster) |
排名 |
10 |
平均得分 |
5.67 |
论文地址 |
https://openreview.net/forum?id=HJgcvJBFvB |
标题 |
A Simple Randomization Technique For Generalization In Deep Reinforcement Learning |
得分 |
8 3 6 |
Variance |
4.22 |
Decision |
Accept (Poster) |
排名 |
10 |
平均得分 |
5.67 |
论文地址 |
https://openreview.net/forum?id=H1eCw3EKvH |
标题 |
On The Weaknesses Of Reinforcement Learning For Neural Machine Translation |
得分 |
8 6 3 |
Variance |
4.22 |
Decision |
Accept (Poster) |
排名 |
10 |
平均得分 |
5.67 |
论文地址 |
https://openreview.net/forum?id=rylrdxHFDr |
标题 |
State Alignment-based Imitation Learning |
得分 |
6 8 3 |
Variance |
4.22 |
Decision |
Accept (Poster) |
排名 |
10 |
平均得分 |
5.67 |
论文地址 |
https://openreview.net/forum?id=rylvYaNYDH |
标题 |
Finding And Visualizing Weaknesses Of Deep Reinforcement Learning Agents |
得分 |
8 6 3 |
Variance |
4.22 |
Decision |
Accept (Poster) |
排名 |
10 |
平均得分 |
5.67 |
论文地址 |
https://openreview.net/forum?id=Skln2A4YDB |
标题 |
Model-augmented Actor-critic: Backpropagating Through Paths |
得分 |
3 6 8 |
Variance |
4.22 |
Decision |
Accept (Poster) |
排名 |
10 |
平均得分 |
5.67 |
论文地址 |
https://openreview.net/forum?id=rygf-kSYwH |
标题 |
Behaviour Suite For Reinforcement Learning |
得分 |
8 3 6 |
Variance |
4.22 |
Decision |
Accept (Spotlight) |
排名 |
10 |
平均得分 |
5.67 |
论文地址 |
https://openreview.net/forum?id=BJluxREKDB |
标题 |
Learning Heuristics For Quantified Boolean Formulas Through Reinforcement Learning |
得分 |
6 8 3 |
Variance |
4.22 |
Decision |
Accept (Poster) |
排名 |
10 |
平均得分 |
5.67 |
论文地址 |
https://openreview.net/forum?id=Bkg0u3Etwr |
标题 |
Maxmin Q-learning: Controlling The Estimation Bias Of Q-learning |
得分 |
8 6 3 |
Variance |
4.22 |
Decision |
Accept (Poster) |
排名 |
10 |
平均得分 |
5.67 |
论文地址 |
https://openreview.net/forum?id=ryx6WgStPB |
标题 |
Hypermodels For Exploration |
得分 |
8 3 6 |
Variance |
4.22 |
Decision |
Accept (Poster) |
排名 |
11 |
平均得分 |
5.5 |
论文地址 |
https://openreview.net/forum?id=ByeWogStDS |
标题 |
Sub-policy Adaptation For Hierarchical Reinforcement Learning |
得分 |
3 8 |
Variance |
6.25 |
Decision |
Accept (Poster) |
排名 |
11 |
平均得分 |
5.5 |
论文地址 |
https://openreview.net/forum?id=r1xPh2VtPB |
标题 |
Svqn: Sequential Variational Soft Q-learning Networks |
得分 |
3 8 |
Variance |
6.25 |
Decision |
Accept (Poster) |
排名 |
12 |
平均得分 |
5.25 |
论文地址 |
https://openreview.net/forum?id=BJeGlJStPr |
标题 |
Impact: Importance Weighted Asynchronous Architectures With Clipped Target Networks |
得分 |
6 3 6 6 |
Variance |
1.69 |
Decision |
Accept (Poster) |
排名 |
13 |
平均得分 |
5 |
论文地址 |
https://openreview.net/forum?id=rJld3hEYvS |
标题 |
排名ing Policy Gradient |
得分 |
6 3 6 |
Variance |
2 |
Decision |
Accept (Poster) |
排名 |
13 |
平均得分 |
5 |
论文地址 |
https://openreview.net/forum?id=HklxbgBKvr |
标题 |
Model-based Reinforcement Learning For Biological Sequence Design |
得分 |
6 3 6 |
Variance |
2 |
Decision |
Accept (Poster) |
排名 |
13 |
平均得分 |
5 |
论文地址 |
https://openreview.net/forum?id=HJx-3grYDB |
标题 |
Learning Nearly Decomposable Value Functions Via Communication Minimization |
得分 |
6 6 3 |
Variance |
2 |
Decision |
Accept (Poster) |
排名 |
13 |
平均得分 |
5 |
论文地址 |
https://openreview.net/forum?id=Byx4NkrtDS |
标题 |
Implementing Inductive Bias For Different Navigation Tasks Through Diverse Rnn Attrractors |
得分 |
3 6 6 |
Variance |
2 |
Decision |
Accept (Poster) |
排名 |
13 |
平均得分 |
5 |
论文地址 |
https://openreview.net/forum?id=SylL0krYPS |
标题 |
Toward Evaluating Robustness Of Deep Reinforcement Learning With Continuous Control |
得分 |
6 3 6 |
Variance |
2 |
Decision |
Accept (Poster) |
排名 |
13 |
平均得分 |
5 |
论文地址 |
https://openreview.net/forum?id=rJxX8T4Kvr |
标题 |
Learning Efficient Parameter Server Synchronization Policies For Distributed Sgd |
得分 |
6 3 6 |
Variance |
2 |
Decision |
Accept (Poster) |
排名 |
13 |
平均得分 |
5 |
论文地址 |
https://openreview.net/forum?id=HkxjqxBYDB |
标题 |
Episodic Reinforcement Learning With Associative Memory |
得分 |
6 3 6 |
Variance |
2 |
Decision |
Accept (Poster) |
排名 |
14 |
平均得分 |
4.67 |
论文地址 |
https://openreview.net/forum?id=rkecJ6VFvr |
标题 |
Logic And The 2-simplicial Transformer |
得分 |
8 3 3 |
Variance |
5.56 |
Decision |
Accept (Poster) |
排名 |
15 |
平均得分 |
4 |
论文地址 |
https://openreview.net/forum?id=rkl3m1BFDB |
标题 |
Exploratory Not Explanatory: Counterfactual Analysis Of Saliency Maps For Deep Rl |
得分 |
1 3 8 |
Variance |
8.67 |
Decision |
Accept (Poster) |
排名 |
15 |
平均得分 |
4 |
论文地址 |
https://openreview.net/forum?id=S1xnXRVFwH |
标题 |
Playing The Lottery With Rewards And Multiple Languages: Lottery Tickets In Rl And Nlp |
得分 |
3 3 6 |
Variance |
2 |
Decision |
Accept (Poster) |
ICLR2020 || 106篇深度强化学习顶会论文汇总相关推荐
- 【人工智能】Rutgers大学熊辉教授:《易经》如何指导我们做人工智能;这里有一篇深度强化学习劝退文
导读 我们看这个世界主要有两种方式:一种方式是从上往下看世界:另外一种是东方人所擅长的<易经>方法看世界,也就是归纳法,从下往上看世界.<易经>追求三易,不易.变易和简易.大道 ...
- 《深度强化学习》面试题汇总
原文出处: [1] 腾讯云.<深度强化学习>面试题汇总 [2] Reinforcement Learning遇到的一些强化学习面试问题 [3] 知乎.再励学习面试真题 深度强化学习报道 来 ...
- 必看!52篇深度强化学习收录论文汇总 | AAAI 2020
所有参与投票的 CSDN 用户都参加抽奖活动 群内公布奖项,还有更多福利赠送 来源 | 深度强化学习实验室(ID:Deep-RL) 作者 | DeepRL AAAI 2020 共收到的有效论文投稿超过 ...
- AAAI-2020 || 52篇深度强化学习accept论文汇总
深度强化学习实验室报道 来源:AAAI-2020 作者:DeepRL AAAI 2020 共收到的有效论文投稿超过 8800 篇,其中 7737 篇论文进入评审环节,最终收录数量为 1591 篇,收录 ...
- 【深度强化学习】【论文阅读】【双臂模仿】Deep Imitation Learning for BimanualRobotic Manipulation
title: Deep Imitation Learning for BimanualRobotic Manipulation date: 2023-01-15T20:54:56Z lastmod: ...
- 深度强化学习(资源篇)(更新于2020.11.22)
理论 1种策略就能控制多类模型,华人大二学生提出RL泛化方法,LeCun认可转发 | ICML 2020 AlphaGo原来是这样运行的,一文详解多智能体强化学习的基础和应用 [DeepMind总结] ...
- 【ICML2021】 9篇RL论文作者汪昭然:构建“元宇宙”和理论基础,让深度强化学习从虚拟走进现实...
深度强化学习实验室 官网:http://www.neurondance.com/ 论坛:http://deeprl.neurondance.com/ 来源:转载自AI科技评论 作者 | 陈彩娴 深度强 ...
- 【重磅整理】提前看287篇ICLR-2021 深度强化学习领域论文得分汇总列表
深度强化学习实验室 来源:ICLR2021 编辑:DeepRL [1]. What Matters for On-Policy Deep Actor-Critic Methods? A Large-S ...
- 【重磅最新】163篇ICML-2021强化学习领域论文整理汇总(2021.06.07)
深度强化学习实验室 官网:http://www.neurondance.com/ 论坛:http://deeprl.neurondance.com/ 作者:深度强化学习实验室 来源:整理自https: ...
最新文章
- as 关联 android源码,android studio 2.x以上关联源码
- CSS中clear属性的both、left和right浅析
- 关于C++模版的连接错误问题
- unet论文_图像分割之RefineNet 论文笔记
- 有哪些大数据处理工具?
- 操作html标签之找到标签(续)
- Docker容器的简单操作及应用部署
- linux向用户发送消息
- Selective Search for Object Recognition
- php add action,WordPress学习——add_action()详解
- Unity Manual learning log
- ct扫描方式有哪些_日联科技x-ray:工业CT是怎么进行X射线的断层扫描的
- java数字时钟_java Swing数字时钟
- 微信小程序分享至朋友圈
- git提交代码至码云
- 路由来源、优先级和度量值
- 【原创】OpenDDS笔记(一) Windows环境下的开发实例
- [记录] 基于STC89C52RC的贪吃蛇三色游戏机设计(内含点阵驱动、数码管驱动详解)
- 教你一招快速清理DNS缓存
- java map不区分KEY的大小写