所有参与投票的 CSDN 用户都参加抽奖活动

群内公布奖项,还有更多福利赠送

来源 | 深度强化学习实验室(ID:Deep-RL)

作者 | DeepRL

AAAI 2020 共收到的有效论文投稿超过 8800 篇,其中 7737 篇论文进入评审环节,最终收录数量为 1591 篇,收录率为 20.6%,而被接受论文列表中强化学习有52+篇,录取比约为3%,其中接收论文中就单位而言:Google Brain, DeepMind, Tsinghua University,UCL,Tencent AI Lab,Peking University, IBM, FaceBook等被录取一大片,就作者而言,不但有强化学习老爷子Sutton的文章(第48篇),也有后起之秀等。

论文涉及了环境、理论算法、应用以及多智能体等各个方向。以下是详细列表:

[1]. Google Research Football: A Novel Reinforcement Learning Environment

Karol Kurach (Google Brain)*; Anton Raichuk (Google); Piotr Stańczyk (Google Brain); Michał Zając (Google Brain); Olivier Bachem (Google Brain); Lasse Espeholt (DeepMind); Carlos Riquelme (Google Brain); Damien Vincent (Google Brain); Marcin Michalski (Google); Olivier Bousquet (Google); Sylvain Gelly (Google Brain)

[2]. Reinforcement Learning from Imperfect Demonstrations under Soft Expert Guidance

Xiaojian Ma (University of California, Los Angeles)*; Mingxuan Jing (Tsinghua University); Wenbing Huang (Tsinghua University); Chao Yang (Tsinghua University); Fuchun Sun (Tsinghua); Huaping Liu (Tsinghua University); Bin Fang (Tsinghua University)

[3]. Proximal Distilled Evolutionary Reinforcement Learning

Cristian Bodnar (University of Cambridge)*; Ben Day (University of Cambridge); Pietro Lió (University of Cambridge)

[4]. Tree-Structured Policy based Progressive Reinforcement Learning for Temporally Language Grounding in Video

Jie Wu (Sun Yat-sen University)*; Guanbin Li (Sun Yat-­sen University); si liu (Beihang University); Liang Lin (DarkMatter AI)

[5]. RL-Duet: Online Music Accompaniment Generation Using Deep Reinforcement Learning

Nan Jiang (Tsinghua University)*; Sheng Jin (Tsinghua University); Zhiyao Duan (Unversity of Rochester); Changshui Zhang (Tsinghua University)

[6]. Mastering Complex Control in MOBA Games with Deep Reinforcement Learning

Deheng Ye (Tencent)*; Zhao Liu (Tencent); Mingfei Sun (Tencent); Bei Shi (Tencent AI Lab); Peilin Zhao (Tencent AI Lab); Hao Wu (Tencent); Hongsheng Yu (Tencent); Shaojie Yang (Tencent); Xipeng Wu (Tencent); Qingwei Guo (Tsinghua University); Qiaobo Chen (Tencent); Yinyuting Yin (Tencent); Hao Zhang (Tencent); Tengfei Shi (Tencent); Liang Wang (Tencent); Qiang Fu (Tencent AI Lab); Wei Yang (Tencent AI Lab); Lanxiao Huang (Tencent)

[7]. Partner Selection for the Emergence of Cooperation in Multi‐Agent Systems using Reinforcement Learning

Nicolas Anastassacos (The Alan Turing Institute)*; Steve Hailes (University College London); Mirco Musolesi (UCL)

[8]. Uncertainty-Aware Action Advising for Deep Reinforcement Learning Agents

Felipe Leno da Silva (University of Sao Paulo)*; Pablo Hernandez-Leal (Borealis AI); Bilal Kartal (Borealis AI); Matthew Taylor (Borealis AI)

[9]. MetaLight: Value-based Meta-reinforcement Learning for Traffic Signal Control

Xinshi Zang (Shanghai Jiao Tong University)*; Huaxiu Yao (Pennsylvania State University); Guanjie Zheng (Pennsylvania State University); Nan Xu (University of Southern California); Kai Xu (Shanghai Tianrang Intelligent Technology Co., Ltd); Zhenhui (Jessie) Li (Penn State University)

[10].Adaptive Quantitative Trading: an Imitative Deep Reinforcement Learning Approach

Yang Liu (University of Science and Technology of China)*; Qi Liu (" University of Science and Technology of China, China"); Hongke Zhao (Tianjin University); Zhen Pan (University of Science and Technology of China); Chuanren Liu (The University of Tennessee Knoxville)

[11]. Neighborhood Cognition Consistent Multi‐Agent Reinforcement Learning

Hangyu Mao (Peking University)*; Wulong Liu (Huawei Noah's Ark Lab); Jianye Hao (Tianjin University); Jun Luo (Huawei Technologies Canada Co. Ltd.); Dong Li ( Huawei Noah's Ark Lab); Zhengchao Zhang (Peking University); Jun Wang (UCL); Zhen Xiao (Peking University)

[12]. SMIX(): Enhancing Centralized Value Functions for Cooperative Multi-Agent Reinforcement Learning

Chao Wen (Nanjing University of Aeronautics and Astronautics)*; Xinghu Yao (Nanjing University of Aeronautics and Astronautics); Yuhui Wang (Nanjing University of Aeronautics and Astronautics, China); Xiaoyang Tan (Nanjing University of Aeronautics and Astronautics, China)

[13]. Unpaired Image Enhancement Featuring Reinforcement-­Learning-Controlled Image Editing Software

Satoshi Kosugi (The University of Tokyo)*; Toshihiko Yamasaki (The University of Tokyo)

[14]. Crowdfunding Dynamics Tracking: A Reinforcement Learning Approach

Jun Wang (University of Science and Technology of China)*; Hefu Zhang (University of Science and Technology of China); Qi Liu (" University of Science and Technology of China, China"); Zhen Pan (University of Science and Technology of China); Hanqing Tao (University of Science and Technology of China (USTC))

[15]. Model and Reinforcement Learning for Markov Games with Risk Preferences

Wenjie Huang (Shenzhen Research Institute of Big Data)*; Hai Pham Viet (Department of Computer Science, School of Computing, National University of Singapore); William Benjamin Haskell (Supply Chain and Operations Management Area, Krannert School of Management, Purdue University)

[16]. Finding Needles in a Moving Haystack: Prioritizing Alerts with Adversarial Reinforcement Learning

Liang Tong (Washington University in Saint Louis)*; Aron Laszka (University of Houston); Chao Yan (Vanderbilt UNIVERSITY); Ning Zhang (Washington University in St. Louis); Yevgeniy Vorobeychik (Washington University in St. Louis)

[17]. Toward A Thousand Lights: Decentralized Deep Reinforcement Learning for Large‐Scale Traffic Signal Control

Chacha Chen (Pennsylvania State University)*; Hua Wei (Pennsylvania State University); Nan Xu (University of Southern California); Guanjie Zheng (Pennsylvania State University); Ming Yang (Shanghai Tianrang Intelligent Technology Co., Ltd); Yuanhao Xiong (Zhejiang University); Kai Xu (Shanghai Tianrang Intelligent Technology Co., Ltd); Zhenhui (Jessie) Li (Penn State University)

[18]. Deep Reinforcement Learning for Active Human Pose Estimation

Erik Gärtner (Lund University)*; Aleksis Pirinen (Lund University); Cristian Sminchisescu (Lund University)

[19]. Be Relevant, Non‐redundant, Timely: Deep Reinforcement Learning for Real‐time Event Summarization

Min Yang ( Chinese Academy of Sciences)*; Chengming Li (Chinese Academy of Sciences); Fei Sun (Alibaba Group); Zhou Zhao (Zhejiang University); Ying Shen (Peking University Shenzhen Graduate School); Chenglin Wu (fuzhi.ai)

[20]. A Tale of Two‐Timescale Reinforcement Learning with the Tightest Finite‐Time Bound

Gal Dalal (Technion)*; Balazs Szorenyi (Yahoo Research); Gugan Thoppe (Duke University)

[21]. Reinforcement Learning with Perturbed Rewards

Jingkang Wang (University of Toronto); Yang Liu (UCSC); Bo Li (University of Illinois at Urbana–Champaign)*

[22]. Exploratory Combinatorial Optimization with Reinforcement Learning

Thomas Barrett (University of Oxford)*; William Clements (Unchartech); Jakob Foerster (Facebook AI Research); Alexander Lvovsky (Oxford University)

[23]. Algorithmic Improvements for Deep Reinforcement Learning applied to Interactive Fiction

Vishal Jain (Mila, McGill University)*; Liam Fedus (Google); Hugo Larochelle (Google); Doina Precup (McGill University); Marc G. Bellemare (Google Brain)

[24]. Spatiotemporally Constrained Action Space Attacks on Deep Reinforcement Learning Agents

Xian Yeow Lee (Iowa State University)*; Sambit Ghadai (Iowa State University); Kai Liang Tan (Iowa State University); Chinmay Hegde (New York University); Soumik Sarkar (Iowa State University)

[25]. Modelling Sentence Pairs via Reinforcement Learning: An Actor‐Critic Approach to Learn the Irrelevant Words

MAHTAB AHMED (The University of Western Ontario)*; Robert Mercer (The University of Western Ontario)

[26]. Transfer Reinforcement Learning using Output-­Gated Working Memory

Arthur Williams (Middle Tennessee State University)*; Joshua Phillips (Middle Tennessee State University)

[27]. Reinforcement-­Learning based Portfolio Management with Augmented Asset Movement Prediction States

Yunan Ye (Zhejiang University)*; Hengzhi Pei (Fudan University); Boxin Wang (University of Illinois at Urbana-­ Champaign); Pin-­Yu Chen (IBM Research); Yada Zhu (IBM Research); Jun Xiao (Zhejiang University); Bo Li (University of Illinois at Urbana–Champaign)

[28]. Deep Reinforcement Learning for General Game Playing

Adrian Goldwaser (University of New South Wales)*; Michael Thielscher (University of New South Wales)

[29]. Stealthy and Efficient Adversarial Attacks against Deep Reinforcement Learning

Jianwen Sun (Nanyang Technological University)*; Tianwei Zhang ( Nanyang Technological University); Xiaofei Xie (Nanyang Technological University); Lei Ma (Kyushu University); Yan Zheng (Tianjin University); Kangjie Chen (Tianjin University); Yang Liu (Nanyang Technology University, Singapore)

[30]. LeDeepChef: Deep Reinforcement Learning Agent for Families of Text-­Based Games

Leonard Adolphs (ETHZ)*; Thomas Hofmann (ETH Zurich)

[31]. Induction of Subgoal Automata for Reinforcement Learning

Daniel Furelos-­Blanco (Imperial College London)*; Mark Law (Imperial College London); Alessandra Russo (Imperial College London); Krysia Broda (Imperial College London); Anders Jonsson (UPF)

[32]. MRI Reconstruction with Interpretable Pixel-­Wise Operations Using Reinforcement Learning

wentian li (Tsinghua University)*; XIDONG FENG (department of Automation,Tsinghua University); Haotian An (Tsinghua University); Xiang Yao Ng (Tsinghua University); Yu-­Jin Zhang (Tsinghua University)

[33]. Explainable Reinforcement Learning Through a Causal Lens

Prashan Madumal (University of Melbourne)*; Tim Miller (University of Melbourne); Liz Sonenberg (University of Melbourne); Frank Vetere (University of Melbourne)

[34]. Reinforcement Learning based Metapath Discovery in Large-­scale Heterogeneous Information Networks

Guojia Wan (Wuhan University); Bo Du (School of Compuer Science, Wuhan University)*; Shirui Pan (Monash University); Reza Haffari (Monash University, Australia)

[35]. Reinforcement Learning When All Actions are Not Always Available

Yash Chandak (University of Massachusetts Amherst)*; Georgios Theocharous ("Adobe Research, USA"); Blossom Metevier (University of Massachusetts, Amherst); Philip Thomas (University of Massachusetts Amherst)

[36]. Reinforcement Mechanism Design: With Applications to Dynamic Pricing in Sponsored Search Auctions

Weiran Shen (Carnegie Mellon University)*; Binghui Peng (Columbia University); Hanpeng Liu (Tsinghua University); Michael Zhang (Chinese University of Hong Kong); Ruohan Qian (Baidu Inc.); Yan Hong (Baidu Inc.); Zhi Guo (Baidu Inc.); Zongyao Ding (Baidu Inc.); Pengjun Lu (Baidu Inc.); Pingzhong Tang (Tsinghua University)

[37]. Metareasoning in Modular Software Systems: On-­the-­Fly Configuration Using Reinforcement Learning

Rich Contextual Representations Aditya Modi (Univ. of Michigan Ann Arbor)*; Debadeepta Dey (Microsoft); Alekh Agarwal (Microsoft); Adith Swaminathan (Microsoft Research); Besmira Nushi (Microsoft Research); Sean Andrist (Microsoft Research); Eric Horvitz (MSR)

[38]. Joint Entity and Relation Extraction with a Hybrid Transformer and Reinforcement Learning Based Model

Ya Xiao (Tongji University)*; Chengxiang Tan (Tongji University); Zhijie Fan (The Third Research Institute of the Ministry of Public Security); Qian Xu (Tongji University); Wenye Zhu (Tongji University)

[39]. Reinforcement Learning of Risk-­Constrained Policies in Markov Decision Processes

Tomas Brazdil (Masaryk University); Krishnendu Chatterjee (IST Austria); Petr Novotný (Masaryk University)*; Jiří Vahala (Masaryk University)

[40]. Deep Model-­Based Reinforcement Learning via Estimated Uncertainty and Conservative Policy Optimization

Qi Zhou (University of Science and Technology of China); Houqiang Li (University of Science and Technology of China); Jie Wang (University of Science and Technology of China)*

[41]. Reinforcement Learning with Non-­Markovian Rewards

Maor Gaon (Ben-­Gurion University); Ronen Brafman (BGU)*

[42]. Modular Robot Design Synthesis with Deep Reinforcement Learning

Julian Whitman (Carnegie Mellon University)*; Raunaq Bhirangi (Carnegie Mellon University); Matthew Travers (CMU); Howie Choset (Carnegie Melon University)

[42]. BAR -­A Reinforcement Learning Agent for Bounding-­Box Automated Refinement

Morgane Ayle (American University of Beirut -­ AUB)*; Jimmy Tekli (BMW Group / Université de Franche-­Comté -­ UFC); Julia Zini (American University of Beirut -­ AUB); Boulos El Asmar (BMW Group / Karlsruher Institut für Technologie -­ KIT); Mariette Awad (American University of Beirut-­ AUB)

[44]. Hierarchical Reinforcement Learning for Open-­Domain Dialog

Abdelrhman Saleh (Harvard University)*; Natasha Jaques (MIT); Asma Ghandeharioun (MIT); Judy Hanwen Shen(MIT); Rosalind Picard (MIT Media Lab)

[45]. Copy or Rewrite: Hybrid Summarization with Hierarchical Reinforcement Learning

Liqiang Xiao (Artificial Intelligence Institute, SJTU)*; Lu Wang (Khoury College of Computer Science, Northeastern University); Hao He (Shanghai Jiao Tong University); Yaohui Jin (Artificial Intelligence Institute, SJTU)

[46]. Generalizable Resource Allocation in Stream Processing via Deep Reinforcement Learning

Xiang Ni (IBM Research); Jing Li (NJIT); Wang Zhou (IBM Research); Mo Yu (IBM T. J. Watson)*; Kun-­Lung Wu (IBM Research)

[47]. Actor Critic Deep Reinforcement Learning for Neural Malware Control

Yu Wang (Microsoft)*; Jack Stokes (Microsoft Research); Mady Marinescu (Microsoft Corporation)

[48]. Fixed-­Horizon Temporal Difference Methods for Stable Reinforcement Learning

Kristopher De Asis (University of Alberta)*; Alan Chan (University of Alberta); Silviu Pitis (University of Toronto); Richard Sutton (University of Alberta); Daniel Graves (Huawei)

[49]. Sequence Generation with Optimal-­Transport-­Enhanced Reinforcement Learning

Liqun Chen (Duke University)*; Ke Bai (Duke University); Chenyang Tao (Duke University); Yizhe Zhang (Microsoft Research); Guoyin Wang (Duke University); Wenlin Wang (Duke Univeristy); Ricardo Henao (Duke University); Lawrence Carin Duke (CS)

[50]. Scaling All-­Goals Updates in Reinforcement Learning Using Convolutional Neural Networks

Fabio Pardo (Imperial College London)*; Vitaly Levdik (Imperial College London); Petar Kormushev (Imperial College London)

[51]. Parameterized Indexed Value Function for Efficient Exploration in Reinforcement Learning

Tian Tan (Stanford University)*; Zhihan Xiong (Stanford University); Vikranth Dwaracherla (Stanford University)

[52]. Solving Online Threat Screening Games using Constrained Action Space Reinforcement Learning

Sanket Shah (Singpore Management University)*; Arunesh Sinha (Singapore Management University); Pradeep Varakantham (Singapore Management University); Andrew Perrault (Harvard University); Milind Tambe (Harvard University)

关于论文的详细解读请查看Github:

https://github.com/NeuronDance/DeepRL/tree/master/DRL-ConferencePaper/AAAI/2020

(*本文为AI科技大本营转载文章,转载请微信联系1092722531)

精彩推荐

2020年,由 CSDN 主办的「Python开发者日」活动(Python Day)正式启动。我们将与 PyCon 官方授权的 PyCon中国社区合作,联手顶尖企业、行业与技术专家,通过精彩的技术干货内容、有趣多元化的活动等诸多体验,共同为中国 IT 技术开发者搭建专业、开放的技术交流与成长的家园。未来,我们和中国万千开发者一起分享技术、践行技术,铸就中国原创技术力量。

【Python Day——北京站】现已正式启动,「新春早鸟票」火热开抢!2020年,我们还将在全国多个城市举办巡回活动,敬请期待!

活动咨询,可扫描下方二维码加入官方交流群~

CSDN「Python Day」咨询群 ????

来~一起聊聊Python

如果群满100人,无法自动进入,可添加会议小助手微信:婷婷,151 0101 4297(电话同微信)


推荐阅读

  • 集五福,我用Python

  • 微软开源NAS算法Petridish,提高神经网络迁移能力

  • AI 没让人类失业,搞 AI 的人先失业了

  • 我国自主开发的编程语言“木兰”是又一个披着“洋”皮的红芯浏览器吗?

  • 小网站的容器化(上)

  • 好扑科技技术副总裁戎朋:从海豚浏览器技术负责人到区块链,揭秘区块链技术之路

  • 你点的每个“在看”,我都认真当成了AI

必看!52篇深度强化学习收录论文汇总 | AAAI 2020相关推荐

  1. AAAI-2020 || 52篇深度强化学习accept论文汇总

    深度强化学习实验室报道 来源:AAAI-2020 作者:DeepRL AAAI 2020 共收到的有效论文投稿超过 8800 篇,其中 7737 篇论文进入评审环节,最终收录数量为 1591 篇,收录 ...

  2. 【重磅整理】提前看287篇ICLR-2021 深度强化学习领域论文得分汇总列表

    深度强化学习实验室 来源:ICLR2021 编辑:DeepRL [1]. What Matters for On-Policy Deep Actor-Critic Methods? A Large-S ...

  3. 【最新重磅整理】82篇AAAI2021强化学习领域论文接收列表

    深度强化学习实验室 官网:http://www.neurondance.com/ 论坛:http://deeprl.neurondance.com/ 作者:深度强化学习实验室&AMiner 编 ...

  4. 【重磅最新】163篇ICML-2021强化学习领域论文整理汇总(2021.06.07)

    深度强化学习实验室 官网:http://www.neurondance.com/ 论坛:http://deeprl.neurondance.com/ 作者:深度强化学习实验室 来源:整理自https: ...

  5. 【人工智能】Rutgers大学熊辉教授:《易经》如何指导我们做人工智能;这里有一篇深度强化学习劝退文

    导读 我们看这个世界主要有两种方式:一种方式是从上往下看世界:另外一种是东方人所擅长的<易经>方法看世界,也就是归纳法,从下往上看世界.<易经>追求三易,不易.变易和简易.大道 ...

  6. ICLR2020 || 106篇深度强化学习顶会论文汇总

    深度强化学习实验室报道 转载自: EndtoEnd.ai 编辑:DeepRL [导读]今年的ICLR大会转到了线上举行,DeepMind和哈佛的研究人员投稿了一篇神经网络控制虚拟小白鼠模的论文十分亮眼 ...

  7. 【论文相关】强化学习:提前看287篇ICLR-2021 深度强化学习领域论文得分汇总列表...

    深度强化学习实验室 来源:ICLR2021 编辑:DeepRL [1]. What Matters for On-Policy Deep Actor-Critic Methods? A Large-S ...

  8. 深度强化学习综述论文 A Brief Survey of Deep Reinforcement Learning

    A Brief Survey of Deep Reinforcement Learning 深度强化学习的简要概述 作者: Kai Arulkumaran, Marc Peter Deisenroth ...

  9. 【论文解读】深度强化学习基石论文:函数近似的策略梯度方法

     导读:这篇是1999 年Richard Sutton 在强化学习领域中的经典论文,论文证明了策略梯度定理和在用函数近似 Q 值时策略梯度定理依然成立,本论文奠定了后续以深度强化学习策略梯度方法的基石 ...

最新文章

  1. php在线考试自动批卷_php网络在线考试组卷系统
  2. Asp.Net Core EndPoint 终结点路由工作原理解读
  3. Asterisk配置文件说明
  4. 生日快乐页面_宇智波佐助生日快乐!参与活动,豚豚为你送福利!
  5. 相机模型之世界坐标、相机坐标、归一化坐标、图像坐标、像素坐标、内参、外参、转换关系总结
  6. 怎样进行大数据的入门级学习
  7. 【sklearn第九讲】支持向量机之分类篇
  8. 什么是代理服务器,代理ip池芝麻
  9. 我对顶级域名、一级域名和二级域名的认识
  10. 无法启动计算机丢失xinput1,电脑丢失xinput13.dll怎么办?计算机丢失XINPUT1_3.dll解决办法...
  11. 整理的最新的前端面试题必问集锦 (持续更新)
  12. 用java实现屏幕找图
  13. html 利用 frameset 进行简单的框架布局
  14. seetaface6 android jni(二)
  15. 招聘 | 百度NLP部 - 对话算法实习生
  16. 收集的20个媒体转换软件|视频处理|音频处理(有图哦)
  17. 详解9个写进简历的数据分析项目
  18. 【Excel】二、VBA入门指导
  19. Netmiko终极宝典
  20. React google map

热门文章

  1. Codeforces 862B - Mahmoud and Ehab and the bipartiteness
  2. Entity Framework:Code-First Tutorial开篇
  3. nginx 开发一个简单的 HTTP 模块
  4. 解决ubuntu14.04下Qt 5.3.1下的QtCreator fcitx,ibus不能输入中文
  5. Build Boost C++ libraries for x32/x64 VC++ compilers on Windows
  6. 关于正则表达式 g,m 参数的总结,为了回答“正则表达式(/[^0-9]/g,'')中的/g是什么意思?”...
  7. 人工神经网络:感知器
  8. tar命令-压缩,解压缩文件
  9. E: GPG 错误:http://developer.download.nvidia.com Release: 下列签名无效: NODATA 1 NODATA 2...
  10. 江西省移动物联网发展战略新闻发布会举行-2017年10月江西IDC排行榜与发展报告...