纯干货-5Deep Reinforcement Learning深度强化学习

本文罗列了最近放出来的关于深度强化学习（Deep Reinforcement Learning，DRL）的一些论文。文章采用人工定义的方式来进行组织，按照时间的先后进行排序，越新的论文，排在越前面。希望对大家有用，同时欢迎大家提交自己阅读过的论文。

• 值函数相关的文章

• 策略相关的文章

• 离散控制相关的文章

• 连续控制相关的文章

• 文本处理领域相关的文章

• 计算机视觉领域相关的文章

• 机器人领域相关的文章

• 游戏领域相关的文章

• 蒙特卡洛树搜索相关的文章

• 逆强化学习相关的文章

• 搜索优化相关的文章

• 多任务和迁移学习相关的文章

• 多智能体相关的文章

• 层次化学习相关的文章

值函数相关的文章

Model-Free Episodic Control, C. Blundell et al., arXiv, 2016.
Safe and Efficient Off-Policy Reinforcement Learning, R. Munos et al., arXiv, 2016.
Deep Successor Reinforcement Learning, T. D. Kulkarni et al., arXiv, 2016.
Unifying Count-Based Exploration and Intrinsic Motivation, M. G. Bellemare et al., arXiv, 2016.
Control of Memory, Active Perception, and Action in Minecraft, J. Oh et al., ICML, 2016.
Dynamic Frame skip Deep Q Network, A. S. Lakshminarayanan et al., IJCAI Deep RL Workshop, 2016.
Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al., arXiv, 2016.
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al., arXiv, 2016.
Continuous Deep Q-Learning with Model-based Acceleration, S. Gu et al., ICML, 2016.
Deep Exploration via Bootstrapped DQN, I. Osband et al., arXiv, 2016.
Value Iteration Networks, A. Tamar et al., arXiv, 2016.
Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks, J. N. Foerster et al., arXiv, 2016.
Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al., arXiv, 2016.
Mastering the game of Go with deep neural networks and tree search, D. Silver et al., Nature, 2016.
Increasing the Action Gap: New Operators for Reinforcement Learning, M. G. Bellemare et al., AAAI, 2016.
How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies, V. François-Lavet et al., NIPS Workshop, 2015.
Multiagent Cooperation and Competition with Deep Reinforcement Learning, A. Tampuu et al., arXiv, 2015.
Strategic Dialogue Management via Deep Reinforcement Learning, H. Cuayáhuitl et al., NIPS Workshop, 2015.
Learning Simple Algorithms from Examples, W. Zaremba et al., arXiv, 2015.
Dueling Network Architectures for Deep Reinforcement Learning, Z. Wang et al., arXiv, 2015.
Prioritized Experience Replay, T. Schaul et al., ICLR, 2016.
Deep Reinforcement Learning with an Action Space Defined by Natural Language, J. He et al., arXiv, 2015.
Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al., ICLR, 2016.
Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al., arXiv, 2015.
Generating Text with Deep Reinforcement Learning, H. Guo, arXiv, 2015.
Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al., arXiv, 2015.
Recurrent Reinforcement Learning: A Hybrid Approach, X. Li et al., arXiv, 2015.
Continuous control with deep reinforcement learning, T. P. Lillicrap et al., ICLR, 2016.
Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al., EMNLP, 2015.
Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al., NIPS, 2015.
Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone, arXiv, 2015.
Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al., arXiv, 2015.
Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al., ICML Workshop, 2015.
Human-level control through deep reinforcement learning, V. Mnih et al., Nature, 2015.
Playing Atari with Deep Reinforcement Learning, V. Mnih et al., NIPS Workshop, 2013.

策略相关的文章

Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks, R. Houthooft et al., arXiv, 2016.
Benchmarking Deep Reinforcement Learning for Continuous Control, Y. Duan et al., ICML, 2016.
Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, S. Levine et al., arXiv, 2016.
Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, C. Finn et al., arXiv, 2016.
Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al., arXiv, 2016.
Mastering the game of Go with deep neural networks and tree search, D. Silver et al., Nature, 2016.
Memory-based control with recurrent neural networks, N. Heess et al., NIPS Workshop, 2015.
MazeBase: A Sandbox for Learning from Games, S. Sukhbaatar et al., arXiv, 2016.
ADAAPT: A Deep Architecture for Adaptive Policy Transfer from Multiple Sources, J. Rajendran et al., arXiv, 2015.
Continuous control with deep reinforcement learning, T. P. Lillicrap et al., ICLR, 2016.
Learning Continuous Control Policies by Stochastic Value Gradients, N. Heess et al., NIPS, 2015.
High-Dimensional Continuous Control Using Generalized Advantage Estimation, J. Schulman et al., ICLR, 2016.
End-to-End Training of Deep Visuomotor Policies, S. Levine et al., arXiv, 2015.
Deterministic Policy Gradient Algorithms, D. Silver et al., ICML, 2015.
Trust Region Policy Optimization, J. Schulman et al., ICML, 2015.

离散控制相关的文章

Model-Free Episodic Control, C. Blundell et al., arXiv, 2016.
Safe and Efficient Off-Policy Reinforcement Learning, R. Munos et al., arXiv, 2016.
Deep Successor Reinforcement Learning, T. D. Kulkarni et al., arXiv, 2016.
Unifying Count-Based Exploration and Intrinsic Motivation, M. G. Bellemare et al., arXiv, 2016.
Control of Memory, Active Perception, and Action in Minecraft, J. Oh et al., ICML, 2016.
Dynamic Frame skip Deep Q Network, A. S. Lakshminarayanan et al., IJCAI Deep RL Workshop, 2016.
Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al., arXiv, 2016.
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al., arXiv, 2016.
Deep Exploration via Bootstrapped DQN, I. Osband et al., arXiv, 2016.
Value Iteration Networks, A. Tamar et al., arXiv, 2016.
Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks, J. N. Foerster et al., arXiv, 2016.
Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al., arXiv, 2016.
Mastering the game of Go with deep neural networks and tree search, D. Silver et al., Nature, 2016.
Increasing the Action Gap: New Operators for Reinforcement Learning, M. G. Bellemare et al., AAAI, 2016.
How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies, V. François-Lavet et al., NIPS Workshop, 2015.
Multiagent Cooperation and Competition with Deep Reinforcement Learning, A. Tampuu et al., arXiv, 2015.
Strategic Dialogue Management via Deep Reinforcement Learning, H. Cuayáhuitl et al., NIPS Workshop, 2015.
Learning Simple Algorithms from Examples, W. Zaremba et al., arXiv, 2015.
Dueling Network Architectures for Deep Reinforcement Learning, Z. Wang et al., arXiv, 2015.
Better Computer Go Player with Neural Network and Long-term Prediction, Y. Tian et al., ICLR, 2016.
Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning, E. Parisotto, et al., ICLR, 2016.
Policy Distillation, A. A. Rusu et at., ICLR, 2016.
Prioritized Experience Replay, T. Schaul et al., ICLR, 2016.
Deep Reinforcement Learning with an Action Space Defined by Natural Language, J. He et al., arXiv, 2015.
Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al., ICLR, 2016.
Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al., arXiv, 2015.
Generating Text with Deep Reinforcement Learning, H. Guo, arXiv, 2015.
ADAAPT: A Deep Architecture for Adaptive Policy Transfer from Multiple Sources, J. Rajendran et al., arXiv, 2015.
Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende, arXiv, 2015.
Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al., arXiv, 2015.
Recurrent Reinforcement Learning: A Hybrid Approach, X. Li et al., arXiv, 2015.
Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al., EMNLP, 2015.
Giraffe: Using Deep Reinforcement Learning to Play Chess, M. Lai, arXiv, 2015.
Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al., NIPS, 2015.
Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone, arXiv, 2015.
Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences, H. Mei et al., arXiv, 2015.
Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al., arXiv, 2015.
Universal Value Function Approximators, T. Schaul et al., ICML, 2015.
Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al., ICML Workshop, 2015.
Human-level control through deep reinforcement learning, V. Mnih et al., Nature, 2015.
Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al., NIPS, 2014.
Playing Atari with Deep Reinforcement Learning, V. Mnih et al., NIPS Workshop, 2013.

连续控制相关的文章

Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks, R. Houthooft et al., arXiv, 2016.
Benchmarking Deep Reinforcement Learning for Continuous Control, Y. Duan et al., ICML, 2016.
Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, S. Levine et al., arXiv, 2016.
Continuous Deep Q-Learning with Model-based Acceleration, S. Gu et al., ICML, 2016.
Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, C. Finn et al., arXiv, 2016.
Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al., arXiv, 2016.
Memory-based control with recurrent neural networks, N. Heess et al., NIPS Workshop, 2015.
Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende, arXiv, 2015.
Continuous control with deep reinforcement learning, T. P. Lillicrap et al., ICLR, 2016.
Learning Continuous Control Policies by Stochastic Value Gradients, N. Heess et al., NIPS, 2015.
Learning Deep Neural Network Policies with Continuous Memory States, M. Zhang et al., arXiv, 2015.
High-Dimensional Continuous Control Using Generalized Advantage Estimation, J. Schulman et al., ICLR, 2016.
End-to-End Training of Deep Visuomotor Policies, S. Levine et al., arXiv, 2015.
DeepMPC: Learning Deep Latent Features for Model Predictive Control, I. Lenz, et al., RSS, 2015.
Deterministic Policy Gradient Algorithms, D. Silver et al., ICML, 2015.
Trust Region Policy Optimization, J. Schulman et al., ICML, 2015.

文本处理领域相关的文章

Strategic Dialogue Management via Deep Reinforcement Learning, H. Cuayáhuitl et al., NIPS Workshop, 2015.
MazeBase: A Sandbox for Learning from Games, S. Sukhbaatar et al., arXiv, 2016.
Deep Reinforcement Learning with an Action Space Defined by Natural Language, J. He et al., arXiv, 2015.
Generating Text with Deep Reinforcement Learning, H. Guo, arXiv, 2015.
Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al., EMNLP, 2015.
Listen, Attend, and Walk: Neural Mapping of Navigational Instructions to Action Sequences, H. Mei et al., arXiv, 2015.

计算机视觉领域相关的文章

Model-Free Episodic Control, C. Blundell et al., arXiv, 2016.
Deep Successor Reinforcement Learning, T. D. Kulkarni et al., arXiv, 2016.
Unifying Count-Based Exploration and Intrinsic Motivation, M. G. Bellemare et al., arXiv, 2016.
Control of Memory, Active Perception, and Action in Minecraft, J. Oh et al., ICML, 2016.
Dynamic Frame skip Deep Q Network, A. S. Lakshminarayanan et al., IJCAI Deep RL Workshop, 2016.
Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al., arXiv, 2016.
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al., arXiv, 2016.
Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, S. Levine et al., arXiv, 2016.
Deep Exploration via Bootstrapped DQN, I. Osband et al., arXiv, 2016.
Value Iteration Networks, A. Tamar et al., arXiv, 2016.
Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al., arXiv, 2016.
Mastering the game of Go with deep neural networks and tree search, D. Silver et al., Nature, 2016.
Increasing the Action Gap: New Operators for Reinforcement Learning, M. G. Bellemare et al., AAAI, 2016.
Memory-based control with recurrent neural networks, N. Heess et al., NIPS Workshop, 2015.
How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies, V. François-Lavet et al., NIPS Workshop, 2015.
Multiagent Cooperation and Competition with Deep Reinforcement Learning, A. Tampuu et al., arXiv, 2015.
Dueling Network Architectures for Deep Reinforcement Learning, Z. Wang et al., arXiv, 2015.
Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning, E. Parisotto, et al., ICLR, 2016.
Better Computer Go Player with Neural Network and Long-term Prediction, Y. Tian et al., ICLR, 2016.
Policy Distillation, A. A. Rusu et at., ICLR, 2016.
Prioritized Experience Replay, T. Schaul et al., ICLR, 2016.
Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al., ICLR, 2016.
Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al., arXiv, 2015.
Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende, arXiv, 2015.
Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al., arXiv, 2015.
Continuous control with deep reinforcement learning, T. P. Lillicrap et al., ICLR, 2016.
Giraffe: Using Deep Reinforcement Learning to Play Chess, M. Lai, arXiv, 2015.
Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al., NIPS, 2015.
Learning Continuous Control Policies by Stochastic Value Gradients, N. Heess et al., NIPS, 2015.
Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone, arXiv, 2015.
Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al., arXiv, 2015.
High-Dimensional Continuous Control Using Generalized Advantage Estimation, J. Schulman et al., ICLR, 2016.
End-to-End Training of Deep Visuomotor Policies, S. Levine et al., arXiv, 2015.
Universal Value Function Approximators, T. Schaul et al., ICML, 2015.
Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al., ICML Workshop, 2015.
Trust Region Policy Optimization, J. Schulman et al., ICML, 2015.
Human-level control through deep reinforcement learning, V. Mnih et al., Nature, 2015.
Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al., NIPS, 2014.
Playing Atari with Deep Reinforcement Learning, V. Mnih et al., NIPS Workshop, 2013.

机器人领域相关的文章

Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks, R. Houthooft et al., arXiv, 2016.
Benchmarking Deep Reinforcement Learning for Continuous Control, Y. Duan et al., ICML, 2016.
Learning Hand-Eye Coordination for Robotic Grasping with Deep Learning and Large-Scale Data Collection, S. Levine et al., arXiv, 2016.
Continuous Deep Q-Learning with Model-based Acceleration, S. Gu et al., ICML, 2016.
Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, C. Finn et al., arXiv, 2016.
Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al., arXiv, 2016.
Memory-based control with recurrent neural networks, N. Heess et al., NIPS Workshop, 2015.
Towards Vision-Based Deep Reinforcement Learning for Robotic Motion Control, F. Zhang et al., arXiv, 2015.
Learning Continuous Control Policies by Stochastic Value Gradients, N. Heess et al., NIPS, 2015.
Learning Deep Neural Network Policies with Continuous Memory States, M. Zhang et al., arXiv, 2015.
High-Dimensional Continuous Control Using Generalized Advantage Estimation, J. Schulman et al., ICLR, 2016.
End-to-End Training of Deep Visuomotor Policies, S. Levine et al., arXiv, 2015.
DeepMPC: Learning Deep Latent Features for Model Predictive Control, I. Lenz, et al., RSS, 2015.
Trust Region Policy Optimization, J. Schulman et al., ICML, 2015.

游戏领域相关的文章

Model-Free Episodic Control, C. Blundell et al., arXiv, 2016.
Safe and Efficient Off-Policy Reinforcement Learning, R. Munos et al., arXiv, 2016.
Deep Successor Reinforcement Learning, T. D. Kulkarni et al., arXiv, 2016.
Unifying Count-Based Exploration and Intrinsic Motivation, M. G. Bellemare et al., arXiv, 2016.
Control of Memory, Active Perception, and Action in Minecraft, J. Oh et al., ICML, 2016.
Dynamic Frame skip Deep Q Network, A. S. Lakshminarayanan et al., IJCAI Deep RL Workshop, 2016.
Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al., arXiv, 2016.
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al., arXiv, 2016.
Deep Exploration via Bootstrapped DQN, I. Osband et al., arXiv, 2016.
Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks, J. N. Foerster et al., arXiv, 2016.
Asynchronous Methods for Deep Reinforcement Learning, V. Mnih et al., arXiv, 2016.
Mastering the game of Go with deep neural networks and tree search, D. Silver et al., Nature, 2016.
Increasing the Action Gap: New Operators for Reinforcement Learning, M. G. Bellemare et al., AAAI, 2016.
How to Discount Deep Reinforcement Learning: Towards New Dynamic Strategies, V. François-Lavet et al., NIPS Workshop, 2015.
Multiagent Cooperation and Competition with Deep Reinforcement Learning, A. Tampuu et al., arXiv, 2015.
MazeBase: A Sandbox for Learning from Games, S. Sukhbaatar et al., arXiv, 2016.
Dueling Network Architectures for Deep Reinforcement Learning, Z. Wang et al., arXiv, 2015.
Better Computer Go Player with Neural Network and Long-term Prediction, Y. Tian et al., ICLR, 2016.
Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning, E. Parisotto, et al., ICLR, 2016.
Policy Distillation, A. A. Rusu et at., ICLR, 2016.
Prioritized Experience Replay, T. Schaul et al., ICLR, 2016.
Deep Reinforcement Learning with an Action Space Defined by Natural Language, J. He et al., arXiv, 2015.
Deep Reinforcement Learning in Parameterized Action Space, M. Hausknecht et al., ICLR, 2016.
Variational Information Maximisation for Intrinsically Motivated Reinforcement Learning, S. Mohamed and D. J. Rezende, arXiv, 2015.
Deep Reinforcement Learning with Double Q-learning, H. van Hasselt et al., arXiv, 2015.
Continuous control with deep reinforcement learning, T. P. Lillicrap et al., ICLR, 2016.
Language Understanding for Text-based Games Using Deep Reinforcement Learning, K. Narasimhan et al., EMNLP, 2015.
Giraffe: Using Deep Reinforcement Learning to Play Chess, M. Lai, arXiv, 2015.
Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al., NIPS, 2015.
Deep Recurrent Q-Learning for Partially Observable MDPs, M. Hausknecht and P. Stone, arXiv, 2015.
Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al., arXiv, 2015.
Universal Value Function Approximators, T. Schaul et al., ICML, 2015.
Massively Parallel Methods for Deep Reinforcement Learning, A. Nair et al., ICML Workshop, 2015.
Trust Region Policy Optimization, J. Schulman et al., ICML, 2015.
Human-level control through deep reinforcement learning, V. Mnih et al., Nature, 2015.
Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al., NIPS, 2014.
Playing Atari with Deep Reinforcement Learning, V. Mnih et al., NIPS Workshop, 2013.

蒙特卡洛树搜索相关的文章

Mastering the game of Go with deep neural networks and tree search, D. Silver et al., Nature, 2016.
Better Computer Go Player with Neural Network and Long-term Prediction, Y. Tian et al., ICLR, 2016.
Deep Learning for Real-Time Atari Game Play Using Offline Monte-Carlo Tree Search Planning, X. Guo et al., NIPS, 2014.

逆强化学习相关的文章

Guided Cost Learning: Deep Inverse Optimal Control via Policy Optimization, C. Finn et al., arXiv, 2016.
Maximum Entropy Deep Inverse Reinforcement Learning, M. Wulfmeier et al., arXiv, 2015.

多任务和迁移学习相关的文章

Actor-Mimic: Deep Multitask and Transfer Reinforcement Learning, E. Parisotto, et al., ICLR, 2016.
Policy Distillation, A. A. Rusu et at., ICLR, 2016.
ADAAPT: A Deep Architecture for Adaptive Policy Transfer from Multiple Sources, J. Rajendran et al., arXiv, 2015.
Universal Value Function Approximators, T. Schaul et al., ICML, 2015.

搜索优化相关的文章

Unifying Count-Based Exploration and Intrinsic Motivation, M. G. Bellemare et al., arXiv, 2016.
Curiosity-driven Exploration in Deep Reinforcement Learning via Bayesian Neural Networks, R. Houthooft et al., arXiv, 2016.
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al., arXiv, 2016.
Deep Exploration via Bootstrapped DQN, I. Osband et al., arXiv, 2016.
Action-Conditional Video Prediction using Deep Networks in Atari Games, J. Oh et al., NIPS, 2015.
Incentivizing Exploration In Reinforcement Learning With Deep Predictive Models, B. C. Stadie et al., arXiv, 2015.

多智能体相关的文章

Learning to Communicate to Solve Riddles with Deep Distributed Recurrent Q-Networks, J. N. Foerster et al., arXiv, 2016.
Multiagent Cooperation and Competition with Deep Reinforcement Learning, A. Tampuu et al., arXiv, 2015.

层次化学习相关的文章

Deep Successor Reinforcement Learning, T. D. Kulkarni et al., arXiv, 2016.
Hierarchical Reinforcement Learning using Spatio-Temporal Abstractions and Deep Neural Networks, R. Krishnamurthy et al., arXiv, 2016.
Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation, T. D. Kulkarni et al., arXiv, 2016.

原文链接：https://github.com/junhyukoh/deep-reinforcement-learning-papers

更多深度学习在NLP方面应用的经典论文、实践经验和最新消息，欢迎关注微信公众号“深度学习与NLP”或“DeepLearning_NLP”或扫描二维码添加关注。

纯干货-5Deep Reinforcement Learning深度强化学习_论文大集合相关推荐

深度强化学习综述论文 A Brief Survey of Deep Reinforcement Learning
A Brief Survey of Deep Reinforcement Learning 深度强化学习的简要概述作者: Kai Arulkumaran, Marc Peter Deisenroth ...
Deep Reinforcement Learning 深度增强学习资源
http://blog.csdn.net/songrotek/article/details/50572935 1 学习资料增强学习课程 David Silver (有视频和ppt): http:/ ...
论文学习：Decoupling Value and Policy for Generalization in Reinforcement Learning（强化学习中泛化的解耦价值和策略）
摘要: Standard deep reinforcement learning algorithms use a shared representation for the policy and v ...
必看！52篇深度强化学习收录论文汇总 | AAAI 2020
所有参与投票的 CSDN 用户都参加抽奖活动群内公布奖项,还有更多福利赠送来源 | 深度强化学习实验室(ID:Deep-RL) 作者 | DeepRL AAAI 2020 共收到的有效论文投稿超过 ...
深度强化学习_深度学习理论与应用第8课 | 深度强化学习
本文是博雅大数据学院"深度学习理论与应用课程"第八章的内容整理.我们将部分课程视频.课件和讲授稿进行发布.在线学习完整内容请登录www.cookdata.cn 深度强化学习是一种将 ...
AAAI-2020 || 52篇深度强化学习accept论文汇总
深度强化学习实验室报道来源:AAAI-2020 作者:DeepRL AAAI 2020 共收到的有效论文投稿超过 8800 篇,其中 7737 篇论文进入评审环节,最终收录数量为 1591 篇,收录 ...
【重磅整理】提前看287篇ICLR-2021 深度强化学习领域论文得分汇总列表
深度强化学习实验室来源:ICLR2021 编辑:DeepRL [1]. What Matters for On-Policy Deep Actor-Critic Methods? A Large-S ...
【论文相关】强化学习：提前看287篇ICLR-2021 深度强化学习领域论文得分汇总列表...
深度强化学习实验室来源:ICLR2021 编辑:DeepRL [1]. What Matters for On-Policy Deep Actor-Critic Methods? A Large-S ...
【论文解读】深度强化学习基石论文：函数近似的策略梯度方法
导读:这篇是1999 年Richard Sutton 在强化学习领域中的经典论文,论文证明了策略梯度定理和在用函数近似 Q 值时策略梯度定理依然成立,本论文奠定了后续以深度强化学习策略梯度方法的基石 ...

纯干货-5Deep Reinforcement Learning深度强化学习_论文大集合

纯干货-5Deep Reinforcement Learning深度强化学习_论文大集合相关推荐

最新文章

热门文章