软件类配置(五)【强化学习算法框架-Ubuntu16.04安装谷歌Dopamine及初步测试】

Dopamine是一个快速构建强化学习算法的框架，它主要为了满足用户对小型、便于访问的代码库的需求，用户能够很方便地构建自己的实验去验证自己研究过程中的想法。

他们的设计理念是:

Easy experimentation: 使用户能够更容易的实现自己的实验。
Flexible development: 使得新用户能够更容易地去实现自己的研究想法。
Compact and reliable: 提供一些经典的测试算法。
Reproducible: 促进结果的可复现性。

In the spirit of these principles, this first version focuses on supporting the state-of-the-art, single-GPU Rainbow agent (Hessel et al., 2018) applied to Atari 2600 game-playing (Bellemare et al., 2013). Specifically, our Rainbow agent implements the three components identified as most important by Hessel et al.:

n-step Bellman updates (see e.g. Mnih et al., 2016)
Prioritized experience replay (Schaul et al., 2015)
Distributional reinforcement learning (C51; Bellemare et al., 2017)

For completeness, we also provide an implementation of DQN (Mnih et al., 2015). For additional details, please see our documentation.

安装过程：

首先搭建 virtual 环境:

sudo apt-get install virtualenv
virtualenv --python=python2.7 dopamine-env
source dopamine-env/bin/activate

上面是搭建python2.7的环境，搭建python3的环境的话还需要其他的一些步骤。

这样的话将会创建一个叫做dopamine-env文件目录，你的virtual环境就在里面。最后一个命令是激活这个环境。

之后的话我们需要安装dopamine所需的依赖环境。

sudo apt-get install cmake zlib1g-dev
pip install absl-py atari-py gin-config gym opencv-python tensorflow

在安装过程中你可以忽略以下信息: tensorflow 1.10.1 has requirement numpy<=1.14.5,>=1.13.3, but you'll have numpy 1.15.1 which is incompatible.

最后我们下载Dopamine 的源文件, e.g.

git clone https://github.com/google/dopamine.git

测试

你可以通过以下命令来测试是否你已经安装成功:

cd dopamine
export PYTHONPATH=${PYTHONPATH}:.
python tests/atari_init_test.py

标准的 Atari 2600 实验是：dopamine/atari/train.py。跑下面这个基本的DQN代码。

python -um dopamine.atari.train \--agent_name=dqn \--base_dir=/tmp/dopamine \--gin_files='dopamine/agents/dqn/configs/dqn.gin'

To get finer-grained information about the process, you can adjust the experiment parameters in dopamine/agents/dqn/configs/dqn.gin, in particular by reducing Runner.training_steps and Runner.evaluation_steps, which together determine the total number of steps needed to complete an iteration. This is useful if you want to inspect log files or checkpoints, which are generated at the end of each iteration.

More generally, the whole of Dopamine is easily configured using the gin configuration framework.

原文链接：https://github.com/google/dopamine

软件类配置(五)【强化学习算法框架-Ubuntu16.04安装谷歌Dopamine及初步测试】相关推荐

无需公式或代码，用生活实例谈谈 AI 自动控制技术“强化学习”算法框架
不用公式.不用代码,白话讲讲强化学习原理 The best way to learn is to teach others. 战胜围棋高手李世石的 AlphaGo ,称霸星际争霸2的 AIphaSta ...
OpenAI Gym 是一个优秀开发和比较强化学习算法的工具
OpenAI Gym 是一个优秀开发和比较强化学习算法的工具. gym的核心接口是Env方法: reset(self):重置环境的状态,返回观察. step(self, action):推进一 ...
【强化学习实战】基于gym和tensorflow的强化学习算法实现
[新智元导读]知乎专栏强化学习大讲堂作者郭宪博士开讲<强化学习从入门到进阶>,我们为您节选了其中的第二节<基于gym和tensorflow的强化学习算法实现>,希望对您有所帮助 ...
目前最好用的大规模强化学习算法训练库是什么？
点击蓝字关注我们本文整理自知乎问答,仅用于学术分享,著作权归作者所有.如有侵权,请联系后台作删文处理. 本文精选知乎问题"目前最好用的大规模强化学习算法训练库是什么?"评论区 ...
上交张伟楠副教授：基于模型的强化学习算法，基本原理以及前沿进展（附视频）
2020 北京智源大会本文属于2020北京智源大会嘉宾演讲的整理报道系列.北京智源大会是北京智源人工智能研究院主办的年度国际性人工智能高端学术交流活动,以国际性.权威性.专业性和前瞻性的" ...
如何提高强化学习算法模型的泛化能力?
深度强化学习实验室官网:http://www.neurondance.com/ 来源:https://zhuanlan.zhihu.com/p/328287119 作者:网易伏羲实验室编辑:Dee ...
MATLAB强化学习实战(十二) 创建自定义强化学习算法的智能体
创建自定义强化学习算法的智能体创建环境定义策略自定义智能体类智能体属性构造函数相关函数可选功能创建自定义智能体训练自定义智能体自定义智能体仿真本示例说明如何为您自己的自定义强化学 ...
【招聘推荐】启元世界招聘深度强化学习算法工程师
深度强化学习实验室官网:http://www.neurondance.com/ 论坛:http://deeprl.neurondance.com/ 编辑.排版:DeepRL 深度强化学习算法工程师 ...
【Nature重磅】OpenAI科学家提出全新强化学习算法，推动AI向智能体进化
深度强化学习实验室官网:http://www.neurondance.com/ 论坛:http://deeprl.neurondance.com/ 编辑:DeepRL 近年来,人工智能(AI)在强化 ...
推荐系统遇上深度学习(十五)--强化学习在京东推荐中的探索
强化学习在各个公司的推荐系统中已经有过探索,包括阿里.京东等.之前在美团做过的一个引导语推荐项目,背后也是基于强化学习算法.本文,我们先来看一下强化学习是如何在京东推荐中进行探索的. 本文来自于pap ...

软件类配置(五)【强化学习算法框架-Ubuntu16.04安装谷歌Dopamine及初步测试】

安装过程：

测试

软件类配置(五)【强化学习算法框架-Ubuntu16.04安装谷歌Dopamine及初步测试】相关推荐

最新文章

热门文章