文章目录

  • 强化学习的基本概念
    • 1. 强化学习的算法步骤:
    • 2. 强化学习和其他机器学习范式的不同
    • 3. 强化学习的要素
      • a. 智能体
      • b. 策略函数
      • c. 值函数
      • d. 模型
    • 4. 强化学习的环境类型
      • (1)确定性环境
      • (2)随机性环境
      • (3)完全可观测环境
      • (4)部分可观测环境
      • (5)离散环境
      • (6)连续环境
      • (7)情景和非情景环境
      • (8)单智能体和多智能体环境
    • 5. 强化学习的平台
    • 6.强化学习的入门例子
      • 小程序一:平衡摆游戏

强化学习的基本概念

1. 强化学习的算法步骤:

  1. 首先,智能体通过执行行为与环境进行交互。
  2. 智能体执行一个行为之后,从一个状态转移到另一个状态
  3. 然后,智能体将会根据所执行的行为获得相应的奖励。
  4. 根据获得的奖励,智能体会知晓该行为是好是坏。
  5. 如果行为是好的,即如果智能体得到正面奖励,那么将会倾向于执行该行为,否则智能体将会尝试执行其他行为来获得正面奖励。因此,该方法本质上是一个试错学习的过程。

2. 强化学习和其他机器学习范式的不同

在监督学习中,机器(智能体)从具有标记的输入|输出训练数据集中学习。目的是使得模型对其学习进行外推和推广,从而能够很好适用于未知数据。在此,需要有一个对环境具备完备知识基础的外部监督者,来监督智能体完成该任务。

在无监督学习中,将提供一个仅有一组输入训练数据的模型,通过模型来学习确定输入数据中的隐藏模式。通常普遍会误认为强化学习是一种无监督学习,其实并非如此。在无监督学习中,通过模型来学习隐藏的结构,而在强化学习中,通过最大化奖励来学习模型。假设想要向用户推荐新电影。无监督学习会通过分析用户观看过的类似电影来推荐,而强化学习则不断接收来自用户的反馈信息,了解其偏好电影,并在此基础上构建一个知识库,来推荐一部新电影。

另外还有一种称为半监督的学习,其本质上是监督学习和无监督学习的结合。该方法涉及数据和未标记数据的函数估计,而强化学习的本质上智能体与其环境之间的交互。所以,强化学习与其他所有机器学习的范式完全不同。

3. 强化学习的要素

a. 智能体

智能体是指进行智能决策的软件程序,在强化学习中通常是学习者。智能体通过与环境交互来执行行为,并根据其行为来获得奖励,比如说超级马里奥在视频游戏中的动作行为。

b. 策略函数

策略定义了智能体在环境中的行为。智能体决定执行何种行为取决于策略。假设从家到办公室之间存在不同的路径,其中有些路径很短,而有些路径相对很长。这些路径可以称为策略,因为代表了为达到目标而选择执行的行为。策略通常用符号πππ 表示。策略也可以是查找表形式或者复杂的搜索过程。

c. 值函数

值函数表示智能体在某一特定状态下的程度。这与策略有关,通常用v(s)v(s)v(s)表示。值函数等价于智能体从开始状态开始所获得的总的期望奖励。值函数有多种形式。最优值函数是与其他值函数相比,所有状态下具有最大值的一种值函数。同理,最优策略是指具有最优值函数的策略。

d. 模型

模型是指智能体对环境的表示。学习可以分为基于模型的学习和无模型学习两种类型,在基于模型的学习中,智能体利用先前学习到的信息来完成任务,而在无模型学习中,智能体仅是通过试错经验来执行正确行为。假设想要更快地从家到办公室。在基于模型的学习中,只需利用先前学习的经验(地图)来快速地到达办公室,而在无模型学习中,不会使用之前的经验,而是尝试所有不同的路线,并从中选择较快的一种。

4. 强化学习的环境类型

与智能体交互的所有内容都称为环境。环境是指外部世界,包括除智能体之外的一切事物。环境有不同类型:

(1)确定性环境

如果根据当前状态就可以知道相应的结果,那么就称为确定性环境。例如,在国际象棋中,会知道移动每个棋子侯的确切结果。

(2)随机性环境

如果不能根据当前状态来去确定相应的结果,那么这种环境就称为随机性环境。在这种环境中,会存在较大程度的不确定性。例如,在掷骰子时永远不知道会出现什么数字。

(3)完全可观测环境

如果智能体在任何时候都能确定系统的状态,那么久称为完全可观测环境。例如,在国际象棋中,系统的状态,即棋盘上所有棋子的位置,都是可以获得的。因此,棋手可以做出最优决策。

(4)部分可观测环境

如果智能体无法在任何时候都能确定系统的状态,那么就称为部分可观测环境。例如,在玩扑克时,不知道对手的牌。

(5)离散环境

如果从一个状态转移到另一个状态后只能有一个有限的行为状态,那么就称为离散环境。例如,在国际象棋中,只能有移动棋子后的有限集。

(6)连续环境

如果从一个状态转移到另一个状态可以有无限的行为状态,那么就称为连续环境。例如,从出发地到目的地的旅行中可以有多条路线。

(7)情景和非情景环境

情景环境也称为非序贯环境。在情景环境中,智能体的当前行为不会影响将来的行为,而在非情景环境中,智能体的当前行为会影响今后的行为,也称为序贯环境。也就是说,智能体在情景环境中执行独立的任务,而在非情景环境中,所有智能体的行为都是相关的。

(8)单智能体和多智能体环境

单智能体环境中只能有一个智能体,而多智能体中可以存在多个智能体。在执行复杂任务时,常用多智能体环境。在完全不同的环境中,有不同的智能体执行。不同环境下的智能体可以相互通信。多智能体环境大多是随机的,因为其具有较大程度的不确定性。

5. 强化学习的平台

OpenAI Gym是一个用于构建、评估和比较强化学习算法的工具包。可以兼容TensorFlow、Theano、keras等任何框架下所编写的算法。该工具包简单易懂,无需对智能体的结构进行假设,并对所有强化学习任务提供了接口。

OpenAI Universe是OpenAI Gym的扩展,提供了在从简单到实时复杂的各种环境下训练和评估智能体的功能。可以无限访问许多游戏环境。利用Universe,任何程序都可以转换为一个Gym环境,而无需访问程序内部、源代码或者API,这是因为Universe是通过一个计算远程桌面的虚拟网络来自动启动程序的。

DeepMind Lab是基于AI智能体的另一个优秀的研究平台,提供了丰富的模拟环境,可作为运行多种强化学习算法的实验平台。同时它具有高度的可定制化和可拓展性,可视化内容十分丰富,且具有科幻风格和逼真效果。

此外,还有RL-Glue、Project Malmo、VizDoom等平台,因为没有尝试过这些,所以不再具体介绍。

6.强化学习的入门例子

Gym的安装教程可以自行百度。可参考这个链接:三大操作系统中的环境配置方法 安装好后,通过

from gym import envs
print(envs.registry.all())

查看支持环境。
结果如下:

dict_values([EnvSpec(KangarooNoFrameskip-v4), EnvSpec(SkiingDeterministic-v0), EnvSpec(ChopperCommand-ramNoFrameskip-v0), EnvSpec(NameThisGame-ramDeterministic-v4), EnvSpec(Gopher-ramNoFrameskip-v4), EnvSpec(MontezumaRevenge-ramNoFrameskip-v0), EnvSpec(Skiing-ramDeterministic-v0), EnvSpec(Pooyan-ramNoFrameskip-v4), EnvSpec(ChopperCommand-ramDeterministic-v0), EnvSpec(Assault-ramNoFrameskip-v4), EnvSpec(ZaxxonNoFrameskip-v4), EnvSpec(JourneyEscape-ram-v0), EnvSpec(BankHeist-v4), EnvSpec(Gravitar-ramNoFrameskip-v0), EnvSpec(Berzerk-ramNoFrameskip-v0), EnvSpec(Pong-ram-v0), EnvSpec(DemonAttackNoFrameskip-v4), EnvSpec(BankHeist-v0), EnvSpec(AtlantisNoFrameskip-v0), EnvSpec(MsPacmanNoFrameskip-v0), EnvSpec(Bowling-v0), EnvSpec(ElevatorAction-v4), EnvSpec(Breakout-ramDeterministic-v4), EnvSpec(FishingDerbyDeterministic-v4), EnvSpec(BipedalWalkerHardcore-v2), EnvSpec(KungFuMaster-ramDeterministic-v0), EnvSpec(DemonAttackDeterministic-v4), EnvSpec(CentipedeDeterministic-v0), EnvSpec(TimePilot-ramDeterministic-v4), EnvSpec(SeaquestDeterministic-v4), EnvSpec(SkiingNoFrameskip-v0), EnvSpec(MountainCarContinuous-v0), EnvSpec(BattleZone-v4), EnvSpec(Krull-ramDeterministic-v4), EnvSpec(GravitarDeterministic-v4), EnvSpec(SpaceInvaders-ramNoFrameskip-v0), EnvSpec(HandManipulateEggFull-v0), EnvSpec(BowlingDeterministic-v4), EnvSpec(InvertedPendulum-v2), EnvSpec(Skiing-ram-v0), EnvSpec(Tutankham-ramDeterministic-v0), EnvSpec(Gopher-ramDeterministic-v4), EnvSpec(KellyCoinflipGeneralized-v0), EnvSpec(Asterix-ramDeterministic-v0), EnvSpec(BattleZone-ramNoFrameskip-v4), EnvSpec(DemonAttack-ramDeterministic-v4), EnvSpec(AirRaid-ramDeterministic-v4), EnvSpec(Robotank-ramDeterministic-v4), EnvSpec(Gravitar-v0), EnvSpec(VentureDeterministic-v0), EnvSpec(AsterixDeterministic-v0), EnvSpec(DemonAttackDeterministic-v0), EnvSpec(BeamRiderDeterministic-v0), EnvSpec(IceHockey-ramDeterministic-v4), EnvSpec(HandManipulateEggDense-v0), EnvSpec(Boxing-ramDeterministic-v0), EnvSpec(HandManipulatePenFullDense-v0), EnvSpec(KungFuMasterDeterministic-v4), EnvSpec(ElevatorAction-ram-v4), EnvSpec(SolarisDeterministic-v0), EnvSpec(JamesbondNoFrameskip-v0), EnvSpec(FrostbiteDeterministic-v0), EnvSpec(AmidarDeterministic-v4), EnvSpec(Enduro-ramDeterministic-v0), EnvSpec(CarnivalNoFrameskip-v4), EnvSpec(Tutankham-v4), EnvSpec(HeroNoFrameskip-v0), EnvSpec(MountainCar-v0), EnvSpec(YarsRevenge-ram-v4), EnvSpec(Centipede-ramNoFrameskip-v0), EnvSpec(VideoPinball-ramDeterministic-v4), EnvSpec(BattleZone-v0), EnvSpec(BankHeistNoFrameskip-v0), EnvSpec(MsPacman-ramNoFrameskip-v4), EnvSpec(Centipede-ramDeterministic-v0), EnvSpec(NameThisGameDeterministic-v0), EnvSpec(Swimmer-v2), EnvSpec(Skiing-ram-v4), EnvSpec(JourneyEscape-ramNoFrameskip-v4), EnvSpec(Assault-v4), EnvSpec(MsPacman-v0), EnvSpec(Frostbite-ramNoFrameskip-v4), EnvSpec(Breakout-ram-v4), EnvSpec(FrostbiteDeterministic-v4), EnvSpec(Qbert-ramNoFrameskip-v0), EnvSpec(CliffWalking-v0), EnvSpec(TimePilotNoFrameskip-v0), EnvSpec(BeamRiderNoFrameskip-v0), EnvSpec(Phoenix-ramDeterministic-v4), EnvSpec(IceHockey-ramNoFrameskip-v0), EnvSpec(IceHockeyDeterministic-v0), EnvSpec(Kangaroo-ram-v0), EnvSpec(HandManipulatePen-v0), EnvSpec(VideoPinball-v0), EnvSpec(ElevatorAction-ramNoFrameskip-v4), EnvSpec(KellyCoinflip-v0), EnvSpec(Centipede-v4), EnvSpec(CrazyClimber-ram-v4), EnvSpec(IceHockey-ramNoFrameskip-v4), EnvSpec(HandManipulateBlockRotateParallelDense-v0), EnvSpec(NameThisGameNoFrameskip-v0), EnvSpec(VideoPinballDeterministic-v0), EnvSpec(Krull-ram-v0), EnvSpec(Pong-ramDeterministic-v0), EnvSpec(IceHockeyNoFrameskip-v4), EnvSpec(TimePilot-ram-v0), EnvSpec(Enduro-ramDeterministic-v4), EnvSpec(CentipedeDeterministic-v4), EnvSpec(KangarooDeterministic-v4), EnvSpec(CrazyClimber-v4), EnvSpec(Bowling-ramNoFrameskip-v0), EnvSpec(Pendulum-v0), EnvSpec(BankHeistDeterministic-v0), EnvSpec(Phoenix-ramNoFrameskip-v4), EnvSpec(Alien-v0), EnvSpec(ChopperCommand-ram-v4), EnvSpec(FrostbiteNoFrameskip-v0), EnvSpec(PongDeterministic-v4), EnvSpec(Amidar-v4), EnvSpec(DoubleDunk-ramNoFrameskip-v4), EnvSpec(BankHeistDeterministic-v4), EnvSpec(Solaris-ramDeterministic-v0), EnvSpec(TimePilot-ramDeterministic-v0), EnvSpec(Freeway-ramNoFrameskip-v0), EnvSpec(Alien-ramNoFrameskip-v0), EnvSpec(CubeCrash-v0), EnvSpec(BankHeistNoFrameskip-v4), EnvSpec(BeamRider-v0), EnvSpec(CubeCrashSparse-v0), EnvSpec(PrivateEyeDeterministic-v0), EnvSpec(Pooyan-ramNoFrameskip-v0), EnvSpec(Bowling-ram-v4), EnvSpec(FetchPushDense-v1), EnvSpec(Pong-v0), EnvSpec(AirRaidNoFrameskip-v4), EnvSpec(DemonAttack-ramDeterministic-v0), EnvSpec(Riverraid-ram-v4), EnvSpec(ZaxxonDeterministic-v0), EnvSpec(FetchSlideDense-v1), EnvSpec(Skiing-ramNoFrameskip-v4), EnvSpec(Gopher-v0), EnvSpec(SpaceInvaders-ramDeterministic-v4), EnvSpec(Gravitar-v4), EnvSpec(BattleZoneDeterministic-v0), EnvSpec(SolarisDeterministic-v4), EnvSpec(BattleZone-ramDeterministic-v0), EnvSpec(Breakout-v0), EnvSpec(AlienDeterministic-v4), EnvSpec(Centipede-v0), EnvSpec(Freeway-ramNoFrameskip-v4), EnvSpec(Seaquest-ramDeterministic-v0), EnvSpec(AlienNoFrameskip-v0), EnvSpec(MontezumaRevenge-v0), EnvSpec(Robotank-v4), EnvSpec(VideoPinball-v4), EnvSpec(Hero-ramNoFrameskip-v4), EnvSpec(AsteroidsDeterministic-v0), EnvSpec(AirRaid-v0), EnvSpec(RoadRunner-v4), EnvSpec(Hero-ramDeterministic-v4), EnvSpec(SpaceInvaders-ramDeterministic-v0), EnvSpec(BeamRider-ramNoFrameskip-v4), EnvSpec(Gravitar-ramDeterministic-v4), EnvSpec(InvertedDoublePendulum-v2), EnvSpec(MontezumaRevenge-ram-v4), EnvSpec(Breakout-v4), EnvSpec(HandManipulateBlock-v0), EnvSpec(TennisDeterministic-v0), EnvSpec(WizardOfWorNoFrameskip-v4), EnvSpec(Berzerk-ramDeterministic-v0), EnvSpec(YarsRevengeNoFrameskip-v0), EnvSpec(FreewayDeterministic-v0), EnvSpec(MontezumaRevenge-ramNoFrameskip-v4), EnvSpec(Zaxxon-ramDeterministic-v0), EnvSpec(Hopper-v2), EnvSpec(TennisNoFrameskip-v4), EnvSpec(UpNDown-ram-v0), EnvSpec(PrivateEye-ramNoFrameskip-v4), EnvSpec(AssaultDeterministic-v0), EnvSpec(RobotankNoFrameskip-v0), EnvSpec(BoxingDeterministic-v0), EnvSpec(BattleZoneNoFrameskip-v4), EnvSpec(Breakout-ramNoFrameskip-v4), EnvSpec(Blackjack-v0), EnvSpec(BipedalWalker-v2), EnvSpec(ChopperCommandDeterministic-v4), EnvSpec(Robotank-ram-v0), EnvSpec(NameThisGameDeterministic-v4), EnvSpec(StarGunnerDeterministic-v0), EnvSpec(Solaris-ramDeterministic-v4), EnvSpec(UpNDown-ramDeterministic-v4), EnvSpec(RoadRunnerNoFrameskip-v0), EnvSpec(PrivateEye-ram-v0), EnvSpec(Pooyan-v4), EnvSpec(Qbert-v4), EnvSpec(DoubleDunk-ramDeterministic-v0), EnvSpec(MontezumaRevengeDeterministic-v0), EnvSpec(ElevatorAction-ramNoFrameskip-v0), EnvSpec(Copy-v0), EnvSpec(Gravitar-ramNoFrameskip-v4), EnvSpec(Alien-v4), EnvSpec(Walker2d-v2), EnvSpec(EnduroNoFrameskip-v0), EnvSpec(JourneyEscape-ram-v4), EnvSpec(Robotank-ram-v4), EnvSpec(Venture-v4), EnvSpec(BankHeist-ram-v4), EnvSpec(Amidar-v0), EnvSpec(NameThisGame-ram-v0), EnvSpec(AsteroidsDeterministic-v4), EnvSpec(Asteroids-ramDeterministic-v0), EnvSpec(TimePilotDeterministic-v4), EnvSpec(Centipede-ram-v4), EnvSpec(FishingDerby-ram-v4), EnvSpec(HandManipulateBlockRotateParallel-v0), EnvSpec(HeroDeterministic-v4), EnvSpec(AirRaid-ramNoFrameskip-v4), EnvSpec(UpNDown-ramDeterministic-v0), EnvSpec(RobotankNoFrameskip-v4), EnvSpec(FrostbiteNoFrameskip-v4), EnvSpec(WizardOfWor-ramNoFrameskip-v0), EnvSpec(GopherNoFrameskip-v4), EnvSpec(DoubleDunk-v4), EnvSpec(YarsRevengeNoFrameskip-v4), EnvSpec(HalfCheetah-v2), EnvSpec(CrazyClimberNoFrameskip-v4), EnvSpec(AsterixNoFrameskip-v0), EnvSpec(DoubleDunkNoFrameskip-v4), EnvSpec(BoxingNoFrameskip-v0), EnvSpec(NameThisGame-ramNoFrameskip-v4), EnvSpec(GravitarDeterministic-v0), EnvSpec(BattleZone-ramDeterministic-v4), EnvSpec(Pong-ramDeterministic-v4), EnvSpec(FishingDerby-ramDeterministic-v0), EnvSpec(PitfallDeterministic-v4), EnvSpec(Berzerk-ramNoFrameskip-v4), EnvSpec(RoadRunner-ramDeterministic-v0), EnvSpec(Atlantis-ramNoFrameskip-v0), EnvSpec(SpaceInvaders-ramNoFrameskip-v4), EnvSpec(PooyanNoFrameskip-v0), EnvSpec(Berzerk-ram-v4), EnvSpec(VideoPinball-ramDeterministic-v0), EnvSpec(UpNDown-ramNoFrameskip-v0), EnvSpec(Asteroids-ramNoFrameskip-v4), EnvSpec(PhoenixDeterministic-v0), EnvSpec(CrazyClimber-ramDeterministic-v0), EnvSpec(Jamesbond-ramDeterministic-v4), EnvSpec(CarnivalDeterministic-v0), EnvSpec(PrivateEyeDeterministic-v4), EnvSpec(Enduro-ram-v4), EnvSpec(Phoenix-ramNoFrameskip-v0), EnvSpec(NameThisGame-v0), EnvSpec(ElevatorActionNoFrameskip-v0), EnvSpec(MontezumaRevenge-v4), EnvSpec(Pong-v4), EnvSpec(Tennis-ramDeterministic-v4), EnvSpec(Krull-v4), EnvSpec(AmidarDeterministic-v0), EnvSpec(Pitfall-v0), EnvSpec(Gopher-ramDeterministic-v0), EnvSpec(RoadRunner-ram-v4), EnvSpec(Alien-ram-v4), EnvSpec(CentipedeNoFrameskip-v0), EnvSpec(Riverraid-ram-v0), EnvSpec(DoubleDunk-ram-v4), EnvSpec(Phoenix-v4), EnvSpec(Pitfall-ramNoFrameskip-v0), EnvSpec(Asterix-ramNoFrameskip-v4), EnvSpec(PhoenixNoFrameskip-v0), EnvSpec(Jamesbond-ramNoFrameskip-v4), EnvSpec(Berzerk-ramDeterministic-v4), EnvSpec(YarsRevenge-ramNoFrameskip-v4), EnvSpec(Riverraid-ramDeterministic-v4), EnvSpec(SolarisNoFrameskip-v0), EnvSpec(Kangaroo-v0), EnvSpec(IceHockeyDeterministic-v4), EnvSpec(Thrower-v2), EnvSpec(WizardOfWorDeterministic-v4), EnvSpec(Freeway-v4), EnvSpec(Zaxxon-ramNoFrameskip-v4), EnvSpec(Assault-ramDeterministic-v0), EnvSpec(Bowling-v4), EnvSpec(UpNDownDeterministic-v4), EnvSpec(Frostbite-v4), EnvSpec(TutankhamDeterministic-v0), EnvSpec(MsPacman-ramDeterministic-v0), EnvSpec(Centipede-ram-v0), EnvSpec(CartPole-v0), EnvSpec(VentureNoFrameskip-v4), EnvSpec(TennisDeterministic-v4), EnvSpec(Reverse-v0), EnvSpec(HandManipulateBlockFull-v0), EnvSpec(HumanoidStandup-v2), EnvSpec(Frostbite-ramDeterministic-v0), EnvSpec(DemonAttack-ramNoFrameskip-v0), EnvSpec(IceHockeyNoFrameskip-v0), EnvSpec(Gravitar-ram-v0), EnvSpec(RobotankDeterministic-v4), EnvSpec(RoadRunner-ramNoFrameskip-v0), EnvSpec(Pooyan-ramDeterministic-v0), EnvSpec(AtlantisNoFrameskip-v4), EnvSpec(Alien-ramDeterministic-v4), EnvSpec(Seaquest-ramNoFrameskip-v0), EnvSpec(HeroDeterministic-v0), EnvSpec(Robotank-ramNoFrameskip-v4), EnvSpec(Tutankham-ramNoFrameskip-v0), EnvSpec(MsPacman-ramNoFrameskip-v0), EnvSpec(PitfallNoFrameskip-v4), EnvSpec(Pong-ram-v4), EnvSpec(DoubleDunk-v0), EnvSpec(Pooyan-ram-v0), EnvSpec(Tennis-v0), EnvSpec(Solaris-v4), EnvSpec(KungFuMasterNoFrameskip-v4), EnvSpec(Carnival-v0), EnvSpec(BankHeist-ramDeterministic-v0), EnvSpec(HandManipulatePenDense-v0), EnvSpec(Seaquest-v4), EnvSpec(AirRaidDeterministic-v4), EnvSpec(VideoPinball-ram-v4), EnvSpec(SkiingDeterministic-v4), EnvSpec(CrazyClimber-v0), EnvSpec(BreakoutNoFrameskip-v4), EnvSpec(HandReachDense-v0), EnvSpec(Qbert-ramDeterministic-v0), EnvSpec(Hero-ramDeterministic-v0), EnvSpec(SpaceInvaders-v0), EnvSpec(FishingDerby-ramNoFrameskip-v0), EnvSpec(CrazyClimberDeterministic-v0), EnvSpec(AirRaid-ramDeterministic-v0), EnvSpec(YarsRevenge-v0), EnvSpec(MontezumaRevengeNoFrameskip-v4), EnvSpec(Freeway-ramDeterministic-v0), EnvSpec(IceHockey-v4), EnvSpec(KrullDeterministic-v4), EnvSpec(PrivateEye-ramNoFrameskip-v0), EnvSpec(JourneyEscape-ramDeterministic-v0), EnvSpec(KungFuMaster-ramNoFrameskip-v4), EnvSpec(Seaquest-ramNoFrameskip-v4), EnvSpec(FishingDerby-v0), EnvSpec(Gopher-ram-v4), EnvSpec(SeaquestNoFrameskip-v4), EnvSpec(ChopperCommandNoFrameskip-v4), EnvSpec(KungFuMaster-ramNoFrameskip-v0), EnvSpec(Enduro-v0), EnvSpec(Enduro-v4), EnvSpec(PooyanDeterministic-v0), EnvSpec(SpaceInvaders-ram-v0), EnvSpec(Boxing-ramDeterministic-v4), EnvSpec(Hero-ram-v0), EnvSpec(VentureDeterministic-v4), EnvSpec(RiverraidNoFrameskip-v0), EnvSpec(QbertDeterministic-v4), EnvSpec(BattleZone-ramNoFrameskip-v0), EnvSpec(Riverraid-v0), EnvSpec(Frostbite-ramDeterministic-v4), EnvSpec(NameThisGame-ramDeterministic-v0), EnvSpec(Solaris-ram-v4), EnvSpec(UpNDown-v4), EnvSpec(HandManipulateBlockRotateXYZDense-v0), EnvSpec(Assault-ram-v4), EnvSpec(Solaris-ramNoFrameskip-v0), EnvSpec(Boxing-v0), EnvSpec(Phoenix-v0), EnvSpec(Seaquest-ram-v0), EnvSpec(SpaceInvadersDeterministic-v4), EnvSpec(FetchSlide-v1), EnvSpec(Skiing-v0), EnvSpec(GravitarNoFrameskip-v0), EnvSpec(FetchPickAndPlace-v1), EnvSpec(StarGunner-ramNoFrameskip-v0), EnvSpec(PooyanDeterministic-v4), EnvSpec(Tutankham-ramDeterministic-v4), EnvSpec(MontezumaRevenge-ram-v0), EnvSpec(Skiing-v4), EnvSpec(KungFuMasterDeterministic-v0), EnvSpec(Hero-v0), EnvSpec(StarGunner-ramDeterministic-v0), EnvSpec(StarGunner-ram-v0), EnvSpec(EnduroDeterministic-v0), EnvSpec(SpaceInvaders-ram-v4), EnvSpec(Amidar-ramNoFrameskip-v0), EnvSpec(Boxing-v4), EnvSpec(AirRaid-ram-v0), EnvSpec(TimePilot-ram-v4), EnvSpec(WizardOfWor-ramNoFrameskip-v4), EnvSpec(RiverraidDeterministic-v0), EnvSpec(MemorizeDigits-v0), EnvSpec(RoadRunnerNoFrameskip-v4), EnvSpec(AlienNoFrameskip-v4), EnvSpec(UpNDown-ram-v4), EnvSpec(BerzerkDeterministic-v0), EnvSpec(Phoenix-ramDeterministic-v0), EnvSpec(Kangaroo-ramDeterministic-v4), EnvSpec(Humanoid-v2), EnvSpec(BankHeist-ramNoFrameskip-v4), EnvSpec(Qbert-ramNoFrameskip-v4), EnvSpec(BattleZoneNoFrameskip-v0), EnvSpec(DoubleDunk-ramNoFrameskip-v0), EnvSpec(WizardOfWor-v4), EnvSpec(BreakoutNoFrameskip-v0), EnvSpec(KungFuMaster-v0), EnvSpec(BeamRiderDeterministic-v4), EnvSpec(Hero-ramNoFrameskip-v0), EnvSpec(HandManipulateBlockRotateXYZ-v0), EnvSpec(Zaxxon-ram-v0), EnvSpec(RoadRunnerDeterministic-v0), EnvSpec(FetchPickAndPlaceDense-v1), EnvSpec(BreakoutDeterministic-v0), EnvSpec(Jamesbond-v0), EnvSpec(HandManipulateEgg-v0), EnvSpec(BankHeist-ramNoFrameskip-v0), EnvSpec(Bowling-ramDeterministic-v0), EnvSpec(Tennis-ramNoFrameskip-v0), EnvSpec(Kangaroo-ram-v4), EnvSpec(HotterColder-v0), EnvSpec(Atlantis-v4), EnvSpec(HandManipulateEggRotateDense-v0), EnvSpec(BeamRider-ramDeterministic-v4), EnvSpec(Ant-v2), EnvSpec(RoadRunner-ram-v0), EnvSpec(PhoenixNoFrameskip-v4), EnvSpec(Assault-ram-v0), EnvSpec(BankHeist-ramDeterministic-v4), EnvSpec(CrazyClimberDeterministic-v4), EnvSpec(Asteroids-ram-v0), EnvSpec(Krull-v0), EnvSpec(Amidar-ram-v4), EnvSpec(Bowling-ramDeterministic-v4), EnvSpec(FishingDerbyNoFrameskip-v4), EnvSpec(YarsRevenge-ram-v0), EnvSpec(RiverraidDeterministic-v4), EnvSpec(PongNoFrameskip-v0), EnvSpec(Hero-v4), EnvSpec(Seaquest-ramDeterministic-v4), EnvSpec(KangarooDeterministic-v0), EnvSpec(JamesbondDeterministic-v4), EnvSpec(Venture-ramDeterministic-v0), EnvSpec(HandManipulateBlockRotateZDense-v0), EnvSpec(GopherDeterministic-v4), EnvSpec(StarGunner-ramNoFrameskip-v4), EnvSpec(QbertNoFrameskip-v4), EnvSpec(Kangaroo-v4), EnvSpec(Carnival-ramNoFrameskip-v0), EnvSpec(HandManipulateBlockRotateZ-v0), EnvSpec(EnduroNoFrameskip-v4), EnvSpec(UpNDownNoFrameskip-v4), EnvSpec(FishingDerbyDeterministic-v0), EnvSpec(PrivateEye-ramDeterministic-v0), EnvSpec(UpNDown-ramNoFrameskip-v4), EnvSpec(AirRaidDeterministic-v0), EnvSpec(CrazyClimber-ramNoFrameskip-v4), EnvSpec(Venture-ramNoFrameskip-v4), EnvSpec(AssaultDeterministic-v4), EnvSpec(IceHockey-v0), EnvSpec(HandManipulateBlockDense-v0), EnvSpec(YarsRevenge-ramDeterministic-v0), EnvSpec(Pitfall-ramDeterministic-v4), EnvSpec(LunarLanderContinuous-v2), EnvSpec(Alien-ramNoFrameskip-v4), EnvSpec(ChopperCommand-ramNoFrameskip-v4), EnvSpec(JourneyEscape-ramNoFrameskip-v0), EnvSpec(RepeatCopy-v0), EnvSpec(TimePilot-ramNoFrameskip-v4), EnvSpec(FishingDerbyNoFrameskip-v0), EnvSpec(Asteroids-ram-v4), EnvSpec(AirRaid-ram-v4), EnvSpec(NameThisGameNoFrameskip-v4), EnvSpec(Seaquest-v0), EnvSpec(Freeway-v0), EnvSpec(VideoPinball-ramNoFrameskip-v0), EnvSpec(JamesbondNoFrameskip-v4), EnvSpec(Venture-ramDeterministic-v4), EnvSpec(Krull-ram-v4), EnvSpec(ChopperCommand-ramDeterministic-v4), EnvSpec(BoxingNoFrameskip-v4), EnvSpec(Tutankham-ram-v0), EnvSpec(Pitfall-ramDeterministic-v0), EnvSpec(YarsRevenge-ramNoFrameskip-v0), EnvSpec(VideoPinball-ram-v0), EnvSpec(Tennis-ramDeterministic-v0), EnvSpec(HandManipulateBlockFullDense-v0), EnvSpec(CarnivalNoFrameskip-v0), EnvSpec(Carnival-ramNoFrameskip-v4), EnvSpec(Frostbite-ramNoFrameskip-v0), EnvSpec(Carnival-ramDeterministic-v4), EnvSpec(Qbert-ramDeterministic-v4), EnvSpec(AsteroidsNoFrameskip-v0), EnvSpec(HandManipulatePenFull-v0), EnvSpec(DoubleDunkDeterministic-v4), EnvSpec(FreewayDeterministic-v4), EnvSpec(Qbert-ram-v0), EnvSpec(FetchReach-v1), EnvSpec(CarRacing-v0), EnvSpec(NameThisGame-ramNoFrameskip-v0), EnvSpec(GravitarNoFrameskip-v4), EnvSpec(Boxing-ramNoFrameskip-v0), EnvSpec(ZaxxonDeterministic-v4), EnvSpec(Zaxxon-ramDeterministic-v4), EnvSpec(KungFuMaster-v4), EnvSpec(BerzerkDeterministic-v4), EnvSpec(TimePilot-v4), EnvSpec(SeaquestDeterministic-v0), EnvSpec(AtlantisDeterministic-v0), EnvSpec(IceHockey-ram-v4), EnvSpec(PongNoFrameskip-v4), EnvSpec(YarsRevengeDeterministic-v4), EnvSpec(BreakoutDeterministic-v4), EnvSpec(JourneyEscapeNoFrameskip-v0), EnvSpec(UpNDown-v0), EnvSpec(Striker-v2), EnvSpec(Venture-v0), EnvSpec(Riverraid-v4), EnvSpec(ElevatorActionDeterministic-v0), EnvSpec(Frostbite-v0), EnvSpec(HandReach-v0), EnvSpec(Asterix-ramNoFrameskip-v0), EnvSpec(BattleZoneDeterministic-v4), EnvSpec(MsPacman-v4), EnvSpec(RobotankDeterministic-v0), EnvSpec(TimePilotDeterministic-v0), EnvSpec(JamesbondDeterministic-v0), EnvSpec(DemonAttack-ram-v0), EnvSpec(BeamRider-ram-v4), EnvSpec(Assault-ramDeterministic-v4), EnvSpec(PrivateEye-ram-v4), EnvSpec(SpaceInvaders-v4), EnvSpec(PitfallDeterministic-v0), EnvSpec(Asterix-ram-v0), EnvSpec(RoadRunner-ramDeterministic-v4), EnvSpec(VideoPinballNoFrameskip-v4), EnvSpec(Riverraid-ramDeterministic-v0), EnvSpec(AssaultNoFrameskip-v0), EnvSpec(CartPole-v1), EnvSpec(PrivateEye-ramDeterministic-v4), EnvSpec(AirRaid-ramNoFrameskip-v0), EnvSpec(Pooyan-ramDeterministic-v4), EnvSpec(Krull-ramDeterministic-v0), EnvSpec(Kangaroo-ramNoFrameskip-v4), EnvSpec(WizardOfWor-ramDeterministic-v4), EnvSpec(StarGunner-ram-v4), EnvSpec(Pong-ramNoFrameskip-v4), EnvSpec(Gopher-ram-v0), EnvSpec(Frostbite-ram-v4), EnvSpec(Jamesbond-ramNoFrameskip-v0), EnvSpec(CarnivalDeterministic-v4), EnvSpec(GuessingGame-v0), EnvSpec(PongDeterministic-v0), EnvSpec(Breakout-ramNoFrameskip-v0), EnvSpec(HandManipulatePenRotateDense-v0), EnvSpec(Tennis-ram-v4), EnvSpec(AirRaid-v4), EnvSpec(Assault-v0), EnvSpec(Berzerk-v0), EnvSpec(Riverraid-ramNoFrameskip-v4), EnvSpec(MsPacmanDeterministic-v4), EnvSpec(MsPacman-ram-v0), EnvSpec(AssaultNoFrameskip-v4), EnvSpec(KungFuMaster-ram-v0), EnvSpec(ElevatorAction-ramDeterministic-v4), EnvSpec(BeamRider-v4), EnvSpec(RoadRunner-ramNoFrameskip-v4), EnvSpec(TimePilot-ramNoFrameskip-v0), EnvSpec(SpaceInvadersNoFrameskip-v4), EnvSpec(Atlantis-ramDeterministic-v0), EnvSpec(MontezumaRevengeNoFrameskip-v0), EnvSpec(Seaquest-ram-v4), EnvSpec(PrivateEye-v4), EnvSpec(ElevatorAction-ramDeterministic-v0), EnvSpec(Tennis-ram-v0), EnvSpec(Pitfall-v4), EnvSpec(CrazyClimber-ramDeterministic-v4), EnvSpec(Enduro-ramNoFrameskip-v0), EnvSpec(Robotank-v0), EnvSpec(KangarooNoFrameskip-v0), EnvSpec(WizardOfWor-v0), EnvSpec(BerzerkNoFrameskip-v0), EnvSpec(ChopperCommandDeterministic-v0), EnvSpec(WizardOfWor-ram-v4), EnvSpec(HandManipulateEggRotate-v0), EnvSpec(GopherDeterministic-v0), EnvSpec(DemonAttackNoFrameskip-v0), EnvSpec(PrivateEyeNoFrameskip-v0), EnvSpec(Pong-ramNoFrameskip-v0), EnvSpec(Freeway-ramDeterministic-v4), EnvSpec(MsPacman-ramDeterministic-v4), EnvSpec(AsterixDeterministic-v4), EnvSpec(TutankhamDeterministic-v4), EnvSpec(Jamesbond-ramDeterministic-v0), EnvSpec(Skiing-ramNoFrameskip-v0), EnvSpec(Solaris-ram-v0), EnvSpec(Breakout-ram-v0), EnvSpec(SeaquestNoFrameskip-v0), EnvSpec(Jamesbond-ram-v4), EnvSpec(RoadRunner-v0), EnvSpec(VideoPinball-ramNoFrameskip-v4), EnvSpec(Bowling-ram-v0), EnvSpec(Venture-ram-v4), EnvSpec(FishingDerby-ram-v0), EnvSpec(YarsRevenge-v4), EnvSpec(Atlantis-ram-v0), EnvSpec(Frostbite-ram-v0), EnvSpec(Pooyan-v0), EnvSpec(AsterixNoFrameskip-v4), EnvSpec(Pitfall-ram-v4), EnvSpec(Roulette-v0), EnvSpec(Tutankham-ramNoFrameskip-v4), EnvSpec(Centipede-ramDeterministic-v4), EnvSpec(ChopperCommandNoFrameskip-v0), EnvSpec(Freeway-ram-v4), EnvSpec(Tennis-v4), EnvSpec(NameThisGame-v4), EnvSpec(BowlingNoFrameskip-v0), EnvSpec(HandManipulatePenRotate-v0), EnvSpec(KungFuMasterNoFrameskip-v0), EnvSpec(Solaris-v0), EnvSpec(Enduro-ramNoFrameskip-v4), EnvSpec(NameThisGame-ram-v4), EnvSpec(StarGunnerDeterministic-v4), EnvSpec(AmidarNoFrameskip-v4), EnvSpec(Gravitar-ram-v4), EnvSpec(CrazyClimberNoFrameskip-v0), EnvSpec(Riverraid-ramNoFrameskip-v0), EnvSpec(DemonAttack-v0), EnvSpec(MontezumaRevengeDeterministic-v4), EnvSpec(Krull-ramNoFrameskip-v0), EnvSpec(Amidar-ramDeterministic-v0), EnvSpec(Gopher-v4), EnvSpec(Amidar-ramDeterministic-v4), EnvSpec(TutankhamNoFrameskip-v0), EnvSpec(VentureNoFrameskip-v0), EnvSpec(StarGunner-v0), EnvSpec(Breakout-ramDeterministic-v0), EnvSpec(BeamRider-ram-v0), EnvSpec(FetchPush-v1), EnvSpec(ElevatorAction-v0), EnvSpec(ElevatorActionNoFrameskip-v4), EnvSpec(UpNDownNoFrameskip-v0), EnvSpec(FishingDerby-ramNoFrameskip-v4), EnvSpec(Alien-ramDeterministic-v0), EnvSpec(StarGunnerNoFrameskip-v4), EnvSpec(YarsRevenge-ramDeterministic-v4), EnvSpec(PrivateEyeNoFrameskip-v4), EnvSpec(BattleZone-ram-v4), EnvSpec(CentipedeNoFrameskip-v4), EnvSpec(DoubleDunkDeterministic-v0), EnvSpec(DoubleDunk-ramDeterministic-v4), EnvSpec(Zaxxon-ram-v4), EnvSpec(WizardOfWor-ram-v0), EnvSpec(AlienDeterministic-v0), EnvSpec(AtlantisDeterministic-v4), EnvSpec(KrullNoFrameskip-v0), EnvSpec(Phoenix-ram-v4), EnvSpec(TimePilotNoFrameskip-v4), EnvSpec(MsPacmanDeterministic-v0), EnvSpec(Carnival-ram-v4), EnvSpec(Asterix-v0), EnvSpec(TimePilot-v0), EnvSpec(EnduroDeterministic-v4), EnvSpec(Pitfall-ram-v0), EnvSpec(BeamRider-ramNoFrameskip-v0), EnvSpec(CrazyClimber-ram-v0), EnvSpec(Atlantis-ram-v4), EnvSpec(QbertNoFrameskip-v0), EnvSpec(MsPacmanNoFrameskip-v4), EnvSpec(SkiingNoFrameskip-v4), EnvSpec(VideoPinballNoFrameskip-v0), EnvSpec(FrozenLake8x8-v0), EnvSpec(Boxing-ramNoFrameskip-v4), EnvSpec(MontezumaRevenge-ramDeterministic-v0), EnvSpec(Zaxxon-v0), EnvSpec(ChopperCommand-v0), EnvSpec(DuplicatedInput-v0), EnvSpec(StarGunnerNoFrameskip-v0), EnvSpec(JourneyEscapeDeterministic-v0), EnvSpec(Berzerk-v4), EnvSpec(Skiing-ramDeterministic-v4), EnvSpec(ChopperCommand-v4), EnvSpec(RoadRunnerDeterministic-v4), EnvSpec(BeamRiderNoFrameskip-v4), EnvSpec(ZaxxonNoFrameskip-v0), EnvSpec(Pooyan-ram-v4), EnvSpec(Berzerk-ram-v0), EnvSpec(Zaxxon-v4), EnvSpec(KrullNoFrameskip-v4), EnvSpec(Asteroids-ramDeterministic-v4), EnvSpec(IceHockey-ram-v0), EnvSpec(Tutankham-v0), EnvSpec(PrivateEye-v0), EnvSpec(Venture-ram-v0), EnvSpec(Enduro-ram-v0), EnvSpec(KungFuMaster-ramDeterministic-v4), EnvSpec(Taxi-v2), EnvSpec(JourneyEscapeDeterministic-v4), EnvSpec(FreewayNoFrameskip-v0), EnvSpec(MontezumaRevenge-ramDeterministic-v4), EnvSpec(StarGunner-ramDeterministic-v4), EnvSpec(KungFuMaster-ram-v4), EnvSpec(CrazyClimber-ramNoFrameskip-v0), EnvSpec(Alien-ram-v0), EnvSpec(BowlingNoFrameskip-v4), EnvSpec(Kangaroo-ramDeterministic-v0), EnvSpec(BeamRider-ramDeterministic-v0), EnvSpec(JourneyEscape-v0), EnvSpec(FetchReachDense-v1), EnvSpec(ElevatorAction-ram-v0), EnvSpec(Zaxxon-ramNoFrameskip-v0), EnvSpec(ReversedAddition3-v0), EnvSpec(KrullDeterministic-v0), EnvSpec(AsteroidsNoFrameskip-v4), EnvSpec(StarGunner-v4), EnvSpec(Boxing-ram-v0), EnvSpec(JourneyEscape-ramDeterministic-v4), EnvSpec(VideoPinballDeterministic-v4), EnvSpec(Pusher-v2), EnvSpec(PooyanNoFrameskip-v4), EnvSpec(Centipede-ramNoFrameskip-v4), EnvSpec(Amidar-ram-v0), EnvSpec(TutankhamNoFrameskip-v4), EnvSpec(QbertDeterministic-v0), EnvSpec(WizardOfWorDeterministic-v0), EnvSpec(Atlantis-v0), EnvSpec(YarsRevengeDeterministic-v0), EnvSpec(Asteroids-v0), EnvSpec(Jamesbond-ram-v0), EnvSpec(Atlantis-ramNoFrameskip-v4), EnvSpec(FrozenLake-v0), EnvSpec(IceHockey-ramDeterministic-v0), EnvSpec(Tennis-ramNoFrameskip-v4), EnvSpec(SpaceInvadersDeterministic-v0), EnvSpec(MsPacman-ram-v4), EnvSpec(PitfallNoFrameskip-v0), EnvSpec(PhoenixDeterministic-v4), EnvSpec(Krull-ramNoFrameskip-v4), EnvSpec(Reacher-v2), EnvSpec(LunarLander-v2), EnvSpec(Asteroids-v4), EnvSpec(ElevatorActionDeterministic-v4), EnvSpec(Gopher-ramNoFrameskip-v0), EnvSpec(HeroNoFrameskip-v4), EnvSpec(ReversedAddition-v0), EnvSpec(FishingDerby-v4), EnvSpec(BoxingDeterministic-v4), EnvSpec(WizardOfWorNoFrameskip-v0), EnvSpec(Asteroids-ramNoFrameskip-v0), EnvSpec(Pitfall-ramNoFrameskip-v4), EnvSpec(Qbert-v0), EnvSpec(Bowling-ramNoFrameskip-v4), EnvSpec(JourneyEscapeNoFrameskip-v4), EnvSpec(Asterix-ramDeterministic-v4), EnvSpec(Assault-ramNoFrameskip-v0), EnvSpec(Robotank-ramDeterministic-v0), EnvSpec(AirRaidNoFrameskip-v0), EnvSpec(Atlantis-ramDeterministic-v4), EnvSpec(Phoenix-ram-v0), EnvSpec(HandManipulateEggFullDense-v0), EnvSpec(BowlingDeterministic-v0), EnvSpec(Carnival-ramDeterministic-v0), EnvSpec(SolarisNoFrameskip-v4), EnvSpec(Carnival-v4), EnvSpec(Jamesbond-v4), EnvSpec(BerzerkNoFrameskip-v4), EnvSpec(Robotank-ramNoFrameskip-v0), EnvSpec(SpaceInvadersNoFrameskip-v0), EnvSpec(Amidar-ramNoFrameskip-v4), EnvSpec(ChopperCommand-ram-v0), EnvSpec(Boxing-ram-v4), EnvSpec(Tutankham-ram-v4), EnvSpec(Gravitar-ramDeterministic-v0), EnvSpec(NChain-v0), EnvSpec(Venture-ramNoFrameskip-v0), EnvSpec(BankHeist-ram-v0), EnvSpec(DoubleDunkNoFrameskip-v0), EnvSpec(TennisNoFrameskip-v0), EnvSpec(DemonAttack-ram-v4), EnvSpec(DoubleDunk-ram-v0), EnvSpec(Kangaroo-ramNoFrameskip-v0), EnvSpec(JourneyEscape-v4), EnvSpec(CubeCrashScreenBecomesBlack-v0), EnvSpec(AmidarNoFrameskip-v0), EnvSpec(Solaris-ramNoFrameskip-v4), EnvSpec(GopherNoFrameskip-v0), EnvSpec(Hero-ram-v4), EnvSpec(WizardOfWor-ramDeterministic-v0), EnvSpec(UpNDownDeterministic-v0), EnvSpec(Qbert-ram-v4), EnvSpec(FishingDerby-ramDeterministic-v4), EnvSpec(DemonAttack-v4), EnvSpec(DemonAttack-ramNoFrameskip-v4), EnvSpec(Acrobot-v1), EnvSpec(BattleZone-ram-v0), EnvSpec(RiverraidNoFrameskip-v4), EnvSpec(FreewayNoFrameskip-v4), EnvSpec(Asterix-v4), EnvSpec(Asterix-ram-v4), EnvSpec(Carnival-ram-v0), EnvSpec(Freeway-ram-v0)])

小程序一:平衡摆游戏

cart pole即车杆游戏,游戏如下,很简单,游戏里面有一个小车,上有竖着一根杆子。小车需要左右移动来保持杆子竖直。如果杆子倾斜的角度大于15°,那么游戏结束。小车也不能移动出一个范围(中间到两边各2.4个单位长度)。
在gym的Cart Pole环境(env)里面,左移或者右移小车的action之后,env都会返回一个+1的reward。到达100个reward之后,游戏也会结束。

import gymenv = gym.make('CartPole-v0')for i_episode in range(100):observation = env.reset()for t in range(100):env.render() # 更新动画action = env.action_space.sample()observation, reward, done, info = env.step(action) # 推进一步if done:env.reset()continue

运行结果:

强化学习入门系列一VS强化学习的基本概念相关推荐

  1. LSTM长短记,长序依赖可追忆(深度学习入门系列之十四)

    摘要:如果你是一名单身狗,不要伤心,或许是因为你的记忆太好了.有时,遗忘是件好事,它让你对琐碎之事不再斤斤计较.然而每当自己记不住单词而"问候亲人"时,也确实气死个人.于是你懂得了 ...

  2. 深度学习入门系列21:项目:用LSTM+CNN对电影评论分类

    大家好,我技术人Howzit,这是深度学习入门系列第二十一篇,欢迎大家一起交流! 深度学习入门系列1:多层感知器概述 深度学习入门系列2:用TensorFlow构建你的第一个神经网络 深度学习入门系列 ...

  3. 深度学习入门系列1:多层感知器概述

    本人正在学习<deep learning with python>–Jason Brownlee,有兴趣的可以一起学习. 仅供学习参考,不做商用! 大家好,我技术人Howzit,这是深度学 ...

  4. 深度学习入门系列23:项目:用爱丽丝梦游仙境生成文本

    大家好,我技术人Howzit,这是深度学习入门系列第二十三篇,欢迎大家一起交流! 深度学习入门系列1:多层感知器概述 深度学习入门系列2:用TensorFlow构建你的第一个神经网络 深度学习入门系列 ...

  5. 深度学习入门系列6项目实战:声纳回声识别

    大家好,我技术人Howzit,这是深度学习入门系列第六篇,欢迎大家一起交流! 深度学习入门系列1:多层感知器概述 深度学习入门系列2:用TensorFlow构建你的第一个神经网络 深度学习入门系列3: ...

  6. 局部连接来减参,权值共享肩并肩(深度学习入门系列之十一)

    系列文章: 一入侯门"深"似海,深度学习深几许(深度学习入门系列之一) 人工"碳"索意犹尽,智能"硅"来未可知(深度学习入门系列之二) 神经 ...

  7. BP算法双向传_链式求导最缠绵(深度学习入门系列之八)

    摘要: 说到BP(Back Propagation)算法,人们通常强调的是反向传播,其实它是一个双向算法:正向传播输入信号,反向传播误差信息.接下来,你将看到的,可能是史上最为通俗易懂的BP图文讲解, ...

  8. OPEN(SAP) UI5 学习入门系列之四:更好的入门系列-官方Walkthrough

    好久没有更新了,实在不知道应该写一些什么内容,因为作为入门系列,实际上应该更多的是操作而不是理论,而在UI5 SDK中的EXPLORER里面有着各种控件的用法,所以在这里也没有必要再来一遍,还是看官方 ...

  9. BP算法双向传,链式求导最缠绵(深度学习入门系列之八)

    摘要: 说到BP(Back Propagation)算法,人们通常强调的是反向传播,其实它是一个双向算法:正向传播输入信号,反向传播误差信息.接下来,你将看到的,可能是史上最为通俗易懂的BP图文讲解, ...

最新文章

  1. db2 linux 数据导出_linux db2 导出数据库
  2. c# 获取ajax数据,c# asp.net jQuery AJAX 从 MySQL 中获取数据
  3. Qt5.9使用QWebEngineView加载网页速度非常慢,问题解决
  4. cpu漏洞linux修复,【图片】为什么linux mint上cpu漏洞直到现在也没完全修复?_linux吧_百度贴吧...
  5. python中random中uniform怎么用_Python中的random.uniform()函数教程与实例解析
  6. Application.DoEvents
  7. python读取文档中有很多指标的数据 写成矩阵_图像处理与特征提取 —— 从 MATLAB 到 Python(一)图像、矩阵与数据的读写...
  8. 使用ztree展示树形菜单结构
  9. hadoop下载地址
  10. java hsqldb数据库_【DataBase】Hsqldb的简单使用
  11. 微信小程序简单签到功能源码分享
  12. vs2019编译的程序在win7环境上运行失败
  13. 新加坡最新的公共交通规划与管理经验借鉴
  14. 查看笔记本预装系统的产品密钥
  15. Yolov5笔记--检测bilibili下载好的视频
  16. 在腾讯的八年,我的职业思考
  17. 用python,重温小时候猜数字大小游戏
  18. 文字前带小点点的样式代码
  19. Javascript验证信用卡号、信用卡类型(最全最新)
  20. Selenium元素操作与属性值_Sinno_Song_新浪博客

热门文章

  1. android tuner 教程,安卓调谐器(Android Tuner)
  2. 视频怎么压缩?这三个方法很好用
  3. win2003能装mysql_win2003 安装2个mysql实例做主从同步服务配置
  4. weblogic下java程序占用cpu过高的问题排查
  5. pycharm中 Make available to all projects的含义
  6. C语言中判断一个三位数是否是水仙花数,判断三位数是否为水仙花数
  7. C/C++——老夫记不住
  8. 2022寒假day3
  9. java jitter buffer_android webrtc jitter buffer大小设置
  10. 【一起学Rust】Rust学习前准备——注释和格式化输出