文章目录

  • 一.论文信息
  • 二.论文结构
  • 三.论文内容
    • Abstract
    • 摘要

一.论文信息

题目: Search-Based Testing Approach for DeepReinforcement Learning Agents.【基于搜索的深度强化学习智能体测试方法】

发表年份: 2022

期刊/会议: arkiv

论文链接: http://arxiv.org/abs/2206.07813

作者信息: Amirhossein Zolfagharian, Manel Abdellatif, Lionel Briand, Mojtaba Bagherzadeh and Ramesh S

二.论文结构

1.Introduction
2.Background2.1 Definitions2.2 State Abstraction
3.Problem Definition3.1 RL Agent Testing Challenges3.2 Assumptions
4.Approach4.1 Reformulation as a Search Problem(重新表述为一个搜索问题)4.2 Overview of the Approach(方法概括)4.3 Initial Population(初始化种群)4.4 Fitness Computations(健康度的计算)4.5 Search Operators(搜索算符)4.6 Execution of Final Results(执行最终结果)
5.Empirical Evaluation(经验评估)5.1 Research Questions(提出的研究问题)5.2 Case Study(案例研究)5.3 Implementation(实现)5.4 Evaluation and Results(效果和评价)
6.Discussions
7.Threats to Validity(威胁的有效性)
8.Related Work
9.Conclusion

三.论文内容

Abstract

Deep Reinforcement Learning (DRL) algorithms have been increasingly employed during the last decade to solve various decision-making problems such as autonomous driving and robotics. However, these algorithms have faced great challenges when deployed in safety-critical environments since they often exhibit erroneous behaviors that can lead to potentially critical errors.

One way to assess the safety of DRL agents is to test them to detect possible faults leading to critical failures during their execution. This raises the question of how we can efficiently test DRL policies to ensure their correctness and adherence to safety requirements.

Most existing works on testing DRL agents use adversarial attacks that perturb states or actions of the agent. However, such attacks often lead to unrealistic states of the environment. Their main goal is to test the robustness of DRL agents rather than testing the compliance of agents’ policies with respect to requirements.

Due to the huge state space of DRL environments, the high cost of test execution, and the black-box nature of DRL algorithms, the exhaustive testing of DRL agents is impossible. In this paper, we propose a Search-based Testing Approach of Reinforcement Learning Agents (STARLA) to test the policy of a DRL agent by effectively searching for failing executions of the agent within a limited testing budget. We use machine learning models and a dedicated genetic algorithm to narrow the search towards faulty episodes.

We apply STARLA on a Deep-Q-Learning agent which is widely used as a benchmark and show that it significantly outperforms Random Testing by detecting more faults related to the agent’s policy. We also investigate how to extract rules that characterize faulty episodes of the DRL agent using our search results. Such rules can be used to understand the conditions under which the agent fails and thus assess its deployment risks.

摘要

在过去十年中(during the last decade),深度强化学习(DRL)算法被越来越多地用于解决各种决策问题(solve various decision-making problems),如自动驾驶、交易决策和机器人技术。然而,这些算法在安全关键环境中部署时面临着巨大的挑战,因为它们经常表现出错误的行为(exhibit erroneous behaviors),可能导致潜在的关键错误。

评估DRL智能体安全性(assess the safety of DRL agents)的方法之一是对其进行测试,以检测在执行过程中可能导致关键故障的故障。这就提出了一个问题(this raises the question of),即我们如何有效地测试DRL策略,以确保它们的正确性和符合安全需求(adherence to safety requirements)。

大多数现有的测试(most existing works on)DRL智能体的工作使用干扰智能体状态或动作(perturb states or actions)的对抗性攻击。然而,这种攻击往往会导致环境的不现实状态(lead to unrealistic states of the environment)。此外,他们的主要目标是测试DRL智能体的鲁棒性(test the robustness of DRL agents),而不是测试智能体的策略与需求的合规性(testing the compliance of agents’ policies with respect to requirements)。

由于深度强化学习环境的巨大状态空间(the huge state space of DRL environments)、测试执行成本高(the high cost of test execution)以及深度强化学习算法的黑箱特性(the black-box nature of DRL algorithms),无法对深度强化学习代理进行穷举测试。本文提出一种基于搜索的强化学习智能体测试方法(STARLA),通过在有限的测试预算(within a limited testing budget)中有效搜索智能体执行失败的策略来测试DRL智能体的策略。依靠机器学习模型和专用遗传算法(a dedicated genetic algorithm)将搜索范围缩小到错误情节(即DRL智能体产生的状态和动作序列)(faulty episodes)。将STARLA应用于一个广泛使用的深度q学习智能体上,作为基准,表明它通过检测更多与智能体策略相关的错误,明显优于随机测试。

我们还研究了如何使用搜索结果提取描述DRL智能体错误情节的规则。这些规则可用于了解智能体失败的条件,从而评估部署它的风险(assess the risks of deploying it)。

【论文阅读】Search-Based Testing Approach for Deep Reinforcement Learning Agents相关推荐

  1. 强化学习泛化性 综述论文阅读 A SURVEY OF GENERALISATION IN DEEP REINFORCEMENT LEARNING

    强化学习泛化性 综述论文阅读 摘要 一.介绍 二.相关工作:强化学习子领域的survey 三.强化学习中的泛化的形式 3.1 监督学习中泛化性 3.2 强化学习泛化性背景 3.3 上下文马尔可夫决策过 ...

  2. 论文笔记之:Playing Atari with Deep Reinforcement Learning

    Playing Atari with Deep Reinforcement Learning <Computer Science>, 2013 Abstract: 本文提出了一种深度学习方 ...

  3. 2018 Automatic View Planning with Multi-scale Deep Reinforcement Learning Agents具有多尺度深度的自动视图规划

    目录 摘要 相关工作 方法 状态State 动作Action 奖励Reward 终端状态Terminal State 多尺度代理 实验 数据集 训练 结果 条件 接下来的工作 References 摘 ...

  4. 【深度强化学习】【论文阅读】【双臂模仿】Deep Imitation Learning for BimanualRobotic Manipulation

    title: Deep Imitation Learning for BimanualRobotic Manipulation date: 2023-01-15T20:54:56Z lastmod: ...

  5. 【论文翻译】Distort-and-Recover: Color Enhancement using Deep Reinforcement Learning

    ABSTRACT: 本文提出--人类修图是按照步骤逐渐进行的序列,于是用MDP建模,训练agent得到一个最优的动作序列.此外,我们提出了一种"失真-复原"训练方案,只需要高质量图 ...

  6. 演化强化学习:Wuji: Automatic Online Combat Game Testing Using Evolutionary Deep Reinforcement Learning

    0 摘要 这篇文章的摘要没有提到很多感兴趣的东西,一句话概括就是 Wuji模型可以使用深度强化学习去进行游戏测试,是一个多任务智能体,不仅要通关游戏,还要尽可能的去探索游戏,找到游戏中的bug . 1 ...

  7. 代码实现 Human-level control through deep reinforcement learning

    代码实现 Human-level control through deep reinforcement learning 提示:文章写完后,目录可以自动生成,如何生成可参考右边的帮助文档 前言 使用D ...

  8. Zero-shot Learning零样本学习 论文阅读(五)——DeViSE:A Deep Visual-Semantic Embedding Model

    Zero-shot Learning零样本学习 论文阅读(五)--DeViSE:A Deep Visual-Semantic Embedding Model 背景 Skip-gram 算法 算法思路 ...

  9. 【论文阅读】DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning

    [论文阅读]DouZero: Mastering DouDizhu with Self-Play Deep Reinforcement Learning 1 本文解决了什么问题? 斗地主是一个非常具有 ...

最新文章

  1. Ubuntu Linux 下优化 swap 交换分区及调整swap大小
  2. linux多进程网络实例,Linux下一个单进程并发服务器的实例 使用select
  3. 分割点云数据_3D点云深度学习综述:三维形状分类、目标检测与跟踪、点云分割等...
  4. 生成路径 vs 设置_Simulink代码生成之模型配置
  5. Quartz.NET simple_demo
  6. ruby 将日期转化为时间_Ruby中的日期和时间类
  7. 关于推送系统设计的一些总结与思考(三)
  8. 优化一个奇葩表设计上的全表扫描SQL
  9. 编译原理教程_8 静态语义分析和中间代码生成
  10. Refused to execute script from 'http://localhost:8080/login' because its MIME type ('text/html') is
  11. 英文翻译西班牙语-批量英文翻译西班牙工具免费
  12. 关于的无穷级数的一点总结
  13. yxr:Makefile 简单样本
  14. 要装系统就装WINDOWSXPSP3VL正式版操作系统
  15. LTE系统调试记录12:接收端画星座图
  16. Java自学路线总结
  17. Unity下载文件的方式小结
  18. deepin深度操作系统
  19. Mysql数据库备份——数据库备份和表备份
  20. (一)Reactor模式详解

热门文章

  1. mediainfo参数收集
  2. 0基础想要快速的学好3D建模,理清思路,对症下药!
  3. 北京大学法学院推免大数据|入营人数超550+,外校占比46%
  4. 14-EIGRP路由协议详解
  5. 小度路由器离线下载根本就是垃圾
  6. 素雅中国风动态PPT模板
  7. 用户运营、活动运营、产品运营、内容运营的区别
  8. DNS服务器(DNS服务器构建,特殊的解析记录,多域名DNS服务器架构,DNF主从架构,DNS主从数据同步)
  9. Unknown provider: formatFileSizeFilterProvider - formatFileSizeFilter AngularJS
  10. 视频教程-小吴老师陪你学游戏角色动画-3Dmax