作业1: Bait游戏实验报告

151220129 计科吴政亿

任务一深度优先搜索

变量简介

变量类型	变量名	变量含义
ArrayList	closeList	存储已经走过的历史路径
boolean	isCalculated	是否得到了能走到终点的答案
ArrayList	depthFirstAction	存储dfs中的每一步动作
int	nowStep	当前步骤在depthFirstAction的下标

函数简介

返回类型	函数名	传入参数	函数功能
boolean	isInCloseList	StateObservation obs	检测是否在历史状态中
boolean	getDepthFirst	StateObservation stateObs, ElapsedCpuTimer elapsedTimer	计算从stateObs出发的深度优先路径，如果找到则返回true
Types.ACTIONS	act	StateObservation stateObs, ElapsedCpuTimer elapsedTimer	根据局面stateObs调用getDepthFirst并返回当轮动作
void	debugPrint	Types.ACTIONS act	输出act动作信息
void	debugPrintAllAction	ArrayList actions	输入actions中所有动作信息

核心代码

boolean getDepthFirst(StateObservation stateObs, ElapsedCpuTimer elapsedTimer){if(stateObs in closeList)return false;elsecloseList.add(stateObs);stCopy = stateObs.copy();for(all actions in stateObs){try this action in stCopy;depthFirstAction.add(action);if(win) return true; // all actions are in 'depthFirstAction'else if(stateObs in closeList || Game over){stCopy = stateObs.copy(); // reset stCopydepthFirstAction.delete(action);continue;}else{ // a new stateif(getDepthFirst(stCopy,elapsedTimer))return true;else {stCopy = stateObs.copy(); // reset stCopydepthFirstAction.delete(action);continue;}}}return false;
}

任务二深度受限的深度优先搜索

变量简介

变量类型	变量名	变量含义
ArrayList	closeList	存储已经走过的历史路径
ArrayList	stateDepth	历史路径里对应的每一个的深度
ArrayList	limitDepthFirstAction	存储limitdfs中的每一步动作
double	bestCost	精灵与目标在state中最优状态的距离
Vector2d	goalpos	门的位置
Vector2d	keypos	钥匙的位置

函数简介

返回类型	函数名	传入参数	函数功能
void	initAgent	null	初始化Agent
boolean	isInCloseList	StateObservation obs	检测是否在历史状态中
void	limitDepthFirst	StateObservation stateObs, ElapsedCpuTimer elapsedTimer, int depth	计算从stateObs出发的受限层数为depth的深度优先路径
Types.ACTIONS	act	StateObservation stateObs, ElapsedCpuTimer elapsedTimer	根据局面stateObs调用limitDepthFirst并返回当轮动作
void	debugPrint	Types.ACTIONS act	输出act动作信息
void	debugPrintAllAction	ArrayList actions	输入actions中所有动作信息
double	getDistance	Vector2d vec1, Vector2d vec2	返回vec1与vec2的曼哈顿距离
boolean	avatarGetKey	StateObservation stateObs	判断精灵是否得到钥匙
double	heuristic	StateObservation stateObs	启发式函数，返回局面评分
void	debugPos	Vector2d vec, String head	输出vec的位置信息

核心代码

protected void limitDepthFirst(StateObservation stateObs, ElapsedCpuTimer elapsedTimer, int depth){if(Reach the end of limitDepthFirst){nowStateScore = heuristic(stateObs);if(nowStateScore better than bestCost)bestAction = now actions;           return;}else if(stateObs is not game start){if(stateObs in closeList && depth same)return;else{closeList.add(stateObs);stateDepth.add(depth);}}else{ // at the beginning of limitdfs, init allcloseList.clear();stateDepth.clear();}for(all actions in stateObs){try this action in stCopy;limitDepthFirstAction.add(action);if(Game win) {nowStateScore = heuristic(stateObs);if(nowStateScore better than bestCost)bestAction = now actions;stCopy = stateObs.copy(); // reset stCopylimitDepthFirstAction.delete(action);continue;}               else{limitDepthFirst(stCopy,elapsedTimer,depth);stCopy = stateObs.copy(); // reset stCopylimitDepthFirstAction.delete(action);continue;}}
}

任务三 A*搜索

根据自己自行测试，可以在有限时间内完成第二关与第三关的搜索并成功通关。

变量简介

变量类型	变量名	变量含义
ArrayList	closeList	存储已经走过的历史路径
PriorityQueue	openList	存储尚未走的已探索到的步骤
boolean	getAnswer	是否得到了能走到终点的答案
ArrayList	aStarAction	存储aStar中的每一步动作
Vector2d	goalpos	门的位置
Vector2d	keypos	钥匙的位置

函数简介

返回类型	函数名	传入参数	函数功能
void	initAgent	null	初始化Agent
boolean	isInCloseList	StateObservation obs	检测是否在历史状态中
boolean	isInOpenList	StateObservation obs	检测是否在尚未走的已搜索到的区域中
void	aStar	StateObservation stateObs, ElapsedCpuTimer elapsedTimer, int depth	计算从stateObs出发的受时间限制的aStar路径
Types.ACTIONS	act	StateObservation stateObs, ElapsedCpuTimer elapsedTimer	根据局面stateObs调用aStar并返回当轮动作
double	getDistance	Vector2d vec1, Vector2d vec2	返回vec1与vec2的曼哈顿距离
boolean	avatarGetKey	StateObservation stateObs	判断精灵是否得到钥匙
double	heuristic	StateObservation stateObs	启发式函数，返回局面评分

核心代码

protected void aStar(StateObservation stateObs, ElapsedCpuTimer elapsedTimer) {initAgent();openList.add(startNode); // 将初始状态加入openListwhile(!openList.isEmpty()){Node tmp = openList.poll(); // 取得分最高的节点tmpaStarAction = tmp.actions; // 将aStarAction初始化为tmp从起点到当前位置的所有动作 closeList.add(tmp.stateObs.copy()); // 将tmp的状态加入closeList中标记已走过for(all actions in stateObs){stCopy = tmp.stateObs.copy();try this action in stCopy;aStarAction.add(act);if(Game win) {getAnswer = true;return ; // 最终的序列步骤在aStarAction中}else if(Game over || stateObs in closeList) { // 如果游戏结束或发现曾走过，则回溯aStarAction.delete(action);continue;}else if(stateObs in openList){ // 如果发现当前局面在openList待尝试if(new actions better than old){ // 如果当前走法优于之间走法refresh openList; // 则更新动作}aStarAction.delete(action); // 回溯}else{ // 这是一个新的动作openList.add(new Node(stCopy,heuristic(stCopy),aStarAction)); // 加入新的动作aStarAction.delete(action); // 回溯}}}
}

任务四蒙特卡洛树搜索

算法框架

while(时间限制内){treePolicy选择一个当前可达状态selected；对子状态执行rollOut，得到得分；从selected开始执行backUp；
}
通过mostVisitedAction返回次数最大的动作并作为结果；

函数简介

函数名	传入参数	函数功能
rollOut	null	不断随机向下搜索，当游戏结束或达到递归层数后对状态评分。更新Agent得分bound后返回得分。
backUp	SingleTreeNode node, double result	传入一个节点与他的得分，对这个节点与他的所有祖先节点，访问次数+1，总分+=得分。
treePolicy	null	如果当前节点有子节点未访问，则返回其中一个，否则调用uct从所有子节点中选择一个。
uct	null	根据子节点总分，访问次数，Agent的得分bound计算节点分数，然后选择得分最高的返回。

核心代码

uct算法作为关键，他的子节点计算方式如下：

childValue(平均估值) = normalize(childTotalValuechildVisitTimes+ϵ),(ϵ=1∗10−6)normalize(childTotalValuechildVisitTimes+ϵ),(ϵ=1∗10−6)normalize(\frac{childTotalValue}{childVisitTimes+ \epsilon}),(\epsilon = 1*10^-6)
uctValue(节点分数) = childValue+2lnparentVisitTimes+1childVisitTimes+ϵ−−−−−−−−−−−−−−−√+ξ,ξchildValue+2ln⁡parentVisitTimes+1childVisitTimes+ϵ+ξ,ξchildValue + \sqrt{\frac{2\ln{parentVisitTimes + 1}}{childVisitTimes + \epsilon}}+\xi, \xi是噪声

简而言之，访问次数少的，平均分高的子节点节点分数更高，更容易被roolOut选中。由于在act中我们返回的是访问次数最多的子节点作为状态，所以可以看出uct对于可能成功的子节点更友好。

运行结果

【人工智能】作业1: Bait游戏实验报告相关推荐

【人工智能】作业3: Aliens游戏实验报告
作业3: Aliens游戏实验报告吴政亿 151220129 wuzy.nju@gmail.com (南京大学计算机科学与技术系, 南京 210093) 摘要:使用监督学习来模仿人玩游戏的动作, ...
java弹弹球实验报告_Java弹球游戏实验报告—chen
Java弹球游戏实验报告-chen 课程设计报告题目弹球小游戏姓名方成学号 20 专业 java 指导教师陈华恩 2013年 12 月 30 目录一.实验目的2 二.需 ...
拼图游戏C语言课设实验报告,C语言拼图游戏实验报告.doc
C语言拼图游戏实验报告课程设计实验报告班级:光电104-2 姓名:刘云龙学号:201058501220 一.实验题目:使用C语言编写一个小游戏(拼图游戏) 二.实验目的:C语言是每一个通信学生的 ...
java小游戏实训目的_Java弹球小游戏实验报告.doc
Java弹球小游戏实验报告滨江学院 Java程序设计实验报告题目弹球小游戏姓名许浩学号 20112346064 学院滨江学院专业网络工程年级 2011级指导教师张舒 ...
扫雷c语言课程设计报告,扫雷游戏实验报告.docx
扫雷游戏实验报告剖析课程设计软件综合课程设计班级:姓名:学号:指导教师:成绩:电子与信息工程学院信息与通信工程系目录1.任务概述------------------------------- ...
C语言中猜数大小的实验报告,猜数字游戏实验报告
<猜数字游戏实验报告>由会员分享,可在线阅读,更多相关<猜数字游戏实验报告(17页珍藏版)>请在人人文库网上搜索. 1.C语言课程设计报告题目:猜数字游戏班级:通信工程组 ...
java猜数字游戏实验报告_java猜数游戏实验报告.doc
java猜数游戏实验报告课程设计报告课程设计名称 Java程序设计-猜数游戏指导教师钟世刚专业班级信息安全学号姓名成绩一.设计任务与要求1 1.1 设计任务与要求 ...
c语言程序设计扫雷游戏实验报告,C语言程序设计扫雷游戏实验报告.doc
C语言程序设计扫雷游戏实验报告中南大学程序设计基础实践报告题目设计一个和window系统类似的小游戏[挖地雷] 学生姓名张兰兰学院信息科学与工程学院专业班级物联网工程1301班 ...
c语言扫雷程序设计流程图,c语言程序设计扫雷游戏实验报告
c语言程序设计扫雷游戏实验报告中南大学程序设计基础实践报告题目设计一个和window系统类似的小游戏[挖地雷]学生姓名张兰兰学院信息科学与工程学院专业班级物联网工程1301班完 ...

【人工智能】作业1: Bait游戏实验报告

作业1: Bait游戏实验报告

151220129 计科吴政亿

任务一深度优先搜索

变量简介

函数简介

核心代码

任务二深度受限的深度优先搜索

变量简介

函数简介

核心代码

任务三 A*搜索

变量简介

函数简介

核心代码

任务四蒙特卡洛树搜索

算法框架

函数简介

核心代码

运行结果

【人工智能】作业1: Bait游戏实验报告相关推荐

最新文章

热门文章

【人工智能】作业1: Bait游戏 实验报告

作业1: Bait游戏 实验报告

151220129 计科 吴政亿

任务一 深度优先搜索

变量简介

函数简介

核心代码

任务二 深度受限的深度优先搜索

变量简介

函数简介

核心代码

任务三 A*搜索

变量简介

函数简介

核心代码

任务四 蒙特卡洛树搜索

算法框架

函数简介

核心代码

运行结果

【人工智能】作业1: Bait游戏 实验报告相关推荐

最新文章

热门文章

【人工智能】作业1: Bait游戏实验报告

作业1: Bait游戏实验报告

151220129 计科吴政亿

任务一深度优先搜索

任务二深度受限的深度优先搜索

任务四蒙特卡洛树搜索

【人工智能】作业1: Bait游戏实验报告相关推荐