机器学习实战——决策树（二）

这里，我们主要用到的是matplotlib中的annotate功能，它原本是来给图像中的某个点作标注的。

- 语法

annotate语法说明：annotate(s=’str’ ,xy=(x,y) ,xytext=(l1,l2) ,..)

s 为注释文本内容
xy 为指向注释的源坐标点
xytext 为注释文字的坐标位置
xycoords 参数如下:

figure points points from the lower left of the figure 点在图左下方
figure pixels pixels from the lower left of the figure 图左下角的像素
figure fraction fraction of figure from lower left 左下角数字部分
axes points points from lower left corner of axes 从左下角点的坐标
axes pixels pixels from lower left corner of axes 从左下角的像素坐标
axes fraction fraction of axes from lower left 左下角部分
data use the coordinate system of the object beinannotated(default) 使用的坐标系统被注释的对象（默认）
polar(theta,r) if not native ‘data’ coordinates textcoords 设置注释文字偏移量

| 参数 | 坐标系 |

| ‘figure points’ | 距离图形左下角的点数量 |

| ‘figure pixels’ | 距离图形左下角的像素数量 |

| ‘figure fraction’ | 0,0 是图形左下角，1,1 是右上角 |

| ‘axes points’ | 距离轴域左下角的点数量 |

| ‘axes pixels’ | 距离轴域左下角的像素数量 |

| ‘axes fraction’ | 0,0 是轴域左下角，1,1 是右上角 |

| ‘data’ | 使用轴域数据坐标系 |

arrowprops #箭头参数,参数类型为字典dict

width the width of the arrow in points 点箭头的宽度
headwidth the width of the base of the arrow head in points 在点的箭头底座的宽度
headlength the length of the arrow head in points 点箭头的长度
shrink fraction of total length to ‘shrink’ from both ends 总长度为分数“缩水”从两端
facecolor 箭头颜色

bbox给标题增加外框，常用参数如下：

boxstyle方框外形
facecolor(简写fc)背景颜色
edgecolor(简写ec)边框线条颜色
edgewidth边框线条大小

bbox=dict(boxstyle=’round,pad=0.5’, fc=’yellow’, ec=’k’,lw=1 ,alpha=0.5) #fc为facecolor,ec为edgecolor,lw为lineweight

import matplotlib.pyplot as plt# 为了处理matplotlib中文乱码
from pylab import mpl
mpl.rcParams['font.sans-serif'] = ['SimHei']

#决策树的字典形式
my_tree={'no surfacing': {0: 'no', 1: {'flippers': {0: 'no', 1: 'yes'}}}}
# feature标签
labels=["no surfacing","flippers"]# 定义几个结点类型，fc是边框线粗细
decision_node=dict(boxstyle="sawtooth",fc="0.8")  # 分支结点，锯齿状
leaf_node=dict(boxstyle="round4",fc="0.8") # 叶子结点,圆矩形
arrow_args=dict(arrowstyle="<-")   # 箭头# 绘制结点
# center_point为文本的中心点，箭头所在的点，parent_point为指向文本的源点,va即vertical align
# ha即horizon align
def plot_node(note_txt,center_point,parent_point,node_type):create_plot.ax1.annotate(note_txt,xy=parent_point,xycoords="axes fraction",xytext=center_point,textcoords="axes fraction", va="center", ha="center",bbox=node_type,arrowprops=arrow_args)# #  创建图像,这个函数是初始版本，后面会对之进行补充
# def create_plot_1():
#     fig=plt.figure(1,facecolor="white")
#     fig.clf()#     create_plot_1.ax1=plt.subplot(111,frameon=False)
#     # 画分支结点和叶子结点
#     plot_node("决策点",(0.5,0.1),(0.1,0.5),decision_node)
#     plot_node("叶子结点",(0.8,0.1),(0.3,0.8),leaf_node)
#     plt.show()

# create_plot_1()

我们现在可以实现一个父节点到子节点的树了，现在需要考虑的问题是：

如何在1*1单位的坐标轴中摆放决策树

于是我们需要树的两个很重要的信息：树的深度和树的所有叶子结点树

我们已经生成的树：

{‘no surfacing’: {0: ‘no’, 1: {‘flippers’: {0: ‘no’, 1: ‘yes’}}}}

# 求树的深度
def get_tree_depth(my_tree):depth=0keys_list=list(my_tree.keys())first_str=keys_list[0]           # 第一个决策特征second_dict=my_tree[first_str]for key in second_dict.keys():# 如果某个决策特征划分后的集合还是一个字典，继续划分，也就是递归调用get_tree()if type(second_dict[key]).__name__=="dict":this_depth=1+get_tree_depth(second_dict[key])# 如果某个决策特征划分后的集合不再是一个字典，也就是说不需要再划分了，当前子树的层数为1else:this_depth=1if this_depth>depth:depth=this_depthreturn depth# 求树的叶子结点树,和求树的深度思路类似。
def get_leafs_num(my_tree):nums=0keys_list=list(my_tree.keys())first_str=keys_list[0]           # 第一个决策特征second_dict=my_tree[first_str]for key in second_dict.keys():if type(second_dict[key]).__name__=="dict":nums+=get_leafs_num(second_dict[key])else:nums+=1return nums

print(my_tree)
print("树的叶子结点数目为："+str(get_leafs_num(my_tree)))
print("树的深度为："+str(get_tree_depth(my_tree)))

{'no surfacing': {0: 'no', 1: {'flippers': {0: 'no', 1: 'yes'}}}}
树的叶子结点数目为：3
树的深度为：2

### 现在，我们需要的就是根据树的深度和叶子结点数，控制一下在图像中的分布，于是将上述几个画图函数略作修改

代码详解,和这篇博客的博主一样，在看源码时这个地方一直没搞明白，反正看完这个博客，豁然开朗 :)

# 在箭头中间添加对应的特征值
def plot_mid_text(center_point,parent_point,txt_string):x_mid=(parent_point[0]-center_point[0])/2.0+center_point[0]y_mid=(parent_point[1]-center_point[1])/2.0+center_point[1]create_plot.ax1.text(x_mid,y_mid,txt_string)# 画决策树，逻辑过程，同样是递归实现的
def plot_tree(my_tree,parent_point,node_txt):# 计算宽和高leafs_num=get_leafs_num(my_tree)depth=get_tree_depth(my_tree)keys_list=list(my_tree.keys())first_str=keys_list[0]# 确定当前结点位置。注意：树根不存在父节点center_point=(plot_tree.xOff+(1+float(leafs_num))/2.0/plot_tree.totalW,plot_tree.yOff)# 在箭头中间添加对应的特征值plot_mid_text(center_point,parent_point,node_txt)# first_str一定是一个决策特征# 画决策结点plot_node(first_str,center_point,parent_point,decision_node)# 往下一层，yOff相应减少second_dict=my_tree[first_str]plot_tree.yOff-=1.0/plot_tree.totalDfor key in second_dict.keys():# 如果还需要划分，递归调用本身，参数改变一下即可if type(second_dict[key]).__name__=="dict":plot_tree(second_dict[key],center_point,str(key))# 如果是叶子结点，直接输出else:plot_tree.xOff+=1.0/plot_tree.totalWplot_node(second_dict[key],(plot_tree.xOff,plot_tree.yOff),center_point,leaf_node)plot_mid_text((plot_tree.xOff,plot_tree.yOff),center_point,str(key))plot_tree.yOff+=1/plot_tree.totalD# 真正的画决策树，非逻辑
def create_plot(my_tree):  fig = plt.figure(1, facecolor='white')  fig.clf()  axprops = dict(xticks=[], yticks=[])  create_plot.ax1 = plt.subplot(111, frameon=False)    #no ticks  #totalW为整树的叶子节点树，totalD为深度  plot_tree.totalW = float(get_leafs_num(my_tree))  plot_tree.totalD = float(get_tree_depth(my_tree))  plot_tree.xOff = -0.5/plot_tree.totalWplot_tree.yOff = 1.0# 因为开始的根节点并不用划线，因此父节点和当前节点的位置需要重合，利用2中的确定当前节点的位置便为(0.5, 1.0)plot_tree(my_tree, (0.5,1.0), '')  plt.show()

# test
my_tree_1={'no surfacing': {0: 'no', 1: {'flippers': {0: 'no', 1: 'yes'}},2:'maybe'}}
create_plot(my_tree_1)

现在，我们已经将决策树从一个嵌套字典的形式转化为了一棵树。其实，我们没有“显示”地构造这种树也没关系，只要构造了上述嵌套的字典，就可以对测试集进行判断了

so，最关键最关键的还是decision_tree.py文件中的部分

最后，跑了一下隐形眼镜数据集，效果还不错

机器学习实战——决策树（二）相关推荐

机器学习实战-决策树-22
机器学习实战-决策树-叶子分类 import numpy as np import pandas as pd import seaborn as sns import matplotlib.pyplo ...
机器学习实战 —— 决策树（完整代码）
声明: 此笔记是学习<机器学习实战> -- Peter Harrington 上的实例并结合西瓜书上的理论知识来完成,使用Python3 ,会与书上一些地方不一样. 机器学习实战-- 决策 ...
[机器学习数据挖掘]机器学习实战决策树plotTree函数完全解析
[机器学习&数据挖掘]机器学习实战决策树plotTree函数完全解析 http://www.cnblogs.com/fantasy01/p/4595902.html点击打开链接 import ...
机器学习实战--决策树ID3的构建、画图与实例：预测隐形眼镜类型
声明本文参考了<机器学习实战>书中代码,结合该书讲解,并加之自己的理解和阐述机器学习实战系列博文机器学习实战--k近邻算法改进约会网站的配对效果机器学习实战--决策树的构建.画图与 ...
python神经网络算法pdf_Python与机器学习实战决策树、集成学习、支持向量机与神经网络算法详解及编程实现.pdf...
作者 :何宇健出版发行 : 北京:电子工业出版社 , 2017.06 ISBN号 :978-7-121-31720-0 页数 : 315 原书定价 : 69.00 主题词 : 软件工具-程序设计 ...
【10月31日】机器学习实战（二）决策树：隐形眼镜数据集
决策树的优点:计算的复杂度不高,输出的结果易于理解,对中间值的确实不敏感,可以处理不相关的特征数据决策树的缺点:可能会产生过度匹配的问题. 其本质的思想是通过寻找区分度最好的特征(属性),用于支持分 ...
机器学习实战-决策树（二）Python实现
转载请注明作者和出处: http://blog.csdn.net/c406495762 运行平台: Windows Python版本: Python3.x IDE: Sublime text3 一前 ...
机器学习实战（二）决策树DT（Decision Tree、ID3算法）
目录 0. 前言 1. 信息增益(ID3) 2. 决策树(Decision Tree) 3. 实战案例 3.1. 隐形眼镜案例 3.2. 存储决策树 3.3. 决策树画图表示学习完机器学习实战的决策 ...
[机器学习实战]决策树
1. 简介决策树(Decision Tree)是在已知各种情况发生概率的基础上,通过构成决策树来求取净现值的期望值大于等于零的概率,评价项目风险,判断其可行性的决策分析方法,是直观运用概率分析的一种 ...
机器学习实战决策树画图理解
机器学习实战第二章决策树难点第二章决策树用matplotlib画图的理解决策树matplotlib画图代码第二章决策树用matplotlib画图的理解作为一个小白呢,确实对于我们来说第二章画图 ...

机器学习实战——决策树（二）

- 语法

so，最关键最关键的还是decision_tree.py文件中的部分

最后，跑了一下隐形眼镜数据集，效果还不错

机器学习实战——决策树（二）相关推荐

最新文章

热门文章