线性分类器:Batch Perception+Ho_Kashyap+MSE原理及代码实现
目录
- 使用数据
- Batch Perception算法
- 原理
- 代码实现
- Ho-Kashyap算法
- 原理
- 代码实现
- MSE算法
- 原理
- 代码实现
- 代码汇总
使用数据
Batch Perception算法
原理
设有一组样本y1,y2,...,yny_1,y_2,...,y_ny1,y2,...,yn,各样本均规范化表示,我们的目的是找一个解向量aaa ,使aTyi>0a^T y_i>0aTyi>0。在线性可分的情况下,满足上式的aaa是无穷的。所以要引出一个损失函数进行优化。这个准则的基本思想是错分样本最少:
J(a)=∑y∈Y(−aTy)J(a)=\sum_{y\in Y}(-a^{T}y)J(a)=∑y∈Y(−aTy)
YYY为错分样本集合。我们采用梯度下降来优化目标函数:
∂J∂a=∑y∈Y(−y)\frac{\partial J}{\partial a}=\sum_{y\in Y}(-y)∂a∂J=∑y∈Y(−y)
则有:
ak+1=ak−η∑y∈Y(−y)a_{k+1}=a_{k}-η\sum_{y\in Y}(-y)ak+1=ak−η∑y∈Y(−y)
代码实现
# Define batch perception algorithm
# Input: w1, w2
# w1: Samples in class 1
# w2: Samples in class 2
# Output: a, n
# a: the parameters
# n: number of iterations
def batch_perception(w1, w2):# Generate the normalized augmented samplesw = trans_sample(w1, w2)# Initiationa = np.zeros_like(w[1])eta = 1 # Learning ratetheta = np.zeros_like(w[1])+1e-6 # Termination conditionsn = 0 # Number of iterations# Implement the algorithmwhile True:y = np.zeros_like(w[1])for sample in w:if np.matmul(a.T, sample) <= 0:# the sample is misclassified y += sampleeta_y = eta * yif all(np.abs(eta_y)<=theta):# if the termination conditions are satisfied, terminate the iterationbreaka += eta_yn += 1print ("The dicision surface a is {}\nThe number of iterations is {}.".format(a, n))return a, n
Ho-Kashyap算法
原理
刚才的准则函数都是关注于错分样本,而对正确分类的样本则没有考虑在内。MSE准则函数把求解目标从不等式形式变成了等式形式:求取满足 aTyi=bia^T y_i=b_iaTyi=bi的权向量。在这里,bib_ibi是任取的正常数。
如果记矩阵 Y∈Rn∗dY\in R^{n*d}Y∈Rn∗d且其每行都是一个样本 yTy^TyT,向量 b=[b1,b2,...,bn]Tb=[b_1,b_2,...,b_n]^Tb=[b1,b2,...,bn]T,那么就可以表述为存在Ya=b>0Ya=b>0Ya=b>0。为了求解这个问题,使用MSE准则函数:
J=∣∣Ya−b∣∣2=∑i=1n(aTyi−bi)2J=||Ya-b||^2=\sum_{i=1}^{n}(a^T y_i-b_i)^2J=∣∣Ya−b∣∣2=∑i=1n(aTyi−bi)2
采用梯度下降来优化目标函数:
∂J∂a=2YT(Ya−b)\frac{\partial J}{\partial a}=2Y^T(Ya-b)∂a∂J=2YT(Ya−b)
∂J∂b=−2(Ya−b)\frac{\partial J}{\partial b}=-2(Ya-b)∂b∂J=−2(Ya−b)
则有:
bk+1=bk+2ηk12(∂J∂b−∣∂J∂b∣)b_{k+1}=b_k+2η_{k}\frac{1}{2}(\frac{\partial J}{\partial b}-|\frac{\partial J}{\partial b}|)bk+1=bk+2ηk21(∂b∂J−∣∂b∂J∣)
ak=Y+bka_{k}=Y^{+}b_kak=Y+bk
代码实现
# Define Ho-Kashyap algorithm
# Input: w1, w2
# w1: Samples in class 1
# w2: Samples in class 2
# Output: a, b, n
# a, b: the parameters
# e: training errors
def HK_algorithm(w1, w2):# Generate the normalized augmented samplesw = trans_sample(w1, w2)# Initiationa = np.zeros_like(w[1])b = np.zeros(w.shape[0]) + 0.5yita = 0.5 # learning rateth_b = np.zeros(w.shape[0]) + 1e-6 # Termination condition of bth_n = 10000 # Termination condition of nn = 0 # Number of iterations# Implement the algorithmwhile n <= th_n:e = np.matmul(w, a) - be_ = 0.5 * (e + np.abs(e))b += 2 * (yita) * e_a = np.matmul(np.matmul(np.linalg.inv(np.matmul(w.T, w)), w.T), b)n += 1if all(np.abs(e) == 0): # if the errors are all 0, terminate the iterationprint ("The dicision surface a is {}.".format(a))print ("The dicision errors are {}.".format(e))return a, b, eif all(np.abs(e) <= th_b): # if the termination conditions are satisfied, terminate the iterationprint ("The dicision surface a is {}.".format(a))print ("The dicision errors are {}.".format(e))return a, b, e# if the termination conditions are not satisfied when n = th_n, terminate the iterationprint ("Iteration exceed the maxinum number! No solution found !")print ("The current dicision surface a is {}.".format(a))print ("The current dicision errors are {}.".format(e))return a, b, e
MSE算法
原理
采用c个两类分类器的组合,使用线性变换:
y=WTx+b,W∈Ed∗c,b∈Rcy=W^{T}x+b,W\in E^{d*c},b\in R^{c}y=WTx+b,W∈Ed∗c,b∈Rc
决策准则:
如果j=argmax(WTx+b)j=argmax(W^{T}x+b)j=argmax(WTx+b),则x∈ωjx\in \omega_jx∈ωj
令W^=WbT,x^=x1∈Rd+1,X^=(x1^,x2^,...,xn^)∈R(d+1)∗n\hat{W}=\begin{matrix}W\\b^T\\\end{matrix},\hat{x}=\begin{matrix}x\\1\\\end{matrix}\in R^{d+1}, \hat{X}=(\hat{x_1},\hat{x_2},...,\hat{x_n})\in R^{(d+1)*n}W^=WbT,x^=x1∈Rd+1,X^=(x1^,x2^,...,xn^)∈R(d+1)∗n
目标函数:
minW,b∑i=1n∣∣WTx+b−yi∣22min_{W,b}\sum_{i=1}^{n}||W^{T}x+b-y_i|^2_2minW,b∑i=1n∣∣WTx+b−yi∣22
∑i=1n∣∣WTx+b−yi∣22=∣∣W^TX^−Y∣∣F2\sum_{i=1}^{n}||W^{T}x+b-y_i|^2_2=||\hat{W}^{T}\hat{X}-Y||_F^2∑i=1n∣∣WTx+b−yi∣22=∣∣W^TX^−Y∣∣F2
得到:
W^=(X^X^T+λI)−1X^YT∈R(d+1)∗c\hat{W}=(\hat{X}\hat{X}^{T}+\lambda I)^{-1}\hat{X}Y^{T}\in R^{(d+1)*c}W^=(X^X^T+λI)−1X^YT∈R(d+1)∗c
代码实现
# Define the multiple classed test function
# Input: w, a
# w: [[class1], [class2], ...]
# a: the parameters (sample_d+1, class_n)
# Output: f_ratio
# f_ratio: false ratio
def multi_class_mse_test(w, a):w_ = copy.deepcopy(w)cnt = 0for class_i, sample in enumerate(w_):print(sample)for sample_i, test_sample in enumerate(sample):test_sample.append(1)test_sample = np.array(test_sample)print(np.matmul(a.T, test_sample))test_result = np.argmax(np.matmul(a.T, test_sample))if test_result != class_i: cnt += 1f_ratio = cnt / ((class_i+1)*(sample_i+1))print("The false rate is {} ".format(f_ratio))return f_ratio
代码汇总
import numpy as np
import matplotlib.pyplot as plt
import copy# Sample w1, w2, w3, w4
w1 = [[0.1, 1.1], [6.8, 7.1], [-3.5, -4.1], [2.0, 2.7], [4.1, 2.8], [3.1, 5.0], [-0.8, -1.3], [0.9, 1.2], [5.0, 6.4], [3.9, 4.0]]
w2 = [[7.1, 4.2], [-1.4, -4.3], [4.5, 0.0], [6.3, 1.6], [4.2, 1.9], [1.4, -3.2], [2.4, -4.0], [2.5, -6.1], [8.4, 3.7], [4.1, -2.2]]
w3 = [[-3.0, -2.9], [0.5, 8.7], [2.9, 2.1], [-0.1, 5.2], [-4.0, 2.2], [-1.3, 3.7], [-3.4, 6.2], [-4.1, 3.4], [-5.1, 1.6], [1.9, 5.1]]
w4 = [[-2.0, -8.4], [-8.9, 0.2], [-4.2, -7.7], [-8.5, -3.2], [-6.7, -4.0], [-0.5, -9.2], [-5.3, -6.7], [-8.7, -6.4], [-7.1, -9.7], [-8.0, -6.3]]# Generate the normalized augmented samples
# Input: w1, w2
# w1: samples in class 1
# w2: samples in class 2
# Output: w
# w: normalized augmented samples in class 1 and a
def trans_sample(w1,w2):w_1 = copy.deepcopy(w1)w_2 = copy.deepcopy(w2)# Augmentationfor sample in w_1: sample.append(1)for sample in w_2: sample.append(1)# Regulationw_1 = np.array(w_1)w_2 = -np.array(w_2)w = np.concatenate([w_1, w_2])return w# Define batch perception algorithm
# Input: w1, w2
# w1: Samples in class 1
# w2: Samples in class 2
# Output: a, n
# a: the parameters
# n: number of iterations
def batch_perception(w1, w2):# Generate the normalized augmented samplesw = trans_sample(w1, w2)# Initiationa = np.zeros_like(w[1])eta = 1 # Learning ratetheta = np.zeros_like(w[1])+1e-6 # Termination conditionsn = 0 # Number of iterations# Implement the algorithmwhile True:y = np.zeros_like(w[1])for sample in w:if np.matmul(a.T, sample) <= 0:# the sample is misclassified y += sampleeta_y = eta * yif all(np.abs(eta_y)<=theta):# if the termination conditions are satisfied, terminate the iterationbreaka += eta_yn += 1print ("The dicision surface a is {}\nThe number of iterations is {}.".format(a, n))return a, n# Define Ho-Kashyap algorithm
# Input: w1, w2
# w1: Samples in class 1
# w2: Samples in class 2
# Output: a, b, n
# a, b: the parameters
# e: training errors
def HK_algorithm(w1, w2):# Generate the normalized augmented samplesw = trans_sample(w1, w2)# Initiationa = np.zeros_like(w[1])b = np.zeros(w.shape[0]) + 0.5yita = 0.5 # learning rateth_b = np.zeros(w.shape[0]) + 1e-6 # Termination condition of bth_n = 10000 # Termination condition of nn = 0 # Number of iterations# Implement the algorithmwhile n <= th_n:e = np.matmul(w, a) - be_ = 0.5 * (e + np.abs(e))b += 2 * (yita) * e_a = np.matmul(np.matmul(np.linalg.inv(np.matmul(w.T, w)), w.T), b)n += 1if all(np.abs(e) == 0): # if the errors are all 0, terminate the iterationprint ("The dicision surface a is {}.".format(a))print ("The dicision errors are {}.".format(e))return a, b, eif all(np.abs(e) <= th_b): # if the termination conditions are satisfied, terminate the iterationprint ("The dicision surface a is {}.".format(a))print ("The dicision errors are {}.".format(e))return a, b, e# if the termination conditions are not satisfied when n = th_n, terminate the iterationprint ("Iteration exceed the maxinum number! No solution found !")print ("The current dicision surface a is {}.".format(a))print ("The current dicision errors are {}.".format(e))return a, b, e# Define multiple class MSE algorithm
# Input: w
# w: [[class1], [class2], ...]
# Output: a
# a: the parameters (sample_d+1, class_n)
def multi_class_mse(w_i):w_ = copy.deepcopy(w_i)class_n = len(w_)sample_n = len(w_[0])# Initiationw = []y = np.zeros((class_n, class_n*sample_n))for class_idx, class_sample in enumerate(w_):for sample in class_sample: sample.append(1)w.append(sample)y[class_idx, class_idx*sample_n:(class_idx+1)*sample_n] = 1print(y)w = np.array(w).T # w: (class_n*sample_n, 3)# y: (class_n, class_n*sample_n)a = np.matmul(np.matmul(np.linalg.inv(np.matmul(w, w.T)+0.01), w), y.T)print ("The dicision surface a is {}.".format(a))return a# Define the multiple classed test function
# Input: w, a
# w: [[class1], [class2], ...]
# a: the parameters (sample_d+1, class_n)
# Output: f_ratio
# f_ratio: false ratio
def multi_class_mse_test(w, a):w_ = copy.deepcopy(w)cnt = 0for class_i, sample in enumerate(w_):print(sample)for sample_i, test_sample in enumerate(sample):test_sample.append(1)test_sample = np.array(test_sample)print(np.matmul(a.T, test_sample))test_result = np.argmax(np.matmul(a.T, test_sample))if test_result != class_i: cnt += 1f_ratio = cnt / ((class_i+1)*(sample_i+1))print("The false rate is {} ".format(f_ratio))return f_ratio# Show the classification results
# Input: w1, w2
# w1: Samples in class 1
# w2: Samples in class 2
# a: the parameters
def show_result(w1, w2, a):# Samples in class 1w_1 = np.array(w1)x1 = w_1[:, 0]y1 = w_1[:, 1]plt.scatter(x1, y1, marker = '.',color = 'red')# Samples in class 2w_2 = np.array(w2)x2 = w_2[:, 0]y2 = w_2[:, 1]plt.scatter(x2, y2, marker = '.',color = 'blue')# Decision surfacex = np.arange(-10, 10, 0.1)y = -a[0]/a[1]*x - a[2]/a[1]plt.plot(x, y)plt.xlabel('x_1')plt.ylabel('x_2')plt.title('Classfication Result')plt.show()# Show the multiple classification results
# Input: w, a
# w: [[class1], [class2], ...]
# a: the parameters
def multi_class_show_result(w, a):class_n = len(w)for class_idx, class_sample in enumerate(w):# Samples in class_idxw_i = np.array(class_sample)x = w_i[:, 0]y = w_i[:, 1]plt.scatter(x, y)# Decision surfacex = np.arange(-10, 10, 0.1)y = -a[0][class_idx]/a[1][class_idx]*x - a[2][class_idx]/a[1][class_idx]plt.plot(x, y)plt.xlabel('x_1')plt.ylabel('x_2')plt.title('Classfication Result')plt.show()if __name__ == "__main__":print("The result of the batch perception algorithm is shown below")print("w1 and w2")a, n = batch_perception(w1, w2)show_result(w1, w2, a)print("w3 and w2")a, n = batch_perception(w3, w2)show_result(w3, w2, a)print("-------------------------------------")print("The result of the Ho-Kashyap algorithm is shown below")print("w1 and w3")a, b, e = HK_algorithm(w1, w3)show_result(w1, w3, a)print("w2 and w4")a, b, e = HK_algorithm(w2, w4)show_result(w2, w4, a)print("-------------------------------------")print("The result of the multiple calss MSE is shown below")a = multi_class_mse([w1[:8], w2[:8], w3[:8], w4[:8]])multi_class_show_result([w1[:8], w2[:8], w3[:8], w4[:8]], a)multi_class_mse_test([w1[8:], w2[8:], w3[8:], w4[8:]], a)
线性分类器:Batch Perception+Ho_Kashyap+MSE原理及代码实现相关推荐
- 多类线性分类器算法原理及代码实现 MATLAB
多类线性分类器算法原理及代码实现 MATLAB 一.算法原理 下面举例说明为何蓝圈部分在case2中是确定的而在case1中不确定: 二.代码实现 1.HK函数 function [] = HK(w1 ...
- 神经网络系列之五 -- 线性二分类的方法与原理
https://www.cnblogs.com/woodyh5/p/12101581.html 系列博客,原文在笔者所维护的github上:https://aka.ms/beginnerAI, 点击s ...
- 模式识别作业-线性分类器设计总结
刚刚做完线性分类器的作业,趁热打铁做下总结. 摘要 模式识别的目的是要在特征空间中设法找到两类(或多类)之间的分界面.基于样本直接设计分类器需要确定三个基本要素:一是分类器即判别函数的类型 ...
- Fisher 线性分类器--转
原文地址:http://blog.csdn.net/htyang725/article/details/6571550 Fisher 线性分类器由R.A.Fisher在1936年提出,至今都有很大的研 ...
- [深度学习基础] 2. 线性分类器
本文将以 softmax 线性分类器为例, 讨论数据驱动过程的各组成部分. 同时本章是后文非线性分类器和深度学习的铺垫. 1 训练数据 给定由 m 张图像组成的训练集, 每个图像的标记是 K 个不同类 ...
- 【线性分类器】(四)万字长文解释拉格朗日乘子与支持向量机
[线性分类器](一)线性判别 [线性判别器](二)"深度学习"的鼻祖--感知器 [线性分类器](三)线性分类器的松弛求解: LEMS 算法,H-K 算法 文章目录 拉格朗日乘子与支 ...
- 【深度学习-CS231n】线性分类器和神经网络
文章目录 神经网络静态部分 基本概念和小细节 算法设计选项 神经网络动态部分 学习过程 [梯度检查](https://zhuanlan.zhihu.com/p/21741716?refer=intel ...
- 线性分类器:感知器/SVM
线性可分 存在一个超平面,可以将数据集的正负例完全正确地划分至超平面两侧.如:左图中的数据是线性可分,而右图不是. 线性模型形式 是x在第i个特征的取值. 反映了第i个特征的重要程度.如 可看 ...
- [模式识别].(希腊)西奥多里蒂斯第四版笔记3之__线性分类器
1,线性分类器主要优点是他们的简化和计算吸引力 2,线性判别函数和决策超平面 3,感知器算法 4,最小二乘法:均方误差估计:随机近似和LMS算法:方差和估计 5,均方估计回顾:均方误差回归:MSE估计 ...
最新文章
- 一次性理清JavaScript变量等高难度面试问题
- 如何解决The underlying provider failed on Open问题
- ctime、mtime、atime
- Bootstrap4+MySQL前后端综合实训-Day03-AM【折叠、模态框】
- 用VMWARE学习组网(四)
- Android下拉刷新效果实现
- apache + subversion + Windows认证
- [转载] 使用Python+OpenCV实现在视频中某对象后添加图像
- sql 保留整数_Spark 3.0发布啦,改进SQL,弃Python 2,更好的兼容ANSI SQL,性能大幅提升...
- ArrayList非线程安全
- 【三维路径规划】基于matlab A_star算法无人机三维路径规划【含Matlab源码 1387期】
- java在线校验银行卡号_校验银行卡卡号
- 2021 IT运维调查报告
- 7、核心芯片说明文档
- 项目管理sod_Microsoft Visual SourceSafe(项目文件管理) V6.0 最新中文版(图文)
- 基于文心大模型的剧本杀海报生成器(武侠篇)
- 博客上怎么根据搜索ID搜索其它人
- php微信公众号怎么开发_PHP对接微信公众平台消息接口开发流程详解及实例
- 函数——哥德巴赫猜想
- 20189200余超 2018-2019-2 移动平台应用开发实践第四作业
热门文章
- windows 开机不进入桌面自动进入自己的程序和恢复桌面显示
- 年轻人,如何抗造系列之 - 无意义的忧虑与烦恼
- entity、model、domain三个包名的意思
- 巴比特 | 元宇宙每日必读:一千块就能买一个虚拟主播?这是小企业的直播福音还是在“割韭菜”?...
- [CodeForces908G]New Year and Original Order
- 聚类基本概念及常见聚类算法和EM算法
- 前端实现input标签输入框密码框显示文字效果
- 你知道这个提高 Java 单元测试效率的 IDEA 插件吗
- 关于 LambdaMART 的六个疑惑
- 刷脸支付取款等人脸识别技术商用开始普及