使用数据

Batch Perception算法

原理

设有一组样本y1,y2,...,yny_1,y_2,...,y_ny1,y2,...,yn，各样本均规范化表示，我们的目的是找一个解向量aaa ,使aTyi>0a^T y_i>0aTyi>0。在线性可分的情况下，满足上式的aaa是无穷的。所以要引出一个损失函数进行优化。这个准则的基本思想是错分样本最少：
J(a)=∑y∈Y(−aTy)J(a)=\sum_{y\in Y}(-a^{T}y)J(a)=∑y∈Y(−aTy)
YYY为错分样本集合。我们采用梯度下降来优化目标函数：
∂J∂a=∑y∈Y(−y)\frac{\partial J}{\partial a}=\sum_{y\in Y}(-y)∂a∂J=∑y∈Y(−y)
则有：
ak+1=ak−η∑y∈Y(−y)a_{k+1}=a_{k}-η\sum_{y\in Y}(-y)ak+1=ak−η∑y∈Y(−y)

代码实现

# Define batch perception algorithm
# Input: w1, w2
#   w1: Samples in class 1
#   w2: Samples in class 2
# Output: a, n
#   a: the parameters
#   n: number of iterations
def batch_perception(w1, w2):# Generate the normalized augmented samplesw = trans_sample(w1, w2)# Initiationa = np.zeros_like(w[1])eta = 1                             # Learning ratetheta = np.zeros_like(w[1])+1e-6    # Termination conditionsn = 0                               # Number of iterations# Implement the algorithmwhile True:y = np.zeros_like(w[1])for sample in w:if np.matmul(a.T, sample) <= 0:# the sample is misclassified y += sampleeta_y = eta * yif all(np.abs(eta_y)<=theta):# if the termination conditions are satisfied, terminate the iterationbreaka += eta_yn += 1print ("The dicision surface a is {}\nThe number of iterations is {}.".format(a, n))return a, n

Ho-Kashyap算法

原理

刚才的准则函数都是关注于错分样本，而对正确分类的样本则没有考虑在内。MSE准则函数把求解目标从不等式形式变成了等式形式：求取满足 aTyi=bia^T y_i=b_iaTyi=bi的权向量。在这里，bib_ibi是任取的正常数。

如果记矩阵 Y∈Rn∗dY\in R^{n*d}Y∈Rn∗d且其每行都是一个样本 yTy^TyT，向量 b=[b1,b2,...,bn]Tb=[b_1,b_2,...,b_n]^Tb=[b1,b2,...,bn]T，那么就可以表述为存在Ya=b>0Ya=b>0Ya=b>0。为了求解这个问题，使用MSE准则函数：
J=∣∣Ya−b∣∣2=∑i=1n(aTyi−bi)2J=||Ya-b||^2=\sum_{i=1}^{n}(a^T y_i-b_i)^2J=∣∣Ya−b∣∣2=∑i=1n(aTyi−bi)2

采用梯度下降来优化目标函数：
∂J∂a=2YT(Ya−b)\frac{\partial J}{\partial a}=2Y^T(Ya-b)∂a∂J=2YT(Ya−b)
∂J∂b=−2(Ya−b)\frac{\partial J}{\partial b}=-2(Ya-b)∂b∂J=−2(Ya−b)
则有：
bk+1=bk+2ηk12(∂J∂b−∣∂J∂b∣)b_{k+1}=b_k+2η_{k}\frac{1}{2}(\frac{\partial J}{\partial b}-|\frac{\partial J}{\partial b}|)bk+1=bk+2ηk21(∂b∂J−∣∂b∂J∣)
ak=Y+bka_{k}=Y^{+}b_kak=Y+bk

代码实现

# Define Ho-Kashyap algorithm
# Input: w1, w2
#   w1: Samples in class 1
#   w2: Samples in class 2
# Output: a, b, n
#   a, b: the parameters
#   e: training errors
def HK_algorithm(w1, w2):# Generate the normalized augmented samplesw = trans_sample(w1, w2)# Initiationa = np.zeros_like(w[1])b = np.zeros(w.shape[0]) + 0.5yita = 0.5                          # learning rateth_b = np.zeros(w.shape[0]) + 1e-6  # Termination condition of bth_n = 10000                        # Termination condition of nn = 0                               # Number of iterations# Implement the algorithmwhile n <= th_n:e = np.matmul(w, a) - be_ = 0.5 * (e + np.abs(e))b += 2 * (yita) * e_a = np.matmul(np.matmul(np.linalg.inv(np.matmul(w.T, w)), w.T), b)n += 1if all(np.abs(e) == 0): # if the errors are all 0, terminate the iterationprint ("The dicision surface a is {}.".format(a))print ("The dicision errors are {}.".format(e))return a, b, eif all(np.abs(e) <= th_b): # if the termination conditions are satisfied, terminate the iterationprint ("The dicision surface a is {}.".format(a))print ("The dicision errors are {}.".format(e))return a, b, e# if the termination conditions are not satisfied when n = th_n, terminate the iterationprint ("Iteration exceed the maxinum number! No solution found !")print ("The current dicision surface a is {}.".format(a))print ("The current dicision errors are {}.".format(e))return a, b, e

MSE算法

原理

采用c个两类分类器的组合，使用线性变换：
y=WTx+b,W∈Ed∗c,b∈Rcy=W^{T}x+b,W\in E^{d*c},b\in R^{c}y=WTx+b,W∈Ed∗c,b∈Rc
决策准则：
如果j=argmax(WTx+b)j=argmax(W^{T}x+b)j=argmax(WTx+b)，则x∈ωjx\in \omega_jx∈ωj

令W^=WbT,x^=x1∈Rd+1,X^=(x1^,x2^,...,xn^)∈R(d+1)∗n\hat{W}=\begin{matrix}W\\b^T\\\end{matrix},\hat{x}=\begin{matrix}x\\1\\\end{matrix}\in R^{d+1}, \hat{X}=(\hat{x_1},\hat{x_2},...,\hat{x_n})\in R^{(d+1)*n}W^=WbT,x^=x1∈Rd+1,X^=(x1^,x2^,...,xn^)∈R(d+1)∗n
目标函数：
minW,b∑i=1n∣∣WTx+b−yi∣22min_{W,b}\sum_{i=1}^{n}||W^{T}x+b-y_i|^2_2minW,b∑i=1n∣∣WTx+b−yi∣22
∑i=1n∣∣WTx+b−yi∣22=∣∣W^TX^−Y∣∣F2\sum_{i=1}^{n}||W^{T}x+b-y_i|^2_2=||\hat{W}^{T}\hat{X}-Y||_F^2∑i=1n∣∣WTx+b−yi∣22=∣∣W^TX^−Y∣∣F2
得到：
W^=(X^X^T+λI)−1X^YT∈R(d+1)∗c\hat{W}=(\hat{X}\hat{X}^{T}+\lambda I)^{-1}\hat{X}Y^{T}\in R^{(d+1)*c}W^=(X^X^T+λI)−1X^YT∈R(d+1)∗c

代码实现

# Define the multiple classed test function
# Input: w, a
#   w: [[class1], [class2], ...]
#   a: the parameters (sample_d+1, class_n)
# Output: f_ratio
#   f_ratio: false ratio
def multi_class_mse_test(w, a):w_ = copy.deepcopy(w)cnt = 0for class_i, sample in enumerate(w_):print(sample)for sample_i, test_sample in enumerate(sample):test_sample.append(1)test_sample = np.array(test_sample)print(np.matmul(a.T, test_sample))test_result = np.argmax(np.matmul(a.T, test_sample))if test_result != class_i: cnt += 1f_ratio = cnt / ((class_i+1)*(sample_i+1))print("The false rate is {} ".format(f_ratio))return f_ratio

代码汇总

import numpy as np
import matplotlib.pyplot as plt
import copy# Sample w1, w2, w3, w4
w1 = [[0.1, 1.1],   [6.8, 7.1],   [-3.5, -4.1], [2.0, 2.7],   [4.1, 2.8],   [3.1, 5.0],   [-0.8, -1.3], [0.9, 1.2],   [5.0, 6.4],   [3.9, 4.0]]
w2 = [[7.1, 4.2],   [-1.4, -4.3], [4.5, 0.0],   [6.3, 1.6],   [4.2, 1.9],   [1.4, -3.2],  [2.4, -4.0],  [2.5, -6.1],  [8.4, 3.7],   [4.1, -2.2]]
w3 = [[-3.0, -2.9], [0.5, 8.7],   [2.9, 2.1],   [-0.1, 5.2],  [-4.0, 2.2],  [-1.3, 3.7],  [-3.4, 6.2],  [-4.1, 3.4],  [-5.1, 1.6],  [1.9, 5.1]]
w4 = [[-2.0, -8.4], [-8.9, 0.2],  [-4.2, -7.7], [-8.5, -3.2], [-6.7, -4.0], [-0.5, -9.2], [-5.3, -6.7], [-8.7, -6.4], [-7.1, -9.7], [-8.0, -6.3]]# Generate the normalized augmented samples
# Input: w1, w2
#   w1: samples in class 1
#   w2: samples in class 2
# Output: w
#   w: normalized augmented samples in class 1 and a
def trans_sample(w1,w2):w_1 = copy.deepcopy(w1)w_2 = copy.deepcopy(w2)# Augmentationfor sample in w_1: sample.append(1)for sample in w_2: sample.append(1)# Regulationw_1 = np.array(w_1)w_2 = -np.array(w_2)w   = np.concatenate([w_1, w_2])return w# Define batch perception algorithm
# Input: w1, w2
#   w1: Samples in class 1
#   w2: Samples in class 2
# Output: a, n
#   a: the parameters
#   n: number of iterations
def batch_perception(w1, w2):# Generate the normalized augmented samplesw = trans_sample(w1, w2)# Initiationa = np.zeros_like(w[1])eta = 1                             # Learning ratetheta = np.zeros_like(w[1])+1e-6    # Termination conditionsn = 0                               # Number of iterations# Implement the algorithmwhile True:y = np.zeros_like(w[1])for sample in w:if np.matmul(a.T, sample) <= 0:# the sample is misclassified y += sampleeta_y = eta * yif all(np.abs(eta_y)<=theta):# if the termination conditions are satisfied, terminate the iterationbreaka += eta_yn += 1print ("The dicision surface a is {}\nThe number of iterations is {}.".format(a, n))return a, n# Define Ho-Kashyap algorithm
# Input: w1, w2
#   w1: Samples in class 1
#   w2: Samples in class 2
# Output: a, b, n
#   a, b: the parameters
#   e: training errors
def HK_algorithm(w1, w2):# Generate the normalized augmented samplesw = trans_sample(w1, w2)# Initiationa = np.zeros_like(w[1])b = np.zeros(w.shape[0]) + 0.5yita = 0.5                          # learning rateth_b = np.zeros(w.shape[0]) + 1e-6  # Termination condition of bth_n = 10000                        # Termination condition of nn = 0                               # Number of iterations# Implement the algorithmwhile n <= th_n:e = np.matmul(w, a) - be_ = 0.5 * (e + np.abs(e))b += 2 * (yita) * e_a = np.matmul(np.matmul(np.linalg.inv(np.matmul(w.T, w)), w.T), b)n += 1if all(np.abs(e) == 0): # if the errors are all 0, terminate the iterationprint ("The dicision surface a is {}.".format(a))print ("The dicision errors are {}.".format(e))return a, b, eif all(np.abs(e) <= th_b): # if the termination conditions are satisfied, terminate the iterationprint ("The dicision surface a is {}.".format(a))print ("The dicision errors are {}.".format(e))return a, b, e# if the termination conditions are not satisfied when n = th_n, terminate the iterationprint ("Iteration exceed the maxinum number! No solution found !")print ("The current dicision surface a is {}.".format(a))print ("The current dicision errors are {}.".format(e))return a, b, e# Define multiple class MSE algorithm
# Input: w
#   w: [[class1], [class2], ...]
# Output: a
#   a: the parameters (sample_d+1, class_n)
def multi_class_mse(w_i):w_ = copy.deepcopy(w_i)class_n = len(w_)sample_n = len(w_[0])# Initiationw = []y = np.zeros((class_n, class_n*sample_n))for class_idx, class_sample in enumerate(w_):for sample in class_sample: sample.append(1)w.append(sample)y[class_idx, class_idx*sample_n:(class_idx+1)*sample_n] = 1print(y)w = np.array(w).T # w: (class_n*sample_n, 3)# y: (class_n, class_n*sample_n)a = np.matmul(np.matmul(np.linalg.inv(np.matmul(w, w.T)+0.01), w), y.T)print ("The dicision surface a is {}.".format(a))return a# Define the multiple classed test function
# Input: w, a
#   w: [[class1], [class2], ...]
#   a: the parameters (sample_d+1, class_n)
# Output: f_ratio
#   f_ratio: false ratio
def multi_class_mse_test(w, a):w_ = copy.deepcopy(w)cnt = 0for class_i, sample in enumerate(w_):print(sample)for sample_i, test_sample in enumerate(sample):test_sample.append(1)test_sample = np.array(test_sample)print(np.matmul(a.T, test_sample))test_result = np.argmax(np.matmul(a.T, test_sample))if test_result != class_i: cnt += 1f_ratio = cnt / ((class_i+1)*(sample_i+1))print("The false rate is {} ".format(f_ratio))return f_ratio# Show the classification results
# Input: w1, w2
#   w1: Samples in class 1
#   w2: Samples in class 2
#   a: the parameters
def show_result(w1, w2, a):# Samples in class 1w_1 = np.array(w1)x1 = w_1[:, 0]y1 = w_1[:, 1]plt.scatter(x1, y1, marker = '.',color = 'red')# Samples in class 2w_2 = np.array(w2)x2 = w_2[:, 0]y2 = w_2[:, 1]plt.scatter(x2, y2, marker = '.',color = 'blue')# Decision surfacex = np.arange(-10, 10, 0.1)y = -a[0]/a[1]*x - a[2]/a[1]plt.plot(x, y)plt.xlabel('x_1')plt.ylabel('x_2')plt.title('Classfication Result')plt.show()# Show the multiple classification results
# Input: w, a
#   w: [[class1], [class2], ...]
#   a: the parameters
def multi_class_show_result(w, a):class_n = len(w)for class_idx, class_sample in enumerate(w):# Samples in class_idxw_i = np.array(class_sample)x = w_i[:, 0]y = w_i[:, 1]plt.scatter(x, y)# Decision surfacex = np.arange(-10, 10, 0.1)y = -a[0][class_idx]/a[1][class_idx]*x - a[2][class_idx]/a[1][class_idx]plt.plot(x, y)plt.xlabel('x_1')plt.ylabel('x_2')plt.title('Classfication Result')plt.show()if __name__ == "__main__":print("The result of the batch perception algorithm is shown below")print("w1 and w2")a, n = batch_perception(w1, w2)show_result(w1, w2, a)print("w3 and w2")a, n = batch_perception(w3, w2)show_result(w3, w2, a)print("-------------------------------------")print("The result of the Ho-Kashyap algorithm is shown below")print("w1 and w3")a, b, e = HK_algorithm(w1, w3)show_result(w1, w3, a)print("w2 and w4")a, b, e = HK_algorithm(w2, w4)show_result(w2, w4, a)print("-------------------------------------")print("The result of the multiple calss MSE is shown below")a = multi_class_mse([w1[:8], w2[:8], w3[:8], w4[:8]])multi_class_show_result([w1[:8], w2[:8], w3[:8], w4[:8]], a)multi_class_mse_test([w1[8:], w2[8:], w3[8:], w4[8:]], a)

线性分类器：Batch Perception+Ho_Kashyap+MSE原理及代码实现相关推荐

多类线性分类器算法原理及代码实现 MATLAB
多类线性分类器算法原理及代码实现 MATLAB 一.算法原理下面举例说明为何蓝圈部分在case2中是确定的而在case1中不确定: 二.代码实现 1.HK函数 function [] = HK(w1 ...
神经网络系列之五 -- 线性二分类的方法与原理
https://www.cnblogs.com/woodyh5/p/12101581.html 系列博客,原文在笔者所维护的github上:https://aka.ms/beginnerAI, 点击s ...
模式识别作业-线性分类器设计总结
刚刚做完线性分类器的作业,趁热打铁做下总结. 摘要模式识别的目的是要在特征空间中设法找到两类(或多类)之间的分界面.基于样本直接设计分类器需要确定三个基本要素:一是分类器即判别函数的类型 ...
Fisher 线性分类器--转
原文地址:http://blog.csdn.net/htyang725/article/details/6571550 Fisher 线性分类器由R.A.Fisher在1936年提出,至今都有很大的研 ...
[深度学习基础] 2. 线性分类器
本文将以 softmax 线性分类器为例, 讨论数据驱动过程的各组成部分. 同时本章是后文非线性分类器和深度学习的铺垫. 1 训练数据给定由 m 张图像组成的训练集, 每个图像的标记是 K 个不同类 ...
【线性分类器】（四）万字长文解释拉格朗日乘子与支持向量机
[线性分类器](一)线性判别 [线性判别器](二)"深度学习"的鼻祖--感知器 [线性分类器](三)线性分类器的松弛求解: LEMS 算法,H-K 算法文章目录拉格朗日乘子与支 ...
【深度学习-CS231n】线性分类器和神经网络
文章目录神经网络静态部分基本概念和小细节算法设计选项神经网络动态部分学习过程 [梯度检查](https://zhuanlan.zhihu.com/p/21741716?refer=intel ...
线性分类器：感知器/SVM
线性可分存在一个超平面,可以将数据集的正负例完全正确地划分至超平面两侧.如:左图中的数据是线性可分,而右图不是. 线性模型形式是x在第i个特征的取值. 反映了第i个特征的重要程度.如可看 ...
[模式识别].(希腊)西奥多里蒂斯第四版笔记3之__线性分类器
1,线性分类器主要优点是他们的简化和计算吸引力 2,线性判别函数和决策超平面 3,感知器算法 4,最小二乘法:均方误差估计:随机近似和LMS算法:方差和估计 5,均方估计回顾:均方误差回归:MSE估计 ...

线性分类器：Batch Perception+Ho_Kashyap+MSE原理及代码实现

目录

使用数据

Batch Perception算法

原理

代码实现

Ho-Kashyap算法

原理

代码实现

MSE算法

原理

代码实现

代码汇总

线性分类器：Batch Perception+Ho_Kashyap+MSE原理及代码实现相关推荐

最新文章

热门文章