matlab高斯求积法_实验3：利用SVM实现线性高斯分类

实验3：利用SVM实现线性高斯分类

hw3.1

Perceptron：
Consider running the Perceptron algorithm on some sequence of examples S (an example is a data point and its label). Let S′ be the same set of examples as S , but presented in a different
1. a) Does the Perceptron algorithm necessarily make the same number of mistakes on S as it does on S′结论：一样首先我们知道感知机的误差定义准则为：

表示所有错误分类样本的集合。某个特定的错误的分类的样本对误差函数的贡献为样本被错误分类的区域中

的线性函数，正确分类的误差函数为零。因此，总的误差函数是分段线性的。现在对误差函数使用随机梯度下降算法。得到权重

的变化为：

下面基于以上原理进行证明，错误数是相同的取超平面为

,使

(此处做了两件事，一是将

加入了

的矩阵，即回归，二是将组合后的矩阵归一化)因为对于有限的

.均有

所以存在

,使得

（1）感知器算法中，从

开始，若输入实例被误分，则更新权重，令

是第

个误分实例之前的扩充权重向量，即

则第k个误分实例的条件是

(2)而我们知道，若

被

误分，则

和

的更新是

即

(3)下面证明俩不等式：（4）

**由（1）和（3）可得

由此递推可得（4）

（5）

**由（2）和（3）得

1. b) If so, why? If not, show such an S and S′ where the Perceptron algorithm makes a different number of mistakes on S′ than it does on S上题已证
自己的一些问题：
1. 在用感知机的时候，对偶形式为啥比原始形式高效?

hw3.2

A proposed kernel Consider the following kernel function:

a) Prove this is a legal kernel. That is, describe an implicit mapping Φ : such that K x, x′ x′) = x ) Φ( x′x′). (You may assume the instance space X is finite.)
要证明核矩阵是合法的，只需要证明

由题意知：

且对于映射：

，若

。所以假设输入空间X是一个有限维度的向量，则

显然，K矩阵满足：

，是对称矩阵。
不妨去K矩阵为10阶方阵：

利用MATLAB软件求解其特征值全为1，所以K是一个半正定矩阵。因此核矩阵K是合法的。

b) In this kernel space, any labeling of points in X will be linearly separable. Justify this claim.显然的，因为相同才为1，所以不论x怎么取，你总是可以把各种标签都给分开
c) Since all labelings are linearly separable, this kernel seems perfect for learning any target function. Why is this actually a bad idea?
假设我们的训练实例含有两个特征[ 1 2]，给定地标 (1)与不同的值，见下图：

图中水平面的坐标为 1， 2而垂直坐标轴代表。可以看出，只有当与 (1)重合时才具有最大值。随着的改变值改变的速率受到 2的控制。在下图中，当实例处于洋红色的点位置处，因为其离 (1)更近，但是离 (2)和 (3)较远，因此 1接近 1，而 2, 3接近 0。因此ℎ ( ) = 0 + 1 1 + 2 2 + 1 3 > 0，因此预测 = 1。同理可以求出，对于离 (2)较近的绿色点，也预测 = 1，但是对于蓝绿色的点，因为其离三个地标都较远，预测 = 0。r/> 所以可以得到结论：这么做没有任何意义，因为除非你遍历了所有的x，否则你的标签数据是不够的，那时你的数据显然不能做预测

hw3.3支持向量机-support vector machine

定义线性可分：

和

是 n 维欧氏空间中的两个点集。如果存在 n 维向量 w 和实数 b，使得所有属于

的点

都有

，而对于所有属于

的点

则有

，则我们称

和

线性可分。

定义最大间隔平面：
1. 以最大间隔把两类样本分开的超平面，也称之为最大间隔超平面。
定义支持向量：
1. 样本中距离超平面最近的一些点，这些点叫做支持向量。

SVM最优化过程：
1. 找到基于超平面

的两条最近距离的支持向量：

转化得到：
1. 得到最大间隔超平面的上下两个超平面：

img

得到最优问题：

利用对偶问题求解：
1. 引入拉格朗日乘数法思想
1. 引入松弛变量进行转化：

1. 再利用KKT条件将问题进行简化：

1. 强对称性
最后拼接：进行SVM优化：
1. 利用第五步的对偶问题求解带入支持向量MODEL
2. 对样本区域进行软间隔优化——引入松弛变量

[公式]

1. 1. 对问题重复第一阶段的过程
核函数相关：
1. 其实就是预先先验假设的样本点分布？
2. 对线性不可分的问题实现高维折叠重合实现可分

代码相关：

from sklearn import svm
import numpy as np
from sklearn import svm
import numpy as np
import matplotlib.pyplot as pltx = [[2, 0, 1], [1, 1, 2], [2, 3, 3]]
y = [0, 0, 1]  # 分类标记
clf = svm.SVC(kernel='linear')  # SVM模块，svc,线性核函数
clf.fit(x, y)print(clf)print(clf.support_vectors_)  # 支持向量点print(clf.support_)  # 支持向量点的索引print(clf.n_support_)  # 每个class有几个支持向量点
x_new = [2, 0, 3]
x_new = np.array(x_new).reshape(1, -1)
print(clf.predict(x_new))  # 预测np.random.seed(0)
x = np.r_[np.random.randn(300,2)-[2,2],np.random.randn(200,2)+[2,2]]
y = [0]*300+[1]*200 clf = svm.SVC(kernel='linear')
clf.fit(x, y)w = clf.coef_[0]  # 获取w
a = -w[0] / w[1]  # 斜率
# 画图划线
xx = np.linspace(-5, 5)  # (-5,5)之间x的值
yy = a * xx - (clf.intercept_[0]) / w[1]  # xx带入y，截距# 画出与点相切的线
b = clf.support_vectors_[0]
yy_down = a * xx + (b[1] - a * b[0])
b = clf.support_vectors_[-1]
yy_up = a * xx + (b[1] - a * b[0])print("W:", w)
print("a:", a)print("support_vectors_:", clf.support_vectors_)
print("clf.coef_:", clf.coef_)plt.figure(figsize=(8, 4))
plt.plot(xx, yy)
plt.plot(xx, yy_down)
plt.plot(xx, yy_up)
plt.scatter(clf.support_vectors_[:, 0], clf.support_vectors_[:, 1], s=80)
plt.scatter(x[:, 0], x[:, 1], c=y, cmap=plt.cm.Paired)  # [:，0]列切片，第0列plt.axis('tight')plt.show()

未优化：

优化后代码：

from sklearn import svm
import numpy as np
from sklearn import svm
import numpy as np
import matplotlib.pyplot as pltx = [[2, 0, 1], [1, 1, 2], [2, 3, 3]]
y = [0, 0, 1]  # 分类标记
clf = svm.SVC(kernel='linear')  # SVM模块，svc,线性核函数
clf.fit(x, y)print(clf)print(clf.support_vectors_)  # 支持向量点print(clf.support_)  # 支持向量点的索引print(clf.n_support_)  # 每个class有几个支持向量点
x_new = [2, 0, 3]
x_new = np.array(x_new).reshape(1, -1)
print(clf.predict(x_new))  # 预测np.random.seed(0)#保证生成随机分布和第一次保持不变，便于评估
mean=(0,0)
cov=[[1,0],[0,1]]
x1=np.random.multivariate_normal(mean,cov,300,'raise')
mean=(1,2)
cov=[[1,0],[0,2]]
x2=np.random.multivariate_normal(mean,cov,200,'raise')
x = np.r_[x1,x2]
y = [0]*300+[1]*200 clf = svm.SVC(kernel='linear')
clf.fit(x, y)w = clf.coef_[0]  # 获取w
a = -w[0] / w[1]  # 斜率
# 画图划线
xx = np.linspace(-5, 5)  # (-5,5)之间x的值
yy = a * xx - (clf.intercept_[0]) / w[1]  # xx带入y，截距# 画出与点相切的线
b = clf.support_vectors_[0]
yy_down = a * xx + (b[1] - a * b[0])
b = clf.support_vectors_[-1]
yy_up = a * xx + (b[1] - a * b[0])print("W:", w)
print("a:", a)print("support_vectors_:", clf.support_vectors_)
print("clf.coef_:", clf.coef_)plt.figure(figsize=(8, 4))
plt.plot(xx, yy)
plt.plot(xx, yy_down)
plt.plot(xx, yy_up)
plt.scatter(clf.support_vectors_[:, 0], clf.support_vectors_[:, 1], s=80)
plt.scatter(x[:, 0], x[:, 1], c=y, cmap=plt.cm.Paired)  # [:，0]列切片，第0列plt.axis('tight')plt.show()

最终结果：(优化)

评估：

线性核的最佳参数为 : {'C': 0.3020408163265306}

最佳线性核参数在测试集上的正确率为 : 85.0 %

多项式核的最佳参数为 : {'C': 0.7827586206896552, 'degree': 1, 'gamma': 0.3889111111111111}

多项式核的最佳参数在测试集上的正确率为 : 89.0 %

rbf的最佳参数为 : {'C': 4.196551724137931, 'gamma': 0.44445555555555555}

rbf的最佳参数在测试集上的正确率为 : 87.0 %

sigmoid的最佳参数为 : {'C': 0.1, 'gamma': 0.16673333333333332}

sigmoid核的最佳参数在测试集上的正确率为 : 74.0 %

利用其他核实现：

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
import math
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV
plt.rcParams['font.sans-serif'] = ['SimHei']  # 用来正常显示中文标签
plt.rcParams['axes.unicode_minus'] = False  # 用来正常显示负号
#初始化生成高斯随机分布
def init_data():#中心mean_a = [0, 0]mean_b = [1, 2]#协方差矩阵cov_a = [[1, 0], [0, 1]]cov_b = [[1, 0], [0, 2]]#高斯分布point_a = np.random.multivariate_normal(mean_a, cov_a, 300)point_b = np.random.multivariate_normal(mean_b, cov_b, 200)#按行的方式合并data = np.append(point_a, point_b, 0)#设置标签labels = [0] * 500labels = np.array(labels)#A类labels[0:300] = 0#B类labels[300:] = 1return np.round(data, 3), labels
#dataSet为坐标值,labels为类别
dataSet, labels = init_data()
trainingSet = np.vstack((dataSet[0:240],dataSet[300:460]))
testSet = np.vstack((dataSet[240:300], dataSet[460:500]))
trainingLabels =list(labels[0:240] ) + list(labels[300:460])
testLabels = list(labels[240:300]) + list(labels[460:500])x1, y1 = dataSet[0:300].T # 所有A类点
x2, y2 = dataSet[300:500].T # 所有B类点
# 绘制生成的a,b点分布图
plt.figure()
x1, y1 = dataSet[0:300].T
x2, y2 = dataSet[300:500].T
plt.scatter(x1, y1, c='y', marker='o', alpha=0.5)
plt.scatter(x2, y2, c='r', marker='o', alpha=0.5)
plt.title("A,B两类点的分布图")
plt.xlabel("x")
plt.ylabel("y")# 线性核
# C：错误项的惩罚系数，C越大泛化能力越弱，越容易过拟合，C跟松弛向量有关
parameters = {'C': np.linspace(0.1,10,50)}
#寻找最佳惩罚系数取值
clf1 = GridSearchCV(SVC(kernel='linear'), parameters, scoring='f1') # 选择最佳参数
clf1.fit(trainingSet, trainingLabels)  # 训练
print('线性核的最佳参数为 : ',clf1.best_params_)
clf1 = SVC(kernel='linear', C=clf1.best_params_['C'])
clf1.fit(trainingSet, trainingLabels)
print("最佳线性核参数在测试集上的正确率为 : ",clf1.score(testSet,testLabels)*100,"%")# 绘制用linear核的SVM得到的超平面图
def plot_linear_hyperplane(clf, title='hyperplane'):plt.figure()x1, y1 = dataSet[0:300].T  x2, y2 = dataSet[300:500].T  plt.scatter(x1, y1, c='y', marker='o', alpha=0.5)plt.scatter(x2, y2, c='r', marker='o', alpha=0.5)plt.title("线性核的支持向量分类图")plt.xlabel("x")plt.ylabel("y")plt.scatter(clf1.support_vectors_[:, 0],clf1.support_vectors_[:, 1],s=300, linewidth=1, facecolors='none')# 绘制决策函数ax = plt.gca()xlim = ax.get_xlim()ylim = ax.get_ylim()# 创建网格来评估模型xx = np.linspace(xlim[0], xlim[1], 30)yy = np.linspace(ylim[0], ylim[1], 30)XX, YY = np.meshgrid(xx, yy)xy = np.vstack([XX.ravel(), YY.ravel()]).TZ = clf.decision_function(xy).reshape(XX.shape)# 绘制决策边界和边距ax.contour(XX, YY, Z, colors = 'k', levels = [-1, 0, 1], alpha = 0.5, linestyles=['--', '-', '--'])# 绘制支持向量（Support Vectors）ax.scatter(clf.support_vectors_[:, 0], clf.support_vectors_[:, 1], s = 30)
# 绘制用linear核的SVM得到的超平面图
def plot_hyperplane(clf, title='hyperplane'):plt.figure()x1, y1 = dataSet[0:300].T  x2, y2 = dataSet[300:500].T  plt.scatter(x1, y1, c='y', marker='o', alpha=0.5)plt.scatter(x2, y2, c='r', marker='o', alpha=0.5)plt.title(title)plt.xlabel("x")plt.ylabel("y")plt.scatter(clf1.support_vectors_[:, 0],clf1.support_vectors_[:, 1],s=300, linewidth=1, facecolors='none')# 绘制决策函数ax = plt.gca()xlim = ax.get_xlim()ylim = ax.get_ylim()# 创建网格来评估模型xx = np.linspace(xlim[0], xlim[1], 30)yy = np.linspace(ylim[0], ylim[1], 30)XX, YY = np.meshgrid(xx, yy)xy = np.vstack([XX.ravel(), YY.ravel()]).TZ = clf.decision_function(xy).reshape(XX.shape)   # 绘制决策边界和边距ax.contour(XX, YY, Z,levels=[-1, 0, 1],cmap=plt.cm.winter, alpha=0.5,linestyles=['--', '-', '--'])# 绘制支持向量（Support Vectors）ax.scatter(clf.support_vectors_[:, 0], clf.support_vectors_[:, 1], s = 30)plot_linear_hyperplane(clf1, title='linear kernel hyperplane')def nonlinearityKernel(name):if(name == "poly"):parameters = {'C': np.linspace(0.1,10,30), 'gamma': np.linspace(0.0001,0.5,10),"degree":[1,2]}else:parameters = {'C': np.linspace(0.1,10,30), 'gamma': np.linspace(0.001,1,10)}clf2 = GridSearchCV(SVC(kernel='{}'.format(name)), parameters, scoring='f1') # 选择最佳参数clf2.fit(trainingSet, trainingLabels)print('{} kernel 的最佳参数为 : '.format(name),clf2.best_params_)clf2 = SVC(kernel='{}'.format(name),C=clf2.best_params_['C'],gamma=clf2.best_params_['gamma'])clf2.fit(trainingSet, trainingLabels)print("最佳{}参数在测试集上的正确率为 : ".format(name),clf2.score(testSet, testLabels)*100,"%")plot_hyperplane(clf2, title='{} kernel hyperplane'.format(name))nonlinearityKernel("rbf")
nonlinearityKernel("sigmoid")
nonlinearityKernel("poly")
plt.show()