目录

逻辑回归

逻辑斯特回归假设

损失函数

向量化的损失函数(矩阵形式)

最小化损失函数(梯度下降)

预测部分

画决策边界

加正则化项的逻辑斯特回归

data


逻辑回归

# %load ../../standard_import.txt
import pandas as pd
import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as pltfrom scipy.optimize import minimizefrom sklearn.preprocessing import PolynomialFeaturespd.set_option('display.notebook_repr_html', False)
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', 150)
pd.set_option('display.max_seq_items', None)#%config InlineBackend.figure_formats = {'pdf',}
%matplotlib inlineimport seaborn as sns
sns.set_context('notebook')
sns.set_style('white')
def loaddata(file, delimeter):data = np.loadtxt(file, delimiter=delimeter)print('Dimensions: ',data.shape)print(data[1:6,:])return(data)
def plotData(data, label_x, label_y, label_pos, label_neg, axes=None):# 获得正负样本的下标(即哪些是正样本,哪些是负样本)neg = data[:,2] == 0pos = data[:,2] == 1if axes == None:axes = plt.gca()axes.scatter(data[pos][:,0], data[pos][:,1], marker='+', c='k', s=60, linewidth=2, label=label_pos)axes.scatter(data[neg][:,0], data[neg][:,1], c='y', s=60, label=label_neg)axes.set_xlabel(label_x)axes.set_ylabel(label_y)axes.legend(frameon= True, fancybox = True);

X = np.c_[np.ones((data.shape[0],1)), data[:,0:2]]
y = np.c_[data[:,2]]

np.r_中的r是row(行)的缩写,是按行叠加两个矩阵的意思,也可以说是按列连接两个矩阵,就是把两矩阵上下相加,要求列数相等,类似于pandas中的concat()。

np.c_中的c是column(列)的缩写,是按列叠加两个矩阵的意思,也可以说是按行连接两个矩阵,就是把两矩阵左右相加,要求行数相等,类似于pandas中的merge()。

逻辑斯特回归假设

#定义sigmoid函数
def sigmoid(z):return(1 / (1 + np.exp(-z)))

其实scipy包里有一个函数可以完成一样的功能:

https://docs.scipy.org/doc/scipy/reference/generated/scipy.special.expit.html#scipy.special.expit

损失函数

向量化的损失函数(矩阵形式)

#定义损失函数
def costFunction(theta, X, y):m = y.sizeh = sigmoid(X.dot(theta))J = -1*(1/m)*(np.log(h).T.dot(y)+np.log(1-h).T.dot(1-y))if np.isnan(J[0]):return(np.inf)return(J[0])

#求解梯度
def gradient(theta, X, y):m = y.sizeh = sigmoid(X.dot(theta.reshape(-1,1)))grad =(1/m)*X.T.dot(h-y)return(grad.flatten())
initial_theta = np.zeros(X.shape[1])
cost = costFunction(initial_theta, X, y)
grad = gradient(initial_theta, X, y)
print('Cost: \n', cost)
print('Grad: \n', grad)

最小化损失函数(梯度下降)

# 这里偷懒了,直接调用scipy里面的最小化损失函数的minimize函数
res = minimize(costFunction, initial_theta, args=(X,y), method=None, jac=gradient, options={'maxiter':400})
res

预测部分

def predict(theta, X, threshold=0.5):p = sigmoid(X.dot(theta.T)) >= thresholdreturn(p.astype('int'))

画决策边界

plt.scatter(45, 85, s=60, c='r', marker='v', label='(45, 85)')
plotData(data, 'Exam 1 score', 'Exam 2 score', 'Pass', 'Failed')
x1_min, x1_max = X[:,1].min(), X[:,1].max(),
x2_min, x2_max = X[:,2].min(), X[:,2].max(),
xx1, xx2 = np.meshgrid(np.linspace(x1_min, x1_max), np.linspace(x2_min, x2_max))
h = sigmoid(np.c_[np.ones((xx1.ravel().shape[0],1)), xx1.ravel(), xx2.ravel()].dot(res.x))
h = h.reshape(xx1.shape)
plt.contour(xx1, xx2, h, [0.5], linewidths=1, colors='b');

加正则化项的逻辑斯特回归

def costFunctionReg(theta, reg, *args):m = y.sizeh = sigmoid(XX.dot(theta))J = -1*(1/m)*(np.log(h).T.dot(y)+np.log(1-h).T.dot(1-y)) + (reg/(2*m))*np.sum(np.square(theta[1:]))if np.isnan(J[0]):return(np.inf)return(J[0])

def gradientReg(theta, reg, *args):m = y.sizeh = sigmoid(XX.dot(theta.reshape(-1,1)))grad = (1/m)*XX.T.dot(h-y) + (reg/m)*np.r_[[[0]],theta[1:].reshape(-1,1)]return(grad.flatten())
fig, axes = plt.subplots(1,3, sharey = True, figsize=(17,5))# 决策边界,咱们分别来看看正则化系数lambda太大太小分别会出现什么情况
# Lambda = 0 : 就是没有正则化,这样的话,就过拟合咯
# Lambda = 1 : 这才是正确的打开方式
# Lambda = 100 : 卧槽,正则化项太激进,导致基本就没拟合出决策边界for i, C in enumerate([0, 1, 100]):# 最优化 costFunctionRegres2 = minimize(costFunctionReg, initial_theta, args=(C, XX, y), method=None, jac=gradientReg, options={'maxiter':3000})# 准确率accuracy = 100*sum(predict(res2.x, XX) == y.ravel())/y.size    # 对X,y的散列绘图plotData(data2, 'Microchip Test 1', 'Microchip Test 2', 'y = 1', 'y = 0', axes.flatten()[i])# 画出决策边界x1_min, x1_max = X[:,0].min(), X[:,0].max(),x2_min, x2_max = X[:,1].min(), X[:,1].max(),xx1, xx2 = np.meshgrid(np.linspace(x1_min, x1_max), np.linspace(x2_min, x2_max))h = sigmoid(poly.fit_transform(np.c_[xx1.ravel(), xx2.ravel()]).dot(res2.x))h = h.reshape(xx1.shape)axes.flatten()[i].contour(xx1, xx2, h, [0.5], linewidths=1, colors='g');       axes.flatten()[i].set_title('Train accuracy {}% with Lambda = {}'.format(np.round(accuracy, decimals=2), C))

data

data1.txt

34.62365962451697,78.0246928153624,0
30.28671076822607,43.89499752400101,0
35.84740876993872,72.90219802708364,0
60.18259938620976,86.30855209546826,1
79.0327360507101,75.3443764369103,1
45.08327747668339,56.3163717815305,0
61.10666453684766,96.51142588489624,1
75.02474556738889,46.55401354116538,1
76.09878670226257,87.42056971926803,1
84.43281996120035,43.53339331072109,1
95.86155507093572,38.22527805795094,0
75.01365838958247,30.60326323428011,0
82.30705337399482,76.48196330235604,1
69.36458875970939,97.71869196188608,1
39.53833914367223,76.03681085115882,0
53.9710521485623,89.20735013750205,1
69.07014406283025,52.74046973016765,1
67.94685547711617,46.67857410673128,0
70.66150955499435,92.92713789364831,1
76.97878372747498,47.57596364975532,1
67.37202754570876,42.83843832029179,0
89.67677575072079,65.79936592745237,1
50.534788289883,48.85581152764205,0
34.21206097786789,44.20952859866288,0
77.9240914545704,68.9723599933059,1
62.27101367004632,69.95445795447587,1
80.1901807509566,44.82162893218353,1
93.114388797442,38.80067033713209,0
61.83020602312595,50.25610789244621,0
38.78580379679423,64.99568095539578,0
61.379289447425,72.80788731317097,1
85.40451939411645,57.05198397627122,1
52.10797973193984,63.12762376881715,0
52.04540476831827,69.43286012045222,1
40.23689373545111,71.16774802184875,0
54.63510555424817,52.21388588061123,0
33.91550010906887,98.86943574220611,0
64.17698887494485,80.90806058670817,1
74.78925295941542,41.57341522824434,0
34.1836400264419,75.2377203360134,0
83.90239366249155,56.30804621605327,1
51.54772026906181,46.85629026349976,0
94.44336776917852,65.56892160559052,1
82.36875375713919,40.61825515970618,0
51.04775177128865,45.82270145776001,0
62.22267576120188,52.06099194836679,0
77.19303492601364,70.45820000180959,1
97.77159928000232,86.7278223300282,1
62.07306379667647,96.76882412413983,1
91.56497449807442,88.69629254546599,1
79.94481794066932,74.16311935043758,1
99.2725269292572,60.99903099844988,1
90.54671411399852,43.39060180650027,1
34.52451385320009,60.39634245837173,0
50.2864961189907,49.80453881323059,0
49.58667721632031,59.80895099453265,0
97.64563396007767,68.86157272420604,1
32.57720016809309,95.59854761387875,0
74.24869136721598,69.82457122657193,1
71.79646205863379,78.45356224515052,1
75.3956114656803,85.75993667331619,1
35.28611281526193,47.02051394723416,0
56.25381749711624,39.26147251058019,0
30.05882244669796,49.59297386723685,0
44.66826172480893,66.45008614558913,0
66.56089447242954,41.09209807936973,0
40.45755098375164,97.53518548909936,1
49.07256321908844,51.88321182073966,0
80.27957401466998,92.11606081344084,1
66.74671856944039,60.99139402740988,1
32.72283304060323,43.30717306430063,0
64.0393204150601,78.03168802018232,1
72.34649422579923,96.22759296761404,1
60.45788573918959,73.09499809758037,1
58.84095621726802,75.85844831279042,1
99.82785779692128,72.36925193383885,1
47.26426910848174,88.47586499559782,1
50.45815980285988,75.80985952982456,1
60.45555629271532,42.50840943572217,0
82.22666157785568,42.71987853716458,0
88.9138964166533,69.80378889835472,1
94.83450672430196,45.69430680250754,1
67.31925746917527,66.58935317747915,1
57.23870631569862,59.51428198012956,1
80.36675600171273,90.96014789746954,1
68.46852178591112,85.59430710452014,1
42.0754545384731,78.84478600148043,0
75.47770200533905,90.42453899753964,1
78.63542434898018,96.64742716885644,1
52.34800398794107,60.76950525602592,0
94.09433112516793,77.15910509073893,1
90.44855097096364,87.50879176484702,1
55.48216114069585,35.57070347228866,0
74.49269241843041,84.84513684930135,1
89.84580670720979,45.35828361091658,1
83.48916274498238,48.38028579728175,1
42.2617008099817,87.10385094025457,1
99.31500880510394,68.77540947206617,1
55.34001756003703,64.9319380069486,1
74.77589300092767,89.52981289513276,1

data2.txt

0.051267,0.69956,1
-0.092742,0.68494,1
-0.21371,0.69225,1
-0.375,0.50219,1
-0.51325,0.46564,1
-0.52477,0.2098,1
-0.39804,0.034357,1
-0.30588,-0.19225,1
0.016705,-0.40424,1
0.13191,-0.51389,1
0.38537,-0.56506,1
0.52938,-0.5212,1
0.63882,-0.24342,1
0.73675,-0.18494,1
0.54666,0.48757,1
0.322,0.5826,1
0.16647,0.53874,1
-0.046659,0.81652,1
-0.17339,0.69956,1
-0.47869,0.63377,1
-0.60541,0.59722,1
-0.62846,0.33406,1
-0.59389,0.005117,1
-0.42108,-0.27266,1
-0.11578,-0.39693,1
0.20104,-0.60161,1
0.46601,-0.53582,1
0.67339,-0.53582,1
-0.13882,0.54605,1
-0.29435,0.77997,1
-0.26555,0.96272,1
-0.16187,0.8019,1
-0.17339,0.64839,1
-0.28283,0.47295,1
-0.36348,0.31213,1
-0.30012,0.027047,1
-0.23675,-0.21418,1
-0.06394,-0.18494,1
0.062788,-0.16301,1
0.22984,-0.41155,1
0.2932,-0.2288,1
0.48329,-0.18494,1
0.64459,-0.14108,1
0.46025,0.012427,1
0.6273,0.15863,1
0.57546,0.26827,1
0.72523,0.44371,1
0.22408,0.52412,1
0.44297,0.67032,1
0.322,0.69225,1
0.13767,0.57529,1
-0.0063364,0.39985,1
-0.092742,0.55336,1
-0.20795,0.35599,1
-0.20795,0.17325,1
-0.43836,0.21711,1
-0.21947,-0.016813,1
-0.13882,-0.27266,1
0.18376,0.93348,0
0.22408,0.77997,0
0.29896,0.61915,0
0.50634,0.75804,0
0.61578,0.7288,0
0.60426,0.59722,0
0.76555,0.50219,0
0.92684,0.3633,0
0.82316,0.27558,0
0.96141,0.085526,0
0.93836,0.012427,0
0.86348,-0.082602,0
0.89804,-0.20687,0
0.85196,-0.36769,0
0.82892,-0.5212,0
0.79435,-0.55775,0
0.59274,-0.7405,0
0.51786,-0.5943,0
0.46601,-0.41886,0
0.35081,-0.57968,0
0.28744,-0.76974,0
0.085829,-0.75512,0
0.14919,-0.57968,0
-0.13306,-0.4481,0
-0.40956,-0.41155,0
-0.39228,-0.25804,0
-0.74366,-0.25804,0
-0.69758,0.041667,0
-0.75518,0.2902,0
-0.69758,0.68494,0
-0.4038,0.70687,0
-0.38076,0.91886,0
-0.50749,0.90424,0
-0.54781,0.70687,0
0.10311,0.77997,0
0.057028,0.91886,0
-0.10426,0.99196,0
-0.081221,1.1089,0
0.28744,1.087,0
0.39689,0.82383,0
0.63882,0.88962,0
0.82316,0.66301,0
0.67339,0.64108,0
1.0709,0.10015,0
-0.046659,-0.57968,0
-0.23675,-0.63816,0
-0.15035,-0.36769,0
-0.49021,-0.3019,0
-0.46717,-0.13377,0
-0.28859,-0.060673,0
-0.61118,-0.067982,0
-0.66302,-0.21418,0
-0.59965,-0.41886,0
-0.72638,-0.082602,0
-0.83007,0.31213,0
-0.72062,0.53874,0
-0.59389,0.49488,0
-0.48445,0.99927,0
-0.0063364,0.99927,0
0.63265,-0.030612,0

七月算法机器学习5 回归分析与工程应用 小案例相关推荐

  1. 七月算法机器学习笔记4 凸优化

    七月算法(http://www.julyedu.com) 12月份 机器学习在线班 学习笔记

  2. 七月算法机器学习笔记8 聚类算法

    七月算法(http://www.julyedu.com) 12月份 机器学习在线班 学习笔记

  3. 七月算法机器学习笔记5 -- 特征工程

    这套笔记是跟着七月算法四月机器学习班的学习而记录的,主要记一下我再学习机器学习的时候一些概念比较模糊的地方,具体课程参考七月算法官网:http://www.julyedu.com/ 特征工程 特征 = ...

  4. 七月算法机器学习笔记9 推荐系统

    七月算法(http://www.julyedu.com) 12月份 机器学习在线班 学习笔记

  5. 七月算法机器学习笔记1 微积分与概率论

    七月算法(http://www.julyedu.com) 12月份 机器学习在线班 学习笔记

  6. 七月算法机器学习笔记7 最大熵模型

    七月算法(http://www.julyedu.com) 12月份 机器学习在线班 学习笔记

  7. 七月算法机器学习笔记3 线性代数与矩阵

    七月算法(http://www.julyedu.com) 12月份 机器学习在线班 学习笔记

  8. 七月算法机器学习笔记5 回归模型

    七月算法(http://www.julyedu.com) 12月份 机器学习在线班 学习笔记

  9. 七月算法机器学习笔记10 人工神经网络

    七月算法(http://www.julyedu.com) 12月份 机器学习在线班 学习笔记

  10. 七月算法--12月机器学习在线班-第七次课笔记—最大熵

    七月算法--12月机器学习在线班-第七次课笔记-最大熵 七月算法(julyedu.com)12月机器学习在线班学习笔记 http://www.julyedu.com 转载于:https://www.c ...

最新文章

  1. 组合搜索(combinatorial search)在算法求解中的应用
  2. Linux之文件管理(一)
  3. 解决Mysql5.7以上版本, 使用group by抛出Expression #1 of SELECT list is not in GROUP BY clause and contains no异常
  4. linux用命令行进行无线连接,linux以命令行下配置连接wlan无线网卡
  5. ipvs学习笔记(二)
  6. 促销海报模板|经典蓝的带着「节日促销」来搞事情了!
  7. 每天5分钟玩转kubernetes_DNS 访问 Service 每天5分钟玩转 Docker 容器技术(138)
  8. 本周进步要点(第3周1.9--1.15)
  9. RDS还原数据库时报错:ERROR 1227 (42000) at line 78664
  10. 原有Android/IOS项目集成flutter功能
  11. 状态压缩DP 图文详解(一)
  12. elasticsearch 更新数据 (部分字段更新)
  13. mybatis与spring结合
  14. wyh2000 and pupil
  15. php微信公众号图文回复,微信公众号回复图文消息——2018年6月6日
  16. 易捷web文件服务器软件,易捷在线文件管理系统
  17. 服务器返回的常见http状态码
  18. go项目部署服务器保姆级教程(带图)
  19. 自定义控件其实很简单2
  20. 2008年MBA全国联考英语考试大纲

热门文章

  1. lisp 吴永进_AutoCAD 完全应用指南
  2. java实现堆栈排序_Java代码为例讲解堆的性质和基本操作以及排序方法
  3. Java设计模式之——代理设计模式
  4. 在单链表中删除指定值的节点
  5. Win2012 R2 IIS8.5+PHP(FastCGI)+MySQL运行环境搭建教程
  6. loadrunner 打印变量
  7. 我们会不会与操作系统谈一场奋不顾身的爱情──《云端情人》有感
  8. 转大白话系列之C#委托与事件讲解大结局
  9. EditPlus3.1工具以及Js插件(打包下载)
  10. Windows 2003网络负载均衡的实现