DL_C1_week_2_2(Logistic Regression)
Logistic Regression( a very simple Neural Network)
** 声明**:转载注明出处
主要内容:
- 编程实现一个logistic regression 分类器,识别图像中是否为猫.
- 参数的初始化
- 计算cost,和cost函数的梯度
- 使用优化算法,更新权值(参数),gradient descent(梯度下降)
1. Use Packages
- numpy
- h5py : train data 存储在h5文件中
- matplotlib
- PIL (python3.x中pillow已经取代PIL,安装pillow即可)
- scipy
import h5py
import scipy
import numpy as np
from PIL import Image
from scipy import ndimage
import matplotlib.pyplot as plt
%matplotlib inline
2. Load and Overview data set
dataset保存在一个h5格式的文件中
- training data 中每个图片的label用0/1表示:
- cat (y = 1)
- non-cat (y = 0)
- trainset共有209张image,testset包含50张image
- m_train = 209
- m_test = 50
- 每张图片的大小是:(64, 64, 3),height:64,width:64,channels:3(RGB)
- image_shape = (64, 64, 3)
2.1 Load Dataset
# Function : load data
def load_data():# train_dataset : dicttrain_dataset = h5py.File('datasets/train_catvnoncat.h5', 'r')# train data featuresorig_train_x = np.array(train_dataset['train_set_x'][:])# train data labelstrain_y = np.array(train_dataset['train_set_y'][:])# test_datasettest_dataset = h5py.File('datasets/test_catvnoncat.h5', 'r')# test data featuresorig_test_x = np.array(test_dataset['test_set_x'][:])# test data labelstest_y = np.array(test_dataset['test_set_y'][:])# list of classesclasses = np.array(test_dataset['list_classes'][:])# labels 数据维度转换(1, len(orig_train_y))train_y = train_y.reshape((1, len(train_y)))test_y = test_y.reshape((1, len(test_y)))return orig_train_x, train_y, orig_test_x, test_y, classes
“orig_”: orig_train_x, orig_test_x,是images的原始像素值,后面的模型中,还会对traindata进行预处理,比如<font color='green’size=5>Standardize
orig_train_x, train_y , orig_test_x, test_y, classes = load_data()
2.2 Overview dataset
m_train = orig_train_x.shape[0]
m_test = orig_test_x.shape[0]print ("Number of training examples: m_train = " + str(m_train))
print ("Number of testing examples: m_test = " + str(m_test))
print ("Each image is of size: "+str(orig_train_x[0].shape))
print ("train_set_x shape: " + str(orig_train_x.shape))
print ("train_set_y shape: " + str(train_y.shape))
print ("test_set_x shape: " + str(orig_test_x.shape))
print ("test_set_y shape: " + str(test_y.shape))
print ('classes :' + str(classes))
Number of training examples: m_train = 209
Number of testing examples: m_test = 50
Each image is of size: (64, 64, 3)
train_set_x shape: (209, 64, 64, 3)
train_set_y shape: (1, 209)
test_set_x shape: (50, 64, 64, 3)
test_set_y shape: (1, 50)
classes :[b'non-cat' b'cat']
2.3 Visualize image
- orig_train_x,是一个多维数组,orig_train_x[i][i][i]对应的是一个image的3D数组
# visualize an image
index = 19
image_label = train_y[0][index]
plt.imshow(orig_train_x[index])
print ("y = "+ str(image_label) +", it's a '"+classes[image_label].decode('utf-8') +"' picture.")
y = 1, it's a 'cat' picture.
index = 10
image_label = train_y[0][index]
plt.imshow(orig_train_x[index])
print ("y = "+ str(image_label) +", it's a '"+ classes[image_label].decode('utf-8') +"' picture.")
y = 0, it's a 'non-cat' picture.
2.4 Reshape orig_data
- 把每个image的数组转换为一个(64x64x3)的行向量
- 转置.T后,matrix的列为对应的样本数量,行数为特征数
flatten_train_x = orig_train_x.reshape(orig_train_x.shape[0],-1).T
# -1 : (x.shape[0]*x.shape[1]*x.shape[2])
flatten_test_x = orig_test_x.reshape(orig_test_x.shape[0],-1).T
flatten_train_x.shape, flatten_test_x.shape
((12288, 209), (12288, 50))
2.5 Standardize
- 数据中每个元素都是image的像素值,范围在[0~255]之间,normalization有利于加速训练
- normalized_pixel=pixel−pixelminpixelmaxnormalized\_pixel = \frac{pixel - pixel_{min}}{pixel_{max}}normalized_pixel=pixelmaxpixel−pixelmin
- 已知pixelmin=0,pixelmax=255pixel_{min} = 0, pixel_{max} = 255pixelmin=0,pixelmax=255
- 所以normed_pixel=pixel−0255.0normed\_pixel = \frac{pixel - 0}{255.0}normed_pixel=255.0pixel−0
normed_train_x = flatten_train_x/255.0
normed_test_x = flatten_test_x/255.0
3. General Architecture of the learning algorithm
- 实现一个Logistic Regression(简单Neural Network)
- Using a Neural Network mindset.
- 训练model,对image进行分类
算法的数学表达式
z(i)=wx(i)+bz^{(i)} = wx^{(i)}+bz(i)=wx(i)+b
y^(i)=a(i)=sigmoid(z(i))\hat{y}^{(i)} = a^{(i)} = sigmoid(z^{(i)})y^(i)=a(i)=sigmoid(z(i))
L(y^(i),y(i))=−y(i)log(y^(i))−(1−y(i))log(1−y^(i))\mathcal{L}(\hat{y}^{(i)}, y^{(i)}) = -y^{(i)}log(\hat{y}^{(i)})-(1-y^{(i)})log(1-\hat{y}^{(i)})L(y^(i),y(i))=−y(i)log(y^(i))−(1−y(i))log(1−y^(i))
x(i)x^{(i)}x(i) : image的列向量,shape=(len(x(i)x^{(i)}x(i)), 1)
www : 输入(层)连接输入出层的权值矩阵,shape=(1, len(x(i)x^{(i)}x(i)))
bbb : 偏置
y^(i),a(i)\hat{y}^{(i)},a^{(i)}y^(i),a(i) : 激活后的输出
sigmoid(z(i))sigmoid(z^{(i)})sigmoid(z(i)):
sigmod(z(i))=11+e−(z(i))sigmod(z^{(i)}) = \frac{1}{1 + e^{-(z^{(i)})}}sigmod(z(i))=1+e−(z(i))1m个样本的平均损失:
J=1m∑i=1mL(a(i),y(i))J = \frac{1}{m} \sum_{i=1}^m \mathcal{L}(a^{(i)}, y^{(i)})J=m1i=1∑mL(a(i),y(i))
#4.Building algorithm
building a Neural Network 的步骤:
- 1.确定网络的结构(输入输出层神经元的数量,这个例子不包含隐藏层)
- 2.初始化模型的参数,www , bbb
- 3.Loop:
- (forward propagation)向前传播,计算损失LLL
- (backward propagation)反向传播,计算梯度dw,dbd_w, d_bdw,db
- (gradient descent)梯度下降,更新参数w=w−αdww = w - \alpha d_ww=w−αdw
- b=b−αdbb = b - \alpha d_bb=b−αdb
4.1 Sigmoid function
$ sigmoid(wx+b) = \frac{1}{1+e^{-(wx+b)}}$
def sigmoid(z):# z : ccalar or numpy arraya = 1./(1.+np.exp(-z))return a
sigmoid(np.array([1,2,3,4]))
array([0.73105858, 0.88079708, 0.95257413, 0.98201379])
4.2 Initializing parameters
- LR的权重可以初始化为全0
- 也可以随即初始化
def initialize_w(dim, s=0):# dim :对应输入特征数量# s=0,全0初始化# s=1,随即初始化if s:w = np.random.rand(1, dim)else:w = np.zeros((1, dim))b = 0return w,b
initialize_w(5)
(array([[0., 0., 0., 0., 0.]]), 0)
initialize_w(5, 1)
(array([[0.7968392 , 0.14891652, 0.06809079, 0.9028713 , 0.11452771]]), 0)
4.3 Forward and Backward propagation
- Forward Propagation
- X:traning data matrix $ Z = WX+b$
- A:activate ouput A=σ(Z)=(a(0),a(1),.....a(m))A = \sigma(Z) = (a^{(0)},a^{(1)},.....a^{(m)}) A=σ(Z)=(a(0),a(1),.....a(m))m:样本的数量
- 计算cost function J=−1m∑i=1my(i)log(a(i))+(1−y(i))log(1−a(i))J=\frac{-1}{m}\sum_{i=1}^m y^{(i)}log(a^{(i)})+(1-y^{(i)})log(1-a^{(i)})J=m−1∑i=1my(i)log(a(i))+(1−y(i))log(1−a(i))
- Backward Propagation
(1)
dw=∂j∂w=∂j∂a.∂a∂z.∂z∂wd_w = \frac{\partial j}{\partial w} = \frac{\partial j}{\partial a}. \frac{\partial a}{\partial z}. \frac{\partial z}{\partial w}dw=∂w∂j=∂a∂j.∂z∂a.∂w∂z
db=∂j∂b=∂j∂a.∂a∂z.∂z∂bd_b = \frac{\partial j}{\partial b} = \frac{\partial j}{\partial a}. \frac{\partial a}{\partial z}. \frac{\partial z}{\partial b}db=∂b∂j=∂a∂j.∂z∂a.∂b∂z
(2)
∂j∂a=−ya+1−y1−a\frac{\partial j}{\partial a} = \frac{-y}{a}+\frac{1-y}{1-a}∂a∂j=a−y+1−a1−y
(3)
∂a∂z=a.(1−a)\frac{\partial a}{\partial z} = a.(1-a)∂z∂a=a.(1−a)
(4)
∂z∂w=x\frac{\partial z}{\partial w} = x∂w∂z=x
(5)
dw=−ya+1−y1−a.a.(1−a).x=(a−y)xd_w = \frac{-y}{a}+\frac{1-y}{1-a}.a.(1-a).x = (a -y)xdw=a−y+1−a1−y.a.(1−a).x=(a−y)x
db=a−yd_b = a -y db=a−y
- vectorial style
∂j∂w=1mX(A−Y)\frac{\partial j}{\partial w} = \frac{1}{m}X(A-Y)∂w∂j=m1X(A−Y)
∂j∂b=1m∑i=1m(a(i)−y(i))\frac{\partial j}{\partial b} = \frac{1}{m}\sum_{i=1}^m(a^{(i)}-y^{(i)})∂b∂j=m1i=1∑m(a(i)−y(i))
# Function: propagate
# 计算cost, gradient
def propagate(w, b, X, Y):"""Arguments:w : weights,a numpy array,shape(1, len(featuers))b : bais, a scalarX : dataset, a numpy array,shape(len(features), m)Y : true label, a numpy array, shape(1, m)return :cost, dw, db"""m = len(X)# forward propagationZ = np.dot(w, X)+bA = sigmoid(Z)# cost cost = -1/m * np.sum(Y*np.log(A)+(1-Y)*np.log(1-A))# backward propagationdw = np.dot((A-Y),X.T)/mdb = np.sum(A-Y)/mgrads = {'dw':dw, 'db':db}return cost,grads
验证一下计算梯度和cost的函数是否正确
w, b, X, Y = np.array([[1,2]]),2,np.array([[1,2],[3,4]]),np.array([[1,0]])cost, grads = propagate(w, b, X, Y)print ("dw = " + str(grads["dw"]))
print ("db = " + str(grads["db"]))
print ("cost = " + str(cost))
dw = [[0.99993216 1.99980262]]
db = 0.49993523062470574
cost = 6.000064773192205
正确的结果:
** dw ** | [[ 0.99993216] [ 1.99980262]] |
** db ** | 0.499935230625 |
** cost ** | 6.000064773192205 |
5.Optimization
已经完成的工作
- 参数的初始化:initialize_w(dim)
- 激活函数:sigmoid(z)
- cost和gradient的计算:propagate(w, b, X , Y)
- 接下来,使用梯度下降来更新参数w,b,α\alphaα 学习效率,控制梯度下降的步长:
- $ w = w - \alpha d_w$
- $ b = b - \alpha d_b$
def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost=0):# num_iterations : loop times# learning_rate : control gradient descent# print_cost : 1 print loss every 10 timescosts = []for i in range(num_iterations):cost , grads = propagate(w, b, X, Y)w = w - learning_rate*grads['dw']b = b - learning_rate*grads['db']costs.append(cost)if print_cost and i%10 == 0:print ('Cost after iteration %i: %f'%(i, cost))params = {'w':w, 'b':b}return params, grads, costs
验证optimize函数是否正常工作
# 验证数据
w, b, X, Y = np.array([[1,2]]),2,np.array([[1,2],[3,4]]),np.array([[1,0]])
params , grads, costs = optimize(w, b, X, Y, 100, 0.009, 1)
Cost after iteration 0: 6.000065
Cost after iteration 10: 5.527691
Cost after iteration 20: 5.055445
Cost after iteration 30: 4.583458
Cost after iteration 40: 4.112002
Cost after iteration 50: 3.641644
Cost after iteration 60: 3.173575
Cost after iteration 70: 2.710307
Cost after iteration 80: 2.257084
Cost after iteration 90: 1.824430
plt.plot(costs)
print ("w = " + str(params["w"]))
print ("b = " + str(params["b"]))
print ("dw = " + str(grads["dw"]))
print ("db = " + str(grads["db"]))
w = [[0.1124579 0.23106775]]
b = 1.5593049248448891
dw = [[0.90158428 1.76250842]]
db = 0.4304620716786828
正确的结果:
<tr><td> **b** </td><td> 1.55930492484 </td>
</tr>
<tr><td> **dw** </td><td> [[ 0.90158428][ 1.76250842]] </td>
</tr>
<tr><td> **db** </td><td> 0.430462071679 </td>
</tr>
**w** | [[ 0.1124579 ] [ 0.23106775]] |
#6. Predict
- 上面经过100次的迭代,Optimize()函数返回训练结束时的params
- 利用训练好的模型参赛predict:
- 1.计算 Y^=A=σ(wX+b)\hat{Y} = A = \sigma(wX+b)Y^=A=σ(wX+b)
- y(i)>0.5,x(i)y^{(i)} > 0.5, x^{(i)}y(i)>0.5,x(i)label=1,
- y(i)<=0.5,x(i)y^{(i)} <= 0.5, x^{(i)}y(i)<=0.5,x(i)label=0,
def predict(w, b, X):Z = np.dot(w, X)+bA = sigmoid(Z)predicted = (A > 0.5)*1.return predicted
验证predict()是否正确
# 验证数据
w, b, X= np.array([[1,2]]),2,np.array([[1,2],[3,4]])
print ('predicted :' + str(predict(w, b, X)))
predicted :[[1. 1.]]
正确的输出:
**predicted** | [[ 1. 1.]] |
#6. Merge all functions into a model
def model(X_train, Y_train, X_test, Y_test, num_iterations=2000,learning_rate=0.5,print_cost=0,flag=0):w, b = initialize_w(X_train.shape[0], flag)params, grads, costs = optimize(w, b, X_train, Y_train, num_iterations, learning_rate,print_cost)test_predicted = predict(params['w'], params['b'], X_test)train_predicted = predict(params['w'], params['b'], X_train)print("train accuracy: {} %".format(100 - np.mean(np.abs(train_predicted - Y_train)) * 100))print("test accuracy: {} %".format(100 - np.mean(np.abs(test_predicted - Y_test)) * 100))d = {'costs':costs,'train_predicted':train_predicted,'test_predicted':test_predicted,'w':params['w'],'b':params['b'],'lr':learning_rate,'num_iterations':num_iterations,}return d
Run model
- flag = 0,全0初始化w,b
- flag = 1,随即初始化w,[-1,1]之间
flag = 0,0初始化
d = model(normed_train_x, train_y, normed_test_x, test_y, print_cost=1)
Cost after iteration 0: 0.011789
Cost after iteration 10: 0.028177
Cost after iteration 20: 0.011830
Cost after iteration 30: 0.019752
Cost after iteration 40: 0.017107
Cost after iteration 50: 0.008967
Cost after iteration 60: 0.008224
Cost after iteration 70: 0.008620
Cost after iteration 80: 0.012857
.................................
Cost after iteration 1900: 0.001477
Cost after iteration 1910: 0.001471
Cost after iteration 1920: 0.001464
Cost after iteration 1930: 0.001458
Cost after iteration 1940: 0.001452
Cost after iteration 1950: 0.001446
Cost after iteration 1960: 0.001440
Cost after iteration 1970: 0.001434
Cost after iteration 1980: 0.001429
Cost after iteration 1990: 0.001423
train accuracy: 99.52153110047847 %
test accuracy: 70.0 %
flag = 1,随即初始化
d2 = model(normed_train_x, train_y, normed_test_x, test_y, print_cost=0, flag=1)
train accuracy: 98.08612440191388 %
test accuracy: 72.0 %
cost curve
plt.figure(figsize=(14,6))
plt.plot(d['costs'], label='flag=0')
plt.plot(d2['costs'], label='flag=1')
plt.legend()plt.title('cost cruve lr=0.5')
Text(0.5,1,'cost cruve lr=0.5')
学习效率为0.5,所以在前期曲线震荡很厉害
模型对测试数据的预测结果
正确分类
index = 1
pred_label = int(d['test_predicted'][0][index])
plt.imshow(orig_test_x[index])
print ("y = "+ str(image_label) +", it's a '"+ classes[pred_label].decode('utf-8') +"' picture.")
y = 0, it's a 'cat' picture.
错误分类
index = 5
pred_label = int(d['test_predicted'][0][index])
plt.imshow(orig_test_x[index])
print ("y = "+ str(image_label) +", it's a '"+ classes[pred_label].decode('utf-8') +"' picture.")
y = 0, it's a 'cat' picture.
#7 Choice of learning rate
learning_rates = [0.005, 0.01, 0.05, 0.1, 0.3, 1.0]
models = {}
for i in learning_rates:print ("learning rate is: " + str(i))models[str(i)] = model(normed_train_x, train_y, normed_test_x, test_y,num_iterations = 40000, learning_rate = i, print_cost = False)print ('\n' + "-------------------------------------------------------" + '\n')
learning rate is: 0.005
train accuracy: 94.73684210526316 %
test accuracy: 74.0 %-------------------------------------------------------learning rate is: 0.01
train accuracy: 97.60765550239235 %
test accuracy: 70.0 %-------------------------------------------------------learning rate is: 0.05
train accuracy: 100.0 %
test accuracy: 70.0 %-------------------------------------------------------learning rate is: 0.1
train accuracy: 100.0 %
test accuracy: 70.0 %-------------------------------------------------------learning rate is: 0.3
train accuracy: 100.0 %
test accuracy: 72.0 %-------------------------------------------------------learning rate is: 1.0
train accuracy: 100.0 %
test accuracy: 72.0 %-------------------------------------------------------
plt.figure(figsize=(15,8))
plt.grid(True)
plt.ylabel('cost')
plt.ylim(0, 0.012)
plt.xlabel('iterations')
plt.title('0 initalize')
for i in learning_rates:plt.plot(np.squeeze(models[str(i)]["costs"]), label= str(models[str(i)]["lr"]))legend = plt.legend(loc='best', shadow=True)
frame = legend.get_frame()
frame.set_facecolor('0.90')
plt.show()
learning_rate的之越大,cost曲线下降的越快.
learning_rate过大,cost曲线前期会出现震荡,但是最终还是会趋于稳定.
predict picture
- 选择测试准确率跟高的模型model[‘0.3’]
def predict_my_image(fname, model):image = np.array(ndimage.imread(fname, flatten=False))input_x = scipy.misc.imresize(image, size=(64, 64)).reshape((1, 64*64*3)).Tpredict_label = predict(model['w'], model['b'] , input_x)[0,0]plt.imshow(image)print('y = '+str(pred_label) +',model predict is a '+classes[int(pred_label)].decode('utf-8'))
predict_my_image('images/my_image2.jpg',models['0.3'])
y = 1,model predict is a cat
predict_my_image('images/my_image.jpg', models['0.3'])
y = 1,model predict is a non-cat
predict_my_image('images/cat_in_iran.jpg',models['0.3'])
y = 1,model predict is a cat
DL_C1_week_2_2(Logistic Regression)相关推荐
- 机器学习与高维信息检索 - Note 3 - 逻辑回归(Logistic Regression)及相关实例
逻辑回归 Logistic Regression 3. 逻辑回归 补充: 凸性 Convexity 定义3.1 定理3.2 定理3.3 成本函数的凸性 3.1逻辑回归的替代方法 3.2 线性可分性和逻 ...
- 逻辑回归(Logistic Regression)简介及C++实现
逻辑回归(Logistic Regression):该模型用于分类而非回归,可以使用logistic sigmoid函数( 可参考:http://blog.csdn.net/fengbingchun/ ...
- OpenCV3.3中逻辑回归(Logistic Regression)使用举例
OpenCV3.3中给出了逻辑回归(logistic regression)的实现,即cv::ml::LogisticRegression类,类的声明在include/opencv2/ml.hpp文件 ...
- Logistic Regression
Logistic Regression 又称逻辑回归,分类算法中的二分类算法,属于监督学习的范畴,算法复杂度低. 1.模型 Logistic Regression模型是广义线性模型的一种,属于线性的分 ...
- R构建Logistic回归实战(Logistic Regression)
R构建Logistic回归实战(Logistic Regression) 目录 R构建Logistic回归实战(Logistic Regression) 逻辑回归(Logistic Regressio ...
- 使用聚类算法(Kmeans)进行数据降维并作为分类算法逻辑回归(logistic Regression)的数据预处理步骤实战
使用聚类算法(Kmeans)进行数据降维并作为分类算法逻辑回归(logistic Regression)的数据预处理步骤实战 目录
- python训练模型函数参数_一步步亲手用python实现Logistic Regression
前面的[DL笔记1]Logistic回归:最基础的神经网络和[DL笔记2]神经网络编程原则&Logistic Regression的算法解析讲解了Logistic regression的基本原 ...
- 深度学习 Deep LearningUFLDL 最新Tutorial 学习笔记 2:Logistic Regression
1 Logistic Regression 简述 Linear Regression 研究连续量的变化情况,而Logistic Regression则研究离散量的情况.简单地说就是对于推断一个训练样本 ...
- 【机器学习】逻辑回归(Logistic Regression)
注:最近开始学习<人工智能>选修课,老师提纲挈领的介绍了一番,听完课只了解了个大概,剩下的细节只能自己继续摸索. 从本质上讲:机器学习就是一个模型对外界的刺激(训练样本)做出反应,趋利避害 ...
- Logistic Regression(逻辑回归) +python3.6(pycharm)实现
数学基础知识略过,可自行查询探究. 遇到的bugs: 1.AttributeError: module 'scipy' has no attribute '__version__' 解决办法:inst ...
最新文章
- 如何实现搜索列表_图解:如何理解与实现散列表
- arduino 中断例程
- 【Yaml】Yaml学习笔记
- 【DP】小明游天界(zjoj 2149)
- kotlin键值对数组_Kotlin程序以升序对数组进行排序
- jQuery中的渐变动画效果
- 视频丨包不同的沙雕敏捷之砸锅卖铁买兰博
- 新发布AlbumOnNet 、dotnetCharting控件注册资料
- 互联网+工业,从哪里开始?
- 旧版java_Java旧版本清理|JavaRa旧版本清理下载_V2.4 官方版_9号软件下载
- 第九届全国大学生数学竞赛(江西赛区)数学类获奖学生名单
- 略谈永中OFFICE的语言国际化
- 查找主机信息的两个协议:DHCP协议+NBNS协议
- 传输线理论 特征阻抗
- 160个常用黑客命令速查手册
- KNEEL: Knee Anatomical Landmark Localization Using Hourglass Networks
- cp: omitting directory `XXX'问题解决
- 如何在python中获得当前时间前几天的日期
- golang 使用ssl连接smtp发送邮件
- Servlet本身的init,service,destory生命周期方法
热门文章
- linux pcie热插拔驱动_嵌入式Linux驱动离不开的知识:深入解析Linux Platform_device
- 我的世界java版和基岩版对比_我的世界 Java版 与 基岩版 有什么区别?
- hibernate java.util.date 精度_hibernate中java.util.Date类型映射
- ORACLE decode
- ASP.NET Core 基础教程总结 - ASP.NET Core 基础教程 - 简单教程,简单编程
- 拍拍二手重装上阵,京东剑指闲鱼胜算几何?
- npm升级所有可更新包
- 前端面试总结--数据结构与算法五
- 张高兴的 Windows 10 IoT 开发笔记:三轴数字罗盘 HMC5883L
- 【译文】Nodejs官方文档(Part 3 断言测试)