逻辑回归模型python实现

文章目录

  • 逻辑回归模型python实现
    • 注:
    • 0.理论知识
      • 逻辑回归模型
      • 计算公式
    • 1.导入需要的包
      • lr_utils.py
    • 2. 导入数据集
      • 查看数据集shape
      • 查看图片(可选)
      • 分析一张图片的数据(可选)
      • reshape数据集
      • 归一化数据
    • 3.创建模型
      • 创建基本函数
      • 测试基本函数(可选)
      • 创建优化函数
      • 测试优化函数(可选)
      • 创建预测函数
      • 测试预测函数(可选)
      • 合成模型
    • 4.测试模型
      • 查看测试样本预测结果(可选)
      • 显示cost变化趋势(可选)
      • 学习率选择(可选)
      • 用自己的图片预测(可选)

注:

数据集及详细讲解请查找吴恩达深度学习第一课第二周,本篇博客为编程作业总结。
GitHub资料:https://github.com/TangZhaoXiang/deeplearning.ai.git

0.理论知识

本次将构建一个逻辑回归模型,使用一个神经网络的mini数据集,用来预测是不是猫。
如下图所示,逻辑回归实际上是一个非常简单的神经网络。

逻辑回归与神经网络都具有激活函数,区别在于神经网络中多了隐藏层,而逻辑回归中只有输入层和输出层,隐藏层数越多,则模型的抽象学习能力越强,隐藏层数大于2层的神经网络模型称为深度神经网络(DNN),这里不展开了,建议可以看原教程仔细理解,本文重实战。

逻辑回归模型


简单说明:输入包括x数据集,w权重和偏差b,第一步先计算Z值(类似于线性回归的预测值),对其运用σ函数,使其预测值 y ^ \hat{y} y^​取值范围为0~1,因为y只有两种类型:0和1,通过损失函数L计算预测值与实际值之间的差距。然后将全部样本的损失值累积后求平均,即为模型的代价J,逻辑回归模型的优化目标就是最小化J,采用的方法为梯度下降法

计算公式

以神经网络的视角看待逻辑回归
对于一个样本数据 x ( i ) x^{(i)} x(i):
前向传播
z ( i ) = w T x ( i ) + b (1) z^{(i)} = w^T x^{(i)} + b \tag{1} z(i)=wTx(i)+b(1)
y ^ ( i ) = a ( i ) = s i g m o i d ( z ( i ) ) (2) \hat{y}^{(i)} = a^{(i)} = sigmoid(z^{(i)})\tag{2} y^​(i)=a(i)=sigmoid(z(i))(2)
L ( a ( i ) , y ( i ) ) = − y ( i ) log ⁡ ( a ( i ) ) − ( 1 − y ( i ) ) log ⁡ ( 1 − a ( i ) ) (3) \mathcal{L}(a^{(i)}, y^{(i)}) = - y^{(i)} \log(a^{(i)}) - (1-y^{(i)} ) \log(1-a^{(i)})\tag{3} L(a(i),y(i))=−y(i)log(a(i))−(1−y(i))log(1−a(i))(3)

代价等于所有的损失之和除以样本数m:
J = 1 m ∑ i = 1 m L ( a ( i ) , y ( i ) ) (6) J = \frac{1}{m} \sum_{i=1}^m \mathcal{L}(a^{(i)}, y^{(i)})\tag{6} J=m1​i=1∑m​L(a(i),y(i))(6)

反向传播:

∂ J ∂ w = 1 m X ( A − Y ) T (7) \frac{\partial J}{\partial w} = \frac{1}{m}X(A-Y)^T\tag{7} ∂w∂J​=m1​X(A−Y)T(7)
∂ J ∂ b = 1 m ∑ i = 1 m ( a ( i ) − y ( i ) ) (8) \frac{\partial J}{\partial b} = \frac{1}{m} \sum_{i=1}^m (a^{(i)}-y^{(i)})\tag{8} ∂b∂J​=m1​i=1∑m​(a(i)−y(i))(8)

对于全部数据样本X:
向量化方法

  • Z = w T X + b = n p . d o t ( w T X ) + b Z = w^TX + b = np.dot( w^TX) + b Z=wTX+b=np.dot(wTX)+b
  • A = σ ( Z ) A = \sigma(Z) A=σ(Z)
  • J = n p . s u m ( ( Y ∗ n p . l o g ( A ) + ( 1 − Y ) ∗ n p . l o g ( 1 − A ) ) , a x i s = 1 ) / ( − m ) J = np.sum((Y * np.log(A) + (1 - Y) * np.log(1 - A)), axis= 1) / (- m) J=np.sum((Y∗np.log(A)+(1−Y)∗np.log(1−A)),axis=1)/(−m)
  • d w = ( n p . d o t ( X , ( A − Y ) . T ) ) / m dw = (np.dot(X, (A - Y).T)) / m dw=(np.dot(X,(A−Y).T))/m
  • d b = n p . s u m ( A − Y ) / m db = np.sum(A - Y) / m db=np.sum(A−Y)/m

1.导入需要的包

import numpy as np
import matplotlib.pyplot as plt   # 图形库
import h5py                       # 操作H5文件
import scipy                      # 操作图像
from PIL import Image
from scipy import ndimage
from lr_utils import load_dataset   # 从lr_utils.py文件导入函数%matplotlib inline        # 什么意思?

lr_utils.py

import numpy as np
import h5pydef load_dataset():train_dataset = h5py.File('datasets/train_catvnoncat.h5', "r")     # 读训练数据集train_set_x_orig = np.array(train_dataset["train_set_x"][:]) # your train set featurestrain_set_y_orig = np.array(train_dataset["train_set_y"][:]) # your train set labels# print(train_set_x_orig.shape,train_set_y_orig.shape) # (209, 64, 64, 3) (209,)  209张图片64*64*3, 3代表RGB三层test_dataset = h5py.File('datasets/test_catvnoncat.h5', "r")      # 读测试数据集test_set_x_orig = np.array(test_dataset["test_set_x"][:]) # your test set featurestest_set_y_orig = np.array(test_dataset["test_set_y"][:]) # your test set labels# print(test_set_x_orig.shape,test_set_y_orig.shape)#(50, 64, 64, 3) (50,)   50张图片classes = np.array(test_dataset["list_classes"][:]) # the list of classes# print(classes,classes.shape)# [b'non-cat' b'cat'] (2,)   只有两种类别 # 一维数组转换为行向量train_set_y_orig = train_set_y_orig.reshape((1, train_set_y_orig.shape[0]))test_set_y_orig = test_set_y_orig.reshape((1, test_set_y_orig.shape[0]))# print(train_set_y_orig.shape,test_set_y_orig.shape)# (1, 209) (1, 50) return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes

2. 导入数据集

train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()

查看数据集shape

### START CODE HERE ### (≈ 3 lines of code)
m_train = train_set_x_orig.shape[0]
m_test = test_set_x_orig.shape[0]
num_px = train_set_x_orig.shape[1]
### END CODE HERE ###print ("Number of training examples: m_train = " + str(m_train))
print ("Number of testing examples: m_test = " + str(m_test))
print ("Height/Width of each image: num_px = " + str(num_px))
print ("Each image is of size: (" + str(num_px) + ", " + str(num_px) + ", 3)")
print ("train_set_x shape: " + str(train_set_x_orig.shape))
print ("train_set_y shape: " + str(train_set_y.shape))
print ("test_set_x shape: " + str(test_set_x_orig.shape))
print ("test_set_y shape: " + str(test_set_y.shape))

Number of training examples: m_train = 209
Number of testing examples: m_test = 50
Height/Width of each image: num_px = 64
Each image is of size: (64, 64, 3)
train_set_x shape: (209, 64, 64, 3)
train_set_y shape: (1, 209)
test_set_x shape: (50, 64, 64, 3)
test_set_y shape: (1, 50)

查看图片(可选)

index = 1  # 改变index值查看
plt.imshow(train_set_x_orig[index])
print ("y = " + str(train_set_y[:, index]) + ", it's a '" + classes[np.squeeze(train_set_y[:, index])].decode("utf-8") +  "' picture.")

分析一张图片的数据(可选)

# 展示一张图片的数据
print(train_set_x_orig[index],train_set_x_orig[index].shape)

[[[196 192 190]
[193 186 182]
[188 179 174]

[ 90 142 200]
[ 90 142 201]
[ 90 142 201]]

[[ 45 43 39]
[ 61 59 54]
[ 81 78 74]

[ 83 82 81]
[ 84 82 82]
[ 82 80 81]]] (64, 64, 3)

说明:
一张图片的数据如上所示:其第1列代表着R层的值,第2列代表G层的值,第3列代表B层的值, [196,192,190]代表的就是1个点的RGB值,
[[196 192 190]
[193 186 182]

[ 90 142 201]
[ 90 142 201]]就代表的是第一列的图片像素,总共64列像素就组成了组成图片,plt.imshow()可以将图片数据显示为图片。

reshape数据集

将(209, 64, 64, 3) 变为(12288,209)

### START CODE HERE ### (≈ 2 lines of code)
train_set_x_flatten = train_set_x_orig.reshape(m_train,num_px * num_px * 3).T #将图片像素数据纵向排布
test_set_x_flatten = test_set_x_orig.reshape(m_test,num_px * num_px * 3).T
### END CODE HERE ###print ("train_set_x_flatten shape: " + str(train_set_x_flatten.shape))
print ("train_set_y shape: " + str(train_set_y.shape))
print ("test_set_x_flatten shape: " + str(test_set_x_flatten.shape))
print ("test_set_y shape: " + str(test_set_y.shape))
print ("sanity check after reshaping: " + str(train_set_x_flatten[0:5,0]))

train_set_x_flatten shape: (12288, 209)
train_set_y shape: (1, 209)
test_set_x_flatten shape: (12288, 50)
test_set_y shape: (1, 50)
sanity check after reshaping: [17 31 56 22 33]

归一化数据

train_set_x = train_set_x_flatten / 255.
test_set_x = test_set_x_flatten / 255.
print(train_set_x,test_set_x)

3.创建模型

创建基本函数

# GRADED FUNCTION: sigmoid
def sigmoid(z):s = 1 / (1 + np.exp(-z))return s# GRADED FUNCTION: initialize_with_zeros
def initialize_with_zeros(dim):w = np.zeros((dim, 1))b = 0assert(w.shape == (dim, 1))assert(isinstance(b, float) or isinstance(b, int))return w, b# GRADED FUNCTION: propagate
def propagate(w, b, X, Y):m = X.shape[1]# FORWARD PROPAGATION (FROM X TO COST)A = sigmoid(np.dot(w.T, X) + b)cost = np.sum((Y * np.log(A) + (1 - Y) * np.log(1 - A)), axis= 1) / (- m)# BACKWARD PROPAGATION (TO FIND GRAD)dw = (np.dot(X, (A - Y).T)) / mdb = np.sum(A - Y) / m#print(dw,db)assert(dw.shape == w.shape)assert(db.dtype == float)cost = np.squeeze(cost)assert(cost.shape == ())grads = {"dw": dw,"db": db}return  grads, cost

测试基本函数(可选)

print ("sigmoid([0, 2]) = " + str(sigmoid(np.array([0,2]))))

sigmoid([0, 2]) = [0.5 0.88079708]

dim = 2
w, b = initialize_with_zeros(dim)
print ("w = " + str(w))
print ("b = " + str(b))

w = [[0.]
[0.]]
b = 0

w, b, X, Y = np.array([[1],[2]]), 2, np.array([[1,2],[3,4]]), np.array([[1,0]])grads, cost = propagate(w, b, X, Y)
print ("dw = " + str(grads["dw"]))
print ("db = " + str(grads["db"]))
print ("cost = " + str(cost))

dw = [[0.99993216]
[1.99980262]]
db = 0.49993523062470574
cost = 6.000064773192205

创建优化函数

# GRADED FUNCTION: optimize
def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost = False):costs = []for i in range(num_iterations):# Cost and gradient calculation (≈ 1-4 lines of code)grads, cost = propagate(w, b, X, Y)# Retrieve derivatives from gradsdw = grads["dw"]db = grads["db"]# update rule (≈ 2 lines of code)w = w - learning_rate * dwb = b - learning_rate * db# Record the costsif i % 100 == 0:costs.append(float(cost))# Print the cost every 100 training examplesif print_cost and i % 100 == 0:print ("Cost after iteration %i: %f" %(i, cost))params = {"w": w,"b": b}grads = {"dw": dw,"db": db}return params, grads, costs

测试优化函数(可选)

params, grads, costs = optimize(w, b, X, Y, num_iterations= 100, learning_rate = 0.009, print_cost = False)print ("w = " + str(params["w"]))
print ("b = " + str(params["b"]))
print ("dw = " + str(grads["dw"]))
print ("db = " + str(grads["db"]))
print (np.squeeze(costs))

w = [[0.1124579 ]
[0.23106775]]
b = 1.5593049248448891
dw = [[0.90158428]
[1.76250842]]
db = 0.4304620716786828
6.000064773192205

创建预测函数

# GRADED FUNCTION: predict
def predict(w, b, X):m = X.shape[1]Y_prediction = np.zeros((1,m))w = w.reshape(X.shape[0], 1)# print(w)# Compute vector "A" predicting the probabilities of a cat being present in the pictureA = sigmoid(np.dot(w.T, X) + b)#print(A)for i in range(A.shape[1]):# Convert probabilities A[0,i] to actual predictions p[0,i]if (A[0][i] > 0.5):Y_prediction[0][i] = 1else:Y_prediction[0][i] = 0assert(Y_prediction.shape == (1, m))return Y_prediction

测试预测函数(可选)

print ("predictions = " + str(predict(w, b, X)))

predictions = [[1. 1.]]

合成模型

# GRADED FUNCTION: model
def model(X_train, Y_train, X_test, Y_test, num_iterations = 2000, learning_rate = 0.5, print_cost = False):# initialize parameters with zeros (≈ 1 line of code)dim = X_train.shape[0]w, b = initialize_with_zeros(dim)# Gradient descent (≈ 1 line of code)parameters, grads, costs = optimize(w, b, X_train, Y_train, num_iterations, learning_rate, print_cost)# Retrieve parameters w and b from dictionary "parameters"w = parameters["w"]b = parameters["b"]# Predict test/train set examples (≈ 2 lines of code)Y_prediction_train = predict(w, b, X_train)Y_prediction_test = predict(w, b, X_test)# Print train/test Errorsprint("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))d = {"costs": costs,"Y_prediction_test": Y_prediction_test, "Y_prediction_train" : Y_prediction_train, "w" : w, "b" : b,"learning_rate" : learning_rate,"num_iterations": num_iterations}return d

4.测试模型

d = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 2000, learning_rate = 0.005, print_cost = True)

Cost after iteration 0: 0.693147
Cost after iteration 100: 0.584508
Cost after iteration 200: 0.466949
Cost after iteration 300: 0.376007
Cost after iteration 400: 0.331463
Cost after iteration 500: 0.303273
Cost after iteration 600: 0.279880
Cost after iteration 700: 0.260042
Cost after iteration 800: 0.242941
Cost after iteration 900: 0.228004
Cost after iteration 1000: 0.214820
Cost after iteration 1100: 0.203078
Cost after iteration 1200: 0.192544
Cost after iteration 1300: 0.183033
Cost after iteration 1400: 0.174399
Cost after iteration 1500: 0.166521
Cost after iteration 1600: 0.159305
Cost after iteration 1700: 0.152667
Cost after iteration 1800: 0.146542
Cost after iteration 1900: 0.140872
train accuracy: 99.04306220095694 %
test accuracy: 70.0 %
说明:
上面的结果显示已经过拟合了,此时若增大迭代次数将进一步过拟合,如将迭代次数改为3000,train accuracy将提高,但test accuracy将减少为68.0%,故考虑正则减少过拟合。

查看测试样本预测结果(可选)

# Example of a picture that was wrongly classified.
index = 5   # 改变index进行查看
plt.imshow(test_set_x[:,index].reshape((num_px, num_px, 3)))
print ("y = " + str(test_set_y[0,index]) + ", you predicted that it is a \"" + classes[int(d["Y_prediction_test"][0,index])].decode("utf-8") +  "\" picture.")

显示cost变化趋势(可选)

# Plot learning curve (with costs)
costs = np.squeeze(d['costs'])  # 压缩多余的小数
plt.plot(costs)
plt.ylabel('cost')
plt.xlabel('iterations (per hundreds)')
plt.title("Learning rate =" + str(d["learning_rate"]))
plt.show()

学习率选择(可选)

learning_rates = [0.01, 0.001, 0.0001]
models = {}
for i in learning_rates:print ("learning rate is: " + str(i))models[str(i)] = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 1500, learning_rate = i, print_cost = False)print ('\n' + "-------------------------------------------------------" + '\n')for i in learning_rates:plt.plot(np.squeeze(models[str(i)]["costs"]), label= str(models[str(i)]["learning_rate"]))plt.ylabel('cost')
plt.xlabel('iterations')legend = plt.legend(loc='upper center', shadow=True)
frame = legend.get_frame()
frame.set_facecolor('0.90')
plt.show()

用自己的图片预测(可选)

## START CODE HERE ## (PUT YOUR IMAGE NAME)
my_image = "cat_in_iran.jpg"   # change this to the name of your image file
## END CODE HERE ### We preprocess the image to fit your algorithm.
fname = "images/" + my_image
image = np.array(ndimage.imread(fname, flatten=False))
my_image = scipy.misc.imresize(image, size=(num_px,num_px)).reshape((1, num_px*num_px*3)).T
my_predicted_image = predict(d["w"], d["b"], my_image)plt.imshow(image)
print("y = " + str(np.squeeze(my_predicted_image)) + ", your algorithm predicts a \"" + classes[int(np.squeeze(my_predicted_image)),].decode("utf-8") +  "\" picture.")

【ML/DL】逻辑回归模型Python实现相关推荐

  1. 数学建模——逻辑回归模型Python代码

    数学建模--逻辑回归模型详解Python代码 程序用到的测试数据: 链接:https://pan.baidu.com/s/1LGD1MAxk2lxO93smSPNyZg 提取码:uukr 代码正文 i ...

  2. 逻辑回归模型 python_机器学习-逻辑回归分析(Python)

    编辑推荐: 本文首先介绍这两种方法的区别和联系,然后对分类方法中的逻辑回归进行较详细的说明(包括其基本原理及评估指标),最后结合案例介绍如何利用Python进行逻辑回归分析. 本文来自于csdn,由火 ...

  3. python归一化 增大差异_Python逻辑回归模型原理及实际案例应用

    前言 上面我们介绍了线性回归, 岭回归, Lasso回归, 今天我们来看看另外一种模型-"逻辑回归". 虽然它有"回归"一词, 但解决的却是分类问题 目录 1. ...

  4. 逻辑回归模型(Logistic Regression)及Python实现

    逻辑回归模型(Logistic Regression)及Python实现 http://www.cnblogs.com/sumai 1.模型 在分类问题中,比如判断邮件是否为垃圾邮件,判断肿瘤是否为阳 ...

  5. python自动测试优惠券过期_python逻辑回归模型-使用优惠券预测

    最近疫情严重,宅在家里给自己充电,修改简历,心里还是有点担忧的,疫情肯定会对招聘产生影响,今年春招的竞争肯定要比以往几年都要大. 于是打算在我的知乎专栏里也囤点"货". #希望大家 ...

  6. 使用python逻辑回归模型来进行nba竞赛数据预测球队胜率

    好的,我来为你介绍一下使用 Python 进行逻辑回归模型预测 NBA 竞赛数据中球队胜率的步骤. 首先,你需要准备训练数据.这些数据可能包含球队的历史胜率,球员数据,比赛场地等信息.你可以使用这些信 ...

  7. 逻辑回归模型及案例(Python)

    1 简介 逻辑回归也被称为广义线性回归模型,它与线性回归模型的形式基本上相同,最大的区别就在于它们的因变量不同,如果是连续的,就是多重线性回归:如果是二项分布,就是Logistic回归. Logist ...

  8. python建立逻辑回归模型

    利用Scikit-Learn对数据进行逻辑回归分析 1.特征选择(1)给出各个特征的F值和p值,选出F值大的或者p值小的(2)递归特征消除 Scikit-Learn提供了RFE包,还有RFECV,利用 ...

  9. 用通俗易懂的方式讲解:逻辑回归模型及案例(Python 代码)

    目录 1 简介 2 优缺点 3 适用场景 加入方式 4 案例:客户流失预警模型 4.1 读取数据 4.2 划分特征变量和目标变量 4.3 模型搭建与使用 4.3.1 划分训练集与测试集 4.3.2 模 ...

最新文章

  1. 【Android 内存优化】Android 原生 API 图片压缩代码示例 ( PNG 格式压缩 | JPEG 格式压缩 | WEBP 格式压缩 | 动态权限申请 | Android10 存储策略 )
  2. Java微框架:不可忽视的新趋势--转载
  3. 在家学习的核心就是专注
  4. 24、jdbc操作数据库(1)
  5. ASP.NET Core管道深度剖析(4):管道是如何建立起来的?
  6. abstract类中可以有private的成员_C++|static成员与单例模式
  7. 首届中国信息通信大数据大会将于4月20-21日在京召开
  8. 解决DesignMode不能正确反应是否处于设计模式的问题
  9. automake使用实例
  10. Django 聚合(译)
  11. 图像去雾:基于暗通道的去雾算法 - 附代码
  12. 思维导图软件列表(mind mapping software list)
  13. 聚类——密度聚类(DBSCAN、OPTICS、DENCLUE)
  14. java 协同过滤算法_基于用户的协同过滤算法(Java实现或R语言实现)
  15. windows 垃圾广告软件清除
  16. 多次引用同一脚注或尾注
  17. 三层交换机和三层交换实验
  18. 2020网络教育计算机统考,2020年9月网络教育统考《计算机应用基础》模拟题8
  19. go java gc_Golang GC算法
  20. jdk1.8换成11,启动项目报错java.net.MalformedURLException: unknown protocol: jrt

热门文章

  1. 安卓通过反射开关个人热点
  2. 《黑马》python6.5就业班基础到高级【网盘分享】
  3. 教资(信息技术学科知识与教学能力) 1-1信息技术概述、发展
  4. 三星note8 html5,三星Galaxy Note8屏幕材质_三星Galaxy Note8屏幕分辨率-太平洋IT百科
  5. 未知的事情,发生在未知的时候
  6. 法律人工智能的前世今生
  7. unset 函数php,PHP unset()函数
  8. VS中未加载wntdll.pbd的问题
  9. vscode使用pinia官网字体
  10. Python zip函数详解+和izip和zip_longest的比较辨析