Part 2： Logistic Regression with a Neural Network mindset

你将学到：
-建立学习算法的一般架构
-初始化参数
-计算损失函数和它的梯度
-使用优化算法（梯度下降）
-按正确的顺序将上述三个函数集合到一个主模块函数中

1 - Packages

First, let’s run the cell below to import all the packages that you will need during this assignment.
- numpy is the fundamental package for scientific computing with Python.
- h5py is a common package to interact with a dataset that is stored on an H5 file.
- matplotlib is a famous library to plot graphs in Python.
- PIL and scipy are used here to test your model with your own picture at the end.

 import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage
#from lr_utils import load_dataset
% matplotlib inline

2 - Overview of the Problem set

Problem Statement: You are given a dataset (“data.h5”) containing:
-训练集m_train 图像标签是y=1为cat，标签是y=0为non-cat
-测试集是一组被标记为cat或者non-cat的图像
-每个图像的形状是(num_px, num_px, 3)，它们都是RGB彩色图像。因此每个图像都是方的。(height = num_px) and (width = num_px).
你将建立一个简单图像分类算法，它能够正确的分类cat和non-cat图片。

Let’s get more familiar with the dataset. Load the data by running the following code.
以下代码来自博主Bin Weber

import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage
#from lr_utils import load_dataset#%matplotlib inline是jupyter notebook里的命令,
#意思是将那些用matplotlib绘制的图显示在页面里而不是弹出一个窗口
def load_dataset():train_dataset = h5py.File("train_catvnoncat.h5","r") #读取训练数据，共209张图片test_dataset = h5py.File("test_catvnoncat.h5", "r") #读取测试数据，共50张图片train_set_x_orig = np.array(train_dataset["train_set_x"][:]) #原始训练集（209*64*64*3）train_set_y_orig = np.array(train_dataset["train_set_y"][:]) #原始训练集的标签集（y=0非猫,y=1是猫）（209*1）test_set_x_orig = np.array(test_dataset["test_set_x"][:]) #原始测试集（50*64*64*3test_set_y_orig = np.array(test_dataset["test_set_y"][:]) #原始测试集的标签集（y=0非猫,y=1是猫）（50*1）train_set_y_orig = train_set_y_orig.reshape((1,train_set_y_orig.shape[0])) #原始训练集的标签集设为（1*209）test_set_y_orig = test_set_y_orig.reshape((1,test_set_y_orig.shape[0])) #原始测试集的标签集设为（1*50）classes = np.array(test_dataset["list_classes"][:])return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classesdef image_show(index,dataset):index = indexif dataset == "train":plt.imshow(train_set_x_orig[index])print ("y = " + str(train_set_y[:, index]) + ", 它是一张" + classes[np.squeeze(train_set_y[:, index])].decode("utf-8") +  "' 图片。")elif dataset == "test":plt.imshow(test_set_x_orig[index])print ("y = " + str(test_set_y[:, index]) + ", 它是一张" + classes[np.squeeze(test_set_y[:, index])].decode("utf-8") +  "' 图片。")train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes=load_dataset()
image_show(10,"test")

深度学习中许多bug是由于矩阵/向量维度的不fit，如果你能保持你的矩阵/向量维度是正确的，那么你不会有太多bug。

练习：找到值for:
- m_train (number of training examples)
- m_test (number of test examples)
- num_px (= height = width of a training image)
Remember that train_set_x_orig is a numpy-array of shape (m_train, num_px, num_px, 3). For instance, you can access m_train by writing train_set_x_orig.shape[0].

### START CODE HERE ### (≈ 3 lines of code)
m_train = None
m_test = None
num_px = None
m_train = train_set_x_orig.shape[0]
m_test = test_set_x_orig.shape[0]
num_px = train_set_x_orig.shape[1]
### END CODE HERE ###print ("Number of training examples: m_train = " + str(m_train))
print ("Number of testing examples: m_test = " + str(m_test))
print ("Height/Width of each image: num_px = " + str(num_px))
print ("Each image is of size: (" + str(num_px) + ", " + str(num_px) + ", 3)")
print ("train_set_x shape: " + str(train_set_x_orig.shape))
print ("train_set_y shape: " + str(train_set_y.shape))
print ("test_set_x shape: " + str(test_set_x_orig.shape))
print ("test_set_y shape: " + str(test_set_y.shape))

为了方便起见，你现在应该将形状为 (num_px, num_px, 3)重新塑造成一个numpy数组(num_px ∗ num_px ∗ 3, 1)。在这之后，我们的训练和测试集是一个numpy数组，其中每一列表示一个扁平的图像。这里应该有m_train列。

Exercise: Reshape the training and test data sets so that images of size (num_px, num_px, 3) are flattened into single vectors of shape (num_px ∗ num_px ∗ 3, 1).

### START CODE HERE ### (≈ 2 lines of code)
train_set_x_flatten = train_set_x_orig.reshape(train_set_x_orig.shape[0],-1).T
#转置后 行变成列 列变成行
test_set_x_flatten = test_set_x_orig.reshape(test_set_x_orig.shape[0],-1).T#每一列就是一个图像的像素点组成
### END CODE HERE ###print ("train_set_x_flatten shape: " + str(train_set_x_flatten.shape))
print ("train_set_y shape: " + str(train_set_y.shape))
print ("test_set_x_flatten shape: " + str(test_set_x_flatten.shape))
print ("test_set_y shape: " + str(test_set_y.shape))
print ("sanity check after reshaping: " + str(train_set_x_flatten[0:5,0]))

为了表示彩色图像，必须为每个像素点分配RGB通道，因此像素值实际上是一个数值在0-255的三维向量

机器学习中一个常见的预处理步骤是对数据集进行集中和标准化，这意味着你可以从每个示例中派生出整个numpy数组的平均值，然后根据整个numpy数组的标准差来划分每个实例。但是对于图片集来说，它更简单方便，只需要将数据集的每一行除以255（一个像素通道的最大值）。

你需要记住的是：
预处理操作的常用步骤：
-找出问题的维数和形状（shape）
-对数据进行重新设置，这样每个实例都是一个大小的向量（numpx,numpx,3,1）
-标准化数据

3 - General Architecture of the learning algorithm

It’s time to design a simple algorithm to distinguish cat images from non-cat images.
You will build a Logistic Regression, using a Neural Network mindset.

Key Step:
-初始化模型参数
-通过最小化成本来学习模型参数
-使用学习的参数进行预测
-分析结果并得出结论

4 - Building the parts of our algorithm

建立神经网络的主要步骤是：
1.确定模块结构（有几步）
2.初始化模块参数
3.循环：
-计算当前损失函数(前向传播）
-计算当前梯度（反向传播）
-更新参数（梯度下降）

4.1 - Helper functions

练习：使用你从python basic学到的代码，执行sigmoid函数。

def sigmoid(z):s=1/(1+np.exp(-z))       return s
print ("sigmoid([0, 2]) = " + str(sigmoid(np.array([0,2]))))

4.2 - Initializing parameters

练习：执行参数初始化，你需要将一个vector初始化成0.

def initialize_with_zeros(dim):"""This function creates a vector of zeros of shape (dim, 1) for w and initializes b to 0.Argument:dim -- size of the w vector we want (or number of parameters in this case)Returns:w -- initialized vector of shape (dim, 1)b -- initialized scalar (corresponds to the bias)"""### START CODE HERE ### (≈ 1 line of code)w, b = np.zeros((dim,1)), 0### END CODE HERE ###assert(w.shape == (dim, 1))assert(isinstance(b, float) or isinstance(b, int))return w, bdim=2
w,b=initialize_with_zeros(dim)
print("w="+str(w))
print("b="+str(b))

4.3 - Forward and Backward propagation

Now that your parameters are initialized, you can do the “forward” and “backward” propagation steps for learning the parameters.
Exercise: Implement a function propagate() that computes the cost function and its gradient.
现在你的参数已经初始化，你可以前向或者反向传播去更新的参数了。

d) Optimization

-初始化你的数据
-计算你的损失函数和梯度
-现在，你可以使用梯度下降更新你的参数

Exercise: Write down the optimization function. The goal is to learn w and b by minimizing the cost function J. For a parameter θ, the update rule is θ=θ−α dθ, where α is the learning rate.

def optimize(w,b,X,Y,num_iterations,learning_rate,print_cost=False):costs=[]for i in range(num_iterations):grads,cost=propagate(w,b,X,Y)dw=grads["dw"]db=grads["db"]w=w-learning_rate*dwb=b-learning_rate*dbif i %100 ==0:costs.append(cost)if print_cost and i %100 ==0:print("Cost after iteration %i:%f"%(i,cost))params={"w":w,"b":b}grads={"dw":dw,"db":db}return params,grads,costs
params, grads, costs = optimize(w, b, X, Y, num_iterations= 100, learning_rate = 0.009, print_cost = False)print ("w = " + str(params["w"]))
print ("b = " + str(params["b"]))
print ("dw = " + str(grads["dw"]))
print ("db = " + str(grads["db"]))

练习：先前optimize函数输出学习的w和b。我们能够使用w和b预测数据集X的标签。执行预测函数，这里有两步需要进行：
1.计算Yhat=A=sigmoid(XW.T+b)
2.转化entries为0（如果激活函数<=0.5）或者1(如果激活函数>0.5)，预测结果存储在一个向量Y_prediction中

# GRADED FUNCTION: predictdef predict(w, b, X):'''Predict whether the label is 0 or 1 using learned logistic regression parameters (w, b)Arguments:w -- weights, a numpy array of size (num_px * num_px * 3, 1)b -- bias, a scalarX -- data of size (num_px * num_px * 3, number of examples)Returns:Y_prediction -- a numpy array (vector) containing all predictions (0/1) for the examples in X'''m = X.shape[1]Y_prediction = np.zeros((1,m))w = w.reshape(X.shape[0], 1)# Compute vector "A" predicting the probabilities of a cat being present in the picture### START CODE HERE ### (≈ 1 line of code)A = sigmoid(np.dot(w.T, X) + b)### END CODE HERE ###for i in range(A.shape[1]):# Convert probabilities A[0,i] to actual predictions p[0,i]### START CODE HERE ### (≈ 4 lines of code)if A[0,i] > 0.5:Y_prediction[0,i] = 1else:Y_prediction[0,i] = 0### END CODE HERE ###assert(Y_prediction.shape == (1, m))return Y_prediction

w = np.array([[0.1124579],[0.23106775]])
b = -0.3
X = np.array([[1.,-1.1,-3.2],[1.2,2.,0.1]])
print ("predictions = " + str(predict(w, b, X)))

提醒：
你应该执行了这样几个函数：
-初始化
-优化loss并学习参数（w,b）
-计算cost和gradient
-使用梯度下降更新参数
-使用学习得到的（w,b）去预测标签并做一个例子

5 - Merge all functions into a model

现在你将知道如何将所有以正确的顺序结构放在一个模型。
练习：整合模型，执行模型的功能：
-对你的test集使用Y_prediction
-对你的train集使用Y_prediction

导入用到的包

import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage

#导入用到的包
import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage
#导入数据
def load_dataset():train_dataset = h5py.File("train_catvnoncat.h5","r") #读取训练数据，共209张图片test_dataset = h5py.File("test_catvnoncat.h5", "r") #读取测试数据，共50张图片train_set_x_orig = np.array(train_dataset["train_set_x"][:]) #原始训练集（209*64*64*3）train_set_y_orig = np.array(train_dataset["train_set_y"][:]) #原始训练集的标签集（y=0非猫,y=1是猫）（209*1）test_set_x_orig = np.array(test_dataset["test_set_x"][:]) #原始测试集（50*64*64*3test_set_y_orig = np.array(test_dataset["test_set_y"][:]) #原始测试集的标签集（y=0非猫,y=1是猫）（50*1）train_set_y_orig = train_set_y_orig.reshape((1,train_set_y_orig.shape[0])) #原始训练集的标签集设为（1*209）test_set_y_orig = test_set_y_orig.reshape((1,test_set_y_orig.shape[0])) #原始测试集的标签集设为（1*50）classes = np.array(test_dataset["list_classes"][:])return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes#显示图片
def image_show(index,dataset):index = indexif dataset == "train":plt.imshow(train_set_x_orig[index])print ("y = " + str(train_set_y[:, index]) + ", 它是一张" + classes[np.squeeze(train_set_y[:, index])].decode("utf-8") +  "' 图片。")elif dataset == "test":plt.imshow(test_set_x_orig[index])print ("y = " + str(test_set_y[:, index]) + ", 它是一张" + classes[np.squeeze(test_set_y[:, index])].decode("utf-8") +  "' 图片。")#sigmoid函数
def sigmoid(z):s = 1/(1+np.exp(-z))return s#初始化参数w,b
def initialize_with_zeros(dim):w = np.zeros((dim,1)) #w为一个dim*1矩阵b = 0    return w, b#计算Y_hat,成本函数J以及dw，db
def propagate(w, b, X, Y):m = X.shape[1] #样本个数Y_hat = sigmoid(np.dot(w.T,X)+b)                                     cost = -(np.sum(np.dot(Y,np.log(Y_hat).T)+np.dot((1-Y),np.log(1-Y_hat).T)))/m #成本函数dw = (np.dot(X,(Y_hat-Y).T))/mdb = (np.sum(Y_hat-Y))/mcost = np.squeeze(cost) #压缩维度    grads = {"dw": dw,"db": db} #梯度return grads, cost#梯度下降找出最优解
def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost = False):#num_iterations-梯度下降次数 learning_rate-学习率，即参数ɑcosts = [] #记录成本值for i in range(num_iterations): #循环进行梯度下降grads, cost = propagate(w,b,X,Y)dw = grads["dw"]db = grads["db"]w = w - learning_rate*dwb = b - learning_rate*dbif i % 100 == 0: #每100次记录一次成本值costs.append(cost)if print_cost and i % 100 == 0: #打印成本值print ("循环%i次后的成本值: %f" %(i, cost))params = {"w": w,"b": b} #最终参数值grads = {"dw": dw,"db": db}#最终梯度值return params, grads, costs#预测出结果
def predict(w, b, X):m = X.shape[1] #样本个数Y_prediction = np.zeros((1,m)) #初始化预测输出w = w.reshape(X.shape[0], 1) #转置参数向量wY_hat = sigmoid(np.dot(w.T,X)+b) #最终得到的参数代入方程for i in range(Y_hat.shape[1]):if Y_hat[:,i]>0.5:Y_prediction[:,i] = 1else:Y_prediction[:,i] = 0return Y_prediction#建立整个预测模型
def model(X_train, Y_train, X_test, Y_test, num_iterations = 2000, learning_rate = 0.5, print_cost = False): #num_iterations-梯度下降次数 learning_rate-学习率，即参数ɑw, b = initialize_with_zeros(X_train.shape[0]) #初始化参数w，bparameters, grads, costs = optimize(w, b, X_train, Y_train, num_iterations, learning_rate, print_cost) #梯度下降找到最优参数w = parameters["w"]b = parameters["b"]Y_prediction_train = predict(w, b, X_train) #训练集的预测结果Y_prediction_test = predict(w, b, X_test) #测试集的预测结果train_accuracy = 100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100 #训练集识别准确度test_accuracy = 100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100 #测试集识别准确度print("训练集识别准确度: {} %".format(train_accuracy))print("测试集识别准确度: {} %".format(test_accuracy))d = {"costs": costs,"Y_prediction_test": Y_prediction_test,"Y_prediction_train" : Y_prediction_train,"w" : w,"b" : b,"learning_rate" : learning_rate,"num_iterations": num_iterations}return dtrain_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()
m_train = train_set_x_orig.shape[0] #训练集中样本个数
m_test = test_set_x_orig.shape[0] #测试集总样本个数
num_px = test_set_x_orig.shape[1] #图片的像素大小
train_set_x_flatten = train_set_x_orig.reshape(train_set_x_orig.shape[0],-1).T #原始训练集的设为（12288*209）
test_set_x_flatten = test_set_x_orig.reshape(test_set_x_orig.shape[0],-1).T #原始测试集设为（12288*50）
train_set_x = train_set_x_flatten/255. #将训练集矩阵标准化
test_set_x = test_set_x_flatten/255. #将测试集矩阵标准化
d = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 2000, learning_rate = 0.005, print_cost = True)# 画出学习曲线
costs = np.squeeze(d['costs'])
plt.plot(costs)
plt.ylabel('cost')
plt.xlabel('iterations (per hundreds)')
plt.title("Learning rate =" + str(d["learning_rate"]))
plt.show()

备注：训练集的精度接近100%，这是一个很好的检查，你的模型足以胜任训练数据的预测。测试集的误差是70%，它实际上也不坏。考虑到数据集太小和我们使用的逻辑回归是一个线性分类器，它已经足够好了。

使用mathplotlib画出学习曲线：

网易云深度学习第一课第二周编程作业相关推荐

网易云深度学习第一课第一周编程作业
1.1Python Basics with Numpy (optional assignment) Welcome to your first assignment. This exercise gi ...
网易云深度学习第一课第三周编程作业
具有一个隐藏层的平面数据分类第三周的编程任务: 构建一个含有一层隐藏层的神经网络,你将会发现这和使用逻辑回归有很大的不同. 首先先导入在这个任务中你需要的所有的包. -numpy是Python中与科 ...
吴恩达深度学习第一课--第二周神经网络基础作业上正反向传播推导
文章目录正向传播推导第i个样本向量化(从个别到整体) 判断向量维度将原始数据进行整合反向传播推导第i个样本损失函数代价函数梯度下降法(实则是多元函数求微分) 向量化(从个别到整体) ...
吴恩达深度学习第一课--第二周神经网络基础作业下代码实现
文章目录需要的库文件步骤取出训练集.测试集了解训练集.测试集查看图片数据维度处理标准化数据定义sigmoid函数初始化参数定义前向传播函数.代价函数及梯度下降优化部分预测部分 ...
Emojify - v2 吴恩达老师深度学习第五课第二周编程作业2
吴恩达老师深度学习第五课第二周编程作业2,包含答案! Emojify! Welcome to the second assignment of Week 2. You are going to use ...
Operations on word vectors-v2 吴恩达老师深度学习课程第五课第二周编程作业1
吴恩达老师深度学习课程第五课(RNN)第二周编程作业1, 包含答案 Operations on word vectors Welcome to your first assignment of thi ...
深度学习-吴恩达第一课第二周课程作业
这周作业是,给出一张图片,判断这张图是不是猫. 这是一个二分类问题,结果是非0即1的,使用逻辑回归(Logic Regression),可以说,了解这个回归方法,有些python基础,会使用jupyt ...
吴恩达深度学习 | (2) 神经网络与深度学习专项课程第二周学习笔记
课程视频第二周PPT汇总吴恩达深度学习专项课程共分为五个部分,本篇博客将介绍第一部分神经网络和深度学习专项的第二周课程:神经网络基础.由于逻辑回归算法可以看作是一个单神经元(单层)的网络结构,为了 ...
Python数据分析初探项目基于Python数据可视化的网易云音乐歌单分析系统大学编程作业（TUST 天津科技大学 2022年）
Python 数据分析初探项目基于 Python 数据可视化的网易云音乐歌单分析系统大学编程作业(TUST 天津科技大学 2022 年) Python 数据分析初探项目基于 Python 数据可 ...

网易云深度学习第一课第二周编程作业