Part 2: Logistic Regression with a Neural Network mindset


1 - Packages

First, let’s run the cell below to import all the packages that you will need during this assignment.
- numpy is the fundamental package for scientific computing with Python.
- h5py is a common package to interact with a dataset that is stored on an H5 file.
- matplotlib is a famous library to plot graphs in Python.
- PIL and scipy are used here to test your model with your own picture at the end.

 import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage
#from lr_utils import load_dataset
% matplotlib inline

2 - Overview of the Problem set

Problem Statement: You are given a dataset (“data.h5”) containing:
-训练集m_train 图像标签是y=1为cat,标签是y=0为non-cat
-每个图像的形状是(num_px, num_px, 3),它们都是RGB彩色图像。因此每个图像都是方的。(height = num_px) and (width = num_px).

Let’s get more familiar with the dataset. Load the data by running the following code.
以下代码来自博主Bin Weber

import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage
#from lr_utils import load_dataset#%matplotlib inline是jupyter notebook里的命令,
def load_dataset():train_dataset = h5py.File("train_catvnoncat.h5","r") #读取训练数据,共209张图片test_dataset = h5py.File("test_catvnoncat.h5", "r") #读取测试数据,共50张图片train_set_x_orig = np.array(train_dataset["train_set_x"][:]) #原始训练集(209*64*64*3)train_set_y_orig = np.array(train_dataset["train_set_y"][:]) #原始训练集的标签集(y=0非猫,y=1是猫)(209*1)test_set_x_orig = np.array(test_dataset["test_set_x"][:]) #原始测试集(50*64*64*3test_set_y_orig = np.array(test_dataset["test_set_y"][:]) #原始测试集的标签集(y=0非猫,y=1是猫)(50*1)train_set_y_orig = train_set_y_orig.reshape((1,train_set_y_orig.shape[0])) #原始训练集的标签集设为(1*209)test_set_y_orig = test_set_y_orig.reshape((1,test_set_y_orig.shape[0])) #原始测试集的标签集设为(1*50)classes = np.array(test_dataset["list_classes"][:])return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classesdef image_show(index,dataset):index = indexif dataset == "train":plt.imshow(train_set_x_orig[index])print ("y = " + str(train_set_y[:, index]) + ", 它是一张" + classes[np.squeeze(train_set_y[:, index])].decode("utf-8") +  "' 图片。")elif dataset == "test":plt.imshow(test_set_x_orig[index])print ("y = " + str(test_set_y[:, index]) + ", 它是一张" + classes[np.squeeze(test_set_y[:, index])].decode("utf-8") +  "' 图片。")train_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes=load_dataset()


- m_train (number of training examples)
- m_test (number of test examples)
- num_px (= height = width of a training image)
Remember that train_set_x_orig is a numpy-array of shape (m_train, num_px, num_px, 3). For instance, you can access m_train by writing train_set_x_orig.shape[0].

### START CODE HERE ### (≈ 3 lines of code)
m_train = None
m_test = None
num_px = None
m_train = train_set_x_orig.shape[0]
m_test = test_set_x_orig.shape[0]
num_px = train_set_x_orig.shape[1]
### END CODE HERE ###print ("Number of training examples: m_train = " + str(m_train))
print ("Number of testing examples: m_test = " + str(m_test))
print ("Height/Width of each image: num_px = " + str(num_px))
print ("Each image is of size: (" + str(num_px) + ", " + str(num_px) + ", 3)")
print ("train_set_x shape: " + str(train_set_x_orig.shape))
print ("train_set_y shape: " + str(train_set_y.shape))
print ("test_set_x shape: " + str(test_set_x_orig.shape))
print ("test_set_y shape: " + str(test_set_y.shape))

为了方便起见,你现在应该将形状为 (num_px, num_px, 3)重新塑造成一个numpy数组(num_px ∗ num_px ∗ 3, 1)。在这之后,我们的训练和测试集是一个numpy数组,其中每一列表示一个扁平的图像。这里应该有m_train列。

Exercise: Reshape the training and test data sets so that images of size (num_px, num_px, 3) are flattened into single vectors of shape (num_px ∗ num_px ∗ 3, 1).

### START CODE HERE ### (≈ 2 lines of code)
train_set_x_flatten = train_set_x_orig.reshape(train_set_x_orig.shape[0],-1).T
#转置后 行变成列 列变成行
test_set_x_flatten = test_set_x_orig.reshape(test_set_x_orig.shape[0],-1).T#每一列就是一个图像的像素点组成
### END CODE HERE ###print ("train_set_x_flatten shape: " + str(train_set_x_flatten.shape))
print ("train_set_y shape: " + str(train_set_y.shape))
print ("test_set_x_flatten shape: " + str(test_set_x_flatten.shape))
print ("test_set_y shape: " + str(test_set_y.shape))
print ("sanity check after reshaping: " + str(train_set_x_flatten[0:5,0]))




3 - General Architecture of the learning algorithm

It’s time to design a simple algorithm to distinguish cat images from non-cat images.
You will build a Logistic Regression, using a Neural Network mindset.

Key Step:

4 - Building the parts of our algorithm


4.1 - Helper functions

练习:使用你从python basic学到的代码,执行sigmoid函数。

def sigmoid(z):s=1/(1+np.exp(-z))       return s
print ("sigmoid([0, 2]) = " + str(sigmoid(np.array([0,2]))))

4.2 - Initializing parameters


def initialize_with_zeros(dim):"""This function creates a vector of zeros of shape (dim, 1) for w and initializes b to 0.Argument:dim -- size of the w vector we want (or number of parameters in this case)Returns:w -- initialized vector of shape (dim, 1)b -- initialized scalar (corresponds to the bias)"""### START CODE HERE ### (≈ 1 line of code)w, b = np.zeros((dim,1)), 0### END CODE HERE ###assert(w.shape == (dim, 1))assert(isinstance(b, float) or isinstance(b, int))return w, bdim=2

4.3 - Forward and Backward propagation

Now that your parameters are initialized, you can do the “forward” and “backward” propagation steps for learning the parameters.
Exercise: Implement a function propagate() that computes the cost function and its gradient.

d) Optimization


Exercise: Write down the optimization function. The goal is to learn w and b by minimizing the cost function J. For a parameter θ, the update rule is θ=θ−α dθ, where α is the learning rate.

def optimize(w,b,X,Y,num_iterations,learning_rate,print_cost=False):costs=[]for i in range(num_iterations):grads,cost=propagate(w,b,X,Y)dw=grads["dw"]db=grads["db"]w=w-learning_rate*dwb=b-learning_rate*dbif i %100 ==0:costs.append(cost)if print_cost and i %100 ==0:print("Cost after iteration %i:%f"%(i,cost))params={"w":w,"b":b}grads={"dw":dw,"db":db}return params,grads,costs
params, grads, costs = optimize(w, b, X, Y, num_iterations= 100, learning_rate = 0.009, print_cost = False)print ("w = " + str(params["w"]))
print ("b = " + str(params["b"]))
print ("dw = " + str(grads["dw"]))
print ("db = " + str(grads["db"]))


# GRADED FUNCTION: predictdef predict(w, b, X):'''Predict whether the label is 0 or 1 using learned logistic regression parameters (w, b)Arguments:w -- weights, a numpy array of size (num_px * num_px * 3, 1)b -- bias, a scalarX -- data of size (num_px * num_px * 3, number of examples)Returns:Y_prediction -- a numpy array (vector) containing all predictions (0/1) for the examples in X'''m = X.shape[1]Y_prediction = np.zeros((1,m))w = w.reshape(X.shape[0], 1)# Compute vector "A" predicting the probabilities of a cat being present in the picture### START CODE HERE ### (≈ 1 line of code)A = sigmoid(, X) + b)### END CODE HERE ###for i in range(A.shape[1]):# Convert probabilities A[0,i] to actual predictions p[0,i]### START CODE HERE ### (≈ 4 lines of code)if A[0,i] > 0.5:Y_prediction[0,i] = 1else:Y_prediction[0,i] = 0### END CODE HERE ###assert(Y_prediction.shape == (1, m))return Y_prediction
w = np.array([[0.1124579],[0.23106775]])
b = -0.3
X = np.array([[1.,-1.1,-3.2],[1.2,2.,0.1]])
print ("predictions = " + str(predict(w, b, X)))


5 - Merge all functions into a model



import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage

import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage
def load_dataset():train_dataset = h5py.File("train_catvnoncat.h5","r") #读取训练数据,共209张图片test_dataset = h5py.File("test_catvnoncat.h5", "r") #读取测试数据,共50张图片train_set_x_orig = np.array(train_dataset["train_set_x"][:]) #原始训练集(209*64*64*3)train_set_y_orig = np.array(train_dataset["train_set_y"][:]) #原始训练集的标签集(y=0非猫,y=1是猫)(209*1)test_set_x_orig = np.array(test_dataset["test_set_x"][:]) #原始测试集(50*64*64*3test_set_y_orig = np.array(test_dataset["test_set_y"][:]) #原始测试集的标签集(y=0非猫,y=1是猫)(50*1)train_set_y_orig = train_set_y_orig.reshape((1,train_set_y_orig.shape[0])) #原始训练集的标签集设为(1*209)test_set_y_orig = test_set_y_orig.reshape((1,test_set_y_orig.shape[0])) #原始测试集的标签集设为(1*50)classes = np.array(test_dataset["list_classes"][:])return train_set_x_orig, train_set_y_orig, test_set_x_orig, test_set_y_orig, classes#显示图片
def image_show(index,dataset):index = indexif dataset == "train":plt.imshow(train_set_x_orig[index])print ("y = " + str(train_set_y[:, index]) + ", 它是一张" + classes[np.squeeze(train_set_y[:, index])].decode("utf-8") +  "' 图片。")elif dataset == "test":plt.imshow(test_set_x_orig[index])print ("y = " + str(test_set_y[:, index]) + ", 它是一张" + classes[np.squeeze(test_set_y[:, index])].decode("utf-8") +  "' 图片。")#sigmoid函数
def sigmoid(z):s = 1/(1+np.exp(-z))return s#初始化参数w,b
def initialize_with_zeros(dim):w = np.zeros((dim,1)) #w为一个dim*1矩阵b = 0    return w, b#计算Y_hat,成本函数J以及dw,db
def propagate(w, b, X, Y):m = X.shape[1] #样本个数Y_hat = sigmoid(,X)+b)                                     cost = -(np.sum(,np.log(Y_hat).T),np.log(1-Y_hat).T)))/m #成本函数dw = (,(Y_hat-Y).T))/mdb = (np.sum(Y_hat-Y))/mcost = np.squeeze(cost) #压缩维度    grads = {"dw": dw,"db": db} #梯度return grads, cost#梯度下降找出最优解
def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost = False):#num_iterations-梯度下降次数 learning_rate-学习率,即参数ɑcosts = [] #记录成本值for i in range(num_iterations): #循环进行梯度下降grads, cost = propagate(w,b,X,Y)dw = grads["dw"]db = grads["db"]w = w - learning_rate*dwb = b - learning_rate*dbif i % 100 == 0: #每100次记录一次成本值costs.append(cost)if print_cost and i % 100 == 0: #打印成本值print ("循环%i次后的成本值: %f" %(i, cost))params = {"w": w,"b": b} #最终参数值grads = {"dw": dw,"db": db}#最终梯度值return params, grads, costs#预测出结果
def predict(w, b, X):m = X.shape[1] #样本个数Y_prediction = np.zeros((1,m)) #初始化预测输出w = w.reshape(X.shape[0], 1) #转置参数向量wY_hat = sigmoid(,X)+b) #最终得到的参数代入方程for i in range(Y_hat.shape[1]):if Y_hat[:,i]>0.5:Y_prediction[:,i] = 1else:Y_prediction[:,i] = 0return Y_prediction#建立整个预测模型
def model(X_train, Y_train, X_test, Y_test, num_iterations = 2000, learning_rate = 0.5, print_cost = False): #num_iterations-梯度下降次数 learning_rate-学习率,即参数ɑw, b = initialize_with_zeros(X_train.shape[0]) #初始化参数w,bparameters, grads, costs = optimize(w, b, X_train, Y_train, num_iterations, learning_rate, print_cost) #梯度下降找到最优参数w = parameters["w"]b = parameters["b"]Y_prediction_train = predict(w, b, X_train) #训练集的预测结果Y_prediction_test = predict(w, b, X_test) #测试集的预测结果train_accuracy = 100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100 #训练集识别准确度test_accuracy = 100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100 #测试集识别准确度print("训练集识别准确度: {} %".format(train_accuracy))print("测试集识别准确度: {} %".format(test_accuracy))d = {"costs": costs,"Y_prediction_test": Y_prediction_test,"Y_prediction_train" : Y_prediction_train,"w" : w,"b" : b,"learning_rate" : learning_rate,"num_iterations": num_iterations}return dtrain_set_x_orig, train_set_y, test_set_x_orig, test_set_y, classes = load_dataset()
m_train = train_set_x_orig.shape[0] #训练集中样本个数
m_test = test_set_x_orig.shape[0] #测试集总样本个数
num_px = test_set_x_orig.shape[1] #图片的像素大小
train_set_x_flatten = train_set_x_orig.reshape(train_set_x_orig.shape[0],-1).T #原始训练集的设为(12288*209)
test_set_x_flatten = test_set_x_orig.reshape(test_set_x_orig.shape[0],-1).T #原始测试集设为(12288*50)
train_set_x = train_set_x_flatten/255. #将训练集矩阵标准化
test_set_x = test_set_x_flatten/255. #将测试集矩阵标准化
d = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 2000, learning_rate = 0.005, print_cost = True)# 画出学习曲线
costs = np.squeeze(d['costs'])
plt.xlabel('iterations (per hundreds)')
plt.title("Learning rate =" + str(d["learning_rate"]))




