本文节选自吴恩达老师《深度学习专项课程》编程作业,在此表示感谢。

课程链接:https://www.deeplearning.ai/deep-learning-specialization/

You will learn to:

  • Build the general architecture of a learning algorithm, including:

    • Initializing parameters
    • Calculating the cost function and its gradient
    • Using an optimization algorithm (gradient descent)
  • Gather all three functions above into a main model function, in the right order.

你将会学到深度学习的通用结构,包括:

  • 参数初始化
  • 计算损失函数和梯度
  • 使用优化算法(梯度下降)

目录

1 - Packages

2 - Overview of the Problem set

3 - General Architecture of the learning algorithm(掌握)

4 - Building the parts of our algorithm

4.1 - Helper functions

4.2 - Initializing parameters

4.3 - Forward and Backward propagation(掌握)

4.4 - Optimization

4.5 - Predict

5 - Merge all functions into a model


1 - Packages

First, let's run the cell below to import all the packages that you will need during this assignment.

  • numpy is the fundamental package for scientific computing with Python.
  • h5py is a common package to interact with a dataset that is stored on an H5 file.
  • matplotlib is a famous library to plot graphs in Python.
  • PIL and scipy are used here to test your model with your own picture at the end.
import numpy as np
import matplotlib.pyplot as plt
import h5py
import scipy
from PIL import Image
from scipy import ndimage
from lr_utils import load_dataset  # 这个函数在文件夹下的 py 文件中%matplotlib inline

2 - Overview of the Problem set

Problem Statement: You are given a dataset ("data.h5") containing:

- a training set of m_train images labeled as cat (y=1) or non-cat (y=0)
- a test set of m_test images labeled as cat or non-cat
- each image is of shape (num_px, num_px, 3) where 3 is for the 3 channels (RGB). Thus, each image is square (height = num_px) and (width = num_px).

每一张图片的shape:(num_px, num_px, 3),3代表3个通道(RGB)。

You will build a simple image-recognition algorithm that can correctly classify pictures as cat or non-cat.

Let's get more familiar with the dataset. Load the data by running the following code.

Many software bugs in deep learning come from having matrix/vector dimensions that don't fit. If you can keep your matrix/vector dimensions straight you will go a long way toward eliminating many bugs.

Exercise: Find the values for:

- m_train (number of training examples)
- m_test (number of test examples)
- num_px (= height = width of a training image)

Remember that train_set_x_orig is a numpy-array of shape (m_train, num_px, num_px, 3). For instance, you can access m_train by writing train_set_x_orig.shape[0].

m_train = train_set_x_orig.shape[0]
m_test = test_set_x_orig.shape[0]
num_px = train_set_x_orig.shape[1]

For convenience, you should now reshape images of shape (num_px, num_px, 3) in a numpy-array of shape (num_px ∗∗ num_px ∗∗ 3, 1). After this, our training (and test) dataset is a numpy-array where each column represents a flattened image. There should be m_train (respectively m_test) columns.

Exercise: Reshape the training and test data sets so that images of size (num_px, num_px, 3) are flattened into single vectors of shape (num_px ∗∗ num_px ∗∗ 3, 1).

A trick when you want to flatten a matrix X of shape (a,b,c,d) to a matrix X_flatten of shape (b∗∗c∗∗d, a) is to use:

X_flatten = X.reshape(X.shape[0], -1).T      # X.T is the transpose of X
train_set_x_flatten = train_set_x_orig.reshape(train_set_x_orig.shape[0], -1).T
test_set_x_flatten  = test_set_x_orig.reshape(test_set_x_orig.shape[0], -1).T

Let's standardize our dataset.

train_set_x = train_set_x_flatten/255.
test_set_x = test_set_x_flatten/255.

**What you need to remember:**

Common steps for pre-processing a new dataset are:

  • Figure out the dimensions and shapes of the problem (m_train, m_test, num_px, ...)
  • Reshape the datasets such that each example is now a vector of size (num_px * num_px * 3, 1)
  • "Standardize" the data

新数据集常用的预处理顺序是:

  • 查看训练集、测试集大小和样本的形状(m_train,m_test,num_px等)
  • 重塑数据集,使每个样本变成为一个大小为(num_px * num_px * 3, 1)的向量
  • 数据“标准化”处理

3 - General Architecture of the learning algorithm(掌握)

You will build a Logistic Regression, using a Neural Network mindset. The following Figure explains why Logistic Regression is actually a very simple Neural Network!

Mathematical expression of the algorithm:

For one example 

​​​​​​​        ​​​​​​​        ​​​​​​​        ​​​​​​​        ​​​​​​​        ​​​​​​​        ​​​​​​​        ​​​​​​​        ​​​​​​​        ​​​​​​​        

The cost is then computed by summing over all training examples:

Key steps: In this exercise, you will carry out the following steps:

  • - Initialize the parameters of the model
  • - Learn the parameters for the model by minimizing the cost
  • - Use the learned parameters to make predictions (on the test set)
  • - Analyse the results and conclude

关键步骤:在这个练习中,你将会进行以下步骤:

  • 初始化模型参数
  • 通过降低损失来学习模型参数
  • 使用学习到的参数来预测(在测试集上预测)
  • 分析预测结果和总结

4 - Building the parts of our algorithm

The main steps for building a Neural Network are:

  1. Define the model structure (such as number of input features)
  2. Initialize the model's parameters
  3. Loop:
    • Calculate current loss (forward propagation)
    • Calculate current gradient (backward propagation)
    • Update parameters (gradient descent)

You often build 1-3 separately and integrate them into one function we call model().

4.1 - Helper functions

Exercise: Using your code from "Python Basics", implement sigmoid()

def sigmoid(z):"""Compute the sigmoid of zArguments:z -- A scalar or numpy array of any size.Return:s -- sigmoid(z)"""s = 1 / (1 + np.exp(-z))return s

4.2 - Initializing parameters

def initialize_with_zeros(dim):"""This function creates a vector of zeros of shape (dim, 1) for w and initializes b to 0.Argument:dim -- size of the w vector we want (or number of parameters in this case)Returns:w -- initialized vector of shape (dim, 1)b -- initialized scalar (corresponds to the bias)"""w = np.zeros((dim, 1))b = 0assert(w.shape == (dim, 1))assert(isinstance(b, float) or isinstance(b, int))return w, b

4.3 - Forward and Backward propagation(掌握)

Now that your parameters are initialized, you can do the "forward" and "backward" propagation steps for learning the parameters.

Exercise: Implement a function propagate() that computes the cost function and its gradient.

Hints:

Forward Propagation:

  • You get X
  • You compute

  • You calculate the cost function:

Here are the two formulas you will be using:

# GRADED FUNCTION: propagatedef propagate(w, b, X, Y):"""Implement the cost function and its gradient for the propagation explained aboveArguments:w -- weights, a numpy array of size (num_px * num_px * 3, 1)b -- bias, a scalarX -- data of size (num_px * num_px * 3, number of examples)Y -- true "label" vector (containing 0 if non-cat, 1 if cat) of size (1, number of examples)Return:cost -- negative log-likelihood cost for logistic regressiondw -- gradient of the loss with respect to w, thus same shape as wdb -- gradient of the loss with respect to b, thus same shape as bTips:- Write your code step by step for the propagation. np.log(), np.dot()"""m = X.shape[1]# FORWARD PROPAGATION (FROM X TO COST)A = sigmoid(np.dot(w.T, X) + b)cost = -1/m * np.sum(Y*np.log(A) + (1-Y)*np.log(1-A))# BACKWARD PROPAGATION (TO FIND GRAD)dw = 1/m * np.dot(X, (A-Y).T)db = 1/m * np.sum(A-Y)assert(dw.shape == w.shape)assert(db.dtype == float)cost = np.squeeze(cost)assert(cost.shape == ())grads = {"dw": dw,"db": db}return grads, cost

4.4 - Optimization

Exercise: Write down the optimization function. The goal is to learn  and  by minimizing the cost function . For a parameter , the update rule is  where  is the learning rate.

def optimize(w, b, X, Y, num_iterations, learning_rate, print_cost = False):"""This function optimizes w and b by running a gradient descent algorithmArguments:w -- weights, a numpy array of size (num_px * num_px * 3, 1)b -- bias, a scalarX -- data of shape (num_px * num_px * 3, number of examples)Y -- true "label" vector (containing 0 if non-cat, 1 if cat), of shape (1, number of examples)num_iterations -- number of iterations of the optimization looplearning_rate -- learning rate of the gradient descent update ruleprint_cost -- True to print the loss every 100 stepsReturns:params -- dictionary containing the weights w and bias bgrads -- dictionary containing the gradients of the weights and bias with respect to the cost functioncosts -- list of all the costs computed during the optimization, this will be used to plot the learning curve.Tips:You basically need to write down two steps and iterate through them:1) Calculate the cost and the gradient for the current parameters. Use propagate().2) Update the parameters using gradient descent rule for w and b."""costs = []for i in range(num_iterations):# Cost and gradient calculation (≈ 1-4 lines of code)grads, cost = propagate(w, b, X, Y)# Retrieve derivatives from gradsdw = grads["dw"]db = grads["db"]# update rule (≈ 2 lines of code)w = w - learning_rate * dwb = b - learning_rate * db# Record the costsif i % 100 == 0:costs.append(cost)# Print the cost every 100 training examplesif print_cost and i % 100 == 0:print ("Cost after iteration %i: %f" %(i, cost))params = {"w": w,"b": b}grads = {"dw": dw,"db": db}return params, grads, costs

4.5 - Predict

Exercise: The previous function will output the learned w and b. We are able to use w and b to predict the labels for a dataset X. Implement the predict() function. There is two steps to computing predictions:

1.Calculate 

2. Convert the entries of a into 0 (if activation <= 0.5) or 1 (if activation > 0.5), stores the predictions in a vector `Y_prediction`. If you wish, you can use an `if`/`else` statement in a `for` loop (though there is also a way to vectorize this).

def predict(w, b, X):'''Predict whether the label is 0 or 1 using learned logistic regression parameters (w, b)Arguments:w -- weights, a numpy array of size (num_px * num_px * 3, 1)b -- bias, a scalarX -- data of size (num_px * num_px * 3, number of examples)Returns:Y_prediction -- a numpy array (vector) containing all predictions (0/1) for the examples in X'''m = X.shape[1]Y_prediction = np.zeros((1,m))w = w.reshape(X.shape[0], 1)# Compute vector "A" predicting the probabilities of a cat being present in the pictureA = sigmoid(np.dot(w.T, X) + b)for i in range(A.shape[1]):# Convert probabilities A[0,i] to actual predictions p[0,i]if A[0,i] > 0.5:Y_prediction[0, i] = 1 else: Y_prediction[0, i] = 0assert(Y_prediction.shape == (1, m))return Y_prediction

**What to remember:** You've implemented several functions that:

- Initialize (w,b)

- Optimize the loss iteratively to learn parameters (w,b):

- computing the cost and its gradient

- updating the parameters using gradient descent

- Use the learned (w,b) to predict the labels for a given set of examples

需要记住的,你已经实现了一些函数:

  • 初始化(w,b)
  • 迭代优化损失来学习参数(w,b)
  • 计算损失和其梯度
  • 使用梯度下降来更新参数
  • 使用学习到的参数来预测给定样本的标签值

5 - Merge all functions into a model

You will now see how the overall model is structured by putting together all the building blocks (functions implemented in the previous parts) together, in the right order.

Exercise: Implement the model function. Use the following notation:

- Y_prediction for your predictions on the test set
- Y_prediction_train for your predictions on the train set
- w, costs, grads for the outputs of optimize()

def model(X_train, Y_train, X_test, Y_test, num_iterations = 2000, learning_rate = 0.5, print_cost = False):"""Builds the logistic regression model by calling the function you've implemented previouslyArguments:X_train -- training set represented by a numpy array of shape (num_px * num_px * 3, m_train)Y_train -- training labels represented by a numpy array (vector) of shape (1, m_train)X_test -- test set represented by a numpy array of shape (num_px * num_px * 3, m_test)Y_test -- test labels represented by a numpy array (vector) of shape (1, m_test)num_iterations -- hyperparameter representing the number of iterations to optimize the parameterslearning_rate -- hyperparameter representing the learning rate used in the update rule of optimize()print_cost -- Set to true to print the cost every 100 iterationsReturns:d -- dictionary containing information about the model."""# initialize parameters with zeros (≈ 1 line of code)w, b = initialize_with_zeros(X_train.shape[0])# Gradient descent (≈ 1 line of code)parameters, grads, costs = optimize(w, b, X_train, Y_train, num_iterations, learning_rate, print_cost)# Retrieve parameters w and b from dictionary "parameters"w = parameters["w"]b = parameters["b"]# Predict test/train set examples (≈ 2 lines of code)Y_prediction_test = predict(w, b, X_test)Y_prediction_train = predict(w, b, X_train)# Print train/test Errorsprint("train accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_train - Y_train)) * 100))print("test accuracy: {} %".format(100 - np.mean(np.abs(Y_prediction_test - Y_test)) * 100))d = {"costs": costs,"Y_prediction_test": Y_prediction_test, "Y_prediction_train" : Y_prediction_train, "w" : w, "b" : b,"learning_rate" : learning_rate,"num_iterations": num_iterations}return dd = model(train_set_x, train_set_y, test_set_x, test_set_y, num_iterations = 2000, learning_rate = 0.005, print_cost = True)

2.深度学习练习:Logistic Regression with a Neural Network mindset相关推荐

  1. cs230 深度学习 Lecture 2 编程作业: Logistic Regression with a Neural Network mindset

    本文结构: 将 Logistic 表达为 神经网络 的形式 构建模型 导入包 获得数据 并进行预处理: 格式转换,归一化 整合模型: A. 构建模型 a. 初始化参数:w 和 b 为 0 b. 前向传 ...

  2. 吴恩达深度学习神经网络基础编程作业Logistic Regression with a Neural Network mindset

  3. 深度学习:卷积神经网络(convolution neural network)

    (一)卷积神经网络 卷积神经网络最早是由Lecun在1998年提出的. 卷积神经网络通畅使用的三个基本概念为: 1.局部视觉域: 2.权值共享: 3.池化操作. 在卷积神经网络中,局部接受域表明输入图 ...

  4. 4.深度学习练习:Building your Deep Neural Network: Step by Step(强烈推荐)

    本文节选自吴恩达老师<深度学习专项课程>编程作业,在此表示感谢. 课程链接:https://www.deeplearning.ai/deep-learning-specialization ...

  5. coursera 吴恩达 -- 第一课 神经网络和深度学习 :第二周课后习题 Neural Network Basics Quiz, 10 questions

    解决问题就像打怪

  6. CTR深度学习模型之 DSIN(Deep Session Interest Network) 论文解读

    之前的文章讲解了DIEN模型:CTR深度学习模型之 DIEN(Deep Interest Evolution Network) 的理解与示例,而这篇文章要讲的是DSIN模型,它与DIEN一样都从用户历 ...

  7. 深度学习——损失函数(Regression Loss、Classification Loss)

    简介 Loss function 损失函数 用于定义单个训练样本与真实值之间的误差 Cost function 代价函数 用于定义单个批次/整个训练集样本与真实值之间的误差 Objective fun ...

  8. 【深度学习】logistic回归模型

    目录 神经网络 数据.符号准备: logistic回归: 损失函数和代价函数: 梯度下降法: 向量化: 神经网络 我们学习深度学习的目的就是用于去训练神经网络,而神经网络是什么呢?我们先来看下面一个基 ...

  9. python机器学习算法(赵志勇)学习笔记( Logistic Regression,LR模型)

    Logistic Regression(逻辑回归) 分类算法是典型的监督学习,分类算法通过对训练样本的学习,得到从样本特征到样本的标签之间的映射关系,也被称为假设函数,之后可利用该假设函数对新数据进行 ...

最新文章

  1. MAC OS X的ACL扩展权限设置
  2. 我自学python的路-Python学习路线图的总结
  3. Activiti绩效对决
  4. 杭电多校HDU 6601 Keen On Everything But Triangle(主席树)题解
  5. Windows下修改Git bash的HOME路径
  6. UVa 10950 - Bad Code
  7. Android横竖屏切换的解决方法
  8. 计算机专业拼音怎样写,单板计算机拼音
  9. 创建wincc项目提示无法连接到服务器,wincc 项目管理器 服务器不可用 无法连接到服务器...
  10. 计算机表格里的隐藏怎么弄出来怎么办,电脑屏幕的excel表格最后一行看不到怎么办《excel表隐藏的表格怎么展开》...
  11. elementui 时间选择框选中后限制前后31天
  12. 【学习day1】图像分类数据集+softmax回归
  13. 农业知识图谱(Agriculture_KnowledgeGraph)项目环境构建
  14. Exp2 后门原理与实践 ——20164316张子遥
  15. 如何搭建一个属于自己的博客网站?(小白教程)
  16. 卷积神经网络 神经网络,卷积神经网络基础知识
  17. 51 单片机实战教程(13 外围芯片驱动程序之CS1237芯片驱动)
  18. Hive 3.1.2Linux CentOs 安装,踩坑 Dbeaver 连接Hive
  19. Python-自动化测试之接口基础
  20. [论文阅读] BoT-SORT: Robust Associations Multi-Pedestrian Tracking

热门文章

  1. HDU 6168 Numbers 思维
  2. php 存储html 内容,HTML 本地存储
  3. unionall mysql_5分钟了解MySQL5.7union all用法的黑科技
  4. vb检测html事件,VB代码VB小程序:捕获 WebBrowser 控件的鼠标事件
  5. 列名 userid 不明确。 表结构_那些你不知道的表结构设计思路
  6. ORACLE使用copy方式存储迁移,详细讲解Oracle数据库的数据迁移方法
  7. 计算机网络项目实训教程课后答案,计算机网络项目实训教程
  8. python django部署docker_centos利用docker部署django项目
  9. 用python模拟评委打分_用vb 编写一个评委打分的程序1. 编写一个评委打分的程序,实现以下功能:a) 单击“评委给分”按钮时弹出InputBo...
  10. hog特征提取python代码_hog特征提取-python实现