多分类实践：手写数字分类

1.导入包
2.回顾：softmax函数
3.神经网络
- 3.1 问题描述
- 3.2 数据集
- - 3.2.1 查看变量
  - 3.2.2 查看维度
  - 3.2.3 数据可视化
- 3.3 模型表示
- 3.4 模型构建
- 3.5 预测

之前讲了多分类问题和softmax函数，这里是实践内容

1.导入包

import numpy as np
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.activations import linear, relu, sigmoid
%matplotlib widget
import matplotlib.pyplot as plt
plt.style.use('./deeplearning.mplstyle')import logging
logging.getLogger("tensorflow").setLevel(logging.ERROR)
tf.autograph.set_verbosity(0)from public_tests import * from autils import *
from lab_utils_softmax import plt_softmax
np.set_printoptions(precision=2)

2.回顾：softmax函数

多类神经网络产生N个输出。选择一个输出作为预测答案。在输出层中，向量z\mathbf{z}z由线性函数生成，该线性函数被馈送到softmax函数中。softmax函数将z\mathbf{z}z转换为如下所述的概率分布。应用softmax后，每个输出将介于0和1之间，输出的总和将为1。它们可以被解释为概率。softmax的较大输入将对应于较大的输出概率。

The softmax function can be written:
aj=ezj∑k=0N−1ezk(1)a_j = \frac{e^{z_j}}{ \sum_{k=0}^{N-1}{e^{z_k} }} \tag{1}aj=∑k=0N−1ezkezj(1)

Where z=w⋅x+bz = \mathbf{w} \cdot \mathbf{x} + bz=w⋅x+b and N is the number of feature/categories in the output layer.

代码实现：

# UNQ_C1
# GRADED CELL: my_softmaxdef my_softmax(z):  """ Softmax converts a vector of values to a probability distribution.Args:z (ndarray (N,))  : input data, N featuresReturns:a (ndarray (N,))  : softmax of z"""    ### START CODEez  = np.exp(z)a = ez/np.sum(ez)### END CODE HERE ### return a

与自带的softmax对比：

z = np.array([1., 2., 3., 4.])
a = my_softmax(z)
atf = tf.nn.softmax(z)
print(f"my_softmax(z):         {a}")
print(f"tensorflow softmax(z): {atf}")my_softmax(z):         [0.03 0.09 0.24 0.64]
tensorflow softmax(z): [0.03 0.09 0.24 0.64]

3.神经网络

3.1 问题描述

在本练习中，您将使用神经网络识别10个手写数字，0-9。这是一个多类分类任务，其中选择n个选项中的一个。如今，自动手写数字识别被广泛使用，从识别邮件信封上的邮政编码到识别银行支票上的金额。

3.2 数据集

下面显示的“load_data（）”函数将数据加载到变量“X”和“y”中`
数据集包含5000个手写数字1^11的训练示例。
- 每个训练示例是一个20像素x 20像素的数字灰度图像。
- 每个像素由一个浮点数表示，该浮点数指示该位置的灰度强度。
- 20×20像素网格被“展开”为400维向量。
- 每个训练示例都成为数据矩阵“X”中的一行。
- 这给了我们一个5000x400矩阵“x”，其中每一行都是手写数字图像的训练示例。

X=(−−−(x(1))−−−−−−(x(2))−−−⋮−−−(x(m))−−−)X = \left(\begin{array}{cc} --- (x^{(1)}) --- \\ --- (x^{(2)}) --- \\ \vdots \\ --- (x^{(m)}) --- \end{array}\right)X=⎝⎛−−−(x(1))−−−−−−(x(2))−−−⋮−−−(x(m))−−−⎠⎞

训练集的第二部分是一个5000 x 1维向量“y”，其中包含训练集的标签
“y＝0”（如果图像是数字“0”）、“y＝4”（如果是数字“4”）等等。

1^11 _{This is a subset of the MNIST handwritten digit dataset (http://yann.lecun.com/exdb/mnist/)}

# load dataset
X, y = load_data()

3.2.1 查看变量

首先打印变量

print ('The first element of X is: ', X[0])

print ('The first element of y is: ', y[0,0])
print ('The last element of y is: ', y[-1,0])

3.2.2 查看维度

print ('The shape of X is: ' + str(X.shape))
print ('The shape of y is: ' + str(y.shape))The shape of X is: (5000, 400)
The shape of y is: (5000, 1)

3.2.3 数据可视化

在下面的单元格中，代码从“X”中随机选择64行，将每行映射回20像素乘20像素的灰度图像，并将图像一起显示。
每个图像的标签都显示在图像上方

import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
# You do not need to modify anything in this cellm, n = X.shapefig, axes = plt.subplots(8,8, figsize=(5,5))
fig.tight_layout(pad=0.13,rect=[0, 0.03, 1, 0.91]) #[left, bottom, right, top]#fig.tight_layout(pad=0.5)
widgvis(fig)
for i,ax in enumerate(axes.flat):# Select random indicesrandom_index = np.random.randint(m)# Select rows corresponding to the random indices and# reshape the imageX_random_reshaped = X[random_index].reshape((20,20)).T# Display the imageax.imshow(X_random_reshaped, cmap='gray')# Display the label above the imageax.set_title(y[random_index,0])ax.set_axis_off()fig.suptitle("Label, image", fontsize=14)

3.3 模型表示

使用的神经网络如下图所示。

这有两个具有ReLU激活的致密层，然后是具有线性激活的输出层。
- 由于图像的大小为20×2020\times2020×20，这为我们提供了400400400的输入
The parameters have dimensions that are sized for a neural network with 252525 units in layer 1, 151515 units in layer 2 and 101010 output units in layer 3, one for each digit.
- Recall that the dimensions of these parameters is determined as follows:
  - If network has sins_{in}sin units in a layer and souts_{out}sout units in the next layer, then
    - WWW will be of dimension sin×souts_{in} \times s_{out}sin×sout.
    - bbb will be a vector with souts_{out}sout elements
- Therefore, the shapes of W, and b, are
  - layer1: The shape of W1 is (400, 25) and the shape of b1 is (25,)
  - layer2: The shape of W2 is (25, 15) and the shape of b2 is: (15,)
  - layer3: The shape of W3 is (15, 10) and the shape of b3 is: (10,)

Note: The bias vector b could be represented as a 1-D (n,) or 2-D (n,1) array. Tensorflow utilizes a 1-D Tensorflow 用1维的 representation and this lab will maintain that convention:

Tensorflow models are built layer by layer. A layer’s input dimensions (sins_{in}sin above) are calculated for you. You specify a layer’s output dimensions and this determines the next layer’s input dimension. The input dimension of the first layer is derived from the size of the input data specified in the model.fit statement below.

Note: It is also possible to add an input layer that specifies the input dimension of the first layer. For example:
tf.keras.Input(shape=(400,)), #specify input shape
We will include that here to illuminate some model sizing.

3.4 模型构建

使用softmax的注意点：
如果在训练期间将softmaxs与损失函数而不是输出层分组，则数值稳定性会得到提高。这在构建模型和使用模型时会产生影响。

神经网络构建：

最终的致密层应使用“线性”激活。这实际上是没有激活的。
The model.compile statement will indicate this by including from_logits=True.
loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True)
这不会影响目标的形状。在稀疏类别交叉熵SparseCategorialCrossentropy的情况下，目标是预期数字0-9。
注意输出不是概率。如果需要输出概率，则应用softmax函数。

构建代码：

# UNQ_C2
# GRADED CELL: Sequential model
tf.random.set_seed(1234) # for consistent results
model = Sequential([               ### START CODE HERE ### tf.keras.Input(shape=(400,)),Dense(25,activation='relu',name="L1"),Dense(15,activation='relu',name="L2"),Dense(10,activation='linear',name="L3"),### END CODE HERE ### ], name = "my_model"
)

让我们进一步检查权重，以验证Tensorflow产生的维度与我们上面计算的相同。

[layer1, layer2, layer3] = model.layers#### Examine Weights shapes
W1,b1 = layer1.get_weights()
W2,b2 = layer2.get_weights()
W3,b3 = layer3.get_weights()
print(f"W1 shape = {W1.shape}, b1 shape = {b1.shape}")
print(f"W2 shape = {W2.shape}, b2 shape = {b2.shape}")
print(f"W3 shape = {W3.shape}, b3 shape = {b3.shape}")W1 shape = (400, 25), b1 shape = (25,)
W2 shape = (25, 15), b2 shape = (15,)
W3 shape = (15, 10), b3 shape = (10,)

拟合模型：：

定义了一个损失函数SparseCategorialCrossentropy，并通过添加from_logits=True指示softmax应包含在损失计算中。）
定义优化器。一个流行的选择是自适应时刻（Adam），这在讲座中有所描述。

model.compile(loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
)history = model.fit(X,y,epochs=40
)#画一下图像看看是否拟合
def plot_loss_tf(history):fig,ax = plt.subplots(1,1, figsize = (4,3))widgvis(fig)ax.plot(history.history['loss'], label='loss')ax.set_ylim([0, 2])ax.set_xlabel('Epoch')ax.set_ylabel('loss (cost)')ax.legend()ax.grid(True)plt.show()plot_loss_tf(history)

3.5 预测

image_of_two = X[1015]
display_digit(image_of_two)prediction = model.predict(image_of_two.reshape(1,400))  # predictionprint(f" predicting a Two: \n{prediction}")
#预测最大的值即概率最大的值
print(f" Largest Prediction index: {np.argmax(prediction)}")

用softmax将prediction转化为概率

prediction_p = tf.nn.softmax(prediction)print(f" predicting a Two. Probability vector: \n{prediction_p}")predicting a Two. Probability vector:
[[2.01e-06 1.35e-02 8.98e-01 6.44e-02 1.14e-07 7.06e-06 5.35e-05 2.37e-021.00e-04 2.77e-05]]

或者使用argmax显示最大值对应的下标

yhat = np.argmax(prediction_p)print(f"np.argmax(prediction_p): {yhat}")

比较预测结果和实际结果

import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
# You do not need to modify anything in this cellm, n = X.shapefig, axes = plt.subplots(8,8, figsize=(5,5))
fig.tight_layout(pad=0.13,rect=[0, 0.03, 1, 0.91]) #[left, bottom, right, top]
widgvis(fig)
for i,ax in enumerate(axes.flat):# Select random indicesrandom_index = np.random.randint(m)# Select rows corresponding to the random indices and# reshape the imageX_random_reshaped = X[random_index].reshape((20,20)).T# Display the imageax.imshow(X_random_reshaped, cmap='gray')# Predict using the Neural Networkprediction = model.predict(X[random_index].reshape(1,400))prediction_p = tf.nn.softmax(prediction)yhat = np.argmax(prediction_p)# Display the label above the imageax.set_title(f"{y[random_index,0]},{yhat}",fontsize=10)ax.set_axis_off()
fig.suptitle("Label, yhat", fontsize=14)
plt.show()

显示错误：

def display_errors(model,X,y):f = model.predict(X)yhat = np.argmax(f, axis=1)doo = yhat != y[:,0]idxs = np.where(yhat != y[:,0])[0]if len(idxs) == 0:print("no errors found")else:cnt = min(8, len(idxs))fig, ax = plt.subplots(1,cnt, figsize=(5,1.2))fig.tight_layout(pad=0.13,rect=[0, 0.03, 1, 0.80]) #[left, bottom, right, top]widgvis(fig)for i in range(cnt):j = idxs[i]X_reshaped = X[j].reshape((20,20)).T# Display the imageax[i].imshow(X_reshaped, cmap='gray')# Predict using the Neural Networkprediction = model.predict(X[j].reshape(1,400))prediction_p = tf.nn.softmax(prediction)yhat = np.argmax(prediction_p)# Display the label above the imageax[i].set_title(f"{y[j,0]},{yhat}",fontsize=10)ax[i].set_axis_off()fig.suptitle("Label, yhat", fontsize=12)return(len(idxs))print( f"{display_errors(model,X,y)} errors out of {len(X)} images")

【Machine Learning】19.多分类实践：手写数字分类相关推荐

机器学习算法（九）: 基于线性判别LDA模型的分类（基于LDA手写数字分类实践）
机器学习算法(九): 基于线性判别模型的分类 1.前言:LDA算法简介和应用 1.1.算法简介线性判别模型(LDA)在模式识别领域(比如人脸识别等图形图像识别领域)中有非常广泛的应用.LDA是一种监 ...
如何为MNIST手写数字分类开发CNN
导言 MNIST手写数字分类问题是计算机视觉和深度学习中使用的标准数据集. 虽然数据集得到了有效的解决,但它可以作为学习和实践如何开发,评估和使用卷积深度学习神经网络从头开始进行图像分类的基础.这包括 ...
独家 | 如何从头开始为MNIST手写数字分类建立卷积神经网络（附代码）
翻译:张睿毅校对:吴金笛本文约9300字,建议阅读20分钟. 本文章逐步介绍了卷积神经网络的建模过程,最终实现了MNIST手写数字分类. MNIST手写数字分类问题是计算机视觉和深度学习中使用的标 ...
神经网络和深度学习（二）——一个简单的手写数字分类网络
本文转自:https://blog.csdn.net/qq_31192383/article/details/77198870 一个简单的手写数字分类网络接上一篇文章,我们定义了神经网络,现在我们开 ...
使用尖峰神经网络和尖峰时间相关可塑性的无监督手写数字分类
尊敬的读者,您好!我非常高兴能在这里和大家分享一种使用尖峰神经网络和尖峰时间相关可塑性(STDP)进行无监督手写数字分类的方法.本文将尽可能详尽地解释这个主题,提供理论背景,然后指导您通过具体示例来实 ...
PyTorch基础与简单应用：构建卷积神经网络实现MNIST手写数字分类
文章目录 (一) 问题描述 (二) 设计简要描述 (三) 程序清单 (四) 结果分析 (五) 调试报告 (六) 实验小结 (七) 参考资料 (一) 问题描述构建卷积神经网络实现MNIST手写数字分类 ...
基于PyTorch框架的多层全连接神经网络实现MNIST手写数字分类
多层全连接神经网络实现MNIST手写数字分类 1 简单的三层全连接神经网络 2 添加激活函数 3 添加批标准化 4 训练网络 5 结论参考资料先用PyTorch实现最简单的三层全连接神经网络,然后 ...
练习利用LSTM实现手写数字分类任务
练习利用LSTM实现手写数字分类任务 MNIST数据集中图片大小为28*28. 按照行进行展开成28维的特征向量. 考虑到这28个的向量之间存在着顺序依赖关系,我们可以将他们看成是一个长为28的输入序 ...
Keras入门实战（1）：MNIST手写数字分类
目录 1)首先我们加载Keras中的数据集 2)网络架构 3)选择编译(compile参数) 4)准备图像数据 5) 训练模型 6)测试数据前面的博客中已经介绍了如何在Ubuntu下安装Keras深 ...

【Machine Learning】19.多分类实践：手写数字分类