TensorFlow学习笔记之二（使用TensorFlow实现神经网络）

文章目录

全连接网络结构的前向传播算法
- 单个神经元
- 全连接网络结构
- 计算过程举例
代码实现该神经网络样例程序
搭建神经网络的过程：准备、前向传播、反向传播、循环迭代
- 准备
- 前向传播：定义输入、参数和输出
- 反向传播：定义损失函数、反向传播方法
- 循环迭代：生成会话，训练STEPS轮
总结

全连接网络结构的前向传播算法

单个神经元

从上图可以看出，单个神经元有多个输入和一个输出。而神经网络的结构是不同神经元之间的连接结构。神经元的输出是所有输入的加权和。神经元的参数就是输入的权重ω。神经网络的优化是优化参数的取值过程。

全连接网络结构

所谓全连接，指相邻的两层之间任意两个节点之间都有连接。

一个简单的判断零件是否合格的三层全连接网络。该图展示了这个神经网络前向传播过程。

场景描述：
该网络通过输入零件长度和零件质量来判断零件是否合格

该神经网络的输入为：
X=[x1x2]X = \begin{gathered} \begin{bmatrix} x_{1} & x_{2} \end{bmatrix} \end{gathered} X=[x1x2]
其中，x1是零件的长度，x2是零件的质量。

神经网络的参数为：
W(1)=[w1,1(1)w1,2(1)w1,3(1)w2,1(1)w2,2(1)w2,3(1)]W^{(1)}=\begin{gathered} \begin{bmatrix} w^{(1)}_{1,1} & w^{(1)}_{1,2} & w^{(1)}_{1,3} \\ w^{(1)}_{2,1} & w^{(1)}_{2,2} & w^{(1)}_{2,3} \end{bmatrix} \end{gathered} W(1)=[w1,1(1)w2,1(1)w1,2(1)w2,2(1)w1,3(1)w2,3(1)]
W(2)=[w1,1(2)w2,1(2)w2,1(2)]W^{(2)}=\begin{gathered} \begin{bmatrix} w^{(2)}_{1,1}\\ w^{(2)}_{2,1}\\ w^{(2)}_{2,1} \end{bmatrix} \end{gathered} W(2)=⎣⎢⎡w1,1(2)w2,1(2)w2,1(2)⎦⎥⎤

计算过程
[a1,1a1,2a1,3]=[x1x2]∗[w1,1(1)w1,2(1)w1,3(1)w2,1(1)w2,2(1)w2,3(1)]\begin{gathered} \begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} \end{bmatrix} \end{gathered}=\begin{gathered} \begin{bmatrix} x_{1} & x_{2} \end{bmatrix} \end{gathered} *\begin{gathered} \begin{bmatrix} w^{(1)}_{1,1} & w^{(1)}_{1,2} & w^{(1)}_{1,3} \\ w^{(1)}_{2,1} & w^{(1)}_{2,2} & w^{(1)}_{2,3} \end{bmatrix} \end{gathered} [a1,1a1,2a1,3]=[x1x2]∗[w1,1(1)w2,1(1)w1,2(1)w2,2(1)w1,3(1)w2,3(1)]

[y]=[a1,1a1,2a1,3]∗[w1,1(2)w2,1(2)w2,1(2)]\begin{gathered} \begin{bmatrix} y \end{bmatrix} \end{gathered}=\begin{gathered} \begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} \end{bmatrix} \end{gathered} *\begin{gathered} \begin{bmatrix} w^{(2)}_{1,1}\\ w^{(2)}_{2,1}\\ w^{(2)}_{2,1} \end{bmatrix} \end{gathered} [y]=[a1,1a1,2a1,3]∗⎣⎢⎡w1,1(2)w2,1(2)w2,1(2)⎦⎥⎤

计算过程举例

当零件长度=0.7，零件质量=0.9，权重
W(1)=[0.20.10.40.3−0.50.2]W^{(1)}=\begin{gathered} \begin{bmatrix} 0.2 & 0.1 & 0.4 \\ 0.3 & -0.5 & 0.2 \end{bmatrix} \end{gathered} W(1)=[0.20.30.1−0.50.40.2]
W(2)=[0.60.1−0.2]W^{(2)}=\begin{gathered} \begin{bmatrix} 0.6 \\ 0.1 \\ -0.2 \end{bmatrix} \end{gathered} W(2)=⎣⎡0.60.1−0.2⎦⎤
计算中间变量：
[a1,1a1,2a1,3]=[0.70.9]∗[0.20.10.40.3−0.50.2]=[0.410.1−0.2]\begin{gathered} \begin{bmatrix} a_{1,1} & a_{1,2} & a_{1,3} \end{bmatrix} \end{gathered}=\begin{gathered} \begin{bmatrix} 0.7& 0.9 \end{bmatrix} \end{gathered} *\begin{gathered} \begin{bmatrix} 0.2 & 0.1 & 0.4 \\ 0.3 & -0.5 & 0.2 \end{bmatrix} \end{gathered}=\begin{gathered} \begin{bmatrix} 0.41 & 0.1 & -0.2 \end{bmatrix} \end{gathered} [a1,1a1,2a1,3]=[0.70.9]∗[0.20.30.1−0.50.40.2]=[0.410.1−0.2]
得出y值
[y]=[0.410.1−0.2]∗[0.60.1−0.2]=[0.116]\begin{gathered} \begin{bmatrix} y \end{bmatrix} \end{gathered}= \begin{bmatrix} 0.41 & 0.1 & -0.2 \end{bmatrix}* \begin{bmatrix} 0.6 \\ 0.1 \\ -0.2 \end{bmatrix} = \begin{bmatrix} 0.116 \end{bmatrix} [y]=[0.410.1−0.2]∗⎣⎡0.60.1−0.2⎦⎤=[0.116]

代码实现该神经网络样例程序

用一个完整的程序来训练神经网络来解决二分类问题
数据集：随机数生成一个

import tensorflow as tf
import numpy as npBATCH_SIZE = 8
seed = 23455# 基于seed产生随机数
rng = np.random.RandomState(seed=seed)
X = rng.rand(32, 2)
Y = [[int(x0 + x1 <1)] for (x0, x1) in X]print("X:\n",X)
print("X的类型：", X.shape)
print("Y:\n",Y)
print("Y的类型", len(Y))# 1、定义神经网络的输入、参数和输出，定义前向传播过程
x = tf.placeholder(tf.float32,shape=(None, 2))  # 知道每组有两个特征变量，但是不知道多少组，用None占位
y_ = tf.placeholder(tf.float32,shape=(None, 1)) # 存放真实的结果值，合格为1，w1 = tf.Variable(tf.random_normal([2, 3], stddev=1, seed=1))
w2 = tf.Variable(tf.random_normal([3, 1], stddev=1, seed=1))a = tf.matmul(x, w1)
y = tf.matmul(a, w2)# 2、定义损失函数以及反向传播方法
loss = tf.reduce_mean(tf.square(y-y_)) # 使用均方误差计算loss
train_step = tf.train.GradientDescentOptimizer(0.001).minimize(loss) # 学习率为0.001# 3、生成会话，训练STEPS轮
with tf.Session() as sess:# 3.1、初始化参数值init_op = tf.global_variables_initializer()sess.run(init_op)print("w1:\n", sess.run(w1))print("w1:\n", sess.run(w2))print("\n")# 3.2、训练模型STEPS = 3000for i in range(STEPS):# 3.2.1 每轮确定读取数据集的游标start = (i*BATCH_SIZE) % 32end = start + BATCH_SIZE# 3.2.2 喂入数据，开始训练sess.run(train_step, feed_dict={x: X[start:end], y_: Y[start:end]})# 3.2.3 每500轮输出一次loss值if i % 500 == 0:total_loss = sess.run(loss, feed_dict={x: X, y_: Y})print("After % dtraining step(s), cross entropy on all data is % g" % (i, total_loss))print("w1:\n", sess.run(w1))print("w2:\n", sess.run(w2))

搭建神经网络的过程：准备、前向传播、反向传播、循环迭代

准备

import导入相关库
常量的定义
数据集生成

前向传播：定义输入、参数和输出

x=
y_=

w1=
w2=

a=
y=

反向传播：定义损失函数、反向传播方法

loss=
train_step=

循环迭代：生成会话，训练STEPS轮

# 3、生成会话，训练STEPS轮
with tf.Session() as sess:# 3.1、初始化参数值init_op = tf.global_variables_initializer()sess.run(init_op)# 3.2、训练模型STEPS = 3000for i in range(STEPS):# 3.2.1 每轮确定读取数据集的游标start = (i*BATCH_SIZE) % 32end = start + BATCH_SIZE# 3.2.2 喂入数据，开始训练sess.run(train_step, feed_dict={x: X[start:end], y_: Y[start:end]})

总结

整个神经网络总共分为四个部分：

前期准备好数据集以及其他参数。
前向传播确定网络结构
反向传播优化网络参数
循环迭代2,3两个过程不断优化两个参数