Tensorflow神经网络框架

以前我们讲了神经网络基础，但是如果从头开始实现，那将是一个庞大且费时的工作，所以我们选择一条捷径———神经网络框架。我理解的神经网络框架就相当于一个工具包。就比如我们去生产一个汽车（神经网络模型），根据自己造的是汽车还是客车（CNN、RNN…）,需要我们自己做的就是把不同型号的轮胎、方向盘、发动机（隐藏层、激活函数、梯度下降法）等等、按照一定的图纸(神经网络结构)组装起来。而神经网络框架就相当于放这些轮胎、方向盘、发动机的仓库。在我们需要的时候仓库中取，从而不至于我们在造汽车的时候，再去挖铁矿造钢铁…。因此不同的公司都纷纷的推出了自己的神经网络框架。常见的神经网络有如下几种。

工具名称	主要维护人员(或团体)	支持语音	支持系统
Caffe	加州大学伯克利分校视觉与学习中心	C++、Python、MATLAB	Linux,Mac OS X,Windows
Deeplearning4j	Skymid	Java,Scala,Clojure	Linux,Windows,Mac Os X,Android
Mricrosoft Cognitive Tookit(CNTK)	微软研究院	Python，C++，BrainSCript	Linux，Windows
MxNet	分布式机器学习社区(DMLC)	C++,Python,Julia,MATLAB,GO,R,Scala	Linux,Mac OS X，Windows,Android ,IOS
paddlepaddle	百度	C++, Python	Linux,Mac OS X
Tensorflow	谷歌	C++，Python	Linux,Mac OS X,Androird,ios
Theano	蒙特利尔大学	Python	Linux，Max Os ，Windows
Torch	Fackbook、Tuitter、Google	Lua、LUaJIT，C	Linux,Mac OS X,Windows,Android,IOS
PyTorch	Adam Paszke、Sam Gross等	Python	Linux,Mac OS X

不同的神经网络有着不同的特点。作为开源的神经网络框架，在选取不同的神经网络框架的时候，时效性占有很大的比重，在GitHub上访问次数最多的就是Tensorflow，而在网络中最活跃的就是Tensorflow，所以我们学习用Tensorflow实现神经网络。

主要顺序就是先写的了神经网络的基本构件，然后构建卷积神经网络和循环神经网络，最后是神经网络的可视化。主要的内容就是解释代码。

例子——>自己三层神经网络——>神经网络可视化——>卷积神经网络——>循环神经网络

通过一个小例子来理解神经网络，下面这个例子为学习一个简单线性模型,下面该例子为通过梯度下降法学习y=0.1x+0.3

代码 1

import tensorflow as tf
import numpy as np#---------------一、创建数据集----------------------
# create data
# x数据集为numpy数据集生成100个0-1之间随机数，定义类型为浮点32位，random为numby中的随机数模块（module），rand（）随机生成数组，生成格式为ndarray，astype从新定义数据的类型为32位浮点型。
x_data = np.random.rand(100).astype(np.float32)
y_data = x_data*0.1+0.3#-------------二、定义神经网络框架--------------#…………………2.1………………
#creat tensorflow structure start##
# 定义权重，tf.variable（）为tf变量类型，初始值为-1到1的维度为1的随机数.
Weights = tf.Variable(tf.random_uniform([1],-1.0,1.0))
# 偏执为维度为1初始值为0的数据
biases = tf.Variable(tf.zeros([1]))#……………………2.2…………………
#定义训练数据模型
y = Weights*x_data+biases#……………………2.3……………………
# 定义损失函数（代价函数），代价函数为均方误差函数
loss = tf.reduce_mean(tf.square(y-y_data))
# 优化器为梯度下降法
optimizer = tf.train.GradientDescentOptimizer(0.5)
# 训练过程为通过优化器（优化方法）最小化损失函数。
train = optimizer.minimize(loss)
# 初始化全局变量
init = tf.global_variables_initializer()
#create tensorflow structure end###----------三、训练模型-----------#定义会话，通过会话来运行数据模型
sess = tf.Session()
sess.run(init)
#训练201次
for step in range(201):sess.run(train)if step % 20 ==0:print(step,sess.run(Weights),sess.run(biases))

tensorflow为谷歌开放的神经网络框架，其计算模型为计算图，数据模型为张量，运行模型为会话。

（一）、框架

变量

通过代码理解tensorflow的数据模型

代码2

import tensorflow as tf# @变量练习# tf变量
state = tf.Variable(0,name='counter')
# tf常量
one = tf.constant(1)# 定义加法
new_value = tf.add(state , one)
# 更新state，把每次运行的结果返回给state
update = tf.assign(state,new_value)# 各数据类型
print("type state:",type(state))
print(state)
print("\ntype one:",type(one))
print(one)
print("\ntype new_value",type(new_value))
print(new_value)
print("\ntype update",type(update))
print(update)#  must have if define variable
init = tf.global_variables_initializer()# 通过会话运行
with tf.Session() as sess:sess.run(init)print(sess.run(state))print(sess.run(one))print(sess.run(update))print('-----分割线-----'*5)
with tf.Session() as sess:sess.run(init)for _ in range(3):print(sess.run(update))print(sess.run(state))

结果

type state: <class 'tensorflow.python.ops.variables.Variable'>
<tf.Variable 'counter:0' shape=() dtype=int32_ref>type one: <class 'tensorflow.python.framework.ops.Tensor'>
Tensor("Const:0", shape=(), dtype=int32)type new_value <class 'tensorflow.python.framework.ops.Tensor'>
Tensor("Add:0", shape=(), dtype=int32)type update <class 'tensorflow.python.framework.ops.Tensor'>
Tensor("Assign:0", shape=(), dtype=int32_ref)
0
1
1
-----分割线----------分割线----------分割线----------分割线----------分割线-----
2020-01-21 11:57:01.826930: I C:\tf_jenkins\home\workspace\rel-win\M\windows\PY\36\tensorflow\core\platform\cpu_feature_guard.cc:137] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
1
1
2
2
3
3Process finished with exit code 0

通过结果可以看出，one,new_value,update的数据模型都为Tensor（张量），这即为tensorflow的数据模型,想要得到其具体的运算过程需要通过会话来运行,如果在定义中有变量，在运行之前需要先初始化所有的变量（init）

会话

神经网络中所有的变量、常量等并不会直接计算，必须要先定义，然后通过会话方式运行。

代码 2

# 定义两个矩阵，第一个矩阵为一行两列，第二个矩阵为两行一列。
matrix1 = tf.constant([[3,3]])
matrix2 = tf.constant([[2],[2]])# matrix multiply np.dot(m1,m2)
product = tf.matmul(matrix1,matrix2)# method 1
# sess = tf.Session()
# result = sess.run(product)
# print(result)
# sess.close()# method 2
with tf.Session() as sess:result2 = sess.run(product)print(result2)

（二）、基础知识

placeholder() 和 feed_dict

placehodlder源码

def placeholder(dtype, shape=None, name=None):"""Inserts a placeholder for a tensor that will be always fed.**Important**: This tensor will produce an error if evaluated. Its value mustbe fed using the `feed_dict` optional argument to `Session.run()`,`Tensor.eval()`, or `Operation.run()`.For example:'''pythonx = tf.placeholder(tf.float32, shape=(1024, 1024))y = tf.matmul(x, x)with tf.Session() as sess:print(sess.run(y))  # ERROR: will fail because x was not fed.rand_array = np.random.rand(1024, 1024)print(sess.run(y, feed_dict={x: rand_array}))  # Will succeed.''''Args:dtype: The type of elements in the tensor to be fed.shape: The shape of the tensor to be fed (optional). If the shape is notspecified, you can feed a tensor of any shape.name: A name for the operation (optional).Returns:A `Tensor` that may be used as a handle for feeding a value, but notevaluated directly."""return gen_array_ops._placeholder(dtype=dtype, shape=shape, name=name)

placeholer为tf中的占位符，这样不需要先定义训练数据，只需要去关心具体的方法，可以先把方法定义好，最后再输入我们的训练数据。通过源码可以看出placeholder必须和feed_dict对应，placeholder为提前定义占位，feed_dict负责最后输入训练数据，必须结合一起使用

代码

import tensorflow as tf#placeholder()
input1 = tf.placeholder(tf.float32)
input2 = tf.placeholder(tf.float32)output = tf.multiply(input1,input2)with tf.Session() as sess:print(sess.run(output,feed_dict={input1:[7.],input2:[2.]}))

通过placeholder定义加法运算

激活函数

激活函数，当一个线性模型向非线性模型转换过程中，需要非线性变换，这种变换通过为仿射变换结合激活函数（激活函数具体内容详见第一部分神经网络基础知识）
python内置的激活函数如下：

@@relu
@@relu6
@@crelu
@@elu
@@leaky_relu
@@selu
@@softplus
@@softsign
@@dropout
@@bias_add
@@sigmoid
@@log_sigmoid
@@tanh
@@convolution
@@conv2d
@@depthwise_conv2d
@@depthwise_conv2d_native
@@separable_conv2d
@@atrous_conv2d
@@atrous_conv2d_transpose
@@conv2d_transpose
@@conv1d
@@conv3d
@@conv3d_transpose
@@conv2d_backprop_filter
@@conv2d_backprop_input
@@conv3d_backprop_filter_v2
@@depthwise_conv2d_native_backprop_filter
@@depthwise_conv2d_native_backprop_input
@@avg_pool
@@max_pool
@@max_pool_with_argmax
@@avg_pool3d
@@max_pool3d
@@fractional_avg_pool
@@fractional_max_pool
@@pool
@@dilation2d
@@erosion2d
@@with_space_to_batch
@@l2_normalize
@@local_response_normalization
@@sufficient_statistics
@@normalize_moments
@@moments
@@weighted_moments
@@fused_batch_norm
@@batch_normalization
@@batch_norm_with_global_normalization
@@l2_loss
@@log_poisson_loss
@@sigmoid_cross_entropy_with_logits
@@softmax
@@log_softmax
@@softmax_cross_entropy_with_logits
@@sparse_softmax_cross_entropy_with_logits
@@weighted_cross_entropy_with_logits
@@embedding_lookup
@@embedding_lookup_sparse
@@dynamic_rnn
@@bidirectional_dynamic_rnn
@@raw_rnn
@@static_rnn
@@static_state_saving_rnn
@@static_bidirectional_rnn
@@ctc_loss
@@ctc_greedy_decoder
@@ctc_beam_search_decoder
@@top_k
@@in_top_k
@@nce_loss
@@sampled_softmax_loss
@@uniform_candidate_sampler
@@log_uniform_candidate_sampler
@@learned_unigram_candidate_sampler
@@fixed_unigram_candidate_sampler
@@compute_accidental_hits
@@quantized_conv2d
@@quantized_relu_x
@@quantized_max_pool
@@q

其中常用的激活函数有，relu、sigmoid、softplus、than等

（三）、神经网络方法

1、首先自己构造一个简单的三层神经网络方法

代码3.1

import tensorflow as tf
import numpy as np# -----------------------1--------------------------------
# 定义添加层函数
def add_layer(inputs,in_size,out_size,activation_function=None):# 权重变量矩阵Weight = tf.Variable(tf.random_normal([in_size,out_size]),name='W')# 偏执初始值建议不为0biases = tf.Variable(tf.zeros([1,out_size])+0.1,name='b')Wx_plus_b = tf.matmul(inputs,Weight)+biasesif activation_function is None:outputs = Wx_plus_belse:outputs = activation_function(Wx_plus_b)return outputs#------------------------2------------------------------------
#创建训练集
x_data = np.linspace(-1,1,300)[:,np.newaxis]
# 噪音均值为0方差为0.05
noise = np.random.normal(0,0.05,x_data.shape)
y_data = np.square(x_data)-0.5+noise#-----------------------3----------------------------------------
#定义输入层，有几个输入数据，就需要定义几个单元
xs = tf.placeholder(tf.float32,[None,1],name='x_input')
ys = tf.placeholder(tf.float32,[None,1],name='y_input')
#定义隐藏层，隐藏层为10个单元
l1 = add_layer(xs,1,10,activation_function=tf.nn.relu)
#定义输出层
predition = add_layer(l1,10,1,activation_function=None)#-----------------------4-----------------------------------------
#损失函数
loss = tf.reduce_mean(tf.reduce_sum(tf.square(ys-predition),reduction_indices=[1]))#-----------------------5------------------------------------------
#训练梯度下降法
train_step = tf.train.GradientDescentOptimizer(0.1).minimize(loss)#学习率为0.1，小于1即可init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)for i in range(1000):sess.run(train_step,feed_dict={xs:x_data,ys:y_data})if i%50 == 0:print(sess.run(loss,feed_dict={xs:x_data,ys:y_data}))

输出

0.15612674
0.01370755
0.011248722
0.009909462
0.008916998
0.00815641
0.007569391
0.0069189603
0.00624108
0.0056113983
0.005094583
0.004754079
0.004478738
0.004224704
0.004011159
0.0038203541
0.0036806199
0.0035679673
0.0034924054
0.0034372727

第一部分为定义一个添加层函数，第二部分利用numpy手动创建一个训练集，第三部分定义层，分别为输入层，隐藏层，输出层，第四部分为定义一个代价函数，代价函数为最小均方误差函数，第五部分为利用梯度下降法最小化代价函数训练模型,通过输出值可以看到误差逐渐减小，这既是一个模型学习的过程。

代码3.2 可视化(1)
在上述代码的基础上通过matplotlib进行可视化过程

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt# -----------------------1--------------------------------
# 定义添加层函数
def add_layer(inputs,in_size,out_size,activation_function=None):# 权重变量矩阵Weight = tf.Variable(tf.random_normal([in_size,out_size]),name='W')# 偏执初始值建议不为0biases = tf.Variable(tf.zeros([1,out_size])+0.1,name='b')Wx_plus_b = tf.matmul(inputs,Weight)+biasesif activation_function is None:outputs = Wx_plus_belse:outputs = activation_function(Wx_plus_b)return outputs#------------------------2------------------------------------
#创建训练集
x_data = np.linspace(-1,1,300)[:,np.newaxis]
# 噪音均值为0方差为0.05
noise = np.random.normal(0,0.05,x_data.shape)
y_data = np.square(x_data)-0.5+noise#-----------------------3----------------------------------------
#定义输入层，有几个输入数据，就需要定义几个单元
xs = tf.placeholder(tf.float32,[None,1],name='x_input')
ys = tf.placeholder(tf.float32,[None,1],name='y_input')
#定义隐藏层，隐藏层为10个单元
l1 = add_layer(xs,1,10,activation_function=tf.nn.relu)
#定义输出层
predition = add_layer(l1,10,1,activation_function=None)#-----------------------4-----------------------------------------
#损失函数
loss = tf.reduce_mean(tf.reduce_sum(tf.square(ys-predition),reduction_indices=[1]))#-----------------------5------------------------------------------
#训练梯度下降法
train_step = tf.train.GradientDescentOptimizer(0.1).minimize(loss)#学习率为0.1，小于1即可init = tf.global_variables_initializer()
sess = tf.Session()
sess.run(init)# 添加一个画布
fig = plt.figure()
# 分割画板
ax = fig.add_subplot(1,1,1)
ax.scatter(x_data,y_data)
plt.ion()
for i in range(1000):sess.run(train_step,feed_dict={xs:x_data,ys:y_data})if i%50 == 0:print(sess.run(loss,feed_dict={xs:x_data,ys:y_data}))try:ax.lines.remove(lines[0])except Exception:passprdition_value = sess.run(predition,feed_dict={xs:x_data})lines = ax.plot(x_data,predition_value,'r-',lw=5)plt.pause(0.1)
plt.ioff()
plt.show()

代码3.3 可视化(2)
利用matplotlib中的animation模块，创建动图并且保存

import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
from matplotlib import animation# -----------------------1--------------------------------
# 定义添加层函数
def add_layer(inputs, in_size, out_size, activation_function=None):global Weight,biasesWeight = tf.Variable(tf.random_normal([in_size, out_size]), name='W')biases = tf.Variable(tf.zeros([1, out_size]) + 0.1, name='b')Wx_plus_b = tf.matmul(inputs, Weight) + biasesif activation_function is None:outputs = Wx_plus_belse:outputs = activation_function(Wx_plus_b)return outputs# ------------------------2------------------------------------
# 创建训练集
x_data = np.linspace(-1, 1, 300)[:,np.newaxis]
noise = np.random.normal(0, 0.05, x_data.shape)
y_data = np.square(x_data) - 0.5 + noise# -----------------------3----------------------------------------
# 定义输入层，有几个输入数据，就需要定义几个单元
xs = tf.placeholder(tf.float32, [None, 1], name='x_input')
ys = tf.placeholder(tf.float32, [None, 1], name='y_input')
# 定义隐藏层
l1 = add_layer(xs, 1, 10, activation_function=tf.nn.relu)
# 定义输出层
predition = add_layer(l1, 10, 1, activation_function=None)# -----------------------4-----------------------------------------
# 损失函数
loss = tf.reduce_mean(tf.reduce_sum(tf.square(ys - predition), reduction_indices=[1]))# ----------------------5------------------------------------------
# 训练梯度下降法
train_step = tf.train.GradientDescentOptimizer(0.1).minimize(loss)c_trace = []
activation_trace = []with tf.Session() as sess:sess.run(tf.global_variables_initializer())for i in range(1000):sess.run(train_step,feed_dict={xs:x_data,ys:y_data})if i%50 == 0:c_tmp = sess.run(loss,feed_dict={xs:x_data,ys:y_data})activation_tmp = sess.run(predition,feed_dict={xs:x_data,ys:y_data})c_trace.append(c_tmp)activation_trace.append(activation_tmp)fig,ax = plt.subplots()
l11 = ax.scatter(x_data,y_data,color='red',label=r'$Original\data$')
ax.set_xlabel(r'$X\ data$')
ax.set_ylabel(r'$y\ data$')def update(i):try:ax.lines.pop(0)except Exception:passline, =ax.plot(x_data,activation_trace[i],'g--',label=r'$Fitting\ line$',lw=2)return line,ani = animation.FuncAnimation(fig,update,frames=len(activation_trace),interval=100)
ani.save('linearregression.gif',writer='imagemagick')plt.show()

运行结束后,与py文件相同路径下保存了名为linerregression.gif文件

效果图

代码3.4可视化（3）
通过可视化工具Tensorboard实现神经网络可视化

'''
1.定义激励函数
2.添加图层
3.权重，偏执
三层神经网络
'''import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt
def add_layer(inputs,in_size,out_size,n_layer,activation_function=None):layer_name = 'layer%s'%n_layerwith tf.name_scope('layer'):with tf.name_scope('weight'):Weight = tf.Variable(tf.random_normal([in_size,out_size]),name='W')tf.summary.histogram(layer_name+'/weights',Weight)with tf.name_scope('biases'):biases = tf.Variable(tf.zeros([1,out_size])+ 0.1,name='b')tf.summary.histogram(layer_name + '/biases', biases)with tf.name_scope('Wx_plus_b'):Wx_plus_b = tf.matmul(inputs,Weight)+biasesif activation_function is None:outputs = Wx_plus_btf.summary.histogram(layer_name + '/outputs', outputs)else:outputs = activation_function(Wx_plus_b)return outputs
# 训练集
x_data = np.linspace(-1,1,300)[:,np.newaxis]
noise = np.random.normal(0,0.05,x_data.shape)
y_data = np.square(x_data)-0.5+noise# 输入层，有几个data 就必须有几个单元
with tf.name_scope('inputs'):xs = tf.placeholder(tf.float32,[None,1],name='x_input')ys = tf.placeholder(tf.float32,[None,1],name='y_input')
# 定义隐藏层
l1 = add_layer(xs,1,10,n_layer=1,activation_function=tf.nn.relu)
# 定义输出层
predition = add_layer(l1,10,1,n_layer=2,activation_function=None)# 损失函数
with tf.name_scope('loss'):loss = tf.reduce_mean(tf.reduce_sum(tf.square(ys-predition),reduction_indices=[1]))tf.summary.scalar('loss',loss)with tf.name_scope('train'):train_step = tf.train.GradientDescentOptimizer(0.1).minimize(loss) #学习率，小于1init = tf.global_variables_initializer()
sess = tf.Session()
merged = tf.summary.merge_all()
writer = tf.summary.FileWriter('tmp/',sess.graph)
sess.run(init)for i in range(1000):sess.run(train_step,feed_dict={xs:x_data,ys:y_data})if i%50 == 0:result = sess.run(merged,feed_dict={xs:x_data,ys:y_data})writer.add_summary(result,i)

运行上述代码后，会在tmp文件夹中出现一个events.out.tfevents…的文档
然后在Terminal命令窗口中输入

(tensorflow) E:\untitled\venv\untitled\venv\tensorflow>tensorboard --logdir=tmp/

然后会出现以下链接
http://DESKTOP-MOE4SA6:6006
然后把该链接复制到浏览器中，即可看到该神经网络的可视化窗口

注意
在运算过程中注意tensorboard和tensorflow的版本相匹配，如果是最新的版本可能会出现错误,可视化截图如下

2.过拟合

（1）理论基础

欠拟合：指模型过与简单，导致你和的函数无法满足训练集，导致误差较大。

过拟合：过拟合指模型假设过于复杂，参数哦过多，训练数据过少，导致拟合的函数在训练集上误差极小，但是在测试集上误差极大，泛化能力过低。
解决方法：

解决方法：

1.减少特征维度，增加训练数据
2.正则化，降低参数值

在图一中，第一个图形为欠拟合，第二个为“刚刚好”，第三个图形为过拟合。

(2)tensorflow

dropout是指在深度学习网络额训练过程中，对于神经网络单元，按照一定的概率将器暂时从网络中丢弃，对于随机梯度下降来说，由于是随机丢弃，故而每一个mini-batch都在训练不同的网络。
dropout的数学模型为：

没有dropout的神经网络

z_i^{(l+1)} = W_i^{(l+1)}y^l+b_i^{(l+1)},y_i^{(l+1)} =f(z_i^{l+1})

*有dropout的神经网络

r_j^{(l)}—Bernoulli(p)

代码

# dorpout 解决过拟合
import tensorflow as tf
from sklearn.datasets import load_digits
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelBinarizer# load data
digits = load_digits()
X = digits.data
y = digits.target
y = LabelBinarizer().fit_transform(y)
X_train,X_test,y_train,y_test = train_test_split(X,y,test_size=.3)def add_layer(inputs,in_size,out_size,layer_name,activation_function=None,):# add one more layer and return the output of this layerWeights = tf.Variable(tf.random_normal([in_size,out_size]))biases = tf.Variable(tf.zeros([1,out_size])+0.1,)Wx_plus_b = tf.matmul(inputs,Weights)+biasesif activation_function is None:outputs = Wx_plus_belse:outputs = activation_function(Wx_plus_b,)tf.summary.histogram(layer_name + '/outputs',outputs)return outputs# define placeholder for inputs to network
xs = tf.placeholder(tf.float32,[None,64]) #8X8
ys = tf.placeholder(tf.float32,[None,10])# add output layer
l1 = add_layer(xs,64,100,'l1',activation_function=tf.nn.tanh)
prediction = add_layer(l1,100,10,'l2',activation_function=tf.nn.softmax)# the loss between prediction and real data
cross_entropy = tf.reduce_mean(-tf.reduce_sum(ys*tf.log(prediction),reduction_indices=[1])) #losstf.summary.scalar('loss',cross_entropy)
train_step = tf.train.GradientDescentOptimizer(0.6).minimize(cross_entropy)sess = tf.Session()
merged = tf.summary.merge_all()
# summary writer goes in here
train_writer = tf.summary.FileWriter('logs/train',sess.graph)
test_writer = tf.summary.FileWriter('logs/test',sess.graph)sess.run(tf.global_variables_initializer())for i in range(500):sess.run(train_step,feed_dict={xs:X_train,ys:y_train})if i%50 ==0:#record losstrain_result = sess.run(merged,feed_dict={xs:X_train,ys:y_train})test_result = sess.run(merged,feed_dict={xs:X_test,ys:y_test})train_writer.add_summary(train_result,i)test_writer.add_summary(test_result,i)

运行完结果后可以通过Terminal，查看Tensorboard的结果。

3.卷积神经网络

具体理论知识看第一部分

代码

# CNN卷积神经网络
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data
import numpy as np
import matplotlib.pyplot as plttf.set_random_seed(1)
np.random.seed(1)BATCH_SIZE = 50
LR = 0.001#number 1 to 10 data
mnist = input_data.read_data_sets('MNIST_data',one_hot=True)def compute_accuracy(v_xs,v_ys):global predictiony_pre = sess.run(prediction,feed_dict={xs:v_xs,keep_prob:1})correct_prediction = tf.equal(tf.argmax(y_pre,1),tf.argmax(v_ys,1))accuary = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))result = sess.run(accuary,feed_dict={xs:v_xs,ys:v_ys,keep_prob:1})return resultdef weight_variable(shape):#产生随机变量inital = tf.truncated_normal(shape,stddev=0.1)return tf.Variable(inital)def bias_variable(shape):initial = tf.constant(0.1,shape=shape)return tf.Variable(initial)# 卷积神经网络层，x为输入，w为权重
def conv2d(x,W):# strides 为步长 stride[1,x_movement,y_movement,1]# SAME为两种抽取方式，must have strides[0] = stides[3]=1return tf.nn.conv2d(x,W,strides=[1,1,1,1],padding='SAME')def max_pool_2X2(x):return tf.nn.max_pool(x,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')# define placeholder for inputs to network
# 28X28
xs = tf.placeholder(tf.float32,[None,784])
ys = tf.placeholder(tf.float32,[None,10])
keep_prob = tf.placeholder(tf.float32)
# 因为照片为黑白的，所以最后一个参数为1
x_image = tf.reshape(xs,[-1,28,28,1])
# print(x_image.shape) #[n_samples,28,28,1]## conv1 layer ##
W_conv1 = weight_variable([5,5,1,32])# patch 5X5,in size 1,out size 32
b_conv1 = bias_variable([32])
h_conv1 = tf.nn.relu(conv2d(x_image,W_conv1)+b_conv1) #output size 28x28x32
h_pool = max_pool_2X2(h_conv1)                  #output size 14x14x32## conv2 layer ##
W_conv2 = weight_variable([5,5,32,64])# patch 5X5,in size 32,out size 64
b_conv2 = bias_variable([64])
h_conv2 = tf.nn.relu(conv2d(h_pool,W_conv2)+b_conv2)#output size 14x14x64
h_poo2 = max_pool_2X2(h_conv2)## func1 layer ##
W_fcl = weight_variable([7*7*64,1024])
b_fcl = bias_variable([1024])
#[n_samples,7,7,64]-->[n_samples 7*7*64]
h_poo2_flat = tf.reshape(h_poo2,[-1,7*7*64])
h_fc1 = tf.nn.relu(tf.matmul(h_poo2_flat,W_fcl)+b_fcl)
h_fc1_drop = tf.nn.dropout(h_fc1,keep_prob)## func2 layer ##
W_fc2 = weight_variable([1024,10])
b_fc2 = bias_variable([10])
prediction = tf.nn.softmax(tf.matmul(h_fc1_drop,W_fc2)+b_fc2)# the error between prediction and real data
cross_entropy = tf.reduce_mean(-tf.reduce_sum(ys*tf.log(prediction),reduction_indices=[1]))# loss
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)sess = tf.Session()
# important step
sess.run(tf.global_variables_initializer())for i in range(1000):batch_xs,batch_ys = mnist.train.next_batch(100)sess.run(train_step,feed_dict={xs:batch_xs,ys:batch_ys,keep_prob:1})if i % 50 ==0:print(compute_accuracy(mnist.test.images,mnist.test.labels))

输出

可以看出精度比简单的神经网络精准的多

3.1保存神经网络

保存神经网络，目前神经网络只能保存权重，和偏执

# @ 保存神经网络,和提取神经网络import tensorflow as tf
import numpy as np
## Save to file
# # remember to define the same dtype an shape when restroe
# W = tf.Variable([[1,2,3],[3,4,5]], dtype = tf.float32,name='weights')
# b = tf.Variable([[1,2,3]],dtype = tf.float32,name = 'biases')
#
# init = tf.global_variables_initializer()
#
# saver = tf.train.Saver()
# with tf.Session() as sess:
#     sess.run(init)
#     save_path = saver.save(sess,'my_net/save_net.ckpt')
#     print("Save to path:",save_path)# restore variables
# redefine the same shape and same type for your variabless
W = tf.Variable(np.arange(6).reshape((2,3)),dtype=tf.float32,name='weights')
b = tf.Variable(np.arange(3).reshape((1,3)),dtype=tf.float32,name='biases')# not need init stepsaver = tf.train.Saver()
with tf.Session() as sess:saver.restore(sess,'my_net/save_net.ckpt')print('weights:',sess.run(W))print('biases:',sess.run(b))

4.循环神经网络

4.1 RNN

代码

# 循环神经网络RNN LSTM
# RNN 分类
import tensorflow as tf
from tensorflow.examples.tutorials.mnist import input_data#this is data
mnist = input_data.read_data_sets('MNIST_data',one_hot=True)# hyperparameters
lr = 0.001
training_iters = 100000
batch_size = 128n_inputs = 28 # MNIST data input(img shape:28*28)
n_steps = 28 # time steps
n_hidden_unis = 128 # neurons in hidden layer
n_classes = 10 # MNIST classes (0-9 digits)# tf Graph input
x = tf.placeholder(tf.float32,[None,n_steps,n_inputs])
y = tf.placeholder(tf.float32,[None,n_classes])# define weights
weights = {
#(28,128)'in': tf.Variable(tf.random_normal([n_inputs,n_hidden_unis])),
#(128,10)'out': tf.Variable(tf.random_normal([n_hidden_unis,n_classes]))
}
biases = {# （128，）'in':tf.Variable(tf.constant(0.1,shape=[n_hidden_unis,])),# (10,)'out':tf.Variable(tf.constant(0.1,shape=[n_classes,]))
}def RNN(X,weights,biases):# hidden layer for input to cell############################# X(128batch, 28step,28inputs)# ==> (128*28 ,28 inputs)X = tf.reshape(X,[-1,n_inputs])# X_in ==> (128 batch * 28 step,128 hidden)X_in = tf.matmul(X,weights['in'])+biases['in']# X_in ==> (128 batch * 28 step,128 hidden)X_in = tf.reshape(X_in,[-1,n_steps,n_hidden_unis])# celllstm_cell = tf.nn.rnn_cell.BasicLSTMCell(n_hidden_unis,forget_bias=1.0,state_is_tuple=True)# lstm cell is divided into two parts (c_state,m_state)_init_state = lstm_cell.zero_state(batch_size,dtype=tf.float32)outputs,states = tf.nn.dynamic_rnn(lstm_cell, X_in,initial_state=_init_state,time_major=False)#hidden layer for output as the final resultsresults = tf.matmul(states[1],weights['out'])+biases['out']# method 2# or# unpack to list(batch,outputs)# outputs = tf.stack(tf.transpose(outputs,[1,0,2])) # states is the last outputs# results = tf.matmul(outputs[-1],weights['out'])+biases['out']return resultspred = RNN(x,weights,biases)
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(labels=y,logits=pred))
train_op = tf.train.AdamOptimizer(lr).minimize(cost)correct_pred = tf.equal(tf.argmax(pred,1),tf.argmax(y,1))
accuracy = tf.reduce_mean(tf.cast(correct_pred,tf.float32))init = tf.global_variables_initializer()
with tf.Session() as sess:sess.run(init)step = 0while step*batch_size < training_iters:batch_xs,batch_ys = mnist.train.next_batch(batch_size)batch_xs = batch_xs.reshape([batch_size,n_steps,n_inputs])sess.run([train_op],feed_dict={x:batch_xs,y:batch_ys,})if step%20 == 0:print(sess.run(accuracy,feed_dict={x:batch_xs,y:batch_ys,}))step += 1

5.非监督学习

*代码

# Autoencoder （自编码）神经网络的非监督学习#自编码，压缩处理解压，将压缩和解压进行对比学习import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data',one_hot = False)# Visualize decoder setting
# Parameters
learning_rate = 0.01
training_epochs = 5
batch_size = 256
display_step = 1
examples_to_show = 10# Network Parameters
n_input = 784 # MNIST data input(img shape:28*28)# tf Graph input(only pictures)
X = tf.placeholder('float',[None,n_input])# hidden layer setting
n_hidden_1 = 256 # 1st layer num featrues
n_hidden_2 = 128 # 2nd layer num features
weight = {'encoder_h1':tf.Variable(tf.random_normal([n_input,n_hidden_1])),'encoder_h2':tf.Variable(tf.random_normal([n_hidden_1,n_hidden_2])),'decoder_h1':tf.Variable(tf.random_normal([n_hidden_2,n_hidden_1])),'decoder_h2':tf.Variable(tf.random_normal([n_hidden_1,n_input]))
}
biases = {'encoder_b1':tf.Variable(tf.random_normal([n_hidden_1])),'encoder_b2':tf.Variable(tf.random_normal([n_hidden_2])),'decoder_b1':tf.Variable(tf.random_normal([n_hidden_1])),'decoder_b2':tf.Variable(tf.random_normal([n_input])),
}# Building the encoder
def encoder(x):layer_1 = tf.nn.sigmoid(tf.add(tf.matmul(x,weight['encoder_h1']),biases['encoder_b1']))# Deocder Hidden layer with sigmoid activtion #2layer_2 = tf.nn.sigmoid(tf.add(tf.matmul(layer_1,weight['encoder_h2']),biases['encoder_b2']))return layer_2# Building the decoder
def decoder(x):# Encoder Hidden layer with sigmoid activation #1layer_1 = tf.nn.sigmoid(tf.add(tf.matmul(x,weight['decoder_h1']),biases['decoder_b1']))# Decoder Hidden layer with sigmoid activation #2layer_2 = tf.nn.sigmoid(tf.add(tf.matmul(layer_1,weight['decoder_h2']),biases['decoder_b2']))return layer_2#Construct model
encoder_op = encoder(X)
decoder_op = decoder(encoder_op)# Prediction
y_pred = decoder_op
# Targets(Labels) are the input data
y_true = X# Define loss and optimizer,minmize the squared error
cost = tf.reduce_mean(tf.pow(y_true-y_pred,2))
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)# Initializing the variable
init = tf.global_variables_initializer()# Launch the graph
with tf.Session() as sess:sess.run(init)total_batch = int(mnist.train.num_examples/batch_size)# Training cyclefor epoch in range(training_epochs):# Loop over all batchesfor i in range(total_batch):batch_xs,batch_ys = mnist.train.next_batch(batch_size) # max(x) = 1,min(x) = 0#Run optimization op(backprop) and cost op (to get loss value)_,c=sess.run([optimizer,cost],feed_dict={X:batch_xs})# Display logs per epoch stepif epoch % display_step == 0:print("Epoch:",'%04d'%(epoch+1),"cost=","{:.9f}".format(c))print("Optimization Finished")# # Applying encode and deconde over test setencode_decode = sess.run(y_pred,feed_dict={X:mnist.test.images[:examples_to_show]})# Compare original images with their reconstructionsf,a = plt.subplots(2,10,figsize=(10,2))for i in range(examples_to_show):a[0][i].imshow(np.reshape(mnist.test.images[i],(28,28)))a[1][i].imshow(np.reshape(encode_decode[i],(28,28)))plt.show()

methon2

# Autoencoder （自编码）神经网络的非监督学习#自编码，压缩处理解压，将压缩和解压进行对比学习import tensorflow as tf
import numpy as np
import matplotlib.pyplot as plt# Import MNIST data
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('MNIST_data',one_hot = False)# examples_to_show = 10# method2 Parameters
learning_rate=0.001
training_epochs=20
batch_size = 256
display_step = 1# Network Parameters
n_input = 784 # MNIST data input(img shape:28*28)# tf Graph input(only pictures)
X = tf.placeholder('float',[None,n_input])# # method2 hidden layer setting
n_hidden_1 = 128
n_hidden_2 = 64
n_hidden_3 = 10
n_hidden_4 = 2#method2
weight = {'encoder_h1':tf.Variable(tf.truncated_normal([n_input,n_hidden_1],)),'encoder_h2':tf.Variable(tf.truncated_normal([n_hidden_1,n_hidden_2],)),'encoder_h3':tf.Variable(tf.truncated_normal([n_hidden_2,n_hidden_3],)),'encoder_h4':tf.Variable(tf.truncated_normal([n_hidden_3,n_hidden_4],)),'decoder_h1':tf.Variable(tf.truncated_normal([n_hidden_4,n_hidden_3],)),'decoder_h2':tf.Variable(tf.truncated_normal([n_hidden_3,n_hidden_2],)),'decoder_h3':tf.Variable(tf.truncated_normal([n_hidden_2,n_hidden_1],)),'decoder_h4':tf.Variable(tf.truncated_normal([n_hidden_1,n_input],)),
}
biases = {'encoder_b1':tf.Variable(tf.random_normal([n_hidden_1])),'encoder_b2':tf.Variable(tf.random_normal([n_hidden_2])),'encoder_b3':tf.Variable(tf.random_normal([n_hidden_3])),'encoder_b4':tf.Variable(tf.random_normal([n_hidden_4])),'decoder_b1':tf.Variable(tf.random_normal([n_hidden_3])),'decoder_b2':tf.Variable(tf.random_normal([n_hidden_2])),'decoder_b3':tf.Variable(tf.random_normal([n_hidden_1])),'decoder_b4':tf.Variable(tf.random_normal([n_input])),
}# method2 Building the encoder
def encoder(x):layer_1 = tf.nn.sigmoid(tf.add(tf.matmul(x,weight['encoder_h1']),biases['encoder_b1']))layer_2 = tf.nn.sigmoid(tf.add(tf.matmul(layer_1,weight['encoder_h2']),biases['encoder_b2']))layer_3 = tf.nn.sigmoid(tf.add(tf.matmul(layer_2,weight['encoder_h3']),biases['encoder_b3']))layer_4 = tf.add(tf.matmul(layer_3,weight['encoder_h4']),biases['encoder_b4'])return layer_4
# buiding the decoder
def decoder(x):layer_1 = tf.nn.sigmoid(tf.add(tf.matmul(x,weight['decoder_h1']),biases['decoder_b1']))layer_2 = tf.nn.sigmoid(tf.add(tf.matmul(layer_1,weight['decoder_h2']),biases['decoder_b2']))layer_3 = tf.nn.sigmoid(tf.add(tf.matmul(layer_2,weight['decoder_h3']),biases['decoder_b3']))layer_4 = tf.nn.sigmoid(tf.add(tf.matmul(layer_3,weight['decoder_h4']),biases['decoder_b4']))return layer_4#Construct model
encoder_op = encoder(X)
decoder_op = decoder(encoder_op)# Prediction
y_pred = decoder_op
# Targets(Labels) are the input data
y_true = X# Define loss and optimizer,minmize the squared error
cost = tf.reduce_mean(tf.pow(y_true-y_pred,2))
optimizer = tf.train.AdamOptimizer(learning_rate).minimize(cost)# Initializing the variable
init = tf.global_variables_initializer()# Launch the graph
with tf.Session() as sess:sess.run(init)total_batch = int(mnist.train.num_examples/batch_size)# Training cyclefor epoch in range(training_epochs):# Loop over all batchesfor i in range(total_batch):batch_xs,batch_ys = mnist.train.next_batch(batch_size) # max(x) = 1,min(x) = 0#Run optimization op(backprop) and cost op (to get loss value)_,c=sess.run([optimizer,cost],feed_dict={X:batch_xs})# Display logs per epoch stepif epoch % display_step == 0:print("Epoch:",'%04d'%(epoch+1),"cost=","{:.9f}".format(c))print("Optimization Finished")encoder_result = sess.run(encoder_op,feed_dict={X:mnist.test.images})plt.scatter(encoder_result[:,0],encoder_result[:,1],c = mnist.test.labels)plt.show()

name_scope/varibale_scope

对不同RNN结构，用相同的变量，这个时候需要用到reuse——varibale

# name_scope / Variabel_scope的相同与不同from __future__ import print_function
import tensorflow as tf
tf.set_random_seed(1) # reproducible'''
# name_scope 对get_varialbe()无效，with tf.name_scope("a_name_scope"):initialiver = tf.constant_initializer(value=1)var1 = tf.get_variable(name = 'var1',shape=[1],dtype=tf.float32,initializer=initialiver)var2 = tf.Variable(name='var2',initial_value=[2],dtype=tf.float32)var21 = tf.Variable(name='var2',initial_value=[2.1],dtype=tf.float32)var22 = tf.Variable(name='var2',initial_value=[2.2],dtype=tf.float32)with tf.Session() as sess:sess.run(tf.global_variables_initializer())print(var1.name)print(sess.run(var1))print(var2.name)print(sess.run(var2))print(var21.name)print(sess.run(var21))print(var22.name)print(sess.run(var22))'''with tf.variable_scope("a_variable_scope") as scope:initializer = tf.constant_initializer(value=3)var3 =tf.get_variable(name='var3',shape=[1],dtype=tf.float32,initializer=initializer)var4 = tf.Variable(name='var4',initial_value=[4],dtype=tf.float32)var4_reuse = tf.Variable(name='var4',initial_value=[4],dtype=tf.float32)# var3_resue = tf.get_variables(name="var3") #错误，get_variable 名字相同重复调用则失败# 重复利用变量# 若需要重复变量则应使用scope.reuse_variables()scope.reuse_variables()var3_resue = tf.get_variable(name="var3")with tf.Session() as sess:sess.run(tf.global_variables_initializer())print(var3.name)print(sess.run(var3))print(var4.name)print(sess.run(var4))print(var4_reuse.name)print(sess.run(var4_reuse))print(var3_resue)print(sess.run(var3_resue))'''
reuse variable
用在RNN不同结构，但是参数相同的情况中
'''