tensorflow进阶(更新中...)

文章目录

1. ConfigProto&GPU
2.embedding_lookup()
3.tf.nn.dropout
4.tf.tile()
5. tf.reshape()
6.tf.nn.sparse_softmax_cross_entropy_with_logits()
- 说明
- 参数
- 示例代码
7.tf.train.exponential_decay
8.卷积函数之 tf.nn.conv1d

1. ConfigProto&GPU

tf.ConfigProto一般用在创建session的时候。用来对session进行参数配置

with tf.Session(config = tf.ConfigProto(...),...)
#tf.ConfigProto()的参数
log_device_placement=True : 是否打印设备分配日志
allow_soft_placement=True ： 如果你指定的设备不存在，允许TF自动分配设备
tf.ConfigProto(log_device_placement=True,allow_soft_placement=True)

控制GPU资源使用率

#allow growth
config = tf.ConfigProto()
config.gpu_options.allow_growth = True
session = tf.Session(config=config, ...)
# 使用allow_growth option，刚一开始分配少量的GPU容量，然后按需慢慢的增加，由于不会释放
#内存，所以会导致碎片# per_process_gpu_memory_fraction
gpu_options=tf.GPUOptions(per_process_gpu_memory_fraction=0.7)
config=tf.ConfigProto(gpu_options=gpu_options)
session = tf.Session(config=config, ...)
#设置每个GPU应该拿出多少容量给进程使用，0.4代表 40%

控制使用哪块GPU

~/ CUDA_VISIBLE_DEVICES=0  python your.py#使用GPU0
~/ CUDA_VISIBLE_DEVICES=0,1 python your.py#使用GPU0,1
#注意单词不要打错#或者在 程序开头
os.environ['CUDA_VISIBLE_DEVICES'] = '0' #使用 GPU 0
os.environ['CUDA_VISIBLE_DEVICES'] = '0,1' # 使用 GPU 0，1

2.embedding_lookup()

embedding_lookup( )的用法
关于tensorflow中embedding_lookup( )的用法，在Udacity的word2vec会涉及到，本文将通俗的进行解释。
先看个简单的demo:

#!/usr/bin/env/python
# coding=utf-8
import tensorflow as tf
import numpy as npinput_ids = tf.placeholder(dtype=tf.int32, shape=[None])embedding = tf.Variable(np.identity(5, dtype=np.int32))
input_embedding = tf.nn.embedding_lookup(embedding, input_ids)sess = tf.InteractiveSession()
sess.run(tf.global_variables_initializer())
print(embedding.eval())
print(sess.run(input_embedding, feed_dict={input_ids:[1, 2, 3, 0, 3, 2, 1]}))

代码中先使用palceholder定义了一个未知变量input_ids用于存储索引，和一个已知变量embedding，是一个5*5的对角矩阵。
运行结果为：

embedding = [[1 0 0 0 0][0 1 0 0 0][0 0 1 0 0][0 0 0 1 0][0 0 0 0 1]]
input_embedding = [[0 1 0 0 0][0 0 1 0 0][0 0 0 1 0][1 0 0 0 0][0 0 0 1 0][0 0 1 0 0][0 1 0 0 0]]

简单的讲就是根据input_ids中的id，寻找embedding中的对应元素。比如，input_ids=[1,3,5]，则找出embedding中下标为1,3,5的向量组成一个矩阵返回。

如果将input_ids改写成下面的格式：

input_embedding = tf.nn.embedding_lookup(embedding, input_ids)
print(sess.run(input_embedding, feed_dict={input_ids:[[1, 2], [2, 1], [3, 3]]}))

输出结果就会变成如下的格式：

[[[0 1 0 0 0][0 0 1 0 0]][[0 0 1 0 0][0 1 0 0 0]][[0 0 0 1 0][0 0 0 1 0]]]

对比上下两个结果不难发现，相当于在np.array中直接采用下标数组获取数据。需要注意的细节是返回的tensor的dtype和传入的被查询的tensor的dtype保持一致；和ids的dtype无关。

3.tf.nn.dropout

防止过拟合
tf.nn.dropout是TensorFlow里面为了防止或减轻过拟合而使用的函数，它一般用在全连接层。

Dropout就是在不同的训练过程中随机扔掉一部分神经元。也就是让某个神经元的激活值以一定的概率p，让其停止工作，这次训练过程中不更新权值，也不参加神经网络的计算。但是它的权重得保留下来（只是暂时不更新而已），因为下次样本输入时它可能又得工作了。示意图如下：

tf.nn.dropout(x,  # A floating point tensor.keep_prob,   #A scalar Tensor with the same type as x. The probability that each element is kept.noise_shape=None,  #A 1-D Tensor of type int32, representing the shape for randomly generated keep/drop flags.seed=None,   #A Python integer. Used to create random seeds. See tf.set_random_seed for behavior.name=None  #A name for this operation (optional).
)

4.tf.tile()

tensorflow中对矩阵进行自身进行复制的功能，比如按行进行复制，或是按列进行复制

函数结构：

tile(input,multiples,name=None
)

import tensorflow as tf
temp = tf.tile([1,2,3],[2])
temp2 = tf.tile([[1,2],[3,4],[5,6]],[2,3])
with tf.Session() as sess:print(sess.run(temp))print(sess.run(temp2))

结果：
[1 2 3 1 2 3]
[[1 2 1 2 1 2]
[3 4 3 4 3 4]
[5 6 5 6 5 6]
[1 2 1 2 1 2]
[3 4 3 4 3 4]
[5 6 5 6 5 6]]

如果现有一个形状如[width, height]的张量，需要得到一个基于原张量的，形状如[batch_size,width,height]的张量，其中每一个batch的内容都和原张量一模一样。

import tensorflow as tfraw = tf.Variable(tf.random_normal(shape=(1, 3, 2)))
multi = tf.tile(raw, multiples=[2, 1, 1])with tf.Session() as sess:sess.run(tf.global_variables_initializer())print(raw.eval())print('-----------------------------')print(sess.run(multi))

结果：
[[[-0.50027871 -0.48475555][-0.52617502 -0.2396145 ][ 1.74173343 -0.20627949]]]
-----------------------------
[[[-0.50027871 -0.48475555][-0.52617502 -0.2396145 ][ 1.74173343 -0.20627949]][[-0.50027871 -0.48475555][-0.52617502 -0.2396145 ][ 1.74173343 -0.20627949]]]

5. tf.reshape()

对tensor重新组织矩阵的形状，也就是reshape
直接上demo:

 # tensor 't' is [1, 2, 3, 4, 5, 6, 7, 8, 9]# tensor 't' has shape [9]reshape(t, [3, 3]) ==> [[1, 2, 3],[4, 5, 6],[7, 8, 9]]# tensor 't' is [[[1, 1], [2, 2]],#                [[3, 3], [4, 4]]]# tensor 't' has shape [2, 2, 2]reshape(t, [2, 4]) ==> [[1, 1, 2, 2],[3, 3, 4, 4]]# tensor 't' is [[[1, 1, 1],#                 [2, 2, 2]],#                [[3, 3, 3],#                 [4, 4, 4]],#                [[5, 5, 5],#                 [6, 6, 6]]]# tensor 't' has shape [3, 2, 3]# pass '[-1]' to flatten 't'reshape(t, [-1]) ==> [1, 1, 1, 2, 2, 2, 3, 3, 3, 4, 4, 4, 5, 5, 5, 6, 6, 6]# -1 can also be used to infer the shape# -1 is inferred to be 9:reshape(t, [2, -1]) ==> [[1, 1, 1, 2, 2, 2, 3, 3, 3],[4, 4, 4, 5, 5, 5, 6, 6, 6]]# -1 is inferred to be 2:reshape(t, [-1, 9]) ==> [[1, 1, 1, 2, 2, 2, 3, 3, 3],[4, 4, 4, 5, 5, 5, 6, 6, 6]]# -1 is inferred to be 3:reshape(t, [ 2, -1, 3]) ==> [[[1, 1, 1],[2, 2, 2],[3, 3, 3]],[[4, 4, 4],[5, 5, 5],[6, 6, 6]]]# tensor 't' is [7]# shape `[]` reshapes to a scalarreshape(t, []) ==> 7

6.tf.nn.sparse_softmax_cross_entropy_with_logits()

sparse_softmax_cross_entropy_with_logits(_sentinel=None,labels=None, logits=None,name=None)

说明

此函数大致与tf_nn_softmax_cross_entropy_with_logits的计算方式相同,
适用于每个类别相互独立且排斥的情况，一幅图只能属于一类，而不能同时包含一条狗和一只大象

但是在对于labels的处理上有不同之处,labels从shape来说此函数要求shape为[batch_size],
labels[i]是[0,num_classes)的一个索引, type为int32或int64,即labels限定了是一个一阶tensor,
并且取值范围只能在分类数之内,表示一个对象只能属于一个类别

参数

_sentinel:本质上是不用的参数，不用填
logits：shape为[batch_size,num_classes],type为float32或float64
name:操作的名字，可填可不填

示例代码

import tensorflow as tfinput_data = tf.Variable([[0.2, 0.1, 0.9], [0.3, 0.4, 0.6]], dtype=tf.float32)
output = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=input_data, labels=[0, 2])
with tf.Session() as sess:init = tf.global_variables_initializer()sess.run(init)print(sess.run(output))
# [ 1.36573195  0.93983102]

7.tf.train.exponential_decay

tf.train.exponential_decay(learning_rate, global_, decay_steps, decay_rate, staircase=True/False)

为tensorflow提供的指数衰减学习率方法

但是仅仅调用该函数并不能达到衰减学习率的效果，必须要特定的格式才能达到效果，以下为调用范例

# Optimizer: set up a variable that's incremented once per batch and
# controls the learning rate decay.
batch = tf.Variable(0, dtype=data_type())
# Decay once per epoch, using an exponential schedule starting at 0.01.
learning_rate = tf.train.exponential_decay(  0.01,                # Base learning rate.  batch * BATCH_SIZE,  # Current index into the dataset.  train_size,          # Decay step.  0.95,                # Decay rate.  staircase=True)
# Use simple momentum for the optimization.
optimizer = tf.train.MomentumOptimizer(learning_rate,  0.9).minimize(loss,  global_step=batch)

需要特别注意的一点是，在优化器中必须定义“global_step"这个参数，否则batch参数不会随着优化器更新参数而迭代。

import tensorflow as tf;
import numpy as np;
import matplotlib.pyplot as plt;  learning_rate = 0.1
decay_rate = 0.96
global_steps = 1000
decay_steps = 100  global_ = tf.Variable(tf.constant(0))
c = tf.train.exponential_decay(learning_rate, global_, decay_steps, decay_rate, staircase=True)
d = tf.train.exponential_decay(learning_rate, global_, decay_steps, decay_rate, staircase=False)  T_C = []
F_D = []  with tf.Session() as sess:  for i in range(global_steps):  T_c = sess.run(c,feed_dict={global_: i})  T_C.append(T_c)  F_d = sess.run(d,feed_dict={global_: i})  F_D.append(F_d)  plt.figure(1)
plt.plot(range(global_steps), F_D, 'r-')
plt.plot(range(global_steps), T_C, 'b-')  plt.show()

使用方法完整版：

    # """based on the loss, use SGD to update parameter"""learning_rate = tf.train.exponential_decay(learning_rate_init, global_step, decay_steps, decay_rate,staircase=True)train_op = tf.contrib.layers.optimize_loss(loss, global_step=global_step, learning_rate=learning_rate,optimizer="Adam")

8.卷积函数之 tf.nn.conv1d

import tensorflow as tf
import numpy as np
inputs=tf.constant(np.arange(1, 30, dtype=np.int32),tf.float32,shape=[2,5,3])
w=tf.constant(np.arange(1, 13, dtype=np.int32),tf.float32,(2,3,2)) with tf.Session() as sess:print(sess.run(inputs),"\n===========")print(sess.run(w),"\n===========")print(sess.run(tf.nn.conv1d(inputs,w,1,'SAME')))

结果：

[[[ 1.  2.  3.][ 4.  5.  6.][ 7.  8.  9.][10. 11. 12.][13. 14. 15.]][[16. 17. 18.][19. 20. 21.][22. 23. 24.][25. 26. 27.][28. 29. 29.]]]
===========
[[[ 1.  2.][ 3.  4.][ 5.  6.]][[ 7.  8.][ 9. 10.][11. 12.]]]
===========
[[[ 161.  182.][ 269.  308.][ 377.  434.][ 485.  560.][ 130.  172.]][[ 701.  812.][ 809.  938.][ 917. 1064.][1014. 1178.][ 260.  346.]]]

如果是nlp中的n-gram的话维度分别为：
input= [batch,max_sentence_length,embedding_size]
w = [filter_size,embedding_size,filter_number]
conv = [filter_size,max_sentence_length,filter_number]