tensorflow入门

by Daniel Deutsch

由Daniel Deutsch

TensorFlow法律和统计入门 (Get started with TensorFlow on law and statistics)

What this is about

这是关于什么的
What we will use

我们将使用什么
Get started

开始吧
Shell commands for installing everything you need

用于安装所需内容的Shell命令
Get data and draw a plot

获取数据并绘制图
Import everything you need

导入您需要的一切
Create and plot some numbers

创建并绘制一些数字
Build a TensorFlow model

建立一个TensorFlow模型
Prepare data

准备数据
Set up variables and operations for TensorFlow

为TensorFlow设置变量和操作
Start the calculations with a TensorFlow session

从TensorFlow会话开始计算
Visualize the result and process

可视化结果和过程

这是关于什么的 (What this is about)

As I am exploring TensorFlow, I wanted build a beginner example and document it. This is a very basic example that uses a gradient descent optimization to train parameters with TensorFlow. The key variables are evidence and convictions. It will illustrate:

在探索TensorFlow时，我想构建一个初学者示例并记录下来。这是一个非常基本的示例，该示例使用梯度下降优化来使用TensorFlow训练参数。关键变量是证据和信念。它将说明：

how the number of convictions depend upon the number of pieces of evidence定罪的数量如何取决于证据的数量
how to predict the number of convictions using a regression model如何使用回归模型预测定罪人数

The Python file is in my repository on GitHub.

Python文件位于我在GitHub上的存储库中。

See the article in better formatting on GitHub.

在GitHub上以更好的格式查看文章。

我们将使用什么 (What we will use)

1. TensorFlow(as tf) (1. TensorFlow (as tf))

Tensors

张量

tf.placeholderstf.placeholders
tf.Variablestf。变量

Helper function

辅助功能

tf.global_variables_initializertf.global_variables_initializer

Math Operations

数学运算

tf.addtf.add
tf.multiplytf.multiply
tf.reduce_sumtf.reduce_sum
tf.powtf.pow

Building a graph

建立图

tf.train.GradientDescentOptimizertf.train.GradientDescentOptimizer

Session

届会

tf.Session会话

2.脾气暴躁(as np) (2. Numpy (as np))

np.random.seednp.random.seed
np.random.zerosnp.random.zeros
np.random.randintnp.random.randint
np.random.randnnp.random.randn
np.random.asanyarraynp.random.asanyarray

3. Matplotlib (3. Matplotlib)

4.数学 (4. Math)

入门 (Getting started)

Install TensorFlow with virtualenv. See the guide on the TF website.

使用virtualenv安装TensorFlow。请参阅TF网站上的指南。

用于安装所需内容的Shell命令 (Shell commands for installing everything you need)

sudo easy_install pip

pip3 install --upgrade virtualenv

virtualenv --system-site-packages <targetDirectory>

cd <targetDirectory>

source ./bin/activate

easy_install -U pip3

pip3 install tensorflow

pip3 install matplotlib

获取数据并绘制图 (Get data and draw a plot)

导入您需要的一切 (Import everything you need)

import tensorflow as tfimport numpy as npimport mathimport matplotlibmatplotlib.use('TkAgg')import matplotlib.pyplot as pltimport matplotlib.animation as animation

As you can see I am using the “TkAgg” backend from matplotlib. This allows me to debug with my vsCode and macOS setup without any further complicated installments.

如您所见，我正在使用matplotlib中的“ TkAgg”后端。这使我可以使用vsCode和macOS设置进行调试，而无需进行任何其他复杂的安装。

创建并绘制一些数字 (Create and plot some numbers)

# Generate evidence numbers between 10 and 20# Generate a number of convictions from the evidence with a random noise addednp.random.seed(42)sampleSize = 200numEvid = np.random.randint(low=10, high=50, size=sampleSize)numConvict = numEvid * 10 + np.random.randint(low=200, high=400, size=sampleSize)

# Plot the data to get a feelingplt.title("Number of convictions based on evidence")plt.plot(numEvid, numConvict, "bx")plt.xlabel("Number of Evidence")plt.ylabel("Number of Convictions")plt.show(block=False)  # Use the keyword 'block' to override the blocking behavior

I am creating random values for the evidence. The number of convictions depends on the amount (number) of evidence, with random noise. Of course those numbers are made up, but they are just used to prove a point.

我正在为证据创建随机值。定罪的数量取决于证据的数量(数量)以及随机噪声。当然，这些数字是虚构的，但是它们只是用来证明这一点。

建立一个TensorFlow模型 (Build a TensorFlow model)

To build a basic machine learning model, we need to prepare the data. Then we make predictions, measure the loss, and optimize by minimizing the loss.

要构建基本的机器学习模型，我们需要准备数据。然后我们进行预测，测量损失，并通过最小化损失进行优化。

准备数据 (Prepare data)

# create a function for normalizing values# use 70% of the data for training (the remaining 30% shall be used for testing)def normalize(array):    return (array - array.mean()) / array.std()

numTrain = math.floor(sampleSize * 0.7)

# convert list to an array and normalize arraystrainEvid = np.asanyarray(numEvid[:numTrain])trainConvict = np.asanyarray(numConvict[:numTrain])trainEvidNorm = normalize(trainEvid)trainConvictdNorm = normalize(trainConvict)

testEvid = np.asanyarray(numEvid[numTrain:])testConvict = np.asanyarray(numConvict[numTrain:])testEvidNorm = normalize(testEvid)testConvictdNorm = normalize(testConvict)

We are splitting the data into training and testing portions. Afterwards, we normalize the values, as this is necessary for machine learning projects. (See also “feature scaling”.)

我们将数据分为训练和测试部分。之后，我们将值标准化，因为这对于机器学习项目是必需的。 (另请参阅“ 功能缩放 ”。)

为TensorFlow设置变量和操作 (Set up variables and operations for TensorFlow)

# define placeholders  and variablestfEvid = tf.placeholder(tf.float32, name="Evid")tfConvict = tf.placeholder(tf.float32, name="Convict")tfEvidFactor = tf.Variable(np.random.randn(), name="EvidFactor")tfConvictOffset = tf.Variable(np.random.randn(), name="ConvictOffset")

# define the operation for predicting the conviction based on evidence by adding both values# define a loss function (mean squared error)tfPredict = tf.add(tf.multiply(tfEvidFactor, tfEvid), tfConvictOffset)tfCost = tf.reduce_sum(tf.pow(tfPredict - tfConvict, 2)) / (2 * numTrain)

# set a learning rate and a gradient descent optimizerlearningRate = 0.1gradDesc = tf.train.GradientDescentOptimizer(learningRate).minimize(tfCost)

The pragmatic differences between tf.placeholder and tf.Variable are:

tf.placeholder和tf.Variable之间的实用差异是：

placeholders are allocated storage for data, and initial values are not required占位符被分配用于数据存储，并且不需要初始值
variables are used for parameters to learn, and initial values are required. The values can be derived from training.变量用于学习参数，并且需要初始值。这些值可以从训练中得出。

I use the TensorFlow operators precisely as tf.add(…), because it is pretty clear what library is used for the calculation. This is instead of using the + operator.

我将TensorFlow运算符精确地用作tf.add(…) ，因为很清楚使用哪个库进行计算。这不是使用+运算符。

从TensorFlow会话开始计算 (Start the calculations with a TensorFlow session)

# initialize variablesinit = tf.global_variables_initializer()

with tf.Session() as sess:    sess.run(init)

# set up iteration parameters    displayEvery = 2    numTrainingSteps = 50

# Calculate the number of lines to animation    # define variables for updating during animation    numPlotsAnim = math.floor(numTrainingSteps / displayEvery)    evidFactorAnim = np.zeros(numPlotsAnim)    convictOffsetAnim = np.zeros(numPlotsAnim)    plotIndex = 0

# iterate through the training data    for i in range(numTrainingSteps):

# ======== Start training by running the session and feeding the gradDesc        for (x, y) in zip(trainEvidNorm, trainConvictdNorm):            sess.run(gradDesc, feed_dict={tfEvid: x, tfConvict: y})

# Print status of learning        if (i + 1) % displayEvery == 0:            cost = sess.run(                tfCost, feed_dict={tfEvid: trainEvidNorm, tfConvict: trainConvictdNorm}            )            print(                "iteration #:",                "%04d" % (i + 1),                "cost=",                "{:.9f}".format(cost),                "evidFactor=",                sess.run(tfEvidFactor),                "convictOffset=",                sess.run(tfConvictOffset),            )

# store the result of each step in the animation variables            evidFactorAnim[plotIndex] = sess.run(tfEvidFactor)            convictOffsetAnim[plotIndex] = sess.run(tfConvictOffset)            plotIndex += 1

# log the optimized result    print("Optimized!")    trainingCost = sess.run(        tfCost, feed_dict={tfEvid: trainEvidNorm, tfConvict: trainConvictdNorm}    )    print(        "Trained cost=",        trainingCost,        "evidFactor=",        sess.run(tfEvidFactor),        "convictOffset=",        sess.run(tfConvictOffset),        "\n",    )

Now we come to the actual training and the most interesting part.

现在我们来进行实际的培训和最有趣的部分。

The graph is now executed in a tf.Session. I am using "feeding" as it lets you inject data into any Tensor in a computation graph. You can see more on reading data here.

该图现在在tf.Session执行。我正在使用“馈送”，因为它可以让您将数据注入计算图中的任何张量。您可以在此处查看有关读取数据的更多信息。

tf.Session() is used to create a session that is automatically closed on exiting the context. The session also closes when an uncaught exception is raised.

tf.Session()用于创建一个会话，该会话在退出上下文时自动关闭。当引发未捕获的异常时，会话也会关闭。

The tf.Session.run method is the main mechanism for running a tf.Operation or evaluating a tf.Tensor. You can pass one or more tf.Operation or tf.Tensor objects to tf.Session.run, and TensorFlow will execute the operations that are needed to compute the result.

tf.Session.run方法是运行tf.Operation或评估tf.Tensor的主要机制。您可以将一个或多个tf.Operation或tf.Tensor对象传递给tf.Session.run ，TensorFlow将执行计算结果所需的操作。

First, we are running the gradient descent training while feeding it the normalized training data. After that, we are calculating the the loss.

首先，我们在进行梯度下降训练的同时将其归一化训练数据。之后，我们正在计算损失。

We are repeating this process until the improvements per step are very small. Keep in mind that the tf.Variables (the parameters) have been adapted throughout and now reflect an optimum.

我们将重复此过程，直到每个步骤的改进很小为止。请记住，已经对tf.Variables (参数)进行了全面调整，现在反映了最优值。

可视化结果和过程 (Visualize the result and process)

# de-normalize variables to be plotable again    trainEvidMean = trainEvid.mean()    trainEvidStd = trainEvid.std()    trainConvictMean = trainConvict.mean()    trainConvictStd = trainConvict.std()    xNorm = trainEvidNorm * trainEvidStd + trainEvidMean    yNorm = (        sess.run(tfEvidFactor) * trainEvidNorm + sess.run(tfConvictOffset)    ) * trainConvictStd + trainConvictMean

# Plot the result graph    plt.figure()

plt.xlabel("Number of Evidence")    plt.ylabel("Number of Convictions")

plt.plot(trainEvid, trainConvict, "go", label="Training data")    plt.plot(testEvid, testConvict, "mo", label="Testing data")    plt.plot(xNorm, yNorm, label="Learned Regression")    plt.legend(loc="upper left")

plt.show()

# Plot an animated graph that shows the process of optimization    fig, ax = plt.subplots()    line, = ax.plot(numEvid, numConvict)

plt.rcParams["figure.figsize"] = (10, 8) # adding fixed size parameters to keep animation in scale    plt.title("Gradient Descent Fitting Regression Line")    plt.xlabel("Number of Evidence")    plt.ylabel("Number of Convictions")    plt.plot(trainEvid, trainConvict, "go", label="Training data")    plt.plot(testEvid, testConvict, "mo", label="Testing data")

# define an animation function that changes the ydata    def animate(i):        line.set_xdata(xNorm)        line.set_ydata(            (evidFactorAnim[i] * trainEvidNorm + convictOffsetAnim[i]) * trainConvictStd            + trainConvictMean        )        return (line,)

# Initialize the animation with zeros for y    def initAnim():        line.set_ydata(np.zeros(shape=numConvict.shape[0]))        return (line,)

# call the animation    ani = animation.FuncAnimation(        fig,        animate,        frames=np.arange(0, plotIndex),        init_func=initAnim,        interval=200,        blit=True,    )

plt.show()

To visualize the process, it is helpful to plot the result and maybe even the optimization process.

为了使过程可视化，对结果甚至优化过程进行绘图很有帮助。

Check out this Pluralsight course which helped me a lot to get started. :)

查看此Pluralsight课程，该课程对我有很多帮助。 :)

Thanks for reading my article! Feel free to leave any feedback!

感谢您阅读我的文章！随时留下任何反馈！

Daniel is a LL.M. student in business law, working as a software engineer and organizer of tech related events in Vienna. His current personal learning efforts focus on machine learning.

丹尼尔(Daniel)是法学硕士。商业法专业的学生，在维也纳担任软件工程师和技术相关活动的组织者。他目前的个人学习重点是机器学习。

Connect on:

连接：

LinkedIn

领英
Github

Github
Medium

中
Twitter

推特
Steemit

Steemit
Hashnode

哈希节点

翻译自: https://www.freecodecamp.org/news/tensorflow-starter-on-law-and-statistics-646072b93b5a/