专业英语翻译（二）Deep Learning（上）（词组+生词+段落翻译+全文翻译）

11/11

原文：
Deep learning allows computational（计算的） models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. These methods have dramatically（戏剧地，显然的引人注目地） improved the state-of-the-art in speech recognition, visual object recognition, object detection and many other domains such as drug discovery and genomics. Deep learning discovers intricate （发杂多样的）structure in large data sets by using the backpropagation （反向传播的）algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer. Deep convolutional （卷积地，回旋地）nets have brought about breakthroughs in processing images, video, speech and audio, whereas recurrent nets have shone light on sequential data such as text and speech.

英文	中文
state-of-the-art	最先进的
speech recognition	语音识别技术
backpropagation algorithm	反向传播算法
deep convolutional nets	深度卷积网络
intricate structure	复杂结构
bing out	生产，产生
visual object recognition	视觉对象识别
recurrent nets	递归网络
show light on	在。。表现出过人的一面
sequential data	序列数据

翻译：
深度学习使得由多层处理层的计算模型能够学习多层抽象数据的表示。这些方式在各领域都带了极大地改善（极大地改善了各领域地最先进的技术），包括最先进的语音识别，虚拟物体识别，物体辨识，和许多别的领域比如药品探索和基因领域。深度学习能够在大数据中发现复杂的结构，它使用反向传播算法（BP）来完成这个过程的。BP算法能够知道如何从前一层获取误差而改变本层的内部参数，这些内部参数可以用于计算表示（展示一个机器如何改变内部的参数，这个参数通过上一层表示的变化，来计算每一层的表示）。深度卷积网络已经在图像，音频，语言和动画处理领域带来一些突破，而递归网络（回复性网络）在处理像文字和演讲等连续性的数据时，也有较大的突破
对于长从句的翻译：。。。improved the state-of-the-art in speech recognition, visual object recognition, object detection 。。。先将这些个单个样例概括一下，给一个总称，然后再说“这个总称包括。。。”这样显得更加通顺一点
长定语从句翻译：
- Deep learning discovers intricate structure in large data sets {短句} by using the backpropagation algorithm {断句} to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer.
- 深度学习能够发现大数据中的复杂结构，它是利用BP算法来发现这个过程的。BP算法能够知道机器如何从前一层获取误差而改变本层的内部参数，这些内部参数可以用来计算表示。
- 需要将断句补成一个完整的句子，”。。使用BP算法“，主语是前者，用它代指一下
我认为这段翻译需要一定的人工智能的基本的常识，对所谓的层概念有一定的了解，不然不知道定语修饰的具体对象。

11/12

原文：Machine-learning technology powers many aspects of modern society: from web searches to content filtering（过滤） on social networks to recommendations on e-commerce websites, and it is increasingly present in consumer products such as cameras and smartphones.
翻译：机器学习技术广泛用于社会的现代社会的各个方面：从网络搜索到互联网上的内容过滤，到电商的商品推荐。而且，在很多的消费产品，如相机，手机中，也出现的越来越频繁了。
原文：Machine-learning systems are used to identify objects in images, transcribe speech into text, match news items, posts or products with users’ interests, and select relevant results of search. Increasingly, these applications make use of a class of techniques called deep learning. Conventional （常见的，传统的）machine-learning techniques were limited in their ability to process natural data in their raw form. For decades, constructing a pattern-recognition or machine-learning system required careful engineering and considerable domain expertise（专门知识） to design a feature extractor（萃取器） that transformed the raw data (such as the pixel values of an image) into a suitable internal representation or feature vector from which the learning subsystem, often a classifier,（分类器） could detect（检测） or classify patterns in the input.

英文	中文
match new items ，posts or products with users’ interests	根据用户的兴趣匹配新的物品、邮件或者产品
process natural data in the raw form	以原始的形式去处理的自然数据
construct a pattern-recognition or machine-learning system	创建一个图像识别或者机器学习系统
pattern	模式、图案、样品

译文：机器学习被使用在很多的领域，包括图像识别，语音转成文字，根据用户的兴趣匹配新的事物，邮件或者产品，选择相关的搜索结果。越来越多的，这一类应用都是用了一种叫做深度学习的技术。传统的（常见的）机器学习技术在处理原始的形式的自然数据方面的能力是受限的。这种情况持续了几十年，创建一个图像识别或者机器学习系统需要一个很精致的引擎（很细致的工程技能）和很多的专业知识，他们会设计一个特征萃取器，这个萃取器能将的原始的数据转换成一个稳定的内部表示方法或者特征向量，通过子学习系统。子学习系统是一个分类器能够的检测或者识别除输入的样本。

11/15

原文：Representation learning is a set of methods that allows a machine to be fed with raw data and to automatically discover the representations needed for detection or classification. Deep-learning methods are representation-learning methods with multiple levels of representation, obtained by composing（生成，组成，撰写，编制） simple but non-linear modules that each transform the representation at one level (starting with the raw input) into a representation at a higher, slightly more abstract level. With the composition of enough such transformations, very complex functions can be learned. For classification tasks, higher layers of representation amplify aspects of the input that are important for discrimination（分辨力） and suppress（镇压，抑制，废止） irrelevant variations. An image, for example, comes in the form of an array of pixel values, and the learned features in the first layer of representation typically represent the presence or absence of edges at particular orientations（定位，定向，情况介绍） and locations in the image. The second layer typically detects（侦测，发觉，探测） motifs（图案，动机） by spotting（认出） particular arrangements of edges, regardless of small variations in the edge positions. The third layer may assemble（聚集、聚合） motifs into larger combinations that correspond （符合一致）to parts of familiar objects, and subsequent layers would detect objects as combinations of these parts. The key aspect of deep learning is that these layers of features are not designed by human engineers: they are learned from data using a general-purpose（通用的） learning procedure.

英文	中文
representation learning	表达学习法
suppress irrelevant variations	抑制无关变量
come in the form of	原始数据，原始类型
the arrangement of	。。。的特殊排列
a general-purpose learning procedure	一个通用的学习过程
assemble 。。。 into larger combination	将。。组合，聚合

译文：表达学习，是这样的一组方法，它给机器灌入原始的数据，（错误：他能的让机器接收最原始的数据），并自动的发现能够被检测和分类的表达方式。深度学习就是有多层表达的表达学习法，而这些多层次的表达通过的生成简单但是非线性的模型来获得，这些模型能够将某一层次的表达转换成更高层次，更能抽象的表达。伴随着足够的类似的转变的生成，一个十分复杂的功能也可以被学习。对于分类任务而言，更高层次的表达能够的扩大输入的各个方面，这对于分辨力和的抑制无关变量而言是十分重要的。比如，对于一个图片而言，原始数据是一个像素组（错误：是由一系列的像素值组成的），那么在第一层上的学习特征表达通常是指在图像的特定位置和方向上是由否边的存在。第二层便是，通过识别出边界的特殊排列，忽略在边界位置的一些小的变量，来识别除图案。第三层，或许会将图案聚集成更大的组合，进而使其对应于某个熟悉目标的某部分（错误：这个组合与某个熟悉物体的某部分相类似）。后面的几层，将几个部分再次组合来辨识物体。深度学习的关键方面就是上述几个特征层（不恰当：这几个特征层），不是由人类工程师设计的，他们使用一种通用的学习流程，从数据中学到的。
定语从句错位，导致的逻辑错误：Deep-learning methods are representation-learning methods with multiple levels of representation, obtained by composing（生成，组成，撰写，编制） simple but non-linear modules that each transform the representation at one level (starting with the raw input) into a representation at a higher, slightly more abstract level.
- 搜索的相关翻译：深度学习就是一种特征学习方法，把原始数据通过一些简单的但是非线性的模型转变成为更高层次的，更加抽象的表达。
- 实际意义是相同的，不过我认为他的有点歧义，是这些非线性的模型，将这些数据转换为的更高层次的表达，其实也就是深度学习，不过那是更能高一层的概念，个人认为作者是想强调这些模型的作用，而且，定语从句是紧跟在模型后的面的。
correspond 和定语从句的翻译： The third layer may assemble motifs into larger combinations that correspond to parts of familiar objects
- 第三层将这些图案组合，从而使其对应于熟悉目标的某个部分。
- 虽然是定语从句，不过可以意译。

11/16

原文：Deep learning is making major advances in solving problems that have resisted（抵抗，忍耐，抗拒） the best attempts of the artificial intelligence community for many years. It has turned out to be very good at discovering intricate structures in high-dimensional data and is therefore applicable to many domains of science, business and government. In addition to beating records in image recognition and speech recognition , it has beaten other machine-learning techniques at predicting the activity of potential drug molecules（分子，微粒）, analysing particle（极小的微粒，质点） accelerator data, reconstructing brain circuits（回路）, and predicting the effects of mutations（突变） in non-coding DNA on gene expression and disease. Perhaps more surprisingly, deep learning has produced extremely promising results for various tasks in natural language understanding, particularly topic classification, sentiment（感情，情绪） analysis, question answering and language translation

英文	中文
resist the best attempts of the artificial intelligence community	人工智能社区没有解决的问题
image and speech recognition	图像和语音识别
predict the activity of potential drug molecules	预测潜在药物活性
particle accelerator	粒子加速器
reconstruct brain circuits	重建大脑回路
sentiment analysis	情感分析
the effects of mutations in non-coding DNA on gene expression and disease	非编码DNA突变对基因表达和疾病的影响

翻译：深度学习主要是在几个人工智能尽了最大的努力很多年都没有解决的问题上的取得很大的进步。事实证明，深度学习擅长发现多维数据的复杂结构。因此可以应用于科学、商业和政治的很多领域。除了打破了图像识别和语音识别的记录，他在很多方面也打败了其他的机器学习技术，这其中包括预测临床药物的活性，分析粒子加速器的数据，重建大脑回路，还有预测非编码DNA在基因表达和疾病上的突变的影响。也许，更让人惊讶的是，深度学习对于自然语言领域的很多的不同的任务，已经能够的产生极度理想的效果，尤其是话题分类，情感分析，自动问答和语言翻译等。
原文：We think that deep learning will have many more successes in the near future because it requires very little engineering by hand, so it can easily take advantage of increases in the amount of available computation and data. New learning algorithms and architectures that are currently being developed for deep neural networks will only accelerate this progress.

英文	中文
engineering by hand	手工工程
the amount of available computation and data	可用计算和数据量

译文：我们认为深度学习在不远的将来，将会有更过的成功，因为他几乎需要很少的手工工程，所以他可以很容易地受益于可用计算量和数据量的增加（不当：利用地可利用计算和数据的大量的增长）。现在正在被开发主要适用于的深度神经网络的新的算法和架构，将只会加快这一过程
will only accelerate this progress.：only，仅仅只会加快这一个过程和只会加快这一个过程， only同样的意思，不同的语义，这里显而易见，最低的成就就是加快这一过程，并不是说只有这个过程，一个上限，一个下限

下面的隔得有点久了，因为中途要换去大作业相关的论文

2020年1月20日：假期开始了，要开始翻译了

原文：
Supervised learning
The most common form of machine learning, deep or not, is supervised learning. Imagine that we want to build a system that can classify images as containing, say, a house, a car, a person or a pet. We first collect a large data set of images of houses, cars, people and pets, each labelled with its category. During training, the machine is shown an image and produces an output in the form of a vector of scores, one for each category. We want the desired category to have the highest score of all categories, but this is unlikely to happen before training. We compute an objective function that measures the error (or dis-tance) between the output scores and the desired pattern of scores.

英文	中文
category	种类、分类
objective	n 目标，攻击目标 adj 客观的；客观存在的
compute an objective function	计算目标函数
supervised learning	监督学习
classify images as containing	按照内容对图片进行分类
the error between the output scores and the desired pattern of scores	结果分数和期望图片的分数之间的错误

译文
无论是深度或非深度的机器学习的最普遍的形式都是监督学习。假想一下，我们想建立一个系统，这个系统能够将图片按照内容进行分类，比如说建筑，车子、人或者宠物。我们首先会收集一个大的数据集，包括汽车、房子、人和宠物的图像，每一个都有自己的类别。在训练期间，输入一张图片给机器，机器能以分数向量的形式作为结果进行输出，每一个类别对应一个向量。我们期望的类别，在所有的种类中有最高的分数，但是这在训练之前是不可能获得的。我们想计算一个目标函数，这个函数用来测量输出值和预期值的误差。
总结：
image that … 设想。。。
给机器输入一张图片，机器进行处理，输出的结果是的分数向量，每一个类别都对应一个分数向量，我们想要期望类别在所有的种类中由最高的分数。然后计算目标函数，测来测量输出结果和期望图片之间的误差。
原文：
The machine then modifies its internal adjustable parameters to reduce this error. These adjustable parameters, often called weights, are real numbers that can be seen as ‘knobs’ that define the input–output function of the machine. In a typical deep-learning system, there may be hundreds of millions of these adjustable weights, and hundreds of millions of labelled examples with which to train the machine.
生词和词组

英文	中文
knobs	旋钮
labelled examples	标签的例子
modify	修改、更改
adjustable parameters	可调节参数
real numbers	实数

译文：
机器然后通过改变内在可适应参数去减少这个误差。这些可调整参数，都是实数，通常被称之为权重，可以看作是定义机器输入和输出的旋钮。在一个典型的深度学习的系统中，常常会有的上亿个的可调整的权重，并且有上亿个的带有权重的标记样例去训练机器。
分析与总结：
注意：These adjustable parameters, often called weights, are real numbers that can be seen as ‘knobs’ that define the input–output function of the machine.这个长句子，没有必要像我这样翻译：对这些可适应误差，常常称之为权重，是真正可被视为旋钮的数字。这些调整旋钮被视为机器的输入和输出函数。要学会自己分类整理，自己调整语序，可以适当的增加短句和每一个断句的主语。

1月21日

原文：
To properly adjust the weight vector, the learning algorithm computes a gradient vector that, for each weight, indicates by what amount the error would increase or decrease if the weight were increased by a tiny amount. The weight vector is then adjusted in the opposite direction to the gradient vector. The objective function, averaged over all the training examples, can be seen as a kind of hilly landscape in the high-dimensional space of weight values. The negative gradient vector indicates the direction of steepest descent in this landscape, taking it closer to a minimum, where the output error is low on average
译文：
为了正确地调整每一个权重向量，学习算法会去计算一个梯度向量。对于每一个权重而言，都有一个梯度向量，如果权重增加一点点，误差将会增加或者减少多少，这都是由梯度向量来体现。然后权重向量将会向梯度向量相反的方向去调整。目标函数，会求取所有训练样例的平均值。目标函数生成的最终的结果可以被看作是一种权重所组成的高维度空间的多山地形。梯度向量的相反值表示整个地形中下降的最快的最陡峭的方向，使该梯度向量接近最小值，进而使得输出的误差在平均值上能够比较低。
生词和词组

英文	中文
compute a gradient vector	计算梯度向量
be averaged over all the training examples	平均所有的样例
hilly landscape	多梯度山地起伏地形
increase by	增加了多少

分析与总结：
每一个权重都对应一个梯度函数，整个梯度函数用来表示权重的变化下降变化最快的地方，权重根据梯度函数来调整，使得最终的结果比较接近平均值。
这一小段翻译的挺久的，可能是数学忘得差不多了，不过大概还是理解的，所以今日目标完成了的。还认识到了两个较难的句子

1月22日

第一段
原文：
In practice, most practitioners use a procedure called stochastic gradient descent (SGD). This consists of showing the input vector for a few examples, computing the outputs and the errors, computing the average gradient for those examples, and adjusting the weights accordingly. The process is repeated for many small sets of examples from the training set until the average of the objective function stops decreasing. It is called stochastic because each small set of examples gives a noisy estimate of the average gradient over all examples. This simple procedure usually finds a good set of weights surprisingly quickly when compared with far more elaborate optimization techniques. After training, the performance of the system is measured on a different set of examples called a test set. This serves to test the generalization ability of the machine — its ability to produce sensible answers on new inputs that it has never seen during training
译文
在实际操作中，大部分从业者都是使用一个叫做随机梯度下降法的过程。这个方法是由以下部分构成：显示样例的输入向量，计算的输出结果和误差，计算样例的平均梯度并据此来调整权重。对于很多小的样例集来说，这个过程将会一直被重复，从最初的训练集一直到目标函数的均值停止减少。之所以被称为随机，是因为每一个小的样例集都会给出一个针对所有样例的平均梯度的噪声估计。相比于大部分复杂的优化技术，这个简单的过程往往能够很快的获取一些比较好的权重集合。训练之后，整个系统性能将在很多不同的样例集测量，这些样例集称之为测试集。他们被用来专门测量机器的泛化能力。泛化能力就是，当我输入一个在训练集中没有出现过的新的样例时，都能够产生一个比较合理的答案。
生词和词组

英文	中文
In practice	事实上
stochastic gradient descent	随机梯度下降法
elaborate optimization techniques	精心设计的优化技术
elaborate	精心设计的
optimization	优化的
the generalization ability	泛化能力

分析与总结
随机梯度下降法：根据任意输入的一个比较小的样例集，能够估计测量出一组权重，并以这个估计的权重为基础，输入新的样例集不断进行修改。在一定概率上，是可以直接获取一些比较好的权重。记住简称SGD（stochastic gradient descent ）
机器学习的泛化能力：输入未曾训练过的样例，能够获得比较正确的结果。
第二段
原文
Many of the current practical applications of machine learning use linear classifiers on top of hand-engineered features. A two-class linear classifier computes a weighted sum of the feature vector components. If the weighted sum is above a threshold, the input is classified as belonging to a particular category.
译文
当前实际应用中的许多的机器学习使用的是线性分类器来对人工提取的特征进行处理。一个二类线性分类器计算一个特征向量所有分量的加权总和。如果加权总和在临界值之上，输入的值将会被分类为一个特定的类别。
生词和词组

英文	中文
linear classifiers	线性分类器
threshold	阈值，临界值，极限
hand-engineered features	手工提取特征
on top of	我认为这里理解为处理更加准确
a weighted sum	加权总和
the feature vector components	特征向量分量
be classfied as belonging to	被分类为

分析总结：
线性分类器，计算每一个输入的特征向量加权和，根据与临界值比较的结果，将之进行分类
特定的属于还是要记一下的。

1月23日

第一段
原文：
Since the 1960s we have known that linear classifiers can only carve their input space into very simple regions, namely half-spaces separated by a hyperplane. But problems such as image and speech recognition require the input–output function to be insensitive to irrelevant variations of the input, such as variations in position, orientation or illumination of an object, or variations in the pitch or accent of speech, while being very sensitive to particular minute variations (for example, the difference between a white wolf and a breed of wolf-like white dog called a Samoyed). At the pixel level, images of two Samoyeds in different poses and in different environments may be very different from each other, whereas two images of a Samoyed and a wolf in the same position and on similar backgrounds may be very similar to each other.
译文
自从的十九世纪六十年代起，我们所熟知的线性分类器仅仅只能将输入空间划分成几个简单的区域，也就是说，通过一个超平面经一个空间分成两个部分。但是问题就在于针对某些场景，我们仅仅需要忽略一些参数，例如图片或者语音识别要求输入和输出函数对于输入的不相关的变量不那么敏感，这些不相关的变量包括：位置上变化，物体方位或者照明的变化，音量的变化或者是口音的变化，但是除此之外，对于特定的某些小的变量还要十分敏感。（比如白狼和一种像狼的叫做萨摩伊的狗之间的区别就很重要）从像素的角度出发，两只不同姿势或者不同环境的萨摩伊犬的图片可能彼此差异很大，但是在相似的环境下，保持相同的姿势的一只狼和一只萨摩伊犬可能彼此十分相像。
生词和词组

英文	中文
carve	v 切开，雕刻
hyperplane	超平面
orientation	n.方向、定向
illumination	n 照明，启发，阐明
pitch	n 程度，音高 v 投掷、触地
a breed of	一种
minute variations	小变量
half-spaces separated by a hyperplane	通过一个超平面将一个空间分成两个部分

分析与总结
对于namely的理解：翻译成也就是说
如果是翻译类似 namely half-spaces separated by a hyperplane，将之补全，补为一个完整的句子去翻译，不然很难理解
第二段
原文
Multilayer neural networks and backpropagation
A multi-layer neural network (shown by the connected dots) can distort the input space to make the classes of data (examples of which are on the red and blue lines) linearly separable. Note how a regular grid (shown on the left) in input space is also transformed (shown in the middle panel) by hidden units. This is an illustrative example with only two input units, two hidden units and one output unit, but the networks used for object recognition or natural language processing contain tens or hundreds of thousands of units.
译文
一个多层神经网络（以多个连接的点来进行展示的）能够整合变化输入实现对线性输入空间的的整合（两个不同的分类分别是红色块和蓝色块），使之变为线性可分。上图展示了一个输入空间的规则网格（在左部展示的）也被隐藏空间的转变的（中间区域的那个图片）。这是一个仅仅只有两个输入单元节点，两个隐藏单元节点和一个输出单元节点的例子，只是为了让你更加便于理解。通常实际被用于目标识别或者自然语言处理的网络要包含数十万个类似的单元。
生词和词组

英文	中文
backpropagation	反向传播算法
distort	扭曲，曲解，使变形
illustrative	说明的、做例证的，解说性的
a regular grid	规则网格

分子与总结：
神经网络通过的对输入空间的变化，实现更好的分类，一个复杂的自然语言处理和目标识别会有上百万个类似的单元构成

1月24日

第一段
原文
Reproduced with permission from C. Olah (http://colah.github.io/). b, The chain rule of derivatives tells us how two small effects (that of a small change of x on y, and that of y on z) are composed. A small change Δx in x gets transformed first into a small change Δy in y by getting multiplied by ∂y/∂x (that is, the definition of partial derivative). Similarly, the change Δy creates a change Δz in z. Substituting one equation into the other gives the chain rule of derivatives — how Δx gets turned into Δz through multiplication by the product of ∂y/∂x and ∂z/∂x. It also works when x, y and z are vectors (and the derivatives are Jacobian matrices).
译文
这串链式导数推论告诉我们两个的小的变化（x的微小改变对于y的影响，y的微小改变对于z的影响）是如何组织到一起的的。x的微小变化量△x最初将会转变成y上面的微小变化△y，通过乘以（这个就是偏导的定义）来实现。相似的△y的变化将会给z带来微小的变化△z，将一个等量带入另外一个，就产生了如下的导数规则。△x转变为△z的方式，主要是通过乘以来实现改变的。当x，y和z都是向量的时候，这也同样是起作用的（导数就是雅可比矩阵（导数矩阵））
生词和词组

英文	中文
reproduce	复制，再生
substitute A into B	将A代入B中
derivatives	导数
jacobian matrices	导数矩阵
the chain rule of derivatives	这串导数规则
be composed of	由组成，两个对象的话，就是如何整合到一起的
get/be transformed into	转变成，转成
partial derivative	偏导
through multiplication by the product of	通过乘以。。。。。

分析与总结：
数学学得真差，忘得差不多了，我很惭愧！

1月25日

第一段

原文：
c, The equations used for computing the forward pass in a neural net with two hidden layers and one output layer, each constituting a module through which one can backpropagate gradients. At each layer, we first compute the total input z to each unit, which is a weighted sum of the outputs of the units in the layer below. Then a non-linear function f(.) is applied to z to get the output of the unit. For simplicity, we have omitted bias terms. The non-linear functions used in neural networks include the rectified linear unit (ReLU) f(z) = max(0,z), commonly used in recent years, as well as the more conventional sigmoids, such as the hyberbolic tangent, f(z) = (exp(z) − exp(−z))/(exp(z) + exp(−z)) and logistic function logistic, f(z) = 1/(1 + exp(−z)).
译文
c是用来计算神经网络的正向传递的等式。在图中的神经网络中，包含两个隐藏层和一个输出层，他们共同构成了一个模型，通过这个模型，方程式c可以反向传递梯度值。在每一层，我们先计算每一个单元节点的所有输入z，z就是在前一层的所有单元节点的输出结果的加权总和。然后，一个非线性的函数f（.）作用的在z上，用来获取本单元节点的输出。为了简化，我们会忽略偏移项。在整个神经网络中使用的非线性函数，包括修正线性单元(ReLU) f(z) = max(0,z)，这个修正线性单元也就是这些年才开始广为使用的。还有更传统的s型生长曲线，比如双曲正切f(z) = (exp(z) − exp(−z))/(exp(z) + exp(−z)) 和logistic函数f(z) = 1/(1 + exp(−z))
生词和词组

英文	中文
for simplicity	为了简单起见
bias terms	偏项
rectified linear unit	修正线性单元
conventional sigmoid	常见的S型函数
conventional	adj 符合习俗的，传统的，常见的
hyberbolic tangent	双曲正切
forward pass	正向传递
equations	方程式、等式

分子与总结
上述每一个的层的每一个节点的输入，都是计算上一层的所有节点的输出的加权总和。仔细观察三部分的公式
补充：sigmoid函数
S型函数，生长曲线，单调递增，并且反函数也是单调递增的，常被用作神经网络的激活函数，将变量映射到0和1之间
定义：
当翻译上述的逻辑性，推论性，数学性的英文时，一定要先理解。当然如果你像我一样不理解，一定要仔细扣句子的结构，知道那个主语执行那个动词，不然到后来，会越搞越混，逻辑性会越来越差。先理解，不理解，细扣句子的结构
当然说起来，你可能觉搞笑，我一个计算机类，微积分学的那么差，整篇文章读起来晕晕的。
第二段
原文：
d, The equations used for computing the backward pass. At each hidden layer we compute the error derivative with respect to the output of each unit, which is a weighted sum of the error derivatives with respect to the total inputs to the units in the layer above. We then convert the error derivative with respect to the output into the error derivative with respect to the input by multiplying it by the gradient of f(z). At the output layer, the error derivative with respect to the output of a unit is computed by differentiating the cost function. This gives yl − tl if the cost function for unit l is 0.5(yl − tl)2, where tl is the target value. Once the ∂E/∂zk is known, the error-derivative for the weight wjk on the connection from unit j in the layer below is just yj ∂E/∂zk
译文
d是用来计算反向传递的方程式。在每一个隐藏层，我们计算与每一个单元节点输出结果产生的的误差导数，这一个与后一层的单元节点的所有的输入（在图上就是上一层单元，但是在逻辑上是下一个将要运行的单元）相关的误差导数的加权总和。然后我们将输出层误差导数和梯度函数相乘，实现将其转变为输入层的误差。在输出层，与每一个单元节点输出的误差导数使用不同的代价函数去计算。如果对于单元l来说，其代价函数是
这里就给出的yl-tl，tl就是目标值。一旦知道了
，链接单元节点j到下一层的权重wjk的误差导数就是
生词和词组

英文	中文
cost function	代价函数

分析与总结
每个单词拆开是什么意思，我都知道，但是合并在一块，是什么意思，我就一脸懵逼了。
我收获的不多，主要这里的方向，向前指的是逻辑的运算方向，向后是与逻辑的运算方向相反，是反向的，不要弄混。
不过吧，我又觉得，数学原理应该是挺简单的，但是我并不懂。
很多只知道那么些，但是不知道什么意思。

1月26日

第一段

原文
A linear classifier, or any other ‘shallow’ classifier operating on raw pixels could not possibly distinguish the latter two, while putting the former two in the same category. This is why shallow classifiers require a good feature extractor that solves the selectivity–invariance dilemma — one that produces representations that are selective to the aspects of the image that are important for discrimination, but that are invariant to irrelevant aspects such as the pose of the animal. To make classifiers more powerful, one can use generic non-linear features, as with kernel methods, but generic features such as those arising with the Gaussian kernel do not allow the learner to generalize well far from the training examples.
译文
（这里对应的是1月23日的萨摩伊犬的例子）线性分类器，或者作用在原始像素的浅层分类器，即使能够将前两者（不同环境和不同姿势的萨摩伊犬）归为同一类别，也不太可能区分后两者（相同环境相同姿势的白狼和萨摩伊犬）。这就是为什么，一些浅层分类器需要一个好的特征提取器来解决选择性的问题——不变性的困境。一个特征提取器能产生代表图片的特征，而这个特征是基于图片的有选择性的区域，而且这个区域对于分辨二者十分重要，除此之外对于不相关的区域而言，比如动物的姿势，是不会产生任何变化。为了使分类器变得更加准确，你可以使用泛化非线性的特征，比如核解法，但是泛化特征，比如从高斯核方法中产生的，远远不能使得学习器从训练样本中，泛化到一般的例子中。
生词和词组

英文	中文
representation	表达、表示
selectivity	n.选择性、分离性
invariance dilemma	不变性的困境
descrimination	分辨力
kernel method	核方法
generic	adj. 类的，一般的，属的，泛化的
generic non-linear classifer	泛化非线性分类器
gaussian kernel	高斯核
generalize	vt 概括、推广、一般化
as with	正如、与…一样，就…来说

分析与总结
错点一：注意各个定语从句和分词的修饰的对象
- any other ‘shallow’ classifier operating on raw pixels，这里应当翻译成操作在原始像素上的浅层分类器。
- 不应该将之加在所有的分类器上的。
this is why应当是用来解释结果莫不是原因，
- 正确：这就是为什么
- 错误：这是因为
针对one等一系列的动词，要找到具体的代词
- 原句：… a good feature extractor … — one that produces representations …
- 这里我直接进行了忽略，没有详细对one进行翻译，也是模棱两可的一种表达，错误的
- 应该指的是特征提取器，翻译成：特征提取器能够…
我觉得这里我翻译的还是挺对的，有逻辑思路的，常规浅层分类器和线性分类器如果要达到区分白狼和萨摩伊犬就得要专门设计特征提取器，但是深度学习就不需要，我觉得csdn官方给出的有点问题。

第二段

原文
The conventional option is to hand design good feature extractors, which requires a considerable amount of engineering skill and domain expertise. But this can all be avoided if good features can be learned automatically using a general-purpose learning procedure. This is the key advantage of deep learning.
译文
传统的观点就是手动设计比较好的特征提取器，这就需要很多的工程技术和一定领域的专业知识了。不过，如果使用了通用目的学习程序，这一切都是可以避免的，因为一些明确的特征是可以自动学习生成的。这就是深度学习的关键优势。
生词和词组

英文	中文
general-purpose	通用目的
hand design	手工设计
domain expertise	领域知识，领域专长
engineering skill	工程技术

分析与总结
这是逻辑上的一个段落，主要是对比了常规的机器学习和深度学习的优劣。一般的机器学习不能不能很好的泛化，对于一些不易区分的实物，不容易生成有效的特征，要想做的很好，需要专门的人来手工设计。但是深度学习就不一样了，深度学习可以自动学习和识别，一些特征。

第三段

原文
A deep-learning architecture is a multilayer stack of simple modules, all (or most) of which are subject to learning, and many of which compute non-linear input–output mappings. Each module in the stack transforms its input to increase both the selectivity and the invariance of the representation. With multiple non-linear layers, say a depth of 5 to 20, a system can implement extremely intricate functions of its inputs that are simultaneously sensitive to minute details — distinguishing Samoyeds from white wolves — and insensitive to large irrelevant variations such as the background, pose, lighting and surrounding objects.
译文
深度学习架构是一个简单模型的多层栈，所有（大部分）的模块都是用来学习。很多都是计算非线性输入和输出映射的。栈中的每一个模型，都会将输入转换，为了增加特征的可选性和不变形。比如说，一个具有5到20个非线性层的系统能够在输入端执行极其复杂的功能，输入数据对细节十分敏感。主要用来区分白狼和萨摩伊犬，除此之外还会对大量的不相关的变化量进行忽略，比如说背景、姿势、灯光和周围的物体等等。
生词和词组

英文	中文
architecture	建筑，架构，工程
be subject to	受支配，从属于，隶属，目标是
intricate	复杂的，错综的
simultaneous	adj . 同时的、联立的、同时发生的

分析与总结
对于复杂的句子，可以一段一段的去翻译，然后再判定短语段之间的归属，使得整个句子变得合理通顺。
不过不得不说，翻译类似的文章，是一定要具备一定的专业知识，不然理解起来，很费劲，而且，很大程度上会出错。

1月27日

第一段

原文
**Backpropagation to train multilayer architectures **
From the earliest days of pattern recognition, the aim of researchers has been to replace hand-engineered features with trainable multilayer networks, but despite its simplicity, the solution was not widely understood until the mid 1980s. As it turns out, multilayer architectures can be trained by simple stochastic gradient descent. As long as the modules are relatively smooth functions of their inputs and of their internal weights, one can compute gradients using the backpropagation procedure. The idea that this could be done, and that it worked, was discovered independently by several different groups during the 1970s and 1980s.
译文
反向传播来训练多层神经网络
从模式识别的早期开始，研究者的目标便是使用可训练的多层网络取代手工提取特征。但是即使这个方法很简单，直到1980年后，这个方法才普遍被理解。事实证明，多层架构可以使用简单的随机梯度下降来训练。只要这个模块的输入和内在权值是相对的平滑函数，这个模块就可以使用反向传递的方式计算梯度。这个可以被做和实施的方法是在1970到1980年间几个不同的小组独立发现的。
生词和词组

英文	中文
as it turns out	事实证明
smooth function	平滑函数

分析与总结
对于连续两个of，具体的翻译方式：smooth functions of their inputs and of their internal weights
- 这里是两个函数，翻译成输入函数和内在权值函数是相对平滑的
意译可能更便于理解，但是是建立在理解逻辑结构的基础上： but despite its simplicity, the solution was not widely understood until the mid 1980s. As it turns out, multilayer architectures can be trained by simple stochastic gradient descent.
- 尽管使用多层网络计算很简单，但是结果一直都很糟糕，直到1980年开始，多层架构使用简单的随即下降梯度算法，最终的结果才有一定程度上的改善。
- 被广泛的接收，可以翻译成被广泛的理解，就是这个结果的正确与否。

第二段

原文
The backpropagation procedure to compute the gradient of an objective function with respect to the weights of a multilayer stack of modules is nothing more than a practical application of the chain rule for derivatives. The key insight is that the derivative (or gradient) of the objective with respect to the input of a module can be computed by working backwards from the gradient with respect to the output of that module (or the input of the subsequent module) .The backpropagation equation can be applied repeatedly to propagate gradients through all modules, starting from the output at the top (where the network produces its prediction) all the way to the bottom (where the external input is fed). Once these gradients have been computed, it is straightforward to compute the gradients with respect to the weights of each module.
译文
（这里翻译的不好，关于这个主语，请看下面的解析翻译）模块的多层栈的权值，用于计算目标函数梯度地反向传播算法，仅仅不过是一个导数链式规则实际应用罢了。核心的观点在于一个模块输入目标的导数（梯度）可以通过该模块的输出梯度（或者是下一个模块输入函数）向后计算获得。向后传播的等式可以被反复利用于在所有的模块中传递梯度，从最末端的输出（整个网络中产生预测的地方）开始经过所有的路径一直到开端（获取外部输入的地方）。一旦这些梯度已经被计算，计算每一个模块的权重就会变得很简单。
生词和词组

英文	中文
the backpropagation procedure	反向传播算法（BP）
nothing more than	不过是，无非是
the key insight	核心观点，关键见解
subsequent module	后继模块
propagate	vt. 传播、传送，繁殖、增殖
prediction	预测
straightforward	adj. 简单的、坦率地、明确的 adv直接了当地，坦率地
a multilayer stack of modules	多层神经网络（这个应该算是另外一种说法吧）

分析与总结
看不懂的关键在于对于深度学习自己缺少一个深入的了解或者说推导的过程，并不知道梯度是如何向后传递的，并不知道梯度的作用，真的是翻译也是要一定的数学基础和理论基础，数学这个东西真的得去好好学习，真的很差劲。
超级长的注意翻译过程：The backpropagation procedure to compute the gradient of an objective function with respect to the weights of a multilayer stack of modules
- 断句翻译：反向传递算法，计算目标函数的梯度，多层神经网络的权重，
- 逻辑连接：多层神经网络的权重的目标函数的梯度，计算目标函数梯度的反向传递的算法
- 最终的结果：计算多层神经网络的权重的目标函数的梯度的反向传递算法

1月28日

第一段

原文
Many applications of deep learning use feedforward neural network architectures , which learn to map a fixed-size input (for example, an image) to a fixed-size output (for example, a probability for each of several categories). To go from one layer to the next, a set of units compute a weighted sum of their inputs from the previous layer and pass the result through a non-linear function. At present, the most popular non-linear function is the rectified linear unit (ReLU), which is simply the half-wave rectifier f(z) = max(z, 0). In past decades, neural nets used smoother non-linearities, such as tanh(z) or 1/(1 + exp(−z)), but the ReLU typically learns much faster in networks with many layers, allowing training of a deep supervised network without unsupervised pretraining. Units that are not in the input or output layer are conventionally called hidden units. The hidden layers can be seen as distorting the input in a non-linear way so that categories become linearly separable by the last layer .
译文
深度学习很多应用都是使用正向反馈神经网络的架构，他们会自动学习去将固定大小的输入（比如，图片）
和固定大小的输出（比如，若干类别中每一个类别的概率）相匹配映射。在从一个层到下一个层之间，一组单元节点会计算来自上一层输出的权重总和并且通过非线性函数传递结果。现在，最常用的非线性函数是修正线性单元（ReLU），它仅仅只是一个简单的半波修正函数： f(z) = max(z, 0)。在过去几十年中，神经网络使用更加平滑的非线性函数，比如说 tanh(z) or 1/(1 + exp(−z))，但是修正单元ReLU通常在多层网络中学习地更快，也可以让深度监督网络直接进行训练，而不需要无监督的提前训练，以达到提前训练的结果。不在输入和输出层的单元节点通常被叫做隐藏节点。隐藏层的作用可以看做是使用非线性的方式调整输入的空间从而使输入数据在下一层变得线性可分。
生词和词组

英文	中文
conventionally	adv. 按照惯例地、照常地
feedforward neural networks	正反馈神经网络，前反馈神经网络
the probility for	。。的概率
the half-wave rectifier	半波纠正
typically	代表性的，极具特色，通常（可别翻译成典型的，多不通顺）

分析与总结：
文章是读懂了，但是有很多的问题，不是文字理解上的，是数学意义上的。
- 什么叫半波修正函数？
- 虽然我知道有无监督的区别是数据是否有标记，但是并不知道二者在训练效果上的差异？
- 线性可分和线性不可分的意义？怎么实现对输入空间的调整？
- 提前训练的作用是什么？
本段要点：
- 不在输入和输出层的节点所在的层称之为隐藏层，其目的就是是的输入层变得线性可分
- 常用地非线性函数是：修正线性单元ReLU，优点是，多层网络中学习的速度更快。

第二段

原文
In the late 1990s, neural nets and backpropagation were largely forsaken by the machine-learning community and ignored by the computer-vision and speech-recognition communities. It was widely thought that learning useful, multistage, feature extractors with little prior knowledge was infeasible. In particular, it was commonly thought that simple gradient descent would get trapped in poor local minima — weight configurations for which no small change would reduce the average error.
译文
在十九世纪九十年代的末，神经网络和反向传播算法很大一部分被机器学习团队放弃，同时也被计算机视觉和语言识别团体忽略。大部分人普遍认为没有任何前导知识的，学习有用的特征提取器是不靠谱的。尤其是，大部分普遍认为，简单的梯度下降将会陷入到较差的局部最小解之中。对于权重部署来说，没有小的变化将会减少平均误差。
生词和词组

英文	中文
speech-recognition communities	语音识别
forsake	放弃，断念
multistage	adj. 多级的，多阶段的，多节的 vt. 使多级的
infeasible	adj. 不可实行的
minima	n 极小值（minimum的复数），最小值
in the late 1990s	在十九世纪九十年代的末
community	n. 社区，群落，共同体（不要翻译成社区了）

分析与总结
learning useful, multistage, feature extractors ：很明显都是形容词，但是这些形容词可能很奇怪，还是要调整一下。
- 学习有用的，多层级的，特征提取器
感觉如果要对机器学习更有一个跟好的理解，还是要学一遍的他的历史。
In particular, it was commonly thought that simple gradient descent would get trapped in poor local minima — weight configurations for which no small change would reduce the average error.关于这段，不是很清楚所谓破折号链接的是两个完整的句子还是一个完整的句子？这里应该是一个解释性的句子。主要是解释较差的局部最小值的含义。这里翻译觉得奇怪，是因为不理解具体的含义。
想参考scdn，发现他直接将后续的那一小部分直接省略了。

1月30日

第一段

原文
In practice, poor local minima are rarely a problem with large net-works. Regardless of the initial conditions, the system nearly always reaches solutions of very similar quality. Recent theoretical and empirical results strongly suggest that local minima are not a serious issue in general. Instead, the landscape is packed with a combinatorially large number of saddle points where the gradient is zero, and the surface curves up in most dimensions and curves down in the remainder. The analysis seems to show that saddle points with only a few downward curving directions are present in very large numbers, but almost all of them have very similar values of the objective function. Hence, it does not much matter which of these saddle points the algorithm gets stuck at.
译文
在实际使用中，在大型网络中，较差的局部最小值很少会成为一个问题。无论最初的情况是什么样的，系统总是可以获取具有相似质量的结果。最近，理论和经验的结果都强有力的证明了对于总体而言，局部极小值并不是一个严重的问题。相反，解空间充满着大量梯度为零组合鞍部点，并且在其周围大部分维度上都是向上的，余下的都是向下的。分析似乎表明，大部分的点都是以下降方向呈现的鞍部点，但是这些点的大多数的目标函数都是具有相似的值。因此，算法最终陷入哪一个鞍部点，并不重要。
生词和词组

英文	中文
empirical	adj. 经验主义的，实证的
combinatorially	adv 组合地
saddle	n. 鞍，鞍状物 vt. 承受、使负担
get stuck at	被困在
remainder	n. 余数，残余 adj.剩余的，吃剩的
be packed with	塞满，挤满
curve up	向上倾斜
curve down	向下倾斜
in the remainder	在剩下的
landscape	解空间（不要翻译成风景）

分析与总结
我觉得逻辑是有点问题的，前面说局部最小值并不是问题，然后用instead来连接下一部分，说明后一部分应该就是的重要的或者说局部最小值很有用，又或者说局部最小值根本不是问题。但是后面那段，我并没有看出来什么逻辑转折的含义。
我觉得csdn翻译的，避重就轻，直接省略了很多部分，只是达到了的一个意译的效果。但是关于鞍部和局部最小值之间的差异，并没有认真的详解，所以浅尝辄止，深究还得靠自己。
核心观点：对于多层次的深度网络而言，局部最小值并不是一个大问题。

第二段

原文
Interest in deep feedforward networks was revived around 2006 (refs 31–34) by a group of researchers brought together by the Canadian Institute for Advanced Research (CIFAR). The researchers introduced unsupervised learning procedures that could create layers of feature detectors without requiring labelled data. The objective in learning each layer of feature detectors was to be able to reconstruct or model the activities of feature detectors (or raw inputs) in the layer below. By ‘pre-training’ several layers of progressively more complex feature detectors using this reconstruction objective, the weights of a deep network could be initialized to sensible values. A final layer of output units could then be added to the top of the network and the whole deep system could be fine-tuned using standard backpropagation. This worked remarkably well for recognizing handwritten digits or for detecting pedestrians, especially when the amount of labelled data was very limited.
译文
对于深度正向反馈网络的关注在2006年兴趣，再一次兴起。这主要是因为CIFAR的一项研究。研究者介绍一种无监督学习方法，它能够创建不需要标记数据的用来进行特征检测的层。特征探测层的目标就是能够重建或者塑造下一个特征检测层的活动。使用重建目标，通过逐步提前训练，可以创建更加复杂的几个特征提取层，一个深层网络的权重就可以初始化为一个相对准确的值。然后，输出单元的最终层可以添加在整个网络的最顶端，并且整个深度学习系统可以使用标准的反向传播算法来进行调整。对于识别手写数字和辨识行人来说，这很有用。尤其是在标签数据的数量是被限制的情况下。
生词和词组

英文	中文
deep feedforward networks	深度正向反馈网络
（CIFAR）the Canadian Institute for Advanced Research	加拿大高等研究所
detector	n. 探测器、检测器
model	塑造，建模，模仿（你那匮乏的语言表达能力，想了半天只能想到建模，我是真无语）
progressively	渐进地，日渐增多地
fine-tune	vt 调整，使用规则，对进行微调
unsupervised learning procedures	无监督深度学习的方法（过程的话，觉得并不好，翻译成方法更好）
layers of feature detectors	特征检测器层，但是不通顺，你可以翻译成特征检测器构成的层，或者特征检测层

分析与总结
在一定程度上，自己是可以增加动词的或者的代词，来让翻译文本更加便于理解。不一定要一字一句，严格地对照着翻译。当然这仅仅只针对科普性的文章，是以科普为最终的目的，不是研究。
段落大意：加拿大的的高级研究所提出的方法可以不使用带标记的数据实现稳定的特征提取功能，对于深度学习具有非凡的意义。

1月31日

第一段

原文
The first major application of this pre-training approach was in speech recognition, and it was made possible by the advent of fast graphics processing units (GPUs) that were convenient to program and allowed researchers to train networks 10 or 20 times faster. In 2009, the approach was used to map short temporal windows of coefficients extracted from a sound wave to a set of probabilities for the various fragments of speech that might be represented by the frame in the centre of the window. It achieved record-breaking results on a standard speech recognition benchmark that used a small vocabulary and was quickly developed to give record-breaking results on a large vocabulary task. By 2012, versions of the deep net from 2009 were being developed by many of the major speech groups and were already being deployed in Android phones. For smaller data sets, unsupervised pre-training helps to prevent overfitting, leading to significantly better generalization when the number of labelled examples is small, or in a transfer setting where we have lots of examples for some ‘source’ tasks but very few for some ‘target’ tasks. Once deep learning had been rehabilitated, it turned out that the pre-training stage was only needed for small data sets.
译文
这种提前训练的方法的第一次主要应用是在语音识别领域，正是因为快速图像处理单元GPU的问世使得这一切成为一种可能，GPU很方便进行编程，并且能使研究者训练网络的速度快10到20倍。在2009年，这项方法被用于将从声波中提取出来的一组临时参数窗口去匹配展示在窗口的中间的一段对话的不同片段的可能。在基本的标准语言的识别的样例之内取得了破纪录的成就，基本的样例使用的词汇量比较小。这个算法的优越处在于对于一个词汇量比较大的任务，也能够快速的给出一个的破纪录的结果。自2009年起到2012年为止，深度网络的版本经过多个的主要的语音识别的团队开发，已经有多个版本，并逐步部署在安卓手机上。对于较小的数据集而言，无监督的提前训练有助于防止数据过度拟合，同时在标签数据样例比较少的情况下也能够达到一个比较好的泛化结果，除此之外，在有很多的源数据，但是没有目标数据的转换设置中也可以达到相同的结果。一旦深度学习技术重新恢复，，结果表明提前训练仅仅针对较小的数据集有用。
意译：
提前训练的第一次应用是在语音识别领域，因为GPU的出现促成了这一结果。GPU他方便编程并且能够极大地提高训练速度。2009年，这个方法的从声波中一系列参数并和实际对话相匹配。对于词汇量较小语音识别基准测试，这个方法的效果很好的，如果将词汇量增加，这个方法也能快速实现相同的效果。从2009年到2012年，很多的专门做语音识别的团队已经开发出了不同版本的语音识别软件，并且逐步部署在的安卓手机上。数据集如果比较小，无监督的提前训练可以防止数据过度拟合，针对标签数据较小的情况和有大量源数据但是没有目标数据的两种情况，这种方法都可以实现较好的泛化。提前学习仅仅适用于数据集较小的情况。
生词和词组

英文	中文
the advant of fast graphic processing units	快速图像处理单元的面世
temporal	ad. 暂时的，当时的 n. 世间万物，暂存的事物
coeficients	n. 系数
fragment	n 碎片 vt 使裂开
benchmark	n 基准 vt 用基准问题测试
overfitting	n/v 过适，过度拟合
rehabilitate	vt 使康复使恢复名誉复兴
graphic process unit	GPU 图像处理单元
a standard speech recognition benchmark	标准的语音识别的基准测试程序

分析与总结
难句：the approach was used to map short temporal windows of coefficients extracted from a sound wave to a set of probabilities for the various fragments of speech that might be represented by the frame in the centre of the window. （定语太长，并且很难理解）
- 只能说网上的翻译是真的鸡贼，能省就省，能跳就跳，怎么简单怎么来。
- 问了一下同学，觉得要有点深入了解才行，我还没到这个深度，往上翻译我觉得有问题的，所以就不乱写了。
key insight：提前训练的无监督神经网络，主要是针对数据集较小的情况，主要应用于语音识别。

第二段

原文
There was, however, one particular type of deep, feedforward net-work that was much easier to train and generalized much better than networks with full connectivity between adjacent layers. This was the convolutional neural network (ConvNet). It achieved many practical successes during the period when neural networks were out of favour and it has recently been widely adopted by the computer-vision community.
译文
除此之外，还有一种深度向前反馈的神经网络。对于邻阶层而言，较于全连接网络，它更容易训练，并且泛化的效果更好。这就是卷积神经网络，在神经网络的不在受关注期间，卷积神经网络已经获得了很多的成功，并且近来已经被计算机视觉团队广泛接受和采取。
生词和词组

英文	中文
convolutional	卷积的
out of favour	失宠，不受欢迎

分析与总结
深度学习是的实现机器学习的技术，包含了监督学习算法和无监督学习算法，常见的卷积神经网络是一种监督学习算法，生成对抗网络的是一种无监督学习。有无监督仅仅针对训练的样本是否是有标记的。

专业英语翻译（二）Deep Learning（上）（词组+生词+段落翻译+全文翻译）相关推荐

CutPaste: Self-Supervised Learning for Anomaly Detection and Localization 全文翻译+详细解读
CutPaste: Self-Supervised Learning for Anomaly Detection and Localization 全文翻译+详细解读文章速览全文翻译及详细解释 0 ...
Data Mining 论文翻译：Deep Learning for Spatio-Temporal Data Mining: A Survey
原文链接:[1906.04928] Deep Learning for Spatio-Temporal Data Mining: A Survey (arxiv.org) IEEE Transacti ...
【综述翻译】Deep Learning for Video Game Playing
深度强化学习实验室原文来源:https://arxiv.org/pdf/1708.07902.pdf 翻译作者:梁天新博士编辑:DeepRL 在本文中,我们将回顾最近的Deep Learning在 ...
交通预测论文翻译：Deep Learning on Traffic Prediction: Methods,Analysis and Future Directions
原文链接:[2004.08555v3] Deep Learning on Traffic Prediction: Methods, Analysis and Future Directions (ar ...
深度学习文本分类文献综述（翻译自Deep Learning Based Text Classification: A Comprehensive Review）
深度学习文本分类文献综述摘要介绍 1. 文本分类任务 2．文本分类中的深度模型 2.1 Feed-Forward Neural Networks 2.2 RNN-Based Models 2.3 ...
全新版大学英语综合教程第一册学习笔记（原文及全文翻译）——2 - All The Cabbie Had Was A Letter（出租车司机的一封信）
Unit 2 - All The Cabbie Had Was A Letter(出租车司机的一封信) How do you feel when old friends are far away? D ...
专业英语翻译（一）The Computer for the 21st Century（词组+生词+段落翻译+全文翻译）（随缘吧）
The Computer for the 21st Century 译文 by Mark Weiser The most profound(深厚的,意义深远的) technologies are th ...
【论文翻译】Deep Learning for Multi-view Stereo via Plane Sweep: A Survey（2021）
一.论文简述 1. 第一作者:Qingtian Zhu 2. 发表年份:2021 3. 发表期刊:CVPR 4. 关键词:MVS.深度学习.综述 5. 核心思想:读到的第一篇深度MVS的综述,总结的很 ...
计算机专业英语汇总(二)
Computer_English(二): A2A integration [,inti'ɡreiʃən] A2A整合 abstract ['æbstrækt, æb'strækt] 抽 ...

专业英语翻译（二）Deep Learning（上）（词组+生词+段落翻译+全文翻译）

专业英语翻译（二）Deep Learning（上）（词组+生词+段落翻译+全文翻译）相关推荐

最新文章

热门文章