神秘的数组初始化

by gk_

由gk_

图像识别神秘化 (Image Recognition Demystified)

Nothing in machine learning captivates the imagination quite like the ability to recognize images. Identifying imagery must connote “intelligence,” right? Let’s demystify.

机器学习没有什么能像图像识别能力那样吸引着想象力。识别图像必须表示“智能”，对吗？让我们揭开神秘面纱。

The ability to “see,” when it comes to software, begins with the ability to classify. Classification is pattern matching with data. Images are data in the form of 2-dimensional matrices.

对于软件，“查看”的能力始于分类的能力。分类是与数据进行模式匹配。图像是二维矩阵形式的数据。

Image recognition is classifying data into one bucket out of many. This is useful work: you can classify an entire image or things within an image.

图像识别将数据分类到众多存储桶中。这项工作很有用：您可以对整个图像或图像中的事物进行分类。

One of the classic and quite useful applications for image classification is optical character recognition (OCR): going from images of written language to structured text.

光学字符识别( OCR )是图像分类的经典且非常有用的应用程序之一： 从书面图像到结构化文本 。

This can be done for any alphabet and a wide variety of writing styles.

可以针对任何字母和多种书写方式来完成此操作。

过程中的步骤 (Steps in the process)

We’ll build code to recognize numerical digits in images and show how this works. This will take 3 steps:

我们将构建代码以识别图像中的数字并显示其工作原理。这将需要3个步骤：

gather and organize data to work with (85% of the effort)

收集和整理数据以进行合作(85％的努力)
build and test a predictive model (10% of the effort)

建立和测试预测模型 (工作量的10％)
use the model to recognize images (5% of the effort)

使用模型识别图像(工作量的5％)

Preparing the data is by far the largest part of our work, this is true of most data science work. There’s a reason it’s called DATA science!

到目前为止，准备数据是我们工作的最大部分，大多数数据科学工作都是如此 。有一个原因叫数据科学！

The building of our predictive model and its use in predicting values is all math. We’re using software to iterate through data, to iteratively forge “weights” within mathematical equations, and to work with data structures. The software isn’t “intelligent”, it works mathematical equations to do the narrow knowledge work, in this case: recognizing images of digits.

我们的预测模型的建立及其在预测值中的用途都是数学上的 。我们正在使用软件迭代数据，迭代伪造数学方程式中的“权重”以及使用数据结构。该软件不是“智能”软件，它通过数学方程式来完成狭义的知识工作，在这种情况下，即：识别数字图像。

In practice, most of what people label “AI” is really just software performing knowledge work.

实际上，人们标记为“ AI”的大多数实际上只是执行知识工作的软件。

我们的预测模型和数据 (Our predictive model and data)

We’ll be using one of the simplest predictive models: the “k-nearest neighbors” or “kNN” regression, first published by E. Fix, J.L. Hodges in 1952.

我们将使用最简单的预测模型之一：“ k最近邻居”或“ kNN”回归模型，该模型最早由E. Fix，JL Hodges于1952年发布。

A simple explanation of this algorithm is here and a video of its math here. And also here for those that want to build the algorithm from scratch.

该算法的简单解释就是在这里和数学的视频在这里。而且在这里为那些想从头开始构建的算法。

Here’s how it works: imagine a graph of data points and circles capturing k points, with each value of k validated against your data.

它的工作方式如下：想象一下一个数据点和捕获k个点的圆的图形，其中k的每个值都针对您的数据进行了验证。

The validation error for k in your data has a minimum which can be determined.

数据中k的验证误差有一个可以确定的最小值。

Given the ‘best’ value for k you can classify other points with some measure of precision.

给定k的“最佳”值，您可以用某种精度来对其他点进行分类。

We’ll use scikit learn’s kNN algorithm to avoid building the math ourselves. Conveniently this library will also provides us our images data.

我们将使用scikit Learn的kNN算法来避免自己构建数学。方便地，该库还将为我们提供图像数据。

Let’s begin.

让我们开始。

The code is here, we’re using iPython notebook which is a productive way of working on data science projects. The code syntax is Python and our example is borrowed from sk-learn.

代码在这里，我们使用的是iPython Notebook ，这是处理数据科学项目的一种有效方式。代码语法是Python，我们的示例是从sk-learn借来的。

Start by importing the necessary libraries:

首先导入必要的库：

Next we organize our data:

接下来，我们整理数据：

training images: 1527, test images: 269

You can manipulate the fraction and have more or less test data, we’ll see shortly how this impacts our model’s accuracy.

您可以操纵分数并拥有或多或少的测试数据，我们很快就会看到这如何影响模型的准确性。

By now you’re probably wondering: how are the digit images organized? They are arrays of values, one for each pixel in an 8x8 image. Let’s inspect one.

现在，您可能想知道：数字图像是如何组织的？它们是值的数组，在8x8图像中每个像素一个。让我们检查一个。

# one-dimension[  0.   1.  13.  16.  15.   5.   0.   0.   0.   4.  16.   7.  14.  12.   0.   0.   0.   3.  12.   2.  11.  10.   0.   0.   0.   0.   0.   0.  14.   8.   0.   0.   0.   0.   0.   3.  16.   4.   0.   0.   0.   0.   1.  11.  13.   0.   0.   0.   0.   0.   9.  16.  14.  16.   7.   0.   0.   1.  16.  16.  15.  12.   5.   0.]

# two-dimensions[[  0.   1.  13.  16.  15.   5.   0.   0.] [  0.   4.  16.   7.  14.  12.   0.   0.] [  0.   3.  12.   2.  11.  10.   0.   0.] [  0.   0.   0.   0.  14.   8.   0.   0.] [  0.   0.   0.   3.  16.   4.   0.   0.] [  0.   0.   1.  11.  13.   0.   0.   0.] [  0.   0.   9.  16.  14.  16.   7.   0.] [  0.   1.  16.  16.  15.  12.   5.   0.]]

The same image data is shown as a flat (one-dimensional) array and again as an 8x8 array in an array (two-dimensional). Think of each row of the image as an array of 8 pixels, there are 8 rows. We could ignore the gray-scale (the values) and work with 0’s and 1’s, that would simplify the math a bit.

相同的图像数据显示为平面(一维)阵列，再次显示为阵列中的8x8阵列(二维)。将图像的每一行视为一个8像素的数组，共有8行。我们可以忽略灰度(值)并使用0和1，这将简化数学运算。

We can ‘plot’ this to see this array in its ‘pixelated’ form.

我们可以对此进行“绘制”以查看其“像素化”形式的数组。

What digit is this? Let’s ask our model, but first we need to build it.

这是几位数让我们问一下我们的模型，但是首先我们需要构建它。

KNN score: 0.951852

Against our test data our nearest-neighbor model had an accuracy score of 95%, not bad. Go back and change the ‘fraction’ value to see how this impacts the score.

根据我们的测试数据，我们的最近邻居模型的准确度得分为95％，还不错。返回并更改“分数”值以查看其如何影响分数。

array([2])

The model predicts that the array shown above is a ‘2’, which looks correct.

该模型预测上面显示的数组为' 2 '，看起来正确。

Let’s try a few more, remember these are digits from our test data, we did not use these images to build our model (very important).

让我们再尝试一些，记住这些是测试数据中的数字 ，我们没有使用这些图像来构建我们的模型(非常重要)。

Not bad.

不错。

We can create a fictional digit and see what our model thinks about it.

我们可以创建一个虚构的数字，然后看看我们的模型对此有何看法。

If we had a collection of nonsensical digit images we could add those to our training with a non-numeric label — just another classification.

如果我们收集了一系列无意义的数字图像，则可以使用非数字标签将它们添加到我们的训练中，这只是另一种分类。

那么图像识别如何工作？ (So how does image recognition work?)

image data is organized: both training and test, with labels (X, y)

图像数据组织起来 ：训练和测试都带有标签(X，y)

Training data is kept separate from test data, which also means we remove duplicates (or near-duplicates) between them.

训练数据与测试数据是分开的，这也意味着我们删除了它们之间的重复项(或几乎重复项)。

a model is built using one of several mathematical models (kNN, logistic regression, convolutional neural network, etc.)

使用几种数学模型( kNN ，逻辑回归，卷积神经网络等)之一构建模型

Which type of model you choose depends on your data and the type and complexity of the classification work.

选择哪种类型的模型取决于您的数据以及分类工作的类型和复杂性。

new data is put into the model to generate a prediction

将新数据放入模型以生成预测

This is lighting fast: the result of a single mathematical calculation.

这是很快的事情：一次数学计算的结果。

If you have a collection of pictures with and without cats, you can build a model to classify if a picture contains a cat. Notice you need training images that are devoid of any cats for this to work.

如果您有带和不带猫的图片集合，则可以建立模型来分类图片是否包含猫。请注意，您需要没有任何猫的训练图像才能起作用。

Of course you can apply multiple models to a picture and identify several things.

当然，您可以将多个模型应用于一张图片并识别几件事。

大数据 (Large Data)

A significant challenge in all of this is the size of each image since 8x8 is not a reasonable image size for anything but small digits, it’s not uncommon to be dealing with 500x500 pixel images, or larger. That’s 250,000 pixels per image, so 10,000 images of training means doing math on 2.5Billion values to build a model. And the math isn’t just addition or multiplication: we’re multiplying matrices, multiplying by floating-point weights, calculating derivatives. This is why processing power (and memory) is key in certain machine learning applications.

所有这方面的一个重大挑战是每张图像的大小，因为8x8对于除小数位以外的其他任何东西都不是合理的图像大小，处理500x500像素或更大的图像并不少见。那就是每张图像250,000像素，因此10,000张训练图像意味着对25亿个值进行数学运算以建立模型。数学不只是加法或乘法：我们要乘以矩阵，再乘以浮点权重，然后计算导数。这就是为什么处理能力(和内存)在某些机器学习应用程序中至关重要的原因。

There are strategies to deal with this image size problem:

有解决此图像尺寸问题的策略：

use hardware graphic processor units (GPUs) to speed up the math

使用硬件图形处理器单元( GPU )加速数学运算
reduce images to smaller dimensions, without losing clarity将图像缩小到较小的尺寸，而不会失去清晰度
reduce colors to gray-scale and gradients (you can still see the cat)

将颜色降低为灰度和渐变(您仍然可以看到猫)

look at sections of an image to find what you’re looking for查看图像的各个部分以找到所需的内容

The good news is once a model is built, no matter how laborious that was, the prediction is fast. Image processing is used in applications ranging from facial recognition to OCR to self-driving cars.

好消息是，一旦建立了模型，无论多么费力，预测都很快。图像处理用于从面部识别到OCR到自动驾驶汽车的各种应用。

Now you understand the basics of how this works.

现在您了解了其工作原理。

翻译自: https://www.freecodecamp.org/news/image-recognition-demystified-fc9c89b894ce/

神秘的数组初始化

神秘的数组初始化_图像识别神秘化相关推荐

004:神秘的数组初始化_使容器神秘化101：面向初学者的深入研究容器技术
004:神秘的数组初始化 by Will Wang 王Will 介绍 (Introduction) Regardless of whether you are a student in school, ...
神秘的数组初始化_I / O神秘化
神秘的数组初始化由于对高度可扩展的服务器设计的所有炒作以及对Node.js的狂热,我一直想重点研究IO设计模式,直到现在为止都没有足够的时间进行投资. 现在已经做了一些研究,我认为最好记下我遇到的东 ...
c语言定义不定长数组初始化_大学C语言期末考试练习题(带详解答案)(1)
链接:https://pan.baidu.com/s/1d2Bb1vNTyBNpFGneIAicVw 提取码:y7uw 单项选择题 C语言的基本单位是函数 1．(A )是构成C语言程序的基本单位. ...
java数组初始化_用Java初始化数组
java数组初始化具有使用C或FORTRAN等语言进行编程的经验的人熟悉数组的概念. 它们基本上是一个连续的内存块,每个位置都是某种类型:整数,浮点数或您所拥有的. Java中的情况与此类似,但有 ...
c语言定义不定长数组初始化_数组的定义，初始化和使用，C语言数组详解
数组可以说是目前为止讲到的第一个真正意义上存储数据的结构.虽然前面学习的变量也能存储数据,但变量所能存储的数据很有限.不仅如此,数组和指针(后续会讲)是相辅相成的,学习数组可以为学习指针打下基础. 那 ...
qt 二维数组初始化_第十九章、C语言学习之数组3
这一章我们来看一看多维数组. 我们假设有这么一个一维数组int a[6]:这个数组里面有6个元素,那么我们可以看成这样一幅图: 那么如果这个数组中a[0]这个元素不是单纯的一个变量,而是一个5个元素的 ...
python定义字符串数组初始化_字符数组及其定义和初始化，C语言字符数组详解...
字符数组及其定义和初始化,C语言字符数组详解字符串的存储方式有字符数组和字符指针,我们先来看看字符数组. 因为字符串是由多个字符组成的序列,所以要想存储一个字符串,可以先把它拆成一个个字符,然后分别 ...
java定义对象数组初始化_怎么定义对象数组 JAVA中怎么初始化对象数组？
java类怎样定义数组对象数组在c++编程中,怎么定义对象数组的指针? JAVA中怎么初始化对象数组? java中怎么创建对象数组?比如我创建了一个学生类Sclass Student{ } 类中怎么 ...
vba二维数组初始化_将工作表数据写入VBA数组
大家好,最近推出的内容是"VBA信息获取与处理"中的部分内容,这套教程面向中高级人员,涉及范围更广,实用性更强,现在的内容是第四个专题"EXCEL工作表数据的读取.回填和 ...

神秘的数组初始化_图像识别神秘化