利用OPENCV创作梵高艺术风格图片

OpenCV is a library with 20 years of continuous development under its belt. The age of introspection and search of destiny. Are there any projects based on the library which have made someone’s life better and happier? Can you make it by yourself? Seeking the answers and trying to discover new OpenCV modules, I would like to collect apps which produce great visual effects – so that the wow-reaction comes first, followed by a deduction that the computer vision is actually on the table.

The style transfer experiment is entitled to be described in the first place. Maestros’ artistic styles are transferred to the photographs. The article will shed light on the gist of the procedure, as well as on the new version of OpenCV library – namely, OpenCV.js — JavaScript one.

OpenCV是一个拥有20年持续发展的库函数。在一个内省和寻找命运的时代。是否有基于库函数的项目使人们的生活更美好，更幸福？你能自己做吗？为寻求答案并尝试发现新的OpenCV模块，我想收集能够产生出色视觉效果的应用程序-这产生了非同凡响的反应，然后得出计算机视觉实际上已经存在的结论。

样式转移实验有必要首先进行描述。 Maestros的艺术风格被转移到照片中。本文将阐明该过程的要点以及OpenCV库的新版本–即OpenCV.js – JavaScript版本。

Style transfer

I regret to inform the machine-learning naysayers that deep convolutional network is the core component of the present article. Because it works. The OpenCV does not provide an opportunity to train neural networks, but one can launch the existing models. We are going to use CycleGAN, a pre-trained network. Thanks to the authors, we can download the network absolutely for free to convert images of apples into the ones of oranges, horses into zebras, satellite images into maps, pictures of winter into pictures of summer, and so on. Moreover, network training procedure allows to have two generator models active in both directions. That is, training to convert winter into summer, you will also get a model for drawing of winter sceneries on pictures of summer. It is impossible to give up such a unique opportunity. In our example we use models able to convert photos into pictures of artists. Namely, of Vincent van Gogh, Claude Monet, Paul Cezanne, as well as an entire genre of Japanese engravings called Ukiyo-e. Thus, we acquire four separate networks. It is worth mentioning that a big number of pictures by this or that artist were used to train each network, as the authors tried to teach the network to assimilate the artistic style, not to transfer style of a particular work.

我很遗憾地通知机器学习反对者，深度卷积网络是本文的核心组成部分。因为它有效。 OpenCV没有提供训练神经网络的机会，但是可以启动现有模型。我们将使用预训练的网络CycleGAN
。感谢作者的帮助，我们可以完全免费下载该网络，以将苹果图像转换为橘子图像，将马图像转换为斑马图像，将卫星图像转换为地图，将冬季图像转换为夏季图像，等等。此外，网络训练程序允许在两个方向上都激活两个生成器模型。也就是说，通过培训将冬天变成夏天，您还将获得一个模型，用于在夏天的图片上绘制冬天的风景。放弃这样一个独特的机会是不可能的。在我们的示例中，我们使用能够将照片转换为艺术家照片的模型。即，文森特·梵高，克洛德·莫奈，保罗·塞尚，以及整个日本版画风格的浮世绘。因此，我们获得了四个单独的网络。值得一提的是，由于作者试图教该网络吸收艺术风格，而不是转移特定作品的风格，因此，他或这位艺术家使用了大量图片来训练每个网络。

OpenCV.js

OpenCV is a C++ library, and an opportunity to create automatic wrappers, which call native methods, exists for the major part of its functionality. Officially, wrappers in Python and Java are supported. On top of that, user solutions for Go and PHP are available. It would be great to learn about your experience of using in other languages too, if any, and who made it possible. OpenCV.js is a project implemented in 2017 thanks to Google Summer of Code. Besides, the OpenCV deep learning module was once created and significantly improved within this framework. In contrast to other languages, at the moment OpenCV.js is not a wrapper for native methods in JavaScript, but a full compilation by means of Emscripten which uses LLVM и Clang. It allows you to convert your C or C++ application or library into a .js file, which can be launched in a browser.

OpenCV是一个C ++库，其主要功能是存在创建自动包装的机会，该包装调用本地方法。正式地，支持Python和Java中的包装器。最重要的是，提供了Go和PHP的用户解决方案。最好能了解您使用其他语言（如果有）的经验以及谁使它成为可能。 OpenCV.js是在2017年实施的一个项目，这要归功于Google Summer of Code。此外，OpenCV深度学习模块曾在此框架内创建并得到显着改进。与其他语言相比，目前，OpenCV.js并不是JavaScript本地方法的包装，而是通过Emscripten（使用LLVM和Clang）进行的完整编译。它允许您将C或C ++应用程序或库转换为一个.js文件，该文件可以在浏览器中启动。

For instance,

#include <iostream>int main(int argc, char** argv) {std::cout << "Hello, world!" << std::endl;return 0;}

Is compiled into asm.js

emcc main.cpp -s WASM=0 -o main.js

Then we launch:

<!DOCTYPE html><html><head><script src="main.js" type="text/javascript"></script>
</head></html>

OpenCV.js can be connected to the project in the following way (nightly build):

<script src="https://docs.opencv.org/master/opencv.js" type="text/javascript"></script>

Image Upload

In OpenCV.js, images can be read from the elements like canvas or img. That means that the image files shall be uploaded by the users. For convenience, the auxiliary function addFileInputHandler uploads the image specific canvas element automatically — just push the button once the image is selected on the disk.

在OpenCV.js中，可以从canvas或img等元素读取图像。这意味着图像文件应由用户上传。为了方便起见，辅助功能addFileInputHandler自动上载特定于图像的画布元素—在磁盘上选择图像后，只需按一下按钮即可。

var utils = new Utils('');
utils.addFileInputHandler('fileInput', 'canvasInput');var img = cv.imread('canvasInput');

where

<input type="file" id="fileInput" name="file" accept="image/*" /><canvas id="canvasInput" ></canvas>

It should be noted that img will be a 4-channel RGBA image, which is different from the typical behavior of cv::imread which creates a BGR image. It shall be considered, for instance, while porting algorithms from other languages.It is quite simple when it comes to rendering – it is enough to call imshow only once specifying the id of the canvas needed (RGB or RGBA).

应当注意，img将是一个4通道RGBA图像，这与创建BGR图像的cv :: imread的典型行为不同。例如，在移植其他语言的算法时，应考虑使用它。渲染时非常简单–仅指定所需canvas的ID（RGB或RGBA）一次调用imshow就足够了。

cv.imshow("canvasOutput", img);

Algorithm

The whole algorithm of image processing is basically the launch of a neural network. Imagine that all inner processes shall remain a mystery, the only thing we have to do is to prepare a proper input and to interpret the prediction correctly (the output of the net).

In this example we will look at the network which receives a four-dimensional tensor with values of float type within the interval [-1, 1]. Each of the dimensions in accordance with the change rate, is an index of a picture, channels, height, and width. Such a layout is called NCHW, and the tensor itself is called a blob (binary large object). Pre-processing is aimed to convert an OpenCV image, the intensity levels of which are interleaved and have intervals of values [0, 255] of unsigned char type in NCHW blob with a value range [-1, 1].

整个图像处理算法基本上是神经网络的启动。想象所有内部过程仍然是一个谜，我们要做的唯一一件事就是准备适当的输入并正确地解释预测（网络的输出）。

在此示例中，我们将查看网络，该网络接收具有在间隔[-1，1]之内的float类型值的四维张量。根据变化率的每个尺寸是图像，通道，高度和宽度的索引。这种布局称为NCHW，张量本身称为Blob（二进制大对象）。预处理旨在转换OpenCV图像，该图像的强度级别是交错的，并且在NCHW Blob中具有值范围[-1、1]的无符号字符类型的值[0，255]的间隔。

piece of Nizhny Novgorod Kremlin (as a human sees it)
下诺夫哥罗德克里姆林宫的一块（人类看到的）

interleaved representation (as it is stored in OpenCV)
交错表示（存储在OpenCV中）

planar representation
平面表示

Post-processing requires inverse transformation: the network retrieves NCHW blob with the values within the interval [-1, 1], which needs to be repackaged into an image, normalized to [0, 255] and converted to unsigned char. Therefore, taking into consideration all specific aspects of image reading and recording in OpenCV.js, we have the following steps shaping up:

imread -> RGBA -> BGR [0, 255] -> NCHW [-1, 1] -> [network]
[network] -> NCHW [-1, 1] -> RGB [0, 255] -> imshow

Having a look at the pipeline obtained, some questions arise: why is the network not able to base on RGBA and retrieve RGB? Why do pixel shift and normalization require extra transformations to be done? The answer is that neural network – is a mathematical object which performs computations over the input data from specific distribution. In our case we trained it to receive data of this particular type, so to achieve the desired results, one has to reproduce the preprocessing used by the authors during pretraining.

后处理需要逆变换：网络以间隔[-1，1]内的值检索NCHW blob，需要将其重新打包到图像中，标准化为[0，255]并转换为无符号字符。因此，考虑到OpenCV.js中图像读取和记录的所有特定方面，我们需要进行以下步骤：

读取-> RGBA-> BGR [0，255]-> NCHW [-1，1]-> [网络]
[网络]-> NCHW [-1，1]-> RGB [0，255]-> imshow

查看获得的管道后，会出现一些问题：为什么网络无法基于RGBA并无法检索RGB？为什么像素移动和归一化需要进行额外的转换？答案是神经网络–是一个数学对象，可以根据特定分布对输入数据进行计算。在我们的案例中，我们对它进行了训练以接收这种特定类型的数据，因此要获得所需的结果，必须重现作者在预训练期间使用的预处理。

Implementation

The neural network that we are going to launch is stored as a binary file, which has to be uploaded to the local file system first.
我们将要启动的神经网络存储为二进制文件，必须首先将其上传到本地文件系统。

var net;
var url = 'style_vangogh.t7';
utils.createFileFromUrl('style_vangogh.t7', url, () => {net = cv.readNet('style_vangogh.t7');
});

By the way, url< — is a fully functional link. In this case we just upload the file stored next to the current HTML page, which can be, however, substituted for the the original source (in this case download may take more time).
顺便说一句，url <-是一个功能齐全的链接。在这种情况下，我们只上传存储在当前HTML页面旁边的文件，但是可以用它代替原始源（在这种情况下，下载可能会花费更多时间）。

var imgRGBA = cv.imread('canvasInput');
var imgBGR = new cv.Mat(imgRGBA.rows, imgRGBA.cols, cv.CV_8UC3);
cv.cvtColor(imgRGBA, imgBGR, cv.COLOR_RGBA2BGR);

A 4D blob is created, where a blobFromImage convert to float type data, with normalization constants applied. Then the network is launched.
创建一个4D Blob，其中blobFromImage转换为浮点型数据，并应用了归一化常数。然后启动网络。

var blob = cv.blobFromImage(imgBGR, 1.0 / 127.5,  // multiplier{width: imgBGR.cols, height: imgBGR.rows},  // dimensions[127.5, 127.5, 127.5, 0]);  // subtraction of the average
net.setInput(blob);
var out = net.forward();

The result is converted back to the image of the type needed with the interval of values [0, 255]
结果转换回间隔为[0，255]所需类型的图像

// Normalization of values from interval [-1, 1] to [0, 255]
var outNorm = new cv.Mat();
out.convertTo(outNorm, cv.CV_8U, 127.5, 127.5);// Creation of an interleaved image from the planar blob
var outHeight = out.matSize[2];
var outWidth = out.matSize[3];
var planeSize = outHeight * outWidth;var data = outNorm.data;
var b = cv.matFromArray(outHeight, outWidth, cv.CV_8UC1, data.slice(0, planeSize));
var g = cv.matFromArray(outHeight, outWidth, cv.CV_8UC1, data.slice(planeSize, 2 * planeSize));
var r = cv.matFromArray(outHeight, outWidth, cv.CV_8UC1, data.slice(2 * planeSize, 3 * planeSize));var vec = new cv.MatVector();
vec.push_back(r);
vec.push_back(g);
vec.push_back(b);
var rgb = new cv.Mat();
cv.merge(vec, rgb);// Result rendering
cv.imshow("canvasOutput", rgb);

At the moment, OpenCV.js is collected in a semi-automatic mode. It means that not all the modules and methods acquire corresponding signatures in JavaScript. For instance, for dnn module the list of acceptable functions is determined as follows:
目前，OpenCV.js以半自动模式收集。这意味着并非所有模块和方法都在JavaScript中获得相应的签名。例如，对于dnn模块，可接受功能的列表确定如下：

dnn = {'dnn_Net': ['setInput', 'forward'],'': ['readNetFromCaffe', 'readNetFromTensorflow','readNetFromTorch', 'readNetFromDarknet','readNetFromONNX', 'readNet', 'blobFromImage']}

The last conversion, that separates the blob into three channels and mixes them into an image, in fact can be performed by means of a single method — imagesFromBlob, which has not been added to the list above yet. It could be you first contribution to the development of OpenCV, couldn’t it?