Julia could possibly be the biggest threat to Python. For a variety of applications, Julia is hands-down faster than Python and is almost as fast as C. Julia also offers features like multiple dispatch and metaprogramming that give it an edge over Python.

Julia可能是对Python的最大威胁。 对于各种应用程序,Julia比Python放慢了速度, 几乎与C一样快 。 Julia还提供了诸如多调度和元编程的功能, 使它在Python上更具优势 。

At the same time, Python is established, widely used, and has a variety of time tested packages. The question of switching to Julia is a hard question to address. Often the answer is a frustrating, “It depends”.

同时,Python已建立并得到广泛使用,并且具有各种经过时间测试的软件包。 改用Julia这个问题很难解决。 答案通常令人沮丧,“取决于情况”。

To help showcase Julia and to address the question of whether to use it, I’ve taken samples of deep learning code from both languages and placed them in series for easy comparison. I will walk through training VGG19 model on the CIFAR10 dataset.

为了帮助展示Julia并解决是否使用它的问题,我从这两种语言中提取了深度学习代码示例,并将它们串联放置以便于比较。 我将逐步介绍如何在CIFAR10数据集上训练VGG19模型。

楷模 (Models)

Deep learning models can be huge and often take a lot of work to define, especially when they contain specialized layers like ResNet [1]. We will use a medium sized model (no pun intended) , VGG19, for this comparison [2].

深度学习模型可能非常庞大,通常需要花费大量工作来定义,特别是当它们包含像ResNet [1]这样的专门层时。 为了进行比较,我们将使用中等大小的模型(无双关语)VGG19。

VGG19 in Python


I’ve chosen Keras for our Python implementation because its lightweight and flexible design is competitive with Julia.


from keras.models import Sequentialfrom keras.layers import Dense, Conv2D, MaxPool2D , Flattenvgg19 = Sequential()vgg19.add(Conv2D(input_shape=(224,224,3),filters=64,kernel_size=(3,3),padding="same", activation="relu"))vgg19.add(Conv2D(filters=64,kernel_size=(3,3),padding="same", activation="relu"))vgg19.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))vgg19.add(Conv2D(filters=128, kernel_size=(3,3), padding="same", activation="relu"))vgg19.add(Conv2D(filters=128, kernel_size=(3,3), padding="same", activation="relu"))vgg19.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))vgg19.add(Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu"))vgg19.add(Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu"))vgg19.add(Conv2D(filters=256, kernel_size=(3,3), padding="same", activation="relu"))vgg19.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))vgg19.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))vgg19.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))vgg19.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))vgg19.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))vgg19.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))vgg19.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))vgg19.add(Conv2D(filters=512, kernel_size=(3,3), padding="same", activation="relu"))vgg19.add(MaxPool2D(pool_size=(2,2),strides=(2,2)))vgg19.add(Flatten())model.add(Dense(units=4096,activation="relu"))vgg19.add(Dense(units=4096,activation="relu"))vgg19.add(Dense(units=10, activation="softmax"))# Code from Rohit Thakur on GitHub

The task here is to concatenate 21 layers of deep learning machinery. Python handles this well. The syntax is simple and easy to understand. While the .add() function might be a little ugly, it is obvious what it is doing. Furthermore, it is clear in the code what each model layer does. (Convolves, pools, flattens, etc..)

这里的任务是连接21层深度学习机器。 Python处理得很好。 语法简单易懂。 虽然.add()函数可能有点丑陋,但是很明显它在做什么。 此外,在代码中很清楚每个模型层的作用。 (卷积,池化,展平等。)

VGG19 In Julia


using Fluxvgg16() = Chain(                Conv((3, 3), 3 => 64, relu, pad=(1, 1), stride=(1, 1)),    Conv((3, 3), 64 => 64, relu, pad=(1, 1), stride=(1, 1)),    MaxPool((2,2)),    Conv((3, 3), 64 => 128, relu, pad=(1, 1), stride=(1, 1)),    Conv((3, 3), 128 => 128, relu, pad=(1, 1), stride=(1, 1)),    MaxPool((2,2)),    Conv((3, 3), 128 => 256, relu, pad=(1, 1), stride=(1, 1)),    Conv((3, 3), 256 => 256, relu, pad=(1, 1), stride=(1, 1)),    Conv((3, 3), 256 => 256, relu, pad=(1, 1), stride=(1, 1)),    MaxPool((2,2)),    Conv((3, 3), 256 => 512, relu, pad=(1, 1), stride=(1, 1)),    Conv((3, 3), 512 => 512, relu, pad=(1, 1), stride=(1, 1)),    Conv((3, 3), 512 => 512, relu, pad=(1, 1), stride=(1, 1)),    MaxPool((2,2)),    Conv((3, 3), 512 => 512, relu, pad=(1, 1), stride=(1, 1)),    Conv((3, 3), 512 => 512, relu, pad=(1, 1), stride=(1, 1)),    Conv((3, 3), 512 => 512, relu, pad=(1, 1), stride=(1, 1)),    BatchNorm(512),    MaxPool((2,2)),    flatten,    Dense(512, 4096, relu),    Dropout(0.5),    Dense(4096, 4096, relu),    Dropout(0.5),    Dense(4096, 10),    softmax)# Code from Flux Model Zoo on Github



At a glance, Julia looks slightly less cluttered than Python. The import statements are a little cleaner and the code is a little easier to read. Like Python, it is clear what each layer does. The Chain type is a little ambiguous, but it is pretty clear that it concatenates the layers together.

乍一看,Julia看上去比Python显得混乱一些。 import语句更加简洁,代码更易于阅读。 像Python一样,很清楚每一层的作用。 Chain类型有点模棱两可,但是很显然,它将图层连接在一起。

Something to notice is that there is no model class. In fact, Julia is not object oriented, so each layer is a type instead of a class. This is worth noting because it emphasizes how the Julia model is very lightweight. Each of these layers was defined independently and then chained together without any class structure to control how they interact.

需要注意的是,没有模型类。 实际上,Julia不是面向对象的,因此每一层都是类型而不是类。 值得注意的是,它强调了Julia模型的重量非常轻。 这些层中的每一个都是独立定义的,然后链接在一起而没有任何类结构来控制它们如何交互。

However, avoiding a little clutter doesn’t really matter when training giant models. The advantage for Python here is that Python has a huge amount of support for troubleshooting and working through bugs. The documentation is excellent and there are hundreds of VGG19 examples online. Contrast this with Julia where there are five unique VGG19 examples online (maybe).

但是,在训练巨型模型时,避免一点混乱并不重要。 这里的Python的优势在于Python对故障排除和错误修复提供了大量支持。 该文档非常出色,在线上有数百个VGG19示例。 与Julia对比,在网上有五个独特的VGG19示例(也许)。

数据处理 (Data Processing)

For data processing we will look at the dataset CIFAR10 that is commonly associated with VGG19.


Data Processing In Python


from keras.datasets import cifar10from keras.utils import to_categorical(X, Y), (tsX, tsY) = cifar10.load_data() # Use a one-hot-encodingY = to_categorical(Y)tsY = to_categorical(tsY)# Change datatype to floatX = X.astype('float32')tsX = tsX.astype('float32')

# Scale X and tsX so each entry is between 0 and 1X = X / 255.0tsX = tsX / 255.0

In order to train the model on image data, images must be put into the correct format. It only takes a few lines of code to do this. Images are loaded into variables along with image labels. To make classification easier, the labels are translated into a one hot encoding format. This is relatively straightforward in Python.

为了在图像数据上训练模型,必须将图像放入正确的格式。 只需几行代码即可完成此操作。 图像与图像标签一起加载到变量中。 为了简化分类,将标签转换为一种热编码格式。 这在Python中相对简单。

Data Processing In Julia


using MLDatasets: CIFAR10using Flux: onehotbatch# Data comes pre-normalized in JuliatrainX, trainY = CIFAR10.traindata(Float64)testX, testY = CIFAR10.testdata(Float64)# One hot encode labelstrainY = onehotbatch(trainY, 0:9)testY = onehotbatch(testY, 0:9)

Julia requires the same kind of image processing as Python to prepare images for the training process. The code looks extremely similar and does not appear to favor either language.

Julia需要与Python相同类型的图像处理才能为训练过程准备图像。 该代码看起来非常相似,并且似乎不支持这两种语言。

训练 (Training)

Next we will look at the model training loop.


Training in Python


optimizer = SGD(lr=0.001, momentum=0.9)vgg19.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])history =, Y, epochs=100, batch_size=64, validation_data=(tsX, tsY), verbose=0)

Training In Julia


using Flux: crossentropy, @epochsusing Flux.Data: DataLoadermodel = vgg19()opt = Momentum(.001, .9)loss(x, y) = crossentropy(model(x), y)data = DataLoader(trainX, trainY, batchsize=64)@epochs 100 Flux.train!(loss, params(model), data, opt)

The code here is about equally verbose, but the differences in the languages show. In Python, returns a dictionary containing accuracy and loss evaluations. It also has keyword arguments to automate the optimization process for you. Julia is much more bare-bones. The training algorithm requires the user to provide their own loss function, optimizer and iterable containing batches of data along with the model.

此处的代码同样冗长,但是显示了语言上的差异。 在Python中, model.fit返回一个包含准确性和损失评估的字典。 它还具有关键字参数,可以为您自动执行优化过程。 Julia(Julia)更为准。 训练算法要求用户提供自己的损失函数,优化器和可迭代的包含批次数据以及模型。

The Python implementation is much more user friendly. The training process is easy and produces useful output. Julia requires a little more from the user. At the same time, Julia is more abstract, and allows any optimizer and loss function. The user can define a loss function any way that they want without needing to consult a list of built in loss functions. This kind of abstraction is typical of Julia developers, who work to make code as abstract and generic as possible.

Python实现更加用户友好。 培训过程很容易,并且会产生有用的输出。 Julia要求用户多一点。 同时,Julia更抽象,并且允许任何优化器和损失函数。 用户可以以自己想要的任何方式定义损失函数,而无需查阅内置损失函数列表。 这种抽象是Julia开发人员的典型代表,他们致力于使代码尽可能抽象和通用。

For this reason, Keras is more practical for implementing known techniques and standard model training, but makes Flux better suited for developing new techniques.


速度 (Speed)

Unfortunately, there is no available benchmark comparing Flux and Keras on the internet. There are a few resources that give us an idea and we can use TensorFlow speed as a reference.

不幸的是,互联网上没有可比较Flux和Keras的基准。 有一些资源可以给我们一个想法,我们可以使用TensorFlow速度作为参考。

One benchmark found that on the GPU and on the CPU, Flux is barely slower than TensorFlow. It’s been shown that Keras is slightly slower than TensorFlow on the GPU as well. Unfortunately this doesn’t give us a clear winner but suggests that the speed of the two packages are similar.

一项基准测试发现,在GPU和CPU上, Flux的运行速度仅比TensorFlow慢 。 已经证明Keras在GPU上也比TensorFlow稍慢 。 不幸的是,这并不能给我们一个明显的胜利者,而是表明这两个软件包的速度是相似的。

The Flux benchmark above was done before a major rework of Flux’s automatic differentiation package. The new package, Zygote.jl, sped up computations considerably. A more recent benchmark of Flux on the CPU found that the improved Flux is faster than TensorFlow on the CPU. This suggests that Flux could be faster on the on the GPU as well, but winning on the CPU doesn’t necessarily imply a victory on the GPU. At the same time, this is still good evidence that Flux would beat Keras on the CPU.

上面的Flux基准测试是在对Flux的自动差分程序进行重大修改之前完成的。 新软件包Zygote.jl大大加快了计算速度。 CPU上最新的Flux基准测试发现, 改进的Flux比CPU上的TensorFlow更快 。 这表明Flux在GPU上也可能更快,但是在CPU上获胜并不一定意味着在GPU上取得胜利。 同时,这仍然是Flux在CPU上击败Keras的充分证据。

谁赢? (Who Wins?)

Both languages preform well in every category. Differences between the two are largely matters of taste. However there are two places that each language has an edge.

两种语言在每个类别中的表现都很好。 两者之间的差异很大程度上取决于口味。 但是,每种语言在两个地方都有优势。

Python的边缘 (Python’s Edge)

Python has a huge support community and offers time tested libraries. It is reliable and standard. Deep learning in Python is much more common. Developers who use Python for deep learning will fit in well in the deep learning community.

Python具有庞大的支持社区,并提供经过时间检验的库。 它是可靠和标准的。 Python中的深度学习更为常见。 使用Python进行深度学习的开发人员将非常适合深度学习社区。

Julia的边缘 (Julia’s Edge)

Julia is cleaner and more abstract. The deep learning code could definitely be faster and improvements are in the works. Julia has an edge on potential. Deep Learning libraries in Python are much more complete, and don’t have as much potential to grow and develop. Julia, with its richer base language has potential for many new ideas and much faster code in the future. Developers who adopt Julia will be closer to the frontier of programming, but will have to deal with forging their own path.

Julia更干净,更抽象。 深度学习代码肯定可以更快,并且正在进行改进。 Julia(Julia)在潜力方面拥有优势。 Python中的深度学习库更加完善,没有那么大的发展潜力。 Julia(Julia)以其更丰富的基础语言,有可能在未来产生许多新想法和更快的代码。 采用Julia的开发人员将更接近编程的前沿,但将不得不面对自己的道路。

优胜者 (Winner)

Deep learning is difficult and requires a lot of troubleshooting. It can be very difficult to reach state of the art accuracy. For this reason, Python wins this comparison. Deep learning in Julia does not have a strong level of online support for deep learning troubleshooting. This can make writing complicated deep learning scripts very difficult. Julia is excellent for many applications, but for deep learning, I would recommend Python.

深度学习很困难,并且需要大量故障排除。 要达到最新的准确性可能非常困难。 因此, Python赢得了这一比较 。 Julia的深度学习在深度学习故障排除方面没有强大的在线支持。 这会使编写复杂的深度学习脚本非常困难。 Julia非常适合许多应用程序,但对于深度学习,我建议使用Python。


