In this tutorial, we will explore how to build and train deep autoencoders using Keras and Tensorflow.


The primary reason I decided to write this tutorial is that most of the tutorials out there, including the official Keras and TensorFlow ones, use the MNIST data for the training. I have been asked numerous times to show how to train autoencoders using our own images that may be large in number.

我决定编写此教程的主要原因是那里的大多数教程(包括官方的Keras和TensorFlow教程)都使用MNIST数据进行培训。 我无数次被要求展示如何使用我们自己的图像(可能数量很多)来训练自动编码器。

I will try to keep this tutorial brief and will not get into the details of how autoencoder works. Therefore, having a basic knowledge of autoencoders is the prerequisite to understand the code presented in this tutorial (needless to say that you must know how to program in Python, Keras and TensorFlow).

我将尝试使本教程简短,而不会深入探讨自动编码器的工作原理。 因此,具有自动编码器的基础知识是理解本教程中提供的代码的先决条件(不必说您必须知道如何使用Python,Keras和TensorFlow进行编程)。

自动编码器 (Autoencoders)

Autoencoders are unsupervised neural networks that learn to reconstruct its input. Denoising an image is one of the uses of autoencoders. Denoising is very useful for OCR. Autoencoders are also also used for image compression.

自动编码器是无监督的神经网络,可以学习重建其输入。 对图像进行降噪是自动编码器的用途之一。 去噪对于OCR非常有用。 自动编码器也用于图像压缩。

As shown in Figure 1, an autoencoder consists of:


  1. Encoder: The encoder takes an image as input and generates an output which is much smaller dimension compared to the original image. The output from the encoders is also called as the latent representation of the input image.编码器:编码器将图像作为输入并生成输出,该输出的尺寸比原始图像小得多。 编码器的输出也称为输入图像的潜在表示。
  2. Decoder: The decoder takes the output from the encoder (aka the latent representation of the input image) and reconstructs the input image.解码器:解码器从编码器获取输出(又称输入图像的潜在表示)并重建输入图像。

Both encoders and decoders are convolutional neural networks with the difference that the encoders dimensions reduce with each layer and the decoders dimensions increase with each layer until the output layer where the dimensions match with the original image.


培训自动编码器 (Training Autoencoders)

We will use our own images for training and testing the autoencoders. For the purpose of this tutorial, we will use a dataset that contains scanned images of restaurant receipts. The dataset is freely available from the link uner MIT License.

我们将使用自己的图像来训练和测试自动编码器。 在本教程中,我们将使用包含餐厅收据扫描图像的数据集。 可从MIT许可中的链接免费获得该数据集。

Although this dataset does not have a large number of images, we will write code that will work for both small and large datasets.


The code below is divided into 4 parts.


  1. Data preparation: Images will be read from a directory and fed as inputs to the encoder block.数据准备:将从目录中读取图像,并将其作为输入提供给编码器块。
  2. Neural network configuration: We will write a function that takes certain parameters and return the encoder, decoder and autoencoder convolutional neural networks神经网络配置:我们将编写一个带有某些参数的函数,并返回编码器,解码器和自动编码器卷积神经网络
  3. Training the neural networks: The code that triggers the training, monitors the progress and saves the trained models.训练神经网络:触发训练,监视进度并保存训练后模型的代码。
  4. Prediction: The code block that uses the trained models and predicts the output.预测:使用经过训练的模型并预测输出的代码块。

I will use Google Colaboratory ( to execute the code. You can use your favorite IDE to write and run the code. The code below works both for CPUs and GPUs, I will use the GPU based machine to speed up the training. Google Colab offers a free GPU based virtual machine for education and learning.

我将使用Google Colaboratory( )执行代码。 您可以使用自己喜欢的IDE编写和运行代码。 下面的代码适用于CPU和GPU,我将使用基于GPU的机器来加快培训速度。 Google Colab提供了免费的基于GPU的虚拟机,用于教育和学习。

If you use a Jupyter notebook, the steps below will look very similar.


First we create a notebook project, AE Demo for example.

首先,我们创建一个笔记本项目,例如AE Demo。

Before we start the actual code, let’s import all dependencies that we need for our project. Here is a list of imports that we will need.

在开始实际代码之前,让我们导入项目所需的所有依赖项。 这是我们需要的进口清单。

# Import the necessary packages


import tensorflow as tf


from google.colab.patches import cv2_imshow


from tensorflow.keras.layers import BatchNormalization


from tensorflow.keras.layers import Conv2D


from tensorflow.keras.layers import Conv2DTranspose


from tensorflow.keras.layers import LeakyReLU


from tensorflow.keras.layers import Activation


from tensorflow.keras.layers import Flatten


from tensorflow.keras.layers import Dense


from tensorflow.keras.layers import Reshape


from tensorflow.keras.layers import Input


from tensorflow.keras.models import Model


from tensorflow.keras import backend as K


from tensorflow.keras.optimizers import Adam


import numpy as np


Listing 1.1: Import the necessary packages.


数据准备: (Data Preparation:)

Our receipt images are in a directory. We will use ImageDataGenerator class, provided by Keras API, and create training and test iterators as shown in the listing 1.2 below.

我们的收据图像位于目录中。 我们将使用Keras API提供的ImageDataGenerator类,并创建训练和测试迭代器,如下面清单1.2所示。

trainig_img_dir = “inputs”

trainig_img_dir =“输入”

height = 1000

高度= 1000

width = 500

宽度= 500

channel = 1

频道= 1

batch_size = 8

batch_size = 8

datagen = tf.keras.preprocessing.image.ImageDataGenerator(validation_split=0.2, rescale=1. / 255.)

datagen = tf.keras.preprocessing.image.ImageDataGenerator(validation_split = 0.2,rescale = 1. / 255。)

train_it = datagen.flow_from_directory(

train_it = datagen.flow_from_directory(



target_size=(height, width),

target_size =(高度,宽度),


color_mode ='灰度',


class_mode ='输入',


batch_size =批量大小,

subset=’training’) # set as training data

subset ='training')#设置为训练数据

val_it = datagen.flow_from_directory(

val_it = datagen.flow_from_directory(



target_size=(height, width),

target_size =(高度,宽度),


color_mode ='灰度',


class_mode ='输入',


batch_size =批量大小,

subset=’validation’) # set as validation data

subset ='validation')#设置为验证数据

Listing 1.2: Image input preparation. Load images in batches from a directory.

代码清单1.2:图像输入准备 从目录中批量加载图像。

Important notes about Listing 1.2:


  1. training_img_dir = “inputs” is the parent directory that contains the receipt images. In other words, receipts are in a subdirectory under the “inputs” directory.training_img_dir =“输入”是包含收据图像的父目录。 换句话说,收据位于“输入”目录下的子目录中。
  2. color_mode=’grayscale’ is important if you want to convert your input images into grayscale.如果要将输入图像转换为灰度,color_mode =“灰度”非常重要。

All other parameters are self explanatory.


配置自动编码器神经网络 (Configure Autoencoder Neural Networks)

As shown in Listing 1.3 below, we have created an AutoencoderBuilder class that provides a function build_ae(). This function takes the following arguments:

如下面的清单1.3所示,我们创建了一个AutoencoderBuilder类,该类提供了一个build_ae()函数。 此函数采用以下参数:

  • height of the input images,输入图像的高度,
  • width of the input images,输入图像的宽度,
  • depth (or the number of channels) of the input images.输入图像的深度(或通道数)。
  • filters as a tuple with the default as (32,64)过滤为元组,默认为(32,64)
  • latentDim which represents the dimension of the latent vectorlatentDim,代表潜在向量的维数

class AutoencoderBuilder:




def build_ae(height, width, depth, filters=(32, 64), latentDim=16):

def build_ae(高度,宽度,深度,过滤器=(32,64),latentDim = 16):

#Initialize the input shape.


inputShape = (height, width, depth)

inputShape =(高度,宽度,深度)

chanDim = -1

chanDim = -1

# define the input to the encoder


inputs = Input(shape=inputShape)

输入=输入(shape = inputShape)

x = inputs

x =输入

# loop over the filters


for filter in filters:


# Build network with Convolutional with RELU and BatchNormalization


x = Conv2D(filter, (3, 3), strides=2, padding=”same”)(x)

x = Conv2D(过滤器,(3,3),步幅= 2,填充=“相同”)(x)

x = LeakyReLU(alpha=0.2)(x)

x = LeakyReLU(alpha = 0.2)(x)

x = BatchNormalization(axis=chanDim)(x)

x =批次归一化(axis = chanDim)(x)

# flatten the network and then construct the latent vector


volumeSize = K.int_shape(x)

volumeSize = K.int_shape(x)

x = Flatten()(x)

x = Flatten()(x)

latent = Dense(latentDim)(x)


# build the encoder model


encoder = Model(inputs, latent, name=”encoder”)


# We will now build the the decoder model which takes the output from the encoder as its inputs


latentInputs = Input(shape=(latentDim,))

latentInputs =输入(shape =(latentDim,))

x = Dense([1:]))(latentInputs)

x =密集( [1:]))(latentInputs)

x = Reshape((volumeSize[1], volumeSize[2], volumeSize[3]))(x)

x =重塑((volumeSize [1],volumeSize [2],volumeSize [3]))(x)

# We will loop over the filters again but in the reverse order


for filter in filters[::-1]:


# In the decoder, we will apply a CONV_TRANSPOSE with RELU and BatchNormalization operation


x = Conv2DTranspose(filter, (3, 3), strides=2,

x = Conv2DTranspose(filter,(3,3),strides = 2,



x = LeakyReLU(alpha=0.2)(x)

x = LeakyReLU(alpha = 0.2)(x)

x = BatchNormalization(axis=chanDim)(x)

x =批次归一化(axis = chanDim)(x)

# Now, we want to recover the original depth of the image. For this, we apply a single CONV_TRANSPOSE layer

#现在,我们要恢复图像的原始深度。 为此,我们应用一个CONV_TRANSPOSE层

x = Conv2DTranspose(depth, (3, 3), padding=”same”)(x)

x = Conv2DTranspose(depth,(3,3),padding =“ same”)(x)

outputs = Activation(“sigmoid”)(x)

输出=激活(“ sigmoid”)(x)

# Now build the decoder model


decoder = Model(latentInputs, outputs, name=”decoder”)


# Finally, the autoencoder is the encoder + decoder


autoencoder = Model(inputs, decoder(encoder(inputs)),

autoencoder =模型(输入,解码器(编码器(输入)),



# return a tuple of the encoder, decoder, and autoencoder models


return (encoder, decoder, autoencoder)


Listing 1.3: Builder class to create autoencoder networks.


培训自动编码器 (Training Autoencoders)

The following code Listing 1.4 starts the autoencoder training.


# initialize the number of epochs to train for and batch size


EPOCHS = 300

EPOCHS = 300


批次= 8

MODEL_OUT_DIR = “ae_model_dir”

MODEL_OUT_DIR =“ ae_model_dir”

# construct our convolutional autoencoder


print(“[INFO] building autoencoder…”)

打印(“ [[INFO] Building autoencoder ...”)

(encoder, decoder, autoencoder) = AutoencoderBuilder().build_ae(height,width,channel)

(编码器,解码器,自动编码器)= AutoencoderBuilder()。build_ae(高度,宽度,通道)

opt = Adam(lr=1e-3)

opt =亚当(lr = 1e-3)

autoencoder.compile(loss=”mse”, optimizer=opt)

autoencoder.compile(loss =“ mse”,Optimizer = opt)

# train the convolutional autoencoder


history =





validation_data = val_it,


epochs = EPOCHS,


batch_size = BATCHES)”/ae_model.h5”) +” / ae_model.h5”)

Listing 1.4: Training autoencoder model.


可视化培训指标 (Visualizing the Training Metrics)

The code listing 1.5 shows how to display a graph of loss/accuracy per epoch of both training and validation. Figure 2 shows a sample output of the code Listing 1.5

代码清单1.5显示了如何显示训练和验证的每个时期的损失/准确性图。 图2显示了代码清单1.5的示例输出。

# set the matplotlib backend so figures can be saved in the background


import matplotlib


import matplotlib.pyplot as plt


%matplotlib inline


# construct a plot that plots and displays the training history


N = np.arange(0, EPOCHS)

N = np.arange(0,EPOCHS)“ggplot”)“ ggplot”)



plt.plot(N, history.history[“loss”], label=”train_loss”)

plt.plot(N,history.history [“ loss”],label =“ train_loss”)

plt.plot(N, history.history[“val_loss”], label=”val_loss”)

plt.plot(N,history.history [“ val_loss”],label =“ val_loss”)

plt.title(“Training Loss and Accuracy”)


plt.xlabel(“Epoch #”)

plt.xlabel(“ Epoch#”)



plt.legend(loc=”lower left”)

plt.legend(loc =“左下角”)

# plt.savefig(plot)

#plt.savefig(图) = True)

Listing 1.5: Display a plot of training loss and accuracy vs epochs


Figure 1.2: Plot of loss/accuracy vs epoch


作出预测 (Make Predictions)

Now that we have a trained autoencoder model, we will use it to make predictions. The code listing 1.6 shows how to load the model from the directory location where it was saved. We use predict() function and pass the validation image iterator that we created before. Ideally we should have a different image set for prediction and testing.

现在我们有了训练有素的自动编码器模型,我们将使用它来进行预测。 代码清单1.6显示了如何从保存模型的目录位置加载模型。 我们使用predict()函数并传递之前创建的验证图像迭代器。 理想情况下,我们应该为预测和测试设置不同的图像集。

Here is the code to do the prediction and display.


from google.colab.patches import cv2_imshow


# use the convolutional autoencoder to make predictions on the


# validation images, then display those predicted image.


print(“[INFO] making predictions…”)

打印(“ [INFO]做出预测…”)

autoencoder_model = tf.keras.models.load_model(MODEL_OUT_DIR+”/encoder_decoder_model.h5")

autoencoder_model = tf.keras.models.load_model(MODEL_OUT_DIR +” / encoder_decoder_model.h5“)

decoded = autoencoder_model.predict(train_it)

解码= autoencoder_model.predict(train_it)

decoded = autoencoder.predict(val_it)

解码= autoencoder.predict(val_it)

examples = 10

例子= 10

# loop over a few samples to display the predicted images


for i in range(0, examples):


predicted = (decoded[i] * 255).astype(“uint8”)

预测=(decoded [i] * 255).astype(“ uint8”)



Listing 1.6: Code to predict and display the images


In the above code listing, I have used the cv2_imshow package which is very specific to Google Colab. If you are Jupyter or any other IDE, you may have to simply import the cv2 package. To display the image, use cv2.imshow() function.

在上面的代码清单中,我使用了cv2_imshow软件包,该软件包非常特定于Google Colab。 如果您是Jupyter或任何其他IDE,则可能只需导入cv2软件包。 要显示图像,请使用cv2.imshow()函数。

结论 (Conclusion)

In this tutorial, we built autoencoder models using our own images. We also explored how to save the model. We loaded the saved model and made the predictions. We finally displayed the predicted images.

在本教程中,我们使用自己的图像构建了自动编码器模型。 我们还探讨了如何保存模型。 我们加载了保存的模型并做出了预测。 我们最终显示了预测的图像。



  • 出人意料的生日会400字_出人意料的有效遗传方法进行特征选择
  • fast.ai_使用fast.ai自组织地图—步骤4:使用 DataBunch处理非监督数据
  • 无监督学习与监督学习_有监督与无监督学习
  • 分类决策树 回归决策树_决策树分类器背后的数学
  • 检测对抗样本_对抗T恤以逃避ML人检测器
  • 机器学习中一阶段网络是啥_机器学习项目的各个阶段
  • 目标检测 dcn v2_使用Detectron2分6步进行目标检测
  • 生成高分辨率pdf_用于高分辨率图像合成的生成变分自编码器
  • 神经网络激活函数对数函数_神经网络中的激活函数
  • 算法伦理
  • python 降噪_使用降噪自动编码器重建损坏的数据(Python代码)
  • bert简介_BERT简介
  • 卷积神经网络结构_卷积神经网络
  • html两个框架同时_两个框架的故事
  • 深度学习中交叉熵_深度计算机视觉,用于检测高熵合金中的钽和铌碎片
  • 梯度提升树python_梯度增强树回归— Spark和Python
  • 5行代码可实现5倍Scikit-Learn参数调整的更快速度
  • tensorflow 多人_使用TensorFlow2.x进行实时多人2D姿势估计
  • keras构建卷积神经网络_在Keras中构建,加载和保存卷积神经网络
  • 深度学习背后的数学_深度学习背后的简单数学
  • 深度学习:在图像上找到手势_使用深度学习的人类情绪和手势检测器:第1部分
  • 单光子探测技术应用_我如何最终在光学/光子学应用程序中使用机器学习作为博士学位
  • 基于深度学习的病理_组织病理学的深度学习(第二部分)
  • ai无法启动产品_启动AI启动的三个关键教训
  • 达尔文进化奖_使用Kydavra GeneticAlgorithmSelector将达尔文进化应用于特征选择
  • 变异函数 python_使用Python进行变异测试
  • 信号处理深度学习机器学习_机器学习与信号处理
  • PinnerSage模型
  • 零信任模型_关于信任模型
  • 乐器演奏_深度强化学习代理演奏的蛇


  1. Keras还是TensorFlow?深度学习框架选型实操分享

    译者| 王天宇.林椿眄 责编| Jane.琥珀 出品| AI科技大本营 深度学习发展势头迅猛,但近两年涌现的诸多深度学习框架让初学者无所适从.如 Google 的 TensorFlow.亚马逊的 MX ...

  2. 使用OpenCV,Keras和Tensorflow构建Covid19掩模检测器

    During COVID19 quarantine I decided to build my own implementation of a mask-detector able to detect ...

  3. python神经网络库 keras_在Python和R中使用Keras和Tensorflow进行深度学习

    了解TensorFlow 2.0和Keras在Python和R中的深度学习并构建神经网络 深入了解人工神经网络(ANN)和深度学习 了解Keras和Tensorflow库的用法 了解适用人工神经网络( ...

  4. 如何使用Keras和TensorFlow建立深度学习模型以预测员工留任率

    The author selected Girls Who Code to receive a donation as part of the Write for DOnations program. ...

  5. 从零开始,手把手教你使用Keras和TensorFlow构建自己的CNN模型

    最近学习CNN,搭建CNN模型时看网上鱼龙混杂的博客走了不少歪路,决定自己来总结一下. 注意本教程未必对所有版本有效,请根据需要的版本适当调整.文章中配置的环境是Python 3.8.12 ,Tens ...

  6. 独家 | 利用孪生网络,Keras,Tensorflow比较图片相似度

    作者:Adrian Rosebrock 翻译:张一然 校对:wwl 本文约3700字,建议阅读8分钟. 在本文中,您将学习如何使用孪生网络和深度学习库Keras / TensorFlow比较两个图像的 ...

  7. tensorflow包_在Keras和Tensorflow中使用深度卷积网络生成Meme(表情包)文本

    作者 | dylan wenzlau 来源 | Medium 编辑 | 代码医生团队 本文介绍如何构建深度转换网络实现端到端的文本生成.在这一过程中,包括有关数据清理,训练,模型设计和预测算法相关的内 ...

  8. TensorFlow深度自动编码器入门实践

    包含从头开始构建Autoencoders模型的完整代码. (关注"我爱计算机视觉"公众号,一个有价值有深度的公众号~) 在本教程中,我们一起来探索一个非监督学习神经网络--Auto ...

  9. TensorFlow2.0(四)--Keras构建深度神经网络(DNN)

    Keras构建深度神经网络(DNN) 1. 深度神经网络简介 2. Kerase搭建DNN模型 2.1 导入相应的库 2.2 数据加载与归一化 2.3 网络模型的构建 2.4 批归一化,dropout ...


  1. Java学习之do-while-if语句实操
  2. AngularJS API
  3. Oracle 登录时错误: ORA-01017: invalid username/password; logon denied
  4. Cisco路由器——Console线的接法
  5. 计算机软件技术大作业报告,多媒体技术大作业报告.doc
  6. spdk/dpdk 编译相关问题汇总
  7. kafka topic数量上限_Kafka使用起来,竟还有这么多“潜规则”?一文彻底搞懂了...
  8. eclipse ssh mysql_Eclipse 配置SSH 详解
  9. nodepad++快捷键收集
  10. 递推极大似然算法实现
  11. Docker教程小白实操入门(10)--如何删除一个镜像
  12. 利用pandas处理二级office的Excel试题(一)
  13. 三极管开关为什么工作在饱和区和截至区
  14. element日历(Calendar)排班
  15. MySQL高级查询语句——超详细,一篇就够了
  16. Windows 下定制黑苹果 USB 驱动教程
  17. 说一下浏览器垃圾回收机制?
  18. 订餐系统之按距离[根据经纬度]排序、搜索
  19. php做引流脚本,极速引流脚本分享火爆的引流脚本效果
  20. linux安装ati工具,Ubuntu 12.10 安装ATI显卡驱动安装


  1. 现代CIO的关键是需要建立 IT/OT之间的桥梁
  2. 我的Python成长之路---第一天---Python基础(5)---2015年12月26日(雾霾)
  3. Python核心数据类型之字典15
  4. win7 桌面右下角音量图标消失的解决办法 参考
  5. 为X Windows添加TrueType字体
  6. unity3d模拟树叶飘动_Unity3D独立游戏开发日记(一):动态生成树木
  7. 多种思路给js文件传递参数
  8. php qcloud sdk weapp_微信小程序源码+PHP后台
  9. android mvp 作用,Android MVP与MVC的区别和理解
  10. c语言 utf 8转字符串,如何将UTF-8字节[]转换为字符串?