【何之源-21个项目玩转深度学习】——Chapter3-3.2 数据准备-将图像数据转为tfrecord形式

转载自：https://blog.csdn.net/c20081052/article/details/81325394

在训练自己的模型前，需要准备数据集，tfrecord作为tensorflow较为流行的数据处理格式，我们需要根据已有的图像样本来制作tfrecord格式的数据源。读者完全可按照下面文件的存放路径，调用以下两个.py文件制作自己的tfrecord文件；

何大神提供的数据源结构如下：

data_prepare/
pic/
train/
wood/
water/
rock/
wetland/
glacier/
urban/
validation/
wood/
water/
rock/
wetland/
glacier/
urban/
src/
tfrecord.py
data_convert.py

在data_prepare文件夹下有个pic的文件夹，该文件夹中又包含train文件夹和validation文件夹；在train文件夹中又包含wood,water,rock,wetland，glacier,urban文件夹，这6个文件夹中分别包含各自类型图像800张，尺寸大致为256x256；

同样在validation中也包含那6个文件夹，各目录下存放了200张图像；

运行data_prepare/ 目录下的data_convert.py程序，运行指令是：

python data_convert.py -t pic/ \
--train-shards 2 \
--validation-shards 2 \
--num-threads 2 \
--dataset-name satellite

指令解释如下：

-t pic/ 是指要转换格式的图像文件存放在pic文件夹下；

--train-shards 2 是指将训练图像生成的tfrecord文件分成2份（考虑数据存储的方便，具体分成几份才合理请百度吧，默认是2份）

--validation-shards 2 是指将验证图像生成的tfrecord文件分成2份（默认2）

--num-threads 2 线程数（默认2，注意线程数必须要能整除 train-shards 和 validation-shards，来保证每个线程处理的数据块数是相同的）

--dataset-name satellite 数据集名，默认为satellite（根据读者自己的数据集更改，何大神用的是卫星航拍图，给生成的数据集起一个名字。这里将数据集起名叫“satellite＇’，最后生成文件的开头就是 satellite_train 和 satellite_validation）

data_convert.py的代码如下：

# coding:utf-8
from __future__ import absolute_import
import argparse
import os
import logging
from src.tfrecord import maindef parse_args():parser = argparse.ArgumentParser()parser.add_argument('-t', '--tensorflow-data-dir', default='pic/')parser.add_argument('--train-shards', default=2, type=int)parser.add_argument('--validation-shards', default=2, type=int)parser.add_argument('--num-threads', default=2, type=int)parser.add_argument('--dataset-name', default='satellite', type=str)return parser.parse_args()if __name__ == '__main__':logging.basicConfig(level=logging.INFO)args = parse_args()args.tensorflow_dir = args.tensorflow_data_dirargs.train_directory = os.path.join(args.tensorflow_dir, 'train')args.validation_directory = os.path.join(args.tensorflow_dir, 'validation')args.output_directory = args.tensorflow_dirargs.labels_file = os.path.join(args.tensorflow_dir, 'label.txt')if os.path.exists(args.labels_file) is False:logging.warning('Can\'t find label.txt. Now create it.')all_entries = os.listdir(args.train_directory)dirnames = []for entry in all_entries:if os.path.isdir(os.path.join(args.train_directory, entry)):dirnames.append(entry)with open(args.labels_file, 'w') as f:for dirname in dirnames:f.write(dirname + '\n')main(args)

读者可根据作者的数据存放目录结构存放数据，然后根据自己的数据集更改名字；其中上面这个.py文件调用了src文件夹中的tfrecord.py文件（其源码如下）；

# coding:utf-8
# Copyright 2016 Google Inc. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
"""Converts image data to TFRecords file format with Example protos.
The image data set is expected to reside in JPEG files located in the
following directory structure.data_dir/label_0/image0.jpegdata_dir/label_0/image1.jpg...data_dir/label_1/weird-image.jpegdata_dir/label_1/my-image.jpeg...
where the sub-directory is the unique label associated with these images.
This TensorFlow script converts the training and evaluation data into
a sharded data set consisting of TFRecord filestrain_directory/train-00000-of-01024train_directory/train-00001-of-01024...train_directory/train-00127-of-01024
andvalidation_directory/validation-00000-of-00128validation_directory/validation-00001-of-00128...validation_directory/validation-00127-of-00128
where we have selected 1024 and 128 shards for each data set. Each record
within the TFRecord file is a serialized Example proto. The Example proto
contains the following fields:image/encoded: string containing JPEG encoded image in RGB colorspaceimage/height: integer, image height in pixelsimage/width: integer, image width in pixelsimage/colorspace: string, specifying the colorspace, always 'RGB'image/channels: integer, specifying the number of channels, always 3image/format: string, specifying the format, always'JPEG'image/filename: string containing the basename of the image filee.g. 'n01440764_10026.JPEG' or 'ILSVRC2012_val_00000293.JPEG'image/class/label: integer specifying the index in a classification layer. start from "class_label_base"image/class/text: string specifying the human-readable version of the labele.g. 'dog'
If you data set involves bounding boxes, please look at build_imagenet_data.py.
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_functionfrom datetime import datetime
import os
import random
import sys
import threadingimport numpy as np
import tensorflow as tf
import loggingdef _int64_feature(value):"""Wrapper for inserting int64 features into Example proto."""if not isinstance(value, list):value = [value]return tf.train.Feature(int64_list=tf.train.Int64List(value=value))def _bytes_feature(value):value=tf.compat.as_bytes(value)return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))def _convert_to_example(filename, image_buffer, label, text, height, width):"""Build an Example proto for an example.Args:filename: string, path to an image file, e.g., '/path/to/example.JPG'image_buffer: string, JPEG encoding of RGB imagelabel: integer, identifier for the ground truth for the networktext: string, unique human-readable, e.g. 'dog'height: integer, image height in pixelswidth: integer, image width in pixelsReturns:Example proto"""colorspace = 'RGB'channels = 3image_format = 'JPEG'example = tf.train.Example(features=tf.train.Features(feature={'image/height': _int64_feature(height),'image/width': _int64_feature(width),'image/colorspace': _bytes_feature(colorspace),'image/channels': _int64_feature(channels),'image/class/label': _int64_feature(label),'image/class/text': _bytes_feature(text),'image/format': _bytes_feature(image_format),'image/filename': _bytes_feature(os.path.basename(filename)),'image/encoded': _bytes_feature(image_buffer)}))return exampleclass ImageCoder(object):"""Helper class that provides TensorFlow image coding utilities."""def __init__(self):# Create a single Session to run all image coding calls.self._sess = tf.Session()# Initializes function that converts PNG to JPEG data.self._png_data = tf.placeholder(dtype=tf.string)image = tf.image.decode_png(self._png_data, channels=3)self._png_to_jpeg = tf.image.encode_jpeg(image, format='rgb', quality=100)# Initializes function that decodes RGB JPEG data.self._decode_jpeg_data = tf.placeholder(dtype=tf.string)self._decode_jpeg = tf.image.decode_jpeg(self._decode_jpeg_data, channels=3)def png_to_jpeg(self, image_data):return self._sess.run(self._png_to_jpeg,feed_dict={self._png_data: image_data})def decode_jpeg(self, image_data):image = self._sess.run(self._decode_jpeg,feed_dict={self._decode_jpeg_data: image_data})assert len(image.shape) == 3assert image.shape[2] == 3return imagedef _is_png(filename):"""Determine if a file contains a PNG format image.Args:filename: string, path of the image file.Returns:boolean indicating if the image is a PNG."""return '.png' in filenamedef _process_image(filename, coder):"""Process a single image file.Args:filename: string, path to an image file e.g., '/path/to/example.JPG'.coder: instance of ImageCoder to provide TensorFlow image coding utils.Returns:image_buffer: string, JPEG encoding of RGB image.height: integer, image height in pixels.width: integer, image width in pixels."""# Read the image file.with open(filename, 'rb') as f:    # need change r  to  rbimage_data = f.read()# Convert any PNG to JPEG's for consistency.if _is_png(filename):logging.info('Converting PNG to JPEG for %s' % filename)image_data = coder.png_to_jpeg(image_data)# Decode the RGB JPEG.image = coder.decode_jpeg(image_data)# Check that image converted to RGBassert len(image.shape) == 3height = image.shape[0]width = image.shape[1]assert image.shape[2] == 3return image_data, height, widthdef _process_image_files_batch(coder, thread_index, ranges, name, filenames,texts, labels, num_shards, command_args):"""Processes and saves list of images as TFRecord in 1 thread.Args:coder: instance of ImageCoder to provide TensorFlow image coding utils.thread_index: integer, unique batch to run index is within [0, len(ranges)).ranges: list of pairs of integers specifying ranges of each batches toanalyze in parallel.name: string, unique identifier specifying the data setfilenames: list of strings; each string is a path to an image filetexts: list of strings; each string is human readable, e.g. 'dog'labels: list of integer; each integer identifies the ground truthnum_shards: integer number of shards for this data set."""# Each thread produces N shards where N = int(num_shards / num_threads).# For instance, if num_shards = 128, and the num_threads = 2, then the first# thread would produce shards [0, 64).num_threads = len(ranges)assert not num_shards % num_threadsnum_shards_per_batch = int(num_shards / num_threads)shard_ranges = np.linspace(ranges[thread_index][0],ranges[thread_index][1],num_shards_per_batch + 1).astype(int)num_files_in_thread = ranges[thread_index][1] - ranges[thread_index][0]counter = 0for s in range(num_shards_per_batch):  #xrange used only in python 2.X ；so use range instend  by csq# Generate a sharded version of the file name, e.g. 'train-00002-of-00010'shard = thread_index * num_shards_per_batch + soutput_filename = '%s_%s_%.5d-of-%.5d.tfrecord' % (command_args.dataset_name, name, shard, num_shards)output_file = os.path.join(command_args.output_directory, output_filename)writer = tf.python_io.TFRecordWriter(output_file)shard_counter = 0files_in_shard = np.arange(shard_ranges[s], shard_ranges[s + 1], dtype=int)for i in files_in_shard:filename = filenames[i]label = labels[i]text = texts[i]image_buffer, height, width = _process_image(filename, coder)example = _convert_to_example(filename, image_buffer, label,text, height, width)writer.write(example.SerializeToString())shard_counter += 1counter += 1if not counter % 1000:logging.info('%s [thread %d]: Processed %d of %d images in thread batch.' %(datetime.now(), thread_index, counter, num_files_in_thread))sys.stdout.flush()writer.close()logging.info('%s [thread %d]: Wrote %d images to %s' %(datetime.now(), thread_index, shard_counter, output_file))sys.stdout.flush()shard_counter = 0logging.info('%s [thread %d]: Wrote %d images to %d shards.' %(datetime.now(), thread_index, counter, num_files_in_thread))sys.stdout.flush()def _process_image_files(name, filenames, texts, labels, num_shards, command_args):"""Process and save list of images as TFRecord of Example protos.Args:name: string, unique identifier specifying the data setfilenames: list of strings; each string is a path to an image filetexts: list of strings; each string is human readable, e.g. 'dog'labels: list of integer; each integer identifies the ground truthnum_shards: integer number of shards for this data set."""assert len(filenames) == len(texts)assert len(filenames) == len(labels)# Break all images into batches with a [ranges[i][0], ranges[i][1]].spacing = np.linspace(0, len(filenames), command_args.num_threads + 1).astype(np.int)ranges = []for i in range(len(spacing) - 1):   #xrange used only in python 2.X ；so use range instend  by csqranges.append([spacing[i], spacing[i + 1]])# Launch a thread for each batch.logging.info('Launching %d threads for spacings: %s' % (command_args.num_threads, ranges))sys.stdout.flush()# Create a mechanism for monitoring when all threads are finished.coord = tf.train.Coordinator()# Create a generic TensorFlow-based utility for converting all image codings.coder = ImageCoder()threads = []for thread_index in range(len(ranges)):  #xrange used only in python 2.X ；so use range instend  by csqargs = (coder, thread_index, ranges, name, filenames,texts, labels, num_shards, command_args)t = threading.Thread(target=_process_image_files_batch, args=args)t.start()threads.append(t)# Wait for all the threads to terminate.coord.join(threads)logging.info('%s: Finished writing all %d images in data set.' %(datetime.now(), len(filenames)))sys.stdout.flush()def _find_image_files(data_dir, labels_file, command_args):"""Build a list of all images files and labels in the data set.Args:data_dir: string, path to the root directory of images.Assumes that the image data set resides in JPEG files located inthe following directory structure.data_dir/dog/another-image.JPEGdata_dir/dog/my-image.jpgwhere 'dog' is the label associated with these images.labels_file: string, path to the labels file.The list of valid labels are held in this file. Assumes that the filecontains entries as such:dogcatflowerwhere each line corresponds to a label. We map each label contained inthe file to an integer starting with the integer 0 corresponding to thelabel contained in the first line.Returns:filenames: list of strings; each string is a path to an image file.texts: list of strings; each string is the class, e.g. 'dog'labels: list of integer; each integer identifies the ground truth."""logging.info('Determining list of input files and labels from %s.' % data_dir)unique_labels = [l.strip() for l in tf.gfile.FastGFile(labels_file, 'r').readlines()]labels = []filenames = []texts = []# Leave label index 0 empty as a background class."""非常重要，这里我们调整label从0开始以符合定义"""label_index = command_args.class_label_base# Construct the list of JPEG files and labels.for text in unique_labels:jpeg_file_path = '%s/%s/*' % (data_dir, text)matching_files = tf.gfile.Glob(jpeg_file_path)labels.extend([label_index] * len(matching_files))texts.extend([text] * len(matching_files))filenames.extend(matching_files)if not label_index % 100:logging.info('Finished finding files in %d of %d classes.' % (label_index, len(labels)))label_index += 1# Shuffle the ordering of all image files in order to guarantee# random ordering of the images with respect to label in the# saved TFRecord files. Make the randomization repeatable.shuffled_index = list(range(len(filenames)))    #add  list() by cikyrandom.seed(12345)random.shuffle(shuffled_index)filenames = [filenames[i] for i in shuffled_index]texts = [texts[i] for i in shuffled_index]labels = [labels[i] for i in shuffled_index]logging.info('Found %d JPEG files across %d labels inside %s.' %(len(filenames), len(unique_labels), data_dir))# print(labels)return filenames, texts, labelsdef _process_dataset(name, directory, num_shards, labels_file, command_args):"""Process a complete data set and save it as a TFRecord.Args:name: string, unique identifier specifying the data set.directory: string, root path to the data set.num_shards: integer number of shards for this data set.labels_file: string, path to the labels file."""filenames, texts, labels = _find_image_files(directory, labels_file, command_args)_process_image_files(name, filenames, texts, labels, num_shards, command_args)def check_and_set_default_args(command_args):if not(hasattr(command_args, 'train_shards')) or command_args.train_shards is None:command_args.train_shards = 5if not(hasattr(command_args, 'validation_shards')) or command_args.validation_shards is None:command_args.validation_shards = 5if not(hasattr(command_args, 'num_threads')) or command_args.num_threads is None:command_args.num_threads = 5if not(hasattr(command_args, 'class_label_base')) or command_args.class_label_base is None:command_args.class_label_base = 0if not(hasattr(command_args, 'dataset_name')) or command_args.dataset_name is None:command_args.dataset_name = ''assert not command_args.train_shards % command_args.num_threads, ('Please make the command_args.num_threads commensurate with command_args.train_shards')assert not command_args.validation_shards % command_args.num_threads, ('Please make the command_args.num_threads commensurate with ''command_args.validation_shards')assert command_args.train_directory is not Noneassert command_args.validation_directory is not Noneassert command_args.labels_file is not Noneassert command_args.output_directory is not Nonedef main(command_args):"""command_args:需要有以下属性：command_args.train_directory  训练集所在的文件夹。这个文件夹下面，每个文件夹的名字代表label名称，再下面就是图片。command_args.validation_directory 验证集所在的文件夹。这个文件夹下面，每个文件夹的名字代表label名称，再下面就是图片。command_args.labels_file 一个文件。每一行代表一个label名称。command_args.output_directory 一个文件夹，表示最后输出的位置。command_args.train_shards 将训练集分成多少份。command_args.validation_shards 将验证集分成多少份。command_args.num_threads 线程数。必须是上面两个参数的约数。command_args.class_label_base 很重要！真正的tfrecord中，每个class的label号从多少开始，默认为0（在models/slim中就是从0开始的）command_args.dataset_name 字符串，输出的时候的前缀。图片不可以有损坏。否则会导致线程提前退出。"""check_and_set_default_args(command_args)logging.info('Saving results to %s' % command_args.output_directory)# Run it!_process_dataset('validation', command_args.validation_directory,command_args.validation_shards, command_args.labels_file, command_args)_process_dataset('train', command_args.train_directory,command_args.train_shards, command_args.labels_file, command_args)

这个源码与何大神提供有差异，考虑本人用的是python3，（何大神用的应该是python2）,所以如不做更改会报一些错误。

直接运行

python data_convert.py -t pic/ \
--train-shards 2 \
--validation-shards 2 \
--num-threads 2 \
--dataset-name satellite

可能会报如下错误：

\data_prepare\src\tfrecord.py", line 341, in _find_image_files
random.shuffle(shuffled_index)
File "F:\Python36\lib\random.py", line 275, in shuffle
x[i], x[j] = x[j], x[i]
TypeError: 'range' object does not support item assignment

UnicodeDecodeError: 'gbk' codec can't decode byte 0xff in position 0: illegal multibyte sequence

解决方法是做如下几处做更改（我上面给的tfrecord.py代码是做了更改后的）：

//第一
def _bytes_feature(value):
"""Wrapper for inserting bytes features into Example proto."""
value=tf.compat.as_bytes(value)//这行需要添加（作者给的代码这行没有）
return tf.train.Feature(bytes_list=tf.train.BytesList(value=[value]))

//第二
def _process_image(filename, coder):
with open(filename, 'rb') as f://这里需要加个b（作者给的源码是‘r’）
image_data = f.read()

//第三
xrange需要都改为range

//第四
_find_image_files:
shuffled_index = list(range(len(filenames)))//这里加上了list （百度了下说python3中range不返回数组对象，而是返回range对象）

//第五
你的项目路径最好不要有中文，嗯中文路径很多问题的你懂的，拼音也比中文好。

至此，运行指令后会在data_prepare/pic/目录下生成下图5个文件；

其中label,txt中内容是

glacier
rock
urban
water
wetland
wood
这6类标签名；

而.tfrecord文件中存放的数据是包含图像数据和标签统一存储的二进制文件

tfrecord格式文件使用可参考：https://blog.csdn.net/c20081052/article/details/81315774）

参考：

https://blog.csdn.net/u010412719/article/details/47088095

https://blog.csdn.net/shijing_0214/article/details/51971734

https://blog.csdn.net/dillon2015/article/details/52987792

https://github.com/hzy46/Deep-Learning-21-Examples/issues/28

【何之源-21个项目玩转深度学习】——Chapter3-3.2 数据准备-将图像数据转为tfrecord形式相关推荐

【专访英特尔高级首席工程师戴金权】普通数据工程师，如何玩转深度学习？
记者 | 白羽几乎每周,人工智能深度学习,总会在某个领域有新的技术突破,新的亮眼成果出来. 不过,这些最新的突破和成果,更多还是在深度学习的各大社区流动,更多是被顶尖教授.学者所掌握和应用,对于普通 ...
视觉机器学习20讲-MATLAB源码示例（18）-深度学习算法
视觉机器学习20讲-MATLAB源码示例(18)-深度学习算法 1. 深度学习算法 2. Matlab仿真 3. 仿真结果 4. 小结 1. 深度学习算法深度学习(DL, Deep Learning ...
自己动手玩转深度学习项目
为什么80%的码农都做不了架构师?>>> 自从2012年AlexNet网络在ImageNet挑战赛上取得巨大成功之后,计算机视觉和深度学习等领域再一次迎来研究热潮.计算机视觉, ...
自己动手玩转深度学习项目 1
自从2012年AlexNet网络在ImageNet挑战赛上取得巨大成功之后,计算机视觉和深度学习等领域再一次迎来研究热潮.计算机视觉,从字面意义上理解就是让计算机等机器也具备人类视觉,研究让机器进行图 ...
tornado项目搭建_Python深度学习原理及项目实战2019年3月21日上海举办
一.课程背景众所周知,人工智能是高级计算智能最宽泛的概念,机器学习是研究人工智能的一个工具,深度学习是机器学习的一个子集,是目前研究领域卓有成效的学习方法.深度学习的框架有很多,而TenforFlo ...
2023北京智源大会亮点回顾 | 高性能计算、深度学习和大模型：打造通用人工智能AGI的金三角
AIGC | Aquila | HuggingFace AGI | DeepMind | Stability AI 通用人工智能(AGI)是人工智能领域的最终目标,也是一项极具挑战性的任务.在诸多技 ...
直播报名 | 小身材大能量！用英伟达智能小车Jetbot玩转深度学习
8 月 14 日(周三)下午,PaperWeekly 将携手 NVIDIA 英伟达在上海举办新一期线下沙龙. 针对具有基本 Python 编程技能的学生和开发者,本次线下沙龙将通过对市场售价 1880 ...
线下沙龙 | 小身材大能量！用英伟达智能小车Jetbot玩转深度学习
8 月 14 日(周日)下午,PaperWeekly 将携手 NVIDIA 英伟达在上海举办新一期线下沙龙. 针对具有基本 Python 编程技能的学生和开发者,本次线下沙龙将通过对市场售价 1880 ...
线下沙龙 × 上海 | 小身材大能量！用英伟达智能小车Jetbot玩转深度学习
8 月 14 日(周三)下午,PaperWeekly 将携手 NVIDIA 英伟达在上海举办新一期线下沙龙. 针对具有基本 Python 编程技能的学生和开发者,本次线下沙龙将通过对市场售价 1880 ...

【何之源-21个项目玩转深度学习】——Chapter3-3.2 数据准备-将图像数据转为tfrecord形式

【何之源-21个项目玩转深度学习】——Chapter3-3.2 数据准备-将图像数据转为tfrecord形式相关推荐

最新文章

热门文章