常用图像数据集原始数据(.png或.jpg格式)生成方法

引言

在计算机视觉方面的工作，我们常常需要用到很多图像数据集．像ImageNet这样早已大名鼎鼎的数据集，我等的百十个Ｇ的硬盘容量怕是怎么也承载不下；本文中，将给出一些Hello world级的图像数据集生成方法，以及其他相关图像数据资源的整理．

本文的主要内容包括：

MNIST, CIFAR-10, CIFAR-100等.png或.jpg格式数据集的生成方法；
如何编写脚本生成图像数据，并更根据标签文件自动归类；
如何使用Digist工具生成这些数据集；
.h5格式数据文件格式查看方式；
相关数据集的下载地址．

生成图像数据

MNIST, CIFAR-10, CIFAR-100等数据在其官网都有相关的介绍，这里也给出相关的数据集的官方地址：

MNIST: http://yann.lecun.com/exdb/mnist/
CIFAR系列: https://www.cs.toronto.edu/~kriz/cifar.html

通过官网的介绍可以看出，官网给出的数据集大多都是二进制格式和一些python,matlab格式；有时候我们需要的是原始图像数据，这个时候我们就需要使用代码或者借助其他工具自己生成了．

代码生成的方式，在网上也有很多，但良莠不齐．大多需要自己根据官网给出的数据格式，自己更具格式特征生成原始数据，这里就不做具体介绍了，网上有很多．这里介绍一些比较简单快捷的方式，来帮助我们快速得到原始图像数据．

CIFAR-10 图像数据

这部分是我从kaggle cifar-10 官网提供的CIFAR-10数据集生成的，原始数据集（.png格式，比较符合我们的要求），但存在一个问题，所给的图片混乱的排列在train目录下，未按照原始10分类进行分类，但好在给出了trainLabels.csv类别映射文件；所以，我们需要解决的首要问题就是，根据这个映射文件自动分成10类别，并存放在10个文件目录下．

下边直接给出我的代码：

# coding: utf-8

import csv

import os

import shutil

import sys

# 获取文件名（除去后缀）

def getImageFilePre(filename):

if filename.endswith(".png"):

temp = filename.split(".")

filePre = temp[0]

return filePre

# string 转 int

def str2Int(stringValue):

return int(stringValue)

# int 转 string

def int2Str(intValue):

return str(intValue)

# 文件重命名

def fileRename(dirPath):

# 三个参数：分别返回

# 1.父目录

# 2.所有文件夹名字（不含路径）

# 3.所有文件名字

for parent, dirnames, filenames in os.walk(dirPath):

for dirname in dirnames: #输出文件夹信息

count = 1

newTmpPath = os.path.join(dirPath, dirname)

os.chdir(newTmpPath)

fileContents = os.listdir(newTmpPath)

for curFile in fileContents:

if curFile.endswith(".png"):

newName = dirname + "."+ int2Str(count) + ".png"

count = count + 1

shutil.move(curFile, newName)

print curFile + " -> " + newName + " ------> OK!"

def main():

# 读取标签文件内容

csvfile = file('trainLabels.csv', 'rb')

reader = csv.reader(csvfile)

reader = list(reader) # 转化为list列表

# 读取目录下文件列表

dirPath = "F:\\xxxxx\\data_origin\\train_200"

os.chdir(dirPath)

dirContents = os.listdir(dirPath)

dirContents.sort(key=lambda x:int(x[:-4])) #按文件名排序

totalFiles = 50001

for num in range(1, totalFiles): # 0-199

labelContent = reader[num]

labelID = reader[num][0]

labelName = reader[num][1]

imageFilename = dirContents[num-1]

tmpFilePre = getImageFilePre(dirContents[num-1])

if str2Int(labelID) == str2Int(tmpFilePre):

print "labelID == filePre !!!"

baseDirPath = "F:\\xxxxx\\data_origin\\train_with_class"

new_dir_name = labelName

new_dir_path = os.path.join(baseDirPath, new_dir_name)

isExists = os.path.isdir(new_dir_path)

if not isExists:

os.makedirs(new_dir_path)

print new_dir_path + " 创建成功！"

else:

print new_dir_path + "目录已存在！"

shutil.copy(imageFilename, new_dir_path)

print ">>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>"

csvfile.close()

rootPath = "F:\\xxxxx\\data_origin\\train_with_class"

fileRename(rootPath)

if __name__ == '__main__':

main()

这样，便分成了10个类别，并根据类别存放在不同的目录下，每一类别5000张图片；在我的Windows平台下耗时1.5个小时（包括文件重命名）才跑完，确实有点慢．下图为最终的结果图：

caffe图像化操作工具digits工具生成图像数据集

详细的使用方法可移步这篇博文：http://www.cnblogs.com/denny402/p/5136155.html

需要安装caffe和digits工具，使用工具可直接生成自动归类的图片数据，速度很快可以一试．

.h5文件结构查看器

在做卷积神经网络的时候，我们经常需要保存.h5数据文件，但有时候我们需要利用这些.h5文件，比如在进行transfor Learning的时候，就需要根据.h5文件的格式进行层冻结．

除了自己用代码一窥.h5文件结构外，还有什么快捷的工具吗？有的，matlab就提供了现成的调用方法．文档地址在这里：http://cn.mathworks.com/help/matlab/ref/h5disp.html

如，我们可以使用matlab命令查看vgg16模型的权重结构

1	>> h5disp('vgg16_weights.h5')

结果显示如下：

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

227

228

229

230

231

232

233

234

235

236

237

238

239

240

241

242

243

244

245

246

247

248

249

250

251

252

253

254

255

256

257

258

259

260

261

262

263

264

265

266

267

268

269

270

271

272

273

274

275

276

277

278

279

280

281

282

283

284

285

286

287

288

289

290

291

292

293

294

295

296

297

298

299

300

301

302

303

304

305

306

307

308

309

310

311

312

313

314

315

316

317

318

319

320

321

322

323

324

325

326

327

328

329

330

331

332

333

334

335

336

337

338

339

340

>> h5disp('vgg16_weights.h5')

HDF5 vgg16_weights.h5

Group '/'

Attributes:

'nb_layers': 37

Group '/layer_0'

Attributes:

'nb_params': 0

Group '/layer_1'

Attributes:

'nb_params': 2

Dataset 'param_0'

Size: 3x3x3x64

MaxSize: 3x3x3x64

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Dataset 'param_1'

Size: 64

MaxSize: 64

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Group '/layer_10'

Attributes:

'nb_params': 0

Group '/layer_11'

Attributes:

'nb_params': 2

Dataset 'param_0'

Size: 3x3x128x256

MaxSize: 3x3x128x256

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Dataset 'param_1'

Size: 256

MaxSize: 256

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Group '/layer_12'

Attributes:

'nb_params': 0

Group '/layer_13'

Attributes:

'nb_params': 2

Dataset 'param_0'

Size: 3x3x256x256

MaxSize: 3x3x256x256

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Dataset 'param_1'

Size: 256

MaxSize: 256

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Group '/layer_14'

Attributes:

'nb_params': 0

Group '/layer_15'

Attributes:

'nb_params': 2

Dataset 'param_0'

Size: 3x3x256x256

MaxSize: 3x3x256x256

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Dataset 'param_1'

Size: 256

MaxSize: 256

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Group '/layer_16'

Attributes:

'nb_params': 0

Group '/layer_17'

Attributes:

'nb_params': 0

Group '/layer_18'

Attributes:

'nb_params': 2

Dataset 'param_0'

Size: 3x3x256x512

MaxSize: 3x3x256x512

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Dataset 'param_1'

Size: 512

MaxSize: 512

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Group '/layer_19'

Attributes:

'nb_params': 0

Group '/layer_2'

Attributes:

'nb_params': 0

Group '/layer_20'

Attributes:

'nb_params': 2

Dataset 'param_0'

Size: 3x3x512x512

MaxSize: 3x3x512x512

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Dataset 'param_1'

Size: 512

MaxSize: 512

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Group '/layer_21'

Attributes:

'nb_params': 0

Group '/layer_22'

Attributes:

'nb_params': 2

Dataset 'param_0'

Size: 3x3x512x512

MaxSize: 3x3x512x512

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Dataset 'param_1'

Size: 512

MaxSize: 512

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Group '/layer_23'

Attributes:

'nb_params': 0

Group '/layer_24'

Attributes:

'nb_params': 0

Group '/layer_25'

Attributes:

'nb_params': 2

Dataset 'param_0'

Size: 3x3x512x512

MaxSize: 3x3x512x512

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Dataset 'param_1'

Size: 512

MaxSize: 512

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Group '/layer_26'

Attributes:

'nb_params': 0

Group '/layer_27'

Attributes:

'nb_params': 2

Dataset 'param_0'

Size: 3x3x512x512

MaxSize: 3x3x512x512

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Dataset 'param_1'

Size: 512

MaxSize: 512

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Group '/layer_28'

Attributes:

'nb_params': 0

Group '/layer_29'

Attributes:

'nb_params': 2

Dataset 'param_0'

Size: 3x3x512x512

MaxSize: 3x3x512x512

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Dataset 'param_1'

Size: 512

MaxSize: 512

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Group '/layer_3'

Attributes:

'nb_params': 2

Dataset 'param_0'

Size: 3x3x64x64

MaxSize: 3x3x64x64

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Dataset 'param_1'

Size: 64

MaxSize: 64

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Group '/layer_30'

Attributes:

'nb_params': 0

Group '/layer_31'

Attributes:

'nb_params': 0

Group '/layer_32'

Attributes:

'nb_params': 2

Dataset 'param_0'

Size: 4096x25088

MaxSize: 4096x25088

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Dataset 'param_1'

Size: 4096

MaxSize: 4096

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Group '/layer_33'

Attributes:

'nb_params': 0

Group '/layer_34'

Attributes:

'nb_params': 2

Dataset 'param_0'

Size: 4096x4096

MaxSize: 4096x4096

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Dataset 'param_1'

Size: 4096

MaxSize: 4096

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Group '/layer_35'

Attributes:

'nb_params': 0

Group '/layer_36'

Attributes:

'nb_params': 2

Dataset 'param_0'

Size: 1000x4096

MaxSize: 1000x4096

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Dataset 'param_1'

Size: 1000

MaxSize: 1000

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Group '/layer_4'

Attributes:

'nb_params': 0

Group '/layer_5'

Attributes:

'nb_params': 0

Group '/layer_6'

Attributes:

'nb_params': 2

Dataset 'param_0'

Size: 3x3x64x128

MaxSize: 3x3x64x128

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Dataset 'param_1'

Size: 128

MaxSize: 128

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Group '/layer_7'

Attributes:

'nb_params': 0

Group '/layer_8'

Attributes:

'nb_params': 2

Dataset 'param_0'

Size: 3x3x128x128

MaxSize: 3x3x128x128

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Dataset 'param_1'

Size: 128

MaxSize: 128

Datatype: H5T_IEEE_F32LE (single)

ChunkSize: []

Filters: none

FillValue: 0.000000

Group '/layer_9'

Attributes:

'nb_params': 0

参考资料

https://absentm.github.io/2016/07/12/%E5%B8%B8%E7%94%A8%E5%9B%BE%E5%83%8F%E6%95%B0%E6%8D%AE%E9%9B%86%E5%8E%9F%E5%A7%8B%E6%95%B0%E6%8D%AE-png%E6%88%96-jpg%E6%A0%BC%E5%BC%8F-%E7%94%9F%E6%88%90%E6%96%B9%E6%B3%95/

[1]. https://www.cs.toronto.edu/~kriz/cifar.html
[2]. https://www.kaggle.com/c/cifar-10/data
[3]. http://cn.mathworks.com/help/matlab/ref/h5disp.html
[4]. http://www.cnblogs.com/denny402/p/5136155.html