基于PaddlePaddle2.0的蝴蝶图像识别分类——利用预训练残差网络ResNet101模型中参数的调整，数据增强

项目来源陆平老师

预备知识

首先ResNet101是ResNet模型，它差不多是当前应用最为广泛的CNN特征提取网络。它的提出始于2015年，作者中间有大名鼎鼎的三位人物He-Kaiming, Ren-Shaoqing, Sun-Jian。网络里隐藏层的层数，Resnet-18, Resnet-50, Resnet-101分别有18、50、101层。

这里摘用了简书作者manofmountain来介绍下ResNet101，最后附上转载信息，大家可以看他文中介绍。

作者：manofmountain
链接：https://www.jianshu.com/p/93990a641066
来源：简书
著作权归作者所有。商业转载请联系作者获得授权，非商业转载请注明出处。

残差网络

残差网络是由来自Microsoft Research的4位学者提出的卷积神经网络，在2015年的ImageNet大规模视觉识别竞赛（ImageNet Large Scale Visual Recognition Challenge, ILSVRC）中获得了图像分类和物体识别的优胜。残差网络的特点是容易优化，并且能够通过增加相当的深度来提高准确率。其内部的残差块使用了跳跃连接，缓解了在深度神经网络中增加深度带来的梯度消失问题

进入正题

关于蝴蝶的项目传送门陆平老师的蝴蝶项目
https://aistudio.baidu.com/aistudio/projectdetail/1617603
第一天我的要求是自己能运行起来，重点在6. 应用高阶API训练模型中。

6. 应用高阶API训练模型（DAY1:按老师的文件跑起来）

一是定义输入数据形状大小和数据类型。

二是实例化模型。如果要用高阶API，需要用Paddle.Model()对模型进行封装，如model = paddle.Model(model,inputs=input_define,labels=label_define)。

三是定义优化器。这个使用Adam优化器，学习率设置为0.0001，优化器中的学习率(learning_rate)参数很重要。要是训练过程中得到的准确率呈震荡状态，忽大忽小，可以试试进一步把学习率调低。

四是准备模型。这里用到高阶API，model.prepare()。

五是训练模型。这里用到高阶API，model.fit()。参数意义详见下述代码注释。

#定义输入
input_define = paddle.static.InputSpec(shape=[-1,3,224,224], dtype="float32", name="img")
label_define = paddle.static.InputSpec(shape=[-1,1], dtype="int64", name="label")#实例化网络对象并定义优化器等训练逻辑
model = MyNet()
model = paddle.Model(model,inputs=input_define,labels=label_define) #用Paddle.Model()对模型进行封装
optimizer = paddle.optimizer.Adam(learning_rate=0.0001, parameters=model.parameters())
#上述优化器中的学习率(learning_rate)参数很重要。要是训练过程中得到的准确率呈震荡状态，忽大忽小，可以试试进一步把学习率调低。model.prepare(optimizer=optimizer, #指定优化器loss=paddle.nn.CrossEntropyLoss(), #指定损失函数metrics=paddle.metric.Accuracy()) #指定评估方法model.fit(train_data=train_dataset,     #训练数据集eval_data=eval_dataset,         #测试数据集batch_size=64,                  #一个批次的样本数量epochs=50,                      #迭代轮次save_dir="/home/aistudio/lup", #把模型参数、优化器参数保存至自定义的文件夹save_freq=20,                    #设定每隔多少个epoch保存模型参数及优化器参数log_freq=100                     #打印日志的频率
)

100%|██████████| 151272/151272 [00:03<00:00, 45656.75it/s]The loss value printed in the log is the current step, and the metric is the average value of previous step.
Epoch 1/50/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:77: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop workingreturn (isinstance(seq, collections.Sequence) and
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/nn/layer/norm.py:636: UserWarning: When training, we now always track global mean and variance."When training, we now always track global mean and variance.")step 24/24 - loss: 1.0453 - acc: 0.5023 - 714ms/step
save checkpoint at /home/aistudio/lup/0
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.7725 - acc: 0.7185 - 452ms/step
Eval samples: 373
Epoch 2/50
step 24/24 - loss: 0.0602 - acc: 0.9625 - 469ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4693 - acc: 0.8204 - 458ms/step
Eval samples: 373
Epoch 3/50
step 24/24 - loss: 0.0137 - acc: 0.9953 - 565ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4770 - acc: 0.8123 - 454ms/step
Eval samples: 373
Epoch 4/50
step 24/24 - loss: 0.0062 - acc: 0.9967 - 511ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.5013 - acc: 0.8123 - 524ms/step
Eval samples: 373
Epoch 5/50
step 24/24 - loss: 0.0063 - acc: 0.9967 - 471ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4859 - acc: 0.8150 - 568ms/step
Eval samples: 373
Epoch 6/50
step 24/24 - loss: 0.0059 - acc: 0.9967 - 479ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4518 - acc: 0.8177 - 445ms/step
Eval samples: 373
Epoch 7/50
step 24/24 - loss: 0.0050 - acc: 0.9953 - 503ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4636 - acc: 0.8177 - 459ms/step
Eval samples: 373
Epoch 8/50
step 24/24 - loss: 0.0058 - acc: 0.9946 - 477ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4556 - acc: 0.8204 - 446ms/step
Eval samples: 373
Epoch 9/50
step 24/24 - loss: 0.0113 - acc: 0.9953 - 501ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4382 - acc: 0.8257 - 533ms/step
Eval samples: 373
Epoch 10/50
step 24/24 - loss: 0.0044 - acc: 0.9973 - 484ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4378 - acc: 0.8284 - 529ms/step
Eval samples: 373
Epoch 11/50
step 24/24 - loss: 0.0032 - acc: 0.9973 - 472ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4394 - acc: 0.8231 - 455ms/step
Eval samples: 373
Epoch 12/50
step 24/24 - loss: 0.0049 - acc: 0.9973 - 484ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4287 - acc: 0.8338 - 471ms/step
Eval samples: 373
Epoch 13/50
step 24/24 - loss: 0.0086 - acc: 0.9953 - 465ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4383 - acc: 0.8257 - 442ms/step
Eval samples: 373
Epoch 14/50
step 24/24 - loss: 6.3811e-04 - acc: 0.9960 - 518ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4707 - acc: 0.8177 - 497ms/step
Eval samples: 373
Epoch 15/50
step 24/24 - loss: 0.0048 - acc: 0.9967 - 482ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4612 - acc: 0.8338 - 454ms/step
Eval samples: 373
Epoch 16/50
step 24/24 - loss: 0.0438 - acc: 0.9967 - 465ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4372 - acc: 0.8177 - 497ms/step
Eval samples: 373
Epoch 17/50
step 24/24 - loss: 0.0018 - acc: 0.9967 - 507ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4751 - acc: 0.8177 - 504ms/step
Eval samples: 373
Epoch 18/50
step 24/24 - loss: 0.0027 - acc: 0.9973 - 461ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4662 - acc: 0.8231 - 520ms/step
Eval samples: 373
Epoch 19/50
step 24/24 - loss: 6.9153e-04 - acc: 0.9973 - 497ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4607 - acc: 0.8204 - 493ms/step
Eval samples: 373
Epoch 20/50
step 24/24 - loss: 0.0030 - acc: 0.9973 - 491ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4461 - acc: 0.8204 - 468ms/step
Eval samples: 373
Epoch 21/50
step 24/24 - loss: 0.0014 - acc: 0.9973 - 451ms/step
save checkpoint at /home/aistudio/lup/20
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4457 - acc: 0.8177 - 453ms/step
Eval samples: 373
Epoch 22/50
step 24/24 - loss: 0.0052 - acc: 0.9967 - 469ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4341 - acc: 0.8204 - 482ms/step
Eval samples: 373
Epoch 23/50
step 24/24 - loss: 0.0020 - acc: 0.9967 - 467ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4730 - acc: 0.8231 - 453ms/step
Eval samples: 373
Epoch 24/50
step 24/24 - loss: 0.0011 - acc: 0.9946 - 504ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4594 - acc: 0.8231 - 527ms/step
Eval samples: 373
Epoch 25/50
step 24/24 - loss: 0.0021 - acc: 0.9960 - 514ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4580 - acc: 0.8177 - 458ms/step
Eval samples: 373
Epoch 26/50
step 24/24 - loss: 0.0021 - acc: 0.9953 - 493ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4362 - acc: 0.8231 - 458ms/step
Eval samples: 373
Epoch 27/50
step 24/24 - loss: 8.8354e-04 - acc: 0.9967 - 476ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4431 - acc: 0.8150 - 459ms/step
Eval samples: 373
Epoch 28/50
step 24/24 - loss: 8.2119e-04 - acc: 0.9967 - 468ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4147 - acc: 0.8231 - 451ms/step
Eval samples: 373
Epoch 29/50
step 24/24 - loss: 4.7179e-04 - acc: 0.9973 - 562ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4441 - acc: 0.8204 - 495ms/step
Eval samples: 373
Epoch 30/50
step 24/24 - loss: 9.7508e-04 - acc: 0.9960 - 462ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4218 - acc: 0.8204 - 458ms/step
Eval samples: 373
Epoch 31/50
step 24/24 - loss: 9.0738e-04 - acc: 0.9967 - 511ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4271 - acc: 0.8231 - 510ms/step
Eval samples: 373
Epoch 32/50
step 24/24 - loss: 0.0554 - acc: 0.9960 - 513ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4144 - acc: 0.8257 - 572ms/step
Eval samples: 373
Epoch 33/50
step 24/24 - loss: 0.0016 - acc: 0.9967 - 547ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4410 - acc: 0.8150 - 780ms/step
Eval samples: 373
Epoch 34/50
step 24/24 - loss: 5.2188e-04 - acc: 0.9973 - 591ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4245 - acc: 0.8177 - 552ms/step
Eval samples: 373
Epoch 35/50
step 24/24 - loss: 0.0012 - acc: 0.9967 - 472ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4272 - acc: 0.8231 - 519ms/step
Eval samples: 373
Epoch 36/50
step 24/24 - loss: 0.0010 - acc: 0.9967 - 481ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4180 - acc: 0.8284 - 448ms/step
Eval samples: 373
Epoch 37/50
step 24/24 - loss: 0.0016 - acc: 0.9967 - 493ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4372 - acc: 0.8204 - 444ms/step
Eval samples: 373
Epoch 38/50
step 24/24 - loss: 0.0012 - acc: 0.9960 - 453ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4159 - acc: 0.8311 - 446ms/step
Eval samples: 373
Epoch 39/50
step 24/24 - loss: 6.6552e-04 - acc: 0.9953 - 478ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4017 - acc: 0.8231 - 463ms/step
Eval samples: 373
Epoch 40/50
step 24/24 - loss: 0.0012 - acc: 0.9953 - 462ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4211 - acc: 0.8177 - 592ms/step
Eval samples: 373
Epoch 41/50
step 24/24 - loss: 0.0010 - acc: 0.9967 - 487ms/step
save checkpoint at /home/aistudio/lup/40
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4018 - acc: 0.8177 - 540ms/step
Eval samples: 373
Epoch 42/50
step 24/24 - loss: 5.5304e-04 - acc: 0.9960 - 514ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4034 - acc: 0.8204 - 488ms/step
Eval samples: 373
Epoch 43/50
step 24/24 - loss: 0.0034 - acc: 0.9967 - 459ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4180 - acc: 0.8150 - 460ms/step
Eval samples: 373
Epoch 44/50
step 24/24 - loss: 9.1367e-04 - acc: 0.9973 - 541ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4253 - acc: 0.8284 - 486ms/step
Eval samples: 373
Epoch 45/50
step 24/24 - loss: 4.2939e-04 - acc: 0.9967 - 487ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4468 - acc: 0.8231 - 468ms/step
Eval samples: 373
Epoch 46/50
step 24/24 - loss: 4.1652e-04 - acc: 0.9960 - 478ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4426 - acc: 0.8257 - 533ms/step
Eval samples: 373
Epoch 47/50
step 24/24 - loss: 0.0021 - acc: 0.9973 - 545ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4339 - acc: 0.8257 - 491ms/step
Eval samples: 373
Epoch 48/50
step 24/24 - loss: 0.0011 - acc: 0.9973 - 492ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4490 - acc: 0.8204 - 471ms/step
Eval samples: 373
Epoch 49/50
step 24/24 - loss: 1.9973e-04 - acc: 0.9967 - 474ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4515 - acc: 0.8177 - 454ms/step
Eval samples: 373
Epoch 50/50
step 24/24 - loss: 0.0047 - acc: 0.9953 - 482ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4507 - acc: 0.8257 - 476ms/step
Eval samples: 373
save checkpoint at /home/aistudio/lup/final

这里可以看到目前的准确率为82.57%

Day2:修改参数

在不改变模型的情况下，尝试修改参数，learning_rate=0.0001，学习率的修改，batch_size=64, 这个参数表示一个批次的样本数量，修改它会导致step发生改变， epochs=50表示迭代轮次，并不是越多越好，当模型参数达到极限时候，调整迭代轮次并没有用。save_freq=20,这个参数设定每隔多少个epoch保存模型参数及优化器参数。随后通过调整我的脚本为：

#定义输入
input_define = paddle.static.InputSpec(shape=[-1,3,224,224], dtype="float32", name="img")
label_define = paddle.static.InputSpec(shape=[-1,1], dtype="int64", name="label")#实例化网络对象并定义优化器等训练逻辑
model = MyNet()
model = paddle.Model(model,inputs=input_define,labels=label_define) #用Paddle.Model()对模型进行封装
optimizer = paddle.optimizer.Adam(learning_rate=0.0002, parameters=model.parameters())
#上述优化器中的学习率(learning_rate)参数很重要。要是训练过程中得到的准确率呈震荡状态，忽大忽小，可以试试进一步把学习率调低。model.prepare(optimizer=optimizer, #指定优化器loss=paddle.nn.CrossEntropyLoss(), #指定损失函数metrics=paddle.metric.Accuracy()) #指定评估方法model.fit(train_data=train_dataset,     #训练数据集eval_data=eval_dataset,         #测试数据集batch_size=64,                  #一个批次的样本数量epochs=10,                      #迭代轮次save_dir="/home/aistudio/lup", #把模型参数、优化器参数保存至自定义的文件夹save_freq=10,                    #设定每隔多少个epoch保存模型参数及优化器参数log_freq=100                     #打印日志的频率
)

The loss value printed in the log is the current step, and the metric is the average value of previous step.
Epoch 1/10/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:77: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop workingreturn (isinstance(seq, collections.Sequence) and
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/nn/layer/norm.py:636: UserWarning: When training, we now always track global mean and variance."When training, we now always track global mean and variance.")step 24/24 - loss: 0.8407 - acc: 0.5687 - 498ms/step
save checkpoint at /home/aistudio/lup/0
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.8831 - acc: 0.7534 - 501ms/step
Eval samples: 373
Epoch 2/10
step 24/24 - loss: 0.2330 - acc: 0.9431 - 474ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.3127 - acc: 0.8391 - 486ms/step
Eval samples: 373
Epoch 3/10
step 24/24 - loss: 0.0126 - acc: 0.9826 - 454ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.2984 - acc: 0.8579 - 481ms/step
Eval samples: 373
Epoch 4/10
step 24/24 - loss: 0.0604 - acc: 0.9926 - 455ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.2291 - acc: 0.8820 - 462ms/step
Eval samples: 373
Epoch 5/10
step 24/24 - loss: 0.0827 - acc: 0.9900 - 463ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.2617 - acc: 0.8633 - 466ms/step
Eval samples: 373
Epoch 6/10
step 24/24 - loss: 0.0158 - acc: 0.9940 - 483ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.2667 - acc: 0.8633 - 486ms/step
Eval samples: 373
Epoch 7/10
step 24/24 - loss: 0.0029 - acc: 0.9960 - 488ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.1697 - acc: 0.8847 - 449ms/step
Eval samples: 373
Epoch 8/10
step 24/24 - loss: 0.0293 - acc: 0.9960 - 454ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.1741 - acc: 0.8874 - 461ms/step
Eval samples: 373
Epoch 9/10
step 24/24 - loss: 0.1015 - acc: 0.9953 - 449ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.4361 - acc: 0.8740 - 483ms/step
Eval samples: 373
Epoch 10/10
step 24/24 - loss: 0.0527 - acc: 0.9926 - 485ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 6/6 - loss: 0.6901 - acc: 0.8686 - 591ms/step
Eval samples: 373
save checkpoint at /home/aistudio/lup/final

通过调试，模型的acc能0.8686拉，但是再怎么调，好像很难提高了，而且loss损失函数也比较大。

Day3：数据扩增

在不改变模型的情况下，靠参数调整提高acc的可能性非常局限，这里想到了数据扩增，数据扩增其实就是将现有的数据通过旋转，平移，随机裁剪，随机擦除，变色，翻转等操作后保存为新的数据，从而达到增加数据的可能性，再而提高acc，考虑到蝴蝶图片，我决定先选取翻转这种数据增强方式，换言之在原本的数据上进行水平翻转，然后再加入数据集。

#以下代码用于建立样本数据读取路径与样本标签之间的关系
import os
import randomimport cv2
import numpy as np
from matplotlib import pyplot as pltdata_list = [] #用个列表保存每个样本的读取路径、标签#由于属种名称本身是字符串，而输入模型的是数字。需要构造一个字典，把某个数字代表该属种名称。键是属种名称，值是整数。
label_list=[]
with open("/home/aistudio/data/species.txt") as f:for line in f:a,b = line.strip("\n").split(" ")label_list.append([b, int(a)-1])
label_dic = dict(label_list)
#print(label_dic)#获取Butterfly20目录下的所有子目录名称，保存进一个列表之中
class_list = os.listdir("/home/aistudio/data/Butterfly20")
class_list.remove('.DS_Store') #删掉列表中名为.DS_Store的元素，因为.DS_Store并没有样本。
#print(class_list)for each in class_list:#print(each)for f in os.listdir("/home/aistudio/data/Butterfly20/"+each):#print(f)filename = "/home/aistudio/data/Butterfly20/"+each+'/'+fimg = cv2.imread(filename)dst = cv2.flip(img,1)  #水平翻转图片#plt.imshow(dst)cv2.imwrite("/home/aistudio/data/Butterfly20/"+each+'/new_'+f, dst) #将翻转的图片加入数据集

1866个样本瞬间变为3732个样本。

#定义输入
input_define = paddle.static.InputSpec(shape=[-1,3,224,224], dtype="float32", name="img")
label_define = paddle.static.InputSpec(shape=[-1,1], dtype="int64", name="label")#实例化网络对象并定义优化器等训练逻辑
model = MyNet()
model = paddle.Model(model,inputs=input_define,labels=label_define) #用Paddle.Model()对模型进行封装
optimizer = paddle.optimizer.Adam(learning_rate=0.0002, parameters=model.parameters())
#上述优化器中的学习率(learning_rate)参数很重要。要是训练过程中得到的准确率呈震荡状态，忽大忽小，可以试试进一步把学习率调低。model.prepare(optimizer=optimizer, #指定优化器loss=paddle.nn.CrossEntropyLoss(), #指定损失函数metrics=paddle.metric.Accuracy()) #指定评估方法model.fit(train_data=train_dataset,     #训练数据集eval_data=eval_dataset,         #测试数据集batch_size=64,                  #一个批次的样本数量epochs=10,                      #迭代轮次save_dir="/home/aistudio/lup", #把模型参数、优化器参数保存至自定义的文件夹save_freq=10,                    #设定每隔多少个epoch保存模型参数及优化器参数log_freq=100                     #打印日志的频率
)

100%|██████████| 151272/151272 [00:02<00:00, 68032.16it/s]The loss value printed in the log is the current step, and the metric is the average value of previous step.
Epoch 1/10/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/fluid/layers/utils.py:77: DeprecationWarning: Using or importing the ABCs from 'collections' instead of from 'collections.abc' is deprecated, and in 3.8 it will stop workingreturn (isinstance(seq, collections.Sequence) and
/opt/conda/envs/python35-paddle120-env/lib/python3.7/site-packages/paddle/nn/layer/norm.py:636: UserWarning: When training, we now always track global mean and variance."When training, we now always track global mean and variance.")step 47/47 - loss: 0.4685 - acc: 0.7251 - 535ms/step
save checkpoint at /home/aistudio/lup/0
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 12/12 - loss: 0.2460 - acc: 0.8606 - 513ms/step
Eval samples: 746
Epoch 2/10
step 47/47 - loss: 0.1171 - acc: 0.9618 - 527ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 12/12 - loss: 0.2140 - acc: 0.9088 - 512ms/step
Eval samples: 746
Epoch 3/10
step 47/47 - loss: 0.0063 - acc: 0.9916 - 529ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 12/12 - loss: 0.1285 - acc: 0.9491 - 515ms/step
Eval samples: 746
Epoch 4/10
step 47/47 - loss: 0.0035 - acc: 0.9953 - 533ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 12/12 - loss: 0.0696 - acc: 0.9571 - 512ms/step
Eval samples: 746
Epoch 5/10
step 47/47 - loss: 0.0038 - acc: 0.9973 - 544ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 12/12 - loss: 0.0627 - acc: 0.9598 - 512ms/step
Eval samples: 746
Epoch 6/10
step 47/47 - loss: 0.0023 - acc: 0.9967 - 529ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 12/12 - loss: 0.0980 - acc: 0.9558 - 511ms/step
Eval samples: 746
Epoch 7/10
step 47/47 - loss: 9.3179e-04 - acc: 0.9973 - 534ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 12/12 - loss: 0.0914 - acc: 0.9625 - 512ms/step
Eval samples: 746
Epoch 8/10
step 47/47 - loss: 2.5623e-04 - acc: 0.9980 - 529ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 12/12 - loss: 0.0767 - acc: 0.9638 - 513ms/step
Eval samples: 746
Epoch 9/10
step 47/47 - loss: 6.1076e-04 - acc: 0.9977 - 525ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 12/12 - loss: 0.0780 - acc: 0.9625 - 515ms/step
Eval samples: 746
Epoch 10/10
step 47/47 - loss: 1.9949e-04 - acc: 0.9983 - 530ms/step
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 12/12 - loss: 0.0857 - acc: 0.9651 - 518ms/step
Eval samples: 746
save checkpoint at /home/aistudio/lup/final

采用第二天的参数进行训练，发现acc增加到 0.9651，并且 loss: 0.0857，总体来说我感觉效果不错。

小结

预训练残差网络ResNet101模型整体来说还是比较好的，适当调参是可以提高4%-5%的准确率，但是不改变模型的情况下，调参的能力还是有限的，因此数据就显得尤为重要，数据的质量的，数据的量，这些就是数据增强，通过数据扩增能使得训练的模型提高准确率。