本文为自学碰壁的完成任务的一个记录总结,无任何参考价值

写在前面:本文章是跟着《动手深度学习》(李沐)学习过程中的动手实操,前因是对一个树叶数据集分类,但是由于自己太小白(太菜了)折腾了两三周才弄出来,但是觉得还是值得记录一下,对整个过程中自己碰壁过程进行一个总结。由于对于树叶分类的那个问题自己有点雨里雾里的,觉得没达到效果,所以才有了在 Kaggle 上 ImageNet Dog Breed Classification 的分类,本文同样也记录了树叶分类

总结:

  1. 多去论坛或者说竞赛地址看看别人写的代码,多看多练
  2. 出现问题第一时间应该去查找,可以去IDE debug ,实在不行就Print 大法,我这次有个很关键的问题就是,在树叶分类任务过程中能够,我没有将树叶类别转化为数字而是直接传入Tensor,当然结果注定是失败的,还有就是我在提取类别的时候用了一个for 循环去一个元素一个元素的遍历,花费了大量时间,而在人家的baseline 当中用一个集合思想不能有重复元素就可以解决,效率大大提升,所以多看多练
  3. 在这个过程中有个自定义dataset类,但是自己不知道是哪里出问题了,导致传入网络的数据是错误的,验证精度从一开始就为零,而训练精度是从0慢慢提升上去,到后面才发现可能是dataset将类别和图片对应传入出问题,所以出问题要有针对性的解决,不要盲目性的瞎捣鼓

目录

  • ImageNet Dogs Breed Classification
    • 自定义Dataset
    • 训练部分
  • 动手深度学习实战:树叶分类
    • dataset
    • 训练部分

ImageNet Dogs Breed Classification

Kaggle 狗的品种分类竞赛网址

自定义Dataset

from torch.utils.data import Dataset, DataLoader
import numpy as np
from PIL import Image
import pandas as pd
from torchvision import transforms
# 读取标签
dog_dataframe = pd.read_csv('../input/dog-breed-identification/labels.csv')
# 将标签转化为集合获取所有类的集合,然后再转化list进行排序
dog_labels = sorted(list(set(dog_dataframe['breed'])))
# 狗的种类数量
n_class = len(dog_labels)
# 需向将string类型的量转化为数字才能ToTensor
class_to_num = dict(zip(dog_labels,range(n_class)))class DogsData(Dataset):def __init__(self, csv_path, file_path, mode='train', valid_ratio=0.1, resize_height=224, resize_width=224):"""Args:csv_path (string): csv 文件路径img_path (string): 图像文件所在路径mode (string): 训练模式还是测试模式valid_ratio (float): 验证集比例"""# 调整图片大小self.resize_height = resize_heightself.resize_width = resize_width# 图片所在文件夹self.file_path = file_pathself.valid_ratio = valid_ratio# 数据是用来训练还是验证self.mode = mode# 去掉frame头部份,将从1开始排序self.data_info = pd.read_csv(csv_path, header=None)# 得到训练数据的长度(个数)self.data_len = len(self.data_info.index) - 1self.train_len = int(self.data_len * (1 - valid_ratio))# 将数据集分为训练集和验证集if mode == 'train':self.train_image = np.asarray(self.data_info.iloc[1:self.train_len, 0])self.train_label = np.asarray(self.data_info.iloc[1:self.train_len, 1])self.image_arr = self.train_imageself.label_arr = self.train_labelelif mode == 'valid':self.valid_image = np.asarray(self.data_info.iloc[self.train_len:, 0])self.valid_label = np.asarray(self.data_info.iloc[self.train_len:, 1])self.image_arr = self.valid_imageself.label_arr = self.valid_labelself.real_len = len(self.image_arr)print('Finished reading the {} set of Leaves Dataset ({} samples found)'.format(mode, self.real_len))def __getitem__(self, index):single_image_name = self.image_arr[index]#  打开某个具体图片,给出了完整路径image = Image.open(self.file_path + single_image_name + '.jpg')#transform 采用了图像增强方法,包括大小,翻转,对比度,亮度,色调,以及正则化if self.mode == 'train':transform = transforms.Compose([transforms.RandomResizedCrop((224), scale=(0.64, 1), ratio=(0.8, 1.0)),transforms.RandomHorizontalFlip(),transforms.ToTensor(),transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])else:transform = transforms.Compose([transforms.Resize(256),transforms.CenterCrop(224),transforms.ToTensor(),transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])image = transform(image)label = self.label_arr[index]number_label = class_to_num[label]return image, number_label# 返回数据集长度def __len__(self):return self.real_len

训练部分

采用了学习率cos下降的方法,采用了AdamW优化函数,采用了Resnext50_32x4d网络


import torchvision
import sys
import torch
from torch.optim.lr_scheduler import CosineAnnealingLR
import pandas as pd
# This is for the progress bar.
from tqdm import tqdm
import torch.nn as nn
def train(lr, weight_decay, num_epochs):device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")print("using {} device.\n".format(device))train_path = '../input/dog-breed-identification/labels.csv'file_path =  '../input/dog-breed-identification/train/'train_dataset = DogsData(train_path, file_path, mode='train' )valid_dataset = DogsData(train_path, file_path, mode='valid' )train_loader = torch.utils.data.DataLoader(dataset=train_dataset,batch_size=8, shuffle=True,num_workers=2)valid_loader = torch.utils.data.DataLoader(dataset=valid_dataset,batch_size=8,shuffle=False,num_workers=2)net = torchvision.models.resnext50_32x4d(pretrained=True)in_channel = net.fc.in_featuresnet.fc = nn.Sequential(nn.Linear(in_channel, n_class))net.to(device)loss_function = nn.CrossEntropyLoss()optimizer = torch.optim.AdamW(net.parameters(), lr=lr, weight_decay=weight_decay)scheduler = CosineAnnealingLR(optimizer, T_max=10)model_path = './dog_resnext'best_acc = 0.0for epoch in range(num_epochs):# ---------- Training ----------# Make sure the model is in train mode before training.net.train() # These are used to record information in training.train_loss = []train_accs = []# Iterate the training set by batches.for batch in tqdm(train_loader):# A batch consists of image data and corresponding labels.imgs, labels = batchimgs = imgs.to(device)labels = labels.to(device)# Forward the data. (Make sure data and model are on the same device.)logits = net(imgs)# Calculate the cross-entropy loss.# We don't need to apply softmax before computing cross-entropy as it is done automatically.loss = loss_function(logits, labels)# Gradients stored in the parameters in the previous step should be cleared out first.optimizer.zero_grad()# Compute the gradients for parameters.loss.backward()# Update the parameters with computed gradients.optimizer.step()# Compute the accuracy for current batch.acc = (logits.argmax(dim=-1) == labels).float().mean()# Record the loss and accuracy.train_loss.append(loss.item())train_accs.append(acc)# The average loss and accuracy of the training set is the average of the recorded values.train_loss = sum(train_loss) / len(train_loss)train_acc = sum(train_accs) / len(train_accs)# Print the information.print(f"[ Train | {epoch + 1:03d}/{num_epochs:03d} ] loss = {train_loss:.5f}, acc = {train_acc:.5f}")scheduler.step()# ---------- Validation ----------# Make sure the model is in eval mode so that some modules like dropout are disabled and work normally.net.eval()# These are used to record information in validation.valid_loss = []valid_accs = []# Iterate the validation set by batches.for batch in tqdm(valid_loader):imgs, labels = batch# We don't need gradient in validation.# Using torch.no_grad() accelerates the forward process.with torch.no_grad():logits = net(imgs.to(device))# We can still compute the loss (but not the gradient).loss = loss_function(logits, labels.to(device))# Compute the accuracy for current batch.acc = (logits.argmax(dim=-1) == labels.to(device)).float().mean()# Record the loss and accuracy.valid_loss.append(loss.item())valid_accs.append(acc)# The average loss and accuracy for entire validation set is the average of the recorded values.valid_loss = sum(valid_loss) / len(valid_loss)valid_acc = sum(valid_accs) / len(valid_accs)# Print the information.print(f"[ Valid | {epoch + 1:03d}/{num_epochs:03d} ] loss = {valid_loss:.5f}, acc = {valid_acc:.5f}")# if the model improves, save a checkpoint at this epochif valid_acc > best_acc:best_acc = valid_acctorch.save(net.state_dict(), model_path)print('saving model with acc {:.3f}\n\n'.format(best_acc))

经过训练后,最好的准确率达到了87%,

动手深度学习实战:树叶分类

李沐大神出的教程-动手深度学习 Kaggle 树叶分类
感兴趣上B站搜索李沐就好
如果真的感兴趣可以去这篇baseline 去看看,大佬思路清晰,看完简直如醍醐灌顶
同样有其他大佬甚至将精度提高打了99%以上,感兴趣可以去看看别人怎么写的

dataset

import torch
import torch.nn as nn
import pandas as pd
import numpy as np
from torch.utils.data import Dataset, DataLoader
from torchvision import transforms
from PIL import Image
import sys
import os
import matplotlib.pyplot as plt
import torchvision.models as models
# This is for the progress bar.
from tqdm import tqdmlabels_dataframe = pd.read_csv('../input/classify-leaves/train.csv')
leaves_labels = sorted(list(set(labels_dataframe['label'])))
num_classes = len(leaves_labels)
class_to_num = dict(zip(leaves_labels, range(num_classes)))class LeavesData(Dataset):def __init__(self, csv_path, file_path, mode='train', valid_ratio=0.2, resize_height=256, resize_width=256):"""Args:csv_path (string): csv 文件路径img_path (string): 图像文件所在路径mode (string): 训练模式还是测试模式valid_ratio (float): 验证集比例"""# 需要调整后的照片尺寸,我这里每张图片的大小尺寸不一致#self.resize_height = resize_heightself.resize_width = resize_widthself.file_path = file_pathself.mode = mode# 读取 csv 文件# 利用pandas读取csv文件self.data_info = pd.read_csv(csv_path, header=None)  #header=None是去掉表头部分# 计算 lengthself.data_len = len(self.data_info.index) - 1self.train_len = int(self.data_len * (1 - valid_ratio))if mode == 'train':# 第一列包含图像文件的名称self.train_image = np.asarray(self.data_info.iloc[1:self.train_len, 0])  #self.data_info.iloc[1:,0]表示读取第一列,从第二行开始到train_len# 第二列是图像的 labelself.train_label = np.asarray(self.data_info.iloc[1:self.train_len, 1])self.image_arr = self.train_image self.label_arr = self.train_labelelif mode == 'valid':self.valid_image = np.asarray(self.data_info.iloc[self.train_len:, 0])  self.valid_label = np.asarray(self.data_info.iloc[self.train_len:, 1])self.image_arr = self.valid_imageself.label_arr = self.valid_labelelif mode == 'test':self.test_image = np.asarray(self.data_info.iloc[1:, 0])self.image_arr = self.test_imageself.real_len = len(self.image_arr)print('Finished reading the {} set of Leaves Dataset ({} samples found)'.format(mode, self.real_len))def __getitem__(self, index):# 从 image_arr中得到索引对应的文件名single_image_name = self.image_arr[index]# 读取图像文件img_as_img = Image.open(self.file_path + single_image_name)#如果需要将RGB三通道的图片转换成灰度图片可参考下面两行
#         if img_as_img.mode != 'L':
#             img_as_img = img_as_img.convert('L')#设置好需要转换的变量,还可以包括一系列的nomarlize等等操作if self.mode == 'train':transform = transforms.Compose([transforms.Resize((224, 224)),transforms.RandomResizedCrop(224, scale=(0.08, 1.0), ratio=(3.0 / 4.0, 4.0 / 3.0)),transforms.RandomHorizontalFlip(p=0.5),   #随机水平翻转 选择一个概率transforms.ToTensor(),transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])else:# valid和test不做数据增强transform = transforms.Compose([transforms.Resize((224, 224)),transforms.ToTensor(),transforms.Normalize([0.485, 0.456, 0.406], [0.229, 0.224, 0.225])])img_as_img = transform(img_as_img)if self.mode == 'test':return img_as_imgelse:# 得到图像的 string labellabel = self.label_arr[index]# number labelnumber_label = class_to_num[label]return img_as_img, number_label  #返回每一个index对应的图片数据和对应的labeldef __len__(self):return self.real_len

训练部分

from torch.optim.lr_scheduler import CosineAnnealingLR
def train(lr, weight_decay, num_epochs):device = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")print("using {} device.\n".format(device))train_path = '../input/classify-leaves/train.csv'img_path = '../input/classify-leaves/'train_dataset = LeavesData(train_path, img_path, mode='train')val_dataset = LeavesData(train_path, img_path, mode='valid')train_loader = torch.utils.data.DataLoader(dataset=train_dataset,batch_size=16, shuffle=False,num_workers=2)val_loader = torch.utils.data.DataLoader(dataset=val_dataset,batch_size=16, shuffle=False,num_workers=2)net = models.resnext50_32x4d(pretrained=True)#model_weight_path = "../input/resnet-module/resnet34.pth"      # 迁移学习的模型参数#assert os.path.exists(model_weight_path), "file {} does not exist.".format(model_weight_path)#net.load_state_dict(torch.load(model_weight_path, map_location='cpu'))# for param in net.parameters():#     param.requires_grad = False# change fc layer structurein_channel = net.fc.in_featuresnet.fc = nn.Sequential(nn.Linear(in_channel, num_classes))net.to(device)loss_function = nn.CrossEntropyLoss()optimizer = torch.optim.AdamW(net.parameters(), lr=lr, weight_decay=weight_decay)scheduler = CosineAnnealingLR(optimizer, T_max=10)best_acc = 0.0save_path = './my_leaves_resnext50_32x4d.pth'train_steps = len(train_loader)val_steps = len(val_loader)for epoch in range(num_epochs):net.train()running_loss = 0.0train_bar = tqdm(train_loader, file=sys.stdout)train_acc, train_total = 0.0, 0for step, (images, label) in enumerate(train_bar):optimizer.zero_grad()images = images.to(device)label = label.to(device)y_hat = net(images)loss = loss_function(y_hat, label)loss.backward()optimizer.step()running_loss += loss.item()train_predict_y = torch.max(y_hat, dim=1)[1]train_total += len(label)train_acc += torch.eq(train_predict_y, label).sum().item()            train_bar.desc = "train epoch[{}/{}] loss:{:.3f}".format(epoch + 1,num_epochs,loss)del loss, images, label, y_hattorch.cuda.empty_cache()train_accurate = train_acc / train_totalscheduler.step()net.eval()val_acc = 0.0  # accumulate accurate number / epochval_loss = 0.0val_total = 0with torch.no_grad():val_bar = tqdm(val_loader, file=sys.stdout)for val_data in val_bar:val_images, val_labels = val_dataval_images = val_images.to(device)val_labels = val_labels.to(device)outputs = net(val_images)loss = loss_function(outputs, val_labels)val_predict_y = torch.max(outputs, dim=1)[1]val_total += len(val_labels)val_loss += loss.item()val_acc += torch.eq(val_predict_y, val_labels).sum().item()val_bar.desc = "valid epoch[{}/{}]".format(epoch + 1, num_epochs)del outputs, val_images, val_labelstorch.cuda.empty_cache()val_accurate =  val_acc / val_totalprint('[epoch %d] train_loss: %.3f  train_accuracy: %.3f  val_loss: %.3f val_accuracy: %.3f \n' %(epoch + 1, running_loss / train_steps, train_accurate, val_loss / val_steps, val_accurate))if val_accurate > best_acc:best_acc = val_accuratetorch.save(net.state_dict(), save_path)print('best accuracy: {:.3f}\n'.format(best_acc))print('Finished Training')

最后的验证最好精度是93%,他的训练精度比他稍小(因为加入了很强的图像增广),其实最好的精度达到了95%但是值得注意的是它的训练精度达到了99%,我认为知识一个过拟合的信号,所以选择放弃这个,而是选择前面了那个

Kaggle: ImageNet Dog Breed Classification (Pytorch)相关推荐

  1. 玩转Kaggle:Dog Breed Identification【识别狗的类别】

    文章目录 1. Kaggle数据集介绍 2. 下载数据和数据整理(kaggle官网即可) 3. 图像增广 4. 加载数据 5.微调模型 6.可视化validation数据 7. 输出测试结果并提交ka ...

  2. kaggle比赛--Dog Breed Identification狗狗品种识别大赛

    对截止至2020.8.1的top5方案进行总结归纳 比赛链接:dog-breed-identification 1.数据形式 2.top1总结 3.top2总结 4.top3总结 -- 1.数据形式 ...

  3. 动手学深度学习Kaggle:图像分类 (CIFAR-10和Dog Breed Identification), StratifiedShuffleSplit,数据集划分

    目录 CIFAR-10 获取并组织数据集 下载数据集 整理数据集 组织数据集更一般的方式 图像增广 读取数据集 torchvision.datasets.ImageFolder()的特点 定义模型 定 ...

  4. [Kaggle] Spam/Ham Email Classification 垃圾邮件分类(BERT)

    文章目录 1. 数据处理 2. 下载预训练模型 3. 加载数据 4. 定义模型 5. 训练 6. 提交测试结果 练习地址:https://www.kaggle.com/c/ds100fa19 相关博文 ...

  5. [Kaggle] Spam/Ham Email Classification 垃圾邮件分类(RNN/GRU/LSTM)

    文章目录 1. 读入数据 2. 文本处理 3. 建模 4. 训练 5. 测试 练习地址:https://www.kaggle.com/c/ds100fa19 相关博文 [Kaggle] Spam/Ha ...

  6. [Kaggle] Spam/Ham Email Classification 垃圾邮件分类(spacy)

    文章目录 1. 导入包 2. 数据预览 2. 特征组合 3. 建模 4. 训练 5. 预测 练习地址:https://www.kaggle.com/c/ds100fa19 相关博文: [Kaggle] ...

  7. Kaggle Cassava Leaf Disease Classification 木薯叶疾病分类竞赛

    题目描述 kaggle挑战赛题目,构造一个分类模型,准确的识别出图像中木薯叶子感染的具体疾病. 详情可以参考链接:Cassva Leaf Disease Classification 木薯叶1 任务就 ...

  8. Kaggle Cassava Leaf Disease Classification

    Kaggle木薯病害分类 我要吃麦旋风-OpenBayes ---------------------- 12-01--------在kaggle上的第一个比赛,0.900, 先占个坑,慢慢刷锅 -- ...

  9. 第1135期AI100_机器学习日报(2017-10-27)

    AI100_机器学习日报 2017-10-27 李飞飞最新演讲:ImageNet后,专注于这五件事--视觉理解.场景图,段落整合.视频分割及CLEVR数据集 @爱可可-爱生活 深度三维残差神经网络:视 ...

最新文章

  1. 循环神经网络(recurrent neural network)(RNN)
  2. oracle的into的含义,请问在oracle 中select into是什么意思?
  3. [C++STL]list容器用法介绍
  4. C++函数的分文件编写
  5. 【Apple苹果设备刷机】ipad已停用,iTunes无法联系网络等问题
  6. 小米wifi如何设置虚拟服务器,小米路由器无线中继模式(桥接)用手机怎么设置? | 192路由网...
  7. 数据结构问题解决2.1——单链表存储结构定义详细解释,struct LNode* next解释,为啥next定义成指针类型
  8. dcos marathon - 容器的存储
  9. java中OOP的概念之我见
  10. daterangepicker 清空_Date Range Picker 中文网
  11. eclipse不自动弹出提示(alt+/快捷键失效)
  12. “我只警告一次,下次我会直接忽略你发的垃圾,懂?”Linus 精彩炮轰语录集锦...
  13. 网络nan的原因_训练网络loss出现Nan解决办法
  14. 京峰网站架构,上线流程
  15. linux内存管理页面,【原创】(七)Linux内存管理 - zoned page frame allocator - 2
  16. 最好用的网易邮箱工具-网易邮箱助手_我是亲民_新浪博客
  17. 转载Python正则表达式匹配反斜杠'\'问题(——字符串转义与正则转义)
  18. big code: code2seq论文复现 Generating Sequences from Structured Representations of Code
  19. 利用wxid批量加好友
  20. Boost库-功能介绍-Geometry-R树-空间索引

热门文章

  1. Centos下数据写入MySQL数据库汉字是????
  2. 数字人民币来了,它到底是什么?
  3. 关于弹出消息窗口的自动关闭
  4. 第一个只出现一次的字符
  5. python公历转农历_有没有 python3 可用的农历转公历的包?
  6. matlab点云数据dem,一种基于点云数据的DEM生成方法
  7. 华为折叠x2是鸿蒙系统吗,华为mateX2发布,鸿蒙系统四月上线,matex2首批搭载。...
  8. xming Error: Can‘t open display:
  9. 头歌MySQL数据库实训答案2022
  10. 柳絮纷飞的日子——CSS笔记之四