@Author:Runsen

上次介绍了Faster-RCNN模型,那么今天就开始训练第一个Faster-RCNN模型。

本文将展示如何在水果图像数据集上使用Faster-RCNN模型。

代码的灵感来自此处的 Pytorch 文档教程和Kaggle

  • https://pytorch.org/tutorials/intermediate/torchvision_tutorial.html
  • https://www.kaggle.com/yerramvarun/fine-tuning-faster-rcnn-using-pytorch/

这是我目前见到RCNN最好的教程

数据集来源:https://www.kaggle.com/mbkinaci/fruit-images-for-object-detection

由于很多对象检测代码是相同的,并且必须编写,torch为我们提供了相关的代码,直接克隆复制到工作目录中。

git clone https://github.com/pytorch/vision.git
git checkout v0.3.0cp vision/references/detection/utils.py ./
cp vision/references/detection/transforms.py ./
cp vision/references/detection/coco_eval.py ./
cp vision/references/detection/engine.py ./
cp vision/references/detection/coco_utils.py ./

下载的数据集,在train和test文件夹中存在对应的xml和jpg文件。

import os
import numpy as np
import cv2
import torch
import matplotlib.patches as patches
import albumentations as A
from albumentations.pytorch.transforms import ToTensorV2
from matplotlib import pyplot as plt
from torch.utils.data import Dataset
from xml.etree import ElementTree as et
from torchvision import transforms as torchtransclass FruitImagesDataset(torch.utils.data.Dataset):def __init__(self, files_dir, width, height, transforms=None):self.transforms = transformsself.files_dir = files_dirself.height = heightself.width = widthself.imgs = [image for image in sorted(os.listdir(files_dir))if image[-4:] == '.jpg']self.classes = [_,'apple', 'banana', 'orange']def __getitem__(self, idx):img_name = self.imgs[idx]image_path = os.path.join(self.files_dir, img_name)# reading the images and converting them to correct size and colorimg = cv2.imread(image_path)img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB).astype(np.float32)img_res = cv2.resize(img_rgb, (self.width, self.height), cv2.INTER_AREA)# diving by 255img_res /= 255.0# annotation fileannot_filename = img_name[:-4] + '.xml'annot_file_path = os.path.join(self.files_dir, annot_filename)boxes = []labels = []tree = et.parse(annot_file_path)root = tree.getroot()# cv2 image gives size as height x widthwt = img.shape[1]ht = img.shape[0]# box coordinates for xml files are extracted and corrected for image size givenfor member in root.findall('object'):labels.append(self.classes.index(member.find('name').text))# bounding boxxmin = int(member.find('bndbox').find('xmin').text)xmax = int(member.find('bndbox').find('xmax').text)ymin = int(member.find('bndbox').find('ymin').text)ymax = int(member.find('bndbox').find('ymax').text)xmin_corr = (xmin / wt) * self.widthxmax_corr = (xmax / wt) * self.widthymin_corr = (ymin / ht) * self.heightymax_corr = (ymax / ht) * self.heightboxes.append([xmin_corr, ymin_corr, xmax_corr, ymax_corr])# convert boxes into a torch.Tensorboxes = torch.as_tensor(boxes, dtype=torch.float32)# getting the areas of the boxesarea = (boxes[:, 3] - boxes[:, 1]) * (boxes[:, 2] - boxes[:, 0])# suppose all instances are not crowdiscrowd = torch.zeros((boxes.shape[0],), dtype=torch.int64)labels = torch.as_tensor(labels, dtype=torch.int64)target = {}target["boxes"] = boxestarget["labels"] = labelstarget["area"] = areatarget["iscrowd"] = iscrowd# image_idimage_id = torch.tensor([idx])target["image_id"] = image_idif self.transforms:sample = self.transforms(image=img_res,bboxes=target['boxes'],labels=labels)img_res = sample['image']target['boxes'] = torch.Tensor(sample['bboxes'])return img_res, targetdef __len__(self):return len(self.imgs)def torch_to_pil(img):return torchtrans.ToPILImage()(img).convert('RGB')def plot_img_bbox(img, target):fig, a = plt.subplots(1, 1)fig.set_size_inches(5, 5)a.imshow(img)for box in (target['boxes']):x, y, width, height = box[0], box[1], box[2] - box[0], box[3] - box[1]rect = patches.Rectangle((x, y),width, height,linewidth=2,edgecolor='r',facecolor='none')a.add_patch(rect)plt.show()def get_transform(train):if train:return A.Compose([A.HorizontalFlip(0.5),ToTensorV2(p=1.0)], bbox_params={'format': 'pascal_voc', 'label_fields': ['labels']})else:return A.Compose([ToTensorV2(p=1.0)], bbox_params={'format': 'pascal_voc', 'label_fields': ['labels']})files_dir = '../input/fruit-images-for-object-detection/train_zip/train'
test_dir = '../input/fruit-images-for-object-detection/test_zip/test'dataset = FruitImagesDataset(train_dir, 480, 480)img, target = dataset[78]
print(img.shape, '\n', target)
plot_img_bbox(torch_to_pil(img), target)

输出如下:

在torch中Faster-RCNN模型导入from torchvision.models.detection.faster_rcnn import FastRCNNPredictor

import torchvision
from torchvision.models.detection.faster_rcnn import FastRCNNPredictordef get_object_detection_model(num_classes):# 加载在COCO上预先训练过的模型(会下载对应的权重)model = torchvision.models.detection.fasterrcnn_resnet50_fpn(pretrained=True)# 获取分类器的输入特征数in_features = model.roi_heads.box_predictor.cls_score.in_features# 用新的头替换预先训练好的头model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)return model

对象检测的增强与正常增强不同,因为在这里需要确保 bbox 在转换后仍然正确与对象对齐。

在这里,添加了随机翻转转换,随机图片处理

def get_transform(train):if train:return A.Compose([A.HorizontalFlip(0.5),ToTensorV2(p=1.0) ], bbox_params={'format': 'pascal_voc', 'label_fields': ['labels']})else:return A.Compose([ToTensorV2(p=1.0)], bbox_params={'format': 'pascal_voc', 'label_fields': ['labels']})

现在让我们准备数据集和数据加载器进行训练和测试。

dataset = FruitImagesDataset(files_dir, 480, 480, transforms= get_transform(train=True))
dataset_test = FruitImagesDataset(files_dir, 480, 480, transforms= get_transform(train=False))# split the dataset in train and test set
torch.manual_seed(1)
indices = torch.randperm(len(dataset)).tolist()# train test split
test_split = 0.2
tsize = int(len(dataset)*test_split)
dataset = torch.utils.data.Subset(dataset, indices[:-tsize])
dataset_test = torch.utils.data.Subset(dataset_test, indices[-tsize:])# define training and validation data loaders
data_loader = torch.utils.data.DataLoader(dataset, batch_size=10, shuffle=True, num_workers=4,collate_fn=utils.collate_fn)data_loader_test = torch.utils.data.DataLoader(dataset_test, batch_size=10, shuffle=False, num_workers=4,collate_fn=utils.collate_fn)

准备模型

# to train on gpu if selected.
device = torch.device('cuda') if torch.cuda.is_available() else torch.device('cpu')num_classes = 4# get the model using our helper function
model = get_object_detection_model(num_classes)# move model to the right device
model.to(device)# construct an optimizer
params = [p for p in model.parameters() if p.requires_grad]
optimizer = torch.optim.SGD(params, lr=0.005,momentum=0.9, weight_decay=0.0005)# and a learning rate scheduler which decreases the learning rate by
# 10x every 3 epochs
lr_scheduler = torch.optim.lr_scheduler.StepLR(optimizer,step_size=3,gamma=0.1)# training for 10 epochs
num_epochs = 10for epoch in range(num_epochs):# training for one epochtrain_one_epoch(model, optimizer, data_loader, device, epoch, print_freq=10)# update the learning ratelr_scheduler.step()# evaluate on the test datasetevaluate(model, data_loader_test, device=device)


Torchvision 为我们提供了一个将 nms 应用于我们的预测的实用程序,让我们apply_nms使用它构建一个函数。

def apply_nms(orig_prediction, iou_thresh=0.3):# torchvision returns the indices of the bboxes to keepkeep = torchvision.ops.nms(orig_prediction['boxes'], orig_prediction['scores'], iou_thresh)final_prediction = orig_predictionfinal_prediction['boxes'] = final_prediction['boxes'][keep]final_prediction['scores'] = final_prediction['scores'][keep]final_prediction['labels'] = final_prediction['labels'][keep]return final_prediction# function to convert a torchtensor back to PIL image
def torch_to_pil(img):return torchtrans.ToPILImage()(img).convert('RGB')

让我们从我们的测试数据集中取一张图像,看看我们的模型是如何工作的。

我们将首先看到,与实际相比,我们的模型预测了多少个边界框

# pick one image from the test set
img, target = dataset_test[5]
# put the model in evaluation mode
model.eval()
with torch.no_grad():prediction = model([img.to(device)])[0]print('predicted #boxes: ', len(prediction['labels']))
print('real #boxes: ', len(target['labels']))

预测#boxes:14
真实#boxes:1

真实数据

print('EXPECTED OUTPUT')
plot_img_bbox(torch_to_pil(img), target)

print('MODEL OUTPUT')
plot_img_bbox(torch_to_pil(img), prediction)

你可以看到我们的模型为每个苹果预测了很多边界框。让我们对其应用 nms 并查看最终输出

nms_prediction = apply_nms(prediction, iou_thresh=0.2)
print('NMS APPLIED MODEL OUTPUT')
plot_img_bbox(torch_to_pil(img), nms_prediction)

算法和代码逻辑是我目前见到,最好的Faster-RCNN教程:

https://www.kaggle.com/yerramvarun/fine-tuning-faster-rcnn-using-pytorch/

这个RCNN对于系统的要求非常高,在公司的GPU中也会显出内存不够。

主要是DataLoader中的num_workers=4在做多线程。

如何微调RCNN模型,并对resnet 50进行微调。如何更改训练配置,比如图像大小、优化器和学习率。如何更好使用Albumentations ,值得去探索。

最后附上整个RCNN的网络结构

FasterRCNN((transform): GeneralizedRCNNTransform(Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])Resize(min_size=(800,), max_size=1333, mode='bilinear'))(backbone): BackboneWithFPN((body): IntermediateLayerGetter((conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)(bn1): FrozenBatchNorm2d(64, eps=0.0)(relu): ReLU(inplace=True)(maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)(layer1): Sequential((0): Bottleneck((conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn1): FrozenBatchNorm2d(64, eps=0.0)(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn2): FrozenBatchNorm2d(64, eps=0.0)(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn3): FrozenBatchNorm2d(256, eps=0.0)(relu): ReLU(inplace=True)(downsample): Sequential((0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)(1): FrozenBatchNorm2d(256, eps=0.0)))(1): Bottleneck((conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn1): FrozenBatchNorm2d(64, eps=0.0)(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn2): FrozenBatchNorm2d(64, eps=0.0)(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn3): FrozenBatchNorm2d(256, eps=0.0)(relu): ReLU(inplace=True))(2): Bottleneck((conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn1): FrozenBatchNorm2d(64, eps=0.0)(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn2): FrozenBatchNorm2d(64, eps=0.0)(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn3): FrozenBatchNorm2d(256, eps=0.0)(relu): ReLU(inplace=True)))(layer2): Sequential((0): Bottleneck((conv1): Conv2d(256, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn1): FrozenBatchNorm2d(128, eps=0.0)(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)(bn2): FrozenBatchNorm2d(128, eps=0.0)(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn3): FrozenBatchNorm2d(512, eps=0.0)(relu): ReLU(inplace=True)(downsample): Sequential((0): Conv2d(256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False)(1): FrozenBatchNorm2d(512, eps=0.0)))(1): Bottleneck((conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn1): FrozenBatchNorm2d(128, eps=0.0)(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn2): FrozenBatchNorm2d(128, eps=0.0)(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn3): FrozenBatchNorm2d(512, eps=0.0)(relu): ReLU(inplace=True))(2): Bottleneck((conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn1): FrozenBatchNorm2d(128, eps=0.0)(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn2): FrozenBatchNorm2d(128, eps=0.0)(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn3): FrozenBatchNorm2d(512, eps=0.0)(relu): ReLU(inplace=True))(3): Bottleneck((conv1): Conv2d(512, 128, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn1): FrozenBatchNorm2d(128, eps=0.0)(conv2): Conv2d(128, 128, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn2): FrozenBatchNorm2d(128, eps=0.0)(conv3): Conv2d(128, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn3): FrozenBatchNorm2d(512, eps=0.0)(relu): ReLU(inplace=True)))(layer3): Sequential((0): Bottleneck((conv1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn1): FrozenBatchNorm2d(256, eps=0.0)(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)(bn2): FrozenBatchNorm2d(256, eps=0.0)(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn3): FrozenBatchNorm2d(1024, eps=0.0)(relu): ReLU(inplace=True)(downsample): Sequential((0): Conv2d(512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False)(1): FrozenBatchNorm2d(1024, eps=0.0)))(1): Bottleneck((conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn1): FrozenBatchNorm2d(256, eps=0.0)(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn2): FrozenBatchNorm2d(256, eps=0.0)(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn3): FrozenBatchNorm2d(1024, eps=0.0)(relu): ReLU(inplace=True))(2): Bottleneck((conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn1): FrozenBatchNorm2d(256, eps=0.0)(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn2): FrozenBatchNorm2d(256, eps=0.0)(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn3): FrozenBatchNorm2d(1024, eps=0.0)(relu): ReLU(inplace=True))(3): Bottleneck((conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn1): FrozenBatchNorm2d(256, eps=0.0)(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn2): FrozenBatchNorm2d(256, eps=0.0)(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn3): FrozenBatchNorm2d(1024, eps=0.0)(relu): ReLU(inplace=True))(4): Bottleneck((conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn1): FrozenBatchNorm2d(256, eps=0.0)(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn2): FrozenBatchNorm2d(256, eps=0.0)(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn3): FrozenBatchNorm2d(1024, eps=0.0)(relu): ReLU(inplace=True))(5): Bottleneck((conv1): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn1): FrozenBatchNorm2d(256, eps=0.0)(conv2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn2): FrozenBatchNorm2d(256, eps=0.0)(conv3): Conv2d(256, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn3): FrozenBatchNorm2d(1024, eps=0.0)(relu): ReLU(inplace=True)))(layer4): Sequential((0): Bottleneck((conv1): Conv2d(1024, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn1): FrozenBatchNorm2d(512, eps=0.0)(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), bias=False)(bn2): FrozenBatchNorm2d(512, eps=0.0)(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn3): FrozenBatchNorm2d(2048, eps=0.0)(relu): ReLU(inplace=True)(downsample): Sequential((0): Conv2d(1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False)(1): FrozenBatchNorm2d(2048, eps=0.0)))(1): Bottleneck((conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn1): FrozenBatchNorm2d(512, eps=0.0)(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn2): FrozenBatchNorm2d(512, eps=0.0)(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn3): FrozenBatchNorm2d(2048, eps=0.0)(relu): ReLU(inplace=True))(2): Bottleneck((conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn1): FrozenBatchNorm2d(512, eps=0.0)(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)(bn2): FrozenBatchNorm2d(512, eps=0.0)(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)(bn3): FrozenBatchNorm2d(2048, eps=0.0)(relu): ReLU(inplace=True))))(fpn): FeaturePyramidNetwork((inner_blocks): ModuleList((0): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1))(1): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1))(2): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1))(3): Conv2d(2048, 256, kernel_size=(1, 1), stride=(1, 1)))(layer_blocks): ModuleList((0): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))(1): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))(2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))(3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1)))(extra_blocks): LastLevelMaxPool()))(rpn): RegionProposalNetwork((anchor_generator): AnchorGenerator()(head): RPNHead((conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))(cls_logits): Conv2d(256, 3, kernel_size=(1, 1), stride=(1, 1))(bbox_pred): Conv2d(256, 12, kernel_size=(1, 1), stride=(1, 1))))(roi_heads): RoIHeads((box_roi_pool): MultiScaleRoIAlign(featmap_names=['0', '1', '2', '3'], output_size=(7, 7), sampling_ratio=2)(box_head): TwoMLPHead((fc6): Linear(in_features=12544, out_features=1024, bias=True)(fc7): Linear(in_features=1024, out_features=1024, bias=True))(box_predictor): FastRCNNPredictor((cls_score): Linear(in_features=1024, out_features=4, bias=True)(bbox_pred): Linear(in_features=1024, out_features=16, bias=True)))
)

深度学习和目标检测系列教程 10-300:通过torch训练第一个Faster-RCNN模型相关推荐

  1. 深度学习和目标检测系列教程 11-300:小麦数据集训练Faster-RCNN模型

    @Author:Runsen 上次训练的Faster-RCNN的数据格式是xml和jpg图片提供,在很多Object-Detection中,数据有的是csv格式, 数据集来源:https://www. ...

  2. 深度学习和目标检测系列教程 3-300:了解常见的目标检测的开源数据集

    @Author:Runsen 计算机视觉中具有挑战性的主题之一,对象检测,可帮助组织借助数字图片作为输入来理解和识别实时对象.大量的论文基于常见的目标检测的开源数据集而来,因此需要了解常见的目标检测的 ...

  3. 深度学习和目标检测系列教程 1-300:什么是对象检测和常见的8 种基础目标检测算法

    @Author:Runsen 由于毕业入了CV的坑,在内卷的条件下,我只好把别人卷走. 对象检测 对象检测是一种计算机视觉技术,用于定位图像或视频中的对象实例.对象检测算法通常利用机器学习或深度学习来 ...

  4. 深度学习和目标检测系列教程 7-300:先进的目标检测Faster R-CNN架构

    @Author:Runsen Faster R-CNN 由于Fast R-CNN 过程中仍然存在一个瓶颈,即ROI Projection.众所周知,检测对象的第一步是在对象周围生成一组潜在的边界框.在 ...

  5. 深度学习和目标检测系列教程 2-300:小试牛刀,使用 ImageAI 进行对象检测

    @Author:Runsen 对象检测是一种属于更广泛的计算机视觉领域的技术.它处理识别和跟踪图像和视频中存在的对象.目标检测有多种应用,如人脸检测.车辆检测.行人计数.自动驾驶汽车.安全系统等.Im ...

  6. 深度学习和目标检测系列教程 5-300:早期的目标检测RCNN架构

    @Author:Runsen 最早期的目标检测基于RCNN的算法,下面介绍RCNN的架构 RCNN架构 R-CNN 的目标是获取图像,并正确识别图片中的主要对象(通过边界框)的位置. 输入:图像: 输 ...

  7. 深度学习和目标检测系列教程 22-300:关于人体姿态常见的估计方法

    @Author:Runsen 姿态估计是计算机视觉中的一项流行任务,比如真实的场景如何进行人体跌倒检测,如何对手语进行交流. 作为人工智能(AI)的一个领域,计算机视觉使机器能够以模仿人类视觉为目的来 ...

  8. 深度学习和目标检测系列教程 21-300:deepsorts测试小车经过的时间和速度

    @Author:Runsen deepDeepSort DeepSort是一种用于跟踪目标的模型,为每个目标分配 ID,为每一个不同的类别分配label. 在DeepSort 中,过程如下. 使用YO ...

  9. 深度学习和目标检测系列教程 9-300:TorchVision和Albumentation性能对比,如何使用Albumentation对图片数据做数据增强

    @Author:Runsen 上次对xml文件进行提取,使用到一个Albumentation模块.Albumentation模块是一个数据增强的工具,目标检测图像预处理通过使用"albume ...

最新文章

  1. Exchange出站队列堵塞解决思路
  2. iPhoneX延迟这么久预订,真实原因连库克也没料到
  3. 牛客练习赛36 Rabbit的字符串(最小表示法)
  4. 思科、华为交换机的一些命令
  5. Innodb存储引擎的特性(1).
  6. Java IO: RandomAccessFile
  7. (扩展欧几里德算法)zzuoj 10402: C.机器人
  8. 【zookeeper】zookeeper 的监听机制
  9. android 透明栏,Android状态栏透明(沉浸式效果)
  10. 给.net初学者的一些建议(共勉之)
  11. 记录:txt文本分割命令,用于notepad++无法打开情况下分割文件
  12. 神经网络的BP算法推导详解
  13. [经典力学]牛顿自然哲学的数学原理论文解读
  14. JAVA后端应该学什么技术?
  15. 4071 国际象棋(枚举)
  16. 麒麟操作系统安装/卸载微信
  17. 用Python 80行代码实现一个微信消息撤回捕捉功能
  18. 蜘蛛会抓取html框架,百度蜘蛛抓取网站的基本规则
  19. IT未来发展五大趋势
  20. win2003服务器安全设置技术实例(二)

热门文章

  1. python array_python数组array.array(转帖)
  2. matlab hsv提取s_Matlab进阶教程 | 基于不规则已知点插值
  3. 开发者们看过来,8ms开发工具平台给大家送福利了!只要你来,肯定有你感兴趣的,3.6-3.10日,只要在8ms平台上创建项目,就有机会白嫖彩屏开发板哦
  4. 基于MIG控制器的DDR3读写控制详解
  5. TVS选型(车载电子产品篇)
  6. python打包不能在其他电脑打开、找不到指定模块,pyinstaller打包移植到别的电脑报错OSError: [WinError 126] 找不到指定的模块。...
  7. 直播预告 | 如何在有限数据下实现资讯类网站海量信息自动分类
  8. CentOS 7下安装jdk1.8
  9. 1,字符是否为空,2,比较两个字符大小。String.Compare(String, String)。string.IsNullOrEmpty(string)...
  10. python模块的导入的两种方式区别详解