1、SSD算法源码解读-如何进行数据增强

文章目录

目标检测任务中数据包含的基本信息
两种类型的数据增强
Int类型图片转为float类型
去均值
将x,y,w,h转化从0-1的相对值转化为图片真实值
将x,y,w,h转化真实值转化为0-1的相对值
乘性噪声
图片格式转换(RGB、HSV、BGR)
通过随机因子调整图像的对比度.
通过随机因子调整图像的亮度
随机中心裁剪
均值填充
所有源码

目标检测任务中数据包含的基本信息

每一张图片有三种信息

图片本身比如 300*300*3的矩阵 512*512*3的矩阵
boxes
boxes对应的label

比如我们说图片1.jpg大小为300*300*3，在[x,y,w,h]，处有一只猫

两种类型的数据增强

如果我们不对图片做形变，比如我们只做去均值，那么猫的位置不变，这个操作[x,y,w,h]和有一只猫这些信息不变化，这类数据增强函数我们输入一张图片返回一张图片即可。

但是如果我们对图片resize，center crop等那么猫的位置改变的，我们相应的需要改变[x,y,w,h]和label，这类数据增强函数我们输入一张图片、boxes、labels信息，返回修改后的图片、boxes、labels信息。

Int类型图片转为float类型

class ConvertFromInts(object):def __call__(self, image, boxes=None, labels=None):return image.astype(np.float32), boxes, labels

去均值

class SubtractMeans(object):def __init__(self, mean):self.mean = np.array(mean, dtype=np.float32)def __call__(self, image, boxes=None, labels=None):image = image.astype(np.float32)image -= self.meanreturn image.astype(np.float32), boxes, labels

将x,y,w,h转化从0-1的相对值转化为图片真实值


class ToAbsoluteCoords(object):def __call__(self, image, boxes=None, labels=None):height, width, channels = image.shapeboxes[:, 0] *= widthboxes[:, 2] *= widthboxes[:, 1] *= heightboxes[:, 3] *= heightreturn image, boxes, labels

将x,y,w,h转化真实值转化为0-1的相对值

class ToPercentCoords(object):def __call__(self, image, boxes=None, labels=None):height, width, channels = image.shapeboxes[:, 0] /= widthboxes[:, 2] /= widthboxes[:, 1] /= heightboxes[:, 3] /= heightreturn image, boxes, labels

乘性噪声

class RandomSaturation(object):def __init__(self, lower=0.5, upper=1.5):self.lower = lowerself.upper = upperassert self.upper >= self.lower, "contrast upper must be >= lower."assert self.lower >= 0, "contrast lower must be non-negative."def __call__(self, image, boxes=None, labels=None):if random.randint(2):image[:, :, 1] *= random.uniform(self.lower, self.upper)return image, boxes, labels

图片格式转换(RGB、HSV、BGR)

class ConvertColor(object):def __init__(self, current, transform):self.transform = transformself.current = currentdef __call__(self, image, boxes=None, labels=None):if self.current == 'BGR' and self.transform == 'HSV':image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)elif self.current == 'RGB' and self.transform == 'HSV':image = cv2.cvtColor(image, cv2.COLOR_RGB2HSV)elif self.current == 'BGR' and self.transform == 'RGB':image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)elif self.current == 'HSV' and self.transform == 'BGR':image = cv2.cvtColor(image, cv2.COLOR_HSV2BGR)elif self.current == 'HSV' and self.transform == "RGB":image = cv2.cvtColor(image, cv2.COLOR_HSV2RGB)else:raise NotImplementedErrorreturn image, boxes, labels

通过随机因子调整图像的对比度.

class RandomContrast(object):def __init__(self, lower=0.5, upper=1.5):self.lower = lowerself.upper = upperassert self.upper >= self.lower, "contrast upper must be >= lower."assert self.lower >= 0, "contrast lower must be non-negative."# expects float imagedef __call__(self, image, boxes=None, labels=None):if random.randint(2):alpha = random.uniform(self.lower, self.upper)image *= alphareturn image, boxes, labels

通过随机因子调整图像的亮度

class RandomBrightness(object):def __init__(self, delta=32):assert delta >= 0.0assert delta <= 255.0self.delta = deltadef __call__(self, image, boxes=None, labels=None):if random.randint(2):delta = random.uniform(-self.delta, self.delta)image += deltareturn image, boxes, labels

随机中心裁剪

class RandomSampleCrop(object):"""CropArguments:img (Image): the image being input during trainingboxes (Tensor): the original bounding boxes in pt formlabels (Tensor): the class labels for each bboxmode (float tuple): the min and max jaccard overlapsReturn:(img, boxes, classes)img (Image): the cropped imageboxes (Tensor): the adjusted bounding boxes in pt formlabels (Tensor): the class labels for each bbox"""def __init__(self):self.sample_options = (# using entire original input imageNone,# sample a patch s.t. MIN jaccard w/ obj in .1,.3,.4,.7,.9(0.1, None),(0.3, None),(0.7, None),(0.9, None),# randomly sample a patch(None, None),)def __call__(self, image, boxes=None, labels=None):# guard against no boxesif boxes is not None and boxes.shape[0] == 0:return image, boxes, labelsheight, width, _ = image.shapewhile True:# randomly choose a modemode = random.choice(self.sample_options)if mode is None:return image, boxes, labelsmin_iou, max_iou = modeif min_iou is None:min_iou = float('-inf')if max_iou is None:max_iou = float('inf')# max trails (50)for _ in range(50):current_image = imagew = random.uniform(0.3 * width, width)h = random.uniform(0.3 * height, height)# aspect ratio constraint b/t .5 & 2if h / w < 0.5 or h / w > 2:continueleft = random.uniform(width - w)top = random.uniform(height - h)# convert to integer rect x1,y1,x2,y2rect = np.array([int(left), int(top), int(left + w), int(top + h)])# calculate IoU (jaccard overlap) b/t the cropped and gt boxesoverlap = jaccard_numpy(boxes, rect)# is min and max overlap constraint satisfied? if not try againif overlap.max() < min_iou or overlap.min() > max_iou:continue# cut the crop from the imagecurrent_image = current_image[rect[1]:rect[3], rect[0]:rect[2],:]# keep overlap with gt box IF center in sampled patchcenters = (boxes[:, :2] + boxes[:, 2:]) / 2.0# mask in all gt boxes that above and to the left of centersm1 = (rect[0] < centers[:, 0]) * (rect[1] < centers[:, 1])# mask in all gt boxes that under and to the right of centersm2 = (rect[2] > centers[:, 0]) * (rect[3] > centers[:, 1])# mask in that both m1 and m2 are truemask = m1 * m2# have any valid boxes? try again if notif not mask.any():continue# take only matching gt boxescurrent_boxes = boxes[mask, :].copy()# take only matching gt labelscurrent_labels = labels[mask]# should we use the box left and top corner or the crop'scurrent_boxes[:, :2] = np.maximum(current_boxes[:, :2],rect[:2])# adjust to crop (by substracting crop's left,top)current_boxes[:, :2] -= rect[:2]current_boxes[:, 2:] = np.minimum(current_boxes[:, 2:],rect[2:])# adjust to crop (by substracting crop's left,top)current_boxes[:, 2:] -= rect[:2]return current_image, current_boxes, current_labels

均值填充

class Expand(object):def __init__(self, mean):self.mean = meandef __call__(self, image, boxes, labels):if random.randint(2):return image, boxes, labelsheight, width, depth = image.shaperatio = random.uniform(1, 4)left = random.uniform(0, width * ratio - width)top = random.uniform(0, height * ratio - height)expand_image = np.zeros((int(height * ratio), int(width * ratio), depth),dtype=image.dtype)expand_image[:, :, :] = self.meanexpand_image[int(top):int(top + height),int(left):int(left + width)] = imageimage = expand_imageboxes = boxes.copy()boxes[:, :2] += (int(left), int(top))boxes[:, 2:] += (int(left), int(top))return image, boxes, labels

所有源码

# from https://github.com/amdegroot/ssd.pytorch
import torch
from torchvision import transforms
import cv2
import numpy as np
import types
from numpy import randomdef intersect(box_a, box_b):max_xy = np.minimum(box_a[:, 2:], box_b[2:])min_xy = np.maximum(box_a[:, :2], box_b[:2])inter = np.clip((max_xy - min_xy), a_min=0, a_max=np.inf)return inter[:, 0] * inter[:, 1]def jaccard_numpy(box_a, box_b):"""Compute the jaccard overlap of two sets of boxes.  The jaccard overlapis simply the intersection over union of two boxes.E.g.:A ∩ B / A ∪ B = A ∩ B / (area(A) + area(B) - A ∩ B)Args:box_a: Multiple bounding boxes, Shape: [num_boxes,4]box_b: Single bounding box, Shape: [4]Return:jaccard overlap: Shape: [box_a.shape[0], box_a.shape[1]]"""inter = intersect(box_a, box_b)area_a = ((box_a[:, 2] - box_a[:, 0]) *(box_a[:, 3] - box_a[:, 1]))  # [A,B]area_b = ((box_b[2] - box_b[0]) *(box_b[3] - box_b[1]))  # [A,B]union = area_a + area_b - interreturn inter / union  # [A,B]class Compose(object):"""Composes several augmentations together.Args:transforms (List[Transform]): list of transforms to compose.Example:>>> augmentations.Compose([>>>     transforms.CenterCrop(10),>>>     transforms.ToTensor(),>>> ])"""def __init__(self, transforms):self.transforms = transformsdef __call__(self, img, boxes=None, labels=None):for t in self.transforms:img, boxes, labels = t(img, boxes, labels)return img, boxes, labelsclass Lambda(object):"""Applies a lambda as a transform."""def __init__(self, lambd):assert isinstance(lambd, types.LambdaType)self.lambd = lambddef __call__(self, img, boxes=None, labels=None):return self.lambd(img, boxes, labels)class ConvertFromInts(object):def __call__(self, image, boxes=None, labels=None):return image.astype(np.float32), boxes, labelsclass SubtractMeans(object):def __init__(self, mean):self.mean = np.array(mean, dtype=np.float32)def __call__(self, image, boxes=None, labels=None):image = image.astype(np.float32)image -= self.meanreturn image.astype(np.float32), boxes, labelsclass ToAbsoluteCoords(object):def __call__(self, image, boxes=None, labels=None):height, width, channels = image.shapeboxes[:, 0] *= widthboxes[:, 2] *= widthboxes[:, 1] *= heightboxes[:, 3] *= heightreturn image, boxes, labelsclass ToPercentCoords(object):def __call__(self, image, boxes=None, labels=None):height, width, channels = image.shapeboxes[:, 0] /= widthboxes[:, 2] /= widthboxes[:, 1] /= heightboxes[:, 3] /= heightreturn image, boxes, labelsclass Resize(object):def __init__(self, size=300):self.size = sizedef __call__(self, image, boxes=None, labels=None):image = cv2.resize(image, (self.size,self.size))return image, boxes, labelsclass RandomSaturation(object):def __init__(self, lower=0.5, upper=1.5):self.lower = lowerself.upper = upperassert self.upper >= self.lower, "contrast upper must be >= lower."assert self.lower >= 0, "contrast lower must be non-negative."def __call__(self, image, boxes=None, labels=None):if random.randint(2):image[:, :, 1] *= random.uniform(self.lower, self.upper)return image, boxes, labelsclass RandomHue(object):def __init__(self, delta=18.0):assert delta >= 0.0 and delta <= 360.0self.delta = deltadef __call__(self, image, boxes=None, labels=None):if random.randint(2):image[:, :, 0] += random.uniform(-self.delta, self.delta)image[:, :, 0][image[:, :, 0] > 360.0] -= 360.0image[:, :, 0][image[:, :, 0] < 0.0] += 360.0return image, boxes, labelsclass RandomLightingNoise(object):def __init__(self):self.perms = ((0, 1, 2), (0, 2, 1),(1, 0, 2), (1, 2, 0),(2, 0, 1), (2, 1, 0))def __call__(self, image, boxes=None, labels=None):if random.randint(2):swap = self.perms[random.randint(len(self.perms))]shuffle = SwapChannels(swap)  # shuffle channelsimage = shuffle(image)return image, boxes, labelsclass ConvertColor(object):def __init__(self, current, transform):self.transform = transformself.current = currentdef __call__(self, image, boxes=None, labels=None):if self.current == 'BGR' and self.transform == 'HSV':image = cv2.cvtColor(image, cv2.COLOR_BGR2HSV)elif self.current == 'RGB' and self.transform == 'HSV':image = cv2.cvtColor(image, cv2.COLOR_RGB2HSV)elif self.current == 'BGR' and self.transform == 'RGB':image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)elif self.current == 'HSV' and self.transform == 'BGR':image = cv2.cvtColor(image, cv2.COLOR_HSV2BGR)elif self.current == 'HSV' and self.transform == "RGB":image = cv2.cvtColor(image, cv2.COLOR_HSV2RGB)else:raise NotImplementedErrorreturn image, boxes, labelsclass RandomContrast(object):def __init__(self, lower=0.5, upper=1.5):self.lower = lowerself.upper = upperassert self.upper >= self.lower, "contrast upper must be >= lower."assert self.lower >= 0, "contrast lower must be non-negative."# expects float imagedef __call__(self, image, boxes=None, labels=None):if random.randint(2):alpha = random.uniform(self.lower, self.upper)image *= alphareturn image, boxes, labelsclass RandomBrightness(object):def __init__(self, delta=32):assert delta >= 0.0assert delta <= 255.0self.delta = deltadef __call__(self, image, boxes=None, labels=None):if random.randint(2):delta = random.uniform(-self.delta, self.delta)image += deltareturn image, boxes, labelsclass ToCV2Image(object):def __call__(self, tensor, boxes=None, labels=None):return tensor.cpu().numpy().astype(np.float32).transpose((1, 2, 0)), boxes, labelsclass ToTensor(object):def __call__(self, cvimage, boxes=None, labels=None):return torch.from_numpy(cvimage.astype(np.float32)).permute(2, 0, 1), boxes, labelsclass RandomSampleCrop(object):"""CropArguments:img (Image): the image being input during trainingboxes (Tensor): the original bounding boxes in pt formlabels (Tensor): the class labels for each bboxmode (float tuple): the min and max jaccard overlapsReturn:(img, boxes, classes)img (Image): the cropped imageboxes (Tensor): the adjusted bounding boxes in pt formlabels (Tensor): the class labels for each bbox"""def __init__(self):self.sample_options = (# using entire original input imageNone,# sample a patch s.t. MIN jaccard w/ obj in .1,.3,.4,.7,.9(0.1, None),(0.3, None),(0.7, None),(0.9, None),# randomly sample a patch(None, None),)def __call__(self, image, boxes=None, labels=None):# guard against no boxesif boxes is not None and boxes.shape[0] == 0:return image, boxes, labelsheight, width, _ = image.shapewhile True:# randomly choose a modemode = random.choice(self.sample_options)if mode is None:return image, boxes, labelsmin_iou, max_iou = modeif min_iou is None:min_iou = float('-inf')if max_iou is None:max_iou = float('inf')# max trails (50)for _ in range(50):current_image = imagew = random.uniform(0.3 * width, width)h = random.uniform(0.3 * height, height)# aspect ratio constraint b/t .5 & 2if h / w < 0.5 or h / w > 2:continueleft = random.uniform(width - w)top = random.uniform(height - h)# convert to integer rect x1,y1,x2,y2rect = np.array([int(left), int(top), int(left + w), int(top + h)])# calculate IoU (jaccard overlap) b/t the cropped and gt boxesoverlap = jaccard_numpy(boxes, rect)# is min and max overlap constraint satisfied? if not try againif overlap.max() < min_iou or overlap.min() > max_iou:continue# cut the crop from the imagecurrent_image = current_image[rect[1]:rect[3], rect[0]:rect[2],:]# keep overlap with gt box IF center in sampled patchcenters = (boxes[:, :2] + boxes[:, 2:]) / 2.0# mask in all gt boxes that above and to the left of centersm1 = (rect[0] < centers[:, 0]) * (rect[1] < centers[:, 1])# mask in all gt boxes that under and to the right of centersm2 = (rect[2] > centers[:, 0]) * (rect[3] > centers[:, 1])# mask in that both m1 and m2 are truemask = m1 * m2# have any valid boxes? try again if notif not mask.any():continue# take only matching gt boxescurrent_boxes = boxes[mask, :].copy()# take only matching gt labelscurrent_labels = labels[mask]# should we use the box left and top corner or the crop'scurrent_boxes[:, :2] = np.maximum(current_boxes[:, :2],rect[:2])# adjust to crop (by substracting crop's left,top)current_boxes[:, :2] -= rect[:2]current_boxes[:, 2:] = np.minimum(current_boxes[:, 2:],rect[2:])# adjust to crop (by substracting crop's left,top)current_boxes[:, 2:] -= rect[:2]return current_image, current_boxes, current_labelsclass Expand(object):def __init__(self, mean):self.mean = meandef __call__(self, image, boxes, labels):if random.randint(2):return image, boxes, labelsheight, width, depth = image.shaperatio = random.uniform(1, 4)left = random.uniform(0, width * ratio - width)top = random.uniform(0, height * ratio - height)expand_image = np.zeros((int(height * ratio), int(width * ratio), depth),dtype=image.dtype)expand_image[:, :, :] = self.meanexpand_image[int(top):int(top + height),int(left):int(left + width)] = imageimage = expand_imageboxes = boxes.copy()boxes[:, :2] += (int(left), int(top))boxes[:, 2:] += (int(left), int(top))return image, boxes, labelsclass RandomMirror(object):def __call__(self, image, boxes, classes):_, width, _ = image.shapeif random.randint(2):image = image[:, ::-1]boxes = boxes.copy()boxes[:, 0::2] = width - boxes[:, 2::-2]return image, boxes, classesclass SwapChannels(object):"""Transforms a tensorized image by swapping the channels in the orderspecified in the swap tuple.Args:swaps (int triple): final order of channelseg: (2, 1, 0)"""def __init__(self, swaps):self.swaps = swapsdef __call__(self, image):"""Args:image (Tensor): image tensor to be transformedReturn:a tensor with channels swapped according to swap"""# if torch.is_tensor(image):#     image = image.data.cpu().numpy()# else:#     image = np.array(image)image = image[:, :, self.swaps]return imageclass PhotometricDistort(object):def __init__(self):self.pd = [RandomContrast(),  # RGBConvertColor(current="RGB", transform='HSV'),  # HSVRandomSaturation(),  # HSVRandomHue(),  # HSVConvertColor(current='HSV', transform='RGB'),  # RGBRandomContrast()  # RGB]self.rand_brightness = RandomBrightness()self.rand_light_noise = RandomLightingNoise()def __call__(self, image, boxes, labels):im = image.copy()im, boxes, labels = self.rand_brightness(im, boxes, labels)if random.randint(2):distort = Compose(self.pd[:-1])else:distort = Compose(self.pd[1:])im, boxes, labels = distort(im, boxes, labels)return self.rand_light_noise(im, boxes, labels)

1、SSD算法源码解读-如何进行数据增强相关推荐

Caffe框架下SSD算法源码综述
ssd源码相比于caffe架构主要添加了flatten,normal,prior,detection,multibox等层,其中最重要的难点是multibox层和multibox,通过学习ssd源码可 ...
第10课：Spark Streaming源码解读之流数据不断接收全生命周期彻底研究和思考
特别说明: 在上一遍文章中有详细的叙述Receiver启动的过程,如果不清楚的朋友,请您查看上一篇博客,这里我们就基于上篇的结论,继续往下说. 博文的目标是: Spark Streaming在接收 ...
Mask RCNN源码解读
Mask RCNN源码解读前言数据集数据载入模型搭建模型输入模型输出 resnet101 RPN网络 ProposalLayer DetectionTargetLayer fpn_clas ...
diff算法_vue源码解读 diff算法
导语最近碰到部分业务场景,代码逻辑需要了解"数组变更后,具体变更了哪一些元素,以及变更的位置..".于是仔细研究并覆写了一遍针对数组变化的diff算法,在这里做下diff算法的逻 ...
SSD源码解读1-数据层AnnotatedDataLayer
前言年后到现在,利用自己的业余时间断断续续将caffe的SSD源码看完了,虽然中间由于工作原因暂停了一段时间,但最终还算顺利完成了,SSD源码的阅读也是今年的年度计划中比较重要的一项内容,完成了还是 ...
SSD源码解读2-PriorBoxLayer
SSD源码解读系列的第2篇,这篇博客对SSD源码中的PriorBoxLayer进行解读 SSD源码阅读的时候,我对SSD源码创建了QT工程,这样方便阅读,SSD源码的QT工程我上传到CSDN了,该工程 ...
Ubuntu 16.04下Caffe-SSD的应用（四）——ssd_pascal.py源码解读
前言 caffe-ssd所有的训练时的参数,全部由ssd_pascal.py来定义,之后再去调用相关的脚本和函数,所以想要训练自己的数据,首先要明白ssd_pascal.py各个定义参数的大体意思. ...
yolov1-v5学习笔记及源码解读
目录深度学习网络分类评价指标原理 yolov1 yolov2 yolov3 yolov4 yolov5 源码解读(v3为例) 深度学习网络分类深度学习经典检测方法通常分为 two-stage ...
Java Review - HashMap HashSet 源码解读
文章目录概述 HashMap结构图构造函数重点方法源码解读 (1.7) put() get() remove() 1.8版本 HashMap put resize() 扩容 get HashSe ...