文章目录

为什么要做图像篡改识别
图像篡改的类型
不同篡改类型的训练和测试结果
- 代码框架以及通用的实验参数
- 高斯模糊
- 高斯噪音
- 中值滤波
- 二次JPEG压缩
- 亮度
- 对比度
- 实验总结

为什么要做图像篡改识别

在安防和司法领域，图像是一种重要的线索和证物，但在PS盛行的当下，并不是随意一张图像都可以具备此功能，一般而言要求图像没有被篡改过。毕竟我们谁都不希望自己的脸在非正常的情况下，无缘无故地出现在了犯罪现场，甚至出现在犯罪嫌疑人身上；或者拍摄的合同图像中的关键文字发生了不利的变化等等。

另外在美颜盛行的当下，或许一些人有“反美颜”的需求？毕竟有一部分人不太希望被“照骗”。

图像篡改的类型

实践中，图像篡改至少有以下几种类型：

一、图像内容的修改，比如前面提到的通过PS换脸或者合同文字的修改
二、能够间接表达第一类篡改嫌疑的操作。比如为了遮掩第一类篡改痕迹而做的中值滤波、平滑、模糊、加噪音等等，以及再次保存图像而产生的二次JPEG压缩。前面所述都是一些传统的数字图像处理操作，除此之外，还有一种非常难以识别的遮掩方式：重新拍摄，把经过修改的图像在显示器上打开并重新拍摄，这样就不会留下明显的数字图像处理的“痕迹”
三、可能“美颜”也是一种“篡改”，但目前美颜似乎不太会出现在司法领域，本文档不针对这种情况讨论

一般而言我们的最终目的都是识别第一类篡改，但是难度很大，需要很深厚的司法、摄影和图像专业知识，比如在传统的图像篡改识别领域，有噪声一致性、几何一致性、光照一致性等等方式来进行判断。实际操作时，需要遍历可疑区域，且每一个可疑区域都需要遍历各种方法进行检验，所以非常地费时费力。

如果使用深度学习方法的话，可以利用图像分割的方式直接将篡改区域分割出来，然而实际训练一下就会发现，难度是真的大，因为训练数据非常难做。至少有两种方式制作这类训练数据，但各有优缺点：

一是使用算法很随意地进行图像拼接，并辅以一些数据增强方法。这种方式可以生成无限多的数据，然而都假得非常明显，训练出的模型往往无法应对经过精细PS的图像
二是使用人工PS的方式制造数据。这种方式产生的数据质量可能比较高，但是效率实在是太低，对于训模型而言几乎不可行。

由于以上原因，我们常常先判断一下图像是否存在过第二类篡改，从技术上讲第二类篡改相对容易识别一些，如果存在第二类篡改的话那么可能就需要仔细点对待了。

下面的部分主要针对第二类篡改进行叙述。第一类比较难搞，不是一个人在家里拿着1050ti随便搞搞就能搞定的，所以本文档就不在这方面搞事情了…然而如果实在对第一类篡改有兴趣的话，可以参考一下adobe 2019的创意者大会。

如果使用传统方法识别上面所述的第二类篡改，事实上还是有点难度的，特别是中值滤波这种高度非线性的操作，但是用了深度学习后，果真大力出奇迹，随随便便就搞定了，下面是一些相关实验。

不同篡改类型的训练和测试结果

在继续向下看之前应当明确，图像篡改识别是一件“与人斗”的事情，很难给出一个“做好”的定义。

这一点不同于一些通用的CV，比如车牌识别人脸识别等，我做好了，达到一定的标准就可以铺开了商用，尽管车牌也存在套牌，人脸存在活体、面具等问题，但是问题种类比较少，并且也都存在一些明确的方案或技术来解决这些问题。

所以，本文档只是浅尝辄止地对上面所述的第二类篡改做了一些简单的实验。

代码框架以及通用的实验参数

代码包含三个文件：

一、util.py里面是一些辅助函数，包括了部分篡改类型，随机获取用于训练的图像块（image patches）等等操作，具体原理可以参考下面两个文档：

图像的退化方式及python实现
随机从图像中获取多个patch

二、generate_train_test_data.py用于制作训练数据，因为这里只是做个简单的实验，使用的数据并不多，所以可以一次性加载入内存中，因此数据保存为numpy的.npy格式。数据包括训练集的60张图片和测试集的30张图片，均使用手机随意拍摄得到（没有开美颜），用于拍摄的手机型号有三种：荣耀10，荣耀30，mate 30。用于训练的图像块大小是28*28，训练集截取了约30万个图像块，测试集截取了约15万个图像块。该代码文件中用于生成tampered_image部分代码可以进行修改，以测试各种篡改方式。另外如果不做额外说明，下面实验中日志的结果反映的就是代码中的数据篡改参数。

三、train.py用于训练，包括Dataset的生成，模型定义，训练和测试流程等等。因为是比较简单的实验，所以就全部写在一起了。超参数如下：

网络结构：6层卷积，非常简单的VGG风格，每2层一个pooling
优化器：Adam
epoch数量：10
学习率：如果不做额外说明，那就是前5个epoch学习率1e-4，后5个是1e-5
batch_size：50

下面是各个代码文件的内容。如果有兴趣跑一下下面的代码的话，需要注意两点：

数据自己拿手机随便去拍，原始大图分别放到两个文件夹内
修改路径相关的变量

util.py

# -*- coding: utf-8 -*-
import os
import cv2
import numpy as npdef uniform_random(low, high, shape=None):"""Get uniform random number(s) between low and highParameters----------low: low limit of random number(s)high: high limit of random number(s)shape: shape of output array. A single number is returned if shape is NoneReturns-------Uniform random number(s) between low and high"""return np.random.random(shape) * (high - low) + lowdef add_gaussian_noise(image, mean_ratio, std_ratio, noise_num_ratio=1.0):"""Add Gaussian nosie to image.Parameters----------image: image data read by opencv, shape is [H, W, C]mean_ratio: ratio with respect to image_mean for mean of gaussian randomnumbersstd_ratio: ratio with respect to image_mean for std (scale) of gaussianrandom numbersnoise_num_ratio: ratio of noise number with respect to the total number ofpixels, between [0, 1]Returns-------noisy_image: image after adding noise"""if std_ratio < 0:raise ValueError('std_ratio must >= 0.0')if not 0.0 <= noise_num_ratio <= 1.0:raise ValueError('noise_num_ratio must between [0, 1]')# get noise shape and channel numbernoise_shape = get_noise_shape(image)channel = noise_shape[2]# compute channel-wise mean and stdimage_mean = np.array(cv2.mean(image)[:channel])mean = image_mean * mean_ratiostd = image_mean * std_ratio# generate noisenoise = np.random.normal(mean, std, noise_shape)noisy_image = image.copy().astype(np.float32)if noisy_image.ndim == 2:noisy_image = noisy_image[..., np.newaxis]  # add channel axis# add noise according to noise_num_ratioif noise_num_ratio >= 1.0:noisy_image[:, :, :channel] += noiseelse:row, col = get_noise_index(image, noise_num_ratio)noisy_image[row, col, :channel] += noise[row, col, ...]# post processingnoisy_image = float_to_uint8(noisy_image, scale=1.0)noisy_image = np.squeeze(noisy_image)return noisy_imagedef float_to_uint8(image, scale=255.0):"""Convert image from float type to uint8, meanwhile the clip between [0, 255]will be done.Parameters----------image: numpy array image data of float typescale: a scale factor for image dataReturns-------image_uint8: numpy array image data of uint8 type"""image_uint8 = np.clip(np.round(image * scale), 0, 255).astype(np.uint8)return image_uint8def get_noise_index(image, noise_num_ratio):"""Get noise index for a certain ratio of noise numberParameters----------image: numpy array image datanoise_num_ratio: ratio of noise number with respect to the total number ofpixels, between [0, 1]Returns-------row: row indexescol: column indexes"""image_height, image_width = image.shape[0:2]noise_num = int(np.round(image_height * image_width * noise_num_ratio))row = np.random.randint(0, image_height, noise_num)col = np.random.randint(0, image_width, noise_num)return row, coldef get_noise_shape(image):"""Get noise shape according to image shape.Parameters----------image: numpy array image dataReturns-------noise_shape: a tuple whose length is 3The shape of noise. Let height, width be the image height and width.If image.ndim is 2, output noise_shape will be (height, width, 1),else (height, width, 3)"""if not (image.ndim == 2 or image.ndim == 3):raise ValueError('image ndim must be 2 or 3')height, width = image.shape[:2]if image.ndim == 2:channel = 1else:channel = image.shape[2]if channel >= 4:channel = 3noise_shape = (height, width, channel)return noise_shapedef jpeg_compression(image, quality_factor):"""Apply jpeg compression to image without saving it to disk.Parameters----------image: image data read by opencv, shape is [H, W, C]quality_factor: jpeg quality factor, between [0, 1]. Higher value meanshigher quality imageReturns-------jpeg_image: jpeg compressed image"""compression_factor = int(quality_factor)compression_param = [cv2.IMWRITE_JPEG_QUALITY, compression_factor]image_encode = cv2.imencode('.jpg', image, compression_param)[1]jpeg_image = cv2.imdecode(image_encode, -1)return jpeg_imagedef get_random_patch_bboxes(image, bbox_size, stride, jitter, roi_bbox=None):"""Generate random patch bounding boxes for a image around ROI regionParameters----------image: image data read by opencv, shape is [H, W, C]bbox_size: size of patch bbox, one digit or a list/tuple containing twodigits, defined by (width, height)stride: stride between adjacent bboxes (before jitter), one digit or alist/tuple containing two digits, defined by (x, y)jitter: jitter size for evenly distributed bboxes, one digit or alist/tuple containing two digits, defined by (x, y)roi_bbox: roi region, defined by [xmin, ymin, xmax, ymax], default is wholeimage regionReturns-------patch_bboxes: randomly distributed patch bounding boxes, n x 4 numpy array.Each bounding box is defined by [xmin, ymin, xmax, ymax]"""height, width = image.shape[:2]bbox_size = _process_geometry_param(bbox_size, min_value=1)stride = _process_geometry_param(stride, min_value=1)jitter = _process_geometry_param(jitter, min_value=0)if bbox_size[0] > width or bbox_size[1] > height:raise ValueError('box_size must be <= image size')if roi_bbox is None:roi_bbox = [0, 0, width, height]# tl is for top-left, br is for bottom-righttl_x, tl_y = _get_top_left_points(roi_bbox, bbox_size, stride, jitter)br_x = tl_x + bbox_size[0]br_y = tl_y + bbox_size[1]# shrink bottom-right points to avoid exceeding image borderbr_x[br_x > width] = widthbr_y[br_y > height] = height# shrink top-left points to avoid exceeding image bordertl_x = br_x - bbox_size[0]tl_y = br_y - bbox_size[1]tl_x[tl_x < 0] = 0tl_y[tl_y < 0] = 0# compute bottom-right points againbr_x = tl_x + bbox_size[0]br_y = tl_y + bbox_size[1]patch_bboxes = np.concatenate((tl_x, tl_y, br_x, br_y), axis=1)return patch_bboxesdef _process_geometry_param(param, min_value):"""Process and check param, which must be one digit or a list/tuple containingtwo digits, and its value must be >= min_valueParameters----------param: parameter to be processedmin_value: min value for paramReturns-------param: param after processing"""if isinstance(param, (int, float)) or \isinstance(param, np.ndarray) and param.size == 1:param = int(np.round(param))param = [param, param]else:if len(param) != 2:raise ValueError('param must be one digit or two digits')param = [int(np.round(param[0])), int(np.round(param[1]))]# check data range using min_valueif not (param[0] >= min_value and param[1] >= min_value):raise ValueError('param must be >= min_value (%d)' % min_value)return paramdef _get_top_left_points(roi_bbox, bbox_size, stride, jitter):"""Generate top-left points for bounding boxesParameters----------roi_bbox: roi region, defined by [xmin, ymin, xmax, ymax]bbox_size: size of patch bbox, a list/tuple containing two digits, definedby (width, height)stride: stride between adjacent bboxes (before jitter), a list/tuplecontaining two digits, defined by (x, y)jitter: jitter size for evenly distributed bboxes, a list/tuple containingtwo digits, defined by (x, y)Returns-------tl_x: x coordinates of top-left points, n x 1 numpy arraytl_y: y coordinates of top-left points, n x 1 numpy array"""xmin, ymin, xmax, ymax = roi_bboxroi_width = xmax - xminroi_height = ymax - ymin# get the offset between the first top-left point of patch box and the# top-left point of roi_bboxoffset_x = np.arange(0, roi_width, stride[0])[-1] + bbox_size[0]offset_y = np.arange(0, roi_height, stride[1])[-1] + bbox_size[1]offset_x = (offset_x - roi_width) // 2offset_y = (offset_y - roi_height) // 2# get the coordinates of all top-left pointstl_x = np.arange(xmin, xmax, stride[0]) - offset_xtl_y = np.arange(ymin, ymax, stride[1]) - offset_ytl_x, tl_y = np.meshgrid(tl_x, tl_y)tl_x = np.reshape(tl_x, [-1, 1])tl_y = np.reshape(tl_y, [-1, 1])# jitter the coordinates of all top-left pointstl_x += np.random.randint(-jitter[0], jitter[0] + 1, size=tl_x.shape)tl_y += np.random.randint(-jitter[1], jitter[1] + 1, size=tl_y.shape)return tl_x, tl_y

generate_train_test_data.py

# -*- coding: utf-8 -*-
import os
import cv2
import numpy as npfrom util import uniform_random
from util import get_random_patch_bboxes
from util import jpeg_compression
from util import add_gaussian_noiseROOT_FOLDER_TRAIN = r'F:\Forensic\train'
ROOT_FOLDER_TEST = r'F:\Forensic\test'
OUTPUT_FOLDER = r'F:\Forensic\noise'PATCH_SHAPE = (28, 28)
STRIDE = (64, 64)
JITTER = (32, 32)def make_data(root_folder, phase='train'):"""Make image patches and the corresponding labels, and then save them todisk. Half of the patches are original, the other half are tampered.Parameters----------root_folder: root_folder of original full imagephase: 'train' or 'test'"""files = os.listdir(root_folder)# make datareal_patches = []tampered_patches = []for i, file in enumerate(files):print(i + 1, file)image = cv2.imread(os.path.join(root_folder, file))# the following part can be modified to generate other types# of tampered_image''' Gaussian blur '''ksize = np.random.choice([3, 5, 7, 9], size=2)ksize = tuple(ksize)tampered_image = cv2.GaussianBlur(image, ksize,sigmaX=uniform_random(1.0, 3.0),sigmaY=uniform_random(1.0, 3.0))''' Gaussian noise '''# tampered_image = add_gaussian_noise(#     image,#     mean_ratio=0.0,#     std_ratio=uniform_random(0.01, 0.3))''' median blur '''# ksize = np.random.choice([3, 5, 7, 9])# tampered_image = cv2.medianBlur(image, ksize=ksize)''' JPEG compression '''# tampered_image = jpeg_compression(image, uniform_random(50, 95))''' brigntness '''# brightness = uniform_random(-25, 25)# tampered_image = np.float64(image) + brightness# tampered_image = np.clip(np.round(tampered_image), 0, 255)# tampered_image = np.uint8(tampered_image)''' contrast '''# contrast = uniform_random(0.75, 1.33)# tampered_image = np.float64(image) * contrast# tampered_image = np.clip(np.round(tampered_image), 0, 255)# tampered_image = np.uint8(tampered_image)patch_bboxes = get_random_patch_bboxes(image, PATCH_SHAPE, STRIDE, JITTER)blur_patch_bboxes = get_random_patch_bboxes(image, PATCH_SHAPE, STRIDE, JITTER)for bbox in patch_bboxes:xmin, ymin, xmax, ymax = bboxreal_patches.append(image[ymin:ymax, xmin:xmax])for bbox in blur_patch_bboxes:xmin, ymin, xmax, ymax = bboxtampered_patches.append(tampered_image[ymin:ymax, xmin:xmax])real_patches = np.array(real_patches)tampered_patches = np.array(tampered_patches)real_labels = np.ones(shape=real_patches.shape[0], dtype=np.int64)tampered_labels = np.zeros(shape=tampered_patches.shape[0], dtype=np.int64)patches = np.concatenate((real_patches, tampered_patches), axis=0)patches = patches.transpose([0, 3, 1, 2])labels = np.concatenate((real_labels, tampered_labels))# save dataos.makedirs(OUTPUT_FOLDER, exist_ok=True)np.save(os.path.join(OUTPUT_FOLDER, '%s_data.npy' % phase), patches)np.save(os.path.join(OUTPUT_FOLDER, '%s_label.npy' % phase), labels)print('Total number of train samples is %d' % labels.shape[0])if __name__ == '__main__':make_data(ROOT_FOLDER_TRAIN, 'train')make_data(ROOT_FOLDER_TEST, 'test')

train.py

# -*- coding: utf-8 -*-
import os
import time
import numpy as npimport torch
import torch.nn as nn
from torch.utils.data import Dataset
from torch.utils.data import DataLoader
import torchsummaryDEVICE = torch.device("cuda" if torch.cuda.is_available() else "cpu")
EPOCH = 10
TRAIN_BATCH_SIZE = 50
TEST_BATCH_SIZE = 32
BASE_CHANNEL = 32
INPUT_CHANNEL = 3
INPUT_SIZE = 28TRAIN_DATA_FILE = r'F:\Forensic\noise\train_data.npy'
TRAIN_LABEL_FILE = r'F:\Forensic\noise\train_label.npy'
TEST_DATA_FILE = r'F:\Forensic\noise\test_data.npy'
TEST_LABEL_FILE = r'F:\Forensic\noise\test_label.npy'MODEL_FOLDER = r'.\saved_model'def update_learing_rate(optimizer, epoch):"""Update learning rate stepwise for optimizerParameters----------optimizer: pytorch optimizerepoch: epoch"""learning_rate = 1e-4if epoch > 5:learning_rate = 1e-5for param_group in optimizer.param_groups:param_group['lr'] = learning_rateclass Model(nn.Module):"""6 layers plain model for forensic classification"""def __init__(self, input_ch, num_classes, base_ch):super(Model, self).__init__()self.num_classes = num_classesself.base_ch = base_chself.feature_length = base_ch * 4self.net = nn.Sequential(nn.Conv2d(input_ch, base_ch, kernel_size=3, padding=1),nn.ReLU(),nn.Conv2d(base_ch, base_ch, kernel_size=3, padding=1),nn.ReLU(),nn.MaxPool2d(kernel_size=2, stride=2),nn.Conv2d(base_ch, base_ch * 2, kernel_size=3, padding=1),nn.ReLU(),nn.Conv2d(base_ch * 2, base_ch * 2, kernel_size=3, padding=1),nn.ReLU(),nn.MaxPool2d(kernel_size=2, stride=2),nn.Conv2d(base_ch * 2, self.feature_length, kernel_size=3,padding=1),nn.ReLU(),nn.Conv2d(self.feature_length, self.feature_length, kernel_size=3,padding=1),nn.ReLU(),nn.AdaptiveAvgPool2d(output_size=(1, 1)))self.fc = nn.Linear(in_features=self.feature_length,out_features=num_classes)def forward(self, input):output = self.net(input)output = output.view(-1, self.feature_length)output = self.fc(output)return outputclass ForensicDataset(Dataset):"""Pytorch dataset for train and test"""def __init__(self, data, label):super(Dataset).__init__()self.data = dataself.label = labelself.num = len(label)def __len__(self):return self.numdef __getitem__(self, index):data = self.data[index]label = self.label[index]return data, labeldef load_dataset():"""Load train and test dataset"""# load train datasetdata = np.load(TRAIN_DATA_FILE).astype(np.float32)label = np.load(TRAIN_LABEL_FILE).astype(np.int64)data = torch.from_numpy(data)label = torch.from_numpy(label)train_dataset = ForensicDataset(data, label)# load test datasetdata = np.load(TEST_DATA_FILE).astype(np.float32)label = np.load(TEST_LABEL_FILE).astype(np.int64)data = torch.from_numpy(data)label = torch.from_numpy(label)test_dataset = ForensicDataset(data, label)return train_dataset, test_datasetif __name__ == '__main__':time_beg = time.time()train_dataset, test_dataset = load_dataset()train_loader = DataLoader(dataset=train_dataset,batch_size=TRAIN_BATCH_SIZE,shuffle=True)test_loader = DataLoader(dataset=test_dataset,batch_size=TEST_BATCH_SIZE,shuffle=False)model = Model(input_ch=INPUT_CHANNEL, num_classes=2,base_ch=BASE_CHANNEL).cuda()torchsummary.summary(model, input_size=(INPUT_CHANNEL, INPUT_SIZE, INPUT_SIZE))criterion = nn.CrossEntropyLoss()optimizer = torch.optim.Adam(model.parameters())train_loss = []for ep in range(1, EPOCH + 1):update_learing_rate(optimizer, ep)# ----------------- train -----------------model.train()time_beg_epoch = time.time()loss_recorder = []for data, classes in train_loader:data, classes = data.cuda(), classes.cuda()optimizer.zero_grad()output = model(data)loss = criterion(output, classes)loss.backward()optimizer.step()loss_recorder.append(loss.item())time_cost = time.time() - time_beg_epochprint('\rEpoch: %d, Loss: %0.4f, Time cost (s): %0.2f' % (ep, loss_recorder[-1], time_cost), end='')# print train info after one epochtrain_loss.append(loss_recorder)mean_loss_epoch = torch.mean(torch.Tensor(loss_recorder))time_cost_epoch = time.time() - time_beg_epochprint('\rEpoch: %d, Mean loss: %0.4f, Epoch time cost (s): %0.2f' % (ep, mean_loss_epoch.item(), time_cost_epoch), end='')# save modelos.makedirs(MODEL_FOLDER, exist_ok=True)model_filename = os.path.join(MODEL_FOLDER, 'epoch_%d.pth' % ep)torch.save(model.state_dict(), model_filename)# ----------------- test -----------------model.eval()correct = 0total = 0for data, classes in test_loader:data, classes = data.cuda(), classes.cuda()output = model(data)_, predicted = torch.max(output.data, 1)total += classes.size(0)correct += (predicted == classes).sum().item()print(', Test accuracy: %0.4f' % (correct / total))print('Total time cost: ', time.time() - time_beg)

高斯模糊

可以看到，如果图像做了高斯模糊很容易被识别出来，很随意就能达到0.99+的准确率。

日志如下：

Epoch: 1, Mean loss: 0.3753, Epoch time cost (s): 59.67, Test accuracy: 0.9501
Epoch: 2, Mean loss: 0.0936, Epoch time cost (s): 58.75, Test accuracy: 0.9768
Epoch: 3, Mean loss: 0.0380, Epoch time cost (s): 58.66, Test accuracy: 0.9874
Epoch: 4, Mean loss: 0.0254, Epoch time cost (s): 58.72, Test accuracy: 0.9902
Epoch: 5, Mean loss: 0.0217, Epoch time cost (s): 58.69, Test accuracy: 0.9735
Epoch: 6, Mean loss: 0.0116, Epoch time cost (s): 58.67, Test accuracy: 0.9929
Epoch: 7, Mean loss: 0.0091, Epoch time cost (s): 60.25, Test accuracy: 0.9935
Epoch: 8, Mean loss: 0.0082, Epoch time cost (s): 62.64, Test accuracy: 0.9934
Epoch: 9, Mean loss: 0.0076, Epoch time cost (s): 62.41, Test accuracy: 0.9933
Epoch: 10, Mean loss: 0.0071, Epoch time cost (s): 59.13, Test accuracy: 0.9940

高斯噪音

高斯噪音非常容易被识别出来，准确率极其随意就上了0.99。

日志如下：

Epoch: 1, Mean loss: 0.1213, Epoch time cost (s): 58.44, Test accuracy: 0.9740
Epoch: 2, Mean loss: 0.0447, Epoch time cost (s): 58.80, Test accuracy: 0.9562
Epoch: 3, Mean loss: 0.0272, Epoch time cost (s): 58.91, Test accuracy: 0.9867
Epoch: 4, Mean loss: 0.0170, Epoch time cost (s): 59.00, Test accuracy: 0.9885
Epoch: 5, Mean loss: 0.0071, Epoch time cost (s): 58.94, Test accuracy: 0.9760
Epoch: 6, Mean loss: 0.0014, Epoch time cost (s): 58.97, Test accuracy: 0.9942
Epoch: 7, Mean loss: 0.0006, Epoch time cost (s): 59.03, Test accuracy: 0.9928
Epoch: 8, Mean loss: 0.0005, Epoch time cost (s): 58.99, Test accuracy: 0.9933
Epoch: 9, Mean loss: 0.0004, Epoch time cost (s): 59.05, Test accuracy: 0.9952
Epoch: 10, Mean loss: 0.0004, Epoch time cost (s): 58.71, Test accuracy: 0.9968

中值滤波

中值滤波也比较容易就能识别出来，最高准确率虽然没有到0.99不过也接近了，增加点数据，多训几把碰碰运气，也不是很难达到。

中值滤波是一种非常强的非线性操作，使用传统方式其实挺难识别出来的，但是使用神经网络，很随意就搞定了。

日志如下：

Epoch: 1, Mean loss: 0.4308, Epoch time cost (s): 59.61, Test accuracy: 0.8943
Epoch: 2, Mean loss: 0.1859, Epoch time cost (s): 58.92, Test accuracy: 0.9280
Epoch: 3, Mean loss: 0.1213, Epoch time cost (s): 59.03, Test accuracy: 0.9467
Epoch: 4, Mean loss: 0.0848, Epoch time cost (s): 59.04, Test accuracy: 0.9460
Epoch: 5, Mean loss: 0.0587, Epoch time cost (s): 59.03, Test accuracy: 0.9645
Epoch: 6, Mean loss: 0.0269, Epoch time cost (s): 59.00, Test accuracy: 0.9813
Epoch: 7, Mean loss: 0.0209, Epoch time cost (s): 59.27, Test accuracy: 0.9822
Epoch: 8, Mean loss: 0.0185, Epoch time cost (s): 59.06, Test accuracy: 0.9857
Epoch: 9, Mean loss: 0.0170, Epoch time cost (s): 59.00, Test accuracy: 0.9854
Epoch: 10, Mean loss: 0.0156, Epoch time cost (s): 59.02, Test accuracy: 0.9763

二次JPEG压缩

JPEG压缩相对而言稍微难识别一点，在训练过程中，学习率策略与其它有所不同，我使用了1e-4做了2个epoch的预热，然后3-7 epoch使用了1e-3， 8-9 epoch使用了1e-4，最后一个epoch使用了1e-5。训了好几次发现，如果只使用1e-4和1e-5的话准确率只能到0.90+。（哎，也没啥特别的道理，就是一顿乱试，不过这里还是有一点规律可循的，一般我们希望初期可以在不发散的情况下尽量尝试大一点的学习率，以期望网络能够覆盖更广阔的的搜索空间）

尽管二次JPEG压缩略难识别，但准确率也达到了0.95+，还算可以了。

日志如下：

Epoch: 1, Mean loss: 0.6933, Epoch time cost (s): 58.97, Test accuracy: 0.5056
Epoch: 2, Mean loss: 0.5764, Epoch time cost (s): 58.88, Test accuracy: 0.7660
Epoch: 3, Mean loss: 0.3430, Epoch time cost (s): 58.83, Test accuracy: 0.7949
Epoch: 4, Mean loss: 0.1980, Epoch time cost (s): 58.88, Test accuracy: 0.8683
Epoch: 5, Mean loss: 0.1609, Epoch time cost (s): 58.88, Test accuracy: 0.9193
Epoch: 6, Mean loss: 0.1489, Epoch time cost (s): 58.85, Test accuracy: 0.9333
Epoch: 7, Mean loss: 0.1268, Epoch time cost (s): 58.81, Test accuracy: 0.9380
Epoch: 8, Mean loss: 0.0825, Epoch time cost (s): 58.95, Test accuracy: 0.9528
Epoch: 9, Mean loss: 0.0744, Epoch time cost (s): 59.06, Test accuracy: 0.9536
Epoch: 10, Mean loss: 0.0626, Epoch time cost (s): 58.83, Test accuracy: 0.9545

亮度

亮度和对比度可以放在一起讲。亮度和对比度有很多种修改方式，可以直接在RGB空间做，但更经常的做法是转换到YUV或者Lab等空间进行操作。此处我们简简单单地选择了在RGB空间进行操作，公式如下：
tampered_image=α∗image+βtampered\_image = \alpha *image + \beta tampered_image=α∗image+β

其中α\alphaα用于修改对比度，β\betaβ用于修改亮度。

此处对β\betaβ在两组取值范围下作了实验，发现识别准确率均非常低。此处我们需明确一点，这是一个二分类问题，50%的准确率意味着“瞎猜”，也就是完全无法识别。下面日志的准确率只是略高于50%，此处没有可视化分析，但是根据两次训练结果猜测高于50%的部分很有可能是因为图像进入了uint8类型的饱和区域，也就是说当β\betaβ很小或很大时，大量的值因为截断而变成了0或者255，所以被识别了出来。这种情况下肉眼也很容易能识别出篡改，所以我们基本可以认为神经网络在应对亮度篡改方面无能为力。

β\betaβ取值[−50,50][-50, 50][−50,50]时的日志如下：

Epoch: 1, Mean loss: 0.6558, Epoch time cost (s): 59.06, Test accuracy: 0.5207
Epoch: 2, Mean loss: 0.6231, Epoch time cost (s): 58.89, Test accuracy: 0.5444
Epoch: 3, Mean loss: 0.6063, Epoch time cost (s): 58.95, Test accuracy: 0.5833
Epoch: 4, Mean loss: 0.5933, Epoch time cost (s): 58.97, Test accuracy: 0.5988
Epoch: 5, Mean loss: 0.5839, Epoch time cost (s): 58.95, Test accuracy: 0.5981
Epoch: 6, Mean loss: 0.5628, Epoch time cost (s): 58.88, Test accuracy: 0.6009
Epoch: 7, Mean loss: 0.5582, Epoch time cost (s): 58.95, Test accuracy: 0.6037
Epoch: 8, Mean loss: 0.5556, Epoch time cost (s): 58.92, Test accuracy: 0.6018
Epoch: 9, Mean loss: 0.5535, Epoch time cost (s): 59.17, Test accuracy: 0.6007
Epoch: 10, Mean loss: 0.5515, Epoch time cost (s): 60.49, Test accuracy: 0.6016

β\betaβ取值[−25,25][-25, 25][−25,25]时的日志如下：

Epoch: 1, Mean loss: 0.6765, Epoch time cost (s): 59.06, Test accuracy: 0.5201
Epoch: 2, Mean loss: 0.6618, Epoch time cost (s): 58.56, Test accuracy: 0.5219
Epoch: 3, Mean loss: 0.6505, Epoch time cost (s): 58.81, Test accuracy: 0.5259
Epoch: 4, Mean loss: 0.6425, Epoch time cost (s): 58.94, Test accuracy: 0.5289
Epoch: 5, Mean loss: 0.6350, Epoch time cost (s): 58.85, Test accuracy: 0.5378
Epoch: 6, Mean loss: 0.6199, Epoch time cost (s): 58.75, Test accuracy: 0.5464
Epoch: 7, Mean loss: 0.6157, Epoch time cost (s): 58.70, Test accuracy: 0.5483
Epoch: 8, Mean loss: 0.6135, Epoch time cost (s): 58.75, Test accuracy: 0.5475
Epoch: 9, Mean loss: 0.6117, Epoch time cost (s): 58.88, Test accuracy: 0.5478
Epoch: 10, Mean loss: 0.6100, Epoch time cost (s): 58.41, Test accuracy: 0.5498

对比度

结论同亮度，神经网络对此项篡改的识别无能为力。

日志如下：

Epoch: 1, Mean loss: 0.6914, Epoch time cost (s): 59.21, Test accuracy: 0.4888
Epoch: 2, Mean loss: 0.6782, Epoch time cost (s): 58.85, Test accuracy: 0.5637
Epoch: 3, Mean loss: 0.6682, Epoch time cost (s): 58.85, Test accuracy: 0.5439
Epoch: 4, Mean loss: 0.6622, Epoch time cost (s): 58.89, Test accuracy: 0.5502
Epoch: 5, Mean loss: 0.6562, Epoch time cost (s): 58.78, Test accuracy: 0.5383
Epoch: 6, Mean loss: 0.6400, Epoch time cost (s): 58.87, Test accuracy: 0.5725
Epoch: 7, Mean loss: 0.6361, Epoch time cost (s): 58.92, Test accuracy: 0.5743
Epoch: 8, Mean loss: 0.6335, Epoch time cost (s): 58.80, Test accuracy: 0.5721
Epoch: 9, Mean loss: 0.6312, Epoch time cost (s): 58.78, Test accuracy: 0.5781
Epoch: 10, Mean loss: 0.6293, Epoch time cost (s): 58.81, Test accuracy: 0.5798

实验总结

针对以上的实验，有一些主观的认识，可能有一些道理，也可能不对，随便看看就好：

CNN天然是一种针对图像区块的操作，因为它具备一定的感受野，所以可以处理一定范围的领域，如果图像篡改也涉及了邻域操作，比如模糊、JPEG压缩等，那么就会容易被CNN识别出来。
还是因为邻域的问题，如果图像块中存在一定的统计特征，如噪音分布，那么也容易被识别出来。
但如果篡改行为是逐像素的，并且从结果得不到统计特征，比如对比度和亮度的改变，那么CNN可能就会无能为力。

基于深度学习的图像篡改识别相关推荐

毕业设计-基于深度学习的图像文字识别系统
目录前言课题背景和意义实现技术思路一.基本原理二.基于深度学习的图像文字识别技术三.总结实现效果图样例最后前言
opencv交通标志识别_教你从零开始做一个基于深度学习的交通标志识别系统
教你从零开始做一个基于深度学习的交通标志识别系统基于Yolo v3的交通标志识别系统及源码自动驾驶之--交通标志识别在本文章你可以学习到如何训练自己采集的数据集,生成模型,并用yolo v3算法 ...
DeepEye：一个基于深度学习的程序化交易识别与分类方法
DeepEye:一个基于深度学习的程序化交易识别与分类方法徐广斌,张伟上海证券交易所资本市场研究所,上海 200120 上海证券交易所产品创新中心,上海 200120 摘要:基于沪市A股交 ...
《基于深度学习的加密流量识别研究》-2022毕设笔记
参考文献: 基于深度学习的网络流量分类及异常检测方法研究_王伟基于深度学习的加密流量分类技术研究与实现_马梦叠基于深度学习的加密流量识别研究综述及展望_郭宇斌基于深度学习的加密流量算法识别研究_ ...
【手写汉字识别】基于深度学习的脱机手写汉字识别技术研究
写在前面最近一段时间在为本科毕业设计做一些知识储备,方向与手写识别的系统设计相关,在看到一篇2019年题为<基于深度学习的脱机手写汉字识别技术研究>的工学硕士论文后,感觉收获比较大,准备 ...
毕业设计之 --- 基于深度学习的行人重识别(person reid)
文章目录 0 前言 1 技术背景 2 技术介绍 3 重识别技术实现 3.1 数据集 3.2 行人检测 3.2 Person REID 3.2.1 算法原理 3.2.2 算法流程图 4 实现效果 5 部 ...
基于深度学习的命名实体识别研究综述——论文研读
基于深度学习的命名实体识别研究综述摘要: 0引言 1基于深度学习的命名实体识别方法 1.1基于卷积神经网络的命名实体识别方法 1.2基于循环神经网络的命名实体识别方法 1.3基于Transforme ...
一种基于深度学习的增值税发票影像识别系统
一种基于深度学习的增值税发票影像识别系统-专利技术交底书缩略语和关键术语定义 1.卷积神经网络(Convolutional Neural Networks, CNN)是一类包含卷积计算且具有深度结构 ...
基于深度学习的农作物病虫害识别系统
1 简介今天向大家介绍一个帮助往届学生完成的毕业设计项目,基于深度学习的农作物病虫害识别系统. ABSTRACT 及时.准确地诊断植物病害,对于防止农业生产的损失和农产品的损失或减少具有重要作用.为 ...

基于深度学习的图像篡改识别

文章目录

为什么要做图像篡改识别

图像篡改的类型

不同篡改类型的训练和测试结果

代码框架以及通用的实验参数

高斯模糊

高斯噪音

中值滤波

二次JPEG压缩

亮度

对比度

实验总结

基于深度学习的图像篡改识别相关推荐

最新文章

热门文章