前言

卷积神经网络依靠其强大的特征提取的能力,在模式识别中大放异彩,这篇博文就是介绍如何用卷积神经网络识别olivettifaces人脸数据库,这算是图像识别的入门级的demo啦,在研习这篇博文前,如果你并没有CNN卷积神经网络的基础,强烈推荐先学习这一篇博文——卷积神经网络(CNN)原理,如果你已经有了卷积神经网络的基础,那就直接来吧!

-----------------------------------------------------------------------------------------------------------------------------------------

说明:

1、olivettifaces人脸数据库详细简介大家可以通过这个链接自己了解,这一篇博客就不详述了,必要且简单的说明还是会悉心奉上;

2、博主是python3.5版本,IDE是pycharm,使用的深度学习框架有两个——tensorflow框架、Kreas框架。(Kreas框架底层也是tensorflow,只是Kreas代码看起来简洁,Kreas的下载安装也很简单这里不做介绍);

3、依靠tensorflow实现的方式较为复杂,这里主要介绍以Kreas框架实现的代码,但是两个的源码都在文末奉上;

4、本篇博文所有代码都已上传:位置在这里,绝对干货无所欺
-----------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------

一、olivettifaces人脸数据库简介


1、olivettifaces人脸数据库是纽约大学组建的一个比较小的人脸数据库。有40个人,每人10张图片,组成一张有400张人脸的大图片。

2、像素灰度范围在[0,255]。整张图片大小是1190942,20行320列,所以每张照片大小是(1190/20)(942/20)= 57*47

3、程序需先配置h5pypython -m pip install h5py

-----------------------------------------------------------------------------------------------------------------------------------------

二、伪代码讲解

1、数据的读取;标签的划分

# 读取整张图片的数据,并设置对应标签
def get_load_data(dataset_path):img = Image.open(dataset_path)# 数据归一化。asarray是使用原内存将数据转化为np.ndarrayimg_ndarray = np.asarray(img, dtype = 'float64')/255# 400 pictures, size: 57*47 = 2679  faces_data = np.empty((400, 2679))for row in range(20):  for column in range(20):# flatten可将多维数组降成一维faces_data[row*20+column] = np.ndarray.flatten(img_ndarray[row*57:(row+1)*57, column*47:(column+1)*47])# 设置图片标签label = np.empty(400)for i in range(40):label[i*10:(i+1)*10] = ilabel = label.astype(np.int)# 分割数据集:每个人前8张图片做训练,第9张做验证,第10张做测试;所以train:320,valid:40,test:40train_data = np.empty((320, 2679))train_label = np.empty(320)valid_data = np.empty((40, 2679))valid_label = np.empty(40)test_data = np.empty((40, 2679))test_label = np.empty(40)for i in range(40):train_data[i*8:i*8+8] = faces_data[i*10:i*10+8] # 训练集对应的数据train_label[i*8:i*8+8] = label[i*10 : i*10+8]   # 训练集对应的标签valid_data[i] = faces_data[i*10+8]   # 验证集对应的数据valid_label[i] = label[i*10+8]       # 验证集对应的标签test_data[i] = faces_data[i*10+9]    # 测试集对应的数据test_label[i] = label[i*10+9]        # 测试集对应的标签train_data = train_data.astype('float32')valid_data = valid_data.astype('float32')test_data = test_data.astype('float32')result = [(train_data, train_label), (valid_data, valid_label), (test_data, test_label)]return result

依照图片的path地址,读取图片的主要信息,根据上面的注释大家都能理解每一步都是做什么的,主要介绍以下几个地方:

  1. 设置图片标签:每10张图片设置一个相同的标签,
  2. 分割数据集:每个人10张照片中,前8张用做训练,第9张用做内测验证,第10张用做外测,也是按照像素索引进行划分;
  3. 函数的返回值就是3个元组,分别是训练集、内测验证集、外测测试集;

2、CNN网络的搭建

# CNN主体
def get_set_model(lr=0.005,decay=1e-6,momentum=0.9):model = Sequential()# 卷积1+池化1if K.image_data_format() == 'channels_first':model.add(Conv2D(nb_filters1, kernel_size=(3, 3), input_shape = (1, img_rows, img_cols)))else:model.add(Conv2D(nb_filters1, kernel_size=(2, 2), input_shape = (img_rows, img_cols, 1)))model.add(Activation('tanh'))model.add(MaxPooling2D(pool_size=(2, 2)))# 卷积2+池化2model.add(Conv2D(nb_filters2, kernel_size=(3, 3)))model.add(Activation('tanh'))  model.add(MaxPooling2D(pool_size=(2, 2)))model.add(Dropout(0.25))  # 全连接层1+分类器层model.add(Flatten())  model.add(Dense(1000))       #Full connectionmodel.add(Activation('tanh'))  model.add(Dropout(0.5))  model.add(Dense(40))model.add(Activation('softmax'))  # 选择设置SGD优化器参数sgd = SGD(lr=lr, decay=decay, momentum=momentum, nesterov=True)  model.compile(loss='categorical_crossentropy', optimizer=sgd)return model

Kreas框架是不是看起来很简洁,它的语法主体就包含在以下声明中:

from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.optimizers import SGD    # 梯度下降的优化器
  • Sequential:模型初始化
  • Dense:全连接层
  • Flatten:合并拉伸函数
  • SGDoptimizers,优化器
  • DropoutActivationConv2DMaxPooling2D就不介绍了,各自的参数含义也请参考卷积神经网络(CNN)原理

3、训练过程,保存参数

# 训练过程,保存参数
def get_train_model(model,X_train, Y_train, X_val, Y_val):model.fit(X_train, Y_train, batch_size = batch_size, epochs = epochs,  verbose=1, validation_data=(X_val, Y_val))# 保存参数model.save_weights('model_weights.h5', overwrite=True)  return model

4、测试过程,调用参数

# 测试过程,调用参数
def get_test_model(model,X,Y):model.load_weights('model_weights.h5')  score = model.evaluate(X, Y, verbose=0)return score

-----------------------------------------------------------------------------------------------------------------------------------------
三、Kreas源码及结果展示

# -*- coding:utf-8 -*-
# -*- author:zzZ_CMing  CSDN address:https://blog.csdn.net/zzZ_CMing
# -*- 2018/06/05;11:41
# -*- python3.5
"""
olivetti Faces是纽约大学组建的一个比较小的人脸数据库。有40个人,每人10张图片,组成一张有400张人脸的大图片。
像素灰度范围在[0,255]。整张图片大小是1190*942,20行320列,所以每张照片大小是(1190/20)*(942/20)= 57*47
程序需配置h5py:python -m pip install h5py
博客地址:https://blog.csdn.net/zzZ_CMing,更多机器学习源码
"""
import numpy as np
from PIL import Image
from keras.models import Sequential
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.optimizers import SGD    # 梯度下降的优化器
from keras.utils import np_utils
from keras import backend as K# 读取整张图片的数据,并设置对应标签
def get_load_data(dataset_path):img = Image.open(dataset_path)# 数据归一化。asarray是使用原内存将数据转化为np.ndarrayimg_ndarray = np.asarray(img, dtype = 'float64')/255# 400 pictures, size: 57*47 = 2679  faces_data = np.empty((400, 2679))for row in range(20):  for column in range(20):# flatten可将多维数组降成一维faces_data[row*20+column] = np.ndarray.flatten(img_ndarray[row*57:(row+1)*57, column*47:(column+1)*47])# 设置图片标签label = np.empty(400)for i in range(40):label[i*10:(i+1)*10] = ilabel = label.astype(np.int)# 分割数据集:每个人前8张图片做训练,第9张做验证,第10张做测试;所以train:320,valid:40,test:40train_data = np.empty((320, 2679))train_label = np.empty(320)valid_data = np.empty((40, 2679))valid_label = np.empty(40)test_data = np.empty((40, 2679))test_label = np.empty(40)for i in range(40):train_data[i*8:i*8+8] = faces_data[i*10:i*10+8] # 训练集对应的数据train_label[i*8:i*8+8] = label[i*10 : i*10+8]   # 训练集对应的标签valid_data[i] = faces_data[i*10+8]   # 验证集对应的数据valid_label[i] = label[i*10+8]       # 验证集对应的标签test_data[i] = faces_data[i*10+9]    # 测试集对应的数据test_label[i] = label[i*10+9]        # 测试集对应的标签train_data = train_data.astype('float32')valid_data = valid_data.astype('float32')test_data = test_data.astype('float32')result = [(train_data, train_label), (valid_data, valid_label), (test_data, test_label)]return result# CNN主体
def get_set_model(lr=0.005,decay=1e-6,momentum=0.9):model = Sequential()# 卷积1+池化1if K.image_data_format() == 'channels_first':model.add(Conv2D(nb_filters1, kernel_size=(3, 3), input_shape = (1, img_rows, img_cols)))else:model.add(Conv2D(nb_filters1, kernel_size=(2, 2), input_shape = (img_rows, img_cols, 1)))model.add(Activation('tanh'))model.add(MaxPooling2D(pool_size=(2, 2)))# 卷积2+池化2model.add(Conv2D(nb_filters2, kernel_size=(3, 3)))model.add(Activation('tanh'))  model.add(MaxPooling2D(pool_size=(2, 2)))model.add(Dropout(0.25))  # 全连接层1+分类器层model.add(Flatten())  model.add(Dense(1000))       #Full connectionmodel.add(Activation('tanh'))  model.add(Dropout(0.5))  model.add(Dense(40))model.add(Activation('softmax'))  # 选择设置SGD优化器参数sgd = SGD(lr=lr, decay=decay, momentum=momentum, nesterov=True)  model.compile(loss='categorical_crossentropy', optimizer=sgd)return model  # 训练过程,保存参数
def get_train_model(model,X_train, Y_train, X_val, Y_val):model.fit(X_train, Y_train, batch_size = batch_size, epochs = epochs,  verbose=1, validation_data=(X_val, Y_val))# 保存参数model.save_weights('model_weights.h5', overwrite=True)  return model  # 测试过程,调用参数
def get_test_model(model,X,Y):model.load_weights('model_weights.h5')  score = model.evaluate(X, Y, verbose=0)return score  # [start]
epochs = 35          # 进行多少轮训练
batch_size = 40      # 每个批次迭代训练使用40个样本,一共可训练320/40=8个网络
img_rows, img_cols = 57, 47         # 每张人脸图片的大小
nb_filters1, nb_filters2 = 20, 40   # 两层卷积核的数目(即输出的维度)if __name__ == '__main__':  # 将每个人10张图片,按8:1:1的比例拆分为训练集、验证集、测试集数据(X_train, y_train), (X_val, y_val),(X_test, y_test) = get_load_data('olivettifaces.gif')if K.image_data_format() == 'channels_first':    # 1为图像像素深度X_train = X_train.reshape(X_train.shape[0],1,img_rows,img_cols)X_val = X_val.reshape(X_val.shape[0], 1, img_rows, img_cols)  X_test = X_test.reshape(X_test.shape[0], 1, img_rows, img_cols)  input_shape = (1, img_rows, img_cols)else:X_train = X_train.reshape(X_train.shape[0], img_rows, img_cols, 1)  X_val = X_val.reshape(X_val.shape[0], img_rows, img_cols, 1)  X_test = X_test.reshape(X_test.shape[0], img_rows, img_cols, 1)  input_shape = (img_rows, img_cols, 1)print('X_train shape:', X_train.shape)# convert class vectors to binary class matrices  Y_train = np_utils.to_categorical(y_train, 40)Y_val = np_utils.to_categorical(y_val, 40)Y_test = np_utils.to_categorical(y_test, 40)# 训练过程,保存参数model = get_set_model()get_train_model(model, X_train, Y_train, X_val, Y_val)score = get_test_model(model, X_test, Y_test)# 测试过程,调用参数,得到准确率、预测输出model.load_weights('model_weights.h5')classes = model.predict_classes(X_test, verbose=0)  test_accuracy = np.mean(np.equal(y_test, classes))print("last accuarcy:", test_accuracy)for i in range(0,40):if y_test[i] != classes[i]:print(y_test[i], '被错误分成', classes[i]);

本篇博文所有代码都已上传:位置在这里,绝对干货无所欺
-----------------------------------------------------------------------------------------------------------------------------------------

四、TensorFlow源码

**声明:**这是网上前辈的代码,olivettifaces人脸数据库,表示敬意——这是很早以前的代码,有些地方与现在的函数调用不匹配,我已经整理过,应该可以跑起来并得到结果,这里只附上源码,有兴趣的伙伴自己通过上面链接研究啦。

4.1: 训练程序

建立train_CNN.py文件,olivettifaces.gif归入同一文件目录,train_CNN.py文件内写入如下代码,

# -*- coding:utf-8 -*-
"""
本程序基于python+numpy+theano+PIL开发,采用类似LeNet5的CNN模型,应用于olivettifaces人脸数据库,
实现人脸识别的功能,模型的误差降到了5%以下。
本程序只是个人学习过程的一个toy implement,模型可能存在overfitting,因为样本小,这一点也无从验证。
但是,本程序意在理清程序开发CNN模型的具体步骤,特别是针对图像识别,从拿到图像数据库,到实现一个针对这个图像数据库的CNN模型,
我觉得本程序对这些流程的实现具有参考意义。
@author:wepon(http://2hwp.com)
讲解这份代码的文章:http://blog.csdn.net/u012162613/article/details/43277187
"""
import os
import sys
import timeimport numpy as np
from PIL import Imageimport theano
import theano.tensor as T
from theano.tensor.signal.pool import pool_2d
from theano.tensor.nnet import conv"""
加载图像数据的函数,dataset_path即图像olivettifaces的路径
加载olivettifaces后,划分为train_data,valid_data,test_data三个数据集
函数返回train_data,valid_data,test_data以及对应的label
"""def get_data(dataset_path):img = Image.open(dataset_path)img_ndarray = np.asarray(img, dtype='float64') / 256faces = np.empty((400, 2679))for row in range(20):for column in range(20):faces[row * 20 + column] = np.ndarray.flatten(img_ndarray[row * 57:(row + 1) * 57, column * 47:(column + 1) * 47])label = np.empty(400)for i in range(40):label[i * 10:i * 10 + 10] = ilabel = label.astype(np.int)# 分成训练集、验证集、测试集,大小如下train_data = np.empty((320, 2679))train_label = np.empty(320)valid_data = np.empty((40, 2679))valid_label = np.empty(40)test_data = np.empty((40, 2679))test_label = np.empty(40)for i in range(40):train_data[i * 8:i * 8 + 8] = faces[i * 10:i * 10 + 8]train_label[i * 8:i * 8 + 8] = label[i * 10:i * 10 + 8]valid_data[i] = faces[i * 10 + 8]valid_label[i] = label[i * 10 + 8]test_data[i] = faces[i * 10 + 9]test_label[i] = label[i * 10 + 9]# 将数据集定义成shared类型,才能将数据复制进GPU,利用GPU加速程序。def shared_dataset(data_x, data_y, borrow=True):shared_x = theano.shared(np.asarray(data_x,dtype=theano.config.floatX),borrow=borrow)shared_y = theano.shared(np.asarray(data_y,dtype=theano.config.floatX),borrow=borrow)return shared_x, T.cast(shared_y, 'int32')train_set_x, train_set_y = shared_dataset(train_data, train_label)valid_set_x, valid_set_y = shared_dataset(valid_data, valid_label)test_set_x, test_set_y = shared_dataset(test_data, test_label)rval = [(train_set_x, train_set_y), (valid_set_x, valid_set_y),(test_set_x, test_set_y)]return rval# 分类器,即CNN最后一层,采用逻辑回归(softmax)
class LogisticRegression(object):def __init__(self, input, n_in, n_out):self.W = theano.shared(value=np.zeros((n_in, n_out),dtype=theano.config.floatX),name='W',borrow=True)self.b = theano.shared(value=np.zeros((n_out,),dtype=theano.config.floatX),name='b',borrow=True)self.p_y_given_x = T.nnet.softmax(T.dot(input, self.W) + self.b)self.y_pred = T.argmax(self.p_y_given_x, axis=1)self.params = [self.W, self.b]def negative_log_likelihood(self, y):return -T.mean(T.log(self.p_y_given_x)[T.arange(y.shape[0]), y])def errors(self, y):if y.ndim != self.y_pred.ndim:raise TypeError('y should have the same shape as self.y_pred',('y', y.type, 'y_pred', self.y_pred.type))if y.dtype.startswith('int'):return T.mean(T.neq(self.y_pred, y))else:raise NotImplementedError()# 全连接层,分类器前一层
class HiddenLayer(object):def __init__(self, rng, input, n_in, n_out,W=None, b=None,activation=T.tanh):self.input = inputif W is None:W_values = np.asarray(rng.uniform(low=-np.sqrt(6. / (n_in + n_out)),high=np.sqrt(6. / (n_in + n_out)),size=(n_in, n_out)),dtype=theano.config.floatX)if activation == theano.tensor.nnet.sigmoid:W_values *= 4W = theano.shared(value=W_values, name='W', borrow=True)if b is None:b_values = np.zeros((n_out,), dtype=theano.config.floatX)b = theano.shared(value=b_values, name='b', borrow=True)self.W = Wself.b = blin_output = T.dot(input, self.W) + self.bself.output = (lin_output if activation is Noneelse activation(lin_output))# parameters of the modelself.params = [self.W, self.b]# 卷积+采样层(conv+maxpooling)
class LeNetConvPoolLayer(object):def __init__(self, rng, input, filter_shape, image_shape, poolsize=(2, 2)):assert image_shape[1] == filter_shape[1]self.input = inputfan_in = np.prod(filter_shape[1:])fan_out = (filter_shape[0] * np.prod(filter_shape[2:]) /np.prod(poolsize))# initialize weights with random weightsW_bound = np.sqrt(6. / (fan_in + fan_out))self.W = theano.shared(np.asarray(rng.uniform(low=-W_bound, high=W_bound, size=filter_shape),dtype=theano.config.floatX),borrow=True)# the bias is a 1D tensor -- one bias per output feature mapb_values = np.zeros((filter_shape[0],), dtype=theano.config.floatX)self.b = theano.shared(value=b_values, borrow=True)# 卷积conv_out = conv.conv2d(input=input,filters=self.W,image_shape=image_shape,filter_shape = filter_shape,)# 子采样pooled_out = pool_2d(input=conv_out,ws=poolsize,ignore_border=True)self.output = T.tanh(pooled_out + self.b.dimshuffle('x', 0, 'x', 'x'))# store parameters of this layerself.params = [self.W, self.b]# 保存训练参数的函数
def save_params(param1, param2, param3, param4):import picklewrite_file = open('params.pkl', 'wb')pickle.dump(param1, write_file, -1)pickle.dump(param2, write_file, -1)pickle.dump(param3, write_file, -1)pickle.dump(param4, write_file, -1)write_file.close()"""
上面定义好了CNN的一些基本构件,下面的函数将CNN应用于olivettifaces这个数据集,CNN的模型基于LeNet。
采用的优化算法是批量随机梯度下降算法,minibatch SGD,所以下面很多参数都带有batch_size,比如image_shape=(batch_size, 1, 57, 47)
可以设置的参数有:
batch_size,但应注意n_train_batches、n_valid_batches、n_test_batches的计算都依赖于batch_size
nkerns=[5, 10]即第一二层的卷积核个数可以设置
全连接层HiddenLayer的输出神经元个数n_out可以设置,要同时更改分类器的输入n_in
另外,还有一个很重要的就是学习速率learning_rate.
"""def main(learning_rate=0.05, n_epochs=200,dataset='olivettifaces.gif',nkerns=[5, 10], batch_size=40):# 随机数生成器,用于初始化参数rng = np.random.RandomState(23455)# 加载数据:分为训练集、验证集、测试集三个数据集datasets = get_data(dataset)train_set_x, train_set_y = datasets[0]valid_set_x, valid_set_y = datasets[1]test_set_x, test_set_y = datasets[2]# 计算各数据集的batch个数n_train_batches = train_set_x.get_value(borrow=True).shape[0]n_valid_batches = valid_set_x.get_value(borrow=True).shape[0]n_test_batches = test_set_x.get_value(borrow=True).shape[0]n_train_batches /= batch_sizen_valid_batches /= batch_sizen_test_batches /= batch_size# 定义几个变量,x代表人脸数据,作为layer0的输入index = T.lscalar()x = T.matrix('x')y = T.ivector('y')####################### 建立CNN模型:# input+layer0(LeNetConvPoolLayer)+layer1(LeNetConvPoolLayer)+layer2(HiddenLayer)+layer3(LogisticRegression)######################print('... building the model')# Reshape matrix of rasterized images of shape (batch_size, 57 * 47)# to a 4D tensor, compatible with our LeNetConvPoolLayer# (57, 47) is the size of  images.layer0_input = x.reshape((batch_size, 1, 57, 47))# 第一个卷积+maxpool层# 卷积后得到:(57-5+1 , 47-5+1) = (53, 43)# maxpooling后得到: (53/2, 43/2) = (26, 21),因为忽略了边界# 4D output tensor is thus of shape (batch_size, nkerns[0], 26, 21)layer0 = LeNetConvPoolLayer(rng,input=layer0_input,image_shape=(batch_size, 1, 57, 47),filter_shape=(nkerns[0], 1, 5, 5),poolsize=(2, 2))# 第二个卷积+maxpool层,输入是上层的输出,即(batch_size, nkerns[0], 26, 21)# 卷积后得到:(26-5+1 , 21-5+1) = (22, 17)# maxpooling后得到: (22/2, 17/2) = (11, 8),因为忽略了边界# 4D output tensor is thus of shape (batch_size, nkerns[1], 11, 8)layer1 = LeNetConvPoolLayer(rng,input=layer0.output,image_shape=(batch_size, nkerns[0], 26, 21),filter_shape=(nkerns[1], nkerns[0], 5, 5),poolsize=(2, 2))# HiddenLayer全连接层,它的输入的大小是(batch_size, num_pixels),也就是说要将每个样本经layer0、layer1后得到的特征图整成一个一维的长向量,# 有batch_size个样本,故输入的大小为(batch_size, num_pixels),每一行是一个样本的长向量# 因此将上一层的输出(batch_size, nkerns[1], 11, 8)转化为(batch_size, nkerns[1] * 11* 8),用flattenlayer2_input = layer1.output.flatten(2)layer2 = HiddenLayer(rng,input=layer2_input,n_in=nkerns[1] * 11 * 8,n_out=2000,  # 全连接层输出神经元的个数,自己定义的,可以根据需要调节activation=T.tanh)# 分类器layer3 = LogisticRegression(input=layer2.output, n_in=2000, n_out=40)  # n_in等于全连接层的输出,n_out等于40个类别################ 定义优化算法的一些基本要素:代价函数,训练、验证、测试model、参数更新规则(即梯度下降)################ 代价函数cost = layer3.negative_log_likelihood(y)test_model = theano.function([index],layer3.errors(y),givens={x: test_set_x[index * batch_size: (index + 1) * batch_size],y: test_set_y[index * batch_size: (index + 1) * batch_size]})validate_model = theano.function([index],layer3.errors(y),givens={x: valid_set_x[index * batch_size: (index + 1) * batch_size],y: valid_set_y[index * batch_size: (index + 1) * batch_size]})# 所有参数params = layer3.params + layer2.params + layer1.params + layer0.params# 各个参数的梯度grads = T.grad(cost, params)# 参数更新规则updates = [(param_i, param_i - learning_rate * grad_i)for param_i, grad_i in zip(params, grads)]# train_model在训练过程中根据MSGD优化更新参数train_model = theano.function([index],cost,updates=updates,givens={x: train_set_x[index * batch_size: (index + 1) * batch_size],y: train_set_y[index * batch_size: (index + 1) * batch_size]})################ 训练CNN阶段,寻找最优的参数。###############print('... training')# 在LeNet5中,batch_size=500,n_train_batches=50000/500=100,patience=10000# 在olivettifaces中,batch_size=40,n_train_batches=320/40=8, paticence可以相应地设置为800,这个可以根据实际情况调节,调大一点也无所谓patience = 800patience_increase = 2improvement_threshold = 0.99validation_frequency = min(n_train_batches, patience / 2)best_validation_loss = np.infbest_iter = 0test_score = 0.start_time = time.clock()epoch = 0done_looping = Falsewhile (epoch < n_epochs) and (not done_looping):epoch = epoch + 1for minibatch_index in range(int(n_train_batches)):iter = (epoch - 1) * n_train_batches + minibatch_indexif iter % 100 == 0:print('training @ iter = ', iter)cost_ij = train_model(minibatch_index)if (iter + 1) % validation_frequency == 0:# compute zero-one loss on validation setvalidation_losses = [validate_model(i) for iin range(int(n_valid_batches))]this_validation_loss = np.mean(validation_losses)print('epoch %i, minibatch %i/%i, validation error %f %%' %(epoch, minibatch_index + 1, n_train_batches,this_validation_loss * 100.))# if we got the best validation score until nowif this_validation_loss < best_validation_loss:# improve patience if loss improvement is good enoughif this_validation_loss < best_validation_loss * \improvement_threshold:patience = max(patience, iter * patience_increase)# save best validation score and iteration numberbest_validation_loss = this_validation_lossbest_iter = itersave_params(layer0.params, layer1.params, layer2.params, layer3.params)  # 保存参数# test it on the test settest_losses = [test_model(i)for i in range(int(n_test_batches))]test_score = np.mean(test_losses)print(('     epoch %i, minibatch %i/%i, test error of ''best model %f %%') %(epoch, minibatch_index + 1, n_train_batches,test_score * 100.))if patience <= iter:done_looping = Truebreakend_time = time.clock()print('Optimization complete.')print('Best validation score of %f %% obtained at iteration %i, ''with test performance %f %%' %(best_validation_loss * 100., best_iter + 1, test_score * 100.))print >> sys.stderr, ('The code for file ' +os.path.split(__file__)[1] +' ran for %.2fm' % ((end_time - start_time) / 60.))if __name__ == '__main__':main()

训练程序会生成一个.pkl的文件,该文件是保存了训练参数。
4.2: 测试程序

在同一目录下,建立use_CNN.py文件,写入以下代码:

# -*-coding:utf8-*-#
"""
本程序实现的功能:
在train_CNN_olivettifaces.py中我们训练好并保存了模型的参数,利用这些保存下来的参数来初始化CNN模型,
这样就得到一个可以使用的CNN系统,将人脸图输入这个CNN系统,预测人脸图的类别。
@author:wepon(http://2hwp.com)
讲解这份代码的文章:http://blog.csdn.net/u012162613/article/details/43277187
"""import os
import sys
import pickleimport numpy
from PIL import Imageimport theano
import theano.tensor as T
from theano.tensor.signal.pool import pool_2d
from theano.tensor.nnet import conv# 读取之前保存的训练参数
# layer0_params~layer3_params都是包含W和b的,layer*_params[0]是W,layer*_params[1]是b
def load_params(params_file):f = open(params_file, 'rb')layer0_params = pickle.load(f)layer1_params = pickle.load(f)layer2_params = pickle.load(f)layer3_params = pickle.load(f)f.close()return layer0_params, layer1_params, layer2_params, layer3_params# 读取图像,返回numpy.array类型的人脸数据以及对应的label
def load_data(dataset_path):img = Image.open(dataset_path)img_ndarray = numpy.asarray(img, dtype='float64') / 256faces = numpy.empty((400, 2679))for row in range(20):for column in range(20):faces[row * 20 + column] = numpy.ndarray.flatten(img_ndarray[row * 57:(row + 1) * 57, column * 47:(column + 1) * 47])label = numpy.empty(400)for i in range(40):label[i * 10:i * 10 + 10] = ilabel = label.astype(numpy.int)return faces, label"""
train_CNN_olivettifaces中的LeNetConvPoolLayer、HiddenLayer、LogisticRegression是随机初始化的
下面将它们定义为可以用参数来初始化的版本
"""class LogisticRegression(object):def __init__(self, input, params_W, params_b, n_in, n_out):self.W = params_Wself.b = params_bself.p_y_given_x = T.nnet.softmax(T.dot(input, self.W) + self.b)self.y_pred = T.argmax(self.p_y_given_x, axis=1)self.params = [self.W, self.b]def negative_log_likelihood(self, y):return -T.mean(T.log(self.p_y_given_x)[T.arange(y.shape[0]), y])def errors(self, y):if y.ndim != self.y_pred.ndim:raise TypeError('y should have the same shape as self.y_pred',('y', y.type, 'y_pred', self.y_pred.type))if y.dtype.startswith('int'):return T.mean(T.neq(self.y_pred, y))else:raise NotImplementedError()class HiddenLayer(object):def __init__(self, input, params_W, params_b, n_in, n_out,activation=T.tanh):self.input = inputself.W = params_Wself.b = params_blin_output = T.dot(input, self.W) + self.bself.output = (lin_output if activation is Noneelse activation(lin_output))self.params = [self.W, self.b]# 卷积+采样层(conv+maxpooling)
class LeNetConvPoolLayer(object):def __init__(self, input, params_W, params_b, filter_shape, image_shape, poolsize=(2, 2)):assert image_shape[1] == filter_shape[1]self.input = inputself.W = params_Wself.b = params_b# 卷积conv_out = conv.conv2d(input=input,filters=self.W,filter_shape=filter_shape,image_shape=image_shape)# 子采样pooled_out = pool_2d(input=conv_out,ws=poolsize,ignore_border=True)self.output = T.tanh(pooled_out + self.b.dimshuffle('x', 0, 'x', 'x'))self.params = [self.W, self.b]"""
用之前保存下来的参数初始化CNN,就得到了一个训练好的CNN模型,然后使用这个模型来测图像
注意:n_kerns跟之前训练的模型要保持一致。dataset是你要测试的图像的路径,params_file是之前训练时保存的参数文件的路径
"""def use_CNN(dataset='olivettifaces.gif', params_file='params.pkl', nkerns=[5, 10]):# 读取测试的图像,这里读取整个olivettifaces.gif,即全部样本,得到faces、labelfaces, label = load_data(dataset)face_num = faces.shape[0]  # 有多少张人脸图# 读入参数layer0_params, layer1_params, layer2_params, layer3_params = load_params(params_file)x = T.matrix('x')  # 用变量x表示输入的人脸数据,作为layer0的输入####################### 用读进来的参数初始化各层参数W、b######################layer0_input = x.reshape((face_num, 1, 57, 47))layer0 = LeNetConvPoolLayer(input=layer0_input,params_W=layer0_params[0],params_b=layer0_params[1],image_shape=(face_num, 1, 57, 47),filter_shape=(nkerns[0], 1, 5, 5),poolsize=(2, 2))layer1 = LeNetConvPoolLayer(input=layer0.output,params_W=layer1_params[0],params_b=layer1_params[1],image_shape=(face_num, nkerns[0], 26, 21),filter_shape=(nkerns[1], nkerns[0], 5, 5),poolsize=(2, 2))layer2_input = layer1.output.flatten(2)layer2 = HiddenLayer(input=layer2_input,params_W=layer2_params[0],params_b=layer2_params[1],n_in=nkerns[1] * 11 * 8,n_out=2000,activation=T.tanh)layer3 = LogisticRegression(input=layer2.output, params_W=layer3_params[0], params_b=layer3_params[1], n_in=2000,n_out=40)# 定义theano.function,让x作为输入,layer3.y_pred(即预测的类别)作为输出f = theano.function([x],  # funtion 的输入必须是list,即使只有一个输入layer3.y_pred)# 预测的类别predpred = f(faces)# 将预测的类别pred与真正类别label对比,输出错分的图像for i in range(face_num):if pred[i] != label[i]:print('picture: %i is person %i, but mis-predicted as person %i' % (i, label[i], pred[i]))if __name__ == '__main__':use_CNN()"""一点笔记,对theano.function的理解,不一定正确,后面深入理解了再回头看看
在theano里面,必须通过function定义输入x和输出,然后调用function,才会开始计算,比如在use_CNN里面,在定义layer0时,即使将faces作为输入,将layer1~layer3定义好后,也无法直接用layer3.y_pred来获得所属类别。
因为在theano中,layer0~layer3只是一种“图”关系,我们定义了layer0~layer3,也只是创建了这种图关系,但是如果没有funtion,它是不会计算的。
这也是为什么要定义x的原因:x = T.matrix('x')
然后将变量x作为layer0的输入。
最后,定义一个function:
f = theano.function([x],    #funtion 的输入必须是list,即使只有一个输入layer3.y_pred)
将x作为输入,layer3.y_pred作为输出。
当调用f(faces)时,就获得了预测值
"""

运行就可以得到结果,这里不做具体说明,如有错误,还望大家说明。
-----------------------------------------------------------------------------------------------------------------------------------------

本篇博文所有代码都已上传:位置在这里,绝对干货无所欺
-----------------------------------------------------------------------------------------------------------------------------------------
-----------------------------------------------------------------------------------------------------------------------------------------

【深度学习】9:CNN实现olivettifaces人脸数据库识别相关推荐

  1. 深度学习笔记(42) 人脸识别

    深度学习笔记(42) 人脸识别 1. 人脸识别 2. One-Shot学习 3. Similarity函数 1. 人脸识别 现在可以看到很多产品在运用人脸识别,如手机解锁.车站身份识别认证.刷脸支付等 ...

  2. 深度学习笔记(45) 人脸验证与二分类

    深度学习笔记(45) 人脸验证与二分类 1. 二分类问题 2. 逻辑回归单元的处理 3. 计算技巧 1. 二分类问题 深度学习笔记(44) Triplet 损失 的Triplet loss是一个学习人 ...

  3. 【深度学习】CNN神经网络应用(用于亚洲大黄蜂分类)

    [深度学习]CNN神经网络应用(用于亚洲大黄蜂分类) 文章目录 1 概述 2 假设条件 3 网络结构 4 数据集和参数 5 Asian hornet classification experiment ...

  4. [caffe]深度学习之CNN检测object detection方法摘要介绍

    [caffe]深度学习之CNN检测object detection方法摘要介绍  2015-08-17 17:44 3276人阅读 评论(1) 收藏 举报 一两年cnn在检测这块的发展突飞猛进,下面详 ...

  5. [Python人工智能] 三十.Keras深度学习构建CNN识别阿拉伯手写文字图像

    从本专栏开始,作者正式研究Python深度学习.神经网络及人工智能相关知识.前一篇文章分享了生成对抗网络GAN的基础知识,包括什么是GAN.常用算法(CGAN.DCGAN.infoGAN.WGAN). ...

  6. [Python图像识别] 四十七.Keras深度学习构建CNN识别阿拉伯手写文字图像

    该系列文章是讲解Python OpenCV图像处理知识,前期主要讲解图像入门.OpenCV基础用法,中期讲解图像处理的各种算法,包括图像锐化算子.图像增强技术.图像分割等,后期结合深度学习研究图像识别 ...

  7. 深度学习 卷积神经网络-Pytorch手写数字识别

    深度学习 卷积神经网络-Pytorch手写数字识别 一.前言 二.代码实现 2.1 引入依赖库 2.2 加载数据 2.3 数据分割 2.4 构造数据 2.5 迭代训练 三.测试数据 四.参考资料 一. ...

  8. 深度学习实战39-U-Net模型在医学影像识别分割上的应用技巧,以细胞核分割任务为例

    大家好,我是微学AI,今天给大家介绍一下深度学习实战39-U-Net模型在医学影像识别分割上的应用技巧,以细胞核分割任务为例.本文将介绍在医学影像分割领域中应用U-Net模型的方法.我们将从U-Net ...

  9. 深度学习(4)手写数字识别实战

    深度学习(4)手写数字识别实战 Step0. 数据及模型准备 1. X and Y(数据准备) 2. out=relu{relu{relu[X@W1+b1]@W2+b2}@W3+b3}out=relu ...

  10. 深度学习(3)手写数字识别问题

    深度学习(3)手写数字识别问题 1. 问题归类 2. 数据集 3. Image 4. Input and Output 5. Regression VS Classification 6. Compu ...

最新文章

  1. 一张清华大学教授工资单曝光!想象与现实天壤之别……
  2. python安装mysqlclient模块报fatal error: Python.h:解决方法
  3. 运河杯交通违章 运行不起来
  4. 吴恩达《机器学习》学习笔记十四——应用机器学习的建议实现一个机器学习模型的改进
  5. oracle无效的十六进制数字,值java.sql.SQLException:ORA-01465:无效的十六进制数
  6. Maven--部署构件至 Nexus
  7. 剑指offer面试题38. 字符串的排列(回溯)
  8. HTML5 纯css圆角代码
  9. 仿生软体机器人就业咋样_余存江课题组《先进材料》封面:智能自适应软体机器人获得新突破...
  10. delphi 获取驱动盘的卷标 号
  11. spring源码--第七个后置处理器的使用:初始化方法
  12. java股票公式源码_通达信公式转java
  13. 2022年华为笔试面试机考真题100道(C/C++语言)
  14. 学语言python研究生专业目录一览表_本科专业与研究生学科专业目录对照表格模板...
  15. Sql 列转行字符串
  16. 北京二手房价10月微涨 业内:坚持限购就不会大涨
  17. 巨人史玉柱的创业故事
  18. 怎样在Mac上为 Apple ID 设置双重认证?
  19. SPP、ASPP与PPM
  20. innobackupex全量恢复

热门文章

  1. SpringCloud之熔断器Hystrix(二)
  2. 你知道如何从零开始学c++游戏编程吗
  3. 删除Docker出现: device or resource busy错误
  4. matlab 正弦信号合成三角波,【matlab求助】正弦波叠加成三角波信号
  5. python输入数字输出月份英文缩写_英文和数字表示的月份,如何在 Excel 中相互转换?...
  6. java word 添加图片_java如何在word中添加图形?图文详解
  7. Java程序员职业发展规划和方向有哪些?
  8. ai钢笔工具怎么描线_AI: 如何用钢笔工具画曲线
  9. 服务器cpu和普通cpu性能,服务器cpu排行(为什么服务器cpu不适合家用)
  10. ASP.NET Core免费(视频)教程汇总