Tensorflow2.6实现Unet结构神经网络（3D卷积）识别脑部肿瘤并实现模型并行

说明
Unet神经网络
- 网络结构
- 代码实现
模型训练
- 训练环境
- 数据加载处理
- 训练
- 训练结果
模型并行版本
- 模型拆分
- 代码实现
- 出现的问题

说明

以下神经网络结构和实验代码是从本人本科毕业设计中摘出来的，在基础上将模型的两部分放到两张GPU上进行模型并行训练，但是模型并行的版本并没有达到预期的效果，如果有大佬看到这篇文章，希望能指出错误，谢谢。

Unet神经网络

Unet 结构神经网络是通过卷积进行采样的，属于卷积神经网络的一种。在2015年在文章 U-Net: Convolutional Networks for Biomedical Image Segmentation 中被提出。

网络结构

采用3D卷积的方式实现 Unet 网络结构，原因是单个影像是 4 维的核磁共振影像，tensorflow2 实现的模型输入是一定数量的影像，所以模型输入是一个 5 维的张量，张量形状是（影像数量，影像维度1，影像维度2，影像维度3，影像维度4）。

网络结构如下。

图1 构建的3D卷积版本的 Unet 结构神经网络

	层数	神经网络层	卷积核形状	输出张量形状
Encoder	1	batchnormalization		(batch size , 240, 240, 155, 4)
	2	conv3d	(3,3,3)	(batch size, 240, 240, 155, 8)
	3	batchnormalization		(batch size, 240, 240, 155, 8)
	4	conv3d	(3,3,3)	(batch size, 240, 240, 155, 16)
	5	batchnormalization		(batch size, 240, 240, 155, 16)
	6	conv3d	(3,3,2)	(batch size, 238, 238, 155, 16)
	7	batchnormalization		(batch size, 238, 238, 155, 16)
	8	conv3d	(3,3,1)	(batch size, 118, 118, 77, 32)
	9	batchnormalization		(batch size, 118, 118, 77, 32)
	10	conv3d	(3,3,1)	(batch size, 58, 58, 39, 64)
	11	batchnormalization		(batch size, 58, 58, 39, 64)
	12	maxpooling3d	(2,2,1)	(batch size, 29, 29, 39, 64)
Decoder	13	batchnormalization		(batch size,29,29,39,64)
	14	upsampling3d	(2,2,1)	(batch size,58,58,39,64)
	15	conv3dTranspose	(3,3,1)	(batch size,58,58,39,32)
	16	concat		(batch size,58,58,39,128)
	17	batchnormalization		(batch size,58,58,39,32)
	18	upsampling3d	(2,2,2)	(batch size,116,116,78,64)
	19	conv3dTranspose	(3,3,1)	(batch size,118,118,78,32)
	20	conv3d	(3,3,3)	(batch size,116,116,76,32)
	21	batchnormalization		(batch size,116,116,76,32)
	22	conv3dTranspose	(3,3,2)	(batch size,118,118,77,16)
	23	concat		(batch size,118,118,77,32)
	24	batchnormalization		(batch size,118,118,77,16)
	25	upsampling3d	(2,2,2)	(batch size,236,236,154,16)
	26	conv3dTranspose	(3,3,1)	(batch size,238,238,154,16)
	27	concat		(batch size,238,238,154,32)
	28	batchnormalization		(batch size,238,238,154,16)
	29	conv3dTranspose	(3,3,1)	(batch size,240,240,154,8)
	30	conv3dTranspose	(1,1,5)	(batch size,240,240,158,4)
	31	conv3d	(1,1,4)	(batch size,240,240,155,1)

表1 构建的3D卷积版本的 Unet 结构神经网络每层信息

代码实现

框架加载

from tensorflow.keras.layers import BatchNormalization,Conv3D,MaxPooling3D,Conv3DTranspose,UpSampling3D
import tensorflow as tf

encoder部分实现

class unet_encoder(tf.keras.Model):def __init__(self):super(unet_encoder,self).__init__()self.b1 = BatchNormalization()self.conv1 = Conv3D(8,3,activation='relu',padding='same')self.b2 = BatchNormalization()self.conv2 = Conv3D(16,3,activation='relu',padding='same')self.b3 = BatchNormalization()self.conv3 = Conv3D(16,(3,3,2),activation='relu')self.b4 = BatchNormalization()self.conv4 = Conv3D(32,(3,3,1),activation='relu',strides=2)self.b5 = BatchNormalization()self.conv5 = Conv3D(64,(3,3,1),activation='relu',strides=2)self.b6 = BatchNormalization()self.maxpool1 = MaxPooling3D((2,2,1))def call(self,x,features):x = self.b1(x)x = self.conv1(x)x = self.b2(x)x = self.conv2(x)x = self.b3(x)# 第一个连接特征图x = self.conv3(x)x = self.b4(x)features.append(x)# 第二个连接特征图x = self.conv4(x)x = self.b5(x)features.append(x)# 第三个连接特征图x = self.conv5(x)x = self.b6(x)features.append(x)# 输出变量outputs = self.maxpool1(x)return outputs

decoder部分实现

class unet_decoder(tf.keras.Model):def __init__(self):super(unet_decoder,self).__init__()self.b1 = BatchNormalization()self.up1 = UpSampling3D((2,2,1))self.conv1tp = Conv3DTranspose(64,(3,3,1),activation='relu',padding='same')self.b2 = BatchNormalization()self.up2 = UpSampling3D((2,2,2))self.conv2tp = Conv3DTranspose(32,(3,3,1),activation='relu')self.conv2 = Conv3D(32,3,activation='relu')self.b3 = BatchNormalization()self.conv3tp = Conv3DTranspose(16,(3,3,2),activation='relu')self.b4 = BatchNormalization()self.up4 = UpSampling3D((2,2,2))self.conv4tp = Conv3DTranspose(16,(3,3,1),activation='relu')self.b5 = BatchNormalization()self.conv5tp = Conv3DTranspose(8,(3,3,1),activation='relu')self.conv6tp = Conv3DTranspose(4,(1,1,5),activation='relu')self.conv_out = Conv3D(1,(1,1,4),activation='relu')def call(self,x,features):x = self.b1(x)x = self.up1(x)x = self.conv1tp(x)x = tf.concat((features[-1],x),axis=-1)x = self.b2(x)x = self.up2(x)x = self.conv2tp(x)x = self.conv2(x)x = self.b3(x)x = self.conv3tp(x)x = tf.concat((features[-2],x),axis=-1)x = self.b4(x)x = self.up4(x)x = self.conv4tp(x)x = tf.concat((features[-3],x),axis=-1)x = self.b5(x)x = self.conv5tp(x)x = self.conv6tp(x)x = self.conv_out(x)outputs = xreturn outputs

结合两个部分的整体类

class Unet3D(tf.keras.Model):def __init__(self,encoder,decoder):super(Unet3D,self).__init__()self.features = []self.encoder = encoderself.decoder = decoderdef call(self,x):x = self.encoder(x,self.features)outputs = self.decoder(x,self.features)return outputs

模型训练

训练环境

软件	版本
Python	3.8.11
Tensorflow	2.6.0-gpu
CUDA	11.2
cuDNN	8.1.0
nibabel	3.2.2

表2 主要的软件列表

处理器	型号	显存
GPU	NVIDIA Geforce GTX 3090	24G

表3 显卡信息

数据加载处理

数据来源
MSD脑瘤数据集（百度飞桨 AI Studio）

影像放缩
使用nearest算法对影像的某些维度进行放缩（防止输入的影像张量形状与模型设计的输入张量形状不一致）。

import nibabel as nib
import numpy as np
def nearest_4d(img,size):res = np.zeros(size)for i in range(res.shape[0]):for j in range(res.shape[1]):for k in range(res.shape[2]):idx = i*img.shape[0] // res.shape[0]idy = j*img.shape[1] // res.shape[1]idz = k*img.shape[2] // res.shape[2]res[i,j,k,:] = img[idx,idy,idz,:]return res

数据生成器
采用生成器和迭代器的方式，从硬盘中读取一定数量影像数据至内存。

# 按照数据文件路径以迭代器的方式读取数据
class DataIterator:def __init__(self,image_paths,label_paths,size=None,transp_shape=[0,1,2,3],mode='nib'):self.image_paths = image_pathsself.label_paths = label_pathsself.size = sizeself.transp = transp_shapeself.mode=modedef read_and_resize(self,img_path,lbl_path):if self.mode=='nib':img = nib.load(img_path)lbl = nib.load(lbl_path)img = img.get_fdata(caching='fill', dtype='float32')lbl = lbl.get_fdata(caching='fill', dtype='float32')elif self.mode == 'np':img = np.load(img_path)lbl = np.load(lbl_path)else:return None,Noneimg /= np.max(img)lbl /= np.max(lbl)img = img.transpose(self.transp)if len(lbl.shape)<len(img.shape):lbl = np.expand_dims(lbl,axis=-1)lbl = lbl.transpose(self.transp)if self.size != None:if len(self.size) == 3:img = nearest_3d(img,self.size)lbl = nearest_3d(lbl,self.size)else:img = nearest_4d(img,self.size)lbl = nearest_4d(lbl,self.size)return img,lbldef __iter__(self):for img_path,lbl_path in zip(self.image_paths,self.label_paths):img,lbl = self.read_and_resize(img_path,lbl_path)if isinstance(img,np.ndarray) and isinstance(lbl,np.ndarray):yield (img,lbl)else:return
# 数据生成器，因为训练用的标签数据少了一个维度，所以在返回数据对象之前给数据对象扩充维度
class DataGenerator:def __init__(self,image_paths,label_paths,size=None,batch_size=32,transp_shape=[0,1,2,3],mode='nib'):dataiter = DataIterator(image_paths,label_paths,size,transp_shape,mode)self.batch_size = batch_sizeself.dataiter = iter(dataiter)def __iter__(self):while 1:i = 0imgs = []lbls = []for img,lbl in self.dataiter:imgs.append(img)lbls.append(lbl)i += 1if i >= self.batch_size:breakif i == 0:breakimgs = np.stack(imgs)lbls = np.stack(lbls)if len(imgs.shape) < 5:imgs = np.expand_dims(imgs,axis=-1)lbls = np.expand_dims(lbls,axis=-1)yield (imgs,lbls)

训练

依赖加载

import tensorflow as tf
from tensorflow.keras import losses,optimizers
from model import unet_encoder,unet_decoder,Unet3Dfrom DataGenerator import DataGenerator
from datetime import datetime
from time import time
import os

数据加载准备

# 数据路径
image_dir_path = './data/train/'
label_dir_path = './data/labels/'images_paths = os.listdir(image_dir_path)
labels_paths = os.listdir(label_dir_path)
image_paths = [image_dir_path+p for p in images_paths]
label_paths = [label_dir_path+p for p in labels_paths]

日志文件记录

# 日志记录文件
log1 = open('./log/epoch_file_form','w',encoding='utf-8')
log2 = open('./log/step_file_form','w',encoding='utf-8')
date_mark = str(datetime.now())
log1.write(date_mark+'\n')
log2.write(date_mark+'\n')

模型定义

# 模型定义
encoder_model = unet_encoder()
decoder_model = unet_decoder()
unet = Unet3D(encoder_model,decoder_model)
unet.build(input_shape=(None,240,240,155,4))
unet.summary()

优化器和损失函数
优化器采用Adam算法，学习率1e-5，损失函数采用交叉熵函数（二分类）。

# 设置优化器，损失函数
optimizer = optimizers.Adam(learning_rate=1e-5)
losser = losses.BinaryCrossentropy()

训练过程实现

# 训练
epochs = 30
s1 = time()
for i in range(epochs):s2 = time()loss_sum = 0step = 0datagener = iter(DataGenerator(image_paths,label_paths,None,1,[0,1,2,3]))for batch in datagener:s3 = time()step += 1x = batch[0]y = batch[1]with tf.GradientTape() as tape:out = unet(x)loss = losser(y_pred=out,y_true=y)grads = tape.gradient(loss,unet.trainable_variables)optimizer.apply_gradients(zip(grads,unet.trainable_variables))e3 = time()loss_sum += lossinfo_step = f'step:{step:03}\tloss:{loss}\t running time: {e3-s3:.3f} s'log2.write(info_step+'\n')print('                                                                             ',end='\r')print(info_step,end='\r')e2 = time()avg_loss = loss_sum/step if step != 0 else 'non samples'info_epoch = f'epoch {i+1:02}\t average loss {avg_loss}\t running time {e2-s2:.3f} s'log1.write(info_epoch+'\n')print('                                                                                ',end='\r')print(info_epoch)
e1 = time()
all_time = f'Training time {e1-s1:.3f} s'
log1.write(all_time+' s\n')
log2.write(all_time+' s\n')
print(all_time)log1.close()
log2.close()# 保存模型
encoder_model.save_weights('./models/encoder_params_formal')
decoder_model.save_weights('./models/decoder_params_formal')
unet.save_weights('./models/unet_params_formal')

训练结果

30个epoch训练过程中，每个 epoch 训练平均损失值和训练时间可视化

图2 训练信息可视化

不同训练 epoch 训练出的模型输出对比

图3 不同训练批次模型输出对比

模型并行版本

模型拆分

使用两块GPU，将 encoder 部分放置到GPU0上，decoder部分放置到GPU1上。

代码实现

模型实现

from tensorflow.keras.layers import BatchNormalization,Conv3D,MaxPooling3D,Conv3DTranspose,UpSampling3D
import tensorflow as tfdef copy_tensor_to_gpu(tensor,gpu_id):with tf.device(f'/gpu: {gpu_id}'):res = tf.zeros_like(tensor)res = res + tensorreturn res
def copy_tensor_to_cpu(tensor,cpu_id):with tf.device(f'/cpu: {cpu_id}'):res = tf.zeros_like(tensor,cpu_id)res = res + tensorreturn resclass unet_encoder(tf.keras.Model):def __init__(self):super(unet_encoder,self).__init__()self.b1 = BatchNormalization()self.conv1 = Conv3D(8,3,activation='relu',padding='same')self.b2 = BatchNormalization()self.conv2 = Conv3D(16,3,activation='relu',padding='same')self.b3 = BatchNormalization()self.conv3 = Conv3D(16,(3,3,2),activation='relu')self.b4 = BatchNormalization()self.conv4 = Conv3D(32,(3,3,1),activation='relu',strides=2)self.b5 = BatchNormalization()self.conv5 = Conv3D(64,(3,3,1),activation='relu',strides=2)self.b6 = BatchNormalization()self.maxpool1 = MaxPooling3D((2,2,1))def call(self,x,features,gpu_id):x = self.b1(x)x = self.conv1(x)x = self.b2(x)x = self.conv2(x)x = self.b3(x)# 第一个连接特征图x = self.conv3(x)x = self.b4(x)features[0] = copy_tensor_to_gpu(x,gpu_id)# 第二个连接特征图x = self.conv4(x)x = self.b5(x)features[1] = copy_tensor_to_gpu(x,gpu_id)# 第三个连接特征图x = self.conv5(x)x = self.b6(x)features[2] = copy_tensor_to_gpu(x,gpu_id)# 输出变量outputs = self.maxpool1(x)return outputsclass unet_decoder(tf.keras.Model):def __init__(self):super(unet_decoder,self).__init__()self.b1 = BatchNormalization()self.up1 = UpSampling3D((2,2,1))self.conv1tp = Conv3DTranspose(64,(3,3,1),activation='relu',padding='same')self.b2 = BatchNormalization()self.up2 = UpSampling3D((2,2,2))self.conv2tp = Conv3DTranspose(32,(3,3,1),activation='relu')self.conv2 = Conv3D(32,3,activation='relu')self.b3 = BatchNormalization()self.conv3tp = Conv3DTranspose(16,(3,3,2),activation='relu')self.b4 = BatchNormalization()self.up4 = UpSampling3D((2,2,2))self.conv4tp = Conv3DTranspose(16,(3,3,1),activation='relu')self.b5 = BatchNormalization()self.conv5tp = Conv3DTranspose(8,(3,3,1),activation='relu')self.conv6tp = Conv3DTranspose(4,(1,1,5),activation='relu')self.conv_out = Conv3D(1,(1,1,4),activation='relu')def call(self,x,features):x = self.b1(x)x = self.up1(x)x = self.conv1tp(x)x = tf.concat((features[-1],x),axis=-1)x = self.b2(x)x = self.up2(x)x = self.conv2tp(x)x = self.conv2(x)x = self.b3(x)x = self.conv3tp(x)x = tf.concat((features[-2],x),axis=-1)x = self.b4(x)x = self.up4(x)x = self.conv4tp(x)x = tf.concat((features[-3],x),axis=-1)x = self.b5(x)x = self.conv5tp(x)x = self.conv6tp(x)x = self.conv_out(x)outputs = xreturn outputsclass Unet3DParallel(tf.keras.Model):def __init__(self,gpu_group):super(Unet3DParallel,self).__init__()self.gpus = gpu_groupwith tf.device(f'/gpu:{gpu_group[1]}'):self.features = [None for i in range(3)]with tf.device(f'/gpu:{gpu_group[0]}'):self.encoder = unet_encoder()with tf.device(f'/gpu:{gpu_group[1]}'):self.decoder = unet_decoder()def call(self,x):x = self.encoder(x,self.features,self.gpus[1])outputs = self.decoder(x,self.features)return outputs

训练过程同上，但是开启GPU显存使用增长

import tensorflow as tf
gpus = tf.config.experimental.list_physical_devices('GPU')
for gpu in gpus:tf.config.experimental.set_memory_growth(gpu,True)

出现的问题

模型两部分可训练参数不同，encoder 少于 decoder，但是GPU0使用显存、使用率和功耗均多于GPU1。

# 模型可训练参数统计
Model: "unet3d_parallel"
_________________________________________________________________
Layer (type)                 Output Shape              Param #
=================================================================
unet_encoder (unet_encoder)  multiple                  32664
_________________________________________________________________
unet_decoder (unet_decoder)  multiple                  121373
=================================================================
Total params: 154,037
Trainable params: 153,149
Non-trainable params: 888
_________________________________________________________________
# 显卡使用情况监控
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.94       Driver Version: 470.94       CUDA Version: 11.4     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  NVIDIA GeForce ...  On   | 00000000:3E:00.0 Off |                  N/A |
| 59%   61C    P2   211W / 350W |  23746MiB / 24268MiB |     67%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  NVIDIA GeForce ...  On   | 00000000:88:00.0 Off |                  N/A |
| 46%   56C    P2   120W / 350W |   4504MiB / 24268MiB |     22%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

按理来说，模型差分后，两个GPU应该形成流水线，从训练第二步开始，单步训练时间应该少于单GPU训练单步时间。但是实验结果相反，单GPU每步训练时间大约在 0.7s 左右，模型拆分后单步训练时间却在 1.2s 左右（由于换了机器，训练时间跟上面训练信息可视化的图表时间不同）。

Tensorflow2.6实现Unet结构神经网络（3D卷积）识别脑部肿瘤并实现模型并行相关推荐

3D 卷积神经网络视频动作识别
转自:http://blog.csdn.net/AUTO1993/article/details/70948249 https://zhuanlan.zhihu.com/p/25912625 http ...
卷积神经网络（2D卷积神经网络和3D卷积神经网络理解）
前言卷积神经⽹络(convolutional neural network,CNN)是⼀类强⼤的神经⽹络,正是为处理图像数据而设计的.基于卷积神经⽹络结构的模型在计算机视觉领域中已经占主导地位,当 ...
基于U-Net的递归残差卷积神经网络在医学图像分割中的应用
转载: 版权声明:本文为博主原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接和本声明. 本文链接:https://blog.csdn.net/weixin_45723705/ ...
卷积神经网络语音识别_用于物体识别的3D卷积神经网络
本文提出了一种基于CNN的3D物体识别方法,能够从3D图像表示中识别3D物体,并在比较了不同的体素时的准确性.已有文献中,3D CNN使用3D点云数据集或者RGBD图像来构建3D CNNs,但是CNN ...
3D卷积神经网络详解
1 3d卷积的官方详解 2 2D卷积与3D卷积 1)2D卷积 2D卷积:卷积核在输入图像的二维空间进行滑窗操作. 2D单通道卷积对于2维卷积,一个3*3的卷积核,在单通道图像上进行卷积,得到输出的动 ...
多时间尺度 3D 卷积神经网络的步态识别
多时间尺度 3D 卷积神经网络的步态识别论文题目:Gait Recognition with Multiple-Temporal-Scale 3D Convolutional Neural Netw ...
python图像人类检测_OpenCV人类行为识别（3D卷积神经网络）
1. 3D卷积神经网络相比于2D 卷积神经网络,3D卷积神经网络更能很好的利用视频中的时序信息.因此,其主要应用视频.行为识别等领域居多.3D卷积神经网络是将时间维度看成了第三维. 人类行为识别的实 ...
GNN-图卷积模型-2016：PATCHY-SAN【图结构序列化：将图结构转换成了序列结构，然后直接利用卷积神经网络在转化成的序列结构上做卷积】
我们之前曾提到卷积神经网络不能应用在图结构上是因为图是非欧式空间,所以大部分算法都沿着找到适用于图的卷积核这个思路来走. 而 PATCHY-SAN 算法 <Learning Convolutio ...
3D点云初探：基于全卷积神经网络实现3D物体识别
基于全卷积神经网络实现3D物体识别一.从2D图像识别到3D物体识别二.ModelNet10:3D CAD数据集 1.存储格式 2.读取方法 3.点云可视化可视化工具 plt可视化 4.数据集定义 ...

Tensorflow2.6实现Unet结构神经网络（3D卷积）识别脑部肿瘤并实现模型并行

Tensorflow2.6实现Unet结构神经网络（3D卷积）识别脑部肿瘤并实现模型并行

说明

Unet神经网络

网络结构

代码实现

模型训练

训练环境

数据加载处理

训练

训练结果

模型并行版本

模型拆分

代码实现

出现的问题

Tensorflow2.6实现Unet结构神经网络（3D卷积）识别脑部肿瘤并实现模型并行相关推荐

最新文章

热门文章