ADNI Series

1、【ADNI】数据预处理(1)SPM,CAT12

2、【ADNI】数据预处理(2)获取 subject slices

3、【ADNI】数据预处理(3)CNNs

4、【ADNI】数据预处理(4)Get top k slices according to CNNs

5、【ADNI】数据预处理(5)Get top k slices (pMCI_sMCI) according to CNNs

6、【ADNI】数据预处理(6)ADNI_slice_dataloader ||| show image


本文总结的功能:

1)将来自同一个 subject MRI image 的 slice concatenate 到同一数组;

2)显示 slices image;

subject ID and relevant label:存储与 .txt 文本中,如下所示

141_S_1137 1
141_S_1152 1
002_S_0295 0
002_S_0559 0

subject id ||| slice path ||| label name:

141_S_0696|||/home/hcq/alzheimer_disease/ADNI_825/experiments_FineTunning/majority_select_top51_slices_folder_02_AD_NC/validation/AD/141_S_0696_slice_Y74.jpg|||AD
141_S_0696|||/home/hcq/alzheimer_disease/ADNI_825/experiments_FineTunning/majority_select_top51_slices_folder_02_AD_NC/validation/AD/141_S_0696_slice_Z46.jpg|||AD
002_S_0413|||/home/hcq/alzheimer_disease/ADNI_825/experiments_FineTunning/majority_select_top51_slices_folder_02_AD_NC/validation/NC/002_S_0413_slice_X46.jpg|||NC
002_S_0413|||/home/hcq/alzheimer_disease/ADNI_825/experiments_FineTunning/majority_select_top51_slices_folder_02_AD_NC/validation/NC/002_S_0413_slice_X50.jpg|||NC

效果:

1)subject:image = [batch_size, slice_num, img_w, img_h]

2)label:AD or NC

源码:

import random
import os
import numpy as np
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import cv2def read_lists(file_path):dataset_list = np.loadtxt(file_path, dtype=str).tolist()subjectID_path, labels  = zip(*[(l[0], int(l[1])) for l in dataset_list])return subjectID_path, labelsclass Dataset_ADNI_TopK:def __init__(self, subjectID_list, subjectID_labels, folder_name):self.subjectID_list = subjectID_listself.subjectID_labels = subjectID_labelsself.slice_path = os.path.join("./path_txt", folder_name + "_majority_select_top51_slices_folder_02.txt")self.shuffled = Falsedef slice_concatenate(self, batch_size, slice_num):# for subjectID in self.subjectID_list:#     print(subjectID)slice_list = np.loadtxt(self.slice_path, dtype=str).tolist()x = np.zeros((batch_size, slice_num, 227, 227))y = np.zeros(batch_size)for bs in range(batch_size):subjectID = self.subjectID_list[bs]subjectID_label = self.subjectID_labels[bs]# print(subjectID)for i, slice_strcut in enumerate(slice_list):# print(slice_strcut.split("|||")[1])image = mpimg.imread(slice_strcut.split("|||")[1])# image = np.reshape(image, (227, 227))image = cv2.resize(image, (227, 227))# print(image.shape)# print(i%slice_num)x[bs, i%slice_num, :, :] = imagey[bs] = subjectID_label# print(x[0, 0, 0, 90:100, 90:100])return x, ytrain_subject_ID = './path_txt/train_sujectID_majority_select_top51_slices_folder_02.txt'
val_subject_ID = './path_txt/val_sujectID_majority_select_top51_slices_folder_02.txt'
train_subject_ID_list, train_subject_ID_label_list = read_lists(train_subject_ID)
val_subject_ID_list, val_subject_ID_lable_list = read_lists(val_subject_ID)# dataset_train = Dataset_ADNI_TopK(train_subject_ID_list, train_subject_ID_label_list, "train")
# dataset_train.slice_concatenate()dataset_val = Dataset_ADNI_TopK(val_subject_ID_list, val_subject_ID_lable_list, "val")
batch_size_subject = 2
slice_num = 51
image, label = dataset_val.slice_concatenate(batch_size_subject, slice_num)### show image
for i in range(16):plt.subplot(4, 4, i+1)plt.imshow(image[0, i, :, :], cmap = 'gray')
#plt.imshow(image[0, 0, :, :], cmap = 'gray')
#i = 1
#plt.subplot(2, 2, i)
#plt.imshow(image[0, 0, :, :])
#
#plt.subplot(222)
#plt.imshow(image[0, 1, :, :])
#
#plt.subplot(223)
#plt.imshow(image[0, 2, :, :])
#
#plt.subplot(224)
#plt.imshow(image[0, 3, :, :])

hcq_data_processing.py

# -*- coding: utf-8 -*-
import random
import os
import numpy as np
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import cv2def read_lists(file_path):dataset_list = np.loadtxt(file_path, dtype=str).tolist()subjectID_path, labels  = zip(*[(l.split("|||")[0], int(l.split("|||")[1])) for l in dataset_list])return subjectID_path, labelsclass Dataset_ADNI_TopK:def __init__(self, subjectID_list, subjectID_labels, folder_name, shuffled, batch_size, slice_num):self.subjectID_list = subjectID_listself.subjectID_labels = subjectID_labelsself.shuffled = shuffledself.batch_size = batch_sizeself.len_dataset = len(self.subjectID_list)self.slice_num = slice_numself.folder_name = folder_nameself.batch_size_used = batch_sizeif shuffled == True:
#            len_subjectID_list = len(subjectID_list)
#            random.shuffle(len_subjectID_list)
#            print("original list...")
#            print(self.subjectID_list)z = list(zip(self.subjectID_list, self.subjectID_labels))random.shuffle(z)self.subjectID_list, self.subjectID_labels = [list(l) for l in zip(*z)]
#            print("random list...")
#            print(self.subjectID_list)def slice_concatenate(self, index):# for subjectID in self.subjectID_list:#     print(subjectID)root_path = "/home/reserch/documents/deeplearning/alzheimers_disease_DL/pytorch/dataset_path/original_825_top_entropy51"model_img_size = 224x = np.zeros((self.batch_size, self.slice_num, model_img_size, model_img_size))y = np.zeros(self.batch_size)for bs in range(self.batch_size):if (index*self.batch_size + bs) >= self.len_dataset:# print("break....")
# ### hcq ###
# ### 20180528 ###
# ### dataloader #### batch_size = 16
# train_dataset_num = 344      --> iter_num = train_dataset_num / batch_size = 21
# validation_dataset_num = 86  --> iter_num = validation_dataset_num / batch_size = 5# # As for train_dataset_num:
# - iter_num = train_dataset_num / batch_size = 21
# - range(iter_num)[0, 20]: for _iter in iter_num: dealed with = 21 iterations --> 21 x 16 = 336
# - remained samples: 344 - 336 = 8  <-- train_dataset_num % batch_size
# - range(iter_num+1)[0, 21]: delete_samples_num = self.batch_size - (self.len_dataset % self.batch_size)for ii in range(delete_samples_num):index_delete = self.batch_size - ii - 1x = np.delete(x, index_delete, axis=0)y = np.delete(y, index_delete, axis=0)self.batch_size_used = bsbreak#            print("index = {}, dataset_len = {} ||| index*self.batch_size + bs = {}".format(index, self.len_dataset, index*self.batch_size + bs))subjectID = self.subjectID_list[index*self.batch_size + bs]subjectID_label = self.subjectID_labels[index*self.batch_size + bs]y[bs] = subjectID_label# print(subjectID)new_slice_txt_path = os.path.join(root_path, self.folder_name + "_slice_txt", subjectID + "_" + self.folder_name + ".txt")slice_list = np.loadtxt(new_slice_txt_path, dtype=str).tolist()for i, slice_strcut in enumerate(slice_list):# print(slice_strcut.split("|||")[1])image = mpimg.imread(slice_strcut.split("|||")[1])# image = np.reshape(image, (227, 227))  ## Not workimage = cv2.resize(image, (model_img_size, model_img_size))# print(image.shape)# print(i%slice_num)x[bs, i%self.slice_num, :, :] = image## return a batch_size  pair(images, labels)# print("x.shape = {}, y.shape = {}".format(x.shape, y.shape))return x, ydef iter_len(self):iter_num = self.len_dataset / self.batch_size
#        print("iter_len = {} ||| len_dataset = {}, batch_size = {}".format(num, self.len_dataset, self.batch_size))return iter_num#train_subject_ID = './path_txt/train_sujectID_majority_select_top51_slices_folder_02.txt'
#val_subject_ID = './path_txt/val_sujectID_majority_select_top51_slices_folder_02.txt'
#train_subject_ID_list, train_subject_ID_label_list = read_lists(train_subject_ID)
#val_subject_ID_list, val_subject_ID_lable_list = read_lists(val_subject_ID)
#
## dataset_train = Dataset_ADNI_TopK(train_subject_ID_list, train_subject_ID_label_list, "train")
## dataset_train.slice_concatenate()
#
#batch_size_subject = 1
#dataset_val = Dataset_ADNI_TopK(val_subject_ID_list, val_subject_ID_lable_list, "val", True, batch_size_subject)
#slice_num = 51
#image, label = dataset_val.slice_concatenate(slice_num)
#dataset_val.iter_len()

Reference:

  • PyTorch ImageFolder:https://github.com/pytorch/vision/blob/master/torchvision/datasets/folder.py

【ADNI】数据预处理(6)ADNI_slice_dataloader ||| show image相关推荐

  1. 【ADNI】数据预处理(1)SPM,CAT12;数据集

    ADNI Series 1.[ADNI]数据预处理(1)SPM,CAT12 2.[ADNI]数据预处理(2)获取 subject slices 3.[ADNI]数据预处理(3)CNNs 4.[ADNI ...

  2. 机器学习PAL数据预处理

    机器学习PAL数据预处理 本文介绍如何对原始数据进行数据预处理,得到模型训练集和模型预测集. 前提条件 完成数据准备,详情请参见准备数据. 操作步骤 登录PAI控制台. 在左侧导航栏,选择模型开发和训 ...

  3. 深度学习——数据预处理篇

    深度学习--数据预处理篇 文章目录 深度学习--数据预处理篇 一.前言 二.常用的数据预处理方法 零均值化(中心化) 数据归一化(normalization) 主成分分析(PCA.Principal ...

  4. 目标检测之Faster-RCNN的pytorch代码详解(数据预处理篇)

    首先贴上代码原作者的github:https://github.com/chenyuntc/simple-faster-rcnn-pytorch(非代码作者,博文只解释代码) 今天看完了simple- ...

  5. 第七篇:数据预处理(四) - 数据归约(PCA/EFA为例)

    前言 这部分也许是数据预处理最为关键的一个阶段. 如何对数据降维是一个很有挑战,很有深度的话题,很多理论书本均有详细深入的讲解分析. 本文仅介绍主成分分析法(PCA)和探索性因子分析法(EFA),并给 ...

  6. 数据预处理--噪声_为什么数据对您的业务很重要-以及如何处理数据

    数据预处理--噪声 YES! Data is extremely important for your business. 是! 数据对您的业务极为重要. A human body has five ...

  7. 数据预处理(完整步骤)

    原文:http://dataunion.org/5009.html 一:为什么要预处理数据? (1)现实世界的数据是肮脏的(不完整,含噪声,不一致) (2)没有高质量的数据,就没有高质量的挖掘结果(高 ...

  8. 3D目标检测深度学习方法数据预处理综述

    作者 | 蒋天元 来源 | 3D视觉工坊(ID: QYong_2014) 这一篇的内容主要要讲一点在深度学习的3D目标检测网络中,我们都采用了哪些数据预处理的方法,主要讲两个方面的知识,第一个是rep ...

  9. 整理一份详细的数据预处理方法

    作者:lswbjtu https://zhuanlan.zhihu.com/p/51131210 编辑:机器学习算法与Python实战 为什么数据处理很重要? 熟悉数据挖掘和机器学习的小伙伴们都知道, ...

  10. pandas数据预处理(标准化归一化、离散化/分箱/分桶、分类数据处理、时间类型数据处理、样本类别分布不均衡数据处理、数据抽样)

    1. 数值型数据的处理 1.1 标准化&归一化 数据标准化是一个常用的数据预处理操作,目的是处理不同规模和量纲的数据,使其缩放到相同的数据区间和范围,以减少规模.特征.分布差异等对模型的影响. ...

最新文章

  1. 2、Python连接Mysql数据库。
  2. linux tomcat apr安装,Linux下Tomcat8.0.44配置使用Apr的方法
  3. 实现单台测试机6万websocket长连接
  4. vs2008打开vs2010工程项目
  5. python基础--接口与归一化设计、封装、异常、网络编程
  6. sqlserver的存储过程
  7. Flutter进阶—Firebase数据库实例
  8. VB 迅雷下载地址解密函数
  9. Thinking in Java 14.7 动态代理
  10. 爱尔兰圣三一大学计算机专业硕士,爱尔兰圣三一学院研究生申请要求
  11. nodeJS中利用第三方内置模块实现数字转大写功能
  12. 杭州电子科技大学研究生计算机科学与技术,杭州电子科技大学-硕士研究生-计算机学院 2018级计算机科学与技术(中日合作)培养方案...
  13. 产品的思维与技术的思维差异
  14. pyhton爬虫爬取100首诗
  15. 沈阳市委书记邀请深兰科技赴沈建厂,助力东北振兴
  16. 计算机主机箱进行总结,工业级主机用机箱分类总结
  17. android入门书籍!微信小程序趋势及前景,安卓系列学习进阶视频
  18. AssertionError: Egg-link *** does not match installed location ***
  19. Java中String接受的最大字符串的长度
  20. 如何用css做一个爱心

热门文章

  1. java使用图灵机器人,Java 调用图灵机器人
  2. 两轮差速AGV的控制理论
  3. excel下拉公式保持一些参数不变
  4. mysql写保护,sd卡有写保护怎么格式化
  5. local class incompatible: stream classdesc serialVersionUID = -3129896799942729832, local class seri
  6. kube-scheduler源码分析(三)之 scheduleOne
  7. 域名购买之后怎么使用
  8. Jetbrains好用的插件(经验总结)
  9. 前端学习笔记-JS数据类型
  10. fabao_get.y