Digital Database for Screening Mammography (DDSM)数据库是一个非常大的乳腺图像数据库,有一万多张图像,但是图像格式是LJPEG,现有的图像软件(如photoshop、ACCDsee、windows自带的图像查看软件)以及编程软件(如matlab)都无法读取,需要将其转换成其他常见的格式才能使用。我从网上搜到了很多方法,试过之后都不成功,其中包括该数据库的创建者——南佛罗里达大学自己写的一个程序[1],一个医学图像格式转换软件XMedCon[2]。最后成功的方法是使用曼彻斯特大学的Dr. Chris Rose写的一个完整的程序,在他的程序基础上做了些修改,成功的将图像格式转换成了PNG格式。他的程序链接见http://microserf.org.uk/academic/Software.html (PS. 最近发现此链接地址已失效,所以我把源程序放在了我的github上,地址:https://github.com/hd11224/DDSM,包括了所有需要的工具和软件)

从上面给出的链接下载到的程序中,有用户手册告诉你怎么使用这个程序。这个程序是用Ruby语言写的,需要在Cygwin下运行,用户手册中有介绍如何安装Cygwin及其他需要的工具。程序工作流程是使用者手动输入图像名称,程序先从FTP上下载该图像,然后经过几步转换,最终转换为PNG格式。我在按照用户手册运行这个程序时没有成功,用VS2013打开get-ddsm-mammo文件查看源码,发现是从FTP上下载图像环节出了问题,做了修改后,最终运行成功。我修改后的程序如下:

#!/usr/bin/ruby# This program gets a specified mammogram from the DDSM website and
# converts it to a PNG image. See the help message for full details.require 'net/ftp'# Specify the name of the info-file.
def info_file_name'info-file.txt'
enddef image_names'image_name.txt'
end# Get an FTP file as specified by a DDSM path (e.g.,
# /pub/DDSM/cases/cancers/cancer_06/case1141/A-1141-1.ics) and return the
# local path to the file, or return nil if the file could not be dowloaded.
def get_file_via_ftp(ddsm_path)ftp = Net::FTP.new('figment.csee.usf.edu')ftp.passive = trueftp.login  ftp.chdir(File.dirname(ddsm_path))  puts File.basename(ddsm_path)  ftp.getbinaryfile(File.basename(ddsm_path))#ftp.getbinaryfile(ddsm_path)# Will be stored local to this program, under the same file name# Check to make sure that we managed to get the file.if !FileTest.exist?(File.basename(ddsm_path))puts "Could not get the file #{File.basename(ddsm_path)} from the DDSM FTP server; perhaps the server is busy."exit(-1)end  return File.basename(ddsm_path)
end# Return the string input with the system's filesep at the end; if there
# is one there already then return input.
def ensure_filesep_terminated(input)if input[input.length-1].chr != File::SEPARATORinput += File::SEPARATORendreturn input
end# Check program input; input is the program input (i.e ARGV).
def check_inputs(input)if input.length != 1puts get_helpexit(-1)end# See if the user wanted the help docs.if input[0] == '--help'puts get_helpexit(-1)end# Check to make sure that the info file exists.if !FileTest.exist?(info_file_name)puts "The file #{info_file_name} does not exist; use catalogue-ddsm-ftp-server.rb"exit(-1)endend# Given the name of a DDSM image, return the path to the
# .ics file associated with the image name. If we can't find the
# path, then we return nil.
def get_ics_path_for_image(image_name)# Does image_name look right?if image_name[/._\d{4,4}_.\..+/].nil?raise 'image_name seems to be wrong. It is: ' + image_nameend# Edit the image name, as .ics files have the format 'A-0384-1.ics';# there is no '.RIGHT_CC' (for example).image_name = image_name[0..(image_name.rindex('.')-1)] # Strip everything after and including the last '.'.image_name[1] = '-'image_name[6] = '-' # Change the '_'s to '-'s (better regexp-based approach?).image_name+='.ics' # Add the file suffix.# Get the path to the .ics file for the specified image.File.open(info_file_name) do |file|file.each_line do |line|# Does this line specify the .ics file for the specified image name?if !line[/.+#{image_name}/].nil?# If so, we can stop looking       return line     endendend# If we get here, then we did not find a match, so we will return nil.return nil
end# Given a line from a .ics file, return a string that specifies the
# number of rows and cols in the image described by the line. The
# string would be '123 456' if the image has 123 rows and 456 cols.
def get_image_dims(line)rows = line[/.+LINES\s\d+/][/\d+/]cols = line[/.+PIXELS_PER_LINE\s\d+/][/PIXELS_PER_LINE\s\d+/][/\d+/]return rows + ' ' + cols
end# Given an image name and a string representing the location of a
# local .ics file, get the image dimensions and digitizer name for
# image_name. Return a hash which :image_dims maps to a string of the
# image dims (which would be '123 456' if the image has 123 rows and
# 456 cols) and :digitizer maps to the digitizer name. If we can't
# determine the dimensions and/or digitizer name, the corresponding
# entry in the hash will be nil.
def get_image_dims_and_digitizer(image_name, ics_file)# Get the name of the image view (e.g. 'RIGHT_CC')image_view = image_name[image_name.rindex('.')+1..image_name.length-1]image_dims = nildigitizer = nil# Read the image dimensions and digitizer name from the file.File.open(ics_file, 'r') do |file|file.each_line do |line|if !line[/#{image_view}.+/].nil?# Read the image dimensionsimage_dims = get_image_dims(line)endif !line[/DIGITIZER.+/].nil?# Read the digitizer type from the file.digitizer = line.split[1].downcase # Get the second word in the DIGITIZER line.# There are two types of Howtek scanner and they are# distinguished by the first letter in image_name.if digitizer == 'howtek'if image_name[0..0].upcase == 'A'digitizer += '-mgh'elsif image_name[0..0].upcase == 'D'digitizer += '-ismd'elseraise 'Error trying to determine Howtek digitizer variant.'endendendendend# Return an associative array specifying the image dimensions and# digitizer used.return {:image_dims => image_dims, :digitizer =>digitizer}
end# Given the name of a DDSM image, return a string that describes
# the image dimensions and the name of the digitizer that was used to
# capture it. If
def do_get_image_info(image_name)# Get the path to the ics file for image_name.ftp_path = get_ics_path_for_image(image_name)ftp_path.chomp!# Get the ics file; providing us with a string representing# the local location of the file.ics_file = get_file_via_ftp(ftp_path)# Get the image dimensions and digitizer for image_name.image_dims_and_digitizer = get_image_dims_and_digitizer(image_name, ics_file)# Remove the .ics file as we don't need it any more.File.delete(ics_file)return image_dims_and_digitizer
end# Given a mammogram name and the path to the image info file, get the
# image dimensions and digitizer name string.
def get_image_info(image_name)# Get the image dimensions and digitizer type for the specified# image as a string.image_info = do_get_image_info(image_name)# Now output the result to standard output.all_ok = !image_info[:image_dims].nil? && !image_info[:digitizer].nil? # Is everything OK?if all_okret_val = image_info[:image_dims] + ' ' + image_info[:digitizer]endreturn ret_val
end# Return a non-existant random filename.
def get_temp_filenamerand_name = "#{rand(10000000)}" # A longish stringif FileTest.exist?(rand_name)rand_name = get_temp_filenameendreturn rand_name
end# Retrieve the LJPEG file for the mammogram with the specified
# image_name, given the path to the info file. Return the path to the
# local file if successful. If we can't get the file, then return nil.
def get_ljpeg(image_name)# Get the path to the image file on the mirror of the FTP server.path = nilFile.open(info_file_name) do |file|file.each_line do |line|if !line[/.+#{image_name}\.LJPEG/].nil?# We've found it, so get the file.line.chomp!local_path = get_file_via_ftp(line)      return local_pathendendend# If we get here we didn't find where the file is on the server.return nil
end# Given the path to the dir containing the jpeg program, the path to a
# LJPEG file, convert it to a PNM file. Return the path to the PNM
# file.
def ljpeg_to_pnm(ljpeg_file, dims_and_digitizer)# First convert it to raw format.command = "./jpeg.exe -d -s #{ljpeg_file}"`#{command}` # Run it.raw_file = ljpeg_file + '.1' # The jpeg program adds a .1 suffix.# See if the .1 file was created.if !FileTest.exist?(raw_file)raise 'Could not convert from LJPEG to raw.'end# Now convert the raw file to PNM and delete the raw file.command = "./ddsmraw2pnm.exe #{raw_file} #{dims_and_digitizer}"pnm_file = `#{command}`File.delete(raw_file)if $? != 0raise 'Could not convert from raw to PNM.'end# Return the path to the PNM file.return pnm_file.split[0]
end# Convert a PNM file to a PNG file. pnm_file is the path to the pnm file
# and target_png_file is the name of the PNG file that we want created.
def pnm_to_png(pnm_file, target_png_file)command = "convert -depth 16 #{pnm_file} #{target_png_file}"`#{command}`if !FileTest.exist?(target_png_file)raise 'Could not convert from PNM to PNG.'endreturn target_png_file
end#write image_names to image_nama.txt
def write_image_names(name)namefile=File.open(image_names,'a') namefile.puts namenamefile.puts "\r\n"namefile.close
end# The entry point of the program.
def main  # Check to see if the input is sensible.#check_inputs(ARGV)#image_name = ARGV[0]File.open('read_names.txt','r') do |file|file.each_line do |line|image_name = lineimage_name.chomp!# Get the image dimensions and digitizer name string for the# specified image.image_info = get_image_info(image_name)# Get the LJPEG file from the mirror of the FTP site, returning the# path to the local file.ljpeg_file = get_ljpeg(image_name)# Convert the LJPEG file to PNM and delete the original LJPEG.pnm_file = ljpeg_to_pnm(ljpeg_file, image_info)File.delete(ljpeg_file)# Now convert the PNM file to PNG and delete the PNG file.target_png_file = image_name + '.png'png_file = pnm_to_png(pnm_file, target_png_file)File.delete(pnm_file)# Test to see if we got something.if !FileTest.exist?(png_file)raise 'Could not create PNG file.'exit(-1)end# Display the path to the file.puts File.expand_path(png_file)#write image namewrite_image_names(image_name)#exit(0)end  endexit(0)
end# The help message
def get_help<<END_OF_HELPThis program gets a specified mammogram from a local mirror of theDDSM FTP Server, converts it to a PNG image and saves it to a targetdirectory; if the target directory already contains a suitably-namedfile, the download and conversion are skipped.Call this program using:ruby get-ddsm-mammo.rb <image-name>(Note: the '\\' simply indicates that the above command should be onone line.)where:* <image-name> is the name of the DDSM image you want to get andconvert, for example: 'A_1141_1.LEFT_MLO'.If successful, the program will print the path to the PNG file ofthe requested mammogram to standard output and will return a statuscode of 0. If unsuccessful, the program should display auseful error message and return a non-zero status code.END_OF_HELP
end# Call the entry point.
main

很麻烦的一点是,原程序运行需要手动依次输入图像名称,一次只能处理一张图像,一张图像处理完后才能处理下一张,很费时费力,所以在上面贴出的程序中我还做了一点修改,可以批量处理图像。方法是将要处理的图像的名称提前写在一个txt文件里,一行一个,命名为read_names,运行程序只需输入 ./get-ddsm-mammo即可。程序运行界面如下:

每处理完一张图像,程序会将图像的名称写在一个名为image_name的txt文件里,所以在运行程序前要先创建一个名为image_name的txt文件。

最后一点要说明的是,用户手册中提到在安装Cygwin时要同时安装Ruby,因为当时的Cygwin版本较低,Ruby已不在手册中所示位置,而是单独拿出来的,要安装如下图所示的Ruby和rubygems:

在ddsm-software文件夹下,会看到已经转为PNG格式的图像:

[1]http://marathon.csee.usf.edu/Mammography/software/heathusf_v1.1.0.html

[2]http://sourceforge.net/projects/xmedcon/

DDSM数据库转换图像格式——LJPEG转为PNG格式相关推荐

  1. 读取mysql数据库的数据,转为json格式

    # coding=utf-8 ''' Created on 2016-10-26 @author: Jennifer Project:读取mysql数据库的数据,转为json格式 ''' import ...

  2. php输出json到表格,PHP中把数据库查询结果输出为json格式

    header("Content-type:text/html;charset=utf-8");//字符编码设置 $servername = "localhost" ...

  3. 微信小程序将时间戳转为日期格式

    微信小程序,时间戳转为日期格式 在小程序中使用时间选择器时,获取到的时间可能是一个时间戳,这并不是我们想要的,这时候我们得将获取到的时间戳进行转换 将时间戳转为时间格式 //转换为时间格式format ...

  4. DDSM乳腺钼靶图像数据库——ftp下载与格式转换(LJPEG转PNG)

    文件下载 参考:https://blog.csdn.net/yixieling4397/article/details/81321870 完成:ROC win10 DDSM数据库是美国的医学机构所建立 ...

  5. DDSM数据库——LJPEG转PNG格式

    一.简介 本文的程序是在曼彻斯特大学的 Dr. Chris Rose 编写的程序基础上修改而成.原程序的官网已经关闭,但在Github上还有镜像存储库. 原程序主要有两个功能: 从FTP服务器下载图像 ...

  6. HRSID舰船检测数据集标签格式转换,json转为xml

    HRSID数据集介绍参考原文:https://ieeexplore.ieee.org/document/9127939 数据集下载链接:https://github.com/chaozhong2010 ...

  7. 图像格式及Matlab的格式转换

    1. matlab图像保存说明 matlab中读取图片后保存的数据是uint8类型(8位无符号整数,即1个字节),以此方式存储的图像称作8位图像,好处相比较默认matlab数据类型双精度浮点doubl ...

  8. 音频文件格式转换 转为 wav格式

    音频文件格式转换 转为 wav格式 pom.xml依赖 <!--文件格式转换--><dependency><groupId>ws.schild</groupI ...

  9. JPG格式如何转为PDF格式?快来学习如何转换

    图片是我们经常用到的一种便携式文件,像我们日常的照片或者是一些学习资料.工作资料都是图片形式的,我们经常会把这些图片发送给其他人,这时候就需要想一个简单的办法把图片一次性发送过去,所以我们可以将图片转 ...

最新文章

  1. qt-designer使用教程2--调用退出
  2. H3C 多区域MSTP配置
  3. 网易云信AI音频最新研究成果获世界顶级学术会议 ICASSP 2022 认可
  4. php-fpm 负荷高,记录简单处理服务器php-fpm占用过多的问题(主题影响负载)
  5. 智慧城市建设面临“三座大山” 安全与服务需两手抓
  6. JavaScript toLocaleString()时间转化为字符串
  7. day52 Django全流程
  8. spring.net 中配置文件分开储存的写法
  9. Bailian2856 计算邮资【入门】
  10. java float存储方式_Java中小数的存储方式
  11. Exchange Server 2013安装部署初体验
  12. 用lex和yacc写成的一个具有解析类C语言的编译器,能够进行正确的词法、语法、语义分析并生成语法树进行可视化以及中间码。
  13. Unity基础学习——光照系统
  14. Win7与VirtualBox ubuntu共享文件夹
  15. 要成为游戏开发人员需要有以下书籍(二)
  16. 微信小程序中播放海康萤石云HLS '.m3u8'视频 video标签
  17. 【InterFace】I2C 总线详述
  18. golang 格式化占位符相关
  19. 【内推】阿里集团2015年实习生招聘
  20. codevs奇怪的梦境(拓扑排序)

热门文章

  1. 蓄水池抽样算法(reservoir sampling)
  2. mysql 上下文切换_线程上下文切换
  3. 宝宝 天天加班, 有意义吗
  4. 计算时间差 html,计算时间差的公式
  5. C#实现侏儒排序算法
  6. java c语言 关系_java和c语言有什么共同点?
  7. HaLoop——适用于迭代计算的Hadoop
  8. SQL SERVER(32)Transact-SQL概述
  9. 软件测试工作的目的和原则是什么?
  10. 订阅新闻联播文字标题到邮箱