python读取fits第三方库_python-astropy.io.fits从具有多个HDU的大型fits文件中读取行

我有一个约50GB的fits文件,其中包含多个HDU,它们都具有相同的格式：一个带有1E5对象和1E6时间戳的(1E5 x 1E6)数组. HDU描述了不同的物理属性,例如磁通,RA,DEC等.我只想从每个HDU中读取5个对象(即(5 x 1E6)阵列).

python 2.7,

糟糕的1.0.3,

linux x86_64

到目前为止,我尝试了很多发现的建议,但是没有任何效果.我最好的方法仍然是：

#the five objects I want to read out

obj_list = ['Star1','Star15','Star700','Star2000','Star5000']

dic = {}

with fits.open(fname, memmap=True, do_not_scale_image_data=True) as hdulist:

# There is a special HDU 'OBJECTS' which is an (1E5 x 1) array and contains the info which index in the fits file corresponds to which object.

# First, get the indices of the rows that describe the objects in the fits file (not necessarily in order!)

ind_objs = np.in1d(hdulist['OBJECTS'].data, obj_list, assume_unique=True).nonzero()[0] #indices of the candidates

# Second, read out the 5 object's time series

dic['FLUX'] = hdulist['FLUX'].data[ind_objs] # (5 x 1E6) array

dic['RA'] = hdulist['RA'].data[ind_objs] # (5 x 1E6) array

dic['DEC'] = hdulist['DEC'].data[ind_objs] # (5 x 1E6) array

此代码可以很好地且快速地适合约20 GB的文件,但对于较大的文件则用尽内存(较大的文件仅包含更多对象,而没有更多时间戳).我不明白为什么-astropy.io.fits本质上使用mmap,并且据我所知应该仅将(5x1E6)数组加载到内存中？因此,与文件大小无关,我要读出的内容始终具有相同的大小.

编辑-这是错误消息：

dic['RA'] = hdulist['RA'].data[ind_objs] # (5 x 1E6) array

File "/usr/local/python/lib/python2.7/site-packages/astropy-1.0.3-py2.7-linux-x86_64.egg/astropy/utils/decorators.py", line 341, in __get__

val = self._fget(obj)

File "/usr/local/python/lib/python2.7/site-packages/astropy-1.0.3-py2.7-linux-x86_64.egg/astropy/io/fits/hdu/image.py", line 239, in data

data = self._get_scaled_image_data(self._data_offset, self.shape)

File "/usr/local/python/lib/python2.7/site-packages/astropy-1.0.3-py2.7-linux-x86_64.egg/astropy/io/fits/hdu/image.py", line 585, in _get_scaled_image_data

raw_data = self._get_raw_data(shape, code, offset)

File "/usr/local/python/lib/python2.7/site-packages/astropy-1.0.3-py2.7-linux-x86_64.egg/astropy/io/fits/hdu/base.py", line 523, in _get_raw_data

return self._file.readarray(offset=offset, dtype=code, shape=shape)

File "/usr/local/python/lib/python2.7/site-packages/astropy-1.0.3-py2.7-linux-x86_64.egg/astropy/io/fits/file.py", line 248, in readarray

shape=shape).view(np.ndarray)

File "/usr/local/python/lib/python2.7/site-packages/numpy/core/memmap.py", line 254, in __new__

mm = mmap.mmap(fid.fileno(), bytes, access=acc, offset=start)

mmap.error: [Errno 12] Cannot allocate memory

编辑2：

谢谢,我现在包含了建议,它使我能够处理最大50GB的装订文件.新代码：

#the five objects I want to read out

obj_list = ['Star1','Star15','Star700','Star2000','Star5000']

dic = {}

with fits.open(fname, mode='denywrite', memmap=True, do_not_scale_image_data=True) as hdulist:

# There is a special HDU 'OBJECTS' which is an (1E5 x 1) array and contains the info which index in the fits file corresponds to which object.

# First, get the indices of the rows that describe the objects in the fits file (not necessarily in order!)

ind_objs = np.in1d(hdulist['OBJECTS'].data, obj_list, assume_unique=True).nonzero()[0] #indices of the candidates

# Second, read out the 5 object's time series

dic['FLUX'] = hdulist['FLUX'].data[ind_objs] # (5 x 1E6) array

del hdulist['FLUX'].data

dic['RA'] = hdulist['RA'].data[ind_objs] # (5 x 1E6) array

del hdulist['RA'].data

dic['DEC'] = hdulist['DEC'].data[ind_objs] # (5 x 1E6) array

del hdulist['DEC'].data

的

mode='denywrite'

没有引起任何变化.

memmap=True

确实不是默认值,需要手动设置.

del hdulist['FLUX'].data

等现在允许我读取50GB而不是20GB的文件

新问题：

大于50GB的任何内容仍然会导致相同的内存错误-但是,现在直接在第一行中.

dic['FLUX'] = hdulist['FLUX'].data[ind_objs] # (5 x 1E6) array

python读取fits第三方库_python-astropy.io.fits从具有多个HDU的大型fits文件中读取行相关推荐

python语音识别的第三方库_python标准库+内置函数+第三方库: 7.音频处理
python标准库+内置函数+第三方库欲善其事,必先利其器这其器必是python的标准库+内置函数,话说许多第三方库, 也是对标准库的使用,进行封装,使得使用起来更方便. 这些库以使用场景来分类: ...
python web开发第三方库_Python Web开发中常用的第三方库
Python Web开发中常用的第三方库 TL;DR 经常有朋友问,如果用Python来做Web开发,该选用什么框架?用 Pyramid 开发Web该选用怎样的组合等问题?在这里我将介绍一些Pytho ...
python人工智能方向第三方库_Python进阶-第三方库管理和虚拟环境
本文为<爬着学Python>系列第十三篇文章. Python能在这几年火起来,靠的不是网上一大片的爬虫和服务器后端知识的应用(本专题就是这样的,这么说真的好吗?不过我们总得认清事实是吧.) ...
python怎么用第三方库_python中第三方库的下载方法
1.最常用:在命令行中输入 pip install "库名称" 例如 pip install gensim 查看pip的命令集: pip uninstall "库名& ...
python卸载后第三方库_Python第三方库安装和卸载
1. 安装第三方库虽然在Python安装库的方式非常简单,因为错误经常出现在不同的环境中,我们提供尽可能多的安装方法,以避免无法安装.注意:所有Dos下执行以下命令,不是在编辑环境中.如果系统提示您\ ...
python怎么用第三方库_python怎么引用第三方库?
Python及第三方库安装教程一.Python安装教程第一步:下载Python安装包第二步:安装第三步:测试第四步:运行二.第三方库安装教程第一步:下载安装包第二步:安装第三步:测试 ...
python网络爬虫第三方库_Python常用第三方库大盘点
Python语言有超过12万个第三方库,覆盖信息技术几乎所有领域.下面简单介绍下网络爬虫.自动化.数据分析与可视化.WEB开发.机器学习和其他常用的一些第三方库,如果有你感兴趣的库,不妨去试试它的功能 ...
python的标识库和第三方库_Python 标准库、第三方库
Python数据工具箱涵盖从数据源到数据可视化的完整流程中涉及到的常用库.函数和外部工具.其中既有Python内置函数和标准库,又有第三方库和工具.这些库可用于文件读写.网络抓取和解析.数据连接.数清 ...
python怎样快速下载库_Python如何急速下载第三方库详解
前言 pip 是一个现代的,通用的 Python 包管理工具 ,是一个安装第三方库必备的工具,提供了对Python 包的查找.下载.安装.卸载的功能.但是在国内使用有很多因素的限制,一个3.4M的库 ...
Python 图像处理 PIL 第三方库详细使用教程（更新中）
Pillow 库基本概述 Python Pillow PIL 库的用法介绍,Pillow库是一个Python的第三方库. 要点:PIL库是一个具有强大图像处理能力的第三方库,不仅包含了丰富的像素.色 ...

python读取fits第三方库_python-astropy.io.fits从具有多个HDU的大型fits文件中读取行

python读取fits第三方库_python-astropy.io.fits从具有多个HDU的大型fits文件中读取行相关推荐

最新文章

热门文章