在维护一段代码时看到前任程序员写的获取栅格数据的CellSize的功能,竟然在知道GDAL的情况下去调用AE的接口来解算,觉得费解。

原来的思路是使用AE的Raster对象读取出Raster的文件大小和真实投影坐标对构造的矩形外框,再来算每个cell的长宽,觉得实在无语。

于是研究了下GDAL怎么获取到一些数据基本信息(Metadata)的。

搬运一下GDAL官方对其数据模型的Metadata的描述:

GDAL metadata is auxiliary format and application specific textual data kept as a list of name/value pairs. The names are required to be well behaved tokens (no spaces, or odd characters). The values can be of any length, and contain anything except an embedded null (ASCII zero).

The metadata handling system is not well tuned to handling very large bodies of metadata. Handling of more than 100K of metadata for a dataset is likely to lead to performance degradation.

Metadata is split into named groups called domains, with the default domain having no name (NULL or ""). Some specific domains exist for special purposes. Note that currently there is no way to enumerate all the domains available for a given object, but applications can "test" for any domains they know how to interpret.

这里需要注意一下描述中我标红的部分,意思是GDAL本身确实提供了非常丰富的Metadata,但是不是所有数据都有这些内容。另外第一段标红的部分里讲,GDAL在存储较大的metadata时(100k以上)会有一定性能问题。当然100k已经很多内容了。

这里的metadata由api中的 Dataset.GetMetadata(String) 方法获取,得到的是一个字符串数组对象。按照纯粹的理论而言,这个数组里的内容是非常丰富的,包括以下部分(复制自GDAL官网):

The following metadata items have well defined semantics in the default domain:

  • AREA_OR_POINT: May be either "Area" (the default) or "Point". Indicates whether a pixel value should be assumed to represent a sampling over the region of the pixel or a point sample at the center of the pixel. This is not intended to influence interpretation of georeferencing which remains area oriented.
  • NODATA_VALUES: The value is a list of space separated pixel values matching the number of bands in the dataset that can be collectively used to identify pixels that are nodata in the dataset. With this style of nodata a pixel is considered nodata in all bands if and only if all bands match the corresponding value in the NODATA_VALUES tuple. This metadata is not widely honoured by GDAL drivers, algorithms or utilities at this time.
  • MATRIX_REPRESENTATION: This value, used for Polarimetric SAR datasets, contains the matrix representation that this data is provided in. The following are acceptable values:
    • SCATTERING
    • SYMMETRIZED_SCATTERING
    • COVARIANCE
    • SYMMETRIZED_COVARIANCE
    • COHERENCY
    • SYMMETRIZED_COHERENCY
    • KENNAUGH
    • SYMMETRIZED_KENNAUGH
  • POLARIMETRIC_INTERP: This metadata item is defined for Raster Bands for polarimetric SAR data. This indicates which entry in the specified matrix representation of the data this band represents. For a dataset provided as a scattering matrix, for example, acceptable values for this metadata item are HH, HV, VH, VV. When the dataset is a covariance matrix, for example, this metadata item will be one of Covariance_11, Covariance_22, Covariance_33, Covariance_12, Covariance_13, Covariance_23 (since the matrix itself is a hermitian matrix, that is all the data that is required to describe the matrix).
  • METADATATYPE: If IMAGERY Domain present, the item consist the reader which processed the metadata. Now present such readers:
    • DG: DigitalGlobe imagery metadata
    • GE: GeoEye (or formally SpaceImaging) imagery metadata
    • OV: OrbView imagery metadata
    • DIMAP: Pleiades imagery metadata
    • MSP: Resurs DK-1 imagery metadata
    • ODL: Landsat imagery metadata

SUBDATASETS Domain

The SUBDATASETS domain holds a list of child datasets. Normally this is used to provide pointers to a list of images stored within a single multi image file.

For example, an NITF with two images might have the following subdataset list.

SUBDATASET_1_NAME=NITF_IM:0:multi_1b.ntf SUBDATASET_1_DESC=Image 1 of multi_1b.ntf SUBDATASET_2_NAME=NITF_IM:1:multi_1b.ntf SUBDATASET_2_DESC=Image 2 of multi_1b.ntf

The value of the _NAME is the string that can be passed to GDALOpen() to access the file. The _DESC value is intended to be a more user friendly string that can be displayed to the user in a selector.

Drivers which support subdatasets advertize the DMD_SUBDATASETS capability. This information is reported when the --format and --formats options are passed to the command line utilities.

Currently, drivers which support subdatasets are: ADRG, ECRGTOC, GEORASTER, GTiff, HDF4, HDF5, netCDF, NITF, NTv2, OGDI, PDF, PostGISRaster, Rasterlite, RPFTOC, RS2, WCS, and WMS.

IMAGE_STRUCTURE Domain

Metadata in the default domain is intended to be related to the image, and not particularly related to the way the image is stored on disk. That is, it is suitable for copying with the dataset when it is copied to a new format. Some information of interest is closely tied to a particular file format and storage mechanism. In order to prevent this getting copied along with datasets it is placed in a special domain called IMAGE_STRUCTURE that should not normally be copied to new formats.

Currently the following items are defined by RFC 14 as having specific semantics in the IMAGE_STRUCTURE domain.

  • COMPRESSION: The compression type used for this dataset or band. There is no fixed catalog of compression type names, but where a given format includes a COMPRESSION creation option, the same list of values should be used here as there.
  • NBITS: The actual number of bits used for this band, or the bands of this dataset. Normally only present when the number of bits is non-standard for the datatype, such as when a 1 bit TIFF is represented through GDAL as GDT_Byte.
  • INTERLEAVE: This only applies on datasets, and the value should be one of PIXEL, LINE or BAND. It can be used as a data access hint.
  • PIXELTYPE: This may appear on a GDT_Byte band (or the corresponding dataset) and have the value SIGNEDBYTE to indicate the unsigned byte values between 128 and 255 should be interpreted as being values between -128 and -1 for applications that recognise the SIGNEDBYTE type.

RPC Domain

The RPC metadata domain holds metadata describing the Rational Polynomial Coefficient geometry model for the image if present. This geometry model can be used to transform between pixel/line and georeferenced locations. The items defining the model are:

  • ERR_BIAS: Error - Bias. The RMS bias error in meters per horizontal axis of all points in the image (-1.0 if unknown)
  • ERR_RAND: Error - Random. RMS random error in meters per horizontal axis of each point in the image (-1.0 if unknown)
  • LINE_OFF: Line Offset
  • SAMP_OFF: Sample Offset
  • LAT_OFF: Geodetic Latitude Offset
  • LONG_OFF: Geodetic Longitude Offset
  • HEIGHT_OFF: Geodetic Height Offset
  • LINE_SCALE: Line Scale
  • SAMP_SCALE: Sample Scale
  • LAT_SCALE: Geodetic Latitude Scale
  • LONG_SCALE: Geodetic Longitude Scale
  • HEIGHT_SCALE: Geodetic Height Scale
  • LINE_NUM_COEFF (1-20): Line Numerator Coefficients. Twenty coefficients for the polynomial in the Numerator of the rn equation. (space separated)
  • LINE_DEN_COEFF (1-20): Line Denominator Coefficients. Twenty coefficients for the polynomial in the Denominator of the rn equation. (space separated)
  • SAMP_NUM_COEFF (1-20): Sample Numerator Coefficients. Twenty coefficients for the polynomial in the Numerator of the cn equation. (space separated)
  • SAMP_DEN_COEFF (1-20): Sample Denominator Coefficients. Twenty coefficients for the polynomial in the Denominator of the cn equation. (space separated)

These fields are directly derived from the document prospective GeoTIFF RPC document (http://geotiff.maptools.org/rpc_prop.html) which in turn is closely modeled on the NITF RPC00B definition.

The line and pixel offset expressed with LINE_OFF and SAMP_OFF are with respect to the center of the pixel.

IMAGERY Domain (remote sensing)

For satellite or aerial imagery the IMAGERY Domain may be present. It depends on exist special metadata files near the image file. The files at the same directory with image file tested by the set of metadata readers, if files can be processed by the metadata reader, it fill the IMAGERY Domain with the following items:

  • SATELLITEID: A satellite or scanner name
  • CLOUDCOVER: Cloud coverage. The value between 0 - 100 or 999 if not available
  • ACQUISITIONDATETIME: The image acquisition date time in UTC

xml: Domains

Any domain name prefixed with "xml:" is not normal name/value metadata. It is a single XML document stored in one big string.

因为官方的说明只给了这些内容,所以我们大致能知道的是,在GDAL的数据模型中,Metadata是以数组形式存储键值对(Key Value Pairs),形式为【key=value】,且按照不同的Domains来进行分组。

实际上, Dataset.GetMetadata(String)所要求的参数是Domain的名称,而Default Domain则可以以【NULL】或者【“”】进行赋值。

喜闻乐见的是,好像这些信息非常的有价值。

然而悲剧的是,如官方所述,不是所有的数据都完整包含每个键值对。

在我的本地测试里使用的一景Landsat8数据就无法获取到除 【AREA_OR_POINT】以外的Metadata。

所以在本地的代码里目前的写法是使用Linq+三目判定,由于项目暂且仅使用Landsat8数据(除15m全色以外的波段),故可以直接为无法从Metadata中得到cellSize值时按30m计算。

若是需要做类似的功能,个人建议是使用直接数值计算影像左上角点和右下角点两个Point的投影坐标值后,按得到的距离除以数据的长宽作为CellSize。当然,需要自己考虑是否忽略小数或精确多少位。不使用AE提供的接口是因为创建的COM对象释放困难,且AE的接口一般存在较大开销,在低端功能上能省则省吧。

当然这里还有一些疑问,如栅格数据的CellSize的单位是m还是啥怎么判断,纯数学方式判定会有多大误差等等。

至于如何保证数据有metadata,也需要在进一步研究后回来更新本篇。

PS:以上涉及代码的部分均为C#环境下.

感谢阅读。

转载于:https://www.cnblogs.com/DannielZhang/p/5167814.html

【GDAL】GDAL栅格数据结构学习笔记(一): 关于Metadata相关推荐

  1. 数据结构学习笔记(七):哈希表(Hash Table)

    目录 1 哈希表的含义与结构特点 1.1 哈希(Hash)即无序 1.2 从数组看哈希表的结构特点 2 哈希函数(Hash Function)与哈希冲突(Hash Collision) 2.1 哈希函 ...

  2. 数据结构学习笔记(六):二叉树(Binary Tree)

    目录 1 背景知识:树(Tree) 2 何为二叉树(Binray Tree) 2.1 二叉树的概念与结构 2.2 满二叉树与完全二叉树 2.3 二叉树的三种遍历方式 3 二叉树及其遍历的简单实现(Ja ...

  3. 数据结构学习笔记(五):重识字符串(String)

    目录 1 字符串与数组的关系 1.1 字符串与数组的联系 1.2 字符串与数组的区别 2 实现字符串的链式存储(Java) 3 子串查找的简单实现 1 字符串与数组的关系 1.1 字符串与数组的联系 ...

  4. 数据结构学习笔记(四):重识数组(Array)

    目录 1 数组通过索引访问元素的原理 1.1 内存空间的连续性 1.2 数据类型的同一性 2 数组与链表增删查操作特性的对比 2.1 数组与链表的共性与差异 2.2 数组与链表增删查特性差异的原理 3 ...

  5. 数据结构学习笔记——顺序表的基本操作(超详细最终版+++)建议反复看看ヾ(≧▽≦*)o

    目录 前言 一.顺序表的定义 二.顺序表的初始化 三.顺序表的建立 四.顺序表的输出 五.顺序表的逆序输出 六.顺序表的插入操作 七.顺序表的删除操作 八.顺序表的按位和按值查找 基本操作的完整代码 ...

  6. Python数据结构学习笔记——链表:无序链表和有序链表

    目录 一.链表 二.无序链表 实现步骤分析 三.无序链表的Python实现代码 四.有序链表 实现步骤分析 五.有序链表的Python实现代码 结语 一.链表 链表中每一个元素都由为两部分构成:一是该 ...

  7. Python数据结构学习笔记——队列和双端队列

    目录 一.队列的定义 二.队列 实现步骤分析 三.队列的Python实现代码 四.队列的应用 六人传土豆游戏 五.双端队列的定义 六.双端队列 实现步骤分析 七.双端队列的Python实现代码 八.双 ...

  8. Python数据结构学习笔记——栈

    目录 一.栈的定义和特性 (一)栈的定义 (二)栈的反转特性 二.实现分析步骤 三.栈的Python实现代码 四.栈的应用 (一)匹配圆括号 (二)匹配符号 (三)模2除法(十进制转二进制) (四)进 ...

  9. 数据结构学习笔记:利用栈实现进制转换

    数据结构学习笔记:利用栈实现进制转换 一.除基倒取余法示意图 二.编写十进制转换成二进制Python程序 1.源代码 2.运行结果 其实Python提供了一

最新文章

  1. 框架学习之Spring 第五节 SSH整合开发[Spring2.5+Hibernate3.3+Struts2]
  2. 开发日记-20190612 关键词 读书笔记《鸟哥的Linux私房菜-基础学习篇》
  3. P1291 [SHOI2002]百事世界杯之旅
  4. pl/postgresql_将PostgreSQL PL / Java安装为PostgreSQL扩展
  5. java - 求a+aa+aaa+aa...a之和
  6. CentOS8配置yum/dnf镜像源
  7. 浅谈 Kafka Leader Epoch
  8. 超级电脑可下载人类思想 究竟是福是祸?(
  9. office365服务器没有响应,office 365 使用过程中频繁出现无响应
  10. 网页集成大华摄像头以及回放功能2019.11.14
  11. 2023中国(上海)国际大豆食品加工及设备展览会
  12. 计算机组成原理 微机,【2017年整理】计算机组成原理-微机实验指导书.doc
  13. web渗透—暴力破解
  14. Python 使用 twitter API 获取twitter用户信息
  15. Element UI设置文本输入框、选择框、数字、日期组件的背景色
  16. PS PNG导出的时候是否交错有什么影响
  17. 设计师的视角看 ‘完全自动驾驶’,未来10年,能否实现自动驾驶?
  18. charAt()方法和charCodeAt()方法—— 从字符串中选取一个字符.
  19. 轴强度校核c语言程序,轴类零件强度的校核.pdf
  20. miui12.5系统广告去除(不断整理)

热门文章

  1. 【C++】Visual Studio教程(八) -修复 Visual Studio
  2. 【ARM】Tiny4412裸板编程之蜂鸣器(C语言)
  3. 编程能力强化(4)——模拟SQL语句解析
  4. java登录的 验证码_java登录验证码
  5. 展示 测速_科技产品 | 人工智能amp;科技展示厅——助力高校人工智能学科建设及产业人才培养...
  6. access重复数据累计_Access 查询同一张表中两个或以上字段含有重复项的记录
  7. VC INI文件读写 和 GetProfileString,WriteProfileString函数的使用
  8. pytorch切片,numpy切片的总结,以及数组切片常用操作的总结
  9. 机器学习 KD树_递归_回溯_搜索(matlab实现)
  10. [Bugku CTF——Pwn] pwn2