这里介绍AVI会使用到的数据结构,为了避免翻译引入歧义,决定该部分还是使用英文原文,如后续有时间再进行翻译。

AVIMAINHEADER structure

The AVIMAINHEADER structure defines global information in an AVI file.

typedef struct _avimainheader {FOURCC fcc;DWORD  cb;DWORD  dwMicroSecPerFrame;DWORD  dwMaxBytesPerSec;DWORD  dwPaddingGranularity;DWORD  dwFlags;DWORD  dwTotalFrames;DWORD  dwInitialFrames;DWORD  dwStreams;DWORD  dwSuggestedBufferSize;DWORD  dwWidth;DWORD  dwHeight;DWORD  dwReserved[4];
} AVIMAINHEADER;

Members

fcc
    Specifies a FOURCC code. The value must be 'avih'.
cb
    Specifies the size of the structure, not including the initial 8 bytes.
dwMicroSecPerFrame
    Specifies the number of microseconds between frames. This value indicates the overall timing for the file.
dwMaxBytesPerSec
    Specifies the approximate maximum data rate of the file. This value indicates the number of bytes per second the system must handle to present an AVI sequence as specified by the other parameters contained in the main header and stream header chunks.
dwPaddingGranularity
    Specifies the alignment for data, in bytes. Pad the data to multiples of this value.
dwFlags

Contains a bitwise combination of zero or more of the following flags:

                                         Value                                           Meaning
AVIF_COPYRIGHTED Indicates the AVI file contains copyrighted data and software. When this flag is used, software should not permit the data to be duplicated.
AVIF_HASINDE Indicates the AVI file has an index.
AVIF_ISINTERLEAVED Indicates the AVI file is interleaved.
AVIF_MUSTUSEINDEX Indicates that application should use the index, rather than the physical ordering of the chunks in the file, to determine the order of presentation of the data. For example, this flag could be used to create a list of frames for editing.
AVIF_WASCAPTUREFILE Indicates the AVI file is a specially allocated file used for capturing real-time video. Applications should warn the user before writing over a file with this flag set because the user probably defragmented this file.

dwTotalFrames
    Specifies the total number of frames of data in the file.
dwInitialFrames
    Specifies the initial frame for interleaved files. Noninterleaved files should specify zero. If you are creating interleaved files, specify the number of frames in the file prior to the initial frame of the AVI sequence in this member.
    To give the audio driver enough audio to work with, the audio data in an interleaved file must be skewed from the video data. Typically, the audio data should be moved forward enough frames to allow approximately 0.75 seconds of audio data to be preloaded. The dwInitialRecords member should be set to the number of frames the audio is skewed. Also set the same value for the dwInitialFrames member of the AVISTREAMHEADER structure in the audio stream header.
dwStreams
    Specifies the number of streams in the file. For example, a file with audio and video has two streams.
dwSuggestedBufferSize
    Specifies the suggested buffer size for reading the file. Generally, this size should be large enough to contain the largest chunk in the file. If set to zero, or if it is too small, the playback software will have to reallocate memory during playback, which will reduce performance. For an interleaved file, the buffer size should be large enough to read an entire record, and not just a chunk.
dwWidth
    Specifies the width of the AVI file in pixels.
dwHeight
    Specifies the height of the AVI file in pixels.
dwReserved
    Reserved. Set this array to zero.

Remarks

The header file Vfw.h defines a MainAVIHeader structure that is equivalent to this structure, but omits the fcc and cb members.

Requirements

Header Aviriff.h

AVISTREAMHEADER structure

The AVISTREAMHEADER structure contains information about one stream in an AVI file.

typedef struct _avistreamheader {FOURCC fcc;DWORD  cb;FOURCC fccType;FOURCC fccHandler;DWORD  dwFlags;WORD   wPriority;WORD   wLanguage;DWORD  dwInitialFrames;DWORD  dwScale;DWORD  dwRate;DWORD  dwStart;DWORD  dwLength;DWORD  dwSuggestedBufferSize;DWORD  dwQuality;DWORD  dwSampleSize;struct {short int left;short int top;short int right;short int bottom;} rcFrame;
} AVISTREAMHEADER;

Members

fcc
    Specifies a FOURCC code. The value must be 'strh'.
cb
    Specifies the size of the structure, not including the initial 8 bytes.
fccType

Contains a FOURCC that specifies the type of the data contained in the stream. The following standard AVI values for video and audio are defined.

FOURCC Description
'auds' Audio stream
'mids' MIDI stream
'txts' Text stream
'vids' Video stream

fccHandler
    Optionally, contains a FOURCC that identifies a specific data handler. The data handler is the preferred handler for the stream. For audio and video streams, this specifies the codec for decoding the stream.
dwFlags

Contains any flags for the data stream. The bits in the high-order word of these flags are specific to the type of data contained in the stream. The following standard flags are defined.

Value                      Meaning
AVISF_DISABLED Indicates this stream should not be enabled by default.
AVISF_VIDEO_PALCHANGES Indicates this video stream contains palette changes. This flag warns the playback software that it will need to animate the palette.

wPriority
    Specifies priority of a stream type. For example, in a file with multiple audio streams, the one with the highest priority might be the default stream.
wLanguage
    Language tag.
dwInitialFrames
    Specifies how far audio data is skewed ahead of the video frames in interleaved files. Typically, this is about 0.75 seconds. If you are creating interleaved files, specify the number of frames in the file prior to the initial frame of the AVI sequence in this member. For more information, see the remarks for the dwInitialFrames member of the AVIMAINHEADER structure.
dwScale
    Used with dwRate to specify the time scale that this stream will use. Dividing dwRate by dwScale gives the number of samples per second. For video streams, this is the frame rate. For audio streams, this rate corresponds to the time needed to play nBlockAlign bytes of audio, which for PCM audio is the just the sample rate.
dwRate
   See dwScale.
dwStart
    Specifies the starting time for this stream. The units are defined by the dwRate and dwScale members in the main file header. Usually, this is zero, but it can specify a delay time for a stream that does not start concurrently with the file.
dwLength
    Specifies the length of this stream. The units are defined by the dwRate and dwScale members of the stream's header.
dwSuggestedBufferSize
    Specifies how large a buffer should be used to read this stream. Typically, this contains a value corresponding to the largest chunk present in the stream. Using the correct buffer size makes playback more efficient. Use zero if you do not know the correct buffer size.
dwQuality
    Specifies an indicator of the quality of the data in the stream. Quality is represented as a number between 0 and 10,000. For compressed data, this typically represents the value of the quality parameter passed to the compression software. If set to –1, drivers use the default quality value.
dwSampleSize
    Specifies the size of a single sample of data. This is set to zero if the samples can vary in size. If this number is nonzero, then multiple samples of data can be grouped into a single chunk within the file. If it is zero, each sample of data (such as a video frame) must be in a separate chunk. For video streams, this number is typically zero, although it can be nonzero if all video frames are the same size. For audio streams, this number should be the same as the nBlockAlign member of the WAVEFORMATEX structure describing the audio.
rcFrame

Specifies the destination rectangle for a text or video stream within the movie rectangle specified by the dwWidth and dwHeight members of the AVI main header structure. The rcFrame member is typically used in support of multiple video streams. Set this rectangle to the coordinates corresponding to the movie rectangle to update the whole movie rectangle. Units for this member are pixels. The upper-left corner of the destination rectangle is relative to the upper-left corner of the movie rectangle.

Remarks

Some of the members of this structure are also present in the AVIMAINHEADER structure. The data in the AVIMAINHEADER structure applies to the whole file, while the data in the AVISTREAMHEADER structure applies to one stream.

The header file Vfw.h defines a AVIStreamHeader structure that is equivalent to this structure, but omits the fcc and cb members.

Requirements

Header Aviriff.h

BITMAPINFOHEADER structure

The BITMAPINFOHEADER structure contains information about the dimensions and color format of a device-independent bitmap (DIB).

Note  This structure is also described in the GDI documentation. However, the semantics for video data are slightly different than the semantics used for GDI. If you are using this structure to describe video data, use the information given here.

typedef struct tagBITMAPINFOHEADER {DWORD biSize;LONG  biWidth;LONG  biHeight;WORD  biPlanes;WORD  biBitCount;DWORD biCompression;DWORD biSizeImage;LONG  biXPelsPerMeter;LONG  biYPelsPerMeter;DWORD biClrUsed;DWORD biClrImportant;
} BITMAPINFOHEADER;

Members

biSize
    Specifies the number of bytes required by the structure. This value does not include the size of the color table or the size of the color masks, if they are appended to the end of structure. See Remarks.
biWidth
    Specifies the width of the bitmap, in pixels. For information about calculating the stride of the bitmap, see Remarks.
biHeight
    Specifies the height of the bitmap, in pixels.

For uncompressed RGB bitmaps, if biHeight is positive, the bitmap is a bottom-up DIB with the origin at the lower left corner. If biHeight is negative, the bitmap is a top-down DIB with the origin at the upper left corner.

For YUV bitmaps, the bitmap is always top-down, regardless of the sign of biHeight. Decoders should offer YUV formats with postive biHeight, but for backward compatibility they should accept YUV formats with either positive or negative biHeight.

For compressed formats, biHeight must be positive, regardless of image orientation.
biPlanes
    Specifies the number of planes for the target device. This value must be set to 1.
biBitCount
    Specifies the number of bits per pixel (bpp). For uncompressed formats, this value is the average number of bits per pixel. For compressed formats, this value is the implied bit depth of the uncompressed image, after the image has been decoded.
biCompression

For compressed video and YUV formats, this member is a FOURCC code, specified as a DWORD in little-endian order. For example, YUYV video has the FOURCC 'VYUY' or 0x56595559. For more information, see FOURCC Codes.

For uncompressed RGB formats, the following values are possible:

Value Meaning
BI_RGB Uncompressed RGB.
BI_BITFIELDS Uncompressed RGB with color masks. Valid for 16-bpp and 32-bpp bitmaps.

See Remarks for more information. Note that BI_JPG and BI_PNG are not valid video formats.

For 16-bpp bitmaps, if biCompression equals BI_RGB, the format is always RGB 555. If biCompression equals BI_BITFIELDS, the format is either RGB 555 or RGB 565. Use the subtype GUID in the AM_MEDIA_TYPE structure to determine the specific RGB type.

biSizeImage
    Specifies the size, in bytes, of the image. This can be set to 0 for uncompressed RGB bitmaps.
biXPelsPerMeter
    Specifies the horizontal resolution, in pixels per meter, of the target device for the bitmap.
biYPelsPerMeter
    Specifies the vertical resolution, in pixels per meter, of the target device for the bitmap.
biClrUsed
    Specifies the number of color indices in the color table that are actually used by the bitmap. See Remarks for more information.
biClrImportant

Specifies the number of color indices that are considered important for displaying the bitmap. If this value is zero, all colors are important.

Remarks

Color Tables
    The BITMAPINFOHEADER structure may be followed by an array of palette entries or color masks. The rules depend on the value of biCompression.
    If biCompression equals BI_RGB and the bitmap uses 8 bpp or less, the bitmap has a color table immediatelly following the BITMAPINFOHEADER structure. The color table consists of an array of RGBQUAD values. The size of the array is given by the biClrUsed member. If biClrUsed is zero, the array contains the maximum number of colors for the given bitdepth; that is, 2^biBitCount colors.
    If biCompression equals BI_BITFIELDS, the bitmap uses three DWORD color masks (red, green, and blue, respectively), which specify the byte layout of the pixels. The 1 bits in each mask indicate the bits for that color within the pixel.
    If biCompression is a video FOURCC, the presence of a color table is implied by the video format. You should not assume that a color table exists when the bit depth is 8 bpp or less. However, some legacy components might assume that a color table is present. Therefore, if you are allocating a BITMAPINFOHEADER structure, it is recommended to allocate space for a color table when the bit depth is 8 bpp or less, even if the color table is not used.

When the BITMAPINFOHEADER is followed by a color table or a set of color masks, you can use the BITMAPINFO structure to reference the color table of the color masks. The BITMAPINFO structure is defined as follows:

typedef struct tagBITMAPINFO {BITMAPINFOHEADER bmiHeader;RGBQUAD          bmiColors[1];
} BITMAPINFO;

If you cast the BITMAPINFOHEADER to a BITMAPINFO, the bmiHeader member refers to the BITMAPINFOHEADER and the bmiColors member refers to the first entry in the color table, or the first color mask.

Be aware that if the bitmap uses a color table or color masks, then the size of the entire format structure (the BITMAPINFOHEADER plus the color information) is not equal to sizeof(BITMAPINFOHEADER) or sizeof(BITMAPINFO). You must calculate the actual size for each instance.

Calculating Surface Stride

In an uncompressed bitmap, the stride is the number of bytes needed to go from the start of one row of pixels to the start of the next row. The image format defines a minimum stride for an image. In addition, the graphics hardware might require a larger stride for the surface that contains the image.

For uncompressed RGB formats, the minimum stride is always the image width in bytes, rounded up to the nearest DWORD. You can use the following formula to calculate the stride:

stride = ((((biWidth * biBitCount) + 31) & ~31) >> 3)

For YUV formats, there is no general rule for calculating the minimum stride. You must understand the rules for the particular YUV format. For a description of the most common YUV formats, see Recommended 8-Bit YUV Formats for Video Rendering .

Decoders and video sources should propose formats where biWidth is the width of the image in pixels. If the video renderer requires a surface stride that is larger than the default image stride, it modifies the proposed media type by setting the following values:
    It sets biWidth equal to the surface stride in pixels.
    It sets the rcTarget member of the VIDEOINFOHEADER or VIDEOINFOHEADER2 structure equal to the image width, in pixels.
    Then the video renderer proposes the modified format by calling IPin::QueryAccept on the upstream pin. For more information about this mechanism, see Dynamic Format Changes.

If there is padding in the image buffer, never dereference a pointer into the memory that has been reserved for the padding. If the image buffer has been allocated in video memory, the padding might not be readable memory.

Requirements

Header WinGDI.h

WAVEFORMATEX structure

The WAVEFORMATEX structure defines the format of waveform-audio data. Only format information common to all waveform-audio data formats is included in this structure. For formats that require additional information, this structure is included as the first member in another structure, along with the additional information.

typedef struct {WORD  wFormatTag;WORD  nChannels;DWORD nSamplesPerSec;DWORD nAvgBytesPerSec;WORD  nBlockAlign;WORD  wBitsPerSample;WORD  cbSize;
} WAVEFORMATEX;

Members

wFormatTag

Waveform-audio format type. Format tags are registered with Microsoft Corporation for many compression algorithms. A complete list of format tags can be found in the Mmreg.h header file. For one- or two-channel Pulse Code Modulation (PCM) data, this value should be WAVE_FORMAT_PCM.

If wFormatTag equals WAVE_FORMAT_EXTENSIBLE, the structure is interpreted as a WAVEFORMATEXTENSIBLE structure.

If wFormatTag equals WAVE_FORMAT_MPEG, the structure is interpreted as an MPEG1WAVEFORMAT structure.

If wFormatTag equals WAVE_FORMAT_MPEGLAYER3, the structure is interpreted as an MPEGLAYER3WAVEFORMAT structure.

Before reinterpreting a WAVEFORMATEX structure as one of these extended structures, verify that the actual structure size is sufficiently large and that the cbSize member indicates a valid size.

nChannels
    Number of channels in the waveform-audio data. Monaural data uses one channel and stereo data uses two channels.
nSamplesPerSec
    Sample rate, in samples per second (hertz). If wFormatTag is WAVE_FORMAT_PCM, then common values for nSamplesPerSec are 8.0 kHz, 11.025 kHz, 22.05 kHz, and 44.1 kHz. For non-PCM formats, this member must be computed according to the manufacturer's specification of the format tag.
nAvgBytesPerSec
    Required average data-transfer rate, in bytes per second, for the format tag. If wFormatTag is WAVE_FORMAT_PCM, nAvgBytesPerSec must equal nSamplesPerSec × nBlockAlign. For non-PCM formats, this member must be computed according to the manufacturer's specification of the format tag.
nBlockAlign
    Block alignment, in bytes. The block alignment is the minimum atomic unit of data for the wFormatTag format type. If wFormatTag is WAVE_FORMAT_PCM, nBlockAlign must equal (nChannels × wBitsPerSample) / 8. For non-PCM formats, this member must be computed according to the manufacturer's specification of the format tag.
    Software must process a multiple of nBlockAlign bytes of data at a time. Data written to and read from a device must always start at the beginning of a block. For example, it is illegal to start playback of PCM data in the middle of a sample (that is, on a non-block-aligned boundary).
wBitsPerSample
    Bits per sample for the wFormatTag format type. If wFormatTag is WAVE_FORMAT_PCM, then wBitsPerSample should be equal to 8 or 16. For non-PCM formats, this member must be set according to the manufacturer's specification of the format tag. If wFormatTag is WAVE_FORMAT_EXTENSIBLE, this value can be any integer multiple of 8.
    Some compression schemes do not define a value for wBitsPerSample, so this member can be zero.
cbSize

Size, in bytes, of extra format information appended to the end of the WAVEFORMATEX structure. This information can be used by non-PCM formats to store extra attributes for the wFormatTag. If no extra information is required by the wFormatTag, this member must be set to zero. For WAVE_FORMAT_PCM formats (and only WAVE_FORMAT_PCM formats), this member is ignored. However it is still recommended to set the value.

Requirements

Header Mmreg.h

AVIOLDINDEX structure

The AVIOLDINDEX structure describes an AVI 1.0 index ('idx1' format). New AVI files should use an AVI 2.0 index ('indx' format).

typedef struct _avioldindex {FOURCC                    fcc;DWORD                     cb;struct _avioldindex_entry {DWORD dwChunkId;DWORD dwFlags;DWORD dwOffset;DWORD dwSize;} aIndex[];
} AVIOLDINDEX;

Members

fcc
    Specifies a FOURCC code. The value must be 'idx1'.
cb
    Specifies the size of the structure, not including the initial 8 bytes.
aIndex
    Array of structures that contain the following members.
dwChunkId

Specifies a FOURCC that identifies a stream in the AVI file. The FOURCC must have the form 'xxyy' where xx is the stream number and yy is a two-character code that identifies the contents of the stream:

db (Uncompressed video frame)
    dc (Compressed video frame)
    pc (Palette change)

wb (Audio data)

dwFlags

Specifies a bitwise combination of zero or more of the following flags:

Value Meaning
AVIIF_KEYFRAME The data chunk is a key frame.
AVIIF_LIST The data chunk is a 'rec ' list.
AVIIF_NO_TIME The data chunk does not affect the timing of the stream. For example, this flag should be set for palette changes.

dwOffset
    Specifies the location of the data chunk in the file. The value should be specified as an offset, in bytes, from the start of the 'movi' list; however, in some AVI files it is given as an offset from the start of the file.
dwSize

Specifies the size of the data chunk, in bytes.

Remarks

This structure consists of the initial RIFF chunk (the fcc and cb members) followed by one index entry for each data chunk in the 'movi' list.

Requirements

Header Aviriff.h

计划该系列文章包括下面几篇:

AVI音视频封装格式学习(一)——微软RIFF文件格式摘要

AVI音视频封装格式学习(二)——AVI RIFF文件参考

AVI音视频封装格式学习(三)——AVI 数据结构解析

AVI音视频封装格式学习(四)——linux系统C语言AVI格式音视频封装应用

Wen Lee

2018.04.01

  

---------------------------------------2022.08.20:21:18更新----------------------------------------

由于各种原因,后续文章内容将更新到公众号,本平台将不再做更新。

CSDN上相关文章的测试工程代码,也统一放到了公众号上,可以免费免积分下载

可以通过主页上的二维码,也可以通过搜索微信公众号 liwen01 进入公众号

liwen01   2022.08.20

---------------------------------------2022.08.20:21:18更新----------------------------------------

AVI音视频封装格式学习(三)——AVI 数据结构解析相关推荐

  1. AVI音视频封装格式学习(二)——AVI RIFF文件参考

    AVI RIFF文件参考 AVI RIFF File Reference 微软AVI文件格式是与捕获,编辑和播放音视频流的应用程序一起使用的RIFF文件规范.通常,AVI文件包含多个不同类型的数据流. ...

  2. AVI音视频封装格式学习(四)——linux系统C语言AVI格式音视频封装应用

    拖了很久的AVI音视频封装实例,花了一天时间终于调完了,兼容性不是太好,但作为参考学习使用应该没有问题.RIFF和AVI以及WAV格式,可以参考前面的一些文章.这里详细介绍将一个H264视频流和一个2 ...

  3. AVI音视频封装格式学习(五)——h265与PCM合成AVI文件

    不知道是处于版权收费问题还是什么原因,H265现在也并没有非常广泛的被普及.将h265数据合成AVI的资料现在在网上也基本上没有.使用格式化工厂工具将h265数据封装成AVI格式,发现它在封装的时候其 ...

  4. AVI音视频封装格式学习

    https://blog.csdn.net/li_wen01/article/details/86716001

  5. 音视频封装格式转换器(支持avi格式转换),基于FFmpeg4.1实现(音视频学习笔记二)

    之前参照雷霄骅博士的最简单的基于FFMPEG的封装格式转换器(无编解码)的博客和FFmpeg官网的example,实现一个简单的封装格式转换器.但是后来我发现我想从mp4格式转换成avi格式的时候会报 ...

  6. 音视频封装格式、编码格式知识

    常见的AVI.RMVB.MKV.ASF.WMV.MP4.3GP.FLV等文件其实只能算是一种封装标准. 一个完整的视频文件是由音频和视频2部分组成的.H264.Xvid等就是视频编码格式,MP3.AA ...

  7. 音视频封装格式、编码格式

    音视频封装格式.编码格式 概述 常见的AVI.RMVB.MKV.ASF.WMV.MP4.3GP.FLV等文件其实只能算是一种封装标准. 一个完整的视频文件是由音频和视频2部分组成的.H264.Xvid ...

  8. 走进音视频的世界——视频封装格式

    音视频的时长怎么获取,音视频的封面怎么获取,音视频的格式怎么获取呢?这些信息都以特定格式存储在文件开头或者结尾,称为多媒体信息或者多媒体元数据.通用的封装格式由:文件标识头+多媒体信息+音视频(字幕) ...

  9. ffmpeg4.4项目学习--音视频基本格式

    目录 一.引言 二.音视频基本格式学习 ------> 2.1.图像压缩数据格式 ------> 2.2.PCM ------> 2.3.H264 ------> 2.4.AA ...

最新文章

  1. 没有数据的MySql导出架构
  2. 【Android 逆向】整体加固脱壳 ( DexClassLoader 加载 dex 流程分析 | RawDexFile.cpp 分析 | dvmRawDexFileOpen函数读取 DEX 文件 )
  3. Session 'app': Error Installing APK
  4. 这台无人机40小时经历上万次事故,终于借助AI学会了自动飞行
  5. 文献阅读(part3)--Self-taught Clustering
  6. Linux平台常用命令
  7. CUDA:在GPU上实现核函数的嵌套以及编译运行
  8. 动效给程序员用什么格式_超炫酷的H5动效!学若干招让程序猿帮你实现吧-动画-程序员-_ 卡酷动画片...
  9. 人人之间“不简单”,关系图谱“有一套”
  10. C#枚举(Enum)
  11. 达达O2O后台架构演进实践:从0到4000高并发请求背后的努力
  12. 34岁测试工程师面试美团遭拒:只招30岁以下,能加班但工资要求不高的....
  13. 值得收藏!EEG/ MEG/MRI/ fNIRS公开数据库大盘点
  14. C#实现将度分秒化为弧度值
  15. 金蝶首席用户体验官对“用户体验”的思考
  16. 2022中科院分区表弃用影响因子,多方官宣
  17. vue-element-admin sidebar分析
  18. intersect 相交 范围_空间关系分类及接口方法
  19. STM32 FLASH 写入不成功问题
  20. 高级软件测试工程师的面试

热门文章

  1. 深度学习初学者必须知道的25个专业名词
  2. s3 java sdk_s3javasdk文档.pdf
  3. e.target.dataset和e.currentTarget.dataset
  4. Exploitation and Exploration
  5. 苹果手机如何只用数据线修改定位
  6. 《鸟哥的Linux私房菜》Chapter11 20180726~20180806
  7. phpoffice 编辑excel文档
  8. Oracle Dimension in DWH
  9. 打造APP引导页3D切换特效
  10. TP-Admin 一个拥有站群功能的多功能CMS基础系统