视频编解码（一）：ffmpeg编码H.264帧类型判断

本文主要讲述ffmpeg编码过程中是如何设置I帧，B帧及P帧的，以及如何通过代码判断帧类型。

之前看过很多网上的文章，讲述如何判断I帧，B帧，P帧，然而都是停留在H.264官方文档中的定义，如果不结合ffmpeg，就仿佛纸上谈兵，有点不切实际，而且很多文章将I帧与I Slice混为一谈，将I Slice当做I帧，这其实是错的。本文就结合ffmpeg讲解ffmpeg中是如何编码各种帧类型的，并纠正其他一些文章的说法。

先有如下定义：

I帧：帧内预测

P帧：前向预测

B帧：前后双向预测

IDR图像：第一个I帧（IDR图像一定是I图像，但I图像不一定是IDR图像），具有随机访问的能力，在IDR帧之后的所有帧都不能引用任何IDR帧之前的帧的内容

（1）IDR帧肯定为I帧
（2）I帧包含了SPS, PPS, I条带
（3）P帧包含P条带
（4）B帧包含B条带

FFMPEG中有以下与帧类型相关的参数：

AVCodecContext* pCodecCtx;
pCodecCtx->gop_size = 25;
pCodecCtx->max_b_frames = 3;

gop_size设置一个gop中有多少帧，默认情况下，一个gop的第一帧为I帧，因此也可以理解为gop_size为相邻两个I帧之间的帧数。max_b_frames设置相邻两个非B帧之间最多出现的B帧数量。

gop有两种类型：封闭式与开放式。封闭式中每个gop的第一帧都是IDR图像，开放式中第一个gop的第一帧是IDR，后续的gop的第一帧非IDR图像。

再看以下编码函数：

int ret = avcodec_encode_video2(pCodecCtx, &pkt, pFrame, &got_picture);

第2个参数pFrame类型为AVFrame*，其有参数 pFrame->pict_type，用于设置编码的帧类型（这个值由用户设置，默认值为0，编码器不会自动设置该值，即不会自动设置为I帧，B帧的类型），pict_type的类型为：

enum AVPictureType {AV_PICTURE_TYPE_NONE = 0, ///< UndefinedAV_PICTURE_TYPE_I,     ///< IntraAV_PICTURE_TYPE_P,     ///< PredictedAV_PICTURE_TYPE_B,     ///< Bi-dir predictedAV_PICTURE_TYPE_S,     ///< S(GMC)-VOP MPEG4AV_PICTURE_TYPE_SI,    ///< Switching IntraAV_PICTURE_TYPE_SP,    ///< Switching PredictedAV_PICTURE_TYPE_BI,    ///< BI type
};

根据编码输出码流 pkt.data[4] 的值来判断帧类型，其中(pkt.data[4] & 0x1F)即为以下表中的nal_uint_type：

也可以看这个：

这里将一个yuv视频文件编码为H264文件，设置参数如下：

pCodecCtx->max_b_frames = 3;
pFrame->pict_type = AV_PICTURE_TYPE_NONE;
pCodecCtx->gop_size = 25;

以上参数，打印输出如下，其中cNalu =pkt.data[4], type = (pkt.data[4] & 0x1F)。

pFrame->pict_type = AV_PICTURE_TYPE_NONE时，pkt.data[4] 只有3个值：0x01, 0x41, 0x67，即(pkt.data[4] & 0x1F)的值为0x01和0x07，即分别表示非IDR与SPS，对于H.264来说，只有I帧才有SPS与PPS，因此这里的0x07即为I帧，而0x01为P或B帧。因此，pFrame->pict_type = AV_PICTURE_TYPE_NONE时，对于帧头来说，(pkt.data[4] & 0x1F)等于0x07表示I帧，等于0x01表示B或P帧。注意我这里说的是帧头，而对于slice头来说，(pkt.data[4] & 0x1F)等于0x1时，也有可能是非IDR的I Slice的。

编码后的帧序列图如下，一个Group的帧序列为:I PBBB PBBB PBBB PBBB PBBB PBBB

ffmpeg编码出来后第1帧（I帧）的长度为0x1F35，对应的十进制就是7989，跟上图中I帧 Size为7989对应。

我们再来看编码后文件的slice结构如下，I帧包含了SPS, PPS, SEI, I Slice四个结构，总长度为0x1F35（7989）。后面的P Slice 及B Slice 就各自单独作为一帧，第1个P帧的长度为0x797（1943），第1个B帧长度为0x3D8（984）。

其文件的十六进制码结构如下：

当参数设置为：

pCodecCtx->max_b_frames = 0;
pFrame->pict_type = AV_PICTURE_TYPE_NONE;
pCodecCtx->gop_size = 25;

这里设置b帧的数量为0，打印输出如下，输出只有I帧与P帧：

编码序列如下图，一个Group的帧序列为：I PPPPPPPPPPPPPPPPPPPP

其他设置的信息如下：

AV_PICTURE_TYPE_NONE：有三种帧类型输出：I帧，B帧，P帧，根据gop_size与max_b_frames设置I帧与B帧间隔。max_b_frames为0时，只有I帧与P帧

AV_PICTURE_TYPE_I：全都为I帧输出，这时max_b_frames要设置为0，不设置为0，tpye值由可能为0x1，但实际上这是个I帧

AV_PICTURE_TYPE_P：输出为I帧与P帧，max_b_frames要设置为0。I帧间隔由gop_size设置。

AV_PICTURE_TYPE_B：效果跟AV_PICTURE_TYPE_I时一致，max_b_frames要大于0

测试代码用的是雷霄华写的例子：

#include "stdafx.h"
#include <stdio.h>
#define __STDC_CONSTANT_MACROS#ifdef _WIN32
//Windows
extern "C"
{
#include "libavutil/opt.h"
#include "libavcodec/avcodec.h"
#include "libavformat/avformat.h"
};
#else
//Linux...
#ifdef __cplusplus
extern "C"
{
#endif
#include <libavutil/opt.h>
#include <libavcodec/avcodec.h>
#include <libavformat/avformat.h>
#ifdef __cplusplus
};
#endif
#endif#pragma comment(lib,"avcodec.lib")
#pragma comment(lib,"avutil.lib")
#pragma comment(lib,"swscale.lib")
#pragma comment(lib, "avformat.lib")int flush_encoder(AVFormatContext *fmt_ctx, unsigned int stream_index) {int ret;int got_frame;AVPacket enc_pkt;if (!(fmt_ctx->streams[stream_index]->codec->codec->capabilities &CODEC_CAP_DELAY))return 0;while (1) {enc_pkt.data = NULL;enc_pkt.size = 0;av_init_packet(&enc_pkt);ret = avcodec_encode_video2(fmt_ctx->streams[stream_index]->codec, &enc_pkt,NULL, &got_frame);av_frame_free(NULL);if (ret < 0)break;if (!got_frame) {ret = 0;break;}printf("Flush Encoder: Succeed to encode 1 frame!\tsize:%5d\n", enc_pkt.size);/* mux encoded frame */ret = av_write_frame(fmt_ctx, &enc_pkt);if (ret < 0)break;}return ret;
}int main(int argc, char* argv[])
{AVFormatContext* pFormatCtx;AVOutputFormat* fmt;AVStream* video_st;AVCodecContext* pCodecCtx;AVCodec* pCodec;AVPacket pkt;uint8_t* picture_buf;AVFrame* pFrame;int picture_size;int y_size;int framecnt = 0;//FILE *in_file = fopen("src01_480x272.yuv", "rb");  //Input raw YUV data FILE *in_file = fopen("../akiyo_cif.y4m", "rb");   //Input raw YUV dataint in_w = 352, in_h = 288;                              //Input data's width and heightint framenum = 300;                                   //Frames to encodeconst char* out_file = "ds.h264";av_register_all();//Method1.pFormatCtx = avformat_alloc_context();//Guess Formatfmt = av_guess_format(NULL, out_file, NULL);pFormatCtx->oformat = fmt;//Open output URLif (avio_open(&pFormatCtx->pb, out_file, AVIO_FLAG_READ_WRITE) < 0) {printf("Failed to open output file! \n");return -1;}video_st = avformat_new_stream(pFormatCtx, 0);//video_st->time_base.num = 1; //video_st->time_base.den = 25;  if (video_st == NULL) {return -1;}//Param that must setpCodecCtx = video_st->codec;pCodecCtx->codec_id = fmt->video_codec;pCodecCtx->codec_type = AVMEDIA_TYPE_VIDEO;pCodecCtx->pix_fmt = AV_PIX_FMT_YUV420P;pCodecCtx->width = in_w;pCodecCtx->height = in_h;pCodecCtx->bit_rate = 400000;pCodecCtx->gop_size = 25;pCodecCtx->time_base.num = 1;pCodecCtx->time_base.den = 25;//H264//pCodecCtx->me_range = 16;//pCodecCtx->max_qdiff = 4;//pCodecCtx->qcompress = 0.6;pCodecCtx->qmin = 10;pCodecCtx->qmax = 51;//Optional ParampCodecCtx->max_b_frames = 3;// Set OptionAVDictionary *param = 0;//H.264if (pCodecCtx->codec_id == AV_CODEC_ID_H264) {av_dict_set(&param, "preset", "slow", 0);av_dict_set(&param, "tune", "zerolatency", 0);//av_dict_set(¶m, "profile", "main", 0);}//H.265//if (pCodecCtx->codec_id == AV_CODEC_ID_H265) {// av_dict_set(&param, "preset", "ultrafast", 0);//    av_dict_set(&param, "tune", "zero-latency", 0);//}//Show some Informationav_dump_format(pFormatCtx, 0, out_file, 1);pCodec = avcodec_find_encoder(pCodecCtx->codec_id);if (!pCodec) {printf("Can not find encoder! \n");return -1;}if (avcodec_open2(pCodecCtx, pCodec, &param) < 0) {printf("Failed to open encoder! \n");return -1;}pFrame = av_frame_alloc();picture_size = avpicture_get_size(pCodecCtx->pix_fmt, pCodecCtx->width, pCodecCtx->height);picture_buf = (uint8_t *)av_malloc(picture_size);avpicture_fill((AVPicture *)pFrame, picture_buf, pCodecCtx->pix_fmt, pCodecCtx->width, pCodecCtx->height);//Write File Headeravformat_write_header(pFormatCtx, NULL);av_new_packet(&pkt, picture_size);y_size = pCodecCtx->width * pCodecCtx->height;pFrame->width = pCodecCtx->width;pFrame->height = pCodecCtx->height;pFrame->format = pCodecCtx->pix_fmt;int iPFrame = 0;pFrame->pict_type = AV_PICTURE_TYPE_NONE;for (int i = 0; i<framenum; i++) {//Read raw YUV dataif (fread(picture_buf, 1, y_size * 3 / 2, in_file) <= 0) {printf("Failed to read raw data! \n");return -1;}else if (feof(in_file)) {break;}pFrame->data[0] = picture_buf;              // YpFrame->data[1] = picture_buf + y_size;      // U pFrame->data[2] = picture_buf + y_size * 5 / 4;  // V//PTS//pFrame->pts=i;pFrame->pts = i*(video_st->time_base.den) / ((video_st->time_base.num) * 25);int got_picture = 0;//Encodeint ret = avcodec_encode_video2(pCodecCtx, &pkt, pFrame, &got_picture);if (ret < 0) {printf("Failed to encode! \n");return -1;}if (got_picture == 1) {framecnt++;pkt.stream_index = video_st->index;char cNalu = pkt.data[4];char type = (cNalu & 0x1f);if (type == 7){                  iPFrame = 0;printf("Succeed to encode frame: %5d\tsize:%5d, cNalu = 0x%-2x, type = 0x%-2x,=====i帧\n", framecnt, pkt.size, cNalu, type);}else if (type == 1){iPFrame++;printf("Succeed to encode frame: %5d\tsize:%5d, cNalu = 0x%-2x, type = 0x%-2x, p帧:%d\n", framecnt, pkt.size, cNalu, type, iPFrame);}else{printf("其他帧类型：cNalu = 0x%-2x, type = 0x%-2x\n", cNalu, type);}ret = av_write_frame(pFormatCtx, &pkt);av_free_packet(&pkt);}}//Flush Encoderint ret = flush_encoder(pFormatCtx, 0);if (ret < 0) {printf("Flushing encoder failed\n");return -1;}//Write file trailerav_write_trailer(pFormatCtx);//Cleanif (video_st) {avcodec_close(video_st->codec);av_free(pFrame);av_free(picture_buf);}avio_close(pFormatCtx->pb);avformat_free_context(pFormatCtx);fclose(in_file);return 0;
}

视频编解码（一）：ffmpeg编码H.264帧类型判断相关推荐

音视频系列2：ffmpeg将H.264解码为RGB
音视频系列2:ffmpeg将H.264解码为RGB 前言源码前言喜大普奔,终于更新啦,上期说到,如何使用ffmpeg+rtmp进行拉流,不熟悉的小伙伴们,可以先看上一期.今天我们要实现的是使用f ...
视音频编解码学习工程：H.264分析器
===================================================== 视音频编解码学习工程系列文章列表: 视音频编解码学习工程:H.264分析器视音频编解码学习 ...
视频编解码和MPEG4编码
来自 http://blog.csdn.net/dansin/article/details/389149 MPEG-4编解码学习. 1.视频数据编码的办法对数据进行编码的目的前面以前提过,一方面降低 ...
音视频开发之旅（58) -H.264 帧内预测
目录编码流程和数据格式回顾 4 × 4亮度块的9中预测模式 16 × 16亮度块的4种预测模式 8 × 8 色度块的4种预测模式 JM代码资料收获一.编码流程和数据格式回顾我们先看下下图来回 ...
视频编解码（八）：264/265解码器小结
一.灵活的编码结构作者:DayInAI 日期:20190123 在H.265中,将宏块的大小从H.264的16×16扩展到了64×64,以便于高分辨率视频的压缩.同时,采用了更加灵活的编码结构来 ...
视频编解码基础--H264编码
(1)图像GOP: GOP:GOP group of pictures,指的就是两个I帧之间的间隔,在这两个I帧之间可以存在多个P帧和B帧,一般在IPC中,主要是I帧和P帧,B帧一般不使用(B帧同时依 ...
视频编解码（六）：264解码器学习
一.VBV 缓冲队列的作用二.264码流结构三.帧重排序步骤四.SODB数据比例串
音视频编解码技术（一）：MPEG-4/H.264 AVC 编解码标准
一.H264 概述 H.264,通常也被称之为H.264/AVC(或者H.264/MPEG-4 AVC或MPEG-4/H.264 AVC) 1. H.264视频编解码的意义 H.264的出现就是为了创 ...
H.264/AVC视频编解码技术
一.基本概念 1.GOP GOP即Group of picture,是一组连续的图像,由一个I帧和多个B/P帧组成,是编解码器存取的基本单位.GOP结构常用的两个参数M和N,M指定GOP中首个P帧和I ...
iOS之ffmpeg开发音视频编解码概要、SDL
官网:http://ffmpeg.org/documentation.html http://ffmpeg.org/ffmpeg.html 简介:https://blog.csdn.net/qq_36 ...

视频编解码（一）：ffmpeg编码H.264帧类型判断

视频编解码（一）：ffmpeg编码H.264帧类型判断相关推荐

最新文章

热门文章