ffmpeg学习函数分析swr

文章目录

简单说明
- 采样格式转换
- 声道格式转换
- 采样频率转换
libswresample库使用
- 函数介绍
示例代码

有关ffmpeg中主要的api函数源码解析参考雷神系列文章，整理如下 ffmpeg学习（2）获取和使用，源码分析。

libswresample主要是用于音频的重采样和格式转换的,包含如下功能：

采样频率转换：对音频的采样频率进行转换的处理，例如把音频从一个高的44100Hz的采样频率转换到8000Hz；从高采样频率到低采样频率的音频转换是一个有损的过程

声道格式转换：对音频的声道格式进行转换的处理，例如立体声转换为单声道；当输入通道不能映射到输出流时，这个过程是有损的，因为它涉及不同的增益因素和混合。

采样格式转换：对音频的样本格式进行转换的处理，例如把s16（AV_SAMPLE_FMT_S16）的PCM数据转换为s8格式或者f32的PCM数据；此外提供了Packed和Planar包装格式之间相互转换的功能。

简单说明

有关PCM介绍查看文章 ffmpeg学习音频采样数据PCM，采样格式、声道格式可以简单的手工处理，详见文章 ffmpeg学习（6）音频解码、音频数据处理，这里再简单说明如下。

采样格式转换

采样数据从32位float类型数据转换位无符号8位uchar类型，需要将取值范围转换到[0,255]。

for(int n = 0; n < frame->nb_samples; n++)for(int c = 0; c < frame->channels; c++) {float vsrc = *(float *)(frame->data[c] + n*in_sample_bytes);unsigned char vdst = (vsrc*128 + 128);fwrite(&vdst, sizeof(unsigned char), 1, fpcm);}

采样数据从从32位float类型数据转换位16位short类型，需要将取值范围转换到[-32768~32767]。

for(int n = 0; n < frame->nb_samples; n++)for(int c = 0; c < frame->channels; c++) {float vsrc = *(float *)(frame->data[c] + n*in_sample_bytes);short vdst = vsrc*32768;fwrite(&vdst, sizeof(short), 1, fpcm);}

声道格式转换

通道从少变多，可以复制一个通道数据。从多变少，可以直接保留需要的声道。
从原来的2个通道，保存为1个通道，可以选择保存一个或者去平均；

for(int n = 0; n < frame->nb_samples; n++) {float vdst = 0;for(int c = 0; c < frame->channels; c++) vdst += *(float *)(frame->data[c] + n*in_sample_bytes);vdst /= frame->channels;fwrite(&vdst, sizeof(float), 1, fpcm);
}

采样频率转换

这里仅给出，转换前频率是转换后频率的整数倍，例如转换前后频率分别为48000和8000。我们将输入的采样数据每间隔6个保存一个即可。例如

for(int n = 0; n < frame->nb_samples; n+=6)for(int c = 0; c < frame->channels; c++) {float vsrc = *(float *)(frame->data[c] + n*in_sample_bytes);char vdst = vsrc*128;fwrite(&vdst, sizeof(char), 1, fpcm);}
}

libswresample库使用

当音频的采样率与播放器的采样率不一致时，那么想在播放器正常播放，就需要对音频进行重采样，否则可能会出现音频变速的问题（两个采样频率不能整除，手动处理需要插值补齐等）。这里着重介绍使用libswresample库处理音频采样数据的转换。

使用流程
（1）实例化SwrContext对象
（2）调用 swr_convert() 进行采样数据转换
（3）释放SwrContext对象

类似SwsContext使用，初次实例化SwrContext对象也有两种方法:

第一种，先调用SwrContext *swr = swr_alloc_set_opts(…)函数，再调用swr_init(swr);

第二种，先调用SwrContext *swr = swr_alloc();，再调用av_opt_set_xxxx()分别设置各参数，最后调用swr_init(swr);。

通常，我们首次初始化SwrContext对象使用第一种方式，之后如有需要修改参数，可以继续调用av_opt_set_xxxx()，并执行swr_init(swr);

函数介绍

初始化、配置SwrContext对象

/*** Allocate SwrContext if needed and set/reset common parameters.** This function does not require s to be allocated with swr_alloc(). On the* other hand, swr_alloc() can use swr_alloc_set_opts() to set the parameters* on the allocated context.** @param s               existing Swr context if available, or NULL if not* @param out_ch_layout   output channel layout (AV_CH_LAYOUT_*)* @param out_sample_fmt  output sample format (AV_SAMPLE_FMT_*).* @param out_sample_rate output sample rate (frequency in Hz)* @param in_ch_layout    input channel layout (AV_CH_LAYOUT_*)* @param in_sample_fmt   input sample format (AV_SAMPLE_FMT_*).* @param in_sample_rate  input sample rate (frequency in Hz)* @param log_offset      logging level offset* @param log_ctx         parent logging context, can be NULL** @see swr_init(), swr_free()* @return NULL on error, allocated context otherwise*/
struct SwrContext *swr_alloc_set_opts(struct SwrContext *s,int64_t out_ch_layout, enum AVSampleFormat out_sample_fmt, int out_sample_rate,int64_t  in_ch_layout, enum AVSampleFormat  in_sample_fmt, int  in_sample_rate,int log_offset, void *log_ctx);

采样数据转换，参数为输入、输出采样数据指针及采样数量，返回值为转换得到采样数据个数。当输入为空，表示flush其内部缓冲数据。

/** Convert audio.** in and in_count can be set to 0 to flush the last few samples out at the* end.** If more input is provided than output space, then the input will be buffered.* You can avoid this buffering by using swr_get_out_samples() to retrieve an* upper bound on the required number of output samples for the given number of* input samples. Conversion will run directly without copying whenever possible.** @param s         allocated Swr context, with parameters set* @param out       output buffers, only the first one need be set in case of packed audio* @param out_count amount of space available for output in samples per channel* @param in        input buffers, only the first one need to be set in case of packed audio* @param in_count  number of input samples available in one channel** @return number of samples output per channel, negative value on error*/
int swr_convert(struct SwrContext *s, uint8_t **out, int out_count,const uint8_t **in , int in_count);

其他相关代码

int av_get_bytes_per_sample(enum AVSampleFormat sample_fmt); // 一个采样数据占用字节数
int av_sample_fmt_is_planar(enum AVSampleFormat sample_fmt); // 采样数据是否为平面格式

示例代码

输入pcm文件格式为数据深度16位、44100Hz采样频率、双通道(packed)，要求输出pcm文件格式为数据深度32位整形、44100Hz采样频率、双通道（plannar）。

#include <stdio.h>#ifdef __cplusplus
extern "C" {#endif  #include "libswresample/swresample.h"#include "libavutil/opt.h"#ifdef __cplusplus
}
#endif int main()
{//输入文件和参数FILE *in_file = fopen("../files/Titanic_44100_s16_stero.pcm", "rb");const int in_sample_rate = 44100;AVSampleFormat in_sfmt = AV_SAMPLE_FMT_S16;  // 输入数据交错存放，非plannaruint64_t in_channel_layout = AV_CH_LAYOUT_STEREO;int in_channels = av_get_channel_layout_nb_channels(in_channel_layout);const int in_nb_samples = 2048;int in_spb = av_get_bytes_per_sample(in_sfmt);// 输出文件和参数FILE *out_file = fopen("out.pcm", "wb");const int out_sample_rate = 48000;AVSampleFormat out_sfmt = AV_SAMPLE_FMT_S32P;uint64_t out_channel_layout = AV_CH_LAYOUT_STEREO;int out_channels = av_get_channel_layout_nb_channels(out_channel_layout);int out_nb_samples = av_rescale_rnd(in_nb_samples, out_sample_rate, in_sample_rate, AV_ROUND_UP);int out_spb = av_get_bytes_per_sample(out_sfmt);//使用AVFrame分配缓存音频pcm数据，用于转换AVFrame *in_frame = av_frame_alloc();av_samples_alloc(in_frame->data, in_frame->linesize, in_channels, in_nb_samples, in_sfmt, 1);AVFrame *out_frame = av_frame_alloc();av_samples_alloc(out_frame->data, out_frame->linesize, out_channels, out_nb_samples, out_sfmt, 1);// swr上下文//SwrContext *swr_ctx = swr_alloc();//av_opt_set_channel_layout(swr_ctx, "in_channel_layout", in_channel_layout, 0);//av_opt_set_channel_layout(swr_ctx, "out_channel_layout", out_channel_layout, 0);//av_opt_set_int(swr_ctx, "in_sample_rate", in_sample_rate, 0);//av_opt_set_int(swr_ctx, "out_sample_rate", out_sample_rate, 0);//av_opt_set_sample_fmt(swr_ctx, "in_sample_fmt", in_sfmt, 0);//av_opt_set_sample_fmt(swr_ctx, "out_sample_fmt", out_sfmt, 0);//swr_init(swr_ctx);SwrContext *swr_ctx = NULL;swr_ctx = swr_alloc_set_opts(swr_ctx, out_channel_layout, out_sfmt, out_sample_rate, in_channel_layout, in_sfmt, in_sample_rate, 0, NULL);swr_init(swr_ctx);修改参数//av_opt_set_int(swr_ctx, "in_sample_rate", in_sample_rate, 0);//swr_init(swr_ctx);// 用于读取的缓冲数据int buf_len = in_spb*in_channels*in_nb_samples;void *buf = malloc(buf_len);// 转换保存int frameCnt = 0;while(1) {  // read samplesint read_samples = fread(in_frame->data[0], in_spb*in_channels,in_nb_samples, in_file);if(read_samples <= 0)break;// convert prepareint dst_nb_samples = av_rescale_rnd(swr_get_delay(swr_ctx, in_sample_rate) + in_nb_samples,out_sample_rate,in_sample_rate, AV_ROUND_UP);if(dst_nb_samples > out_nb_samples) {av_frame_unref(out_frame);out_nb_samples = dst_nb_samples;av_samples_alloc(out_frame->data, out_frame->linesize, out_channels, out_nb_samples, out_sfmt, 1);}// convertint out_samples = swr_convert(swr_ctx, out_frame->data, out_nb_samples,(const uint8_t**)in_frame->data, read_samples);// writeif(av_sample_fmt_is_planar(out_sfmt)) { // plannarfor(int i = 0; i < out_samples; i++) {for(int c = 0; c < out_channels; c++)fwrite(out_frame->data[c] + i*out_spb, 1, out_spb, out_file);}}else {  // packedfwrite(out_frame->data[0], out_spb*out_channels, out_samples, out_file);}printf("Succeed to convert frame %4d, samples [%d]->[%d]\n", frameCnt++, read_samples, out_samples);}// flush swrprintf("Flush samples \n");int out_samples;do {// convertout_samples = swr_convert(swr_ctx, out_frame->data, out_nb_samples, NULL, 0);// writeif(av_sample_fmt_is_planar(out_sfmt)) { for(int i = 0; i < out_samples; i++) {for(int c = 0; c < out_channels; c++)fwrite(out_frame->data[c] + i*out_spb, 1, out_spb, out_file);}}else {fwrite(out_frame->data[0], out_spb*out_channels, out_samples, out_file);}printf("Succeed to convert frame %d samples %d\n", frameCnt++, out_samples);}while(out_samples);// freeav_frame_free(&in_frame);av_frame_free(&out_frame);swr_free(&swr_ctx);free(buf);fclose(in_file);fclose(out_file);
}

输出采样数据个数
输出采样频率发生变化，那么单通道采样个数也响应发生变化。频率变高，采样数据增加；频率降低，采样数据减少。计算方式为

int out_nb_samples = av_rescale_rnd(in_nb_samples, out_sample_rate, in_sample_rate, AV_ROUND_UP);

转换数据个数计算

在实际使用中，可能存在输入采样数据个数变化/延时，当输入增大，swr_ctx内部会进行缓冲，不及时取出可能造成数据堆积，影响输出（例如实时推流）。
此时需要重新分配空间，接收当前转换数据及缓冲数据，

int dst_nb_samples = av_rescale_rnd(swr_get_delay(swr_ctx, in_sample_rate) + in_nb_samples,out_sample_rate,in_sample_rate, AV_ROUND_UP);
if(dst_nb_samples > out_nb_samples) {// 释放原空间，重新分配
}

swr_convert调用及结果处理

传参时，输出的缓冲数据区和对应的采样数据量，是动态调整的结果值。处理转换后的采样数据时，应该以swr_convert返回值为准。

例如实际转换得到的采样数据数量为out_samples，则后续处理为

    // writeif(av_sample_fmt_is_planar(out_sfmt)) { // plannarfor(int i = 0; i < out_samples; i++) {for(int c = 0; c < out_channels; c++)fwrite(out_frame->data[c] + i*out_spb, 1, out_spb, out_file);}}else {  // packedfwrite(out_frame->data[0], out_spb*out_channels, out_samples, out_file);}

最后flush时的输出处理也同上。

运行结果截图

使用audacity工具载入pcm文件设置参数，如下

数据正常。