一、编解码原理

编码原理

（1）零偏置电平下移（Level Offset）
对于灰度级为2^n 的像素，通过减去2^(n-1)，将无符号整数变为有符号数，即值域变为正负对称。将绝对值大的数出现的概率大大减小，提高编码效率。

（2）离散余弦变换（8×8 DCT）
先将图像分为8×8的像块，如果图像的宽（高）不是8的整数倍，使用图像边缘像素填充，以不改变频谱分布。然后对每一个子块进行DCT（Discrete Cosine Transform）。DCT变换使用下式计算，C为变换核矩阵：

实现能量集中和去相关，便于去除空间冗余，提高编码效率。DCT是一种无损变换，不压缩图像（输出的是系数）这样做是在为下一步的量化做准备。

（3）量化（Quantization）
量化是编码流程中唯一会引入误差也是唯一会带来压缩效果的步骤，决定压缩质量，因此是JPEG压缩编码算法的核心。JPEG标准中采用中平型均匀量化，输入DCT系数，输出量化系数。

量化表有建议量化表和真正使用的量化表 之分。

建议量化表：基于人的生理感知阈值实验，对人眼敏感的低频部分采取较细的量化，对不太敏感的高频部分采取较粗的量化，减少了视觉冗余。
真正的量化表：
- 质量因子≤ 50：缩放因子= 50 / 质量因子
- 质量因子> 50：缩放因子 = 2 – 质量因子/ 50

（4）DC系数——差分编码（Differential indices）
8×8像块经过DCT后得到的DC系数有两个特点：一是系数的值较大；二是相邻像块的DC系数存在相关性（即存在冗余）。根据这个特点，JPEG标准采用了DPCM（差分脉冲编码调制），以对相邻图像块之间量化DC系数的差值DIFF进行编码：

（5） AC系数——之字形扫描和游程编码（Zig-Zag+RLE）

之字形扫描

经过DCT变换后，AC系数大多集中在左上角的低频分量区。因此采用Z字形按频率的高低顺序读出，可以出现很多连零的情况，便于使用RLE（Run Length Encoding，游程编码），若最后的数据均为０，则直接给出EOB(End of Block）。

游程编码

当遇到很多连续的0时，为缩短数据长度，编码【非零系数level和它之前0的个数run】，（Run，level）。

如：0，-2，0，0，3，2，-4–>游程编码：(1,-2),(2,3),(0,2),(0,-4),EOB

（6）Huffman编码

对DC系数DPCM的结果和AC系数RLE的结果进行Huffman编码：

类别ID采用一元码编码。
类内索引采用定长码编码。共有亮度DC、亮度AC、色差DC、色差AC四张码表。

解码原理

① 解码huffman数据、② 解码DC差值、③ 重构量化后的系数、④ DCT逆变换

⑤ 丢弃填充的行/列、⑥ 反0偏置、⑦ 对丢弃的CbCr分量进行插值、⑧ YCbCr–>RGB

二、JPEG文件格式

JPEG文件以segment的形式组织，其中每个segment以一个marker开始（marker均以0xFF+一个marker的标识符），随后为2字节的marker长度（不包含marker的起始两字节）和对应的payload（SOI和EOI marker只有2字节的标识符）。

注意，连续的0xFF字节并不是marker的起始标志，而是用来填充的特殊字符。此外，部分中，0xFF后若为0x00，则跳过此字节不予处理。

以test.jpg为例，查看二进制文件格式。

使用JPEG_parser程序，生成的解析文件，也可以辅助分析:

三、程序调试

结构体

struct huffman_table：存储Huffman码表，获取权值对应的码字和码长，根据表进行快速查找。

struct huffman_table
{/* Fast look up table, using HUFFMAN_HASH_NBITS bits we can have directly the symbol,* if the symbol is <0, then we need to look into the tree table */short int lookup[HUFFMAN_HASH_SIZE];/* code size: give the number of bits of a symbol is encoded */unsigned char code_size[HUFFMAN_HASH_SIZE];/* some place to store value that is not encoded in the lookup table * FIXME: Calculate if 256 value is enough to store all values*/uint16_t slowtable[16-HUFFMAN_HASH_NBITS][256];
};

struct component：存储解码信息，定义了水平方向和垂直方向的采样因子，量化表的指针，AC系数和DC系数的Huffman码表的指针。

struct component
{unsigned int Hfactor;unsigned int Vfactor;float *Q_table;      /* Pointer to the quantisation table to use */struct huffman_table *AC_table;struct huffman_table *DC_table;short int previous_DC;  /* Previous DC coefficient */short int DCT[64];     /* DCT coef */
#if SANITY_CHECKunsigned int cid;
#endif
};

在对DC系数编码的时候采用了DPCM+Huffman编码的方式，而DPCM的解码需要有前一个的解码值，所以定义了解码过程中需要临时存储的变量，其中DCT[64]用于保存DCT系数，previous_DC用来保存前一个解码后的DC系数。

struct jdec_private：定义JPEG数据流结构体，用来指示解码过程中所用到的信息，如图像数据、量化表、Huffman码表等，并定义了存储IDCT解码后的像素值的变量

struct jdec_private
{/* Public variables */uint8_t *components[COMPONENTS];unsigned int width, height;  /* Size of the image */unsigned int flags;/* Private variables */const unsigned char *stream_begin, *stream_end;unsigned int stream_length;const unsigned char *stream; /* Pointer to the current stream */unsigned int reservoir, nbits_in_reservoir;struct component component_infos[COMPONENTS];float Q_tables[COMPONENTS][64];      /* quantization tables */struct huffman_table HTDC[HUFFMAN_TABLES]; /* DC huffman tables   */struct huffman_table HTAC[HUFFMAN_TABLES]; /* AC huffman tables   */int default_huffman_table_initialized;int restart_interval;int restarts_to_go;             /* MCUs left in this restart interval */int last_rst_marker_seen;           /* Rst marker is incremented each time *//* Temp space used after the IDCT to store each components */uint8_t Y[64*4], Cr[64], Cb[64];jmp_buf jump_state;/* Internal Pointer use for colorspace conversion, do not modify it !!! */uint8_t *plane[COMPONENTS];};

代码框架

1.准备

输入参数选择需要输出的文件格式，此实验为YUV420

2.main

接受输入输出文件名称参数，打开TRACEFILE。

3.convert_one_image

打开输入输出文件，初始化jdec结构体，获得文件参数信息

4.tinyjpeg_parse_header

控制指针移动，调用parse_JFIF()

5.parse_JFIF

循环调用parse_SOF、parse_DQT、parse_SOS、parse_DHT、parse_DRI函数，直到发现SOS块为止。

6.parse_DQT

调用build_quantization_table函数创建量化表，解码获得量化表信息。

7.parse_DHT

解码获得Huffman码表信息，通过build_huffman_table创建Huffman表。

8.tinyjpeg_decode

利用码表信息解码jpeg图像。

9.write_yuv

将jpeg图像按yuv格式写入文件。

TRACE

TRACE的值为1时，TRACE开启，记录重要信息；否则TRACE关闭。通过后续解码得到的这些信息，可以判断是否正确解码。

#define  snprintf _snprintf//add by nx
//可在此修改控制开闭
#define TRACE 1//add by nxn
#define  TRACEFILE "trace_jpeg.txt"//add by nxn

四、实验输出

量化矩阵和Huffman表

在build_quantization_table()中添加如下代码后，实现输出量化矩阵输出；

    #if TRACEconst unsigned char* zz1 = zigzag;for (int i = 0; i < 8; i++) {for (int j = 0; j < 8; j++) {fprintf(p_trace, "%d", ref_table[*zz1++]);if (j == 7) {fprintf(p_trace, "\n");}}}#endif

Huffman码表在build_huffman_table中，可以直接使用：

输出 DC、AC值图像

int tinyjpeg_decode(struct jdec_private *priv, int pixfmt)
{ ···//输出 DC AC图FILE* DCFile;FILE* ACFile_1,* ACFile_5, *ACFile_15;DCFile = fopen("DC.yuv", "w");ACFile_1 = fopen("AC1.yuv", "w");ACFile_5 = fopen("AC5.yuv", "w");ACFile_15 = fopen("AC15.yuv", "w");unsigned char* uvbuf = 128;unsigned char* DCbuf, *ACbuf_1,*ACbuf_5, *ACbuf_15;int cnt = 0;//统计y分量的数量decode_mcu_table = decode_mcu_3comp_table;switch (pixfmt) {case TINYJPEG_FMT_YUV420P:colorspace_array_conv = convert_colorspace_yuv420p;if (priv->components[0] == NULL)priv->components[0] = (uint8_t *)malloc(priv->width * priv->height);if (priv->components[1] == NULL)priv->components[1] = (uint8_t *)malloc(priv->width * priv->height/4);if (priv->components[2] == NULL)priv->components[2] = (uint8_t *)malloc(priv->width * priv->height/4);bytes_per_blocklines[0] = priv->width;bytes_per_blocklines[1] = priv->width/4;bytes_per_blocklines[2] = priv->width/4;bytes_per_mcu[0] = 8;bytes_per_mcu[1] = 4;bytes_per_mcu[2] = 4;break;
···//xstride_by_mcu = ystride_by_mcu = 8;if ((priv->component_infos[cY].Hfactor | priv->component_infos[cY].Vfactor) == 1){decode_MCU = decode_mcu_table[0];convert_to_pixfmt = colorspace_array_conv[0];
#if TRACEfprintf(p_trace,"Use decode 1x1 sampling\n");fflush(p_trace);
#endif···/* Just the decode the image by macroblock (size is 8x8, 8x16, or 16x16) */for (y=0; y < priv->height/ystride_by_mcu; y++){//trace("Decoding row %d\n", y);priv->plane[0] = priv->components[0] + (y * bytes_per_blocklines[0]);priv->plane[1] = priv->components[1] + (y * bytes_per_blocklines[1]);priv->plane[2] = priv->components[2] + (y * bytes_per_blocklines[2]);for (x=0; x < priv->width; x+=xstride_by_mcu){decode_MCU(priv);//加入DC,AC的数据接入文件DCbuf = (unsigned char)((priv->component_infos->DCT[0] + 512) / 4.0);fwrite(&DCbuf, 1, 1, DCFile);ACbuf_1 = (unsigned char)((priv->component_infos->DCT[1] + 128));fwrite(&ACbuf_1, 1, 1,ACFile_1);ACbuf_5 = (unsigned char)((priv->component_infos->DCT[5] + 128));fwrite(&ACbuf_5, 1, 1, ACFile_5);ACbuf_15 = (unsigned char)((priv->component_infos->DCT[15] + 128));fwrite(&ACbuf_15, 1, 1, ACFile_15);cnt++;
···
#if TRACEfprintf(p_trace,"Input file size: %d\n", priv->stream_length+2);fprintf(p_trace,"Input bytes actually read: %d\n", priv->stream - priv->stream_begin + 2);fflush(p_trace);
#endif//uv分量写进文件for (int j = 0; j < cnt * 0.25 * 2; j++){fwrite(&uvbuf, sizeof(unsigned char), 1, DCFile);fwrite(&uvbuf, sizeof(unsigned char), 1, ACFile_1);fwrite(&uvbuf, sizeof(unsigned char), 1, ACFile_5);fwrite(&uvbuf, sizeof(unsigned char), 1, ACFile_15);}fclose(DCFile);fclose(ACFile_1);fclose(ACFile_5);fclose(ACFile_15);return 0;
}

DC系数图	AC系数图

统计其概率分布

#include<iostream>
#define uchar unsigned char
#pragma warning(disable:4996);
using namespace std;
// 读取yuv文件，输出统计概率分布的txt文件void Freq(uchar* buffer, double* freq, int size);int main()
{int size = 128 * 128;FILE* DCYUV;FILE* ACYUV_1, * ACYUV_5, * ACYUV_15;DCYUV = fopen("DC.yuv", "r");ACYUV_1 = fopen("AC1.yuv", "r");ACYUV_5 = fopen("AC5.yuv", "r");ACYUV_15 = fopen("AC15.yuv", "r");if (DCYUV == NULL ||ACYUV_1 == NULL || ACYUV_5== NULL || ACYUV_15 == NULL){cout << "one of the file failed open" << endl;}uchar* DCbuf = new uchar[size];uchar* ACbuf_1 = new uchar[size];uchar* ACbuf_5 = new uchar[size];uchar* ACbuf_15 = new uchar[size];fread(DCbuf, 1, size, DCYUV);fread(ACbuf_1, 1, size, ACYUV_1);fread(ACbuf_5, 1, size, ACYUV_5);fread(ACbuf_15, 1, size, ACYUV_15);FILE* DCFile;FILE* ACFile_1, * ACFile_5, * ACFile_15;DCFile = fopen("DC.txt", "w");ACFile_1 = fopen("AC_1", "w");ACFile_5 = fopen("AC_5", "w");ACFile_15 = fopen("AC_15", "w");if (DCFile == NULL || ACFile_1 == NULL || ACFile_5 == NULL || ACFile_15 == NULL){cout << "one of the file failed open" << endl;}double freq_dc[256] = { 0 }, freq_ac1[256] = { 0 }, freq_ac5[256] = { 0 }, freq_ac15[256] = { 0 };Freq(DCbuf, freq_dc, size);fprintf(DCFile, "%s\t%s\n", "symbol", "freq");for (int i = 0; i < 256; i++){fprintf(DCFile, "%d\t%f\n", i, freq_dc[i]);}Freq(ACbuf_1, freq_ac1, size);fprintf(ACFile_1, "%s\t%s\n", "symbol", "freq");for (int i = 0; i < 256; i++){fprintf(ACFile_1, "%d\t%f\n", i, freq_ac1[i]);}Freq(ACbuf_5, freq_ac5, size);fprintf(ACFile_5, "%s\t%s\n", "symbol", "freq");for (int i = 0; i < 256; i++){fprintf(ACFile_5, "%d\t%f\n", i, freq_ac5[i]);}Freq(ACbuf_15, freq_ac15, size);fprintf(ACFile_15, "%s\t%s\n", "symbol", "freq");for (int i = 0; i < 256; i++){fprintf(ACFile_15, "%d\t%f\n", i, freq_ac15[i]);}fclose(DCFile);fclose(ACFile_1);fclose(ACFile_5);fclose(ACFile_15);cout << "结束" << endl;return 0;
}void Freq(uchar*buffer,double* freq,int size)
{// 统计for (int i = 0; i < size; i++){freq[buffer[i]] += 1;}// 计算概率for (int i = 0; i < 256; i++)c{freq[i] = freq[i] / size;}
}

DC图像概率分布	AC图像概率分布

JPEG编解码分析及调试相关推荐

数据压缩第七周作业——JPEG编解码
目录一.实验目的二.实验原理 1.JPEG编码器:编辑 2.JPEG解码器 3.JPEG文件解析三.实验内容 (1)调试和理解JPEG解码器程序 (2)理解程序设置 1.结构体理解 2.梳理代 ...
最近做Jpeg编解码遇到的问题
最近做Jpeg编解码遇到的问题 2011-03-05 10:41:32| 分类: 技术系列 | 标签:解码 sos jpegencoder myjpeg 字节字号:大中小订阅 h ...
java web编码详解_Java Web 之编解码分析
Java Web 之编解码分析所谓编码,就是将字符转换成字节,所谓解码,就是将字节转换为字符.而编解码中存在的问题主要是由编码和解码所用字符集不匹配导致的.本文主要从以下三个方面分析 Java We ...
【STM32F407VET6开发】第三章 jpeg编解码实验（有误，待修改）
目录一.硬件连接(STM32F407VETx.仿真器) 二.配置Debugger 三.通过串口协议,实现对jpeg图片编解码后传回电脑验证四.结果一.硬件连接(STM32F407VETx.仿真器 ...
JPEG编解码基本技术回顾
JPEG格式是一个很老的格式了,笔者刚刚认识5寸软盘的时代就知道这种文件是用来保存图片的,而且比同尺寸的"bmp"图片要小很多.很多年过去了,信息技术各个领域全面发展,视频编解码技 ...
雷霄骅--H264视频编解码分析--目录转载
===================================================== H.264源代码分析文章列表: [编码 - x264] x264源代码简单分析:概述 x26 ...
视频编解码（十一）：编解码、显示调试常用命令总结
1.打印pts requestpicture函数里面加一个全局变量,每次请求图片后把pts保存,并且在前面求pts的差值,这样可以看到每次33000-33000-34000之间正好是100m构成 ...
【codecs】JPEG、MPEG-1、MPEG-2和MPEG-4编解码流程对比
Date: 2018.10.25 1.JPEG编解码流程发展时间:1988~1990 2.MPEG-1编解码流程发展时间:1990~1992 http://citeseerx.ist.psu.ed ...
JPG文件编解码详解——详细介绍编码和解码JPG
http://blog.csdn.net/zhengzhoudaxue2/article/details/7693258 JPEG文件编/解码详解 cat_ng 猫猫 JPEG(Joint Photo ...

JPEG编解码分析及调试