实验目的

掌握词典编码的基本原理，用C/C++/Python等语言编程实现LZW解码器并分析编解码算
法。

方法解释

算法说明

所用功能代码

词典树数组结构体

struct {int suffix;//后缀int parent;//母节点 int firstchild;//第一子节点int nextsibling;//下一个相邻同级节点
} dictionary[MAX_CODE + 1];

词典初始化

void InitDictionary(void) {int i;for (i = 0; i < 256; i++) {dictionary[i].suffix = i;dictionary[i].parent = -1;dictionary[i].firstchild = -1;dictionary[i].nextsibling = i + 1;}dictionary[255].nextsibling = -1;next_code = 256;
}

词典添加字符

void AddToDictionary(int character, int string_code) {int firstsibling, nextsibling;if (0 > string_code) return;dictionary[next_code].suffix = character;dictionary[next_code].parent = string_code;dictionary[next_code].nextsibling = -1;dictionary[next_code].firstchild = -1;firstsibling = dictionary[string_code].firstchild;if (-1 < firstsibling) {   // the parent has childnextsibling = firstsibling;while (-1 < dictionary[nextsibling].nextsibling)nextsibling = dictionary[nextsibling].nextsibling;dictionary[nextsibling].nextsibling = next_code;}else {// no child before, modify it to be the firstdictionary[string_code].firstchild = next_code;}next_code++;
}

LZW编码原理和实现算法

原理

LZW的编码思想是不断地从字符流中提取新的字符串，通俗地理解为新“词条”，然后用“代号”也就是码字表示这个“词条”。这样一来，对字符流的编码就变成了用码字去替换字符流，生成码字流，从而达到压缩数据的目的。LZW编码是围绕称为词典的转换表来完成的。LZW编码器通过管理这个词典完成输入与输出之间的转换。LZW编码器的输入是字符流，字符流可以是用8位ASCII字符组成的字符串，而输出是用n位(例如12位)表示的码字流。LZW编码算法的步骤如下：
步骤1：将词典初始化为包含所有可能的单字符，当前前缀P初始化为空。

步骤2：当前字符C=字符流中的下一个字符。
        步骤3：判断P＋C是否在词典中
         （1）如果“是”，则用C扩展P，即让P=P＋C，返回到步骤2。
         （2）如果“否”，则
                输出与当前前缀P相对应的码字W；
                将P＋C添加到词典中；
                令P=C，并返回到步骤2

代码实现

LZW编码

void LZWEncode(FILE* fp, BITFILE* bf) {int character;int string_code;int index;unsigned long file_length;fseek(fp, 0, SEEK_END);file_length = ftell(fp);fseek(fp, 0, SEEK_SET);BitsOutput(bf, file_length, 4 * 8);//打开二进制编码流文件InitDictionary();string_code = -1;while (EOF != (character = fgetc(fp))) {index = InDictionary (character, string_code);if (0 <= index) { // string+character in dictionarystring_code = index;}else {  // string+character not in dictionaryoutput(bf, string_code);//输出二进制编码流文件if (MAX_CODE > next_code) {    // free space in dictionary// add string+character to dictionaryAddToDictionary(character, string_code);}string_code = character;}}output(bf, string_code);
}

LZW解码原理和实现算法

原理

LZW解码算法开始时，译码词典和编码词典相同，包含所有可能的前缀根。具体解码算法如下：
        步骤1：在开始译码时词典包含所有可能的前缀根。
        步骤2：令CW：=码字流中的第一个码字。
        步骤3：输出当前缀-符串string.CW到码字流。
        步骤4：先前码字PW：=当前码字CW。
        步骤5：当前码字CW：=码字流的下一个码字。
        步骤6：判断当前缀-符串string.CW 是否在词典中。
        （1）如果”是”，则
                把当前缀-符串string.CW输出到字符流。
                当前前缀P：=先前缀-符串string.PW。
                当前字符C：=当前前缀-符串string.CW的第一个字符。
                把缀-符串P+C添加到词典。
        （2）如果”否”，则
                当前前缀P：=先前缀-符串string.PW。
                当前字符C：=当前缀-符串string.CW的第一个字符。
                输出缀-符串P+C到字符流,然后把它添加到词典中。
        步骤7：判断码字流中是否还有码字要译。
        （1）如果”是”，就返回步骤4。
        （2）如果”否”，结束。

重要问题

对原理步骤6：(2)的解释
解码时，出现词典中无法查询到该字符，是由于在编码时，新的“P+C”刚被创建，下一个“P”就需要使用它造成的，新创建的“P+C”的尾缀是新创建的“P+C”的首字符“

代码实现

void LZWDecode(BITFILE* bf, FILE* fp) {int character;int new_code, last_code;int phrase_length;unsigned long file_length;file_length = BitsInput(bf, 4 * 8);//获取二进制编码流长度if (-1 == file_length) file_length = 0;/*需填充*/InitDictionary();last_code = -1;while (0 < file_length) {new_code = input(bf);//从二进制码流获取一个字符编码if (new_code >= next_code)//this is the case CSCSC(not in dict){d_stack[0] = character;phrase_length = DecodeString(1, last_code);//返还1}else{phrase_length = DecodeString(0, new_code);//通过查询词典解码}character = d_stack[phrase_length - 1];while (0 < phrase_length) {phrase_length--;fputc(d_stack[phrase_length], fp);file_length--;}if (MAX_CODE > next_code) {  // add the new phrase to dictionaryAddToDictionary(character, last_code);}last_code = new_code;}}

实验结果

调试LZW编解码程序

设置项目参数进行编码

编码生成文件

设置项目参数解码

解码生成文件，与原文件一致

选择至少十种不同格式类型的文件，使用LZW编码器进行压缩

文件类型	原始文件大小	LZW编码压缩后大小	压缩比
docx	620,942	771,188	124.2%
bmp	3,030	980	32.3%
rgb	196,608	183,272	93.2%
yuv	98,304	69,634	70.8%
pdf	211,722	271,978	128.4%
jpg	170,416	197,340	115.8%
txt	3,946	2,144	54.3%
png	55,238	82,704	149.7%
eddx	33,477	54,948	164.1%
zip	505,449	640,588	126.7%

由表格可知，LZW压缩频繁出现压缩或文件变得更大的情况，猜测是该代码程序的局限性所致。所以个人认为，这些压缩比并不准确。

完整代码

bitio.h头文件

bitio.c定义码流输入输出函数

/** Definitions for bitwise IO** vim: ts=4 sw=4 cindent*/#include <stdlib.h>
#include <stdio.h>
#include "bitio.h"
BITFILE* OpenBitFileInput(char* filename) {BITFILE* bf;bf = (BITFILE*)malloc(sizeof(BITFILE));if (NULL == bf) return NULL;if (NULL == filename)    bf->fp = stdin;else bf->fp = fopen(filename, "rb");if (NULL == bf->fp) return NULL;bf->mask = 0x80;bf->rack = 0;return bf;
}BITFILE* OpenBitFileOutput(char* filename) {BITFILE* bf;bf = (BITFILE*)malloc(sizeof(BITFILE));if (NULL == bf) return NULL;if (NULL == filename)  bf->fp = stdout;else bf->fp = fopen(filename, "wb");if (NULL == bf->fp) return NULL;bf->mask = 0x80;bf->rack = 0;return bf;
}void CloseBitFileInput(BITFILE* bf) {fclose(bf->fp);free(bf);
}void CloseBitFileOutput(BITFILE* bf) {// Output the remaining bitsif (0x80 != bf->mask) fputc(bf->rack, bf->fp);fclose(bf->fp);free(bf);
}int BitInput(BITFILE* bf) {int value;if (0x80 == bf->mask) {bf->rack = fgetc(bf->fp);if (EOF == bf->rack) {fprintf(stderr, "Read after the end of file reached\n");exit(-1);}}value = bf->mask & bf->rack;bf->mask >>= 1;if (0 == bf->mask) bf->mask = 0x80;return((0 == value) ? 0 : 1);
}unsigned long BitsInput(BITFILE* bf, int count) {unsigned long mask;unsigned long value;mask = 1L << (count - 1);value = 0L;while (0 != mask) {if (1 == BitInput(bf))value |= mask;mask >>= 1;}return value;
}void BitOutput(BITFILE* bf, int bit) {if (0 != bit) bf->rack |= bf->mask;bf->mask >>= 1;if (0 == bf->mask) {    // eight bits in rackfputc(bf->rack, bf->fp);bf->rack = 0;bf->mask = 0x80;}
}void BitsOutput(BITFILE* bf, unsigned long code, int count) {unsigned long mask;mask = 1L << (count - 1);while (0 != mask) {BitOutput(bf, (int)(0 == (code & mask) ? 0 : 1));mask >>= 1;}
}

lzw_E.c 定义编解码函数、main函数所在

/** Definition for LZW coding** vim: ts=4 sw=4 cindent nowrap*/
#include <stdlib.h>
#include <stdio.h>
#include "bitio.h"
#define MAX_CODE 65535struct {int suffix;int parent, firstchild, nextsibling;
} dictionary[MAX_CODE + 1];
int next_code;
int d_stack[MAX_CODE]; // stack for decoding a phrase#define input(f) ((int)BitsInput( f, 16))
#define output(f, x) BitsOutput( f, (unsigned long)(x), 16)int DecodeString(int start, int code);
void InitDictionary(void);
void PrintDictionary(void) {int n;int count;for (n = 256; n < next_code; n++) {count = DecodeString(0, n);printf("%4d->", n);while (0 < count--) printf("%c", (char)(d_stack[count]));printf("\n");}
}int DecodeString(int start, int code) {int count;count = start;while (0 <= code) {d_stack[count] = dictionary[code].suffix;code = dictionary[code].parent;count++;}return count;
}
void InitDictionary(void) {int i;for (i = 0; i < 256; i++) {dictionary[i].suffix = i;dictionary[i].parent = -1;dictionary[i].firstchild = -1;dictionary[i].nextsibling = i + 1;}dictionary[255].nextsibling = -1;next_code = 256;
}
/** Input: string represented by string_code in dictionary,* Output: the index of character+string in the dictionary*      index = -1 if not found*/
int InDictionary(int character, int string_code) {int sibling;if (0 > string_code) return character;sibling = dictionary[string_code].firstchild;while (-1 < sibling) {if (character == dictionary[sibling].suffix) return sibling;sibling = dictionary[sibling].nextsibling;}return -1;
}void AddToDictionary(int character, int string_code) {int firstsibling, nextsibling;if (0 > string_code) return;dictionary[next_code].suffix = character;dictionary[next_code].parent = string_code;dictionary[next_code].nextsibling = -1;dictionary[next_code].firstchild = -1;firstsibling = dictionary[string_code].firstchild;if (-1 < firstsibling) { // the parent has childnextsibling = firstsibling;while (-1 < dictionary[nextsibling].nextsibling)nextsibling = dictionary[nextsibling].nextsibling;dictionary[nextsibling].nextsibling = next_code;}else {// no child before, modify it to be the firstdictionary[string_code].firstchild = next_code;}next_code++;
}void LZWEncode(FILE* fp, BITFILE* bf) {int character;int string_code;int index;unsigned long file_length;fseek(fp, 0, SEEK_END);file_length = ftell(fp);fseek(fp, 0, SEEK_SET);BitsOutput(bf, file_length, 4 * 8);InitDictionary();string_code = -1;while (EOF != (character = fgetc(fp))) {index = InDictionary (character, string_code);if (0 <= index) {   // string+character in dictionarystring_code = index;}else {  // string+character not in dictionaryoutput(bf, string_code);if (MAX_CODE > next_code) {    // free space in dictionary// add string+character to dictionaryAddToDictionary(character, string_code);}string_code = character;}}output(bf, string_code);
}void LZWDecode(BITFILE* bf, FILE* fp) {int character;int new_code, last_code;int phrase_length;unsigned long file_length;file_length = BitsInput(bf, 4 * 8);if (-1 == file_length) file_length = 0;/*需填充*/InitDictionary();last_code = -1;while (0 < file_length) {new_code = input(bf);if (new_code >= next_code)//this is the case CSCSC(not in dict){d_stack[0] = character;phrase_length = DecodeString(1, last_code);}else{phrase_length = DecodeString(0, new_code);//}character = d_stack[phrase_length - 1];while (0 < phrase_length) {phrase_length--;fputc(d_stack[phrase_length], fp);file_length--;}if (MAX_CODE > next_code) { // add the new phrase to dictionaryAddToDictionary(character, last_code);}last_code = new_code;}}int main(int argc, char** argv) {FILE* fp;BITFILE* bf;if (4 > argc) {fprintf(stdout, "usage: \n%s <o> <ifile> <ofile>\n", argv[0]);fprintf(stdout, "\t<o>: E or D reffers encode or decode\n");fprintf(stdout, "\t<ifile>: input file name\n");fprintf(stdout, "\t<ofile>: output file name\n");return -1;}if ('E' == argv[1][0]) { // do encodingfp = fopen(argv[2], "rb");bf = OpenBitFileOutput(argv[3]);if (NULL != fp && NULL != bf) {LZWEncode(fp, bf);fclose(fp);CloseBitFileOutput(bf);fprintf(stdout, "encoding done\n");}}else if ('D' == argv[1][0]) {  // do decodingbf = OpenBitFileInput(argv[2]);fp = fopen(argv[3], "wb");if (NULL != fp && NULL != bf) {LZWDecode(bf, fp);fclose(fp);CloseBitFileInput(bf);fprintf(stdout, "decoding done\n");}}else {    // otherwisefprintf(stderr, "not supported operation\n");}return 0;
}

LZW 编解码算法实现与分析相关推荐

【实验三】LZW编解码算法实现与分析
一.实验目的 1.掌握词典编码的基本原理,用C/C++/Python等语言编程实现LZW解码器并分析编解码算法. 2.选择十种不同格式类型的文件,使用LZW编码器进行压缩得到输出的压缩比特流文件.对各 ...
实验三 LZW编解码算法实现与分析
LZW简述本部分参考wiki https://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Welch LZW压缩算法在1978年提出,由 Abr ...
[实验三]LZW 编解码算法实现与分析
目录一.LZW算法 1.1 编码步骤 1.2 解码步骤 1.3 关于有可能出现当前码字CW不在词典中的情况说明二.代码实现 2.1 程序说明 2.2 数据结构 2.3 bitio.h 2.4 bi ...
【数据压缩（五）】LZW编解码c语言实现和分析
一.实验目的 1.掌握词典编码的基本原理 2.C/C++/Python等语言编程实现LZW解码器 3.分析编解码算法二.实验要求 1.首先调试LZW的编码程序,以一个文本文件作为输入,得到输出的LZ ...
语音编码 c语言,语音编解码算法G．723．1在DSP - 嵌入式新闻 - 电子发烧友网
1 引言 G.723.1是删组织于1996年推出的一种低码率的语音编码算法标准,也是目前该组织颁布的语音压缩标准中码率最低的一种标准.G.723.1主要用于对语音及其它多媒体声音信号的压缩,目前在一些 ...
Xvid视频编解码算法
Xvid是开源的MPEG-4视频编解码算法,采用C语言开发,核心函数采用MMX/SSE/SSE2媒体汇编指令优化. 基于对象的MPEG-4视频编码设计初衷是第二代图像编码标准,即对象编码. Vvid ...
从编解码算法到全链路RTC架构，揭秘淘系直播技术演进之路
从2016年直播元年至今,纯粹的直播已经逐渐失去竞争力,越来越多形式创新映入眼帘,而众多企业开始走向内容垂直化--秀场.游戏.电商.广电等内容特点深度结合.伴随2020年疫情爆发,电商为人们日常生活提 ...
HDMI/DVI中TMDS编解码算法的理解
HDMI/DVI中TMDS编解码算法的理解 TMDS简介 TMDS编码 TMDS解码 TMDS简介 HDMI和DVI协议使用TMDS作为它们的物理层.支持高达225MHz的传输速率,一个传输链路能满足 ...
蓝牙音频双剑客(二)--高质量音频分布协议(A2DP) SBC编解码算法
零. 概述主要介绍下蓝牙协议栈(bluetooth stack)传统蓝牙音频协议之高质量音频分布协议(A2DP) SBC编解码算法 Codec Specific Infomation Element ...

LZW 编解码算法实现与分析

实验目的

方法解释

算法说明

所用功能代码

LZW编码原理和实现算法

原理

代码实现

LZW解码原理和实现算法

原理

重要问题

代码实现

实验结果

调试LZW编解码程序

选择至少十种不同格式类型的文件，使用LZW编码器进行压缩

完整代码

bitio.h头文件

bitio.c定义码流输入输出函数

lzw_E.c 定义编解码函数、main函数所在

LZW 编解码算法实现与分析相关推荐

最新文章

热门文章