Table of Contents

NAME

QccENTHuffmanDesign - design of Huffman code table from probability distribution

SYNOPSIS

#include "libQccPack.h"

int QccENTHuffmanDesign(const int *symbols, const double *probabilities, int num_symbols, QccENTHuffmanTable *huffman_table);

DESCRIPTION

QccENTHuffmanDesign() designs a Huffman code table for the alphabet of symbols given in symbols and their associated probabilities of occurrence. num_symbols gives the number of symbols in the alphabet. The symbols themselves must be nonnegative integers less than or equal to QCCENTHUFFMAN_MAXSYMBOL (currently, QCCENTHUFFMAN_MAXSYMBOL is defined to be 100000). The probabilities are nonnegative doubles less than 1; the sum of all the probabilities should be equal to 1.0.

QccENTHuffmanDesign() can create a Huffman code table suitable for decoding or encoding using the functions QccENTHuffmanEncode(3) and QccENTHuffmanDecode(3) , respectively. For an encoding table, huffman_table->table_type should be set to QCCENTHUFFMAN_ENCODETABLE before calling QccENTHuffmanDesign(). For a decoding table, huffman_table->table_type should be set to QCCENTHUFFMAN_DECODETABLE before calling QccENTHuffmanDesign().

QccENTHuffmanDesign() will allocate space for the Huffman code table via a call to QccENTHuffmanTableAlloc(3) so this does not need to be done in advance.

Any of the symbols which have zero probabilities are not included in the output Huffman code table. That is, such zero-probability symbols are not included in the building of the Huffman code tree, are therefore do not get Huffman codewords assigned to them.

The design algorithm employed by QccENTHuffmanDesign() is essentially identical to the tree-based technique described in Huffman's original paper. In order to keep the implementations of encoding and decoding practical, the length of Huffman codewords has been limited to QCCENTHUFFMAN_MAXCODEWORDLEN (31 bits) in this implementation. Unfortunately, Huffman's original algorithm cannot predict the length of the longest codeword in advance of designing the code tree, nor can it place an upper bound on codeword lengths. As a result, QccENTHuffmanDesign() will fail if, while designing the Huffman code, it encounters a situation that requires a codeword of length longer than QCCENTHUFFMAN_MAXCODEWORDLEN bits; this problem often arises when the symbol alphabet contains a large number of symbols that have relatively small probabilities. In such a case, attempts to generate the code table are terminated, and a return value of 1 is returned to the calling procedure.

It would probably be better to have the implementation resort to a modified Huffman code (i.e., a code in which low-probability symbols are "collapsed" into a single node in the code tree, and a Hufffman-code prefix along with a fixed-length code is used for the collapsed symbols) if a Huffman code cannot be built without exceeding QCCENTHUFFMAN_MAXCODEWORDLEN codeword length. This has not be done in this implementation, however.

RETURN VALUE

These routines return 0 on success, and 1 on failure.

SEE ALSO

QccENTHuffmanEncode(3) , QccENTHuffmanDecode(3) , QccENTHuffmanTable(3) , QccPackENT(3) , QccPack(3)

D. A. Huffman, "A Method for the Construction of Minimum-Redundancy Codes," Proceedings of the IRE, vol. 40, pp. 1098-1101, September 1952.

AUTHOR

Copyright (C) 1997-2021 James E. Fowler


Table of Contents



Get QccPack at SourceForge.net. Fast, secure and Free Open Source software downloads