int QccENTHuffmanDesign(const int *symbols, const double *probabilities, int num_symbols, QccENTHuffmanTable *huffman_table);
QccENTHuffmanDesign() can create a Huffman code table suitable for decoding or encoding using the functions QccENTHuffmanEncode(3) and QccENTHuffmanDecode(3) , respectively. For an encoding table, huffman_table->table_type should be set to QCCENTHUFFMAN_ENCODETABLE before calling QccENTHuffmanDesign(). For a decoding table, huffman_table->table_type should be set to QCCENTHUFFMAN_DECODETABLE before calling QccENTHuffmanDesign().
QccENTHuffmanDesign() will allocate space for the Huffman code table via a call to QccENTHuffmanTableAlloc(3) so this does not need to be done in advance.
Any of the symbols which have zero probabilities are not included in the output Huffman code table. That is, such zero-probability symbols are not included in the building of the Huffman code tree, are therefore do not get Huffman codewords assigned to them.
The design algorithm employed by QccENTHuffmanDesign() is essentially identical to the tree-based technique described in Huffman's original paper. In order to keep the implementations of encoding and decoding practical, the length of Huffman codewords has been limited to QCCENTHUFFMAN_MAXCODEWORDLEN (31 bits) in this implementation. Unfortunately, Huffman's original algorithm cannot predict the length of the longest codeword in advance of designing the code tree, nor can it place an upper bound on codeword lengths. As a result, QccENTHuffmanDesign() will fail if, while designing the Huffman code, it encounters a situation that requires a codeword of length longer than QCCENTHUFFMAN_MAXCODEWORDLEN bits; this problem often arises when the symbol alphabet contains a large number of symbols that have relatively small probabilities. In such a case, attempts to generate the code table are terminated, and a return value of 1 is returned to the calling procedure.
It would probably be better to have the implementation resort to a modified Huffman code (i.e., a code in which low-probability symbols are "collapsed" into a single node in the code tree, and a Hufffman-code prefix along with a fixed-length code is used for the collapsed symbols) if a Huffman code cannot be built without exceeding QCCENTHUFFMAN_MAXCODEWORDLEN codeword length. This has not be done in this implementation, however.
D. A. Huffman, "A Method for the Construction of Minimum-Redundancy Codes," Proceedings of the IRE, vol. 40, pp. 1098-1101, September 1952.