gtrencode implements the generalized-threshold-replenishment (GTR) algorithm for adaptive vector quantization (AVQ). The vectors of datfile (DAT format) are encoded using the GTR algorithm, with initial_codebook (CBK format), if specified, being the initial codebook. gtrencode outputs a channel of VQ indices, channelfile (CHN format), and side information, sideinfofile (SID format). codebook_coder gives the scalar quantizer used to implement the codebook coder and generate the side information (see below). Note: gtrencode implements the move-to-front variant of the GTR algorithm.
If -ic is given, then the initial codebook is read from initial_codebook. If -cs is given, then max_codebook_size gives the maximum allowable codebook size. That is, when vectors update the codebook, they are allowed to be added to the codebook by increasing the codebook size until the number of codewords in the codebook reaches max_codebook_size. After that point, each vector update replaces an existing codeword in the codebook. If -cs is given and no initial codebook is specified (no -ic option given), the codebook starts empty and the first vector coded must update the codebook. If -ic is given, then max_codebook_size is set to be the size of initial_codebook regardless of whether -cs is given or not. If neither -ic nor -cs is given, max_codebook_size defaults to 256 and the codebook starts empty.
gtrdecode is used for the corresponding decoding of channelfile and sideinfofile; avqrate calculates the bit rate for the GTR algorithm as represented by this channel and the side information. See gtrdecode(1) and avqrate(1) for more details.
If option -fc is specified, gtrencode outputs the final state of the codebook, taking into account all codebook updates performed during coding, to final_codebook.
The side information output by gtrencode consists of a series of flags indicating whether or not the codebook is updated at a given time, as well as the vectors added to the codebook during a codebook update. The side-information file, sideinfofile, consists of a sequence of symbols of the following form:
<symbol type> <symbol value>
where <symbol type> is 1 for a flag, and 3 for an update vector. If <symbol type> = 1, then the current symbol is a flag, and, in this case, <symbol value> can be either 0 or 1. If <symbol value> = 1, then the symbol indicates that a codebook update is to be performed, and the next symbol will be the new codeword. <symbol value> = 0 for no update. If <symbol type> = 3, then the current symbol is a vector. In this case, <symbol value> = v[1] v[2] ... v[dim]. The scalar quantizer codebook_coder (SQ format) is used to quantize each component of the vector that updates the codebook. Thus, each v[i] is an index output from this scalar quantizer. Note: dim is the vector dimension specified in the header of sideinfofile (see QccPack(1) ). Note: in all instances, <symbol value> and <symbol type> are stored in ASCII characters. The value of N (see QccPack(1) ) specified in the header of sideinfofile gives the number of symbols with <symbol type> = 1, i.e. the number of flags stored in sideinfofile.
J. E. Fowler and S. C. Ahalt, "Adaptive Vector Quantization Using Generalized Threshold Replenishment," in Proceedings of the IEEE Data Compression Conference (J. A. Storer and M. Cohn, eds.), (Snowbird, UT), pp. 317-326, IEEE Computer Society Press, March 1997-2021.