chnarithmeticencode implements adaptive arithmetic coding of the channel sequence, channelfile (CHN format). chnarithmeticencode is based on the well known and frequently used implementation of arithmetic coding popularized by Witten et al. chnarithmeticencode includes their first-order adaptive model; additionally, it can use a second-order model for more complex modeling of data sources.
The -o option specifies the order of the adaptive model. If order is 1, chnarithmeticencode simply replicates the operation of the coder and adaptive model described by Witten et al. If order is 2, the previous symbol in the channel serves as a context for the coding of the current symbol. This higher order model should provide greater coding efficiency (i.e., more compression) if the data source (channelfile) possesses some degree of memory rather than being the memoryless source assumed by the first-order model.
chnarithmeticencode initially outputs a few bytes of header information (original number of symbols in channelfile, symbol alphabet size, and arithmetic-coding order) to outfile. The byte-packed bitstream produced by the arithmetic coding follows. As in the implementation by Witten et al., a special EOF symbol is output at the end of the bitstream to allow the decode, chnarithmeticdecode(1) , to detect the stopping point.
Normally, chnarithemticencode prints to stdout the rate achieved by the arithmetic coding, expressed in terms of bits per channel symbol. This output may be suppressed by the -s option (silent mode). The -vo option indicates that only the value of the rate is to be printed (terse output). The -s option overrides the -vo option. If option -d is given, the rate is printed as bits per vector component (i.e., it is the bit rate of the arithmetic coding divided by vector_dimension). For example, the -d option gives a convenient way to calculate the bit rate, in bits per original source symbol, when the channel corresponds to indices output from a vector quantizer (see vqencode(1) ). Note: the specification of the -d option in no way affects the operation of the arithmetic encoder; it affects only the printed output of the rate.
I. A. Witten, R. M. Neal, and J. G. Cleary, "Arithmetic Coding For Data Compression," Communications of the ACM, vol. 30, no. 6, pp. 520-540, June 1987.
T. M. Cover and J. T. Thomas, Elements of Information Theory. New York: John Wiley & Sons, Inc., 1991.