QccVIDRDWTBlockDecodeHeader.3

NAME

QccVIDRDWTBlockEncode, QccVIDRDWTBlockDecode - encode/decode an image sequence using the RDWT-block algorithm

int QccVIDRDWTBlockEncode(QccIMGImageSequence *image_sequence, const QccFilter *filter1, const QccFilter *filter2, const QccFilter *filter3, int subpixel_accuracy, QccBitBuffer *output_buffer, int blocksize, int num_levels, int target_bit_cnt, const QccWAVWavelet *wavelet, const QccString mv_filename, int read_motion_vectors, int quiet);

int QccVIDRDWTBlockDecodeHeader(QccBitBuffer *input_buffer, int *num_rows, int *num_cols, int *start_frame_num, int *end_frame_num, int *blocksize, int *num_levels, int *target_bit_cnt);

int QccVIDRDWTBlockDecode(QccIMGImageSequence *image_sequence, const QccFilter *filter1, const QccFilter *filter2, const QccFilter *filter3, int subpixel_accuracy, QccBitBuffer *input_buffer, int target_bit_cnt, int blocksize, int num_levels, const QccWAVWavelet *wavelet, const QccString mv_filename, int quiet);

DESCRIPTION

Encoding

QccVIDRDWTBlockEncode() encodes an image_sequence using the RDWT-block video-coding algorithm by Park and Kim. Essentially, the RDWT-block algorithm involves traditional block-based motion estimation and motion compensation wherein the redundant phases of the RDWT of the reference frame are used to circumvent the shift variance of the wavelet transform; see "ALGORITHM" below for greater detail.

image_sequence is the image sequence to be coded and should indicate a collection of grayscale images of the same size stored as separate, numbered files; the filename indicated by image_sequence must contain one printf(3) -style numerical descriptor which will then be filled in the current frame number (e.g., football.%03d.pgm will become football.000.pgm, football.001.pgm, etc.; see QccPackIMG(3) ). Each frame of image_sequence must have a size which is an integer multiple of blocksize both horizontally and vertically. Both the start_frame_num and end_frame_num fields of image_sequence should indicate the desired starting and stopping frames, respectively, for the encoding; these should either be set manually or via a call to QccIMGImageSequenceFindFrameNums(3) prior to calling QccVIDRDWTBlockEncode().

filter1, filter2, and filter3 are interpolation filters for supporting the subpixel accuracy specified by subpixel_accuracy which can be one of QCCVID_ME_FULLPIXEL, QCCVID_ME_HALFPIXEL, QCCVID_ME_QUARTERPIXEL, or QCCVID_ME_EIGHTHPIXEL, indicating full-, half-, quarter-, or eighth-pixel accuracy, respectively. See "SUBPIXEL ACCURACY" below.

output_buffer is the output bitstream which must be of QCCBITBUFFER_OUTPUT type and opened via a prior call to QccBitBufferStart(3) .

num_levels gives the number of levels of dyadic wavelet decomposition to perform, and wavelet is the wavelet to use for decomposition.

When encoding the current frame, QccVIDRDWTBlockEncode() will first output to output_buffer all the motion vectors for the frame, and then an embedded intraframe encoding of the motion-compensated residual. target_bit_cnt is the desired number of bits to output for each frame, including motion vectors and motion-compensated residual. After target_num_bits have been produced for the current frame, QccVIDRDWTBlockEncode() will call QccBitBufferFlush(3) to flush the bit-buffer contents to the output bitstream. Encoding of the next frame starts on the next byte boundary of the output bitstream.

mv_filename gives the filename for files of motion vectors. If read_motion_vectors is FALSE, then the motion-vectors are written to mv_filename via QccVIDMotionVectorsWriteFile(3) . mv_filename should have a printf(3) -style numerical descriptor which will then be filled in with the current frame number before writing, so that motion vectors are separated into multiple files, one file per frame. On the other hand, if read_motion_vectors is TRUE, then motion vectors are read from mv_filename via QccVIDMotionVectorsReadFile(3) , in which case QccVIDRDWTBlockEncode() performs no motion estimation itself, using simply the motion vectors read from the files for coding.

If quiet = 0, QccVIDRDWTBlockEncode() will print to stdout a number of statistics concerning each frame as it is encoding. If quiet = 1, this verbose output is suppressed.

Decoding

QccVIDRDWTBlockDecodeHeader() decodes the header information in a bitstream produced by QccVIDRDWTBlockEncode(). The input bitstream is input_buffer which must be of QCCBITBUFFER_INPUT type and open via a prior call to QccBitBufferStart(3) . The header information is returned in num_rows (vertical size of image-sequence frames), num_cols (horizontal size of image-sequence frames), start_frame_num (number of the first frame of the sequence), end_frame_num (number of the last frame of the sequence), num_levels (number of wavelet-transform levels), and target_bit_cnt (number of bits encoded for each frame).

QccVIDRDWTBlockDecode() decodes the bitstream input_buffer, reconstructing each image of the output image sequence and writing it to a separate, numbered grayscale-image file. The filename denoted by image_sequence must contain one printf(3) -style numerical descriptor which is filled in with the number of the current frame being decoded. The bitstream must already have had its header read by a prior call to QccVIDRDWTBlockDecodeHeader() (i.e., you call QccVIDRDWTBlockDecodeHeader() first and then QccVIDRDWTBlockDecode()). If quiet = 0, then QccVIDRDWTBlockDecode() prints a brief message to stdout after decoding each frame; if quiet = 1, then this message is suppressed.

mv_filename gives the name of files of motion vectors. If mv_filename is NULL, then QccVIDRDWTBlockDecode() simply decodes the motion vectors to use in decoding from the bitstream (the usual state of affairs). On the other hand, if mv_filename is not NULL, then the motion vectors stored in the bitstream are ignored and, rather, the motion vectors are read from mv_filename instead. mv_filename should have a printf(3) -style numerical descriptor which will then be filled in with the current frame number before reading via QccVIDMotionVectorsReadFile(3) .

ALGORITHM

In recent years, there have been proposed a number of video coders that use the RDWT to circumvent the shift variance of traditional critically sampled wavelet transforms, a trait which has long hinder deployment of motion estimation and compensation in the DWT domain. The majority of prior work concerning RDWT-based video coding originates in the work of Park and Kim who proposed the RDWT-block coder implemented here. In essence, the RDWT-block coder works as follows. An input frame is decomposed with a critically sampled DWT via QccWAVSubbandPyramidDWT(3) and partitioned into blocks wherein each NxN block (N = blocksize) for is composed of the coefficients from each subband that correspond spatially to a particular NxN block in the original image. A full-search block-matching algorithm is then used to compute motion vectors for each wavelet-domain block; the system uses as the reference for this search an RDWT decomposition of the previous reconstructed frame, generated by QccWAVWaveletRedundantDWT2D(3) . The motion-estimation procedure of the coder amounts to identifying, for each block of the current frame, a particular critically sampled DWT in the RDWT, and a displacement within that DWT (see QccWAVWaveletRedundantDWT2DSubsample(3) ). Transmission of a single motion vector per block suffices to convey this all of this motion information to the decoder.

After creating the motion-compensated residual in the critically sampled DWT domain, the RDWT-block coder then performs intraframe coding of the residual. Although it is possible to use any wavelet-domain coder for this task, Park and Kim used the embedded SPIHT coder as the intraframe coder. The QccPack implementation of the RDWT-block coder does the same via a call to QccSPIHTEncode2(3) on the DWT-domain motion-compensated residual.

The coder developed by Park and Kim has often been called the "low-band shift" (LBS) method due to their proposal for implementing the RDWT via a sequence of one-sample shifts and filter banks. However, "low-band shift" is just one of several equivalent ways of implementing the RDWT (the original algorithme a trous being another; see QccWAVWaveletRedundantDWT2D(3) ). Consequently, we refer to Park and Kim's method as "RDWT block" to emphasize its use of traditional block-based motion estimation and compensation, albeit in the RDWT domain, and to distinguish it from other RDWT-based techniques employing significantly different coder architectures.

SUBPIXEL ACCURACY

As described originally by Park and Kim, the RDWT-block algorithm uses motion estimation and compensation with full, integer-pixel accuracy. However, due to the linearity of the RDWT, it is possible to implement subpixel interpolation in the RDWT domain in a manner similar to as done in the spatial domain to support traditional subpixel accuracy. Specifically, one simply interpolates each subband of the RDWT to subpixel accuracy independently. QccVIDRDWTBlockEncode() and QccVIDRDWTBlockDecode() both call QccVIDMotionEstimationCreateReferenceFrame(3) for each subband of the RDWT of the reference frame to interpolate the subband to the accuracy specified by subpixel_accuracy. The filters filter1, filter2, and filter3 are passed to QccVIDMotionEstimationCreateReferenceFrame(3) to control whether filtered interpolation or bilinear interpolation is performed at each step of the subpixel interpolation. See QccVIDMotionEstimationCreateReferenceFrame(3) for more detail.

AUTHOR

Written by Joe Boettcher <jbb15@msstate.edu> based on the originally developed algorithm and code by Suxia Cui.

Table of Contents