49
Manual and Exercises Prof.dr.ir. R.L. Lagendijk Information and Communication Theory Group Delft University of Technology [email protected] Delft

vcdemomanualquestions3

Embed Size (px)

Citation preview

Page 1: vcdemomanualquestions3

Manual and Exercises

Prof.dr.ir. R.L. LagendijkInformation and Communication Theory Group

Delft University of [email protected]

������������ �����������

Delft

Page 2: vcdemomanualquestions3

VcDemo® Manual and Excercises -2-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

Contents

Contents___________________________________________________________________2

1 About VcDemo®_________________________________________________________3

2 Subsampling (SS) ______________________________________________________10

3 Pulse Code Modulation (PCM)____________________________________________12

4 Differential Pulse Code Modulation (DPCM) ________________________________14

5 Vector Quantization (VQ) ________________________________________________16

6 Fractal Compression (FRAC) ____________________________________________19

7 Subband Compression (SBC) ____________________________________________21

8 Discrete Cosine Transform (DCT)_________________________________________24

9 Baseline JPEG_________________________________________________________27

10 Set Partioning In Hierarchical Trees (SPIHT) _______________________________30

11 Embedded Zerotree Wavelet (EZW) ________________________________________32

12 Motion Estimation______________________________________________________34

13 MPEG-1/2 Decoder _____________________________________________________38

14 MPEG-1/2 Encoder_____________________________________________________41

15 Tips and Tricks ________________________________________________________44

Page 3: vcdemomanualquestions3

VcDemo® Manual and Excercises -3-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

1 About VcDemo®

The VcDemo® image and video compression learning tool is a fully menu-driven package. It isoperated by selecting compression techniques and parameters using buttons. In most cases you shouldbe able to run VcDemo® without any further explanations starting from the main application window.This window is shown in Figure 1.1 (this is actually taken from an earlier release: the current versionhas even more options). The purpose of VcDemo® is to be able to experiment with different(sometimes complex) image compression algorithms without bothering about their actualimplementation. For this reason VcDemo® is menu driven, and allows the user to experiment withdifferent parameters that define the specific compression operations, e.g. in PCM compression whatnumber of bits to use, and whether or not to apply dithering. On one hand visualization of imagecompression results will help to gain insight in "how good" a compression algorithm performs. On theother hand the compression results will stimulate a problem-solving attitude. A generally effectiveattitude toward using VcDemo® is asking yourself all the time the questions “what will happen given acertain combination of parameters”, “what is the cause of a certain compression artifacts”, and “howdo compression techniques compare visually and numerically”.

VcDemo® is a tool assisting the learning process, but does in itself not explain the compressiontechniques. Lecture materials (sheets) for a full course on signal, image and video compression isavailable from the author’s website: http://www-ict.its.tudelft.nl/et10-38. The VcDemo® software,including sample images and sequences, can be downloaded from http://www-ict.its.tudelft.nl/vcdemo.

Figure 1.1. Main application window prior to loading an image to work on.

Installation

Page 4: vcdemomanualquestions3

VcDemo® Manual and Excercises -4-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

The VcDemo® software comes as a self-installing package. Just download the most recent release ofVcDemo®, unzip the file, and run Setup.exe. The install shield will ask for an installation directory,and suggests a default one in C:\Program Files. In this install directory, four subdirectories will becreated, namely:� a directory for storing all quantization tables and filter coefficients needed by VcDemo®.

Normally this will be the subdirectory \Tables of the installation directory. The files in thisdirectory should never be touched or removed by users.

� a directory for storing vector quantization codebooks. Normally this will be the subdirectory\CodeBook of the installation directory. The vector quantization module will read and let the userwrite codebooks into this directory

� an image directory. VcDemo® comes with several test images that are written into this directory.However, VcDemo® will work with any bmp image, provided that the dimensions satisfy certaincompression technique-dependent constraints (for instance, even number of rows and columns). Apossible choice for this directory is the subdirectory \Images of the installation directory.

� An “results” directory, where the user-saved files (jpeg, mpeg) are stored.Selected installation directories can always be changed by clicking “Help�Setting”. Credits andquick access to the web pages of the author’s research group at TU-Delft. After installing VcDemo®,you are ready to go. Just click on VcDemo.exe in the installation directory, and the demo will start.VcDemo® does not install any additional dll’s on your system.

Over time the VcDemo installation files started to get larger and larger. Therefor, we currently have a“Full” installation, with all the nuts and bolts, which is 15 MByte (zipped) and a “Light” installationwhich is substantially smalle because some of the video files have been omitted. You can alwaysdownload and add these files later on.

Getting StartedAfter starting VcDemo®, an image needs to be loaded to work upon. Clicking on “File�Open Image”or on the toolbar button will give the contents of the selected image directory. Pick any bmp image,but be aware that some compression methods are restricted to image dimensions that are even, orpowers of 2. A good choice is an image of 256x256 pixels, as this can be used with all compressionalgorithms, will and will nicely fit in the workspace. Larger images can be used, but they aresometimes shown in smaller windows because lack of space (try to double click the left mouse buttonin the image and see what happens!). Of course the windows of these larger images can always beresized to see them completely. VcDemo® can load color or paletted images. All compression routineson images work, however, on 8 bit gray value images, so internally the input image format isconverted to 256 gray values. For sequences (MPEG) this may be different.

Figure 1.2 shows VcDemo® after opening one image. This image is called the “start image”, as allcompression operations will work on this image. Multiple images can be open at the same time butonly one image can be the start image (and also your workspace will get cluttered). Results fromcompression operations can themselves be promoted to be a start image, so that concatenatedcompression can be carried out (use right mouse button in the image to open this menu).

On the toolbar, the pull-down menu, and the buttons all compression options are now visible. If someof the compression options are not active, this is due to the dimensions of the currently active startimage, or because of the image type (still image, video sequence, mpeg stream).

Page 5: vcdemomanualquestions3

VcDemo® Manual and Excercises -5-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

Figure 1.2 Main application window after loading one image to work on.

Clicking one of the compression techniques will make the parameter menu pop-up. The menu isalways positions on the right of the workspace and is immovable. Figure 1.3 shows the result ofclicking the DPCM button. The menu consists of a number of “tabs” that can be clicked. Each of thetabs contains parameters that the user can control. After clicking “Apply” the compression will becarried out using the current setting of these parameters. With “Cancel” the menu will disappear, butall resulting images obtained will stay in the workspace.

Figure 1.3 Window after activating DPCM compression.

The lower part of the menu has a text window that will show numerical results of the compression,such as bit rate, signal-to-noise ration (not peak-SNR, but variance-based SNR). The results shown inthis text window are saved for each compressed image shown in the workspace. This information isreached by clicking in an image on the right mouse button, followed by clicking on “ImageInformation”.

���������

� �� ������

�����������

������ ���

Page 6: vcdemomanualquestions3

VcDemo® Manual and Excercises -6-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

Figure 1.4 Window with DPCM results and right mouse-button menu.

Figure 1.4 shows the resulting configuration after pushing “Apply” for DPCM compression. Ingeneral the start image is positioned in the upper left corner of the workspace. The compressed anddecompressed image is shown below the start image, in a staggered fashion. Other output results, suchas spectra, DCT coefficients, or – here in DPCM compression – the prediction difference are shown tothe right of the start image. This figure also shows the right-mouse button menu. From this menu,images can be saved and can be set as start image. Image compression menu’s can also directly bestarted from the right mouse button menu; notice that compression modules started in this way stillrefer to the actual start image, not to the (compressed) image in which the compression module isstarted.

Multiple compression modules can be active, and can refer to the same or to different start images.This option is very useful if compression results are to be compared. For beginner’s operation it iseasier to have one start image and one compression module active at a time.

Figure 1.5 Multiple DPCM results information window about one of the results.

���������

����� �����������

�����

������ ���

����������

��� ������

�� ����������

!������ ��������

����������������

Page 7: vcdemomanualquestions3

VcDemo® Manual and Excercises -7-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

Figure 1.5 shows several compression results in the workspace. For one of the results the informationwindow has been activated, such that the corresponding parameter settings and numerical results canbe reviewed. To terminate the active compression module and clean up all compressed images andother results related to it, close the start image. This is the easiest way to clean up a clutteredworkspace..

Working with SequencesSequences come in two flavors, namely as YUV raw data files, or as a series of bmp images.

YUV Sequences� YUV sequences contain the raw image sequences without any header. The data is stored as a

series of Y-U-V data blocks. The storage format can be luminance only, or luminance-chrominance in 4:2:0 or 4:2:2 format. With each sequence a header file must exist of the samename. For instance

…\adir\foo.yuvcontains YUV data, of which the format is described in foo.hdr. Luminance-only sequences arecontained in file with extension *.y.

The ASCII header file may start with comment lines, starting with /*Then the following numbers are expected (in this order, one line per number):

- (int) number of lines- (int) number of pixels per line- (boolean) flag indicating progressive (1) or interlaced (0) format- (int) color representation: gray value (0), 4:2:2 (1), or 4:2:0 (2)- (int) number of frames

If you have a raw Y or YUV data file (for instance MPEG test sequences), just edit a simpleheader file with the same name and the extension .hdr, and you are ready to go. Look at theexamples provided in VcDemo image directory.

BMP Sequences� An BMP image sequence is called, for instance, foo1.bmp, foo2.bmp, …, foo9.bmp, foo10.bmp,

etc. The program does not have knowledge of the length of this sequence beforehand. To indicateto the program that an foo sequence exists, the images must be organized as follows:

…\adir\foo.seq…\adir\foo\foo1.bmp…\adir\foo\foo2.bmp…\adir\foo\foo3.bmp

etc.Upon loading time, the program will look for files with extension *.seq, and – when selected – willload this image as the first image of the sequence. Therefore, foo.seq must be identical to foo1.bmp(just a copy with different extension). All other frames will be loaded from the subdirectory foo, untila next image cannot be found so that the program realizes that the sequence end has been reached.

Image sequences are opened by clicking on “File�Open Sequence”. The sequences with extension*yuv, *.y or *.seq are listed and can be opened. The first frame will be shown only. The modules thatwork with sequences can now be activated. The motion rendering in the motion estimation module isheavily dependent on the CPU speed.

Page 8: vcdemomanualquestions3

VcDemo® Manual and Excercises -8-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

Working with MPEG compressed sequencesThe image directory (or any other directory) can include mpeg streams with extensions *.mpg,(generically used for mpeg streams) *.m1v (specifically for mpeg-1 video), and *.m2v (specificallyfor mpeg-2 video).

MPEG sequences are opened by clicking on “File�Open MPEG stream”. The mpeg sequences withextension .mpg are listed and can be opened. Via the pull-down menu the files with extensions *.m1vand *.m2v can be shown and opened. Upon opening the file, the window will just show nice colorsbut no video. After activating the mpeg decompression module, the window will be resized and videowill be displayed. Obviously the motion rendering is heavily dependent on the CPU speed.

HistoryLook at the web pages http://www-ict.its.tudelft.nl/vcdemo for the history of VcDemo.

Conditions of UsageEducational Use ONLYVcDemo® is a non-commercial free-ware educational tool. Any commercial usage of the VcDemo®

software is explicitly disallowed. Results - either visual or numerical - obtained with the VcDemo®

software can be used for producing non-commercial educational materials. VcDemo® is, however, notintended as and has not been optimized for ultimately comparing PSNR, visual, or encoding/decodingtime performance. This type of use of VcDemo® is therefore discouraged.

No WarrantyVcDemo® is provided "as is" without warranty of any kind, either expressed or implied. TU-Delft/ICT-Group disclaims all warranties, expressed or implied.

No LiabilityTU-Delft is not liable for any direct, indirect, special, incidental or consequential damages arising outof the use -- or the inability to use -- the VcDemo® package.

CopyrightsChanging, analyzing, or decompiling the VcDemo® executable code is not allowed. Copyrights to theVcDemo® user interface is held by TU-Delft/ICT Group. Copyrights of software code used in certaincompression is held by third parties (see credit in VcDemo® program).

AcceptanceDownloading the VcDemo® software implies agreement on the above terms of use, and in particularagreement on the restriction to educational use only.

In case of Problems or BugsThe VcDemo® compression routines have been used and extensively tested over nearly 10 years. Stillproblems may occur due to particular settings or due to small bugs in the totally renovated Win32interface. Problems and bugs can be reported to the author ([email protected]), whowill respond to your bug report but does not necessarily commit himself to fixing particular problemsor bugs. In the third release of VcDemo all reported bugs have been fixed.

Page 9: vcdemomanualquestions3

VcDemo® Manual and Excercises -9-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

The author also welcomes feedback to this manual and the exercises. In particular NEW exerciseswould be appreciated. The current set is pretty standard, so I am looking for exiting new ones!

More InformationMore information and updated can be found on the is due to VcDemo® web site. Although the sourcecode of VcDemo® is not freeware nor shareware, we have put the VcDemo® developers “manual” onthe web site so that you can get an idea of “how it has been done” in case you are interested.

AcknowledgementsAlthough the VcDemo® has a long history, the current design, user interface and most of theimplementation is due to Alberto Donadoni. On behalf of the Information and Communication TheoryGroup and all future users of VcDemo®, Alberto thanks! Great job done.

If you like VcDemo® please tell your colleagues or students. VcDemo® is freeware! The VcDemo®

web site has a downloadable information flier about VcDemo®. It might be a good idea to put it up ina place where other interested people can see it.

Page 10: vcdemomanualquestions3

VcDemo® Manual and Excercises -10-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

2 Subsampling (SS)

Compression OptionsThe compression options are shown in Figure 2.1. There are three tabs with the following parameters:

Figure 2.1 Compression options for subsampling.

Factor: Sets the subsample factor. The subsampled image can be shown in its subsampleform, i.e. a smaller image, or can be blown up to the original image size using pixelreplication.

Filter: Prior to subsampling anti-aliasing filtering is normally applied. On the filter tab the anti-aliasing filtering can be switched on/off, and the length of the low-pass filter (number offilter coefficients) can be selected. The longer the filter, the steeper is the roll-off of thetransfer function of the low-pass filter.

Spectrum: To get insight in the spectral behavior of subsampling, the spectrum of the subsampled (ororiginal) image can be shown. The computation of a spectrum is a problem in itself. In theVcDemo® subsampling menu the (windowed) periodogram estimator is used. The usercan select the window that is used to suppress leakage frequencies. Notice that thespectrum calculation does not have any influence on the actual subsampling process.

Exercises

(a) Inspect the spectrum of the image “Build512B.bmp”. The DC component of the image is in thecenter of the spectrum. Explain the line structure that you observe.

Page 11: vcdemomanualquestions3

VcDemo® Manual and Excercises -11-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

(b) Subsample by a factor of 2 this image without anti-aliasing filtering, and with a 17 taps anti-aliasing filter. Find the differences in the spectra and in the subsampled images. Can you pin-pointthe aliasing in one of the subsampled images? Try different subsampling factors and differentanti-aliasing filters.

(c) Try a couple of other test images. Is aliasing visually objectionable? Does your opinion depend onthe image or subsample factor?

Page 12: vcdemomanualquestions3

VcDemo® Manual and Excercises -12-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

3 Pulse Code Modulation (PCM)

Compression OptionsThe compression options are shown in Figure 3.1. There are three tabs with the following parameters:

Figure 3.1 Compression options for PCM compression.

Bit rate: Select PCM bit rate, integers only. A uniform quantizer optimized for a uniform pdf isapplied.

Dithering: Prior to PCM compression, a small amount of uniform noise can be added to the image.This causes PCM compression artifacts at low bit rates to be less objectionable. Since thenoise is generated deterministically, it can be subtracted at the decoder side. This optioncan be switched off/on. The amount of noise is controlled by the dither step size.

Errors: When data becomes more and more compressed, the bits become more and morevulnerable to channel errors. The VcDemo® allows the injection of random bit errors intothe compressed bit stream, prior to decompression. In this way an impression can beobtained of the effects of (simple) channel errors. Different bit error rates can be selected.

Exercises(a) Find out for the images “Lena256B”, “Clown256B” and “Odie256B” at which bit rate PCM

compression artifacts becomes visible. At what bit rates do the artifacts objectionable? How manygray values are available at that bit rate to represent the image?

(b) Add dither to the image prior to PCM compression. What is the “gain” in bit rate when the visualquality of a PCM compressed and dithered PCM compressed image are compared? What is the

Page 13: vcdemomanualquestions3

VcDemo® Manual and Excercises -13-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

conclusion is only the numerical measures (SNR, MSE) are taken into account? What is the bestoption, subtract the dither at the decoder side or not?

(c) Consider a classical test image, namely the 2-D frequency sweep or zone plate. Is this imageuseful for evaluating the quality of PCM?

(d) Add channel errors at different bit error rates and different bit rates. Give an explanation for whatyou see.

(e) Draw an SNR-versus-bit-rate-plot on the grid supplied in the end of this document, using theLena256B image. What are theoretical predictions for this curve? How well does the theory fit thepractice?

(f) Repeat (e) for the “Noise256B” image. This image contains random noise. Is there a differencebetween (e) and (f)?

Page 14: vcdemomanualquestions3

VcDemo® Manual and Excercises -14-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

4 Differential Pulse CodeModulation (DPCM)

Compression OptionsThe compression options are shown in Figure 4.1. There are four tabs with the following parameters:

Figure 4.1 Compression options for DPCM compression.

Model: Four different prediction models can be used. The main differences are between the leftone (vertical 1-D prediction) and the other three (2-D predictions).

Bit rate: Select DPCM bit rate, integers only. A Lloyd-Max quantizer optimized for a Laplacianpdf is applied. In the text window an estimate of the actual bit rate after VLC (Huffman)coding will be given. This number should be used when making SNR versus bit rate plots.

Levels: A choice can be made between an even and odd level quantizers.Errors: This tab allows the injection of random bit errors into the compressed bit stream, prior to

decompression. In this way an impression can be obtained of the effects of (simple)channel errors. Different bit error rates can be selected.

The prediction difference or prediction error:The prediction difference inside the DPCM loop is shown. Zero values are displayed asgray, while positive and negative values are displayed as brighter or darker than theoverall gray value. The prediction difference is scaled for maximum visibility.

Page 15: vcdemomanualquestions3

VcDemo® Manual and Excercises -15-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

Exercises(a) Select the image “Lena256B”. Select the 1-D predictor. Carry out the compression at bitrates 6 to

1 bpp. Draw the results in an SNR-bitrate plot in the same plot as the one created for PCM.(b) Observe the correlation matrix in the experiment (a), the prediction coefficient, the resulting

prediction error variance and prediction gain. Explain how these are related and check youranswer on the basis of the numerical results. Does the experimentally determined prediction(coding) gain correspond to the theoretical model for a linear predictor of the first order? What isthe theoretical gain over PCM in terms of bit rate reduction at the same quality?

(c) Try to pin-point the DPCM slope overload and explain the structure of the prediction differenceimage.

(d) Carry out the compression for different bit rates using the other prediction models. What is therelation between the prediction model and the prediction/coding gain? Check the relation betweenthe correlation matrix and the prediction coefficients.

(e) Plot an SNR-bit rate curve for the 4-th order prediction model.(f) Examine the consequences of bit errors on the DPCM compression at different error probabilities

and for different prediction models. Explain the structure of the degradations observed especiallythe differences between the 4 prediction models.

(g) Consider the texture image “Texture256B”. Find the best prediction model and explain why it isthe best.

(h) Consider the “Noise256B” image, and look at the correlation structure and the resulting predictioncoefficients and prediction/coding gain. Explain the numerical results observed.

(i) Consider the “zone plate” and the “Odie256B” image. Why does DPCM not work well on theseimages?

Remark: At low bit rates you may see funny effects in the compressed images. These are oscillationpatterns caused by a particular predictor and quantizer combination. Try to explain why is it possiblefor these oscillations to occur while the predictor is supposedly “optimal”.

DiscussionThe PCM and DPCM compression modules are not spatially adaptive in the sense that the samequantizer is applied to all picture elements. One could also select a different quantizer for each pictureelement, or for groups of pixels (e.g. blocks). Calculate the amount of overhead generated in this wayfor several adaptation choices.

Page 16: vcdemomanualquestions3

VcDemo® Manual and Excercises -16-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

5 Vector Quantization (VQ)

Compression OptionsThe VQ compression module has two modes in which it can work. In the first place one can use anexisting codebook for compressing an image. Secondly, on can design a codebook usingtrainingsimages.

The main options are shown in Figure 5.1. There are three tabs with the following parameters:

Figure 5.1 Compression options for VQ compression.

Codebook:Choose to create a new codebook or load an existing codebook. The other two tabs aredisabled until a codebook has been designed or loaded.

Bits: Select the number of bits per codebook vector. This choice in combination with thecodebook vector size determines the compression bit rate.

Errors: This tab allows the injection of random bit errors into the compressed bit stream, prior todecompression. Different bit error rates can be selected.

Figure 5.2 shows the menu when “Load Codebook” is pressed. The directory opened is the one setunder “Help�Settings”. The codebook names usually show the size of the codevectors, and theminimum and maximum rate stored in the codebook. Note that a single codebook stores the codebookvectors for multiple bit rates. The codebooks are simply bmp images, and can also be viewed using anormal image viewer. From top to bottom, the codebooks of increasing rate are stored, sometimes in arow wrapped fashion. Figure 5.3 shows an example of a (very small) codebook with codebook vectorsize 4x4 pixels for bit rates of 0.0625 to 0.31 bit per pixel.

Page 17: vcdemomanualquestions3

VcDemo® Manual and Excercises -17-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

1 bit2 bit3 bit4 bit5 bit

Figure 5.2 Load Codebook options.

Figure 5.3 Example of a small codebook “image”

Figure 5.4 shows the menu for “Create Codebook”. This menu has a large number of parameters thatcan be grouped into two classes:

Codebook vectors:� Horizontal and vertical dimension of the codebook vectors can be set independently. The

dimensions must be a multiple of the image dimensions.� Minimum and maximum bit rate for which the codebook needs to be designed. Note that

the design of a codebook is exponentially dependent on the bit rate. The design of acodebook for more than 8 bits may take from 15 minutes up to hours – of course machinedependent but be prepared to wait (long) and get some coffee! The program will outputsome information in the text window so you will know it is still alive.

Initialization and convergence:� The design of a codebook is an iterative procedure. The codebook vectors can be

initialized using random numbers in the range [0,255] or with constant gray values,uniformly distributed over the codebook vectors in the range [0,255]. The results will bedifferent and convergence speed may be different.

� The codebook design iterations have to be truncated after sufficient convergence. Twothresholds are in effect, namely the difference in MSE in two subsequent iterations(calculated on the current estimate of the codebook and the trainingvectors) which iscalled the convergence threshold, and the number of iterations carried out (to preventinfinite computation time). Setting smaller convergence thresholds and/or larger numberof iterations yields larger computation times.

The codebooks of different rates are designed sequentially, starting at the lowest rate codebook. TheMSE between the current estimate of the codebook and the trainingvectors is shown in the textwindow for each iteration. Observe the decrease of this number. In the initial iterations sometimes notrainingsvectors are assigned to codebook vectors. The percentage of no-assignments is shown in thetext window. These no-assignment codebook vectors are re-initialized with constant gray values.

Page 18: vcdemomanualquestions3

VcDemo® Manual and Excercises -18-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

The codebook will be designed using training vectors drawn from the current start image. In generalmultiple trainingsimages are necessary for robust codebooks. Use a simple program (like MsPaint) tocombine several images into a single image, load this composite image as start image, and invoke thecodebook design procedure. Codebooks that have been earlier designed and saved can be used lateron.

Figure 5.4 Options for VQ codebook design.

Exercises(a) Load one of the predesigned codebooks. Try to pin-point the codebook vectors. How many

codebook vectors are there for the different bit rates? For instance, how large is the code book ifthe bitrate is 0.5 bit/pixel and the block size is 4x4?

(b) Load the “Lena256B” image. Load the predesigned codebook “Standard_4x4_min1_max12.cbk”.This codebook has been designed on 4 test images (fruit, camman, clown, girl). Make an SNR-bitrate curve in the same plot as the DPCM/PCM curves.

(c) Examine the consequences of bit errors on the VQ compression at different error probabilities andfor different prediction models. Explain the structure of the degradations observed.

(d) Load a composite image that can be used for codebook design. A rule of thumb is that at least 10times as many trainingvectors are needed as the number of codebook vectors to be designed. Theactually needed number of images depends on the size of the codebook vectors and the maximumbit rate for which the codebook is to be designed. Be careful not to put your PC in a 4 hourscomputation mode!

(e) Design a codebook for 4x4 codebook vectors, and observe the iteration information that comes byin the text window.

(f) After saving your codebook, apply it to the Lena256B image. Are the SNR/MSE numbers thesame as those you got under (b)? Explain.

(g) Consider different codebook vector dimensions, for instance 8x2 and/or 2x8. Design the codebookfor these sizes, and compare them to the results you obtained under (f). Explain differences.

(h) Try another image, for instance “Odie256B” or the “Noise256B” image.

Page 19: vcdemomanualquestions3

VcDemo® Manual and Excercises -19-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

6 Fractal Compression (FRAC)

Compression OptionsThe compression options are shown in Figure 6.1. There are four tabs with the following parameters:

Figure 6.1 Compression options for fractal compression.

Size: The maximum and minimum range block size can be set here. Generally, the smaller therange blocks are allowed to become, the higher the bit rate and the longer the computationtime for encoding and decoding.

Threshold: To determine if a range block has to be split into 4 smaller range blocks, a check is carriedout on the MSE of the range block and the best domain block found. If the square root ofthe MSE (RMS) is larger than the threshold set, the range block will be split.

Bits: The domain to range block mappings are determined by a shift vector (fixed rate,dependent on range block size), an isomorphic projection (fixed 3 bits), and the scale andoffset for the intensity transformation. The number of bits for the scale and offset factorcan be selected here. The more bits are chosen, the higher the bit rate of the compressedimage.

Iterations: The decoder carries out an iterative reconstruction to recover the compressed image. Herethe number of iterations can be chosen. Once an image is encoded, choosing a differentnumber of iterations does not require compressing the image again. The compressedimage is stored on disk as a bitstream in the file “FractalCodedImageBitStream.fif”. Thedecoder reads bits from this file.

Exercises

Page 20: vcdemomanualquestions3

VcDemo® Manual and Excercises -20-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

(a) Set the maximum and minimum block size to the same number, and compress the “Lena256B”image. Draw SNR-bit rate curves for variable number of reconstruction iterations, and for avariable range block size. Keep the quantization of the scale and offset fixed at the defaultsettings.

(b) Let the compression algorithm itself figure out how large to choose the range block size. Startwith a high threshold value, and decrease this value to get progressively better results and smallerrange blocks. Also the computation time will go up. Draw SNR-bit rate curves.

(c) Choose a fixed setting for the threshold value and the number of reconstruction iterations.Determine the influence of the scale and offset quantization.

(d) Try another image, for instance “Odie256B” or the “Noise256B” image.

Remark: Fractal compression as implemented here does not have a possibility for rate control.Therefore, as you vary parameters both the bit rate and the MSE/SNR will change. Keep an eye onthis when comparing results.

Page 21: vcdemomanualquestions3

VcDemo® Manual and Excercises -21-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

7 Subband Compression (SBC)

Compression OptionsThe compression options are shown in two parts in Figure 7.1. There are five tabs with the followingparameters:

Figure 7.1 Compression options for subband coding.

Decomp: The subband decomposition structure can be selected here. A choice of 6 commondecompositions is available, though many more decompositions can be envisioned. Thequantization of the subbands can be switched off to measure the quality of filter banksthat carry out the subband decomposition and reconstruction.

Filter: The number of filter coefficients for the low pass QMF analysis filter can be selected. Theother filters are derived from this low pass QMF filter. The filters used are those designedby Johnston.

Subs: The quantization of the subbands can be selected. Subband 1 (low-pass subband) can becompressed independently of the higher frequency subbands. For the subbands a choicecan be made between PCM and DPCM. The quantizers considered for each subbanddepend on the pdf selected here. The c-value specifies the shape parameter of ageneralized Gaussian pdf. If DPCM is chosen, the DPCM prediction model can beselected.

Bitrate: Different bit rates can be selected here. The limited number of choices here has been hardcoded, but is not essential to subband compression as the bit allocation procedure(implemented here as the greedy or convex hull algorithm) can achieve any desired

Page 22: vcdemomanualquestions3

VcDemo® Manual and Excercises -22-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

(overall) bit rate. Normally entropy coding (Huffman) is applied to the quantizer outputs,but in order to accommodate experiments with channel bit errors, the entropy coding hasto be switched off.

Errors: This tab allows the injection of random bit errors into the compressed bit stream, prior todecompression. In this way an impression can be obtained of the effects of (simple)channel errors. Different bit error rates can be selected. Bit errors can only be injected ifthe entropy coding of the quantizer output levels has been switched off.

The original and compressed subbandsThe subbands of the image to be compressed are shown to the right of the start image. Anexample is show in Figure 7.2 (left image). The subbands are organized according to thedecomposition scheme selected. The low frequency subband is shown in the upper leftcorner, with increasing frequency bands to the right and below. In order to visualize thesubband information, the subbands are scaled to maximum intensity range. “Gray” meanszero value, and negative and positive values are darker and brighter, respectively, than thezero value. The actual variance of the subbands can be found in the text window.

After quantization and VLC encoding, the subbands are transmitted and decoded by thereceiver. The quantized subbands are shown immediately below the original subbands(see Figure 7.2 right image). Subbands that have received zero bits in the bit allocation areentirely gray. The actual bit allocation results can be found in the text window. Again,subbands are scaled for maximum visibility. The effect of channel errors is also visible inthe individual subbands.

Figure 7.2 Original and compressed subbands. Subbands are scaled for maximum visibility.

Exercises(a) The decomposition of the images is done using tree-structure 2-channel QMF filter banks. These

banks do not yield prefect reconstruction. Check the quality of the reconstructed image (forinstance for the “Lena256B” image) for different decompositions and QMF filters (i.e. number offilter coefficients).

Page 23: vcdemomanualquestions3

VcDemo® Manual and Excercises -23-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

(b) Select the zoneplate “Zone256B” image, and carry out different subband decompositions (noquantization). Start with a 4-subband decomposition, and use different QMF filters. Explain thecontent of the subbands that you get.

(c) Load the “Lena256B” image. Select the 16-coefficient QMF filter, a 28-subband decomposition,and select PCM compression for all subbands. Pick a reasonable value for the c-parameter (howcan you check the correctness of your choice?). Draw two SNR-bitrate curves, one with and onewithout entropy coding. How much additional compression or SNR does entropy-coding give?

(d) Compare the SNR numbers obtained in exercise (c) with the SNR numbers from exercise (a).What can you conclude about the relative importance of the “perfect reconstruction” requirement?

(e) Repeat (c) for incorrect values of the c-parameter. Look closely at the difference between theselected bit rate, the bit rate predicted after bit allocation, and the actually obtained bit rate.Explain difference observed.

(f) Repeat (c) – using “correct” settings for the c-parameter – for two other cases, namely� DPCM compression of all subbands,� DPCM compression of subband 1, and PCM compression of all other subbands.

What performance gain does the additional complexity of DPCM compression bring within asubband compression system?

(g) Repeat (c) for the “Noise256B” image. Look at the subband variances and the results of the bitallocation. Compare your results with PCM and DPCM compression of this noise image.

(h) Load the “Lena256B” image, and compute SNR-bitrate curves using different subbanddecompositions and different QMF filters, keeping the compression of the subbands fixed (forinstance using DPCM for subband 1 and PCM for the others, using a fixed prediction model, andusing a fixed c-parameter).

(i) Examine the consequences of bit errors on SBC compression at different error probabilities andfor different subband decompositions and (DPCM) prediction models. Explain the structure of thedegradations observed especially the locally occurring oscillation patterns. Compare the effects ofbit errors in the subbands with those in the reconstructed image.

Page 24: vcdemomanualquestions3

VcDemo® Manual and Excercises -24-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

8 Discrete Cosine Transform(DCT)

Compression OptionsThe compression options are shown in two parts in Figure 8.1. There are five tabs with the followingparameters:

Figure 8.1 Compression options for DCT compression.

Size: The size of the DCT transform blocks can be selected (2x2, 4x4, 8x8, 16x16). Thequantization of the DCT coefficients can be switched off to measure the quality offorward and reverse DCT transform.

Coefs: The quantization of the DCT coefficients can be selected. DCT coefficient 1 (mean valuein the DCT block) can be compressed independently of the higher frequency DCTcoefficients. For the DCT coefficients a choice can be made between PCM and DPCM.The quantizers considered for each DCT coefficient depend on the pdf selected here. Thec-value specifies the shape parameter of a generalized Gaussian pdf. If DPCM is chosen,the DPCM prediction model can be selected. The DCT coefficients with the same index(originating from different DCT blocks) are all quantized with the same quantizer. Thusthe DCT compression implemented here has a global quantization behavior. In contrast tothis, JPEG is a DCT compression scheme with local quantization behavior.

Bitrate: Different bit rates can be selected here. The limited number of choices here has been hardcoded, but is not essential to DCT compression as the bit allocation procedure(implemented here as the greedy or convex hull algorithm) can achieve any desired

Page 25: vcdemomanualquestions3

VcDemo® Manual and Excercises -25-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

(overall) bit rate. Normally entropy coding (Huffman) is applied to the quantizer outputs,but in order to accommodate experiments with channel bit errors, the entropy coding hasto be switched off.

Errors: This tab allows the injection of random bit errors into the compressed bit stream, prior todecompression. In this way an impression can be obtained of the effects of (simple)channel errors. Different bit error rates can be selected. Bit errors can only be injected ifthe entropy coding of the quantizer output levels has been switched off.

Display: The DCT coefficients obtained after transformation and after quantization can bedisplayed in two ways, namely

� as collections of DCT coefficients. In this case the DCT coefficients with thesame index from all DCT blocks are collected in a single set. These sets are muchlike subbands in subband compression, and give an easy to understand impressionof the bit allocation result,

� as DCT Blocks in the correct spatial location. In this case the DCT coefficients areshown as NxN blocks that sit in the spatial position corresponding to the NxNpixels the DCT coefficients are calculated from. For visibility purposes the DCTcoefficients are scaled. This display mode gives an easy to understand impressionof the local frequency content of an image, and – after the bit allocation – animpression of which part of an image are easy or hard to compress.

The original and compressed DCT coefficientsThe DCT coefficients of the image to be compressed are shown to the right of the startimage. The DCT coefficients are organized according to the Display mode selected. Inorder to visualize the subband information, the subbands are scaled to maximum intensityrange. “Gray” means zero value, and negative and positive values are darker and brighter,respectively, than the zero value. The actual variance of the subbands can be found in thetext window.

After quantization and VLC encoding, the DCT coefficients are transmitted and decodedby the receiver. The quantized DCT coefficients are shown immediately below theoriginal DCT coefficients (see Figure 8.2 left image). DCT coefficients that have receivedzero bits in the bit allocation are entirely gray. The actual bit allocation results can befound in the text window. Again, DCT coefficients are scaled for maximum visibility. Theeffect of channel errors is also visible in the individual DCT coefficients. The quantizedDCT coefficients are organized depending on the Display mode selected. The right imagein Figure 8.2 shows the quantized DCT coefficients if the option DCT Blocks is selected.

Page 26: vcdemomanualquestions3

VcDemo® Manual and Excercises -26-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

Figure 8.2 Compressed DCT coefficient (scaled for maximum visibility) for the two different Displaymodes (left: Collections, right: DCT Blocks).

Exercises(a) The images to be compressed are transformed into DCT coefficients. The DCT is an orthogonal

transform. Check the quality of the reconstructed image (for instance for the “Lena256B” image)for different DCT sizes. How well does the theory correspond with the practical implementation?

(b) Load the “Lena256B” image. Select the 8x8 DCT transform, and select PCM compression for allDCT coefficients. Pick a reasonable value for the c-parameter (how can you check the correctnessof your choice?). Draw two SNR-bitrate curves, one with and one without entropy coding. Howmuch additional compression or SNR does entropy-coding give?

(c) Repeat (c) for incorrect values of the c-parameter. Look closely at the difference between theselected bit rate, the bit rate predicted after bit allocation, and the actually obtained bit rate.Explain difference observed.

(d) Repeat (c) – using “correct” settings for the c-parameter – for two other cases, namely� DPCM compression of all DCT coefficients,� DPCM compression of DCT coefficient 1, and PCM compression of all other DCT

coefficients.What performance gain does the additional complexity of DPCM compression bring within aDCT compression system?

(e) Repeat (c) for the “Noise256B” image. Look at the variances of the DCT coefficients and theresults of the bit allocation. Compare your results with PCM and DPCM compression of this noiseimage.

(f) Load the “Lena256B” image, and compute SNR-bitrate curves using different DCT block sizes,keeping the compression of the DCT coefficients fixed (for instance using DPCM for DCTcoefficient 1 and PCM for the others, using a fixed prediction model, and using a fixed c-parameter).

(g) Examine the consequences of bit errors on DCT compression at different error probabilities andfor different DCT block sizes and (DPCM) prediction models. Explain the structure of thedegradations observed. Compare the effects of bit errors in the DCT coefficients with those in thereconstructed image.

Page 27: vcdemomanualquestions3

VcDemo® Manual and Excercises -27-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

9 Baseline JPEGCompression Options

The compression options are shown in two parts in Figure 9.1. There are six tabs with the followingparameters:

Figure 9.1 Compression options for JPEG compression.

Bitrate: The JPEG quality factor Q can be set here. By selecting the quality factor, the bitrate isimplicitly defined, but the bitrate is not known beforehand. Instead of setting the qualityfactor, a bitrate can be selected. The quality setting corresponding with this bitrate will bedetermined by a simple search procedure. The resulting quality factor is shown in the textwindow.

Quant: The quantization of the DCT coefficients is carried out using the standard JPEGquantizers, which can – however – be influenced by selecting a particular quantization ornormalization matrix. The normalization matrix to be used can be selected here. Theoptions are

� the standard JPEG normalization matrix for luminance information,� the standard JPEG normalization matrix for chrominance information,� a normalization matrix filled with equal weights for all DCT coefficients

(weight=50).� a “high-emphasis” normalization matrix, that puts more emphasis on high

frequency DCT coefficients rather than on low-frequency DCT coefficients. Thismatrix has been added for educational purposes only, and will never be used inpractice.

Page 28: vcdemomanualquestions3

VcDemo® Manual and Excercises -28-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

Notice that in JPEG both quality factor and SNR vary as one changes the selection of thenormalization matrix.

The user quality factor Q and normalization matrix N(u,v) are used as follows to quantizethe DCT coefficients F(u,v):

F u v NINTF u v

Q N u v* ( , )

( , )

' ( , )�

��

��

where the minimal value of Q' N(u,v) is 1.0. Q' is related to the user quality factor Q asshown in Figure 9.2.

0 10 20 30 40 50 60 70 80 90 1000

1

2

3

4

5

6

Q'

Q

Figure 9.2 Relation between the user quality factor Q and the internally used factor Q’.

Huffman: Different entropy coding tables can be selected here, namely no entropy coding (i.e. fixedlength coding) for the DCT coefficients, standard VLC, or a VLC table optimized for theimage under consideration.

Smooth: If checked, the decompressed image is smoothed somewhat to suppress blocking artifacts.Markers: In the JPEG module, errors can be injected into the actual JPEG stream. This means that

VLC codes, header information, and other crucial information may get corrupted. Toblock the effect of progressively worsening VLC decoding errors, unique markers may beinserted into the JPEG bit stream. The periodicity of these markers can be selected here.Notice that as more frequently markers are inserted, the overall bit rate will go up, or – ifa fixed bit rate has been set – the SNR will go down.

Errors: This tab allows the injection of random bit errors into the compressed bit stream, prior todecompression. In this way an impression can be obtained of the effects of (simple)channel errors. Different bit error rates can be selected. Bit errors potentially corruptcrucial information such as header information (image size, VLC tables). In case theerrors are too severe, decoding is interrupted.Without any precautions many of the bit errors may cause the application to crash.Though the authors believe that they have identified most of these cases, exceptionalcases may still cause page faults, access violations or worse. If such case occurs,please report this situation – with proper documentation – to the authors.

Compressed bitstream:

Page 29: vcdemomanualquestions3

VcDemo® Manual and Excercises -29-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

The compressed image is written to disk in a file name “JpegCodedImageBitStream.jpg”.This file contains the JPEG compressed data, embedded in the JFIF header protocol. Thefile can be viewed with any standard image display or processing application.

Viewing of DCT coefficients:The JPEG module itself does not show the (quantized) DCT coefficients. To view these,use the option “Set as start image” on the JPEG decompressed image. Then run the DCTcompression module on this image, using 8x8 DCT blocks and switching off the encodingof the DCT coefficients. The DCT coefficients computed from the JPEG compressedimage will now be displayed. Both display modes (Collections and DCT Blocks) areinstructive to study. Figure 9.3 shows an example of the resulting image.

Figure 9.3 JPEG compressed DCT coefficient (scaled for maximum visibility).

Exercises(a) Load the “Lena256B” image. Use the standard luminance normalization matrix. Draw three SNR-

bitrate curves, one for each entropy coding choice. At the same time, plot three Quality-SNRcurves. How much additional SNR does entropy-coding give? What are your observations aboutthe Quality-SNR curves?

(b) Repeat (a) for the “flat normalization matrix”. Observe that the SNRs obtained in this case arehigher! Give a justification for using the “standard luminance normalization” instead of the “flatnormalization matrix”.

(c) Compare the quantization of the DCT coefficients in JPEG with the one resulting from the DCTcompression module, for instance for the “Lena256B” image at a bit rate of 1.0 bit per pixel.Follow the instructions given under “Viewing of DCT coefficients”. Can you find and explain thedifferences?

(d) Examine the effect of smoothing the JPEG decompressed image.(e) Determine the decrease in SNR when inserting markers into the JPEG bit stream.(f) Examine the consequences of bit errors on JPEG compression at different error probabilities.

Page 30: vcdemomanualquestions3

VcDemo® Manual and Excercises -30-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

10 Set Partioning In HierarchicalTrees (SPIHT)

Compression OptionsThe compression options are shown in two parts in Figure 10.1. There are three tabs with thefollowing parameters:

Encoder bitrate A bitrate can be selected here to which to current image will be encoded. Thecoded file will be at the selected bitrate

Smoothing Seven levels of smoothing can be selecting. This will adjust the filtercoefficients to get a smoothing effect ( 0 is no smoothing, 7 is smoothing level7.)

Decoder bitrate The coded image will be decoded until the selected decoder bitrate is reached,the rest of the coded information will not be used to restore the image. Thedecoder bit rate MUST be smaller than the encoded bit rate. Note that thedecoder bit rates is not automatically adjusted for changing encoder bit rate.

The number of splits (see “Remark”) is determined by the SPIHT algorithm itself and is not aselectable parameter here. If you want ot play with this parameter, use the SBC or EZW moduleinstead as the overall effects are more or less the same as in SPIHT.

The SPIHT algorithm used here is based on commercial implementation of PrimaComp Inc. Troy,NY, USA. Many of the reported compression factors in VcDemo are calculated on the basis of VLCassignments, but without actually creating the compressed file. The SPIHT encoder does create a

Figure 10.1 Compression options for SPIHT compression

Page 31: vcdemomanualquestions3

VcDemo® Manual and Excercises -31-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

temporary file on disk and the decoder read this file. For that reason the reported compression factorsand bit rate are exact.

Figure 10.2 The original and compressed subbands

The Windows containing the original image subbands and the compressed image subbands (argggh,this term will upset some of the wavelet people; of course we mean wavelet coefficients here, but hey,aren’t they the same as subbands?) are similar as in the SBC module.

Exercises

(a) Load the “Lena256B” image. Draw the SNR-bitrate curve, use a smoothing setting of zero.(b) Load Camman256.bmp and perform SPIHT coding. Explain the octave structure of the Subbands.(c) Why does changing the encoder bitrate seem not to have a result on de decoded picture.(d) Load Noise256.bmp and explain the effect of “smoothing” on a picture.(e) For which purpose is the decoder bitrate setting very useful.(f) Perform SPIHT on Lena256B.bmp with the following bitrate settings: 0.10, 0.25, 0.50 an 1.00

bpp.

RemarkThe term “number of levels” is very confusing. If an image is not decomposed, what is the number oflevels in this case? Zero or one? In the VcDemo we talk about “the number of resolution levels”. Nosplitting means one resolution level, one split means two resolution levels, etc. This terminology isused in both EZW and SPIHT.

Page 32: vcdemomanualquestions3

VcDemo® Manual and Excercises -32-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

11 Embedded Zerotree Wavelet(EZW)

Compression Options

Encoder Bitrate The encoder bitrate can be selected.Levels Select the number of levels up to which the spectrum of image will be splitDecoder Bitrate The coded image will be decoded until the selected decoder bitrate is reached,

the rest of the coded information will not be used to restore the image.

The original and compressed subbandsThe Windows containing the original image subbands and the compressedimage subbands are similar as in the SPIHT module.

Exercises

(a) Load Clown256.bmp and explain the effect of choosing a very low level (1 or 2) and a bitrate of0.2. or lower.

(b) Load the “Lena256B” image. Draw the SNR-bitrate curve, use the maximal number of levels.(c) Compare SPIHT, EZW and JPEG for bitrates of 0.10, 0.25, 0.50 and 1.00. describe the visual and

numerical (SNR) differences. Use your SNR-bitrate curves, Specify what is the best coder foreach bitrate range (for “Lena256B”, use no smoothing for SPIHT and the maximal number oflevels for EZW, use the default settings for JPEG)

Figure 11.1 Options for EZW module (actual interface differs slightly)

Page 33: vcdemomanualquestions3

VcDemo® Manual and Excercises -33-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

(d) What is “continously scalable” and is EZW continuously scalable. Are SPIHT and JPEGcontinuously scalable?

RemarkThe term “number of levels” is very confusing. If an image is not decomposed, what is the number oflevels in this case? Zero or one? In the VcDemo we talk about “the number of resolution levels”. Nosplitting means one resolution level, one split means two resolution levels, etc. This terminology isused in both EZW and SPIHT.

Page 34: vcdemomanualquestions3

VcDemo® Manual and Excercises -34-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

12 Motion EstimationCompression Options

The compression options are shown in two parts in Figure 12.1. There are five tabs with the followingparameters:

Figure 12.1. Options for Motion Estimation module (the actual interface looks slightly different).

Hierarchy Standard block mathcing or hierarchical block matching (two or three levels) can beselected. Depending on the choice at this tab, some of the other tabs change content as forinstance the block size is a function of the resolution level.

Search: Different types of motion estimation search strategies can be used, namely: (i) full search,(ii) one-at-a-time search, (iii) N-step search. Note that full search can become extremelycomputationally demanding.

Size: The size of the (square) blocks for which the motion vector is estimated, can be selected.In hierarchical estimation, the block size is the size at the higher resolution level. Thesizes at the other resolution levels are determined by the program (look at the outputwindow).

Displ’mnt: The maximum displacement can be selected for “Full search” and “One-at-a-time”. Thelarger the maximum displacement, the more computationally demanding the motion

Page 35: vcdemomanualquestions3

VcDemo® Manual and Excercises -35-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

estimator becomes. For hierarchical estimation, the maximum displacements at thedifferent resolution levels (low to high resolution) are given.

N-Step: For the motion estimation search type “N-Step” the number of levels (steps) can beselected. The larger the number of steps, the larger the maximum tolerable displacementis. For hierarchical estimation the number of steps per resolution level is given.

Video: Different video display options can be selected. The estimated motion fields are internallysaved, so that the option “Display Again” re-uses already estimated motion fields.Changing the motion estimation options of course requires re-estimation of the motionfield.

Images shown:Motion is estimated on all frames of the image sequences. The estimation process cannotbe interrupted. Four windows are shown (see Figure 10.2), namely:� the original image sequence (upper left),� the difference between two consecutive frames (upper right),� the motion compensated prediction of the current image, with the motion field

overlayed (bottom left). The begin point of the motion vectors is indicated with ablack dot,

� the motion compensated frame difference (bottom right).

The text window shows for each frame the variance of the original frame, the variance ofthe frame difference, the variance of the motion compensated frame difference, and anestimate of the differential entropy of the estimated motion field (in bit/vector). For thelatter a simple 1-D lossless DPCM is carried out on the motion vector field, and thehistogram of the DPCM difference is calculated. From the histogram the entropy of thevector field is estimated (total of horizontal and vertical component).

Page 36: vcdemomanualquestions3

VcDemo® Manual and Excercises -36-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

Figure 12.2 Displayed windows in Motion Estimation demo.

Exercises

Load the car sequence, and use “Full search” motion estimation with 8x8 blocks and a maximumdisplacement of 8 pixels.

(a) Discuss why it is advantageous to encode a difference image instead of the original image. Baseyou motivation on the prediction gain.

(b) How large is the gain in the prediction gain obtained when – besides taking frame differences –motion compensation is used?

(c) How large is the overhead of the motion field encoding when a bit rate of 0.5 bit per pixel is usedfor the compression of the motion compensated frame difference.

(d) Does motion compensation pay off in all parts of the frame? If not, in which parts does motioncompensation not seem to work? How is this handled in real compression systems?

(e) Compare the efficiency of motion compensation for varying displacements. What is the optimaltrade-off between complexity and compression gain in this case?

(f) Compare the efficiency of motion compensation for varying block sizes. Discuss the effectsobserved.

(g) Compare the differences in efficiency for the three different motion estimation search strategies,and discuss the trade off between complexity and prediction gain.

(h) Compare the differences in efficiency for no hierarchical blockmatching with 2 and 3-levelhierarchical blockmatching.

Page 37: vcdemomanualquestions3

VcDemo® Manual and Excercises -37-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

RemarkThis module has a lot of parameters if hierarchical motion estimation is taken into account. It is noteasy to find the optimal setting for all cases. Beginners should stick with the non-hierarchical version.The fact that the parameter choice is not easy of course also says something about real applications ofmotion estimation (if you agree with me …. what does it say?)

Page 38: vcdemomanualquestions3

VcDemo® Manual and Excercises -38-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

13 MPEG-1/2 DecoderCompression Options

The MPEG decoder is fully MPEG-1 and MPEG-2 compliant. Test streams obtained from the Internetcan be decoded by the module, including MPEG video and program streams. The MPEGdecompression options are shown in Figure 13.1. There are four tabs with the following parameters:

Figure 13.1 Decompression options for the MPEG decoder module (actual interface differs slightly)

Operation: The decoder operation and output can be chosen, namely: (i) normal output, showing thedecoder frames as would be shown by any decoder; (ii) the predicted frames withoutadding the quantized prediction difference that are sent from encoder to decoder in thebitstream. In this way an impression can be obtained of the prediction quality. Notice thatprediction differences will accumulate in this case; (iii) the coded prediction differences,i.e. the information sent from encoder to decoder in the bitstream. In this way theprediction quality and the number of intra-coded macroblocks can easily be evaluated.

Display: Three display options can be switched on/off, namely: (i) color or black-and-whitedisplay, (ii) show the motion vectors in overlay with the frame information; (iii) skip theB-frames (for slow machines, gives better motion rendering).

Video: Different video display options can be selected. The frame-by-frame option allows forevaluating individually decoded frames.

Page 39: vcdemomanualquestions3

VcDemo® Manual and Excercises -39-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

Save As: The decoded MPEG stream can also be saved as a raw YUV file (plus header). Be awareof the data expansion that you will get, and have enough disk space available if youdecoded a long MPEG stream!

Image shown:One windows is shown containing the decoder frame, the predicted frame, or theprediction difference depending on the selected Operation choice (see Figure 11.2). Thetext pane displays unformation decoded from the Mpeg stream about video format andcoding parameters. During the decoding also the GOP structure is displayed. Notice thatthe decoding order and display order may be different in Mpeg due to the B-frames.

The overlayed motion field show the begin point of the motion vectors as a red dot.Macroblocks that have only a dot have a motion vector of zero, or have been intra-coded.By looking at the coded frame differences these two cases can be differentiated. Themotion vectors shown are always backward motion vectors (as in P frames) unless amacroblock in a B frame has been predicted in a forward manner. In this case the forwardmotion vector is used. Some macroblocks use multiple motion vectors (B-macroblocksand in various MPEG-2 coding modes). Notice that still only one vector is shown.

Figure 13.2 Displayed windows in MPEG decompression demo.

Exercises

Page 40: vcdemomanualquestions3

VcDemo® Manual and Excercises -40-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

Load the son Mpeg stream. Decode the stream in a frame-by-frame mode. Decoding can be terminatedby clicking “No”. Clicking “Cancel” will cause the automatic decoding of the rest of the Mpeg stream.

(a) A few frames have to be decoded before the display starts. Why?(b) Explain the meaning of the parameters shown in the tex window. Study in particular the GOP

structure.(c) Set the Operation option such that coded differences are shown. Explain the content that the video

window shows. It may be helpful to overlay the motion field with the coded differences.(d) Near frame 28 a scene change occurs. Check what happens to the decoded frames, the predicted

frames, the coded differences, and the motion field.(e) Find parts in the sequences where the motion compensated prediction fails (look at the frame

differences or at the predicted frames), and explain why the motion compensation failed in thesecases.

(f) Decode other Mpeg sequences, and find “interesting” parts of the sequences. Explain why youthink these parts are interesting …

Page 41: vcdemomanualquestions3

VcDemo® Manual and Excercises -41-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

14 MPEG-1/2 Encoder

Compression Options

When opening the MPEG encoder interface, all options have been deactivated until a name has beenselected for the compressed MPEG file. Depending on the type of MPEG compression (MPEG-1 orMPEG-2 on the first tab) certain option are activated or deactivated.

An MPEG encoder has MANY different parameters that one can set. In the VcDemo MPEG encoderthe choice have been limited severely in order to for novices to het meaningful results. Therefore, theMPEG encoder should not be considered for commercial compression. It works, sure, but you mightwant to have more control over the parameters when doing the real thing.

The interface has six tabs.

Figure 14.1 Compression options for the MPEG encoder module

File: Set the Mpeg file name, and select the format (MPEG-1 or MPEG-2).Rate: Set bit rate of the compressed file, in Mbit/second.GOP: Different group of picture structures can be selected here. Although GOPs come in a

wide variety of choices, here only several standard (and non adaptive) ones have beenprovided for the sake of simplicity.

Page 42: vcdemomanualquestions3

VcDemo® Manual and Excercises -42-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

Motion: The maximum displacement is selected here. The encoder uses hierarchical motionestimation for frame-based compression, and full search for field-based compression.

Format: For MPEG-2 compressed format, the format of the video sequence can be set here. Ifinterlaced format is selected, the position of the first field must be indicated.

Field/Frame: In case the input video sequence has an interlaced format, the full power of MPEG-2can be used. There are several ways the encoder can be forced to certain option. Here,the encoder can be forced to always use frame-based motion estimation and DCT incase frame coding is used. In case field coding is used, these options are irrelevant.The field coding seems, however, to have some bugs in determining the motion vectorrange, so the use of this option is discouraged (the performance is bad anyway).

Output images and data:The encoder display the image just encoded. Since the encoder does not always process the framestime sequentially, you will see funny motion effects (movie going back and forth).

The most useful information can be found in the text window. You will find the bit rate per frame, theway the frame has been encoded (I/P/B), and the way the individual macroblocks have beencompressed (Figure 14.2)

Figure 14.2 Compression output

This is the legend for the first character:S: skippedI: intra coded0: forward perdicted without motion compensationF: forward frame/16x8 prediction (in frame/field pictures resp.)f: forward field predictionp: dual prime predictionB: backward frame/16x8 predictionb: backward field predictionD: fram,e/16x8 interpolationd: field interpolation

Page 43: vcdemomanualquestions3

VcDemo® Manual and Excercises -43-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

Second character:space: coded, no quantizer changeQ: coded, quantizer change

Compatibility issues:The MPEG encoder produces an MPEG video stream. This video stream is a perfectly MPEGcomplient stream, but in case you try to play it with the Windows Media playes it will not work.Why? Simply because there is no system information, most notably timing information, which themedia player requires. It is easily verified that with other MPEG decoders one can view the MPEGfiles produced by VcDemo. Alternatively, you can use the VcDemo MPEG 1/2 decoder to view theresult. By the way, the MPEG decoder will play any MPEG video stream, with or without timinginformation and with or without system stream information.

ExercisesWill follow some time in the future. For now, I think that playing with the encoder is enough fun byitself. The best thing to do is to encode a sequence, see if you can understand what is going on, andthen view the results in the MPEG decoder. Try different sequences (lot of motion, no motion) and trydifferent bit rates and GOP structures.

Page 44: vcdemomanualquestions3

VcDemo® Manual and Excercises -44-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

15 Tips and Tricks

Here is a short list of handy tips and tricks for the VcDemo.

Toolbar � Settings Allows you to change the default directory for images and user output.This might be handy is you put all you images and sequences in anotherthan the standard directory.

Left mouse button doubleclick in image

This shows an image at its true size. Large images are usually scaled tofit them in the work area.

Right mouse button inimage

Shows the context dependent menu, which gives access to informationabout the image in which you clicked. This is handy to figure out whichparameter settings were used for a particular result. You can also savean image in this way as bmp file.

Drag bottom left imagecorner

To enlarge/shrink images

Click X in start image To close the start image and all the derived results.Computation takes longtime

Use a smaller image. Your PC is probably not the fastest in the world oryour machine has less than 32 MByte RAM. Alternatively, closeVcDemo and start it again. If this helps, you found a memory leak inVcDemo. Please report this to the author.

Hit any key … To interrupt the compression, decompression, or motion estimationprocess on image sequences.

Click on a module button .. if you had already activated it before, but activated a second modulelater on. The first module still lives, but is hidden behind the secondone. If you click the button corresponding to the first one, it will pop-upand come to the foreground.

Modules deactivated Some modules require particular image sizes, usually powers of 2 oreven number of columns/rows. If a module is not acitve, than the imagethat you loaded has a funny size. Crop the image to another size.

My color image (bmp) isshown as gray value

That’s correct. The VcDemo internally converts color images to grayvalue ones. All compression techniques can work on color images, butto understand compression techniques, color issues are a little too much.That’s why VcDemo supports gray value processing only (for MPEGencoding and decoding color is supported (YUV format)).

I cannot read the text inthe parameter menus onthe right hand side of mydisplay

That’s a bug I am unable to fix (suggestions …?). You can fix theproblem by changing font size (in Display Properties� Settings) to“small”. The problem should be gone now.

Page 45: vcdemomanualquestions3

VcDemo® Manual and Excercises -45-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

SNR-bitrate Drawing Grids

Two different grid types are provided:� Range 0 - 40 dB and 1 - 4 bit per pixel. Use this one for PCM and DPCM.� Range 0 - 25 dB and 0 – 2.5 bit per pixel. Use this one for VQ, FRAC, SBC, DCT, and JPEG,

and copy some of the PCM/DPCM points into this curve.

Page 46: vcdemomanualquestions3

VcDemo® Manual and Excercises -46-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

Page 47: vcdemomanualquestions3

VcDemo® Manual and Excercises -47-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

Page 48: vcdemomanualquestions3

VcDemo® Manual and Excercises -48-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000

Page 49: vcdemomanualquestions3

VcDemo® Manual and Excercises -49-

Delft University of Technology. Information and Communication Theory Group© Copyright 1992-2000