Junqing Chen's Homepage

Projects

Perceptual Metrics and Perceptual Models

Introduction

In this project, we examine perceptual metrics and use them to evaluate the quality of still image coders. We show that mean-squared-error based metrics (such as PSNR) fail to predict image quality when one compares artifacts generated by different types of image coders (e.g., block-based, subband, and wavelet coders). We consider three different types of coders: JPEG, the Safranek-Johnston perceptual subband coder (PIC), and the Said-Pearlman SPIHT algorithm with perceptually weighted subband quantization, based on the Watson et al. visual thresholds. We show that incorporating perceptual weighting in the SPIHT algorithm results in significant improvement in visual quality. The metrics we consider are based on the same image decompositions (subband, wavelet, DCT) as the corresponding compression algorithms. Such metrics are computationally efficient and considerably simpler than more elaborate metrics (e.g., by Daly, Lubin, and Teo and Heeger). However, since each of the metrics is used for the optimization of a coder, one expects that they would be biased towards that coder. We use the metrics to evaluate the performance of the compression techniques for a wide range of bit rates. Our experiments indicate that the PIC metric provides the best correlation with subjective evaluations. It predicts that at very low bit rates the SPIHT algorithm and the 8 X 8 PIC coder perform the best, while at high bit rates the 4 X 4 PIC coder is the best. More importantly, we show that the relative algorithm performance depends on image content, with the subband and DCT coders performing best for images with a lot of high frequency content, and the wavelet coders performing best for smoother images.

Perceptual Coders

Image compression techniques make use of the properties of the human visual system. Most of the traditional compression techniques make implicit use of the HVS characteristics to achieve high compression ratios without significantly sacrificing image quality. The use of an explicit perceptual model allows even higher compression ratios by allowing the maximum amount of distortion in each image component without resulting in any perceived distortion. This is usually referred to as the just noticeable distortion level or JND. When the distortions introduced by the compression scheme are below the JND level, they are invisible to the eye, and the scheme is perceptually lossless.

Most perceptually based compression algorithms share a similar basic structure, which is shown in Fig. 1. It includes a frequency analysis stage that consists of a bank of linear filters that decompose the image into several components (usually called subbands) with different spatial frequencies and orientations, a quantization stage whose parameters are controlled by the perceptual model, and an entropy coding stage. The entropy coding stage is lossless, and the frequency analysis stage is either lossless or introduces very little distortion. Thus, any significant information loss occurs at the quantization stage. The role of the perceptual model is to control the amount of distortion introduced at the quantization stage, so that it is either invisible (perceptually lossless) or so that its visibility is minimized (perceptually lossy). Typically, the perceptual model accounts for variations in HVS sensitivity to different frequencies and brightness levels, as well as nonlinear mechanisms that account for masking. We will look at perceptual models in more detail at the next section.

perceptual coder

Perceptual Metrics

The goal of a perceptual metric for image quality is to determine the differences between two images that are visible to the human visual system. Usually one of the images is the reference which is considered to be ``original,'' ``perfect,'' or ``uncorrupted.'' The second image has been modified or distorted in some sense, e.g., by the compression algorithm. It is very difficult to evaluate the quality of an image without a reference. Thus, a more appropriate term would be image ``fidelity'' or ``integrity,'' or alternatively, image ``distortion''. However, for historical reasons we adopt the term image quality.

As we saw in the previous section, the perceptual coders use models of the HVS to determine the visibility of the distortions at the quantization stage. Such models could also be used to evaluate the performance of any compression scheme. The basic structure of a perceptual metric is shown in Fig.2. The original and decoded images undergo the same frequency analysis (DCT, subband, wavelet, etc.). The perceptual model is used to weigh the different components of the error signal according to the HVS sensitivity in order to produce an output image that represents perceptual error. If desired, error pooling can be performed to produce a single number that represents the perceived distortion.

perceptual coder

Figure 2. Perceptual Metric

A number of perceptual metrics have been proposed in the literature. Some of these metrics are quite general and are intended for general applicability, including image compression, image halftoning, display evaluation, etc. Such metrics are based on models of the low level processing of the HVS, such as the optics, the retina, the lateral geniculate nucleus, and the striate cortex. They are quite elaborate, computationally intensive, and require careful calibration and parameter selection.

In this project, we consider metrics that are designed specifically for image compression. Such metrics are simpler and more efficient computationally. For example, they could apply to specific viewing conditions and display devices, and may be intended for specific classes of coders ( e.g., block-based or wavelet coders).

Results

Where MSE fails

First we examine the performance of the MSE metric. We show that it cannot be used to evaluate the quality of image coders that include perceptual weighting. Figs.3 (a) and (b) show an image (BANK) coded at 0.52 bits/pixel using the SPIHT coder and the 4x4 PIC coder, respectively. The image resolution is 512x512 pixels. The PSNR (based on the MSE) is 33.3 dB for the SPIHT image and 29.4 for PIC image. This is not surprising, as the SPIHT algorithm is designed to minimize the MSE and has no perceptual weighting. The PIC coder, on the other hand, uses perceptual weighting. The viewing distance is six image heights or 21 inches, which at the given resolution results in roughly the same visual resolution (in pixels/degree) as standard definition TV as well as high definition TV. At a close viewing distance (and assuming that it survives the reproduction procedure) the reader may see ringing near the edge of the PIC image. On the other hand, the SPIHT image has considerable blurring, especially on the wall near the left edge of the image. However, if one holds the image at the intended viewing distance (approximately at arm's length), the ringing disappears, and all that remains visible is the blurring of the SPIHT image.

In our SPIE 2001 paper, the order of these two figures were misplaced

In contrast, the perceptual PSNR provided by the PIC metric is 46.8 dB for the SPIHT image and 49.5 dB for the PIC image. Observe that the distortion is considerably higher for SPIHT image, especially near the wall on the left where there is substantial blurring. As expected the MSE metric fails to predict the quality of the perceptually weighted coder when it is compared to a coder that is not perceptually weighted. Fig.4 shows the performance of the two metrics for BANK image for the four different coders over a wide range of bit rates.

SPIHT coder PIC coder
Figure. 3 (a) SPIHT coder, PSNR=33.4dB, P-PSNR=46.8dB (b) 4x4PIC coder, PSNR = 29.3 dB, P-PSNR=49.5dB

Figure 4. MSE Metric and PIC Metric for BANK image

Which Peceptual Metric works best?

We now compare the performance of the PIC metric to the wavelet metric and Watson's DCT-based metric. Fig. 5 shows the ROSE image coded by the perceptually weighted SPIHT coder and the 4x4 PIC coder at 0.7 bits/pixel. According to PIC Metric the P-PSNR is higher for the PIC coded image (45.58 dB) than it is for the perceptual SPIHT coded image (42.15 dB), which agrees with the subjective image quality. Observe that the line on the road is a bit blurred in the P-SPIHT image while it is considerably sharper in the PIC image. In contrast, the value of the wavelet metric is higher for the P-SPIHT image.

Perceptual SPIHT coder PIC coder
(a) P-SPIHT coder, P-PSNR=42.15dB (b) 4x4PIC coder, P-PSNR=45.58dB
Figure. 5 ROSE Image at 0.7 bits/pixel

Fig. 6 shows the ROSE image coded by JPEG and the perceptually weighted SPIHT coder at 0.13 bits/pixel. Though they both have pronounced artifacts, most of our readers will agree that the P-SPIHT image looks less objectionable. In this case, both metrics predict the subjective observation.

JPEG coder  Perceptual SPIHT coder
(a) JPEG coder, P-PSNR=30.77dB (b) P-SPIHT coder, P-PSNR=32.99dB
Figure. 6 ROSE Image at 0.13 bits/pixel

Fig.7 shows the result of PIC, wavelet, and Watson metrics over a wide range of bit rates, from very low at around 0.1 bits/pixel to perceptually transparent (as determined by 4x4 PIC coder). As the figure shows, the PIC metric predicts that at lower rates P-SPIHT coder has better visual quality while at higher rates the performance of the 4x4 PIC coder dominates. This result correlates well with informal subjective tests. While one can definitely argue that the PIC metric is biased towards the PIC coder, it successfully predicts the general trends in relative coder performance. The wavelet metric, on the other hand, is biased towards the SPIHT coder as they use the same wavelet decomposition. Similarly, the Watson metric is biased towards JPEG. The same conclusions were obtained over a number of different images. Thus, the PIC metric provides the best correlation with subjective evaluations. However, while it predicts the trends in relative performance, it does not predict the exact position of the cross-over points, nor does it predict absolute performance.

PIC Metric  Wavelet Metric
 Watson Metric (DCT)

Figure. 7. Perceptual Metrics for ROSE image

 

 

 

Contact | ©1999 Junqing Chen | Last updated