Alt band ayrıştırmasıyla görüntü kodlama

Başlık çevirisi mevcut değil.

Tez No: 75588
Yazar: BURÇİN AÇAN
Danışmanlar: DOÇ. DR. MELİH PAZARCI
Tez Türü: Yüksek Lisans
Konular: Elektrik ve Elektronik Mühendisliği, Electrical and Electronics Engineering
Anahtar Kelimeler: Belirtilmemiş.
Yıl: 1998
Dil: Türkçe
Üniversite: İstanbul Teknik Üniversitesi
Enstitü: Fen Bilimleri Enstitüsü
Ana Bilim Dalı: Elektronik ve Haberleşme Mühendisliği Ana Bilim Dalı
Bilim Dalı: Belirtilmemiş.
Sayfa Sayısı: 124

Özet

ÖZET Çok büyük miktarlarda sayısal bilginin işlendiği günümüzde, işlenen bilginin mümkün olduğunca azaltılması bu alandaki en büyük problemlerden biridir. Dünya üzerindeki sayısal bilginin ve iletişimin önemli bir parçası olan sayısal resimlerin ham olarak kapladıklarından daha küçük bir şekilde saklanmaları, resim bilgisinin fazlalığı nedeniyle ayrı bir öneme sahiptir. İşaret işlemenin üç önemli bölümü olan saklama, iletim ve hesaplama evrelerinde bellek ve işlem süresi açısından oldukça büyük kazançlar sağlayan resim kodlama sistemlerinin performansı da her geçen gün ihtiyaçla birlikte artmaktadır. Aynı zamanda teknolojideki gelişmeler bu alana da yansımakta ve her gün yeni kodlama sistemleri ortaya çıkmaktadır. Bugün dünyada en yaygın şekilde kullanılan JPEG ve MPEG gibi DCT temelli resim kodlama yöntemlerine rakip olan alt band kodlaması yeni ve araştırmalara açık bir kodlama sistemidir. Bu çalışmada alt band ayrıştırması ve vektör kuantalama yardımıyla resim kodlayan sistemler üzerinde durulmuştur. Çalışmanın ikinci bölümünde sayısal işaret işleme ve resim kodlama hakkında temel bilgiler verilmiştir. Üçüncü bölümde alt band ayrıştırmasının teorik temelleri ve pratikteki uygulamada karşılaşılan sorunlar ele alınmıştır. Dördüncü bölümde ise geliştirilen farklı vektör kuantalayıcılar ile alt band bileşenlerinin kuantanlaması sonucu oluşturulan resim kodlama uygulamalarından elde edilen sonuçlar verilmektedir. Yapılan bu çalışmada farklı kodlama seviyelerinde farklı kalitede resimler elde edilmiştir. Uygulamaların geneline bakıldığında oldukça başarılı sonuçlara ulaşıldığı söylenebilir. Özellikle kodlamaların sistemin hesap süresi, ve bellek ihtiyacı dikkate alınarak yapıldığı gözönüne alındığında, alt band kodlama sistemlerinin geleceğinin parlak olduğu sonucuna varılabilir. viii

Özet (Çeviri)

SUMMARY SUBBAND CODING After the invention of the transistor, digital technology developed very rapidly. Accordingly, we now have very powerful computers, which allow us to deal with larger amounts of data in a shorter time compared to several years ago. On the other hand as the power of the computers grow, the amount of data that should be processed grows, too. Especially with the advent of the information super highway, the Internet, the amount of digital information that is generated and transmitted in everyday applications has seen a very explosive growth. Examples of applications that deal with very large amounts of data include some of the proposed standards for high definition television (HDTV) systems, digital video conference systems, and very large databases created by NASA's satellite missions or other projects. In order to make these-and similar. applications feasible, it is important to operate upon the data with some form of data compression technique. For this purpose many data compression techniques have been proposed. Images are one of the most important parts of different applications and they contain more data than the other components. So, someone consumes large amounts of time to transmit images through transmission channels, or large harddisk spaces to store them. Because of this important need for reducing the sizes of the digital images, many researchers work on different image coding schemes. These efforts brought many image coding techniques and standards as well. For example GIF (Graphics Interchange Format) and JPEG (Joint Photographic Experts Group) are the most widely used compressed image formats for still images. While GIF uses a lossless coding technique, JPEG is a DCT (Discrete Cosine Transform) based lossy coding standard. The flexibility and high compression ratios achieved made JPEG especially popular. The motion picture coding standards MPEG-1 (Motion Pictures Expert Group) and MPEG-2 are DCT based image coding standards for motion picture sequences. One of the newer image coding techniques is the subband coding. The encouraging results reported by different researchers show that subband coding is one of the most effective image coding schemes. In this thesis, different coding approaches based on subband coding and vector quantization are proposed. IXA general digital image coding system consists of three basic parts: Signal decomposition, quantization and lossless coding. By the signal decomposition step, the image is typically decomposed into several subimages, by using a linear transformation like DFT or DCT, which serves to reduce the correlation between the subimages or produces a useful data structure. (The reason, why we need this step is that in natural images the pixels are highly correlated). The second step of image coding is quantization, which makes the coding lossy. By quantization, the large set of possible pixel values are mapped into a smaller set of pixel values. There are two kinds of quantization: Scalar quantization and vector quantization. Scalar quantization works on individual pixel values, which are mapped into a small set of output values. Certain pixel values that are in certain intervals are mapped to a single pixel value. In the case of a uniform scalar quantization, in which the input intervals have equal width, the quantization process can be done by dividing the pixel value by the desired interval width and storing only the integer part of the result. This operation causes loss of information, because the original pixel values can not be recovered after the reverse operation, but at the same time less space is required to store the quantized pixel values. By vector quantization, the pixels are grouped to form blocks of different sizes and these pixels are quantized together as a block. By this operation, the whole image is thought to be consisting of these pixel groups (vectors). These vectors are compared with the vectors that are included in the codebook of the vector quantizer, and the index of the nearest vector is selected to represent the encoded vector. The length of this index is |~log2 Kİ bits, which depends on the number of codewords K in the codebook; this value is smaller than the total number of bits of the pixel group represented by this index. The most important part of vector quantization is the codebook design. A good codebook includes most of the possible vector combinations; that means it contains many codewords, which increase the index length, so the compression ratio cannot be high. On the other hand a small codebook will not contain enough vector combinations to reconstruct a good quality image. This tradeoff between quality and compression ratio should be examined very carefully. The last part of an image coding system is the lossless entropy coding of quantizer outputs. In this part, further compression is achieved by using a coding system like Huffman, Lempel-Ziv or arithmetic coding. The general idea of this coding is to assign shorter symbols to more common codewords, while the unlikely codewords are coded using longer symbols. This kind of coding minimizes the average codeword length, and no information is lost. Although most of the image coding systems have these three parts in common, the lossless coding is optional for the subband coding systems, which use vectorquantization. The reason is that the output symbols of the vector quantizers are sometimes equally likely, which makes the lossless coding useless. In this work, the signal decomposition consists of filtering the original image by an analysis filter pair and maximally decimating the resulting signal in order to achieve 4 different subbands. These filters are separable 2-D Conjugate Quadrature Filters (CQF) with length 8, that provide perfect reconstruction in the absence of quantization. The achieved subbands are named LL, LH, HL and HH according to whether the rows and columns received the low or high frequency filtering, so they represent different frequency bands of the original image. The information content of these bands are different from each other. Most of the important image information is in LL band. LH and HL bands contain the horizontal and vertical high frequency components, which are the edges in the picture. The high frequency components in both directions are contained in HH band, which has negligible energy compared to the other bands in natural images. An important problem in subband decomposition is the convolution problem near the image boundaries. By 2-D filtering of an image, the convolved sequence is longer than the original, although it is maximally decimated. If some pixels at the subband boundaries are truncated, important artifacts occur at the reconstructed image boundaries. So, if we want a perfect reconstruction at the boundaries after the decomposition stage, we must save more pixels than in the original image. This is an important drawback of subband analysis, but this problem can be solved by applying circular convolution or by using "a symmetric extension method. Unfortunately these methods can only be used with symmetric filters. At the second step of the image coding system developed in this thesis, different vector quantization methods are used. All of them are intraband vector quantization methods, that means all vector components are selected from within a subband. As mentioned before, the most important task is the generation of codebooks for vector quantization. By this point the main goal was image quality and not high compression ratios. Therefore, the codebook sizes were large where it is needed. The codebooks were generated by using the generalized Lloyd algorithm (GLA), also known as the Linde Buzo Gray (LBG) algorithm. The initial codebooks for starting the LBG algorithm are generated by a modified version of the pairwise nearest neighbor (PNN) algorithm, which is the most successful initialization algorithm. After the generation of the initial codebook and training these vectors by using the LBG algorithm, two selected bands, which were taken from two different images were used. All of the four subbands have different characteristics, but there is one exception: The subbands LH and HL are statistically very similar. The correlation between the pixel values and information amounts contained are very much alike. So, it is decided to use the same codebook to vector quantize these two different bands. The only difference in the application of quantization to these different bands is the direction of the vectors. XIIn order to quantize the subbands LH and HL, unconstrained full-search vector quantizers were used. This method guarantees to find the nearest vector out of the present codebook, but it has one important drawback. The complexity of the calculation and accordingly the processing time and memory requirements of the coder increase linearly as the codebook size increases. Therefore this method can only be used with small codebooks. By the quantization of the components LH and HL, good image quality with small codebooks can be achieved, because the information content of these bands are small. But the LL band should be coded using a much larger codebook, because it has a high information content. Consequently, a faster algorithm should be used. The solution found for this problem is the multistage vector quantization. This is one of the most effective vector quantization methods in the literature, with respect to memory and speed requirements. Multistage VQ divides the encoding task into several stages. In the first stage, the image is roughly encoded and a difference image between the original and the roughly represented image is found. In the second stage, this difference image is encoded using the difference codebook. This encoding stage can be repeated many times, but in this work two stage quantizers were used. Unlike full search VQ, in which the number of represented vector combinations is exactly the same as the number of codewords in the codebook, multistage VQ provides a combination number which is the product of the numbers of the vectors in the two codebooks. After examining its effects, the 'HH band which contains the least information has been simply discarded, because the image quality of most pictures remained almost the same as the original image. One exception was found in a picture with a high HH band content, but there was no visible artifact in the picture after discarding the HH band. The human visual system is also less sensitive to high frequency diagonal details, hence artifacts due to the omission of the HH band are not visible. A sketch of the applied vector quantization can be seen on Figure 1. Figure 1 Vectors used for quantizing different subbands xiiAs it can be seen in Figure 1, vectors of size 4x2 were used by quantizing the LH and HL bands. The reason for these relative big vector size is the low information content of these bands. Also the codebooks contain vectors ranging from 31 to 86 in number, which are quite few, because of the same reason. On the other hand, the LL band is quantized using 4x2 and 2x2 vectors. The coding application with 4x2 vectors resulted in poor image quality because of the large and directional coding vectors. So, square shaped 2x2 vectors were found to be useful in coding the LL band. This time, the codebook sizes were much bigger than the codebooks used by the LH and HL bands. The reason for this is, that the LL band has more pixel variations and contains much more information than the other bands. During the work, 9 test images are used. All of the test images used in the coding applications are 8 bit grayscale images of size 256 x 256. The image coding results are given as peak signal to noise ratio (PSNR). An important point is that although the PSNR value of an image is a mathematically meaningful quality measure, an image with visible artifacts can have high PSNR values, while a good quality image may have a relatively low PSNR value. Therefore, the opinion of an educated observer is much more important for the image quality. This is especially important in subband coding, because the errors, which cause the PSNR value to decrease are often invisible errors, and this is another important advantage of subband coding over block transforms. The coding results after vector quantization are in general encouraging. For example, at high bit rates like 2.81bpp, there are no visible artifacts in the image. An educated observer can see some small localized artifacts on the images that are coded at medium bit rates (eg. 1.53bpp). At the highest compression level, where only the LL band is coded, a bit rate of 1.13bpp is achieved, but at the same time these images have some small artifacts at the boundaries and around the edges, which can also be seen by an uneducated observer. Some mathematical results are given in Table 1. The subband coded images can be rated as high to medium quality images, which can be used in many applications and the achievable bit rates by further decomposition of subbands are promising. Furthermore, the processing time and codebook sizes are also acceptable. By examining the quantizer outputs, it is found that there is some statistical irregularity in the vector assignments, so the vector quantizer output is found to be a candidate for entropy coding. By using Huffman encoding, more compression is achieved. Although the results are encouraging, better quality images at the same or lower bit rates with the same image quality can be achieved using multi stage subband analysis. By applying the analysis stage for a second time, 16 uniform subbands can be achieved. Or as another way, the lowest subband can be further analyzed with the analysis filters. This decomposition technique is named as the octave-band decomposition. By analyzing the subbands further, more appropriate quantization for each band can be applied and better PSNR values and image quality can be achieved. xiiiTable 1: The comparison of different results * This image is used by generation of the initial codebooks. ** These images are used by training the initial codebooks during LBG algorithm. xiv

Benzer Tezler

Tez No
673005
Spektral ve Faz Tabanlı Özniteliklerle Çok Sınıflı Motor Hayali EEG Sinyallerinin Sınıflandırılması
Classification of multi-class motor imaginary eeg signals with spectral and phase-based features
OSMAN ÇETİN
Yüksek Lisans
Türkçe
2021
Elektrik ve Elektronik Mühendisliği Kütahya Dumlupınar Üniversitesi
Elektrik-Elektronik Mühendisliği Ana Bilim Dalı
DR. ÖĞR. ÜYESİ MUSTAFA TOSUN
Tez No
46285
Sayısal görüntülerin alt band kodlanması
Subband coding of digital images
SIDIK DÜNDAR
Yüksek Lisans
Türkçe
1995
Elektrik ve Elektronik Mühendisliği İstanbul Teknik Üniversitesi
Y.DOÇ.DR. M. ERTUĞRUL ÇELEBİ
Tez No
76484
Subband decomposition vector quantization architectures for coding of speech and audio signals
Konuşma ve işitsel sinyaller için alt band vektör niceleme mimarileri
ONUR ENGİN TAÇKIN
Yüksek Lisans
İngilizce
1998
Elektrik ve Elektronik Mühendisliği Boğaziçi Üniversitesi
Elektrik-Elektronik Mühendisliği Ana Bilim Dalı
PROF. DR. MEHMET BÜLENT SANKUR
Tez No
21736
2.4-2.7 GHz radyo-röle alt band çevirici tasarımı ve gerçekleştirilmesi
Design and realization of a 2.4-2.7 GHz radiolink down-converter
H.BÜLENT YAĞCI
Yüksek Lisans
Türkçe
1992
Elektrik ve Elektronik Mühendisliği İstanbul Teknik Üniversitesi
PROF. DR. OSMAN PALAMUTÇUOĞULLARI
Tez No
83713
Automatic speech segmentation based on subband decomposition
Alt bant ayrışıma dayalı otomatik konuşma bölütleme
ARÇIN BOZKURT
Yüksek Lisans
İngilizce
1999
Elektrik ve Elektronik Mühendisliği İhsan Doğramacı Bilkent Üniversitesi
Elektrik-Elektronik Mühendisliği Ana Bilim Dalı
PROF. DR. A. ENİS ÇETİN

Geri Dön