Termal görüntülere derin öğrenme tabanlı süper çözünürlük yöntemlerinin uygulanması
Application of deep learning based super resolution in thermal images
- Tez No: 894770
- Danışmanlar: PROF. DR. ENDER METE EKŞİOĞLU
- Tez Türü: Yüksek Lisans
- Konular: Elektrik ve Elektronik Mühendisliği, Electrical and Electronics Engineering
- Anahtar Kelimeler: Belirtilmemiş.
- Yıl: 2023
- Dil: Türkçe
- Üniversite: İstanbul Teknik Üniversitesi
- Enstitü: Fen Bilimleri Enstitüsü
- Ana Bilim Dalı: Elektronik ve Haberleşme Mühendisliği Ana Bilim Dalı
- Bilim Dalı: Elektronik Mühendisliği Bilim Dalı
- Sayfa Sayısı: 89
Özet
Gelişen teknoloji ile görüntüleme sistemleri hedef tespiti, hedef takibi, taşıt plaka tanıma, radyolojik görüntüler üzerinden hastalık tespiti gibi askeri ve sivil alanları ilgilendiren konularda sıklıkla kullanılmaya başlanmıştır. Görünür dalga boyunda çalışan sensörlerin ışık kaynağına ihtiyacı bulunması sebebi ile, gece ve kötü hava şartlarında gündüz sensörlerinin görüntüleme yapması mümkün olmamaktadır. Bu ihtiyacın giderilmesi amacı ile termal görüntüleme sistemleri geliştirilmiştir. Termal görüntüleme sistemleri ısısı olan her cismin elektromanyetik dalga yayması ve bunun dedektörler aracılığı ile algılanması prensibine dayalı olarak çalışmaktadır. Görüntüleme sistemlerinin beklenen görevlerde yüksek performans gösterebilmesi için görüntü çözünürlüğünün oldukça önemli bir yeri bulunmaktadır. Çözünürlük her ne kadar önemli olsa da yüksek çözünürlüklü sensörlerin maliyet yüksekliği, üretim zorluğu nedeni ile üretim kapasitesi sınırlıdır ve sıklıkla tercih edilememektedir. Bu sebep ile, düşük çözünürlüklü görüntülerden yüksek çözünürlüklü görüntüler elde etmek için yazılım tabanlı birçok yöntem geliştirilmiştir ve kullanılmaya başlanmıştır. Yüksek çözünürlüklü görüntüler elde etmek için kullanılan yöntemler genel olarak üç başlık altında toplanabilir: Enterpolasyon tabanlı yöntemler, yeniden yapılandırma tabanlı yöntemler ve örnek tabanlı yöntemler. Yüksek çözünürlüklü görüntüler elde etmek için kullanılan yöntemlerden ilki olan enterpolasyon tabanlı yöntemler, yüksek hesap gücü gerektirmeyen geleneksel görüntü işleme tabanlı yöntemlerdir. Enterpolasyon tabanlı yöntemlerde, genel prensip bir pikselin çevresindeki piksellerden yararlanarak çözünürlüğü arttırmaktır. Bu yöntemlerde, görüntüdeki yüksek frekansın kaybedilmesi sebebi ile keskin detaylar yüksek çözünürlüklü çıktı görüntüde görülememektedir. Yeniden yapılandırma tabanlı yöntemlerde ise düşük çözünürlüklü görüntünün yüksek çözünürlüklü görüntüye benzer olması gerekliliği bulunur ve benzer olan düşük çözünürlüklü yapısal olarak yüksek çözünürlüklü görüntüye dönüştürülür. Diğer bir süper çözünürlük yöntemi olan örnek tabanlı süper çözünürlük yöntemlerde, öğrenilmiş örnekler ile benzerlik kurularak yüksek çözünürlüklü görüntüler oluşturulur. Bu yöntemlerin kullanılabilmesi için daha yüksek işlem gücü ve hafızaya ihtiyaç bulunmaktadır. Günümüzde, gelişmiş işlemcilerin hayatımıza girmesi ile derin öğrenme birçok alanda kullanılmaya başlanmıştır. Evrişimsel sinir ağları ile derin öğrenme süper çözünürlük alanında kullanılmaya başlanmıştır. Sonrasında, birçok farklı derin öğrenme mimarisi türetilmiştir ve bu alanda en yüksek başarıyı elde etmiştir. Her ne kadar renkli görüntüler için birçok derin öğrenme mimarisi bulunsa da termal görüntüler için uygulanmış süper çözünürlük mimarileri sayıca azdır. Bunun nedenleri arasında, çok sayıda termal görüntüden oluşan veri seti bulunmaması, termal görüntüleme sistemlerinin son zamanlara kadar yaygın olarak kullanılmaması gösterilebilir. Bu çalışmada, renkli tek görüntüler için kullanılan derin öğrenme süper çözünürlük modellerinin termal görüntüler için uygulaması yapılacaktır ve sonuçlar değerlendirilecektir. Bu kapsamda sırası ile, ilk olarak süper çözünürlük için evrişimsel sinir ağlarının kullanıldığı derin öğrenme tabanlı SRCNN modeli, VDSR modeli, ESRGAN modeli ve SwinIR modeli incelenecektir. Bu inceleme sırasında, modele verilen düşük çözünürlüklü görüntünün yüksek çözünürlüklü veri setindeki orijinal görüntüden elde edilebilmesi için iki farklı enterpolasyon yöntemi kullanılmıştır ve karşılaştırma bu iki teknik için de yapılmıştır. Karşılaştırma yapılırken, 2 farklı türde değerlendirme metrikleri kullanılmıştır. Referans görüntü kullanılarak değerlendirme yapılan ölçüm metrikleri olan PSNR, SSIM metriği ve referans olmadan değerlendirme metrikleri olan NIQE, PIQUE ve BRISQUE kullanılmıştır.
Özet (Çeviri)
Nowadays, imaging systems is crucial for various areas such as military, medicine, traffic observance and public security. Therefore, imaging systems are improved for these specific purposes. Despite improvements on imaging systems, imaging quality may be insufficient in order to reach the specific goals adequately. At this point, resolution of images play an important role. Due to high cost of high-resolution sensors and other hardware limitations, imaging sensors have higher resolution may not be preferred in imaging systems. When all reasons are taken into consideration, resolution of images coming from imaging systems need to be recovered by processors digitally. Thus, super resolution algorithms have been emerged. As the simplest expression, super resolution converts given low resolution input images into high resolution output images by using different techniques. Super resolution has two main branches: single image resolution and multi-image super resolution. In single image super resolution, a single image is given as input for super resolution model. In multi-image super resolution, in order to obtain high resolution image, super-resolution model benefits from more than one image. In the electromagnetic spectrum, electromagnetic waves are divided into different regions according to their wavelengths as gamma rays, x-rays, ultraviolet rays, visible region, infrared rays, microwave and radio waves. Daytime imaging systems, which are functional at the visible wavelength of 400-700 nm need a light source as their working principle. It becomes dysfunctional in places where the light source is insufficient or absent. For this reason, the need for a different imaging system has appeared in dark and bad weather conditions and thermal imaging systems have been used. Basically, the working principle of thermal imaging systems is that objects at all temperatures above 0 Kelvin, which is expressed as absolute temperature, broadcast in invisible infrared wavelengths, and this broadcast by the objects can be detected by thermal imaging systems. Thermal cameras are functional in the infrared region, which is expressed as a wavelength between 700 nm and 1 mm, but in this range, thermal imaging systems are used in limited regions due to the lack of atmospheric permeability for this wavelength in many regions. These regions are near infrared with a wavelength of 700-1000 nm, short-wave infrared with a wavelength of 1000-2500 nm, medium-wave infrared with a wavelength of 3000-5000 nm, and long-wave infrared with a wavelength of 8000-12000 nm. Thanks to thermal imaging systems, it is possible to detect, diagnose and identify objects in the environment in completely dark or insufficient ambient light. For these reasons, thermal imaging systems have been used extensively in fields such as military, medicine and public security. However, thermal imaging systems have lower resolutions and higher production costs in comparison with day imaging systems. Due to the high cost and production difficulty, super resolution methods are used to obtain higher resolution images from thermal cameras. Comparing to color images, thermal images have less high-frequency component and higher noise ratio from thermal detector. For this reason, the application of super resolution methods is relatively difficult. Single-image super-resolution techniques are basically divided into three main branches: interpolation-based super-resolution, reconstruction-based super-resolution, and sample-based super-resolution. Interpolation-based super resolution methods are widely used due to their low cost and ease of implementation. Main principle of interpolation-based methods is that values of neighbor pixels of one pixel are used in order to determine new pixel values. In nearest neighborhood interpolation method, new pixel value is determined in accordance with the one nearest pixel. This method is very fast method, but image quality of interpolated output image is low. In bilinear interpolation method, new pixel values are determined with four nearest pixels. Bilinear interpolation method is slower in comparison with nearest neighborhood interpolation, but it has better image quality performance. In bicubic interpolation method, new pixel values are computed according to nearest 16 pixel considering their distance. Bicubic interpolation method have higher computing cost and best quality performance in comparison with other two interpolation methods. In reconstruction-based methods, high resolution image is generated by using prior knowledge of the image. The constraint is that low resolution image and high-resolution image pictures are similar in structure. Example-based super resolution methods create high resolution image from low resolution image benefiting from examples. These super resolution methods need more computation power. There are many different example-based methods. These are, high frequency transfer, neighbor embedding, sparse coding, anchored regression, regression trees and deep learning. In high frequency transfer technique, high resolution image is obtained by combining high frequency component of low resolution-image and interpolated high resolution image. In neighbor embedding technique, patches from input image and down-scaled image are used in order to create a database. Patches from low resolution image are matched from high resolution patch in database. By combining high resolution patches, high resolution is given as output image. Sparse coding technique is improved version of neighbor embedding. In training part, two different dictionaries are created. One of them contains patches from natural scenes and the other one contains down-scaled version of first dictionary. In inference part, patches from low resolution image are matched with patches from down scaled dictionary. Matched patches with downscaled dictionary are matched with another dictionary. Lastly, high-resolution image is created by combining these patches. In anchored regression technique, there is an external database. This database contains low resolution dictionary and low-high resolution mapping matrices. Using this database, patches from low resolution image are matched with high resolution patches so high-resolution image is generated as output image. In regression trees technique, low resolution is divided to small patches. These patches enter from the top of tree and exit from the most suitable node of tree. High resolution image is generated by benefiting from this regression model. Last but not least, nowadays the most popular example-based super resolution technique is deep learning-based super resolution. Deep learning is a sub-branch of machine learning consisting of structures called artificial neural networks. It is expressed as deep because there are many layers in the artificial neural network. In the training phase of deep learning, deep learning models generally need to be trained with large data sets. In deep learning, many algorithms and methods have been developed with the aim of realizing learning, increasing the accuracy rate to the desired level, and shortening the training period. Examples of these are activation functions, optimization algorithms, back propagation algorithm, batch normalization and dropout. Today, with the introduction of advanced processors into our lives, deep learning has begun to be used in many areas. For example, deep learning models can understand and respond to speech, recognize objects, people, faces in a given image, drive vehicles, play games such as chess, computer games. It has started to be used in the field of super resolution with convolutional neural networks. Since then, many different deep learning architectures have been derived and achieved the highest success in this field. Although there are many deep learning architectures for color images, super resolution architectures applied for thermal images are few in number. Among the reasons for this, there are not many thermal images, thermal imaging systems have not been widely used until recently. In this study, four different deep learning models used for color images will be used for single image super resolution in thermal images. In super resolution convolutional neural network abbreviated as SRCNN, convolutional neural networks (CNN) were used for super resolution firstly. In SRCNN, low resolution image is given to model as input by upscaling by desired scale factor. In other words, input and output image of the model have the same size. In SRCNN architecture, there are three main functions. These are patch extraction, non-linear mapping and reconstruction. By inspiring from VGG-Net architecture, very deep super resolution architecture abbreviated as VDSR was created. VDSR has multiple convolutional layers. These multiple convolutional layers are increasing training time of this model. Similar to SRCNN, input image is given to model by upscaling. The high-resolution image is obtained by combining output image from model and upscaled image. The most famous network for super resolution is generative adversarial networks (GAN). Enhanced super resolution generative adversarial networks abbreviated as ESRGAN is enhanced version of GAN. In ESRGAN model, all batch normalization layers were removed from generator part and original basic block is replaced by residual in residual dense block. Many deep learning super resolution models were based on CNN. SwinIR model uses transformer structure expressed as SwinIR transformer. SwinIR has three modules. These are shallow feature extraction, deep feature extraction and high-quality image reconstruction. Two different quality assessment methods are used for image super resolution. Qualitative assessment methods are based on human evaluation. Quantitative assessment methods are based on mathematical computation with image features like structures and luminances. Quantitative methods have two main branches. These are full reference methods and non-reference methods. In full reference methods, output image is compared with original reference image in contrast to non-reference methods. In this thesis, peak signal-to-noise ratio (PSNR) and structural similarity index metric (SSIM) are used as full reference image quality assessment method. PSNR is used to assess the similarity between high resolution output and original image. If an image has higher PSNR value in comparison with other images, this image is the closest image to original image. SSIM is another method that evaluates structure of image by considering structure, contrast and luminance. Higher SSIM value is also accepted as the higher image quality. As non-reference methods, naturalness image quality evaluator (NIQE), perception-based image quality evaluator (PIQUE) and blind/referenceless image spatial quality evaluator (BRISQUE) are used in this thesis. For three methods, low value shows better image quality. In this study, four deep learning super resolution models are analyzed with five quality assessment metrics in thermal images. Low-resolution thermal images obtained by downscaling original thermal images is given to super resolution deep learning model as input. This downscaling process have done by using bilinear interpolation and bicubic interpolation. Model outputs of low-resolution images, which were reduced by different interpolation methods from the same original thermal image, were examined.
Benzer Tezler
- Süper çözünürlük algoritmalarındaderin öğrenmenin etkisi
The effect of deep learning in super resolution algorithms
MEHMET CAN ÖZ
Yüksek Lisans
Türkçe
2024
Elektrik ve Elektronik MühendisliğiGazi ÜniversitesiElektrik ve Elektronik Mühendisliği Ana Bilim Dalı
PROF. DR. TUĞBA SELCEN NAVRUZ
- Termal görüntü çözünürlüğünün artırılması için derin öğrenme tabanlı bulut sisteminin geliştirilmesi
Development of cloud system based on deep learning for thermal image resolution enhancement
FATİH MEHMET ŞENALP
Doktora
Türkçe
2022
Elektrik ve Elektronik MühendisliğiKonya Teknik ÜniversitesiElektrik-Elektronik Mühendisliği Ana Bilim Dalı
DOÇ. DR. MURAT CEYLAN
- Deep learning-based object tracking system by using visual and thermal infrared fusion
Termal kızılötesi ve görünür bant kaynaştırma kullanarak derin öğrenme tabanlı nesne takip sistemi
ABBAS TÜRKOĞLU
Yüksek Lisans
İngilizce
2023
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve KontrolOrta Doğu Teknik ÜniversitesiModelleme ve Simülasyon Ana Bilim Dalı
DOÇ. DR. ELİF SÜRER
DOÇ. DR. ERDEM AKAGÜNDÜZ
- Havadan alınan termal kamera görüntülerinde canlı tasnifinin yapılmasında derin öğrenme tabanlı tekniklerin uygulanması
Application of deep learning-based techniques for performing live sorting on aerial thermal camera images
HALİL USLU
Yüksek Lisans
Türkçe
2022
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve KontrolFırat ÜniversitesiYazılım Mühendisliği Ana Bilim Dalı
PROF. DR. ENGİN AVCI
- Hiperspektral termal görüntülerde hedef tespiti
Target detection in hyperspectral thermal images
METEHAN YALÇIN
Yüksek Lisans
Türkçe
2023
Elektrik ve Elektronik MühendisliğiHacettepe ÜniversitesiElektrik ve Elektronik Mühendisliği Ana Bilim Dalı
DOÇ. DR. SENİHA ESEN YÜKSEL ERDEM
DOÇ. DR. ALPER KOZ