Görüntü işlemede derin öğrenme tabanlı süper çözünürlük uygulamaları
Deep learning based super resolution applications in image processing
- Tez No: 659034
- Danışmanlar: PROF. DR. ENDER METE EKŞİOĞLU
- Tez Türü: Yüksek Lisans
- Konular: Elektrik ve Elektronik Mühendisliği, Electrical and Electronics Engineering
- Anahtar Kelimeler: Görüntü İşleme, Süper Çözünürlük, Görüntü İyileştirme, Derin Öğrenme, Enterpolasyon, Image Processing, Super Resolution, Image Enhancement, Deep Learning, Interpolation
- Yıl: 2021
- Dil: Türkçe
- Üniversite: İstanbul Teknik Üniversitesi
- Enstitü: Fen Bilimleri Enstitüsü
- Ana Bilim Dalı: Elektronik ve Haberleşme Mühendisliği Ana Bilim Dalı
- Bilim Dalı: Telekomünikasyon Mühendisliği Bilim Dalı
- Sayfa Sayısı: 91
Özet
Tıbbi teşhis, uydu görüntüleri, araç plaka tanıma ve yüz tanıma gibi sistemlerde görüntünün detayları önemli rol oynamaktadır. Detayların kritik olduğu bu görüntülerde, düşük çözünürlüklü görüntüler yetersiz kalmaktadır. Bu görüntülerin elde edildiği sistemlerin yüksek çözünürlük elde edebilen sistemlere dönüştürülmesi hem maliyet açısından hem de elde edilen yüksek çözünürlüklü görüntülerin depolanması açısından zordur. Bu sebeple, var olan sistemler kullanılıp, yüksek çözünürlüklü görüntü elde edilmesi önemlidir. Günümüzde kullanılan dijital kameralar için daha fazla görüntü çözünürlüğü sağlaması talep edilmektedir. Ancak sensör teknolojisi doygunluk seviyesine ulaştığı için görüntü çözünürlüğü de sınırlanmış durumdadır. Bu durumla başa çıkabilmek amacıyla düşük çözünürlüklü görüntülerden yüksek çözünürlüklü görüntü elde eden süper çözünürlük çözümü sunulmuştur. Süper çözünürlük, tekli görüntü veya aynı sahnenin birden fazla görüntüsünü kullanarak çözünürlük iyileştirme tekniğidir. Tekli görüntü üzerinde görüntü iyileştirme problemi için literatürde birçok yöntem bulunmaktadır. Enterpolasyon tabanlı yöntemler, model tabanlı yöntemler ve derin öğrenme tabanlı yöntemler tarihsel açıdan bu sırayla önerilmiştir. Görüntü iyileştirme amacıyla ilk önerilen yöntem olan enterpolasyon tabanlı yöntemler basit ama başarılıydı. Ancak çoğu uygulama için yüksek frekansa sahip bileşen detayları yetersiz kalmaktadır. Süper çözünürlük fikrinin önerildiği yıllarda derin öğrenme yaygın değildi ve enterpolasyon tabanlı çeşitli çözünürlük arttırma yöntemleri kullanılmaktaydı. Derin öğrenmenin yaygınlaşması ve ilerleyen zamanlarda evrişimli sinir ağlarının devreye girmesi süper çözünürlüğü popüler bir problem haline getirmiştir. Böylelikle, derin öğrenme tabanlı süper çözünürlük yöntemleri literatürde yerini aldı. Tez kapsamında, tekli görüntü üzerinde derin öğrenme tabanlı süper çözünürlük yöntemleri kullanılmıştır. Bu amaçla gerçeklenen ve birbiriyle kıyaslanan derin öğrenme yöntemleri; SRGAN, EDSR, SAN, RCAN ve CAR'dır. Kullanılan derin öğrenme tabanlı yöntemlerin, üst örnekleme tabanlı yöntemlere göre çok daha başarılı olduğunu göstermek amacıyla bikübik enterpolasyon ile elde edilmiş görüntülere de yer verilmiştir. Bu çalışma kapsamında derin öğrenme tabanlı süper çözünürlük yöntemlerinden elde edilen başarım sonuçları ile referans değerleri örtüşmektedir. İlerleyen çalışmalara ışık tutması açısından, gerçeklenen derin öğrenme tabanlı süper çözünürlük yöntemlerinin başarım sonuçları gösterilip, süper çözünürlük amacıyla önerilen derin öğrenme yöntemleri kıyaslanmıştır.
Özet (Çeviri)
The details of the image play an important role in systems such as medical diagnosis, satellite images, vehicle license plate recognition and face recognition. In these images where details are important, low resolution images are insufficient. It is difficult to transform the systems from which these images are obtained into systems that can achieve high resolution, both in terms of cost and in terms of storing the obtained high resolution images. For this reason, it is important to use existing systems to obtain high resolution images. In a typical digital camera it is possible to properly magnify the image. Synthetic magnification of the area of interest is used in various applications such as surveillance, forensic, scientific, medical and satellite imagery. Digital video recorders (DVRs) have been replaced by closed-circuit television (CCTV) systems for surveillance and forensic purposes, and are often used for the faces of criminals or car license plates by magnifying objects in an image. Super resolution technique is also applied in medical fields along the resolution quality limit by collecting multiple images of Computed Tomography (CT) and Magnetic Resonance Imaging (MRI). In satellite images applications such as remote sensing, the super resolution technique is also used to increase the resolution of the image of the target of interest. Camera sensors in digital cameras, tablets and phones are capable of obtaining advanced and high resolution images. The resolutions of the images obtained from these devices are sufficient for general use. However, in some cases low resolution images may be obtained from these devices, or in some areas of use these images are expected to have a resolution that can show more details. For example, (i) devices that are unable to produce high resolution images and videos (ii) displaying details contained in an image; for example, the faces of the people in the image, license plates, etc. (iii) blurry and noisy images (iv) image processing applications that achieve better resolution than the camera can provide (v) pre-processing to increase the performance of other algorithms using images; for example, face recognition, MR disease diagnosis algorithm etc. is used. It is demanded to provide more image resolution for the digital cameras used today. However, as the sensor technology reaches the saturation level, the image resolution is also limited. In order to overcome this situation, a super resolution solution that obtains high resolution images from low resolution images has been presented. Super resolution is a resolution improvement technique using a single image or multiple images of the same scene. There are many methods in the literature for the problem of image enhancement on single images. Interpolation-based methods, reconstruction-based methods, and deep learning-based methods have historically been proposed in this order. Interpolation-based methods, the first proposed method for image enhancement purposes, were simple but successful. However, for most applications, component details with high frequency are insufficient. Interpolation- xxibased methods do not constitute any difficulty in terms of computational load. However, it is not possible to obtain high frequency components as low-pass filters are used as a result of interpolation. Therefore, the resulting Over-smooth of the new image object edges or jagged residues can occur. For more flexible solutions, reconstruction-based methods such as non-native similarity and sparseness have been used. Even though reconstruction-based methods were successful in producing high resolution images, they had some shortcomings. In order to obtain results, these methods often require a time consuming optimization process. In addition, when the performance of these methods depends on the image characteristics, it has been observed that the performance value decreases rapidly. Although approaches such as linear, bicubic or closest neighbor interpolation are faster, deep learning based super resolution methods that give more successful results achieve very successful results in the field of image enhancement. Super resolution, which is one of the image enhancement techniques that has been an active signal processing research subject recently, is an image reconstruction and enhancement technique that has an important place in the literature. While this technique is called super resolution, choosing the adjective super expresses the characteristic of this signal processing technique. One of the biggest advantages of the signal processing approach is its low cost and the utilization of the existing low resolution system. Super resolution technique on multiple images is an accepted technique by using the high resolution image obtained by using more than one image of the same scene in many fields such as medical, satellite image and video applications. Super resolution techniques on a single image, which will be explained in the following parts of the thesis, are to obtain a high resolution image as a result of algorithms using a single low resolution image. Artificial intelligence is a broad field of study that deals with methods that enable computer simulations of human intelligence. The basic topics of artificial intelligence are knowledge representation, inference and learning. Machine learning is the sub-branch of artificial intelligence in learning and has the most widespread use today. Instead of making decisions based on predetermined rules, a machine learns to use large data sets and aims to make better inferences. Machine learning is used in every aspect of our daily lives, even if some of us are not aware of it. Object recognition, software that converts speech to text, electronic commerce sites that make product recommendations based on the interests of users and the products they have reviewed in the past are the most basic examples to be given to these areas. Machine learning is divided into 3 as controlled learning, uncontrolled learning and reinforced learning. In the supervised learning method, input is given to the algorithm and the expected result is given. The result obtained as a result of the algorithm and the expected result are compared by the algorithm and the calculation parameters are recalculated. Unsupervised learning is a method that uses sets of information given as input, without the expected output being given to the algorithm. This method is used especially for clustering big data.Reinforced learning has a reward mechanism. The algorithm learns which parameters to use to reach the highest reward. The foundation of neural networks has emerged with the modeling of biological nervous systems. After achieving successful results in the machine learning task, it has grown and developed in the field of engineering. For this reason, when explaining artificial neural networks, it is necessary to start with biological neural networks. The brain's smallest computing unit is the neuron. There are structures such as axons, dendrites and nuclei in the biological nerve cell. From these structures, the axon transmits the incoming signals and is the output of the nerve cell. Those dendrites, is the entrance of the cell collects the signals from other neurons. The structure that connects dendrites and axons to each other in order for nerve cells to communicate with each other is defined as a synapse. There are approximately 86 billion neurons in the human nervous system and these neurons are interconnected by synapses. Artificial neural networks can be defined as layers of perceptrons. Each module converts the incoming data for the next module into a different representation. These modules are defined as layers in the literature, and each layer consists of multiple units or neurons. The fully coupled artificial neural network, which consists of N hidden layers, contains the weight coefficient and deviation coefficient learned during the training. The output of each layer is the weighted sum of inputs given to the input unit. There is no connection between neurons in the same layer, and the weight coefficient and deviation coefficient between neurons in other layers are the values learned during the training phase. Neurons between two successive layers affect each other with various activation functions, and these values determine the learning level of the model. In the years when the super resolution idea was proposed, deep learning was not common and various interpolation-based image enhancement methods were used. The widespread use of deep learning and the introduction of convolutional neural networks in later times have made super-resolution a popular problem. Thus, deep learning based super resolution methods took their place in the literature. Deep learning aims to learn the relationship between inputs and outputs. For this purpose, deep learning approaches provide the representation of deep learning models consisting of multiple layers and data of various qualities. It has provided the development of technology in various fields such as deep learning methods, speech recognition, object recognition, object detection. In the scope of the thesis, deep learning based super resolution methods were used on single images. Deep learning methods implemented for this purpose and compared with each other; SRGAN, EDSR, SAN, RCAN and CAR. In order to show that the deep learning-based methods used are much more successful than classical methods, images obtained by bicubic interpolation are also included. The methods used to calculate the performance rate of the image enhancement method are divided into two as objective and subjective. In the objective performance measurement methods used to measure the performance of the images obtained as a result of image enhancement, the low resolution image obtained from the high resolution image called the ground truth and the high resolution image obtained from the low resolution image are used. Within the scope of this study, the performance of the results obtained from deep learning based super resolution methods and reference values converged. In order to shed light on future studies, the performance results of the implemented deep learning based super resolution methods were shown and the deep learning methods recommended for super resolution were compared.
Benzer Tezler
- Self-supervised pansharpening: Guided colorization of panchromatic images using generative adversarial networks
Öz-denetimli pankeskinleştirme: Çekişmeli üretken ağlar ile pankromatik görüntülerin güdümlü renklendirilmesi
FURKAN ÖZÇELİK
Yüksek Lisans
İngilizce
2020
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrolİstanbul Teknik ÜniversitesiBilgisayar Mühendisliği Ana Bilim Dalı
PROF. DR. GÖZDE ÜNAL
- Deep learning-based techniques for 3D point cloud analysis
3B nokta bulutu analizi için derin öğrenme temelli teknikler
YUSUF HÜSEYİN ŞAHİN
Doktora
İngilizce
2023
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrolİstanbul Teknik ÜniversitesiBilgisayar Mühendisliği Ana Bilim Dalı
PROF. DR. GÖZDE ÜNAL
- Kalite kontrol sistemi için derin öğrenme tabanlı bir model önerisi
A deep learning-based model proposal for a quality control system
YAREN ÇELİK
Yüksek Lisans
Türkçe
2022
Endüstri ve Endüstri MühendisliğiBaşkent ÜniversitesiEndüstri Mühendisliği Ana Bilim Dalı
PROF. DR. BERNA DENGİZ
- Design and deployment of deep learning based fuzzy logicsystems
Derin öğrenme tabanlı bulanık sistemlerin geliştirilmesi ve uygulanması
AYKUT BEKE
Doktora
İngilizce
2023
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrolİstanbul Teknik ÜniversitesiKontrol ve Otomasyon Mühendisliği Ana Bilim Dalı
DOÇ. DR. TUFAN KUMBASAR
- Konuşma sinyali ve ses telleri görüntülerinden derin öğrenme tabanlı glotal alan kestirimi
Deep learning based estimation of glottal area from speech and vocal folds images
YAŞAR SAİD DERDİMAN
Yüksek Lisans
Türkçe
2023
Elektrik ve Elektronik MühendisliğiSüleyman Demirel ÜniversitesiElektrik-Elektronik Mühendisliği Ana Bilim Dalı
DR. ÖĞR. ÜYESİ TURGAY KOÇ