Evrişimli sinir ağları ile yüksek çözünürlüklü uydu görüntülerinden uçak tespiti

Aircraft detection from high resolution satellite images with convolutional neural networks

PDF İndir

Tez No: 694473
Yazar: EMİNE DİLŞAD ÜNSAL
Danışmanlar: PROF. DR. ELİF SERTEL
Tez Türü: Yüksek Lisans
Konular: Jeodezi ve Fotogrametri, Geodesy and Photogrammetry
Anahtar Kelimeler: Belirtilmemiş.
Yıl: 2021
Dil: Türkçe
Üniversite: İstanbul Teknik Üniversitesi
Enstitü: Bilişim Enstitüsü
Ana Bilim Dalı: İletişim Sistemleri Ana Bilim Dalı
Bilim Dalı: Uydu Haberleşmesi ve Uzaktan Algılama Bilim Dalı
Sayfa Sayısı: 112

Özet

Sanayi 4.0 ile birlikte hayatın her alanına giren yapay zeka sistemleri, gün geçtikçe daha karmaşık problemleri çözebilir hale gelmiş, insanoğlunun hayat standardını önemli ölçüde artırmıştır. Son yıllarda yer gözlem uydularının sahip olduğu ve giderek artan yüksek mekansal ve zamansal çözünürlük sayesinde yeryüzüne ait pek çok nesnenin tespit edilebilirliği önemli ölçüde artmıştır. Uzaktan algılama ve fotogrametri disiplinlerinin ilk ortaya çıktığı zamandan bu yana obje tespiti önemli bir konu olmuştur. İlk zamanlarda manuel olarak yapılan obje tespiti, günümüzde bir tür derin öğrenme modeli olan evrişimsel sinir ağları ile yapılmaktadır. Çünkü, geleneksel öğrenme yöntemlerinin modern uzaktan algılama görüntülerinden nesne algılamada kullanımı kısıtlıdır. Bunun nedeni geleneksel algılama yöntemlerinin çoğunlukla daha basit arka plana sahip küçük görüntüler için etkili olan tek bir basit özellik kullanımına dayanmasıdır. Ancak, günümüzde uzaktan algılama görüntülerinin daha karmaşık yapıya sahip olması, geleneksel yöntemlerin büyük uzaktan algılama görüntülerini işlerken pek çok zorlukla karşı karşıya kalmasına neden olmaktadır. Böylece, analistler tarafından yapılacak hataları en aza indirmek, zaman, iş gücü ve maliyetten tasarruf etmek mümkündür. Başarılı bir yapay sinir ağı oluşturabilmek için hedef obje ve/veya objelerin pek çok örneği kullanılarak eğitim yapılması gerekmektedir. Oluşturulan evrişimli sinir ağları ile aynı anda birden fazla objenin tespiti, değişim analizleri ve geleceğe yönelik tahminler yapmak mümkündür. Çok katmanlı evrişimsel sinir ağında işleme sokulan girdi görüntüsü için her katmanda ağırlıklar güncellenir, tercih edilen sayıda ileri besleme ve geri besleme aşamalarından sonra tahmin edilen değer ile gerçek değer arasındaki farka bakılarak hata miktarı bulunur. Ulusal ve uluslararası hava sahalarında farklı irtifalara sahip, askeri ve sivil amaçlı pek çok uçak dolaşmaktadır. Yüksek çözünürlüklü uydu görüntüleri geniş görüş alanı ve irtifa izlemesi sağladığı için uçak tespitinde oldukça kullanışlıdır. Hava sahalarının denetlenebilmesi, havayolu trafiğinin kontrol edilebilmesi, savunma amaçlı uygulamalar gibi pek çok sebeple uydu görüntülerinden uçak tespitine ihtiyaç duyulmaktadır. Uydu görüntülerinden obje tanımlamada veri seti oluşturmak zaman ve maliyet açısından zahmetlidir. Hali hazırda bulunan açık kaynaklı veri setleri kısıtlıdır. Ücretsiz uydu görüntüsü temini için Google Earth araştırmacılar açısından önemli bir fırsattır. Tez çalışması kapsamında Google Earth görüntülerinden oluşan HRPlanesv2 veri kümesi kullanılmıştır. Veri setini tez çalışmasından sonra yayınlayarak kullanıcıların kullanımına açmak ve literatürdeki bu ihtiyaca fayda sağlamak hedeflenmektedir. Literatürde uydu görüntülerinden uçak tespiti için daha çok aday bölge oluşturmaya yönelik modeller kullanılmıştır. Bu modeller yüksek doğruluklu sonuçlar elde etse de tespit süresi açısından zaman alıcıdır. YOLO modelleri görüntüleri aday bölge oluşturarak değil de ızgaralara ayırarak tespit yaptığı için derin öğrenme, özellikle gerçek zamanlı tespit açısından oldukça kullanışlıdır. Bu nedenle bu tez çalışmasında obje tespitinde en başarılı derin öğrenme modellerinden ikisi olan YOLO ve Faster RCNN modelleriyle eğitimler gerçekleştirilmiştir. Kullanılan modeller daha önceden açık kaynaklı COCO veri kümesiyle eğitilen modellerdir. Çalışmada yapılacak deneylerin amacına yönelik üç ayrı veri seti oluşturulmuştur. İlk veri setinde bulunan uçak proporsiyonlarının %80 ve üzerinde olmasına dikkat edilmiş, ikinci veri setinde ise bu eşik değeri %50'ye çekilmiştir. Tüm veri setlerinde bulunan görüntüler hava koşulları açısından temizdir. Aktarımlı öğrenmenin gerçekleştirildiği ikinci veri kümesinde bulunan görüntülerin az bir kısmında gürültüler mevcuttur. Birinci veri kümesi üzerinde YOLOv4, YOLOv5 ve Faster-RCNN-Inception-v2 deneyleri gerçekleştirilmiştir. İkinci veri kümesi ile yapılan deneylerin amacı; ilk deneylerden elde edilen en iyi sonucun öğrenmeyi ne kadar etkileyeceğini gösterecek olan aktarımlı öğrenme gerçekleştirilmesidir. Üçüncü olarak oluşturulan veri seti ise farklı uydulardan, farklı algılama parametrelerine sahip görüntülerden oluşmaktadır. Derin öğrenme çalışmalarında hedef objenin çok çeşitli örneğinden oluşan bir veri seti doğruluk sonuçlarını önemli oranda artırmaktadır. Ancak uzaktan algılama için zahmetli olan bu durumu aşmak için görüntüleri mozaiklemek, kesmek, döndürmek gibi çeşitli veri artırma teknikleri kullanılmaktadır. Böylece kısıtlı olan veri setinin kapasitesi sentetik olarak artırılmaktadır. Yapılan deneylerde veri setinin kapasitesini artırmak için çok çeşitli veri artırımı yöntemleri uygulanmıştır. Bunların başlıcaları mozaikleme, kesme ve karıştırmadır. Bazı deneylerde birden fazla artırım yöntemi aynı anda uygulanarak sonuç doğruluklarının artması hedeflenmiştir. Veri artırımı yöntemlerinin tamamı deney sonuçlarını aynı şekilde etkilememiştir. Bu nedenle en etkili veri artırımı tekniği sorusunun reasyonel bir cevabı bulunmamaktadır. YOLOv4 için en başarılı artırım yöntemi mozaikleme olurken FRCNN için rastgele yatay ve dikay döndürme olmuştur. Yapılan deneyler ortalama averaj keskinliği metriğine (mAP) göre kıyaslanmış ve en iyi sonucu YOLOv5 modeli vermiştir. Tüm deneylerin ortak sonucu girdi görüntülerinin boyutunun arttırılarak ağın derinliğinin genişletilmesidir. Girdi boyutlarının artışı ortalama averaj keskinliği sonuçlarını artırmış fakat deney sürelerini de uzatmıştır. Aktarımlı öğrenme ile önceden eğitilmiş ağırlıkların modeller arası transferi ise deneylerin başarı oranını artırmıştır. Bu sonuçların değerlendirilmesi için modeller öncelikle küçük veri seti üzerinde önceden eğitilmiş COCO ağırlıkları kullanılarak eğitilmiş, daha sonra bu sonuçlar küçük veri kümesinin büyük veri kümesi eğitiminden gelen ağırlıklarla eğitilmesi ile kıyaslanmıştır. Derin öğrenme çalışmalarında güçlü ekran kartına sahip bilgisayarlar deneyleri kolaylaştırmaktadır. Bu tez çalışması kapsamında FRCNN ve YOLOv4 deneyleri NVIDIA GeForce RTX 2080Ti ekran kartına, Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz 3.60 GHz işlemciye sahip bilgisayarlarda yürütülmüşken YOLOv5 deneyi Google tarafından ücretsiz sunulan Colab platformunda yürütülmüştür. YOLOv5 modeli için geliştiriciler 300 ile 1000 epok arasında deney yapılmasını önermektedirler. Ancak bu tez çalışması kapsamında kullanılan HRPLanesv2 veri setinin yüksek bellek gerektirmesi sebebiyle kısıtlama engeli ile karşılaşılmış, bu yüzden YOLOv5 deneyleri 50'şer epoktan oluşan altı farklı deney olarak aktarımlı öğrenme şeklinde yürütülmüştür.

Özet (Çeviri)

Artificial intelligence systems, which have entered all fields of life with Industry 4.0, have become able to solve more complex problems day by day and have dramatically increased the standard of living of human beings. In recent years, thanks to the increasing spatio-temporal resolution of earth observation satellites, detection of ground objects has increased significantly. Object detection has been a main issue since the disciplines of remote sensing and photogrammetry first emerged. Object detection, which was done manually in the early days, is now done with convolutional neural networks, which is a kind of deep learning model. Because the use of traditional learning methods in object detection from modern remote sensing images is limited. This is because traditional detection methods often rely on the use of a single simple feature that is effective for small images with a simpler background. However, the more complex nature of remote sensing images today causes traditional methods to face many difficulties when processing large remote sensing images. Thus, it is possible to minimize gross mistakes made by analysts and to save time, labor and cost. In order to create a successful artificial neural network, it is necessary to train using many examples of target objects and/or objects. With the created convolutional neural networks, it is possible to detect more than one object at the same time, change analyzes and make predictions for the future. As the input image passes through a multi-stage convolutional neural network, the weights are updated in each layer and come to the last layer. This stage is called an epoch and the method is called feedforward. Then, an error rate is calculated by the difference between the predicted value and the ground truth value. In order to decrease this error, the network feeds back and the weights are updated by taking the derivative. Many military and civilian aircraft with different altitudes fly in national and international airspaces. High-resolution satellite images are very useful for aircraft detection, as they provide a wide field of view and altitude monitoring. Aircraft detection from satellite images is essential for various tasks such as monitoring airspace, controlling airline traffic, and applications for defense purposes. Creating a data set for object detection from satellite images is a time-consuming and costly process. Open source datasets currently available are limited. Google Earth is an important opportunity for researchers to provide free high-resolution satellite imagery. Within the scope of the thesis, the HRPlanesv2 dataset consisting of Google Earth images was used. It is aimed to make the data set available to users by publishing it after the thesis study and to benefit this need in the literature. In the literature, models for creating candidate regions have been used for aircraft detection from satellite images. Although these models achieve highly accurate results, they are time consuming in terms of detection time. YOLO models are very useful for deep learning especially on real time object detection, as they detect images by dividing them into grids rather than creating candidate regions. For this reason, in this thesis study, trainings were carried out with YOLO and Faster RCNN models, which are two of the most successful deep learning models in object detection. The models used are those that were previously trained with the open source COCO dataset. Three separate data sets were created for the purpose of the experiments to be conducted in the study The planes in the first data set have a proportion of 80% and above, and the planes in the second data set have a proportion of 50% or more. Images in all datasets belong to good weather conditions. There are noises in a few of the images in the second dataset, where transfer learning is performed. YOLOv4, YOLOv5 and Faster-RCNN-Inception-v2 experiments were performed on the first dataset. The purpose of the experiments with the second dataset; It is the implementation of transfer learning that will show how much the best result obtained from the first experiments will affect the learning. Thirdly, the created data set consists of images from different satellites with different detection parameters. In deep learning studies, a data set consisting of various examples of the target object significantly increases the accuracy of results. However, in order to overcome this troublesome situation for remote sensing, various data augmentation techniques such as mosaicing, cutting and rotating images are used. Thus, the capacity of the limited data set is synthetically increased. Various data augmentation methods have been applied to increase the capacity of the data set in the experiments for the study. The main ones are mosaic, cutmix and cutout. In some experiments, it is aimed to increase the accuracy of results by applying more than one augmentation method at the same time. Not all data augmentation methods affected the experimental results in the same way. Therefore, there is no rational answer to the question of the most effective data augmentation technique. The most successful augmentation method for YOLOv4 was mosaicing, while for FRCNN, random horizontal and vertical flip. The experiments were compared according to the mean average precision metric (mAP) and the YOLOv5 model achieved the best results. The common result of all experiments is to increase the depth of the network by increasing the size of the input images. Increasing the input sizes increased the mean average precision results but also extended the experimental times. Transfer of pre-trained weights between models with transfer learning increased the success rate of the experiments. To evaluate these results, the models were first trained using pre-trained COCO weights on the second dataset, and then these results were compared with training the small dataset with the weights from the large dataset training. Computers with powerful graphics cards facilitate experiments in deep learning studies. Within the scope of this thesis, the FRCNN and YOLOv4 experiments were carried out on computers with NVIDIA GeForce RTX 2080Ti graphics card, Intel(R) Core(TM) i9-9900K CPU @ 3.60GHz 3.60 GHz processor, while the YOLOv5 experiment was conducted on the Colab platform, which is offered free of charge by Google. For the YOLOv5 model, the developers recommend testing between 300 and 1000 epochs. However, due to the high memory requirement of the HRPLanesv2 data set used within the scope of this thesis, the restriction was encountered, so the YOLOv5 experiments were carried out in the form of transfer learning as six different experiments consisting of 50 epochs.

Benzer Tezler

Tez No
665780
Building detection from very high resolution satellite images with deep learning approach
Derin öğrenme yaklaşımı ile çok yüksek çözünürlüklü uydu görüntülerinde bina tespiti
ESRA ÖZAYDIN
Yüksek Lisans
İngilizce
2021
Jeodezi ve Fotogrametri İstanbul Teknik Üniversitesi
Geomatik Mühendisliği Ana Bilim Dalı
PROF. DR. ELİF SERTEL
Tez No
768837
Deep learning-based building segmentation using high-resolution aerial images
Yüksek çözünürlüklü hava görüntüleri kullanarak derin öğrenme temelli bina bölütlemesi
BATUHAN SARITÜRK
Doktora
İngilizce
2022
Jeodezi ve Fotogrametri İstanbul Teknik Üniversitesi
Geomatik Mühendisliği Ana Bilim Dalı
PROF. DR. DURSUN ZAFER ŞEKER
Tez No
737828
Vessel detection from very high-resolution satellite images with deep learning methods
Derin öğrenme metotları kullanılarak çok yüksek çözünürlüklü uydu görüntülerinden gemi tespiti
FURKAN BÜYÜKKANBER
Yüksek Lisans
İngilizce
2022
Bilim ve Teknoloji İstanbul Teknik Üniversitesi
İletişim Sistemleri Ana Bilim Dalı
PROF. DR. MUSTAFA YANALAK
Tez No
753612
Single-frame and multi-frame super-resolution on remote sensing images via deep learning approaches
Derin öğrenme yaklaşımlarıyla uzaktan algılama görüntülerinde tek çerçeve ve çok çerçeve süper çözünürlük
PEIJUAN WANG
Doktora
İngilizce
2022
İletişim Bilimleri İstanbul Teknik Üniversitesi
İletişim Sistemleri Ana Bilim Dalı
PROF. DR. ELİF SERTEL
Tez No
768797
Çok yüksek çözünürlüklü hava fotoğraflarından derin ögrenme yöntemi ile bina bölütlemesi
Building segmentation from very high resolution aerial imagery using deep learning
NURAN ASLANTAŞ
Yüksek Lisans
Türkçe
2022
Bilim ve Teknoloji Yıldız Teknik Üniversitesi
Harita Mühendisliği Ana Bilim Dalı
PROF. DR. BÜLENT BAYRAM

Geri Dön