Fotogrametri ve lıdar tekniği ile üretilen nokta bulutlarının makine öğrenmesi ile sınıflandırılması ve analizi

Classification and analysis of point clouds generated by photogrammetry and lidar technique with machine learning

PDF İndir

Tez No: 706906
Yazar: KÜBRA ÖZCAN
Danışmanlar: DOÇ. DR. ZAİDE DURAN
Tez Türü: Yüksek Lisans
Konular: Jeodezi ve Fotogrametri, Geodesy and Photogrammetry
Anahtar Kelimeler: Belirtilmemiş.
Yıl: 2021
Dil: Türkçe
Üniversite: İstanbul Teknik Üniversitesi
Enstitü: Lisansüstü Eğitim Enstitüsü
Ana Bilim Dalı: Geomatik Mühendisliği Ana Bilim Dalı
Bilim Dalı: Geomatik Mühendisliği Bilim Dalı
Sayfa Sayısı: 93

Özet

Günümüzde veri elde etmek çok önemli bir konu olup, veri elde etmek için çeşitli yöntemler bulunmaktadır. Çeşitli veri oluşturma yöntemleri yanında bu verilerden bilgi edinme ve bu bilgileri değerlendirme süreci her zaman kendini güncel tutan bir araştırma konusudur. Verilerden bilgi edinme sürecinde sıklıkla makine öğrenmesi algoritmaları kullanılmaktadır. Tez çalışması kapsamında, veri oluşturma denince akla ilk gelen yöntemlerden biri olan hava fotogrametrisi ve LIDAR teknolojisi ile tarama yöntemi kullanılmıştır. İstanbul Teknik Üniversitesi Ayazağa Kampüsünde aynı bölgede yapılan uçuş ve taramalar ile iki farklı nokta bulutu elde edilmiştir. Elde edilen bu nokta bulutları Cloud Compare programı ile yapı, sık ağaçlık alan, yüzey ve seyrek bitki örtüsü olmak üzere dört farklı sınıfa ayrılarak veri setleri oluşturulmuştur. Veri setlerinin geometrik özellikleri hesaplanarak; LIDAR teknolojisi ile elde veri setinde 22 öznitelik ile çalışılmış, hava fotogrametrisi yöntemi ile elde edilen veri setinde 22 öznitelik bilgisine ek olarak renk bilgisi eklenerek (RGB) 25 farklı öznitelik ile çalışma yürütülmüştür. Oluşturulan bu veri setleri, Python programlama ile makine öğrenmesi veri ön işleme adımları gerçekleştirilmiştir. Veri setleri dokuz farklı makine öğrenmesi algoritması ile kontrollü sınıflandırılmış ve algoritmalar doğruluk, kesinlik, hassaslık, F1 ölçütü ve algoritmanın sınıflandırma yapmak için harcadığı zaman bakımından karşılaştırılmıştır. Veri setlerinde karşılaştırılan algoritmalar şunlardır: rastgele orman (random forest (RF)), karar ağaçları (decision tree (DCT)), k-en yakın komşuluk (k-nearest neighbors (KNN)), çok katmanlı algılayıcı (multi-layer perceptron (MLP)), lojistik regresyon (logistic regression (LR)), gaussian naive bayes (GNB), doğrusal diskrimant analizi (lineer discriminant analysis (LDA)), yükseltme algoritması (ada boost classifier (ADB)) ve destek vektör makinesi (support vector machine (SVM)) algoritmalarıdır. Sonuç olarak, LIDAR teknolojisi ile elde edilen nokta bulutunun verilerine göre, en yüksek doğruluğu sağlayan algoritma 0,931 doğruluk değeri ile MLP algoritmasıdır. Algoritmanın kesinlik, duyarlılık ve F1 ölçütü bakımından performansı sırasıyla 0,931; 0,931; 0,931'dir. İkinci sırada 0,916 doğruluk elde ederek RF algoritması, üçüncü sırada ise 0,896 doğruluk ile DCT algoritması gelmektedir. En düşük performansa sahip olan algoritma ise GNB algoritmasıdır. Algoritmanın doğruluk, kesinlik, hassaslık ve F1 ölçütü performansı sırasıyla 0,481; 0,641; 0,485; 0,552' dir. LIDAR teknolojisi ile elde edilen nokta bulutunun değerlendirilmesine benzer olarak hava fotogrametrisi ile elde edilen nokta bulutunda en iyi performansı sağlayan algoritma 0,996 doğruluk değeri ile MLP algoritmasıdır. Kesinlik, hassaslık ve F1 ölçütü bakımından değeri sırasıyla 0,996; 0,996; 0,996' dır. İkinci sırada 0,995 doğruluk değeri ile RF algoritması üçüncü olarak da KNN algoritması gelmektedir. Dokuzuncu sırada 0,775 doğruluk değerini elde ederek GNB algoritması gelmektedir. GNB algoritmasının kesinlik, hassaslık ve F1 ölçütü sırasıyla 0,807; 0,774; 0,790 değerleridir. Veri setlerinin sınıflandırılmasında algoritmalar sınıflandırma süreleri bakımından da değerlendirilmiştir. En hızlı sonuç üreten algoritma KNN algoritması iken en uzun sonuç üreten algoritma SVM algoritması olarak belirlenmiştir.

Özet (Çeviri)

Nowadays obtaining data is a very important issue and there are various methods of obtaining data. Examples of these methods are remote sensing and photogrammetry. However, another issue as important as obtaining data is the process of obtaining information from data. In the process of obtaining information from data, machine learning classification algorithms are generally used. In the thesis study, two different data sets belonging to the same region were used, the data set obtained by photogrammetric methods and the data set obtained by LIDAR scanning. In aerial photogrammetry, the data set is obtained by flying with metric cameras and by subsequent operations while LIDAR scanning, on the other hand, uses laser pulses to define the distance of the surface, and the data set is obtained. These datasets were obtained by creating two different point clouds in the same region of Istanbul Technical University Ayazağa Campus by flying and scanning. These point clouds were divided into four different classes using the Cloud Compare program: structure, forest, ground and low vegetation. These separated classes are labeled with the Python programming language. In addition, in order to extract meaningful information from these data sets, features were created by calculating their geometric properties. Geometric feature calculation, with its most basic definition, provides meaningful information to be extracted by making use of the location information of the points in the data sets. Geometric features obtained from point clouds were used for the feature spaces created for classification. Color information is also added to these in the photogrammetric point cloud. 22 features were used in the data set obtained by LIDAR scanning, and 25 features were used in the data set obtained by aerial photogrammetry. Data preprocessing steps were performed on the datasets to perform supervised classification. First of all, there may be missing data in the data sets due to the way of obtaining data or the person entering the data. Missing data analysis was performed as the first step in these data sets. After that, the feature extraction process was performed and the categorical data were converted into numerical data. Due to hardware problems in the created data sets, approximately 250.000 data were randomly selected from each class. In other words, data sets consist of approximately 1.000.000 data. These created data sets are divided into 2 classes as test and training data. The datasets were classified by nine different machine learning algorithms. The algorithms were compared in terms of accuracy, precision, recall, F1 score and the time spent in the classification process of the algorithm. Algorithms compared across datasets are: random forest (RF), decision tree (DCT), k-nearest neighbors (KNN), multi-layer perceptron (MLP), logistic regression (LR), gaussian naive bayes (GNB), linear discriminant analysis (LDA), ada boost classifier (ADB) and support vector machine (SVM) algorithms. If we examine these algorithms briefly, the first used algorithm is logistic regression. Although the LR algorithm evokes regression algorithms, it is a classification algorithm. It is preferred in many studies because it is easy, fast and simple. The second algorithm used is the LDA algorithm. The working principle of the algorithm performs the classification process by using a linear distribution of statistical information and features. The third used algorithm is the KNN algorithm. KNN, which is an easy machine learning algorithm, is one of the most preferred algorithms. It basically has two parameters. These are the k value and the distance function. The next classification algorithm is DCT. In this algorithm get the result product, there is an inverse tree structure that branches from the root node to heterogeneous leaf nodes with a homogeneous probability. The algorithm reaches the result by choosing the most decisive feature. The next algorithm is GNB. The algorithm, which classifies according to probabilistic distribution, gives successful results in unbalanced data sets. The sixth algorithm used is MLP. MLP is a class of feed forward neural network. The ADB algorithm produces a result value by summing the values it receives with the iteration of each operation made in the data set. The other algorithm is RF. The algorithm is based on extracting the most important feature from the data set. There is a linear relationship between the number of trees and the result obtained. The last algorithm used is SVM. The algorithm basically determines the decision class between classes by creating the maximal margin from any point in the data set. According to the LIDAR point cloud results, the highest overall accuracies were obtained as 0,931 with the multi layer perceptron (MLP) method. The performance of the algorithm in terms of precision, recall and F1 score is respectively 0,931; 0,931 and 0,931. RF algorithm comes in second with an accuracy of 0,916 and DCT algorithm comes in third with an accuracy of 0,896. The algorithm with the lowest performance is the GNB algorithm. The algorithm's accuracy, recall, precision and F1 score performance are respectively 0,481; 0,641; 0,485; 0,552. Similar to the evaluation of the point cloud obtained by LIDAR technology, the algorithm that provides the best performance in the point cloud obtained by aerial photogrammetry is the MLP algorithm with an accuracy value of 0,996. The algorithm's precision, recall and F1 score values are respectively 0,996; 0,996 and 0,996. The second is the RF algorithm with an accuracy value of 0,995 and the third is the KNN algorithm. The ninth is the GNB algorithm, with an accuracy value of 0,775. The precision, recall and F1 score values of the GNB algorithm are 0,807; 0,774 and 0,790 respectively. Algorithms classification times were also evaluated in the classification of data sets. While the algorithm that produces the result in the shortest time is the KNN algorithm, the algorithm that produces the result in the longest time is the SVM algorithm.

Benzer Tezler

Tez No
510451
Nokta bulutu verilerinin yerel geoit modellerinin değerlendirilmesinde kullanılması üzerine bir inceleme
An investigation on the use of point cloud data in evaluation of local geoid models
EMRAH ÖZÖGEL
Yüksek Lisans
Türkçe
2018
Jeodezi ve Fotogrametri İstanbul Teknik Üniversitesi
Geomatik Mühendisliği Ana Bilim Dalı
DOÇ. DR. SERDAR EROL
Tez No
811608
Nokta bulutu verilerinden çıkarılan bina çatı geometrisinin düzgünleştirilmesine yönelik yeni bir algoritma ve arayüz geliştirme
A novel algorithm and interface for automatic regularization of building boundary geometry extracted from point cloud data
EMİRHAN ÖZDEMİR
Doktora
Türkçe
2023
Jeodezi ve Fotogrametri Karadeniz Teknik Üniversitesi
Harita Mühendisliği Ana Bilim Dalı
PROF. DR. FEVZİ KARSLI
Tez No
909011
Mekânsal dijital ikizlere yönelik yapı modeli üretiminde prosedürel modelleme yönteminin tasarımı ve geliştirilmesi
Design and development of procedural modeling method in generating structure models for spatial digital twins
GÜÇLÜ ŞENYURDUSEV
Doktora
Türkçe
2024
Bilim ve Teknoloji İstanbul Teknik Üniversitesi
Geomatik Mühendisliği Ana Bilim Dalı
DOÇ. DR. AHMET ÖZGÜR DOĞRU
Tez No
559889
Photogrammetry based heritage modeling with shape embedding
Tarihi yapıların fotogrametri ve gömülü biçimlerle modellenmesi
DEMİRCAN TAŞ
Yüksek Lisans
İngilizce
2019
Mimarlık İstanbul Teknik Üniversitesi
Bilişim Ana Bilim Dalı
PROF. DR. MİNE ÖZKAR KABAKÇIOĞLU
Tez No
567177
Geleneksel fotogrametrik yöntemle üretilen haritaların 3 boyutlu konum doğruluğu analizi: Çağlayan/Erzincan örneği
3-dimensional position accuracy analysis of maps produced by conventional photogrammetric method: Çaglayan/Erzincan example
GÖKHAN KARA
Yüksek Lisans
Türkçe
2019
Jeodezi ve Fotogrametri Zonguldak Bülent Ecevit Üniversitesi
Geomatik Mühendisliği Ana Bilim Dalı
DR. ÖĞR. ÜYESİ HÜSEYİN KEMALDERE

Geri Dön