Investigation of artificial intelligence-based point cloud semantic segmentation

Yapay zeka tabanlı nokta bulutu semantik bölümlendirmesinin incelenmesi

PDF İndir

Tez No: 775403
Yazar: MUHAMMED ENES ATİK
Danışmanlar: PROF. DR. ZAİDE DURAN
Tez Türü: Doktora
Konular: Jeodezi ve Fotogrametri, Geodesy and Photogrammetry
Anahtar Kelimeler: Belirtilmemiş.
Yıl: 2022
Dil: İngilizce
Üniversite: İstanbul Teknik Üniversitesi
Enstitü: Lisansüstü Eğitim Enstitüsü
Ana Bilim Dalı: Geomatik Mühendisliği Ana Bilim Dalı
Bilim Dalı: Geomatik Mühendisliği Bilim Dalı
Sayfa Sayısı: 156

Özet

3 boyutlu (3B) nokta bulutlarının artan kullanım alanları ile 3B veriden bilgi çıkarımı, fotogrametri, uzaktan algılama, bilgisayarla görme ve robotikte önemli bir çalışma alanı haline gelmiştir. Nokta bulutlarının içerdiği geometrik bilgiler, bir çok uygulamanın başarılı şekilde yerine getirilmesi açısından değerlidir. Nokta bulutları 3B tarayıcılar, Light Detection and Ranging (LiDAR), Hareket ile Nesne Oluşturma (SFM), fotogrametri ve RGB-D kameralar ile elde edilebilir. Bu teknolojiler arasında havadan, yersel ve mobil şekilde algılama yapılabilen LiDAR teknolojisinin kullanım alanı gün geçtikçe genişlemektedir. Özellikle haritalama ve otonom araçlar için mobil LiDAR nokta bulutları oldukça kullanışlı veriler sunmaktadır. Mobil LiDAR nokta bulutları hareket eden bir araç üzerine monte edilmiş lazer tarayıcılar kullanılarak elde edilen bir veri türüdür. Mekânın doğru şekilde algılanması, haritalanması ve hassas konum belirleme, otonom sürüş için başlıca gereksinimlerdir. Bu görevlerin başarılı şekilde yerine getirilmesi için mobil LiDAR nokta bulutları bilgi açısından zengin bir veri kaynağıdır. Nokta bulutu semantik segmentasyonu, son on yılda önemli bir araştırma konusu haline gelmiştir. Yapay zeka tekniklerinin gelişmesiyle nokta bulutlarının semantik segmentasyonu bir çok alanda uygulanmaktadır. Literatürde çok sayıda yöntem ve veri seti paylaşılmasına ve araştırmaların hızla devam etmesine rağmen daha fazla araştırmaya ihtiyaç duyulmaktadır. Derin öğrenme teknikleri büyük ve karmaşık nokta bulutlarının başarılı şekilde semantik segmentasyonunu mümkün kılmaktadır. Otonom sürüş sistemlerinin de çevreyi algılaması ve haritalaması için semantik segmentasyon önemli bir potasiyele sahiptir. Bu tez kapsamında nokta bulutlarının semantik segmentasyonunda yapay zeka tekniklerinin kullanımını inceleyen üç makale sunulmuştur. Tezde derin öğrenme-temelli yeni bir semantik segmentasyon yaklaşımı önerilmiştir. Bununla beraber mevctur makine öğrenmesi ve derin öğrenme tekniklerinin performanslarının iyileştirilmesine yönelik yaklaşımlar sunulmuştur. İlk makalede sekiz adet makine öğrenmesi yaklaşımının havadan ve mobil LiDAR sensörleri ile oluşturulan nokta bulutları kullanılarak semantik segmentasyon performansları araştırılmıştır. Nokta bulutunda bulunan her bir noktanın öznitelik vektörleri, noktanın belirli lokal komşuluk alanındaki geometrik ilişkilerini tanımlayan geometrik özellikler kullanılarak oluşturulmuştur. Nokta bulutunun sadece koordinatlarının yanında ek bilgilerin oluşturulması agoritmaların semantik segmentasyon performanslarını iyielştirmektedir. Bir noktanın komşuluk alanı, noktayı merkez alan bir küre ile belirlenmektedir. Çalışmada bu kürenin yarıçapının değiştirilmesine bağlı olarak makine öğrenmesi algoritmalarının semantik segmentasyon doğruluklarının değişimi incelenmiştir. En uygun yarıçapın belirlenmesi geometrik özelliklerin ayırt ediciliğini arttırmakta ve böylece algoritmaların doğrulukları yükselmektedir. Elde edilen sonuçlar aynı veri setlerini kullanan güncel yöntemlerin sonuçları ile karşılaştırılmıştır. İkinci makalede, nokta bulutu sematik segmentasyonu için projeksiyon-temelli yeni bir derin öğrenme yaklaşımı sunulmuştur. Öncelikle nokta bulutları 2B görüntülere dönüştürülmüştür. Bu görüntüler nokta bulutunun düzensiz yapısının 2B düzleme izdüşürülmesiyle oluşturulur. İzdüşürme için küresel projeksiyon kullanılmıştır. Mobil LiDAR nokta bulutları görüntü dizisine benzer şekilde sıralı çerçevelerinden oluşur. Güvenli bir otonom sürüş sağlamak için bu verilerin hızlı ve doğru bir şekilde değerlendirilmesi gerekir. Nokta bulutları dönüştürüldükten sonra artık 2B görüntüler gibi değerlendirilebilir. U-Net ve SegNet yaygın kullanılan görüntü segmentasyon yöntemlerdir. Önerilen yöntem (SegUnet3D) bu iki yöntemin kombinasyonu ile oluşturulmuştur. Girdi verisi U-Net ve SegNet olmak üzere iki kanaldan ilerler ve son aşamada hesaplanan ağırlıklar toplanarak sonuç tahminleri oluşturulur. Noktaları tanımlamak için geometrik özellikler hesaplanmıştır. Her bir geometrik özellik bir görüntü bandı gibi 2B görüntülere eklenmiştir. Böylece nokta bulutunu temsil eden çok bantlı görüntüler oluşturulmuştur. Geometrik özelliklerin kullanımı yöntemin semantik segmentasyon performansını iyileştirmiştir. Önerilen yöntemi uygulamak için SemanticPOSS ve RELLIS-3D veri setleri kullanılmıştır. SemanticPOSS yoğun şehir bölgesini, RELLIS-3D ise kırsal bölgeyi içermektedir. Böylece önerilen yöntemin farklı topoğrafik yapılarda performansı da incelenmiştir. Ayrıca optimum parametreleri belirlemek için girdi görüntü boyutu ve geometrik özellikleri hesaplamak için gereken minimum nokta sayısı değerleri değiştirilerek deneyler tekrarlanmıştır. Önerilen yöntem literatürdeki güncel yöntemler ile kıyaslanmıştır. Önerilen yöntem, mIoU metriğini SemanticPOSS veri setinde \%15,9'a kadar ve RELLIS-3D veri setinde \%5,4'e kadar iyileştirmeyi başarmıştır. Üçüncü makalede özellik seçimi algoritmalarının derin öğrenme ağlarının nokta bulutu semantik segmentasyonu performanslarına etkisi incelenmiştir. İlgili özellikleri seçmek için filtre tabanlı bilgi kazancı (BK), Ki-kare (Chi2) ve ReliefF algoritmaları kullanılmıştır. Filtre tabanlı yöntemler bir sınıflandırıcıya bağlı olmadıkları için optimum özellikleri belirlemede daha tutarlı sonuçlar üretirler. Derin öğrenme ağları olarak doğrudan noktaları kullanan RandLA-Net ve Superpoint Graph (SPG) tercih edilmiştir. Her iki yöntem de geometrik özellikleri girdi veri olarak işleyebilmektedir. Deneyler üç popüler mobil LiDAR nokta bulutu veri seti üzerinde gerçekleştirilmiştir. Seçilen veri setleri Toronto3D, SZTAKI-CityMLS ve Paris-CARLA-3D'dir. Üç veri setinin kullanılması önerilen makalenin hipotezinin genelleştirilmesi açısından önemlidir. Toronto3D ve Paris-CARLA-3D bir nokta için renk bilgisi içermektedir. 3B koordinatlar (x, y, z), renk bilgisi (kırmızı - yeşil- mavi) ve seçilen geometrik özellikler göz önüne alındığında bu iki veri seti için on adet özellik kombinasyonu oluşturulmuştur. SZTAKI-CityMLS renk bilgisi içermediği için beş adet özellik kombinasyonu oluşturulmuştur. Sonuç olarak özellik seçimi ile belirlenen alt-özniteliklerin kullanıldığı durumlar, tüm özelliklerin kullanıldığı durumlara göre daha yüksek semantik segmentasyon doğruluğuna sahiptir. Tüm veri setlerinde benzer sonuçlar elde edilmiştir. Ayrıca renk bilgisinin semantik segmentasyonun doğruluğunu önemli ölçüde arttırdığı görülmektedir. Özellikle renk bilgisi olmadan yol ve yol işaretleri gibi geometrik açıdan birbirine benzeyen sınıfları ayırt etmek mümkün olmamaktadır. Özellik önem derecelerine göre en yüksek öneme sahip özelliğin bir nokta komşuluk alanındaki yükseklik farkı olduğu görülmektedir. Özellik önem sıralaması birinci makaledeki sonuçlar ile tutarlıdır. Bu çalışma ile nokta bulutu semantik segmentasyonunun başarısının belirlenen özelliklere bağımlı bir işlem olduğu sonucuna varılmıştır. Özetle, bu tezde yapay zeka yaklaşımları ile PCSS uygulamalarında geometrik öznitelik kullanımının etkisi incelenmiştir. Mimariyi besleyen noktalar, makine öğrenmesi ve derin öğrenme algoritmalarının PCSS performanslarını iyileştirmek için geometrik özellikler kullanılarak tanımlanır. Yüzey alanındaki bir noktanın en doğru şekilde belirlenmesi için analizler yapılmıştır. Otonom sürüş için önemli bir veri kaynağı olan Mobil LiDAR nokta bulutları araştırmanın odak noktasını oluşturuyor. Otonom sürüş için önemli bir veri kaynağı olan mobil LiDAR nokta bulutları araştırmanın odak noktasını oluşturmaktadır. Bir noktanın yüzey alanında en doğru şekilde tanınlanması için analizler gerçekleştirilmiştir. Performans analizleri ve önerilen yöntemler nokta bulutu semantik segmentasyonuna yönelik çalışmalarda tekrar uygulanabilir ve gerçeklenebilir şekilde sunulmuştur.

Özet (Çeviri)

With the increasing usage areas of 3D point clouds, information extraction from 3D data has become an important field of study in photogrammetry, remote sensing, computer vision and robotics. The geometric information contained in point clouds is valuable for the successful implementation of many applications. Point clouds can be obtained with 3D scanners, Light Detection and Ranging (LiDAR), Motion Object Rendering (SFM), photogrammetry, and RGB-D cameras. Among these technologies, the usage area of LiDAR technology, which can be detected from the aerial, terrestrial and mobile, is expanding day by day. Especially for mapping and autonomous vehicles, mobile LiDAR point clouds offer very useful data. Mobile LiDAR point clouds are a type of data obtained using laser scanners mounted on a moving vehicle. Accurate sense of space, mapping and precise positioning are essential requirements for autonomous driving. For the successful performance of these tasks, mobile LiDAR point clouds are an information-rich data source. Point cloud semantic segmentation has become an important research topic in the last decade. With the development of artificial intelligence techniques, semantic segmentation of point clouds has been applied in many areas. Many methods and data sets are shared in the literature, and although the research continues rapidly, more research is needed. Deep learning techniques also enable successful semantic segmentation of large and complex point clouds. Semantic segmentation has an important potential for autonomous driving systems to perceive and map the environment. This thesis presents three articles examining the use of artificial intelligence techniques in the semantic segmentation of point clouds. A new deep learning-based semantic segmentation approach is proposed in the thesis. In addition, approaches to improving the performance of existing machine learning and deep learning techniques are presented. In the first article, semantic segmentation performances of eight machine learning approaches were investigated using point clouds created with aerial and mobile LiDAR sensors. The feature vectors of each point in the point cloud are created using geometric features that describe the geometric relationships in the specific local neighborhood of the point. Only the 3D coordinates of the point cloud are not sufficient for semantic segmentation. Additional information needs to be created. The neighborhood of a point is determined by a sphere centered on the point. In the study, the change of semantic segmentation accuracy of machine learning algorithms depending on the change of the radius of this sphere has been examined. Determining the most suitable radius increases the distinctiveness of the geometric features, and thus the accuracy of the algorithms increases. The results obtained were compared with the results of current methods using the same data sets. In the second article, a new projection-based deep learning approach for point cloud semantic segmentation is presented. First, point clouds are converted into 2D images. These images are created by projecting the irregular structure of the point cloud onto the 2D plane. Spherical projection is used for projection. Mobile LiDAR point clouds consist of frames similar to an image array. This data needs to be evaluated quickly and accurately to ensure safe autonomous driving. Once converted, point clouds can now be treated as 2D images. U-Net and SegNet have commonly used image segmentation methods. The proposed method (SegUnet3D) was created by combining these two methods. Input data proceeds through two channels, U-Net and SegNet, and result estimates are created by summing the calculated weights in the final stage. Geometric features were calculated to describe the points. Each geometric feature is attached to the 2D images like a band of images. Thus, multi-spectral images representing the point cloud were created. The use of geometric features improved the semantic segmentation performance of the method. SemanticPOSS and RELLIS-3D data sets were used to implement the proposed method. SemanticPOSS includes dense urban area, and RELLIS-3D includes the rural area. Thus, the performance of the proposed method in different topographic structures was also examined. In addition, the experiments were repeated to determine the optimum parameters by changing the input image size and the minimum number of points required to calculate the geometric features. The proposed method was compared with the current methods in the literature. The mIoU metric was improved with the proposed method by up to 15.9\% in the SemanticPOSS data set and up to 5.4\% in the RELLIS-3D data set. The third article examines the effect of feature selection algorithms on the point cloud semantic segmentation performance of deep learning networks. Filter-based information gain (IG), Chi-square (Chi2) and ReliefF algorithms were used to select the relevant features. Because filter-based methods do not depend on a classifier, they produce more consistent results in determining the optimum properties. RandLA-Net and Superpoint Graph (SPG), which directly use points as deep learning networks, are preferred. Both methods can process geometric features as input data. Experiments were performed on three popular mobile LiDAR point cloud data sets. Selected data sets are Toronto3D, SZTAKI-CityMLS, and Paris-CARLA-3D. The use of three data sets is important in terms of generalizing the hypothesis of the proposed article. Toronto3D and Paris-CARLA-3D contain color information for a point. Considering the 3D coordinates (x, y, z), color information (red - green - blue), and selected geometric features, ten feature combinations were created for these two data sets. As a result, cases where sub-attributes determined by feature selection are used have higher semantic segmentation accuracy than cases where all features are used. Similar results were obtained from all data sets. It is also seen that color information significantly increases the accuracy of semantic segmentation. Especially without color information, it is not possible to distinguish geometrically similar classes such as road and road marking. It is seen that the feature with the highest importance according to the feature importance degrees is the height difference in a point neighborhood area. The feature importance ranking results in the first article are consistent. This study concluded that the success of point cloud semantic segmentation is a process dependent on the determined features. In summary, the effect of the usage of geometric features in PCSS applications with artificial intelligence approaches has been examined in this thesis. Each point of the point cloud is defined using geometric features to improve the PCSS performances of machine learning and deep learning algorithms. Analyzes were carried out for the most accurate identification of a point in the surface area. Mobile LiDAR point clouds, an important data source for autonomous driving, are the focus of the research. A fast and efficient projection-based deep learning network has been developed for point cloud semantic segmentation for autonomous driving. Performance analyzes and suggested methods are presented in a reproducible and applicable way in studies of point cloud semantic segmentation.

Benzer Tezler

Tez No
837034
Tarihi yarımada'da turizm amaçlı, mekansal tabanlı sanal gerçeklik olanaklarının araştırılması
Research on spatial-based virtual reality opportunities for tourism purposes in the historical peninsula
SANÇAR BUHUR
Doktora
Türkçe
2023
Jeodezi ve Fotogrametri İstanbul Teknik Üniversitesi
Geomatik Mühendisliği Ana Bilim Dalı
PROF. DR. NEBİYE MUSAOĞLU
Tez No
770344
Türkiye kentsel dönüşüm uygulamaları ve yapay zekâ tabanlı algoritmalar kullanarak kentsel dönüşüm sürecinin incelenmesi
Investigation of the urban transformation processes using Turkish urban transformation applications and artificial intelligence-based algorithms
HACI ABDULLAH UÇAN
Doktora
Türkçe
2022
İnşaat Mühendisliği Karadeniz Teknik Üniversitesi
İnşaat Mühendisliği Ana Bilim Dalı
PROF. DR. TAYFUN DEDE
Tez No
747651
Elektrikli araç şarj istasyonlarının akıllı şebekelerle entegrasyonunun sağlanması ve bu istasyonların şebekeye getireceği yükün incelenmesi: Bursa örneği
Ensuring the integration of electric vehicle charging stations with smart grids and investigation of the load that these stations bring to the network: Bursa
ENES AVCİ
Yüksek Lisans
Türkçe
2022
Elektrik ve Elektronik Mühendisliği Bursa Teknik Üniversitesi
Elektrik-Elektronik Mühendisliği Ana Bilim Dalı
PROF. DR. MUSA AYDIN
Tez No
55993
Bir robot koluna kumanda eden doğal dil anlama sistemi
Başlık çevirisi yok
HASAN FERİT KEÇECİ
Yüksek Lisans
Türkçe
1996
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol İstanbul Teknik Üniversitesi
PROF.DR. EŞREF ADALI
Tez No
650041
Artificial intelligence-based maximum power point tracking controller for pv modules under partial shading conditions
Kısmi gölgelenme koşullarındaki pv paneller için yapay zeka tabanlı maksimum güç noktası izleme denetleyicisi
FUAD ALHAJOMAR
Doktora
İngilizce
2020
Elektrik ve Elektronik Mühendisliği Selçuk Üniversitesi
Elektrik-Elektronik Mühendisliği Ana Bilim Dalı
PROF. DR. AHMET AFŞİN KULAKSIZ

Geri Dön