Large scale landmark recognition using global descriptors and encoded features

Coğrafi lokasyon ve yer sembolü veri kümelerinin öznitelikler yardımı ile tanınması

PDF İndir

Tez No: 753644
Yazar: BERK TAŞKIN
Danışmanlar: PROF. DR. MİNE ELİF KARSLIGİL YAVUZ
Tez Türü: Yüksek Lisans
Konular: Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol, Computer Engineering and Computer Science and Control
Anahtar Kelimeler: Belirtilmemiş.
Yıl: 2022
Dil: İngilizce
Üniversite: Yıldız Teknik Üniversitesi
Enstitü: Fen Bilimleri Enstitüsü
Ana Bilim Dalı: Bilgisayar Mühendisliği Ana Bilim Dalı
Bilim Dalı: Bilgisayar Mühendisliği Bilim Dalı
Sayfa Sayısı: 64

Özet

Obje tanıma ve anlamsal bölütleme gibi bilgisayarlı görü algoritmalarının öneminin artması ile bu alanda yapılan çalışmalar da son yıllarda popülerlik kazanmıştır. Bunun sonucunda kullanılan veri kümelerinin de boyutlarında artış gözlemlenmiştir. Fakat çalışılan veri kümelerinin boyutlarındaki artış işaretleme yükünü de beraberinde getirmektedir. Çalışmaların büyük çoğunluğunda kullanılan resimler insanlar tarafından işaretlenmiş olmakla beraber, çok büyük veri kümeleri için bu durum uygulanabilir olmaktan çıkmaktadır. Bu çalışmada veri kümesi temizliği için aykırı örnek tespiti, kümeleme ve derin öğrenme tabanlı sınıflandırma tekniklerini kullanan otomatik bir sistem tasarlanmış ve gerçeklenmiştir. Çalışmanın temelini oluşturan Google Landmarks v2 veri kümesi, içerisinde ~ 200B sınıf ve 5M resim örneği bulunduran bir veri kümesidir. Bu veri kümesi öncelikle AutoEncoder, DELF ve DELG'den üç öznitelik vektörünün oluşturulmasında kullanılmıştır. Öznitelik vektörlerinin üretiminin ardından gürültü örneklerini ortadan kaldırmak için aykırı değer tespiti (kNN,Isolation Forest,VAE) ve kümeleme (FAISS) yöntemlerinden faydalanılmıştır. Çalışmanın son bölümünde, bir sınıflandırma modelinin oluşturulmasında yukarıda bahsedilen öznitelik vektörleri ve EfficientNet mimarisi kullanılmıştır. Veri kümesi temizliğinden önce AutoEncoder özelliklerinde %49, DELF özelliklerinde %55, DELG özelliklerinde %57 ve EfficientNet-B5'te %61 doğruluk puanı elde edilmiştir. Bu sonuçlar ışığında veri kümesi temizleme amaçlı EfficientNet kullanımına karar verilmiştir. Temel performans değerlerının eldesi için yapılan detaylı eğitimde %69 doğruluk puanı elde edilmiş ve bunu takiben, veri kümesi, eğitilen modelin tahmin güvenleri kullanılarak temizlenmiştir. Temizlenen veri kümesi kullanılarak sıfırdan yeni bir eğitime başlanmış ve %71 doğruluk puanı elde edilmiştir. Bununla birlikte, sınıflandırma mimarileri kullanılarak veri kümesi temizliğinin mümkün olduğu gözlemlenmiştir.

Özet (Çeviri)

The increasing importance of computer vision algorithms like object detection and segmentation in recent years made the research for these topics evermore popular. As a result, larger and larger datasets started surfacing. However, with the development of larger datasets, comes the effort of labeling. While in most studies, images are manually annotated, this labeling technique is outright infeasible for large collections of data therefore automatic annotations are used instead. In this research, an automated dataset cleaning technique for one such dataset has been outlined and performed using various outlier detection, clustering, and deep learning based classification algorithms. Providing the basis for this study, Google Landmarks Dataset v2 is a dataset with ~200k classes and 5M images. This dataset is then first used in the creation of three feature vectors from an AutoEncoder, DELF, and DELG in this order. Extracted feature vectors are then used in outlier detection (kNN, Isolation Forest, VAE) and clustering (FAISS) schemes in order to remove the noise samples. Achieving various degrees of success across different features. The final part of this study used the aforementioned feature vectors and EfficientNet architecture in the creation of a classification model. Resulting validation accuracies of; %49 on AutoEncoder features, %55 on DELF features, %57 on DELG features, and %61 on EfficientNet-B5 were achieved prior to cleaning. In the light of these results, the usage of EfficientNet for dataset cleaning purposes has been decided on. Following a further training and fine-tuning step, the dataset was cleaned using the prediction confidences of the trained model. Using the cleaned dataset, another training was then made from scratch achieving %71 validation accuracy. With this, it was observed that dataset cleaning is in fact possible to achieve using classification architectures.

Benzer Tezler

Tez No
503382
Face recognition with local Walsh transform
Yerel Walsh dönüşümü ile yüz tanıma
MERYEM UZUN PER
Doktora
İngilizce
2018
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol İstanbul Teknik Üniversitesi
Bilgisayar Mühendisliği Ana Bilim Dalı
PROF. DR. MUHİTTİN GÖKMEN
Tez No
541783
Age and gender classification from ear images
Kulak imgelerinden yaş ve cinsiyet sınıflandırma
DOĞUCAN YAMAN
Yüksek Lisans
İngilizce
2018
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol İstanbul Teknik Üniversitesi
Bilgisayar Mühendisliği Ana Bilim Dalı
DOÇ. DR. HAZIM KEMAL EKENEL
Tez No
401591
Enabling dynamics in face analysis
Başlık çevirisi yok
HAMDİ DİBEKLİOĞLU
Doktora
İngilizce
2014
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol Universiteit van Amsterdam
PROF. DR. THEO GEVERS
PROF. DR. A. W. M. SMEULDERS
Tez No
728716
Short term electricity load forecasting with deep learning
Derin öğrenme ile kısa dönemli elektrik yük talep tahmini
İBRAHİM YAZICI
Doktora
İngilizce
2022
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol İstanbul Teknik Üniversitesi
Endüstri Mühendisliği Ana Bilim Dalı
DR. ÖĞR. ÜYESİ ÖMER FARUK BEYCA
Tez No
720406
Synthesization and reconstruction of 3d facesby deep neural networks
Başlık çevirisi yok
BARİS GECER
Doktora
İngilizce
2020
Biyoteknoloji University of London
DR. STEFANOS ZAFEİRİOU

Geri Dön