Combining features and semantics for low-level computer vision

Başlık çevirisi mevcut değil.

PDF İndir

Tez No: 523516
Yazar: FATMA GÜNEY
Danışmanlar: Dr. ANDREAS GEIGER
Tez Türü: Doktora
Konular: Bilim ve Teknoloji, Matematik, Science and Technology, Mathematics
Anahtar Kelimeler: Belirtilmemiş.
Yıl: 2017
Dil: İngilizce
Üniversite: Eberhard-Karls-Universität Tübingen
Enstitü: Yurtdışı Enstitü
Ana Bilim Dalı: Belirtilmemiş.
Bilim Dalı: Belirtilmemiş.
Sayfa Sayısı: 176

Özet

Özet yok.

Özet (Çeviri)

Visual perception of depth and motion plays a significant role in understanding and navigating the environment. Reconstructing outdoor scenes in 3D and estimating the motion from video cameras are of utmost importance for applications like autonomous driving. The corresponding problems in computer vision have witnessed tremendous progress over the last decades, yet some aspects still remain challenging today. Striking examples are reflecting and textureless surfaces or large motions which cannot be easily recovered using traditional local methods. Further challenges include occlusions, large distortions and difficult lighting conditions. In this thesis, we propose to overcome these challenges by modeling non-local interactions leveraging semantics and contextual information. Firstly, for binocular stereo estimation, we propose to regularize over larger areas on the image using object-category specific disparity proposals which we sample using inverse graphics techniques based on a sparse disparity estimate and a semantic segmentation of the image. The disparity proposals encode the fact that objects of certain categories are not arbitrarily shaped but typically exhibit regular structures. We integrate them as non-local regularizer for the challenging object class 'car' into a superpixelbased graphical model and demonstrate its benefits especially in reflective regions. Secondly, for 3D reconstruction, we leverage the fact that the larger the reconstructed area, the more likely objects of similar type and shape will occur in the scene. This is particularly true for outdoor scenes where buildings and vehicles often suffer from missing texture or reflections, but share similarity in 3D shape. We take advantage of this shape similarity by localizing objects using detectors and jointly reconstructing them while learning a volumetric model of their shape. This allows to reduce noise while completing missing surfaces as objects of similar shape benefit from all observations for the respective category. Evaluations with respect to LIDAR ground-truth on a novel challenging suburban dataset show the advantages of modeling structural dependencies between objects. Finally, motivated by the success of deep learning techniques in matching problems, we present a method for learning context-aware features for solving optical flow using discrete optimization. Towards this goal, we present an efficient way of training a context network with a large receptive field size on top of a local network using dilated convolutions on patches. We perform feature matching by comparing each pixel in the reference image to every pixel in the target image, utilizing fast GPU matrix multiplication. The matching cost volume from the network's output forms the data term for discrete MAP inference in a pairwise Markov random field. Extensive evaluations reveal the importance of context for feature matching. v

Benzer Tezler

Tez No
515803
Indexing both content and concept for high-dimensional multimedia data
Çok boyutlu çokluortam veri erişimi için içerik ve anlam dizinleme
SERDAR ARSLAN
Doktora
İngilizce
2018
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol Orta Doğu Teknik Üniversitesi
Bilgisayar Mühendisliği Ana Bilim Dalı
PROF. DR. ADNAN YAZICI
Tez No
496320
Building of Turkish propbank and semantic role labeling of Turkish
Türkçe önerme veri tabanının oluşturulması ve Türkçenin anlamsal görev çözümlemesi
GÖZDE GÜL ŞAHİN
Doktora
İngilizce
2018
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol İstanbul Teknik Üniversitesi
Bilgisayar Mühendisliği Ana Bilim Dalı
PROF. DR. EŞREF ADALI
Tez No
202704
Image auto-annotation based on combination of text and visual clustering
Resimlerin metin ve görsel kümelemeye dayalı olarak otomatik etiketlenmesi
ERBUĞ ÇELEBİ
Doktora
İngilizce
2006
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol Dokuz Eylül Üniversitesi
Bilgisayar Mühendisliği Ana Bilim Dalı
YRD. DOÇ. DR. ADİL ALPKOÇAK
Tez No
944874
Multi-agent planning with automated curriculum learning
Otomatik müfredat öğrenmesi ile çoklu ajan planlaması
ONUR AKGÜN
Doktora
İngilizce
2025
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol İstanbul Teknik Üniversitesi
Mekatronik Mühendisliği Ana Bilim Dalı
DOÇ. DR. NAZIM KEMAL ÜRE
Tez No
143475
Combining image features for semantic descriptions
Anlamsal tanımlamalar için görüntü öznitelikleri birleştirme
MEDENİ SOYSAL
Yüksek Lisans
İngilizce
2003
Elektrik ve Elektronik Mühendisliği Orta Doğu Teknik Üniversitesi
Elektrik-Elektronik Mühendisliği Ana Bilim Dalı
DOÇ. DR. AYDIN ALATAN

Geri Dön