Ego noise estimation for robot audition

Başlık çevirisi mevcut değil.

PDF İndir

Tez No: 400042
Yazar: GÖKHAN İNCE
Danışmanlar: PROF. JUNİCHİ IMURA
Tez Türü: Doktora
Konular: Makine Mühendisliği, Mechanical Engineering
Anahtar Kelimeler: Belirtilmemiş.
Yıl: 2011
Dil: İngilizce
Üniversite: Tokyo Institute of Technology
Enstitü: Yurtdışı Enstitü
Ana Bilim Dalı: Belirtilmemiş.
Bilim Dalı: Belirtilmemiş.
Sayfa Sayısı: 176

Özet

Özet yok.

Özet (Çeviri)

Robots should listen to their surrounding world by the microphones embedded in theirbodies to recognize and understand the auditory environment. This artificial listeningcapability called robot audition is an important function to understand the surroundingauditory world including sounds such as human voices, music, and other environmentalsounds. Robot audition can be improved by incorporating another modality, robotmotion, so that the framework is extended to active robot audition. In that sense, activeaudition can be considered as the first step towards endowing the robot with intelligentbehavior. It provides the robot with a processing architecture that will allow it to learnand reason about how to behave in response to complex acoustic environments andconditions.The most important problem encountered in the active audition domain is ego noise,which can be described as the robot?s own noise generated during a motion of the robot.However, it cannot be solved effectively with conventional methods proposed in othersignal processing domains. The basic problem with ego noise, like all types of noise ina robot audition system, is that it causes the Signal-to-Noise Ratio (SNR) to drop andit contaminates the spectrum of the recorded signal so that it is almost impossible toperform the fundamental applications of robot audition, such as Sound Source Localiza-tion (SSL), Sound Source Separation (SSS) and Automatic Speech Recognition (ASR),accurately. Because the complexity of the ego noise is enhanced by the number of motorsin action, the negative effects of ego noise are even more severe for a moving robot withmany degrees of freedom.This thesis addresses the estimation problem of the ego noise of a robot in order tosuppress it for various tasks. The aim of this thesis is to establish a real-time and onlineego noise estimation system. To develop a framework for estimating ego noise and tointegrate it into the general robot audition framework effectively, we have to considerthe following three issues: (1) modeling the process of ego noise estimation, (2) onlineprocessing and (3) general applicability of our ego noise estimation method for robotaudition.In order to address the modeling issue of ego noise estimation, we first have to resolvethree important sub-issues we have determined: Knowledge gathering issue, representa-tion issue and algorithm issue. The templates are good representations of motor noisewhen the same actions are performed over and over again. We model the ego noiseusing templates by associating discrete time series data representing the motion (i.e.,the angular status of each joint of the robot) with another series of discrete time datarepresenting the ego noise spectrum. The data is stored in a database so that later itcan be estimated instantaneously. However, the necessity of offline training poses strictconstraints. The new ?online? scheme can distinguish between stationary noise (i.e.,static fan noise, hardware noise of the robot and possibly changing background noise)and non-stationary ego-motion noise and treat both of them in separate processes. Furthermore,the proposed online training of the templates makes template-based noiseestimation method more adequate to real-world applications because it can learn theego noise of unknown motions on the fly. Whereas the proposed ?template learning?mechanism can discriminate the new data entries from the existing templates in thedatabase, the ?template update? mechanism adaptively sustains the accuracy and precisionof the templates. It also prevents the rapid growth of the size of database. Thefinal issue is the confirmation of general applicability and compliance of the proposedego noise estimation method on several robot audition applications. The establishedframeworks for ego noise reduction, noise robust feature extraction, ASR and SSL arepresented.In Chapter 1, we introduce our motivation, our goals, and the technical issues forthis study. The problems and requirements for robot audition are explained, and wegive the appropriate approaches to these issues.Chapter 2 surveys the literature related to robot audition and signal processing.Since there are different noise sources in a robot environment and ego noise is stronglyintertwined with all of them, our robot audition framework has diverse noise processingblocks. We explain the basic methods used in these blocks in a detailed way as existingwork. The properties of all noise sources are explained along with a detailed analysis ofthe noise signals and robot motions. Also, related work is summarized in this chapter.We describe the technical differences between our approaches and conventional ones.In Chapter 3, after specifying general criteria to be able to choose the optimal estiiimation process for each noise type, we explain how to approach the modeling process ofego noise estimation specifically. Later on, we propose an estimation method called parameterizedtemplate estimation. The performance of this original method is comparedwith those of existing single-channel noise estimation methods.Chapter 4 describes the developments we made on the basic parameterized templateestimation system so that it runs online. In order to cope with changing environmentalnoise, we modify the abstract template concept to our needs. We generate the templatesin a way that they only represent the non-stationary noise. The stationary portion ofthe ego noise with ambient noise is dealt with by a stationary noise estimation method.We explain the details of this unified framework for noise estimation. Moreover, weeliminate the necessity of human intervention in the training procedure by introducingan incremental template learning scheme. Finally, we evaluate the performance of theproposed methods in terms of estimation quality and noise reduction accuracy by usingobjective performance criteria and discuss the results.Chapter 5 delves into the question of how to suppress the whole-body motion noise ofa robot more robustly. For this purpose we integrate template-based ego noise estimationwith the already established works from the multi-channel noise reduction literature.Microphone array-based sound source separation is adequate to cancel motor noise withcertain spatial properties, thus the performance of this hybrid noise reduction systemexceeds the individual performances of the template estimation and multi-channel noisereduction methods. In this chapter, we discuss the implementation and its evaluationin terms of ASR accuracy.Chapter 6 describes Missing Feature Theory (MFT)-based integration of ego noisereduction and ASR. We focus on two different ASR systems: single-talker ASR andmulti-talker ASR. Both systems rely on the single-channel and multi-channel noise reductionmethods to generate spectro-temporal masks filtering the unreliable acousticfeatures. We present detailed results regarding recognition accuracy to determine optimalparameters of the mask generation process for each system.In Chapter 7, we provide an extended version of the parameterized template estimationto operate on multi-channel audio data. This feature enables an Sound SourceLocalization (SSL) scheme to whiten the ego noise allowing to eliminate its interferingeffect on the the spatio-temporal plane of Multiple SIgnal Classification (MUSIC)method for SSL. We assess the performance in terms of localization accuracy and peakdetection rates for MUSIC.Chapter 8 outlines the contributions of this thesis and gives an insight into theremaining issues and future work.Chapter 9 summarizes and concludes this dissertation.

Benzer Tezler

Tez No
831282
Wavelet-based adaptive array signal processing for gunshot detection and DOA estimation on unmanned air vehicles
İHA üzerinde silah atışı tespiti ve yön tahmini için dalgacık temelli uyarlanabilir dizi işaret işleme
MURAT YILMAZ
Doktora
İngilizce
2023
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol Orta Doğu Teknik Üniversitesi
Bilişim Sistemleri Ana Bilim Dalı
PROF. DR. BANU GÜNEL KILIÇ
Tez No
173932
Doğrusal programlama yaklaşımı ile toplu taşıma sistemlerinin planlanması ve çizelgelenmesi
Planning and scheduling of public transit systems with the linear programming approach
MEHMET EMRAH ÖZKAYA
Yüksek Lisans
Türkçe
2006
İşletme Hacettepe Üniversitesi
İşletme Ana Bilim Dalı
DOÇ.DR. AYDIN ULUCAN
Tez No
46532
Sincap kafesli asenkron makinenin rotor alan yönlendirmeli kontrolü
Rotor field-orientation control of a squirrel cage induction machine
SAFFET ALTAY
Yüksek Lisans
Türkçe
1995
Elektrik ve Elektronik Mühendisliği İstanbul Teknik Üniversitesi
PROF.DR. M. EMİN TACER
Tez No
322358
Information valuation and processing in performance contexts with noisy feedback: Experimental evidence
Performans bağlamında kesin olmayan geri bildirimlerle bilgi değerleme ve işleme: Deneysel bulgular
ERGÜN KOTAN
Yüksek Lisans
İngilizce
2012
Ekonomi Koç Üniversitesi
İktisat Ana Bilim Dalı
YRD. DOÇ. DR. SEDA ERTAÇ
Tez No
779206
Stochastic future prediction in real world driving scenarios
Başlık çevirisi yok
ADİL KAAN AKAN
Yüksek Lisans
İngilizce
2022
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol Koç Üniversitesi
Bilgisayar Mühendisliği Ana Bilim Dalı
DR. ÖĞR. ÜYESİ FATMA GÜNEY

Geri Dön