Dynamic system modeling and state estimation for speech signal

Konuşma işareti için dinamik sistem modelleme ve durum kestirimi

PDF İndir

Tez No: 268427
Yazar: İBRAHİM YÜCEL ÖZBEK
Danışmanlar: PROF. DR. MÜBECCEL DEMİREKLER
Tez Türü: Doktora
Konular: Elektrik ve Elektronik Mühendisliği, Electrical and Electronics Engineering
Anahtar Kelimeler: Belirtilmemiş.
Yıl: 2010
Dil: İngilizce
Üniversite: Orta Doğu Teknik Üniversitesi
Enstitü: Fen Bilimleri Enstitüsü
Ana Bilim Dalı: Elektrik ve Elektronik Mühendisliği Bölümü
Bilim Dalı: Belirtilmemiş.
Sayfa Sayısı: 150

Özet

Bu tez çalışması formant frekanslarının izlenmesi ve akustikden (ve/veya görselden) artikülatörlere evirme algoritmalarının performanslarının iyileştirilmesi için kapsamlı bir çerçeve sunmaktadır. Olası iyileştirmeler özet olarak aşağıda sunulmaktadır.Bu tezin ilk bölümü sabit ve değişken sayıdaki formant frekanslarının kestirimlerinin nasıl yapılması gerektigini araştırmaktadır.Sabit sayıdaki formant frekanslarının izlenmesi, sabit sayıda formant frekansının konuşma süresi boyunca var olduğu varsayımına dayanmaktadır. Önerilen yöntem dinamik programlama ve Kalman süzgeci/düzgünleştiricisinin birleştirilmesi (birlikte kullanılması) ilkesine dayanmaktadır.Bu yöntemde konuşma işareti ötümlü veya ötümsüz olarak bölütlendirilmekte ve her bir bölüt için formant adayları ile formant izleri, dinamik programlama yardımıyla eşleştirilmektedir. Eşleştirme işleminden sonra her bir formant frekansının kestirimi Kalman süzgeci/düzgünleştiricisi ile yapılmaktadır. Önerilen bu algoritmanın performansı literatürde varolan diğer algoritmalar ile karşılaştırılmıştır.Değişken sayıdaki formant frekanslarının izlenmesi esnasında yalnızca spektrogramda görülen formant frekansları dikkate alınmaktadır. Bu nedenle izlenmesi gereken formant frekans sayısı zamanla değişebilmektedir. Bu durumda izlenecek formant frekanslarının sayısınında kestirilmesi gerekmektedir. Bu amaçla önerilen yöntem bazı algoritmalar (formant başlatma/bitirme karar mekanizması gibi) kullanmaktadır. Herbir formant frekans izinin gelen ölçümle beslenmesi Kalman süzgeci ile yapılmaktadır. Bu yöntemin başarısı çeşitli örneklerle gösterilmiştir.Bu tezin ikinci bölümünde akustikden (ve/veya görselden) artikülatörlere evirme algoritmalarının performansları iyileştirilmiştir. Bu konuda yapılan çalışmalar iki kategoride incelenmektedir: Gaussian karışım modellere (GKM) dayalı evirme ve doğrusal atlamalı Markov sistemlere (DAMS) dayalı evirme.GKM yöntemine dayalı evirmede artikülatörlerin hareketleri (pozisyonları) ve akustik (ve/veya görsel) veriler ortak dağılımlı Gaussian karışımı olarak modellenir. Bu dağılımın şartlı ortalaması, istenilen kestirim fonksiyonudur. Önerilen bu yöntemde, farklı akustik özniteliklerin birleştirilmesinin faydaları ve farklı füzyon yöntemlerinin akustik ve görsel verilerin birleştirilmesindeki etkinlikleri incelenmiştir. Ayrıca kestirilen artikülatörsel izlerin düzgünleştirilmesi için farklı dinamik düzgünleştirme yöntemleri önerilmiştir. Önerilen yöntemlerin performansı literatürde var olan diğer algoritmalar ile karşılaştırılmıştır.DAMS yöntemine dayalı evirmede akustik uzay ile artikülatörsel uzayı birbirine çoklu sayıda durum-uzay gösterimleri ile bağlanmaktadır. Bu yöntemle artikülatörsel evirme problemi, ölçümleri akustik (ve/veya görsel) veriler olan, durum vektörünün ise artikülatörlerin pozisyonlarından oluşan durum kestirimi problemine dönüştürülmektedir. Önerilen evirme yöntemi öncelikle durum-uzay modellerinin parametrelerini beklenti enbüyültülmesi (BE) yöntemi ile öğrenir ve durum etkileşimli çoklu model (EÇM) süzgeci/düzgünleştirici yardımı ile kestirilir.

Özet (Çeviri)

This thesis presents an all-inclusive framework on how the current formant tracking and audio (and/or visual)-to-articulatory inversion algorithms can be improved.The possible improvements are summarized as follows:The first part of the thesis investigates the problem of the formant frequency estimation when the number of formants to be estimated fixed or variable respectively.The fixed number of formant tracking method is based on the assumption that the number of formant frequencies is fixed along the speech utterance. The proposed algorithm is based on the combination ofa dynamic programming algorithm and Kalman filtering/smoothing. In this method, the speech signal is divided into voiced and unvoiced segments, and the formant candidates are associated via dynamic programming algorithm for each voiced and unvoiced part separately. Individual adaptive Kalman filtering/smoothing is used to perform the formant frequency estimation. The performance of the proposed algorithm is compared with some algorithms given in the literature.The variable number of formant tracking method considers those formant frequencies which are visible in the spectrogram. Therefore, the number of formant frequencies is not fixed and they can change along the speech waveform. In that case, it is also necessary to estimate the number of formants to track. For this purpose, the proposed algorithm uses extra logic (formant track start/end decision unit). The measurement update of each individual formant trajectories is handled via Kalman filters. The performance of the proposed algorithm is illustrated by some examplesThe second part of this thesis is concerned with improving audiovisual to articulatory inversion performance. The related studies can be examined in two parts; Gaussian mixture model (GMM) regression based inversion and Jump Markov Linear System (JMLS) based inversion.GMM regression based inversion method involves modeling audio (and /or visual) and articulatory data as a joint Gaussian mixture model. The conditional expectation of this distribution gives the desired articulatory estimate. In this method, we examine the usefulness of the combination of various acoustic features and effectiveness of various types of fusion techniques in combination with audiovisual features. Also, we propose dynamic smoothing methods to smooth articulatory trajectories. The performance of the proposed algorithm is illustrated and compared with conventional algorithms.JMLS inversion involves tying the acoustic (and/or visual) spaces and articulatory space via multiple state space representations. In this way, the articulatory inversion problem is converted into the state estimation problem where the audiovisual data are considered as measurements and articulatory positions are state variables. The proposed inversion method first learns the parameter set of the state space model via an expectation maximization (EM) based algorithm and the state estimation is handled via interactive multiple model (IMM) filter/smoother.

Benzer Tezler

Tez No
409990
Zamanla değişen özbağlanımlı modele dayalı olarak durağan olmayan rasgele işaretlerin modellenmesi
Modelling the nonstationary random signals based upon the time-varying autoregressive model
SİMGE ZEREY
Yüksek Lisans
Türkçe
2014
Elektrik ve Elektronik Mühendisliği Pamukkale Üniversitesi
Elektrik-Elektronik Mühendisliği Ana Bilim Dalı
DOÇ. DR. AYDIN KIZILKAYA
Tez No
894524
Yinelemeli sinir ağları ile işaret dili tanıma
Sign language recognition with recurrent neural networks
İBRAHİM ÇETİNKAYA
Yüksek Lisans
Türkçe
2023
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol İstanbul Teknik Üniversitesi
Mekatronik Mühendisliği Ana Bilim Dalı
PROF. DR. TAMER ÖLMEZ
Tez No
903942
Akımsız nikel esaslı alaşım kaplamalarda en iyi kaplama özelliklerini sağlayan banyo parametrelerinin yapay zeka yöntemleri ile tersine optimizasyonu
Inverse optimization of bath parameters providing the best coating properties in electroless nickel-based alloy coatings using artificial intelligence methods
MEHMET FATİH TAŞKIN
Doktora
Türkçe
2024
Endüstri ve Endüstri Mühendisliği Sakarya Üniversitesi
Endüstri Mühendisliği Ana Bilim Dalı
PROF. DR. ÖZER UYGUN
Tez No
665220
Adaptif model öngörülü kontrolör ile konsensus kontrolü
Consensus control with adaptive model predictive control
ANIL YILMAZ
Yüksek Lisans
Türkçe
2021
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol İstanbul Teknik Üniversitesi
Kontrol ve Otomasyon Mühendisliği Ana Bilim Dalı
DOÇ. DR. YAPRAK YALÇIN
Tez No
806902
Elektrikli araçlar için yüksek doğrulukla şarj kestirimi sunan batarya yönetim sistemi tasarımı
Design of battery managemenet system providing high accuracy state of charge estimation for electric vehicles
MUSTAFA MERT SERİNBAŞ
Yüksek Lisans
Türkçe
2023
Elektrik ve Elektronik Mühendisliği İstanbul Teknik Üniversitesi
Elektrik Mühendisliği Ana Bilim Dalı
DR. ÖĞR. ÜYESİ MEHMET ONUR GÜLBAHÇE

Geri Dön