Ses analizi ve ses sıkıştırma ve kodlama algoritmaları

Başlık çevirisi mevcut değil.

Tez No: 46192
Yazar: SERVET YILDIRIM
Danışmanlar: Y.DOÇ.DR. M. ERTUĞRUL ÇELEBİ
Tez Türü: Yüksek Lisans
Konular: Elektrik ve Elektronik Mühendisliği, Electrical and Electronics Engineering
Anahtar Kelimeler: Belirtilmemiş.
Yıl: 1995
Dil: Türkçe
Üniversite: İstanbul Teknik Üniversitesi
Enstitü: Fen Bilimleri Enstitüsü
Ana Bilim Dalı: Belirtilmemiş.
Bilim Dalı: Belirtilmemiş.
Sayfa Sayısı: 70

Özet

ÖZET Bu tez çalışması ile ağırlıklı olarak ses analizi yapılmış ve ses sıkıştırma ve kodlama algoritmaları araştırılmıştır. Bunun için Doğrusal öngörü Yöntemi kullanılmıştır. Böylece sesin tüm örneklenen değerlerinin depolayarak saklamak yerine ses bilgisini taşıyacak olan parametreler kuUamlmışrır. Bu parametreler daha sonra aynı ses komutu geldiğinde karşılaştırma için saklanacaktır. Bu analizleri yapabilmek amacıyla çeşitli bölümlerde ilgili algoritma ve yöntemler tartışılacak ve yorumlanacaktır. Doğrusal öngörü katsayılanmn bulunabilmesi için birçok metod ve algoritma vardır. Bunların birçoğu bu çalışmanın içerisinde anlatılacaktır, özbağlammlı süreç (AR) daha iyi çözünürlük verdiği için bilinen kestirim yöntemlerine oranla daha fazla ilgi çekmektedir. AR kestirim yöntemi ilk olarak jeofizik verilerinin işlenmesi için gehştirilmiştir. Bazen en büyük Entropy yöntemi olarak da adlandırılır. AR spektral kestirim yönteminin radar, sonar, görüntüleme, ses analizi, radyo astronomi, tıp elektroniği, yön bulma gibi yaygın alanları vardır. Kliniklerde özellikle EEG(Elektroensefalogram), UP(Uyanlmış potansiyel) ve GUP(Görsel uyarılmış potansiyel) sinyalleri sayesinde sara hastalığının tanısı, beynin İM yaran küresindeki elektriksel hareketlerin karşüaştaılmasıyla kafatası içindeki yabancı yapıların yerlerinin tespitinde, anestezi altındaki hastanın anestezi seviyesinin belirlenmesinde, akıl hastalarının beyinlerinde organik bir bozukluğun olup olmadığının belirlenmesi gibi durumlarda kullanılan işaretlerin analizi için özbağlammlı (AR) süreç oldukça uygundur. VI

Özet (Çeviri)

SUMMARY SPEECH ANALYSIS AND SPEECH COMPRESSION AND ENCODING ALGORITHMS Speech is the basic device humans use for communication. The information to be transmitted is first encoded at a discrete level according to the rules of the language used by the speaker, and then, through the complex process of physiological speech production, this information is converted to an acoustic signal. This signal, when received by another speaker of the same language, is converted in another discrete sequence and decoded to extract the information transmitted. In this study, based on the basic principles of the linear predicitive filtre whose steady - state system function is of the from H(z) =^P 1 + X ak. z“k k=l There fore the main problem is to solve or to find the predictor cofficients (ak) of the system. The basic idea behind, the linear predicitive analysis that a speech sample can be approximated as a linear combination of past speech samples. The speech samples Sn are related to the excitation Un by the simple equation P S”= -2-1 aic. Sn-k + G.Un k=l A linear predictor with predicition coefficients (ak) is defined as a system whose output is _ P ön =“ La 3k- S”- [ vnThe basic problem of linear production analysis is to determine a set of predictor coefficient (a*) directly from the speech signal is such a manner as to obtain a good estimate of the speech. Because of the time - varying nature of the speech signal the predictor coefficients must be estimated from short segment of the speech signal. By minimizing the sum of the squared differences lover a finite interval) between the actual speech samples, Sn, and the linearly predicted once, Sn, a unique set of predictor coefficients can be determined. This approach leads to a set of linear equations that can be efficiently solved to obtain the predictor parameters. The speech signal can be modeled as the output of a time varying linear system exited by either random noise (for unvoiced speech) or a quasi - periodic sequence of impulses (for voiced speech). The parameters of this model are voiced /unvoiced classification, pitch period for voiced speech, gain and the coefficients of the digital filter. This study is comoposed of five sections. Some sections are devoted to a discussion of how a variety of speech parameters can be reliably estimated using linear prediciton methods. As applied to speech processing, the term linear prediction refers to a variety of essentially equivalent formulations of the problem of modeling the speech waveform. The differences among these formulations concern the details of the computations used to obtain the predictor coefficients. In section 2. the use of two formulations such as autocorrelation and covariance method are discussed these two formulations help to solve the prediciton coefficients of the speech model. İn order to effectively implement a linear analysis system, it is necessary to solve the linear equations in an efficient manner. A variety of technigues can be applied to solve a system of a linear equations in p unknowns. Because of the special properties of the coefficient matrices it is prossible to solve the equations much more efficiently than is possible in general. In section 3, two method obtaining the predictor coefficients are discussed. These two methods are;“cholesky decomposition solation for the covariance method”and“Levinson - Durbin's recursive solution for the autocorrelartion method”. In section 4. the different methods for the estimation of the pitch period and voiced unvoiced classification are discussed. The amplitude of the speech signal varies apreciably with time. In particular, the amplitude of unvoiced segments is generally. Much lower then the amplitude of voiced segments. The short - time energy of the speech signal provides a convanient representation that reflects these amplitude variations. In general the short time vmenergy can defined as E“=Eu2(m) nr-B-N+l One difficulty with the short time energy function is that it is very sensitive to large signal levels. Here N is the window length. After estimation of the filter coefficients, pitch period and voiced / unvoiced classification the fallowing relation is used. p S”= -2-1 &k- S".k + G. Un k=l and the synthesis of the speech is realized. (ak) are the coefficients of the digital filter. (Un) are the unit impulses with estimated pitch period when the speech is voiced and randon white noise when the speech is unvoiced (Sn) are the output of the prediction filter P is the numer of the coefficients G is the gain parameter and can be obtained from the fallowing equation p G2= R(o)-Iak.R(k) k=l R (k) is the autocorrelation sequence of the input signal. In section 5, it is tried to analyse the speech signals, for this reason a computer program is written. The speech is analysed in a short time interval of different periods. To find the LPC coefficients. The autocorrelation method is applied and autocorrelation values are computed. Then applying the Levinson-Durbin's recursive solution technique the autocorrelation equations are solved and coefficients of the digital filter are obtained for each window. EX

Benzer Tezler

Tez No
887325
Brain-inspired cortical-coding algorithm for multimedia processing
Multimedya işlemek için beyinden esinlenilmiş kortikal kodlama algoritması
AHMET EMİN ÜNAL
Yüksek Lisans
İngilizce
2024
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol İstanbul Teknik Üniversitesi
Bilgisayar Mühendisliği Ana Bilim Dalı
PROF. DR. BURAK BERK ÜSTÜNDAĞ
Tez No
252687
Ses sıkıştırma tekniklerinin başarım analizi
Performance analysis of speech compression techniques
DİNÇER YARIMÇAM
Yüksek Lisans
Türkçe
2009
Elektrik ve Elektronik Mühendisliği İstanbul Üniversitesi
Elektrik-Elektronik Mühendisliği Ana Bilim Dalı
PROF. DR. OSMAN NURİ UÇAN
Tez No
488061
Subband decomposition and fractal image compression based steganography
Altbant ayrıştırma ve fraktal imge sıkıştırma tabanlı steganografi
SUHAD FAKHRI HUSSEIN ALBASRAWI
Yüksek Lisans
İngilizce
2017
Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol İstanbul Teknik Üniversitesi
Bilişim Uygulamaları Ana Bilim Dalı
Assoc. Prof. Dr. BEHÇET UĞUR TÖREYİN
Tez No
496613
Alt uzay yöntemleri kullanarak işaret kodlama
Signal coding by using subspace methods
SERKAN KESER
Doktora
Türkçe
2017
Elektrik ve Elektronik Mühendisliği Eskişehir Osmangazi Üniversitesi
Elektrik-Elektronik Mühendisliği Ana Bilim Dalı
PROF. DR. MEHMET BİLGİNER GÜLMEZOĞLU
Tez No
770074
Data-driven design and analysis of next generation mobile networks for anomaly detection and signal classification with fast, robust and light machine learning
Hızlı, Sağlam ve Hafif Makine Öğrenmesi ile Anormallik Algılaması ve Sinyal Sınıflandırması için Yeni Nesil Mobil Ağların Veriye Dayalı Tasarımı ve Analizi
MUHAMMED FURKAN KUCUK
Doktora
İngilizce
2022
Elektrik ve Elektronik Mühendisliği University of South Florida
Haberleşme Ana Bilim Dalı
DOÇ. DR. İSMAİL UYSAL

Geri Dön