Geri Dön

Konuşma işaretlerinin analiz ve sentezi

Analysis and synthesis of speech signals

  1. Tez No: 22036
  2. Yazar: KAREN BÜYÜKAŞIKOĞLU
  3. Danışmanlar: PROF. DR. EŞREF ADALI
  4. Tez Türü: Yüksek Lisans
  5. Konular: Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol, Computer Engineering and Computer Science and Control
  6. Anahtar Kelimeler: Belirtilmemiş.
  7. Yıl: 1992
  8. Dil: Türkçe
  9. Üniversite: İstanbul Teknik Üniversitesi
  10. Enstitü: Fen Bilimleri Enstitüsü
  11. Ana Bilim Dalı: Belirtilmemiş.
  12. Bilim Dalı: Belirtilmemiş.
  13. Sayfa Sayısı: 91

Özet

ÖZET Bu çalışmada yapay konuşma üretim tekniği esaslarına dayanılarak konuşma işaretlerinin yapay olarak üretilmesine çalışılmıştır. Bu amaçla Doğrusal Öngörü Analiz Tekniği kullanılmıştır. İnsan sesi vücut içinde değişik ses üretme organlarından geçip ağız ve dudaklara kadar varmaktadır. Konuşmadaki değişik seslerin özellikleri de değişiktir. Bazı sesler periyodik dürtülerden oluşmuşlardır, bazıları da beyaz gürültü şeklinde bir titreşime sahiptirler. İnsan vücudu içinde sesin geçtiği bölgenin ses üzerinde doğrusal bir öngörü filtresi gibi etki yaptığı varsayılmaktadır. Esas amaç değişik seslere karşı böyle bir filtrenin katsayılarının bulunmasıdır. Bu katsayılar bulunduktan sonra periyodik dürtü yada beyaz gürültü şeklindeki işaretlere karşı gelen seslerin elde edilmesi ile konuşmanın gerçekleştirilmesi mümkün olur. Bu çalışmada, Bölüm l' deki kısa girişten sonra Bölüm 2 de Doğrusal öngörü Analizi'nden bahsedilmiş. Bölüm 3'te doğrusal öngörü filtresinin katsayılarının nasıl elde edilebileceği gösterilmiştir. Bölüm 4te konuşma işaretinin özelliklerinin saptanması, sesli -sessizlerle sessiz-seslerin ayırt edilebilmesi için kullanılan bazı yöntemler açıklanmıştır. Bölüm 5'te ise daha önceki bölümlerde elde edilen bilgiler ışığında konuşma işaretlerinin yapay olarak üretilmesine çalışılmıştır.

Özet (Çeviri)

SUMMARY ANALYSIS AND SYNTHESIS OF SPEECH SIGNALS One of the most, powerful speech analysis technique is the method of linear predictive analysis. Linear prediction has been widely used to describe a new approach to speech analysis and synthesis. In this study, based on the basic principles of the Linear Prediction Analysis, it is tried to synthesis the speech signals. It is supposed that the human vocal tract acts as a linear predictive filter whose steady-state system function is of the form HCzD = p -k 1 + £ a z k k=i Therefore the main problem is to solve or to find the predictor coefficients Ca ) of the system. The basic idea behind the linear predictive analysis is that a speech sample can be approximated as a linear combination of past speech samples. The speech samples y n are related to the excitation x by the simple equation y ^ £ a, y, + G x n. *“* k rv-k r k=l A linear predictor with prediction coefficients Ca ) is defined as a system whose output is K ”E ak yn-k k = i VTThe basic problem of linear prediction analysis is to determine a set of predictor coefficient Ca ) directly from the speech signal in such a manner as to obtain a good estimate of the speech. Because of the time-varying nature of the speech signal the predictor coefficients must be estimated from short segment of the speech signal. By minimizing the sum of the squared differences Cover a finite interval D between the actual speech samples, y, and the linearly predicted once, y, a unique set of predictor coefficients can be determined. This approach leads to a set of linear equations that can be efficiently solved to obtain the predictor parameters. The speech signal can be modeled as the output of a time varying linear system exited by either random noise Cfor unvoiced speech} or a quasi -per iodic sequence of impulses Cfor voiced speech}. The parameters of this model are voiced/unvoiced classification, pitch period for voiced speech, gain and the coefficients of the digital filter. This study is composed of five sections. Some sections are devoted to a discussion of how a variety of speech parameters can be reliably estimated using linear prediction methods. As applied to speech processing, the term linear prediction refers to a variety of essentially equivalent formulations of the problem of modeling the speech waveform. The differences among these formulations concern the details of the computations used to obtain the predictor coefficients. In Section 2, the use of two formulations such as autocorrelation and covariance methods are discussed. These two formulations help to solve the prediction coefficients of the speech model. In order to effectively implement a linear analysis system, it is necessary to solve the linear equations in an efficient manner. A variety of techniques can be applied to solve a system of p linear equations in p unknowns. Because of the special properties of the coefficient matrices it is possible to solve the equations much more efficiently than is possible in general. VIIIn Section 3, two methods for obtaining the predictor coefficients are discussed. These two methods are:“Cholesky decomposition solution for the covariance method”and“Levinson-Durbin's recursive solution for the autocorrelation method. In Section 4, the different methods for the estimation of the pitch period and voiced/unvoiced classification are discussed. The amplitude of the speech signal varies appreciably with time. In particular, the amplitude of unvoiced segments is generally much lower then the amplitude of voiced segments. The short-time energy of the speech signal provides a convenient representation that reflects these amplitude variations. In general the short time energy can defined as n E = £ x2Cnû n «>=n-N+l One difficulty with the short time energy function is that it is very sensitive to large signal levels. A simple way of alleviating this problem is to define an average magnitude function n Mn = £ IxCnO | m=n-N+i where the sum of absolute values of the signal is computed instead of the sum of squares. Here N is the window length The major significance of E or M is that it provides a basis for distinguishing voiced speech segments from unvoiced speech segments. The energy function can also be used to locate approximately the time at which voiced speech becomes invoiced, and vice versa, and the energy can be used to distinguish speech from silence. In Section S, it is tried to synthesis the speech signals. For this rai son a computer program is written. A ”Sound Blaster“ voice card is used. The computer program digitizes the speech by the help of the utility programs of this card, and after processing these speech samples, the synthetic speech is obtained, again by using same utilities this synthetic speech samples are converted into analog speech signal. VTIIThe speech is analyzed in a short, lime interval of 20ms. To find the predictor coefficients the autocorrelation method is applied and autocorrelation values are computed. Then applying the Levi nson- Dur bi n * s recursive solution technique the autocorrelation equations are solved and predictor coefficients are obtained for each window. For the pitch period estimation the modified autocorrelation analysis algorithm is used CMar,19902>. In this algorithm the autocorrelation sequence of the predicted input signal, R CkZ>, which can be expressed in terms of the autocorrelation sequence of the actual input and the autocorrelation sequence of the prediction coefficients, a.. P R CJO =£R C j5 R Ck-jD e. a x The autocorrelation function for a is defined as R CJ3 = £ a. a.. v=o The pitch is detected by finding the peak of the normalized autocorrelation sequence R Ck2>/R COD in the time interval that corresponds to XLS to %73 of the selected window. If the value of this peak at least O. 2S, the window is considered voiced with a pitch equal to the value of n at the peak divided by the sampling frequency. If the peak value is less than 0. 25, the frame is considered unvoiced an the pitch is zero. After estimation of the filter coefficients, pitch period, and voiced/unvoiced classification the following relation is used P E *n ”£** *n-k + GXn and the synthesis of the speech is realized. [a, 3 are the coefficients of the digital filter, k [x ] are the unit impulses with estimated pitch IXperiod when the speech is voiced and random white noise when the speech is unvoiced, [y 3 are the output of the prediction filter, p is the number of the coefficients, G is the gain parameter and can be obtained from the following equation G2 - RCOD - £ a, RCk2> k=l RCkZ) is the autocorrelation sequence of the input si gnal. Many synthetic speech signals are produced. After the computation of each synthetic signal the waveform is plotted and compared with the real one. It is remarked that two waveforms are quite identical and have similar sounding. The computer program written in Microsoft Quick Basic Ver: 7.1 for these computations are given in Appendix A.

Benzer Tezler

  1. Türkçe fonemler için en uygun ana dalgacık fonksiyonunun araştırılması

    The investigation of optimum mother wavelet function for turkish phonemes

    ÖZKAN ARSLAN

    Yüksek Lisans

    Türkçe

    Türkçe

    2014

    Elektrik ve Elektronik MühendisliğiEge Üniversitesi

    Elektrik-Elektronik Mühendisliği Ana Bilim Dalı

    YRD. DOÇ. DR. ERKAN ZEKİ ENGİN

  2. Düşük bir hızlarında konuşma kodlama ve uygulamaları

    Low bit rate speech coding and applications

    TARIK AŞKIN

  3. Doğrusal öngörü ile konuşma işareti kodlayıcısı tasarımı

    Design of a linear predictive speech coder

    YILMAZ KIRÇİÇEK

    Yüksek Lisans

    Türkçe

    Türkçe

    2007

    Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve KontrolYıldız Teknik Üniversitesi

    Haberleşme Ana Bilim Dalı

    PROF. DR. VEDAT TAVŞANOĞLU

  4. Das Leseverstehen allgemein und lim deutsch als Fremdsprache-unterricht

    Başlık çevirisi yok

    SERPİL BAL

    Yüksek Lisans

    Almanca

    Almanca

    1991

    Alman Dili ve Edebiyatıİstanbul Üniversitesi

    DR. PİA ANGELA GÖKTÜRK

  5. Enhancement of the coded speech using filtering

    Filtreleme kullanarak kodlanmış sesin iyileştirilmesi

    SALİH SİNAN TAYLAN

    Yüksek Lisans

    İngilizce

    İngilizce

    2017

    Elektrik ve Elektronik MühendisliğiIşık Üniversitesi

    Elektrik-Elektronik Mühendisliği Ana Bilim Dalı

    DOÇ. DR. ÜMİT GÜZ

    DOÇ. DR. HAKAN GÜRKAN