Türk dili için konuşma üretme

Başlık çevirisi mevcut değil.

Tez No: 75579
Yazar: NİHAL ALICI
Danışmanlar: PROF. DR. EŞREF ADALI
Tez Türü: Yüksek Lisans
Konular: Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol, Computer Engineering and Computer Science and Control
Anahtar Kelimeler: Belirtilmemiş.
Yıl: 1998
Dil: Türkçe
Üniversite: İstanbul Teknik Üniversitesi
Enstitü: Fen Bilimleri Enstitüsü
Ana Bilim Dalı: Kontrol ve Bilgisayar Mühendisliği Ana Bilim Dalı
Bilim Dalı: Belirtilmemiş.
Sayfa Sayısı: 107

Özet

ÖZET Doğal Dil İşleme, insanın işlevlerini gerçeklemeyi amaçlayan Yapay Zeka'nın bir dalı olarak doğal dili bilgisayar için anlaşılır ve kullanılır hale getirmeye çalışmaktadır. Bu itibarla dilbilim de Doğal Dil îşleme (DDİ) kapsamına girmektedir. Bu tez ile, bilgisayarın Türkçe'yi telaffuz etmesi ve böylece görme özürlü insanların okuma ve yazmaları sağlanmıştır. Benzer çalışmalar diğer diller için (İngilizce başta olmak üzere) mevcuttur. Ancak dillerin ses ve hece yapılarının farklı olması nedeniyle bu çalışmalar Türkçe için ihtiyacı karşılamamaktadır. Yapılan çalışmalar şu aşamalarda özetlenebilir: Türkçe'nin ses yapısının incelenmesi. Türkçe için ses birimlerinin saptanması. Ses örneklerinin kaydedilip filtrelenmesi. Ses örneklerinin veritabanına aktarılması. Heceleme ve okuma ile ilgili diğer algoritmaların kurulması. Yazılımın gerçeklenmesi. İkinci aşamada en küçük ses birimi olarak iki harfli heceler seçilmiş, diğer bütün sesler bunlardan türetilmiştir. Nesne Yönelimli ve Olay Merkezli olarak gerçeklenen yazılım, editör ve okutma bölümleri içerir. Editör bölümünde kullanıcının yazdığı veya hazır olan bir metin okunur. Burada görme özürlü bir kullanıcının yazdığım duyarak klavyeyi kullanmayı öğrenmesi sağlanabilir. Ayrıca yazılan doküman bir dosyada saklanabilir. Okutma kısmı ise benzer ekran görüntüsüne sahip olmakla beraber, etkileşimli olmayan bir sayfa içerir. Kullanıcı mevcut bir dosyayı açar ve bu dosyanın içeriği hem görüntülenir, hem de okunur. Ayrıca yazılım; yapılan tüm işlemleri, kritik noktalardaki uyan mesaj larını ve fare hareketlerinden faydalanarak komut düğmelerini de seslendirir. Dolayısıyla kullanıcı sadece yazma ve okuma aşamasında değil, programı kullanma aşamasında da başka birine ihtiyaç duymayacaktır. Kullanılan yazılımlar : _ Borland Delphi 2.0 Borland Paradox 4.5 _Cool Edit 96 Medya Aygıtı vüi

Özet (Çeviri)

SUMMARY A SPEECH PRODUCTION SYSTEM FOR TURKISH LANGUAGE This study is related to Artificial Intelligence (AI) research. The aim of the research is to teach computers to understand spoken communication. Because, language is meant for communicating about the world. The largest part of human linguistic communication occurs as speech. Written language is a fairly recent invention. Processing written language is easier, in some ways, than processing speech. To build a system that understands spoken language, all the facilities of a written language understander systems are needed as well as enough additional knowledge to handle all the noise and ambiguities of the audio signal. The average person can speak at about twice the rate that proficient typists can type. Finally, these studies would remove the keyboard as an obstacle in human/machine dialog. This would encourage less technically oriented persons to make more extensive use of machines. Speech Processing is the branch of Natural Language Processing. There are three kind of applications of Speech Processing systems : 1- Speech Recognition 2- Speech Understanding 3- Speech Production Speech Recognition is more difficult than others. Because all languages have very complex structures for computers. And there are different expressions for same meanings, so same things are told in different ways. Therefore very large number of projects are made in speech recognition more than the others. IXThere is a very close parallel between the problems, approaches, and accomplishments of computer vision. For speech recognition researches, the more general case using a deeper level of understanding has been solved to a reasonable degree of accuracy through a great deal of effort and considerable expense. In speech understander systems, the input is speech generally. And the computer operates the input speech. In other words, these systems converts speech to written text. But in speech production systems, the output is speech. So that written text is converted to speech. For this operation, every language needs different algorithms. Because structures, phonemes and pronunciation of natural languages are very different. This thesis is the speech production system for Turkish language. For Natural Language Processing (NLP), this maxim is true :“to enhance the likelihood of success, restrict the problem domain”Speech production is the problem of translating text into speech rather than vice versa. It has also been attacked with neural networks. Speech production is easier than speech recognition and speech understanding, so high performance programs can be written. This kind of study is present for English. But this system has developed for speech production in Turkish language. The aim of the system is to implement a speech production system that converts a text file to speech in Turkish. The input to the system is normal writing and the output is the acoustic rendition of the written text. The method used in this work is the concatenation of the waveforms of the phonemes. Linguists have long studied the rules governing the translation of text into speech units called phonemes. A traditional approach to the problem would be to write all these rules down and use a production system to apply them. But, most of the rules have exceptions and these exceptions must also be programmed in. A connectionist approach is simply to present a network with words and their pronunciations, and hope that the network will discover the regularities and remember the exceptions.In this thesis; the method that is the concatenation of the waveforms of the phonemes is used. The smallest unit that can be used is the phoneme. The usage of phonemes minimizes the storage needs, but requires the use of interpolation at the phoneme boundaries; moreover, decreases the quality of the resulting speech. In the written language, each letter corresponds to one phoneme. The phonemes of the Turkish Language is stored in the hard disk to form a phoneme waveform library. The first step of this part is to get the speech samples from the speaker. These samples are filtered and stored in the hard disk. Only one word at a time is taken from the speaker and stored. Only after all words needed are taken is the analysis performed. After the syllabification process, the syllable is looked up the database. If it is found in the database, it is replaced by its phonetic equivalent; if not, it is considered to product from stored phonemes. The Turkish alphabet consist of 29 letters : a,b7c,ç,d,e,f,g,ğ,h,ı,i,j,k,l,m,n,o,ö, p,r,s,ş,t,u,ü,v,y,z Eight of these letters are vowels : a,e,ı,i,o,ö,u,ü The remaining are consonants : b,c,ç,d,f,g,ğ,h,j,k,l,m,n,p,r,s,ş,t,v,y,z In the Turkish Language, there are six kind of syllables : Vowel Vowel - Consonant Consonant - Vowel Consonant - Vowel - Consonant Vowel - Consonant - Consonant Consonant - Vowel - Consonant - Consonant There may be only be one vowel in a syllable and there may be only one consonant before the vowel, except in the first syllable of a word. Therefore in the first syllable, the consonants, if any, that come before the vowel of the syllable belong to that vowel. XIAll syllables that has two letters and eight vowels are stored. But syllables that has more than two letters are not available. These syllables must be product by combining syllables with one and two letters. For this reason, consonant letters are also stored alone. As a result, 504 different sounds are stored as a WAV files in the hard disk of the PC firstly. After that these sounds and its paths are stored in the database that is created with Paradox 4.5. Then the system use the sound records. The kinds of sounds : 20 consonant * 8 vowel = 160 (ba,be,bı,bi, zö,zü) 21 consonant * 8 vowel = 168 (ğ inside) (ab,eb,ib,ib, öz,üz) 21 consonant * 8 from vowel = 168 ((a)b,(e)b,(ı)b, (u)z,(ü)z) 8 sesli = 8 (a,e,ı,i,o,ö,u,ü) TOTAL = 504 sounds While performing this operation, the words that conform to the pronounciation rule are replaced by their phonetic equivalents. The letters that correspond to two phonemes are classified according to the rule and the appropriate phonemes are inserted in their place. The resulting speech requires some more work to be done so that it can be understood better by the human ear. This requires some more signal processing to be done on the resulting waveform, as well as taking the speech samples in a better environment and conditions to eliminate the noise introduced. The system is consist of three parts : XU1- Editor : This part is for writing and listening text. System read all word that user input. And all file operations (open, save, close, ) can be done there. There is also 'Read Key Mod' which provide learning and using keyboard from blind users. 2- Reading: In this part, system read files that are present at first. Changing contents of the file is not possible. 3- Parameters : All parameters that are connected with operating program, can be changed by users. System is designed as an Object Oriented and Event Driven program with Borland Delphi 2.0. It contains four form including main menu. In the design time, the first step is to get the speech samples from the speaker. These samples are filtered by using FFT filter and stored in the hard disk as WAV files. Normally, if connected speech is used in analysis, the detection of the boundaries of words presents a problem, especially in noisy environments. Therefore, only one word at a time is taken from the speaker and stored. Only after all words needed are taken is the analysis performed. The production is performed using the phoneme waveforms found in the analysis step. When the writing of word is finished, only one word at a time is extracted from the text. The words in the written text are divided into syllables with syllabification procedure firstly. Then, the waveforms, in other words the records of all phonemes in the syllable are concatenated in time to form the vocal equivalent of the written text. The resulting speech is devoid of intonation and prosody, since giving the correct intonation requires that the meaning of the text be understood, which is outside the scope of this work. The numerals in the text can be converted into text, and thus can be vocalized. Various punctuation marks (fullstop, comma, ) are taken into account and corresponding pauses are inserted into the waveform. However, at this time, the resulting speech requires some more work to be done so that it can be understood better by the human ear. This requires X1Usome more signal processing to be done on the resulting waveform, as well as taking the speech samples in a better environment to eliminate the noise introduced. Softwares : _ Borland Delphi 2.0 Borland Paradox 4.5 _Cool Edit 96 Media Player XIV

Benzer Tezler

Tez No
631494
Yabancı dil olarak Türkçe öğretiminde yazma becerisinin temel ve orta seviyelerde hedef dil bağlamında karşılaştırılması
A comparison of writing skill in the context of target language at the elementary and intermediate levels of Turkish as a foreign language
MELİKE NUR ÇEP
Yüksek Lisans
Türkçe
2021
Eğitim ve Öğretim İstanbul Arel Üniversitesi
Yabancı Dil Olarak Türkçe Ana Bilim Dalı
DOÇ. DR. ALİ TAŞTEKİN
Tez No
779818
Dava metinlerine söylem çözümlemesi bağlamında bir yaklaşım denemesi: Yassıada davaları örneği
An attempt to approach the case texts in the context of discourse analysis: Yassıada case sample
EMİNE DAMLA TURAN
Doktora
Türkçe
2023
Dilbilim Karadeniz Teknik Üniversitesi
Türk Dili ve Edebiyatı Ana Bilim Dalı
PROF. DR. ASİYE MEVHİBE COŞAR
Tez No
724649
Postmodernizmin Türk Dili ve Edebiyatı Öğretimine etkileri
The effects of postmodernism on Turkish Language and Literature Teaching
SELMA ÖZHAN
Yüksek Lisans
Türkçe
2022
Eğitim ve Öğretim Van Yüzüncü Yıl Üniversitesi
Türkçe ve Sosyal Bilimler Eğitimi Ana Bilim Dalı
DOÇ. DR. FETHİ DEMİR
Tez No
845647
Yabancı dil olarak Türkçe öğretimi'nde klasik edebî metinlerin kullanımı ve Samipaşazade Sezai'nin 'Kediler' adlı hikâyesinin B1 seviyesine göre sadeleştirilmesi
Using classical literary works at teaching Turkish as a foreign language and simplification literary work of the Samipaşazade Sezai's story named 'Kediler' for B1 level
SİBEL ÜNLÜTÜRK
Yüksek Lisans
Türkçe
2024
Eğitim ve Öğretim Bursa Uludağ Üniversitesi
Türkçe ve Sosyal Bilimler Eğitimi Ana Bilim Dalı
DOÇ. DR. LEVENT ALİ ÇANAKLI
Tez No
315065
A cross-cultural study of Americans, Turkish and Kazakh EFL students' use of English speech acts: Apology, request and complaint
Türk ve Kazak İngilizce öğrencileri ve Amerikalıların özür, rica ve şikâyet üretimleri üzerine kültürlerarası karsılaştırmalı bir çalışma
NURZHANAT AMETBEK
Yüksek Lisans
İngilizce
2012
Eğitim ve Öğretim Hacettepe Üniversitesi
Yabancı Diller Eğitimi Ana Bilim Dalı
PROF. DR. MEHMET DEMİREZEN

Geri Dön