Türkçe Metin Seslendirme

Şentürk, Tuncay

Türkçe Metin Seslendirme

dc.contributor.advisor	Adalı, Eşref	tr_TR
dc.contributor.author	Şentürk, Tuncay	tr_TR
dc.contributor.department	Bilgisayar Mühendisliği	tr_TR
dc.contributor.department	Computer Engineering	en_US
dc.date	2010	tr_TR
dc.date.accessioned	2010-06-25	tr_TR
dc.date.accessioned	2015-04-07T13:59:33Z
dc.date.available	2015-04-07T13:59:33Z
dc.date.issued	2010-06-29	tr_TR
dc.description	Tez (Yüksek Lisans) -- İstanbul Teknik Üniversitesi, Fen Bilimleri Enstitüsü, 2010	tr_TR
dc.description	Thesis (M.Sc.) -- İstanbul Technical University, Institute of Science and Technology, 2010	en_US
dc.description.abstract	Bu çalışmada temel amaç, Türkçe metinlerin insan sesine dönüştürülebilmesi ve “Türkçe Metin Seslendirme” sisteminin geliştirilmesidir. Bu sistem geliştirilirken üç farklı yöntem incelenmiş, uygulanmış ve aralarındaki anlaşılırlık istatistiksel olarak ölçülmüştür. İlk olarak, “çift-ses (diphone) eklemeli yöntem” uygulanmıştır. Anlaşılırlığı düşük olmasa da doğallıktan uzak sonuçlar elde edilmiştir. Bunun üzerine, donanım maliyetinin de azalması ile, çift-ses eklemeye nazaran günümüz koşullarında daha kabul görmüş “hece eklemeli yöntem” geliştirilmiştir. Anlaşılırlık olarak ve ses kalitesinde olumlu yönde fark olduğu istatistiksel olarak ispatlanmıştır. Son olarak, ses süre ve şiddetinin değiştirilmesi suretiyle, vurgu ve tonlamada da başarılı sonuçlar elde edilmiştir. Tüm çalışmalar için gerekli ses dosyalarının hazırlanması amacıyla önce Türk Dil Kurumunun ses veritabanı kullanılmıştır. Ancak bu veritabanında kelimelerin vurgulu ve iki farklı kişi (erkek ve kadın) tarafından karışık olarak okunmuş olması dolayısıyla çok olumlu sonuçlar elde edilememiştir. Daha sonra, yazılan program vasıtası ile MBROLA kütüphanelerinin kullanılması ile, tüm ses dosyalarının otomatik olarak oluşturulabilmesi sağlanmıştır. Oluşturulan bu ses dosyalarına, genlik dengeleme algoritması uygulanmış, ses dosyaları arasındaki en fazla ve en az genlik seviye farklılıkları aza indirgenerek anlaşılırlık arttırılmıştır. Son olarak bu hecelerin birleşme noktalarında seslerin türlerine göre belirlenen kurallar uygulanarak, gerçek ses dosyalarındaki dalga şekillerine benzer doğallık oluşturulmaya çalışılmıştır. Hazırlanan program üç ana bileşenden oluşmaktadır: • Metinden XML dosyası oluşturma : İlk bileşen, girilen metni dilbilgisi kuralları çerçevesinde, belirlenen biçimde bir XML yapısına dönüştürür. • XML’den ses üretme : Bu bileşen, belirlenen kurallar doğrultusunda hazırlanmış XML dosyasını veya katarını, Türkçe ses dosyasına dönüştürür. • Kullanıcı arayüzü : Programın kullanılabilmesi için hazırlanmış arayüz bileşenidir. Her iki bileşen, birbirine bağlanmıştır ve görsel arayüz ile kullanıcının girmiş olduğu metin, yine kullanıcının belirlemiş olduğu yöntem ile ses dosyasına dönüştürülüp, seslendirilir. Tüm yöntemlerin ayrı ayrı anlaşılırlığının tespit edilebilmesi için; cümleler, farklı yaş gruplarındaki insanlara dinletilmiş ve alınan cevaplara göre belirli formül yardımı ile yüz üzerinden puan verilecek şekilde hesaplama yapılarak, bir matriste sunulmuştur. Son olarak, görme engellilerin de ekran görüntüsü gerektirmeden kullanabileceği metin düzenleme program hazırlanmıştır.	tr_TR
dc.description.abstract	The main purpose of this study is development of a Turkish Text Synthesizer System which converts text, written in Turkish, to human voice. Three different methods are examined for developing this system, these three methods are implemented and their clarity is measured statistically. First, the diphone concatenation method was applied. While the words were understandable, results were far from natural. Thus, considering the reduction of hardware costs in todays conditions the more accepted syllable concatenation method” was developed. It is statisticaly proven that there is positive improvement with clarity and sound quality with this method. Finally, by changing the amplitude and duration of the sounds, more successful results were obtained for intonation. The Turkish Language Association’s (TDK) database is used to prepare the necessary audio files in the begining of this study. However, in this database the sound of words were accented, and the database was vocalized by two different people (men and women) therefore favorable results could not be achieved. Then, by means of a software program developed, MBROLA library was used to automatically create all the sound files. The amplitude balancing algorithm has been applied to these audio files, and clarity was increased by normalizing the maximum and minimum amplitude differences between sound files. Finally, more natural sounds which have a wave shape similar to real audio files were created by applying the rules, determined according to the type of sound, to the syllables vanishing point. The program consists of three main components: • Text to XML: the first component converts the text to the specified XML format by given grammar rules framework. • XML to sound: This component converts the XML file or string, which has been prepared in accordance with the rules specified, to Turkish audio files. • Graphical User interface: the interface is the component designed to use the program. Both components are linked together. The text entered by the user interface, is converted to audio file utilizing the method selected by the user, and then vocalized. In order to determine and compare clarity of all methods set sentences were listened by different age groups and their answers were formulated to a score from 0 to 100, and the results were given in a matrix. Finally, a text editing software program is developed to help the visually impaired edit text without the need for a screen image.	en_US
dc.description.degree	Yüksek Lisans	tr_TR
dc.description.degree	M.Sc.	en_US
dc.identifier.uri	http://hdl.handle.net/11527/359
dc.publisher	Fen Bilimleri Enstitüsü	tr_TR
dc.publisher	Institute of Science and Technology	en_US
dc.rights	İTÜ tezleri telif hakkı ile korunmaktadır. Bunlar, bu kaynak üzerinden herhangi bir amaçla görüntülenebilir, ancak yazılı izin alınmadan herhangi bir biçimde yeniden oluşturulması veya dağıtılması yasaklanmıştır.	tr_TR
dc.rights	İTÜ theses are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission.	en_US
dc.subject	Türkçe	tr_TR
dc.subject	Metin seslendirme	tr_TR
dc.subject	Görme Engelliler	tr_TR
dc.subject	Turkish	en_US
dc.subject	Text to speech	en_US
dc.subject	Visually impaired	en_US
dc.title	Türkçe Metin Seslendirme	tr_TR
dc.title.alternative	Turkish Text To Speech Synthesizer	en_US
dc.type	Master Thesis	en_US

Dosyalar

Orijinal seri

Şimdi gösteriliyor 1 - 1 / 1

Ad:: 10547.pdf
Boyut:: 1.58 MB
Format:: Adobe Portable Document Format
Açıklama

İndir

Lisanslı seri

Şimdi gösteriliyor 1 - 1 / 1

Ad:: license.txt
Boyut:: 3.14 KB
Format:: Plain Text
Açıklama

İndir

Koleksiyonlar

FBE- Bilgisayar Mühendisliği Lisansüstü Programı - Yüksek Lisans