Çoklu Koşullu Rassal Alanlar Kullanarak Türkçe Biçimbilimsel Belirsizlik Giderme

thumbnail.default.alt
Tarih
2013-02-18
Yazarlar
Ehsani, Razieh
Süreli Yayın başlığı
Süreli Yayın ISSN
Cilt Başlığı
Yayınevi
Fen Bilimleri Enstitüsü
Institute of Science and Technology
Özet
Bu çalışma Türkçenin biçimbilimsel belirsizlik gidermesi sorunu ile uğraşır. Bu sorunu istatistiksel makine öğrenme yöntemi ile ele alar. Kullandığı istatistiksel yöntem ise koşullu rassal alanlardır.
This thesis presents the results of main part-of-speech tagging and full morphological disambiguation of Turkish sentences using multiple Conditional Random Fields (CRFs). Although CRFs are applied to many different languages for part-of-speech (POS) tagging, Turkish poses interesting challenges to be modeled with them. The challenges include issues related to the statistical model of the problem as well as issues related to computational complexity and scaling. In this paper, we propose a novel model for main-POS tagging in Turkish. Furthermore, we pro- pose some approaches to reduce the computational complexity and allow better scaling characteristics or improve the performance without increased complexity. These approaches are discussed with respect to their advantages and disadvantages. We show that the best approach is competitive with the current state of the art in accuracy and also in training and test durations. The good results obtained imply a good first step towards full morphological disambiguation.
Açıklama
Tez (Yüksek Lisans) -- İstanbul Teknik Üniversitesi, Fen Bilimleri Enstitüsü, 2013
Thesis (M.Sc.) -- İstanbul Technical University, Institute of Science and Technology, 2013
Anahtar kelimeler
koşullu rassal alanlar, Türkçe, biçimbilimsel, belirsizlik, conditional random fields, hidden markov models, morphological disambiguation, Turkish
Alıntı