Türkçe sözlük analizi

dc.contributor.advisor Harmanci, Emre
dc.contributor.author Bayramoğlu, Arzu
dc.contributor.authorID 19318
dc.contributor.department Kontrol ve Otomasyon Mühendisliği tr_TR
dc.date.accessioned 2023-03-16T05:59:45Z
dc.date.available 2023-03-16T05:59:45Z
dc.date.issued 1991
dc.description Tez (Yüksek Lisans) -- İstanbul Teknik Üniversitesi, Fen Bilimleri Enstitüsü, 1991 tr_TR
dc.description.abstract Türkiye'de özel ve resmi işyerlerinde gittikçe artan sayıda doküman bilgisayar ve kelime işlemciler kullanılarak yazılmaktadır. Çok sayıda işlevi bulunan Türkçe kelime işlemciler, kullanıcılara dokümanlarını belli bir formatta girme olanağı sağlamaktadırlar. Ancak dokümanların kalitesi açısından bu yeterli olmamaktadır. Bu nedenle girilen yazının çözümlenmesini ve yanlışlıklar için öneri getirilmesini sağlayan kolaylıklar gerçekleştirilmiştir. En basit işlem sözcük çözümlemesidir. Türkçenin bitişken yapısı, bunun sonucunda oluşan karmaşık sözcük şekilleri ve Türkçenin kuralları çözümleme işlemini zorlaştıran etkenlerdir. Bu tezde sözdizimi ve biçimbilim ilişkilerine girmeden Türkçe sözcüklerin çözümlenmesi gerçekleştirilmiştir. Amaç Türkçe bir doküman yazımı esnasında sözcüklerin analizini yapmak ve yanlış sözcüklerle karşılaşıldığında kullanıcıyı uyarmaktır. tr_TR
dc.description.abstract In Turkey, at official and private offices an increased number of documents are prepared by computers and word processors. Word pro cessors with numerous functionalities offer the possibility for entering and formatting documents to the users. However this is not sufficient for the quality of the document. For this reason, some tools have been developed for analysing the text and suggesting changes according to the spelling errors. The simplest task is analysing the words. The agglutinative nature of Turkish, the resulting complex word formations and the rules of the language are the factors, which make the analysing process more difficult. This thesis developes a system for analysing Turkish words wit hout entering morphological relations. In other words, we have not to prove that the word "yapıt" is formed from the root "yap" and the derivational suffix "-it". So we put the word "yapıt" di rectly in the root dictionary. Word analysing process investigates if the word written is valid according to the Turkish word struc ture or not. In order to investigate this validity, one has to se arch for the word either from a list of all possible words of the Turkish or try to generate them using a list of roots, suffixes and a set of word generating rules to find out whether the analy sed word may be generated. It is not possible to use the first analysing method in an agglutinative language such as Turkish. VII All morphemes occuring in the beginning of the words (roots and stems) have been considered within the coverage of root morphemes dictionary. Loan words, emphatic forms, some female and male per sonal names, family names, nationality names, city names, substan tives, pronouns, adjectives, verbs, adverbs, postpositions, con junctions and interjenctions are all in the root dictionary. Word classes used for root morphemes are as follows : * IS substantive * S adjective * I substantival (used either as an adj. or a sub.) * ZM pronoun * F verb * FO negative verb * ZF adverb * ED postposition * BA conjunction * UN interjection * ÖZ special name * SO question word * FI verbal substantival (used either as a verb or * FS verbal adjectival a sub. /an adj.) Because of we don't enter morphological relations, we deal only with conjugational suffixes. But some derivational suffixes are also in the suffix dictionary. It would be not logical if we put all the words formed with these suffixes in the root dictionary. These suffixes are as follows: VIII ** 1 = derivational suffix ** Almost all Turkish suffixes are subject to vowel and consonant harmony rule. This means that a Turkish morpheme may often have 2, 4, 8, 16 or even 24 allomorphs. According to the vowel harmony, unrounded vowels ( a, ı, e, i ) are followed by unrounded vowels and rounded vowels (o, u, ö, ü) are followed by low unrounded (a, e) or high rounded vowels (u, ii). Turkish consonant harmony rules are the following: - Turkish words do not end by some of the voiced consonants (b, c, d, g). But there are a few exceptions such as ad, öd, hac. - The final voiceless consonants ( p, ç, t, k ) become respec tively ( b, c, d, ğ ) when they are followed directly by a vowel, except when an "n" precedes final "k" in which case it becomes a "g" instead of "ğ". The analysing process starts with the with the spelling check. If the word fails in this subprocess, the entire process stops. But in case of success, the word enters to the word analysis subpro cess, which requires more time than the spelling check. Word analysis subprocess involves : IX - Root recognation with a dictionary look-up, which determines where the root morpheme ends and suffix morpheme begins. - Suffixes recognation. - Test the root and the suffixes for structural validity. Tur kish roots can be classified into two main classes : substantival and verbal. The verbal model comprises the verbs, while substanti val class comprises nouns and adjectives. The suffixes that can be received by either of this groups are different. The detailed pa radigms for verbal and substantival grammars are as follows : VERBAL WORD PARADIGM (l)root--(y)mi*§ -person-d*i*r ma*li* (i*)yor (y)a*ca*k* (2)root-main tense--person (3}root-m(a*)--person (5) i-(y)mi*§ -person (y)d*i* (y)sa* (6)root-(a*)r/i*r -(y)ken (y)mi*§ ma*li* (y)a*ca*k* (7)root--0R/UL generating suffix except (y)ken (8)root--si*n(la*r) yı*n(ı*z) - Test for harmony rules. An important part of the system is the database, which contains Turkish roots, suffixes and their properties. en_US
dc.description.degree Yüksek Lisans tr_TR
dc.identifier.uri http://hdl.handle.net/11527/23547
dc.language.iso tr
dc.publisher Fen Bilimleri Enstitüsü tr_TR
dc.rights Kurumsal arşive yüklenen tüm eserler telif hakkı ile korunmaktadır. Bunlar, bu kaynak üzerinden herhangi bir amaçla görüntülenebilir, ancak yazılı izin alınmadan herhangi bir biçimde yeniden oluşturulması veya dağıtılması yasaklanmıştır. tr_TR
dc.rights All works uploaded to the institutional repository are protected by copyright. They may be viewed from this source for any purpose, but reproduction or distribution in any format is prohibited without written permission. en_US
dc.subject Bilgisayar Mühendisliği Bilimleri-Bilgisayar ve Kontrol tr_TR
dc.subject Kelime çözümlemesi tr_TR
dc.subject Türkçe tr_TR
dc.subject Computer Engineering and Computer Science and Control en_US
dc.subject Word analysis en_US
dc.subject Turkish en_US
dc.title Türkçe sözlük analizi tr_TR
dc.title.alternative Turkish word analysis en_US
dc.type Master Thesis tr_TR
Dosyalar
Orijinal seri
Şimdi gösteriliyor 1 - 1 / 1
thumbnail.default.alt
Ad:
19318.pdf
Boyut:
3.62 MB
Format:
Adobe Portable Document Format
Açıklama
Lisanslı seri
Şimdi gösteriliyor 1 - 1 / 1
thumbnail.default.placeholder
Ad:
license.txt
Boyut:
3.16 KB
Format:
Plain Text
Açıklama