FBE- Bilgisayar Mühendisliği Lisansüstü Programı - Doktora
Bu koleksiyon için kalıcı URI
Gözat
Sustainable Development Goal "Goal 9: Industry, Innovation and Infrastructure" ile FBE- Bilgisayar Mühendisliği Lisansüstü Programı - Doktora'a göz atma
Sayfa başına sonuç
Sıralama Seçenekleri
-
ÖgeAn integrated architecture for information extraction from documents in Turkish(Institute of Science and Technology, 2009-12-25) Adalı, Şerif ; Sönmez, Coşkun A ; 504012098 ; Computer EngineeringIn this study, ontology based information extraction and document layout analysistechniques are integrated for extracting domain specific events and entities. Proposed?Concept Zoning? technique provides easy definition of extraction concepts andincreases portability of the IE system and requires only concept definitions whencompared to approaches that rely on large sets of linguistic patterns. Proposedarchitecture works well when applied to restricted domain applications. It alsosuccessfuly detects data in tabular, list or itimized form. In case of an unknown event,concept similarity is calculated by comparing the concepts in the input document againstthe concepts in the ontology and new attributes, key concept nodes and conceptsproperties are incrementally added to the knowledge base by the user. Domain ontologyis enriched by adding newly discovered instances. Experimental results indicate that ahigh performance document processing system has to cover enough number of lexicalresources, extraction concepts and document models. In addition, document layoutanalysis is used for detecting unknown entity types and approach verifies extractedinformation and relations among them by using key values defined for each domainevent.