Veritabanı sistemlerinde sorgulama optimizasyonu ve SQL standardları

Çelik, Güler
Süreli Yayın başlığı
Süreli Yayın ISSN
Cilt Başlığı
Fen Bilimleri Enstitüsü
Bu çalışmada sorgulama optimizasyonu ve 1987 yılından itibaren standart bir dil olarak tanımlanan SQL (Structured Query Language) sorgulama dilinin, sorgulama optimizasyonundaki özel faydalan dilin özellikleri ile birlikte incelenmiştir. SQL, ilk kez 1987 yılında International Standard Organization (ISO) tarafından standard olarak tanımlanmıştır. Bu çalışmada, SQL dili incelenirken temel alman standardın versiyonu 1992 tarihli uluslararası standarttır (ISO 1992). SQL dilinin neden önemli olduğu, veri tabanı dünyasında ne denli ciddiye alınması gerektiği ve neden standart bir dil olarak tanımlandığı yine bu çalışmanın kapsamında incelenmiştir. SQL dilinin özellikleri vurgulanırken, dilin özelliklerinin daha iyi kavranması amacıyla örneklere yer verilmiştir. Bu çalışma kapsamında örnek bir sistem seçilmiş ve bu sistemin veri tabam tasannu yapılmıştır. Veri tabam tasarımında, gerçek hayatta uygulanagelen yöntemlere yakın bir yöntem kullanılmıştır. Bu yöntemde veritabanı yaşam süreci olarak kabul edilmekte ve veri ortak bir kaynak olarak görülmektedir. Yöntem gereksinim formülasyonu, mantıksal tasarım ve gerçekleştirme gibi adımlardan sonra veri tabanının bilgi işlem gereksinimleri altoda verimliliğini hedefleyen bir yöntemdir. Mantıksal kayıt erişim sayılan ve taşıma hacmi değerleri yaklaşımı ile yapılan tasarımın performansı incelenir ve elde edilen bu değerleri en aza indirerek performansı artıracak yeni bir tasarım daha yapılır. Bu tez kapsamında incelenen sistem, otomobil imalat ve satışını yapan firmaya bağlı yetkili servis ve bakım istasyonundaki onarım ve yedek parça stok hareketlerinin otomasyonunu amaçlamaktadır. Bu sistemin, bahsedilen yöntem ile veri tabanın tasannu yapılmış ve bu elde edilen tasanm yeniden düzenlenmiştir. Yapılan her iki veri tabam tasannu bilgisayar ortamında denenmiş ve sonuçlar karşılaştınlmıştır.
The structure of the database management systems (DBMS) are composed of two basic parts: The data definition part and data manipulation part. Data definition refers to how the structure of the database is communicated to the DBMS. The overall structure of the database describes all the records, all the data items, all the files, all the sets, used in the database. This is usually specified through some type of special language called the data definition language (DDL). Data manipulation refers to methods used to retrieve, add, modify, and delete data in a database. There are four basic methods of data manipulation in databases: programming language interface, query, report writers, and system utilities. Query is used to inquire into database. Selecting the best possible solution to solving a query is called optimization. Actually, the system might not recognize all posible solutions to the query solution or, it might take too long to evaluate all possible solutions, so the system might just try to improve on the current solution. There are two factors to consider when attempting to optimize a query: time and space. Time refers to the time that it takes to carry out the operations to solve the query. Space refers to how much space is needed to construct the intermediate tables to solve the query. In general, the JOIN-type queries are the ones that the system has to try its best to optimize, for they are the most time consuming to execute and generate the most space for resultant tables. This means that JOINs should be placed as late as possible in the execute queue. However, it must be the job of the relational DBMS to optimize queries, and this should be transparent. From the programmer's and the system's query optimization point of view, it is generally considered better to write a query with as little nesting and imbed ordering as possible. Query languages are stand-alone languages that allow rapid access to the database. These languages do not have to be compiled/translated or linked before they are used. They are designed primarly for retrieval of data, although data maintenance use is usually allowed. Query languages are two category. The first category is called command-oriented-query languages. This category encomppasses all query languages in which the commands are specified in English-like text. The second category is called screen-oriented languages. In this category, the user enters commands through a fill-in-the-blank mechanism. This thesis presents the importance of the SQL in the query optimization. SQL is currently without rival as a standard database language. Every major supplier currently provides database products that either based on SQL or have an SQL interface. Besides being a standard in its own right, SQL also provides an important tool for defining other key standards. Most important research on database technology has been centred on SQL since the mid seventies in both academic community and industry research labs. In that context, it has been a focal point for investigating such crucial technological issues as query optimization, data distribution and distributed processing. SQL, whose name is derived from Structured Query Language, is in the command-oriented-query languages category. SQL is a relational database language. SQL is a comprehensive database language; it has statements for data definition, query, and update. Hence, it is both DDL and DML. In addition, it has facilities for defining views on the database, for creating and dropping indexes on the files that represent relations, and for embedding SQL statements into a general-purpose programming language such as PL/1, COBOL or PASCAL. Each of this is discussed on this thesis. SQL as such is a database sub-language, which means that it is not a complete language in its own right since it contains no control commands. SQL provides definitional language facilities for data definition, for integrity constraint definition and, to some extent, for acces control definition. For data manipulation, it provides a module language that consists of procedures caontaining statements that can retrieve, store and modify various ways. SQL is not a database management system, nor is it a stand-alone product. SQL is an integral part of database management system, a language and a viii tool for communicating with the database management system. SQL links the components of the typical DBMS, such as forms facility, report writer, interactive query facility, together. The database engine is the heart of the DBMS, responsible for actually structring, storing, and recieving the data on the disk. It accepts requests from other DBMS components, from user-written application programs, and even from other computer systems. SQL plays many different roles: SQL is an interactive query language. Users type SQL commands into an interactive SQL program to retrieve data and display it on the screen. In addition to this, SQL is a database programming language. Programmers embed SQL commands into their application programs to acces the data in a database. The database administrator responsaible for managing a minicomputer or mainframe database uses SQL to define the database structure and control acces to the stored data. Personal computer programs use SQL to communicate over a local area network with database servers that stor shared data. Many new applications are using this client/server architecture, which minimizes network traffic and allows both PCs and servers to do what they do best. SQL is also a distributed database language. Distributed database management systems use SQL to help disribute data across many connected computer systems. The DBMS software on each system uses SQL to communicate with the other systems, sending requests for data access. Finaly, SQL is a database gateway language. In a computer network with a mix of differnt DBMS products, SQL is often used in a gateway that allows one brand of DBMS to communicate with another brand. SQL is an important language as it is the first and so far only standard database language to gain wide acceptance. The SQL standard language is used in other standards or devoloping standards as a definitional tool. The development of the language is supported by considerable academic interest, providing both a theoretical basis for the language and the techniques needed the implement it succesfully. This is true in the areas of query optimization, distrubition of data, privacy and security. This thesis includes a sample database design. Although the methodology emphasizes the logical design step, the activities of requirements analysis and physical design are also addressed. The methos accepts database life ix cycle concept. The design steps are requirement formulation and and anlysis taking place. In the logical design step address the design and refinement of an information structure through the consolidation of the user information requirements specifications. In order to meausure of project the performance of the design logical record acces approach is used. Logical record acces caounts are made to estimate the excepted number of occurences of each record type retrieved explicitly in the course of executing an application. When weighted by processing frequencies, they show which application will probably require the most I/O accesses to and from the database. If logical record access counts can be specified for each application by record type, it is easy to use their results to calculate transport volume, or the total amount of the data flow between the application programs and the database management system. The transport volume due to one record type in one application is just the product of that record type's size and its number of logical record accesses in that application. This may be summed over all record types to calculate the transport volume in bytes for the application. Finally the various application frequencies and summed to obtain the global transport vaolume. By noting which applications are dominat in terms of frequency of executaion, logical record accesses per unit time and transport volume, one can determine where design improvements will be most effective. Using this database design method, a sample database designed in this thesis. The system, aims to operate maintenance, restoration of automobile and automation of spare part and stock in the authorized service and repairs stations of a automobile company. Service objects to use computer in each step of reparing of a automobile and in the automation of reserve piece and stock. For this reason, a data base is needed to bring into existence. First a data base design had done and implemented, then examined in computer environment, for data base performance. The IBM AS/400 consists of a environment is used for this study. Then logical record access counts and transport volumes calculated. Improvements had done on the design and new design re-implemented and examined. The comparison charts and SQL programs for both systems are given in appendixes. The main objective of this study is to optimize queries in database systems and express the advantages of using SQL standards in query optimization.
Tez (Yüksek Lisans) -- İstanbul Teknik Üniversitesi, Fen Bilimleri Enstitüsü, 1994
Anahtar kelimeler
Mühendislik Bilimleri, SQL, Sistem analizi, Sorgulama dili, Veri tabanı sistemleri, Engineering Sciences, Structured query language, System analysis, Query lanquage, Database systems