Analysis and visualization of two and three dimensional data structures using a rational cubic spline function: A case study on water, natural gas, and electricity data in Istanbul
Analysis and visualization of two and three dimensional data structures using a rational cubic spline function: A case study on water, natural gas, and electricity data in Istanbul
Dosyalar
Tarih
2025-02-14
Yazarlar
Yavuz, Zübeyde
Süreli Yayın başlığı
Süreli Yayın ISSN
Cilt Başlığı
Yayınevi
Graduate School
Özet
Spline is an interpolation technique that we use to best connect a series of data points in a mathematical sense and obtain a curved function. Basically, they consist of functions between every two data points, which we call multi-part polynomials, which are used to connect certain data points with a smooth curve. The function derivatives here are continuous. This means that the curve does not create corner breaks by making a smooth transition. Spline is also used to provide smooth transitions between data points, to create continuous curves without sharp corners or sudden changes, and to prevent excessive oscillations, i.e. unnecessary fluctuations, in higher-degree polynomial interpolations. Because in splines, instead of higher-degree polynomials, piecewise functions are used. It is preferred to use it when working with continuous data sets in data visualization, modeling data trends, engineering mathematics, computer graphics, numerical analysis, climate science, bioinformatics, machine learning and artificial intelligence. When we apply spline interpolation to real data sets, we obtain functions that provide smooth and continuous transitions between the data points we have. They are also used to estimate missing or noisy data, to determine trends or to model the dynamics of a system. There are many different spline methods in the literature. Different data sets and applications require the use of different spline types. The reason for choosing different spline types is the accuracy, calculation and shape control requirements that vary depending on the structure of the data set and modeling needs.In this thesis, different spline types are examined and linear spline, cubic spline and rational cubic spline methods known in the literature are used. While linear spline provides a fast and simple solution, cubic spline increases interpolation accuracy with smooth transitions. Rational cubic spline is advantageous in providing additional control conditions such as positivity. In this study, the rational cubic spline method has been examined in detail. There are many different forms of rational cubic spline. In this study, we use the form with four free parameters and C^1 continuity conditions. Shape control analyses have been performed to stretch the curve bends of these parameters or to provide control on the curve. The ability of shape control analyses to provide the desired shapes of parametric curves has been examined. In particular, in the analyses performed on rational cubic spline curves, it has been examined how well these curves fit the function. Rational cubic spline is a method that can preserve the positivity of the curve with the help of shape control parameters. Positivity is important in data that has a physical meaning, such as the data sets used here, such as water, natural gas and electricity consumption, which cannot be negative. In addition, three different derivative selection methods were used for rational cubic spline: Arithmetic method, geometric method and central difference method. Different derivative selections were evaluated in order to increase the accuracy of the curve and modeling success. While the arithmetic method provides an average transition, the geometric method captures proportional changes better. The central difference method increases the accuracy of interpolation with the symmetric derivative calculation approach. Peano Kernel Theorem is used to determine how close the rational cubic spline curve is to the function, to test the accuracy of spline derivatives, and to determine the upper bounds of the error in differential and integral error estimates. Peano Kernel Theorem is a method used in error analysis of spline interpolation and is used to compare and analyze the approximation errors of different spline types. As a result of these analyses, it is revealed which parameters should be optimized in what way in order to increase the precision of spline interpolation. The real datasets we used in this study are water, natural gas and electricity consumption data of Istanbul province. The datasets were obtained from the official website of Istanbul Metropolitan Municipality (IMM) and Istanbul Water and Sewerage Administration (ISKI). Python programming language was used in the visualizations. It is necessary to preserve positivity in rational cubic spline interpolation for the water, natural gas and electricity consumption datasets used in the study. Because water, natural gas and electricity consumptions can never be negative. They gain meaning when they are positive. For this reason, it is necessary to use a method that does not allow negative values and preserves positivity in the interpolation process. The spline interpolation methods used here, especially the rational cubic spline, were used to ensure that the consumption amounts in the dataset remain compatible with physical reality. Linear spline, cubic spline and rational cubic spline interpolation methods applied to real data sets provide a smooth and understandable representation of water, natural gas and electricity consumption data in time series format. Thanks to these methods, values that are not measured or are made but missing can be estimated. Precise analyses can be performed on consumption data. Since water, natural gas and electricity consumption quantities do not show a regular change over time, spline interpolations are sensitive to changes and fluctuations that may occur and can provide results accordingly. The use of these interpolation methods and visualization techniques provides valuable information for municipalities, energy and water distribution companies, environmental researchers and city planners. With these visualizations and models, the examination of consumption trends is very important in terms of efficient use of resources, establishing supply-demand balance and sustainability. In such cases, taking precautions against any possible situations is important in planning maintenance and repair processes and planning infrastructure investments. In two-dimensional visualizations, linear spline, cubic spline and rational cubic spline were used and derivatives for rational cubic spline were calculated with arithmetic, geometric and central difference methods. Each interpolation and derivative calculation method is colored so that users can compare the performance of different methods. Two-dimensional visualizations present the change of consumption data over time in a simple and understandable way, while allowing for clearer observation of increases or decreases in a certain period. In addition, error analyses were performed to evaluate the accuracy of interpolation methods and compare their performances. Five different interpolation metrics were used to perform these error analyses. These metrics were determined as mean absolute error MAE, mean square error MSE, coefficient of determination R^2, mean square logarithmic error RMSE and sum of squared error SSE. In this way, the accuracy and reliability levels of different methods on the data sets could be compared. Bicubic spline and rational bicubic spline interpolation methods were used for three-dimensional visualizations. In order to obtain three-dimensional images, the SciPy package in the Python programming language was used for cubic spline interpolation. For rational bicubic spline images, a rectangular region was defined and the rational cubic spline function was expanded. Through these visuals, water and natural gas consumption of 39 different districts of Istanbul province was presented in detail with changes on a yearly and monthly basis. Three-dimensional visualizations allow for a more holistic examination of temporal and spatial consumption patterns, allowing for more comprehensive analyses for both decision makers and researchers. Working with data sets and comparing various applications also helps evaluate consumption habits of different geographical, climatic and demographic characteristics. As a result, the use of such interpolation and visualization approaches supports faster, strategic, technical and effective decision-making in areas such as energy and water management, and contributes to the development of various policies for a more resilient infrastructure and resource management against possible problems in the future.
Açıklama
Thesis (M.Sc.) -- Istanbul Technical University, Graduate School, 2025
Anahtar kelimeler
rational cubic spline,
rasyonel kübik splayn,
three dimensional data,
üç boyutlu veri