Deep wavelet neural network for spatio-temporal data fusion
Deep wavelet neural network for spatio-temporal data fusion
Dosyalar
Tarih
2022-07-19
Yazarlar
Kulaglic, Ajla
Süreli Yayın başlığı
Süreli Yayın ISSN
Cilt Başlığı
Yayınevi
Graduate School
Özet
Machine Learning (ML) algorithms have recently gained prominence in prediction problems. The construction of an accurate machine learning model becomes a real challenge concerning the nature of the data, the number of data samples as well as the accuracy and complexity of the model. This study introduced a new machine learning structure for temporal and spatio-temporal, univariate, and multivariate prediction problems. The predictive error compensated neural network model (PECNET), which combines spatio-temporal data, has been developed. Temporal data contains information within the observation time window, and its bandwidth is limited by the sampling rate. On the other hand, spatial data provide information regarding spatial location, while spatio-temporal data combine temporal and spatial resolution together. The PECNET model can capture both time dependencies and the spatial relationships between different data resources by fusing multivariate input patterns at multiple lengths and the sampling resolution. The PECNET achieves reliable prediction performance with relatively low model complexity and minimizes the overfitting and underfitting problems. In the proposed model, additional networks are used to predict the error of previously trained networks to compensate the overall prediction. The main network uses high correlation data with the target through moving frames in multiple scales. The PECNET improves time series prediction accuracy by enhancing orthogonal features within a data fusion scheme. The same structure and hyperparameter sets are applied to quite different problems to verify the proposed model's robustness and accuracy. Root-zone soil moisture, wind speed, financial time series data, and stationary and non-stationary time series benchmark problems are selected to evaluate the PECNET model. The results have shown improvement in the prediction accuracy and overfitting prevention using multiple neural networks for distinctive types of problems. The first part of this dissertation focuses on designing and implementing the proposed PECNET model. The algorithm is implemented in the Python programming language, and the performance of the proposed algorithm is evaluated on stochastic and chaotic time series benchmark problems found in the literature. Results have highlighted some aspects of PECNET implementations. The major contributions of the proposed method can be seen in improving the prediction accuracy for distinct types of time series data (chaotic and stochastic) using multiple neural networks where the secondary network is trained by shifted time series prediction error of the primary network. The overfitting is avoided due to an increase in recurrence-related feedback. The same structure and hyperparameter sets are applied for a wide range of time series prediction problems with moving frames in multiple scales. The discrete wavelet transform (DWT) used for preprocessing the input data yields better accuracy improvement than directly applying the time series data to the neural network in predictive error compensation. The PECNET for the stock price prediction problem is introduced in the second part of the dissertation. The selected data represent the non-stationary time series data. Due to the difficulties in the traditional normalization techniques that deal with non-stationary time series data, the average normalization method is proposed. The average value of the current input to the neural networks is computed and subtracted from particular input data. The proposed normalization method is able to represent the different volatilities and preservation of the original properties within each input sequence. The different frequencies of stock price time series data are used together in one neural network, while an additional network uses the previous residual errors as inputs. The updated learning method is applied in this part, enhancing the overall prediction performance. In the third part, the improved PECNET model enables choosing orthogonal features in data fusion applications. Different data types can be fused into one single model by extracting valuable knowledge from multivariate time series data. The extraction of valuable knowledge is done by checking the correlation between the remaining features and residual error. The PECNET chooses the highest correlating data to the residual error acquired by the previously trained network. It is well known that irrelevant features cause overfitting in forecasting models, representing a critical issue considering the number of samples and the number of available features. Because of that, selecting the proper feature set to the essential ones will reduce the learning process's computational cost and improve the accuracy by minimizing the overfitting. In the fourth chapter, the root-zone soil moisture problem is introduced. For this purpose, in-situ agrometeorological measurements and satellite remote sensing indices are used. The distance between the central point and known stations is calculated. The root-zone soil moisture estimation is done using only accumulated ground-based measurements as input data, using only remotely sensing indices, and combining both. Applying the PECNET to the spatio-temporal root-zone soil moisture estimation problem shows promising results. The results can be used to obtain the soil moisture map of neighboring points where sensor information is unavailable. The fifth part examines the decomposition into frequency bands of input time series data and the applicability of different filtering methods. For this purpose, the Butterworth filter is implemented and used as an additional filtering method. Besides the closing stock price as input data, the Far-Eastern stock market indices to obtain the spatial dimensional for the financial time series forecasting example has been included. The overall results showed that fusing spatial and temporal data together into a separately trained cascaded PECNET model can achieve promising results without causing overfitting or reducing the model performances. The proposed wavelet preprocessed PECNET also leaves room for improvement using various preprocessing techniques as well as different types of neural networks.
Açıklama
Thesis(Ph.D.) -- Istanbul Technical University, Graduate School, 2022
Anahtar kelimeler
data,
veri,
artificial intelligence,
yapay zeka,
time series,
zaman serileri,
time domain analysis,
zaman-frekans analizi