LEE Bilgisayar Mühendisliği Lisansüstü Programı
Bu topluluk için Kalıcı Uri
Gözat
Yazar "Akgül, Abdullah" ile LEE Bilgisayar Mühendisliği Lisansüstü Programı'a göz atma
Sayfa başına sonuç
Sıralama Seçenekleri

ÖgeMemorybased approaches to problems in probabilistic modeling(Lisansüstü Eğitim Enstitüsü, 20221025) Akgül, Abdullah ; Ünal, Gözde ; 504201504 ; Computer EngineeringDeep neural networks are an accepted solution for many problems in deep learning; however, the application of deep neural networks to safetycritical areas such as health care is still a hot research topic. To employ deep neural networks in such fields, they are expected to fit the indomain data set, provide calibrated predictions on problematic regions of the target domain, and separate the outofdomain queries. Even though these expectancies are studied extensively, these studies are highly fragmented. Therefore, there is no model that is able to fit these requirements simultaneously. Continual Learning (CL) is a framework that aims to learn numerous tasks in a sequential way. The excellent CL method should adapt to new tasks perfectly without forgetting previous tasks. However, neural networks suffer from catastrophic forgetting which is a performance drop on previously learned tasks caused by the newly learned task. Yet, to get intelligent systems capable of adapting to environmental change, CL is crucial. Because of this, CL is a hot topic but the research on CL is mainly on image classification tasks and there is limited work on time sequence classification tasks. Yet, there is no work on multimodal dynamics modeling. In this thesis, we employ an external memory to deal with problems in probabilistic modeling. Our solutions for these problems can be summarized as follows: i) Evidential Turing Processes (ETP): First, we define total calibration for the first time. After investigating two Bayesian paradigms which are the Bayesian model, and the Evidential Bayesian Model, we introduce the Complete Bayesian Model (CBM) which is a unification of those two paradigms. We develop ETP as an instance of CBMs with neural episodic memory. We build a pipeline to evaluate the models' performance for total calibration. We compare our solution, the ETP member of CBMs, with stateoftheart members of other paradigms, and we also provide an ablation study. We investigate the models' performance under five realworld data sets including one timeseries classification, and four image classification tasks. Furthermore, we tested the models in the corrupted versions of different data sets. We use four different metrics that are test error as prediction accuracy, Expected Calibration Error as indomain calibration score, Negative LogLikelihood (NLL) as model fit, and area under the ROC curve as outofdomain detection score. We report that only the ETP can excel in all three aspects of total calibration simultaneously. ii) Continual Dynamic Dirichlet Process (CDDP) for Continual Learning of Multimodal Dynamics: We introduce a new problem which is CL of multimodal dynamics. Since the problem is novel, we create a baseline from the existing ones. For this new problem, we introduce a novel solution that employs an external memory to transfer knowledge between tasks. We curate a pipeline for this newly introduced problem, and in the pipeline new tasks are coming sequentially and each task has a certain number of different mode samples. Differences in task order may cause different results in CL setups; therefore, we change the task order for each replication. We also generate synthetic data sets and adapt timeseries classification data sets to evaluate models' performance in the problem. We compare models' performance with Normalized Mean Squared Error as a measure of prediction accuracy and NLL as a measure of Bayesian model fit that quantifies uncertainty. We reveal that our approach, CDDP, compares favorably to the established parameter transfer approach in CL of multimodal dynamical systems. To sum up, in this thesis, by experiments we show that external memory architecture can be used for both calibrations of neural networks to use in safetycritical areas and CL of multimodal dynamics.