LEE- Bilgisayar Mühendisliği-Doktora
Bu koleksiyon için kalıcı URI
Gözat
Başlık ile LEE- Bilgisayar Mühendisliği-Doktora'a göz atma
Sayfa başına sonuç
Sıralama Seçenekleri
-
ÖgeA composed technical debt identification methodology to predict software vulnerabilities(Graduate School, 2023-10-27) Halepmollası, Ruşen ; Kühn Tosun, Ayşe ; 504162505 ; Computer EngineeringSoftware systems must be evolvable and maintainable to meet evolving customer requirements and technology advancements in a rapidly changing IT landscape. Technical debt refers to the accumulated cost as a consequence of rushed design decisions and code implementations and inadequate testing, which compromises long-term software quality for short-term objectives. When technical debt remains invisible and cannot be managed, it accumulates over time and similar to financial debt, it can have interest payments that are the extra effort for future development. The accumulated debt complicates software maintainability and evolvability and potentially leads to security risks. Technical Debt Management is a continuous process, and hence, it is important to integrate this process into the overall software development process. Software security is a quality characteristic and refers to the protection of the systems and networks against vulnerabilities and exploits by building secure software. By integrating security best practices throughout the software development life cycle, the risks associated with security vulnerabilities can be mitigated. To reduce the possibility of a system's vulnerability, incorporating security-oriented thinking into the systems is a better strategy as providing functional and secure development together throughout the overall life cycle will offer protection at all layers of the software. Besides, coding and design flaws are significant contributors to vulnerabilities, highlighting the significance of addressing technical debt as a means to prevent security threats. The main objective of this thesis is to explore relationship between technical debt and software security and provide insights to bridge the gap between technical and business stakeholders. To accomplish this objective, we collected and analyzed real-world data from various projects' GitHub repositories and the National Vulnerability Database. The vulnerability data is linked to corresponding code changes, enabling the identification of vulnerability-inducing commits. Moreover, we prepared an additional dataset of code smells using the PMD tool to investigate the impact of code quality issues on software security. In this thesis, we focus on offering valuable insights into the relationship between technical debt and software security through the collection and analysis of real vulnerability data from open source projects. This analysis provides a deeper understanding of how technical debt impacts software security and the associated risks. First, we investigate the relationship between technical debt indicators such as code smells and code faults and refactoring activities, recognizing the role of refactoring in mitigating technical debt. Therefore, we provide empirical findings that add depth to the understanding of refactoring impact. By analyzing refactoring activities and their impact on technical debt, we aim to identify the extent to which refactoring can enhance or reduce code smells and/or faults. Then, we conduct a comprehensive analysis of technical debt indicators, including software metrics, code smells, and bugs, to predict software security risks. By examining multiple technical debt indicators, we aim to provide a holistic view of the relationship between technical debt and vulnerabilities. This analysis will assist in identifying specific indicators that can reliably predict software security risks, thereby enabling proactive mitigation efforts. We conduct two types of research methods: exploratory research and explanatory research. These methods are utilized to investigate various aspects of software development, each serving a distinct purpose. Both exploratory and explanatory studies play crucial roles in software engineering research. Exploratory studies enable us to explore new or poorly understood phenomena, while explanatory studies allow us to investigate cause-and-effect relationships between variables.
-
ÖgeA new key performance indicator design for academic publishing(Graduate School, 2022-06-16) Hamedgolzar, Negar ; Külekci, Muhammed Oğuzhan ; 704181014 ; Computer SciencesScience and social science research are crucial to the nation's long-term sustainable progress in both the social and economic spheres. The advancement of living standards and quality of life is supported by scientific and social science research advances. Many countries are rapidly transitioning to a knowledge-based economy and lessening their reliance on their natural resources as a result. This is due to the growing importance of research in a country's economic success. Research funding is crucial for the advancement of science and technology as well as for the growth of society and the economy. Bibliometric indicators are vital instruments for figuring out the extent, growth, and global distribution of research in order to recognize and evaluate its progress. Bibliometric indicators are commonly used to evaluate the scientific production, visibility, and capacity of research publications in the context of global science. These statistics are mostly based on the number of published scientific research documents and their citations. Bibliometric indicators evaluate the quantity and quality of research output, and structural indicators analyze the relationship between authors, publications, and topics of research in general science. Science and technology cannot exist unless researchers provide evidence for and publicize the results of their experiments. Keeping these factors in mind, the current study intended to create a strong and more objective ranking system for a country that can evaluate the quality and quantity of a country's research output than the existing techniques. In This thesis, we have developed an evaluation metric called the AtE ratio to evaluate a country's performance in terms of its international visibility in terms of scientific productivity. As a quantitative and qualitative indicator, all publications with at least one author from the target country, as well as the number of citations, are counted on a particular topic. The ratios of actual publications or citations to expected values (AtE), which are estimated based on the country's GDP and population size, are used to evaluate the country's international visibility. If the ratio is higher than one, the associated country performs well in comparison to its global presence. Additionally, we have created a website that allows for more flexible data processing and visualization. There was a large amount of information. Many criteria were taken into account, including the number of scientific categories, subcategories, countries, published articles, citations, and publication journals. As it was not possible to incorporate all of the outcomes in this study, with the help of this website we will be able to display and understand data for any desired period of time between 2001-2020, for any chosen custom combined factors. We calculated AtE ratios in four different approaches for eight different scientific areas and provided the top 20 countries with the greatest AtE ratios. The findings are shown in 19 figures and 8 tables. From the results, we notice that only Israel appears in all 32 categories. This signifies that, given its GDP share and population size, Israel is performing remarkably well in the output of science and has invested significantly in research. The majority of the top twenty countries also have high incomes. Nonetheless, Cyprus is the only small country among non-high-income countries to have appeared on the list 15 times, a testament to the country's tireless efforts to produce research in the best possible way. Furthermore, we also show the top ten countries with the most papers, published in all categories without any normalization. Australia, the United Kingdom, and the Netherlands are among the top 20 countries, with AtE ratios of 28, 27, and 27 times, respectively. This demonstrates that, despite having a smaller population and a lower GDP than other wealthy countries, these countries have generated science on par with those countries in terms of quantity and quality. Another significant element is that the United States was listed among the top 20 countries on multiple occasions. The actual versus expected number of publications or citations depending on GDP or population is represented by the AtE ratio. The United States has the largest GDP in the world. It is the world's third most populous country in terms of population. To be among the top 20 countries in terms of AtE ratios, a country must publish massive numbers of publications with high citations. It is astonishing that the United States has done it 20 times. Other countries with smaller GDPs than the US, such as China and India, could only appear on the list four and three times, demonstrating the vast disparity between these countries. These findings indicate the United States' vast and unequaled global power in high-quality science production. In addition, we investigated Turkey's ranking as a special case in eight scientific fields. In terms of the total number of publications, it has the highest global position of 17, which is in the areas of "Engineering and Computer Sciences" and "Health and Medical Sciences." Similarly, Turkey's AtE ratio is greater than one only in the "Engineering and Computer Sciences" and "Health and Medical Sciences" categories, where it is 1.15 and 1.06, respectively. These findings suggest that Turkey's policymakers should focus more on scientific research in order to boost the country's science production in terms of quantity and quality. Finally, it can be said that this is a new framework that makes us examine the science production of countries from a new angle by considering their GDP share and population size. Policy makers also better understand in which areas their country has weaknesses and strengths in the production of science and whether they are playing their part according to their size in the world or not.
-
ÖgeAi-powered web application security mechanisms(Graduate School, 2024-12-11) Demirel Yılmazer, Dilek ; Sandıkkaya, Mehmet Tahir ; 504172515 ; Computer EngineeringIn the current era of widespread digitalization, the volume of processed private and sensitive data has significantly increased due to the adoption of web-based applications. With this expansion, the need for robust cybersecurity measures to protect against external threats has grown immensely. Corporate networks traditionally served as a barrier to prevent direct access from the Internet, but attackers are targeting web application servers, which are the main points of contact for end users. Thus, this thesis presents AI-based mechanisms for protecting sensitive information of companies as they rely on web-based applications for data storage and exchange. As web application security becomes a top concern across industries, high-performance computing and intelligent solutions are needed to analyze and comprehend vast amounts of web application logs. Machine learning, a branch of artificial intelligence, emerges as a key technique to address these issues. Machine learning is ideal for identifying and evaluating web-based attacks since it allows computers to learn from data and predict results. The thesis explores how machine learning techniques such as regression, prediction, and classification effectively resolve common web application security problems. Researchers have found applications in network management and operation, resource optimization, security analysis, and user profiling. Additionally, zero-shot learning, a technique commonly associated with natural language processing and computer vision, is proposed as a promising approach in web application security for detecting previously unseen attacks. This thesis presents AI-powered web application security mechanisms that lay the groundwork for the threat detection capabilities of ML. It focuses on malicious web requests and web session detection using supervised and unsupervised approaches and makes three major contributions. First, this thesis introduces the Zero-Shot Learning approach using a Convolutional Neural Network (ZSL-CNN), which effectively tackles high false positive rates and unbalanced data issues encountered during ML-based web application attack detection. The approach is evaluated using five distinct web request datasets, and the ZSL-CNN model outperforms other models with a remarkable true positive rate. Second, this thesis presents an innovative approach that uses machine learning-based classification to detect malicious web sessions. This technique combines an embedding layer with machine learning algorithms and demonstrates superior accuracy compared to benchmark methodologies. Finally, this thesis introduces another innovative approach that combines unsupervised learning methodologies. This approach, which focuses on web-based session security, employs two unsupervised learning algorithms to efficiently discriminate benign sessions from malicious sessions for a web application. This thesis presents a comprehensive investigation of the intersection of machine learning and web application security in the digital age, providing valuable insights and innovative solutions for protecting web applications.
-
ÖgeArtificial intelligence based and digital twin enabled aeronautical AD-HOC network management(Graduate School, 2022-12-20) Bilen, Tuğçe ; Canberk, Berk ; 504172508 ; Computer EngineeringThe number of passengers using aircraft has been increasing gradually over the following years. With the increase in the number of passengers, significant changes in their needs have been made. In-flight connectivity (IFC) has become a crucial necessity for passengers with the evolving aeronautical technology. The passengers want to connect to the Internet without interruption regardless of their location and time. The aeronautical networks attract the attention of both industry and academia due to these reasons. Currently, satellite connectivity and air-to-ground (A2G) networks dominate existing IFC solutions. However, the high installation/equipment cost and latency of the satellites reduce their efficiency. Also, the terrestrial deployment of A2G stations reduces the coverage area, especially for remote flights over the ocean. One of the novel solutions is the Aeronautical Ad-hoc Networks (AANETs) to satisfy the IFC's huge demand by also solving the defects of satellite and A2G connectivities. The AANETs are based on creating air-to-air (A2A) links between airplanes and transmitting packets over these connections to enable IFC. The AANETs dramatically increase the Internet access rates of airplanes by widening the coverage area thanks to these established A2A links. However, the mobility and atmospheric effects on AANETs increase the A2A link breakages by leading to frequent aircraft replacement and reducing link quality. Accordingly, the mobility and atmospheric effects create the specific characteristics for AANETs. More specifically, the ultra-dynamic link characteristics of high-density airplanes create an unstructured and unstable topology in three-dimensional space for AANETs. To handle these specific characteristics, we first form a more stable, organized, and structured AANET topology. Then, we should continuously enable the sustainability and mapping of this created AANET topology by considering broken A2A links. Finally, we can route the packets over this formed, sustained, and mapped AANET topology. However, the above-explained AANET-specific characteristics restrict the applicability of conventional topology and routing management algorithms to AANET by increasing its complexity. More clearly, the AANET specific characteristics make its management challenging by reducing the packet delivery success of AANET with higher transfer delay. At that point, artificial intelligence (AI)-based solutions have been adapted to AANET to cope with the high management complexity by providing intelligent frameworks and architectures. Although AI-based management approaches are widely used in terrestrial networks, there is a lack of a comprehensive study that supports AI-based solutions for AANETs. Here, the AI-based AANET can take topology formation, sustainability, and routing management decisions in an automated fashion by considering its specific characteristics thanks to learning operations. Therefore, AI-based methodologies have an essential role in handling the management complexity of this hard-to-follow AANET environment as they support intelligent management architectures by also overcoming the drawbacks of conventional methodologies. On the other hand, these methodologies can increase the computational complexity of AANETs. At that point, we propose the utilization of the Digital Twin (DT) technology to handle computational complexity issues of AI-based methodologies. Based on these, in this thesis, we aim to propose an AI-based and DT-enabled management for AANETs. This system mainly consists of four main models as AANET Topology Formation Management, AANET Topology Sustainability Management, AANET Topology Mapping Management, and AANET Routing Management. Here, our first aim is to form a stable, organized, and structured AANET topology. Then, we will enable the sustainability of this formed topology. We also continuously map the formed and sustained AANET topology to airplanes. Finally, the packets of airplanes are routed on this formed, sustained, and mapped AANET topology. We will create these four models with different AI-based methodologies and combine all of them under the DT technology in the final step. In the Topology Formation Management, we will propose a three-phased topology formation model for AANETs based on unsupervised learning. The main reason for proposing an unsupervised learning-based algorithm is that we have independently located airplanes with unstructured characteristics in AANETs before forming the topology. They could be considered as the unlabeled training data for unsupervised learning. This management model utilizes the spatio-temporal locations of aircraft to create a more stable, organized, and structured AANET topology in the form of clusters. More clearly, the first phase corresponds to the aircraft clustering formation, and here, we aim to increase the AANET stability by creating spatially correlated clusters. The second phase consists of the A2A link determination for reducing the packet transfer delay. Finally, the cluster head selection increases the packet delivery ratio in AANET. In the Topology Sustainability Management, we will propose a learning vector quantization (LVQ) based topology sustainability model for AANETs based on supervised learning. The main reason for proposing a supervised learning-based algorithm is that we already have an AANET topology before the A2A link breakage, and we can use it in supervised learning for training. Accordingly, we can consider the clusters in AANET topology as a pattern; then, we can find the best matching cluster of an aircraft observing A2A link breakages through pattern classification instead of creating topology continuously. This management model works in three phases: winning cluster selection, intra-cluster link determination, and attribute update to increase the packet delivery ratio with reduced end-to-end latency. In the Topology Mapping Management, we will propose a gated recurrent unit (GRU) based topology mapping model for AANETs. In topology formation, we create AANET topology in the form of clusters by collecting airplanes having similar features under the same set. In topology sustainability, we sustain the formed clustered-AANET topology with supervised learning. However, these formed and sustained AANET topologies must be continuously mapped to the clustered airplanes to notify them about the current situation. This procedure could be considered a part of sustainability management. Here, we continuously notify the airplanes with GRU at each timestamp about topological changes. This management model works in two main parts ad forget and update gates. In Routing Management, we propose a q-learning (QLR) based routing management model for AANETs. For this aim, we map the AANET environment to reinforcement learning. Here, the QLR-based management model aims to let the airplanes find their routing path through exploration and exploitation. Accordingly, the routing algorithm can adapt to the dynamic conditions of AANETs. In this management model, we adapt the Bellman Equation to the AANET environment by proposing different methodologies for its related QLR components. Accordingly, this model mainly consists of two main parts current state & maximum state-action determination and dynamic reward determination. Therefore, we execute the topology formation, sustainability, and routing management modules through unsupervised, supervised, and reinforcement learning-based algorithms. Additionally, we take advantage of neural networks in topology mapping management. After managing the topology and routing of AANETs with AI-based models, in the DT-enabled AANET management, we will support them with the DT technology. The DT can virtually replicate the physical AANET components through closed-loop feedback in real-time to solve the computational challenges of AI-based methodologies. Therefore, we will introduce the utilization of DT technology for the AANET orchestration and propose a DT-enabled AANET (DT-AANET) management model. This model consists of the Physical AANET Twin and Controller, including the Digital AANET Twin with Operational Module. Here, the Digital AANET Twin virtually represents the physical environment. Also, the operational module executes the implemented AI-based models. Therefore, in this thesis, we aim to propose an AI-based and DT-enabled management for AANETs. In this management system, we will first aim to propose AI-based methodologies for AANET topology formation, topology sustainability, topology mapping, and routing issues. Then, we will support these AI-based methodologies with DT technology. This proposed complete management model increased the packet delivery success of AANETs with reduced end-to-end latency.
-
ÖgeCharacter-level dilated deep neural networks for web attack detection(Graduate School, 2024-09-16) Moarref, Nazanin ; Sandıkkaya, Mehmet Tahir ; 504172517 ; Computer EngineeringThe swift expansion of web-based technology has resulted in a rise in intricate and advanced attacks directed toward website securities. An effective approach is necessary to defend against evolving attacks. This thesis's objective is to develop an effective method for detecting attacks. The goal is to detect attacks by utilizing the Hyper Text Transfer Protocol (HTTP) requests and minimizing the complexity of the preprocessing stage. For this reason, the HTTP requests are utilized at the character level. Therefore, the requests are interpreted as sequences of characters. Many studies have offered solutions to attack detection problems that leverage machine learning (ML) techniques. Feature engineering is required for many solutions in this field in order to achieve an efficient performance. Nevertheless, many of the applied techniques struggle to maintain the sequential information in the input. Deep learning (DL) garnered a lot of interest in attack detection since feature engineering is regarded to be the most labor-intensive step in developing an ML system. Since they are able to learn the feature representation and sequenced pattern within any given input automatically and generalize the feature representation efficiently. Hence, DL approaches outperform many traditional ML techniques. However, extracting long-term relationships remains a challenge for DL applications, despite their cutting-edge performance in attack detection. Larger receptive fields are necessary for convolutional neural networks (CNNs) to cover longer sequences. More layers are required for wider receptive fields, and more layers equal more parameters and a more difficult training process. Employing long short-term memory networks (LSTMs) is another efficient method for managing sequential data. Unfortunately, LSTMs still struggle to learn long-term relations because of their inability to deal with the problem of vanishing/exploding gradients. For capturing long-range dependencies in sequential data or time-series analysis, dilated neural networks are a good choice. By utilizing dilated layers and skip connections in LSTMs, the issue of vanishing/exploding gradients is mitigated. In CNNs, the receptive field can be expanded using dilated convolutions without requiring more computation or parameters. Gaps, or dilation, are created between the convolutional filter elements to accomplish the procedure. Dilated networks are therefore well suited for tasks that necessitate comprehending dependencies across a wide range since they can capture more contextual information. In this thesis, both the dilated LSTM and CNN-based methodologies' performances are assessed. Two distinct methodologies based on dilated LSTMs are evaluated: Dilated LSTM and dilated bidirectional LSTM (Bi-LSTM). The dilated Bi-LSTM methodology's first layer contains Bi-LSTM blocks. Consequently, the model in the Bi-LSTM layer retains the data available on both sides of each time step. With the aid of the dilated layers at the top of the Bi-LSTM layer, the model reduces the vanishing/exploding gradients problem and learns the temporal relations of different scales at various levels. With the exception of the LSTM blocks in the first layer, the structure of dilated LSTM is akin to dilated Bi-LSTM methodology. The multichannel of multilayer dilated CNN blocks with different kernel sizes make up the dilated CNN-based model as MC-MLDCNN. There are multiple layers in each channel, and their dilation sizes increase exponentially. By combining a variety of channels and multiple layers of dilated CNNs, the model recognizes the correlation and interdependence within character resolution in HTTP requests at various levels and scales. Three different datasets are used to assess the efficacy of CNN-based and dilated LSTM-based approaches to discover the long-term dependency of the complicated attacks. The Consejo Superior de Investigaciones Científicas (CSIC) 2010 dataset, the Web Application Firewall (WAF) dataset, and the self-collected dataset—which has been gathered for nearly a decade—are all used in the experiments. The WAF dataset only includes the query portion of HTTP requests, the self-collected data only includes the Uniform Resource Identifier (URI) portion and the full text of HTTP requests can be found in the CSIC 2010 dataset. The experiment's results demonstrate that, in terms of attack detection performance MC-MLDCNN performs better than dilated LSTM-based models in terms of accuracy, recall, precision and F1 score. MC-MLDCNN-based models require less computation time and converge faster as well. Therefore, the methodology proposed in this thesis to detect web attacks is MC-MLDCNN. The efficiency of the proposed methodology is compared with several cutting-edge DL-based methodologies found in the literature along with some conventional DL approaches. The experimental outcomes demonstrate the superiority of the proposed methodology using the same aforementioned metrics used for attack detection efficiency. Keeping the rate of categorizing normal requests as attacks (false positives) low while maintaining accurate attack detection is a critical skill for any effective web attack detection system. The business continuity is stopped as a result of the high false positive rate (FPR). To ensure enhanced security without compromising the availability and usability of web applications the FPR scores are also analyzed.
-
ÖgeClassification of melanoma malignancy in dermatology(Lisansüstü Eğitim Enstitüsü, 2021) Gazioğlu, Bilge Süheyla ; Kamaşak, Mustafa Ersel ; 709938 ; Bilgisayar MühendisliğiCancer has become one of the most common diseases all over the world in recent years. Approximately 40% of all incidences is skin cancer. The frequency of sightings of skin cancer has increased by 10 times in the last 50 years, and the risk of developing skin cancer is about 20%. Skin cancer has symptoms such as abnormal tissue growth, redness, pigmentation abnormalities and nonhealing wounds. Melanoma is a rare type of skin cancer with higher mortality compared to other types of skin cancers. Melanoma can be defined as a result of uncontrolled division and proliferation of melanocytes. Worldwide, melanoma is the 20th most common cancer and there are an estimated 287,723 new cases (1.6% of all cancers). In USA, more than two hundred thousand new cases of melanoma were diagnosed in 2021 and it increases more rapidly than other forms of cancer. Melanoma incidence increased up to 237% in the last 30 years. In our country, Turkey, melanoma is relatively rare compared to the other countries. Cancer cells display a rapid grow and systematic spread. As in all types of cancer, early diagnosis is of great importance for the treatment of skin cancer. Early diagnosis improves treatment success and prognosis. To detect a melanoma, changes in color, shape and structure of the skin, swelling and stains on the skin are carefully examined by the physicians. Besides the physician investigation, computer aided diagnosis (CAD) mechanisms are recommended for early diagnosis. In this thesis, deep learning models have been used to determine whether skin lesions are benign or malignant melanoma. The classification of the lesions is considered from two different points of view. In the first study, effect of objects in the image and image quality on classification performance was examined by using four different deep learning models. In addition, sensitivity of these models was tested. In the second study, it was aimed to establish a pre-diagnosis system that could help dermatologists by proposing a binary classification (benign nevi or malignant melanoma) mechanism on the ISIC dataset. In clinical settings, it is not always possible to capture flawless skin images. Sometimes skin images can be blurry, noisy, or have low-contrast. In other cases, images can have external objects. The aim of the first study is to investigate the effects of external objects (ruler, hair) and image quality (blur, noise, contrast) using widely used Convolutional Neural Networks (CNN) models. Classification performance of frequently used ResNet50, DenseNet121, VGG16 and AlexNet models are compared. Resilience of the mentioned models against external objects and image quality was examined. Distortions in the images are discussed under three main headings: Blur, noise and contrast changes. For this purpose, different levels of image distortions were obtained by adjusting different parameters. Data sets were created for three different distortion types and distortion levels. Firstly, the most common external object in skin images is hair on skin. In addition, rulers are commonly used as a scale for suspicious lesions on skin. In order to determine the effect of external objects on lesion classification, three separate test sets were created. These sets consist of images containing a ruler, hair and no external object (none). The third dataset consists only of mole (lesion) images. With the three datasets, four models were trained and their classification performances were analyzed. In fact, the best result was expected to be classified with a higher accuracy of the dataset that did not contain any object except the lesion. However, when the results are analyzed, since the image set containing hair had the highest number of images in the total dataset, the best classification performance in our system was measured by using DenseNet model on this subset. As a result of these tests, ResNet model showed a better classification performance compared to other models. Melanoma images can be better recognized under contrast changes unlike the benign images, we recommend ResNet model whenever there is low contrast. Noise significantly degrades the performance on melanoma images and the recognition rates decrease faster compared to benign lesions in noisy set. Both classes are sensitive to blur changes. Best accuracy is obtained with DenseNet model in blurred and noisy datasets. The images contain ruler has decreased the accuracy and ResNet has better performance in this set. Hairy images have the best success rate in our system since it has the maximum number of images in total dataset. We evaluated the accuracy as 89.22% for hair set, 86% for ruler set and 88.81% for none set. We can infer that DenseNet can be used for melanoma classification with image distortions and degradations. As a general result of the first study, we can conclude that DenseNet can be used for melanoma classification since it is more resistant to image distortion. In recent years, deep learning models with high accuracy values in computer aided diagnosis systems have been used frequently in biomedical image processing research area. Convolutional neural networks are also widely used in skin lesion classification to increase classification accuracy. In another study discussed in this thesis, five deep learning models were discussed in order to classify the images in the specially created skin lesions dataset. The dataset used in this study consists of images from ISIC dataset. In the dataset which is available in 2020, there are two classes of benign and malignant and three diagnosis consist of nevus, melanoma and unknown. We only considered images with nevus and melanoma diagnosis. Dataset had 565 melanoma and 600 benign lesion images in total. We separated the 115 images for the class of malignant melanoma and 120 images for the benign nevi class as our test set. The rest of the data was used for model training. With pre-processing methods such as flipping and rotation, the training dataset has divided into 5 parts and the number of images in the train set was increased. DenseNet121, DenseNet161, DenseNet169, DenseNet201, ResNet18, ResNet50, VGGNet19, VGGNet16_bn, SqueezeNet1_1, SqueezeNet1_0 and AlexNet models were trained with each subset. Using these models an ensemble system was designed. In this system, results the models were combined with the majority voting method. The accuracy of the proposed model is 95.76 % over the data set.
-
ÖgeCodebook learning: Challenges and applications in image representation learning(Graduate School, 2024-12-27) Can Baykal, Gülçin ; Ünal, Gözde ; 504202505 ; Computer EngineeringThe rapid advancement of Machine Learning (ML) and Artificial Intelligence (AI) has paved the way for novel approaches in image representation learning for Computer Vision (CV), particularly through the utilization of codebook learning techniques. A codebook consists of representative vectors, also known as codewords, embeddings, or prototypes based on the context, that capture the essential features of the data. Codebook learning involves training these discrete representations within models, allowing the mapping of continuous data into a set of quantized or discrete vectors. This thesis studies codebook learning in two different contexts: the exploration of its challenges and the exploitation of the learned codebook in various tasks, including image generation and disentanglement. By examining three key studies, this thesis aims to provide a comprehensive understanding of how the challenges of codebook learning can be mitigated and how the learned codebook can be leveraged to enhance various image representation learning tasks. Codebook learning is beneficial in various applications, including image generation and classification tasks. It can be integrated into models like discrete Variational Autoencoders (VAEs), where it allows for efficient encoding and decoding of information, thereby improving performance in generative tasks. Additionally, in prototype based classification, codebooks consist of prototypes that characterize distinct classes within a dataset, enabling more accurate predictions. The versatility of codebook learning across different frameworks underscores its significance in advancing techniques for representation learning. The studies in this thesis perform codebook learning within different frameworks, and focus on the challenges of codebook learning along with the codebook incorporation to solve the significant problems of different image representation learning tasks. The first study addresses the challenge of codebook collapse where the codebook learning is performed within a discrete VAE framework. This phenomenon occurs when the learned codebook fails to capture the diversity of the input data as the multiple inputs get mapped to a limited number of codewords, leading to redundancy and a loss of representational power. This issue particularly arises in models such as Vector Quantized Variational Autoencoders (VQ-VAEs) and discrete VAEs, which rely on discrete representations for effective learning. The proposed solution involves a hierarchical Bayesian modeling to mitigate the codebook collapse. This work contributes significantly to the field by providing empirical evidence and theoretical insights into the root cause of codebook collapse, overcoming this collapse, thereby enhancing the representational power of discrete VAEs. After the first study that focuses on exploring the challenges of codebook learning within a VAE framework, the second and the third work focus on the problems of various image representation learning tasks where codebook learning can be exploited. In the second study, the focus shifts to the computational time problem of deep generative models, especially diffusion models. Diffusion models require relatively longer times for convergence, and our hypothesis is that incorporating informative signals about the data during the training of diffusion model might reduce the convergence time. However, the critical thing to manage is obtaining these informative signals in negligibly short time so that reducing the training time of the diffusion model also reduces the overall computational time. To learn such informative signals, we perform codebook learning within a framework of training a classifier, and the learned codebook consists of prototypes that represent the classes in the data. The second study in this thesis shows that using the class prototypes that are learned in a short time as the informative signals during the training of the diffusion model leads to better generative performance in the early stages of training, and eliminate the need for longer training. The third study's motivation is to overcome another important representation learning problem called disentanglement—a key aspect in understanding and representing complex data structures. Disentanglement refers to the ability to separate and manipulate the underlying factors of variation in the data, which is crucial for tasks such as attribute manipulation and controlled generation. On the grounds of the categorical nature of the underlying generative factors, our hypothesis is that using discrete representations that are well suited for the categorical data might aid disentanglement in the image representation. Therefore, we build a novel framework to learn a codebook within the framework of discrete VAEs, and propose an original optimization based regularization to further assist the disentanglement. The findings of this study demonstrate that using discrete representations and optimization based regularizers leads to significant improvements in terms of disentanglement. This research emphasizes the synergy between codebook learning and disentanglement, advocating for further exploration of their combined potential in advancing image representation learning. The exploration of these three studies reveals the critical challenges and advantages associated with codebook learning. The first study lays the groundwork by addressing the fundamental issue of codebook collapse, while the subsequent studies demonstrate the applicability of codebook learning in diverse contexts such as image generation and disentanglement. Together, these works illustrate that a robust understanding of codebook learning can lead to significant advancements in image generation and disentanglement. In summary, this thesis contributes to the growing literature on codebook learning by providing a detailed overview that includes its challenges and applications. The findings highlight the importance of addressing inherent challenges while leveraging the benefits of codebook learning for practical applications. Insights gained from this research aim not only to enhance the performance of existing models but also to inspire future innovations in image representation learning.
-
ÖgeCompression of geometry videos by 3D-SPECK wavelet coder(Lisansüstü Eğitim Enstitüsü, 2021) Bahçe Gülbak, Canan ; Bayazıt, Uluğ ; 723134 ; Bilgisayar MühendisliğiA geometry image represents a manifold surface in 3D space as an 2D array of 3D points. This involves 3 steps : First, cutting the manifold which essential defines the boundary of the square, second, defining the parametrization which defines the interior of the square and lastly, rasterizing and scan converting the geometry and applying compression to it. By representing manifold 3D objects using a global 2D parametrization (mapping) it is possible to use existing video techniques to represent 3D animations. 2D-SPECK coder, discovered by Islam and Pearlman, codes sets of DWT coefficients grouped within subbands. SPECK coder is different from the other schemes in that it does not use trees which span and also exploits the similarity accross different subbands. It makes use of sets in the form of blocks. The main idea is to exploit the clustering of energy in frequency and space in the hierarchical structures of wavelet transformed images. 3D-SPECK coder, is an extension of the 2D-SPECK algorithm for compressing 3D data with high coding efficiency. A geometry video is formed as a sequence of geometry images where each frame is a remeshed form of a frame of an animated mesh sequence. For efficiently coding geometry videos by exploiting temporal as well spatial correlation at multiple scales, this thesis proposes the 3D-SPECK algorithm which has been successfully applied to the coding of volumetric medical image data and hyperspectral image data in the past. The thesis also puts forward several postprocessing operations on the reconstructed surfaces that compensate for the visual artifacts appearing in the form of undulations due to the loss of high frequency wavelet coefficients, cracks near geometry image boundaries due to vertex coordinate quantization errors and serrations due to regular or quad splitting triangulation of local regions of large anisotropic geometric stretch. Experimental results on several animated mesh sequences demonstrate the superiority of the subjective and objective coding performances of the newly proposed approach to those of the commonly recognized animated mesh sequence coding approaches at low and medium coding rates.
-
ÖgeDA4HI: A deep learning framework for facial emotion recognition in affective systems for children with hearing impairments(Graduate School, 2023-11-23) Gürpınar, Cemal ; Köse, Hatice ; Arıca, Nafiz ; 504122504 ; Computer EngineeringThe study of facial emotions has important implications for fields such as psychology, neuroscience, and computer science, including the recognition and interpretation of facial expressions to improve communication between humans and computers and helping people with particular needs, such as the elderly, children with ASD (Autism Spectrum Disorder), or children with hearing impairment. Facial expressions enable facile observation of emotions by human beings. The term "AU" refers to the smallest facial movements that can be visually distinguished. Facial Expression Recognition (FER) pertains to the identification of general emotional states such as happiness, sadness, anger etc. The identification of distinct muscle movements related to various facial expressions can be achieved by using action unit (AU) detection, allowing for a more comprehensive examination of facial expressions. FER and AU detection are two interrelated yet separate issues in the field of emotion recognition. The AU detection process involves the recognition of distinct facial muscle movements or actions that correspond to various expressions. On the other hand, AU detection encompasses the examination of specific AUs that contribute to the overall expression. Due to how children with special needs and children with hearing impairments express their emotions differently, it may be challenging to comprehend the emotional states of these children compared to adults. However, psychological studies, numerous child-machine interactions, and social robots for children all use children's emotions as one of the most crucial cues for evaluating their cognitive development. Many FER and AU detection systems use pre-trained models based on adult data, but few are trained on child data, whose face morphology differs from that of adults. One of the main reasons for this is the fact that datasets for children are scarce since they are delicate, and accessing them requires additional procedures. Models that are trained only on images of children may not be robust and sufficiently general as a result. The motivation of this thesis study is to develop and implement a model for recognizing facial emotions in children with hearing impairment to be utilized on a Pepper, a socially assistive humanoid robot platform, in real-time for the "RoboRehab: Assistive Audiology Rehabilitation Robot" project (TUBITAK 118E214). The spontaneous facial data of children who have hearing impairments was gathered in a study involving an interaction with a Pepper humanoid robot and a tablet-based game. In both experimental conditions, the responses of the children were captured via a video recording device positioned in their direct line of sight. The frames from the videos that showed notable levels of emotional intensity were chosen for analysis. The resulting images, which were subsequently labeled by annotators, were categorized as neutral, negative, and positive. Also, 18 distinct action units were detected in the aforementioned frames. One of the research questions to be answered in this thesis is whether the process of recognizing emotions can be optimized by directly detecting facial expressions in an image or by first detecting the facial action units within the image and subsequently utilizing them to recognize emotions. In order to conduct the experiment related to recognizing emotions directly from the images, the pre-trained Convolutional Neural Network (CNN) model were fine-tuned by the transfer learning to improve the recognition performance of hearing-impaired children's facial expressions. For this purpose, since human faces have different morphological structures according to age groups, the contribution of transfer learning to facial expression recognition performance from typical adults and children to hearing impaired children was explored. The AffectNet dataset for adults and the CAFE dataset for typical children were used. The CAFE dataset was classified into three emotion categories, namely positive, negative and neutral, in order to align with the emotional categories present in the dataset of hearing-impaired children. The present study analyzed the impact of incorporating training with both the basic 8 emotions (namely, angry, disgust, contempt, fear, happy, neutral, sad, and surprise) and 3 emotions (positive, negative, and neutral) on the performance of a model trained on the AffectNet dataset. This was done in light of the fact that the AffectNet dataset contains a larger number of images compared to the CAFE and hearing-impaired children's datasets. As a result of these experiments, it was found that fine-tuning the trained model using adult datasets which contain eight basic emotions contributed positively to the facial expression recognition performance of hearing-impaired children. For the emotion recognition experiment through AU detection, a model was proposed that employs a contrastive learning-based domain adaptation method that uses a siamese network due to the limited availability of data on hearing-impaired children to enhance the performance of facial AU detection. One of the research questions that this thesis was supposed to answer was how effective domain adaptation is in adults with different facial shapes but lots of data compared to children with similar facial shapes but not as much data.. In Siamese networks, it is important to identify positive and negative pairs. The distinction of positive and negative pairs is unambiguous in scenarios where mutually exclusive labels are employed, but it becomes increasingly uncertain when non-mutually exclusive labels are assigned to each image. In order to benefit from current methodologies that rely on contrastive loss, it is necessary to impose limiting assumptions. The straightforward method entails considering a pair of images as positive if they share identical labels and negative otherwise. For AU labeled data, the same AUs must be detected in both images for an image pair to be positive. So, this is not a very optimistic approach, especially since children's data is difficult to access and therefore data scarcity arises, and facial AU detection is a challenging problem. It further reduces the small dataset. A less strict strategy involves assuming that each image pair is positive if they share at least one label. This method is also far from ideal. Instead, given that the detection of facial AUs constitutes a multi-label classification task, the incorporation of a novel smoothing parameter, denoted as $\beta$, which serves to modulate the impact of comparable samples on the loss function of the contrastive learning approach, was proposed. The findings indicate that the incorporation of children's data (Child Affective Facial Expressions - CAFE) in domain adaptation produces superior performance outcomes compared to the use of adult's data (The Denver Intensity of Spontaneous Facial Action - DISFA). Furthermore, the adoption of the smoothing parameter $\beta$ results in a noteworthy enhancement of the recognition success. In relation to the aforementioned inquiry, an additional question was explored within the confines of the thesis: since the datasets containing the facial expressions of children, especially hearing-impaired children, are limited and difficult to access, which approach would yield superior results—transfer learning or domain adaptation? In order to address this inquiry, we used the Hearing-Impaired Children(HIC) dataset, which has been meticulously annotated by many experts, including both AU and emotion labels. Within the framework of domain adaptation, the Siamese Network model was successfully trained using data derived from typically developing children, namely the CAFE dataset, with the objective of identifying the Action Units (AUs) shown by children in the HIC dataset. Next, the top portion of the Siamese Network model was used as the AU classifier. An artificial neural network (ANN) model was built onto the AU classifier in order to accurately identify the emotional states of the children in the HIC dataset. On the other hand, the transfer learning idea used the EfficientNet-B0 model. This model was first trained using the IMAGENET and CAFE datasets, and then further refined using the HIC dataset. In this work, we use the HIC dataset, which consists of positive, negative, and neutral emotions, to train deep learning models. When conducting the research, it was noted that the models trained using transfer learning and domain adaptation techniques had comparable outcomes. The experts that annotated the HIC dataset in terms of emotion, categorized the emotion as neutral, which belongs to a class of emotions that cannot be categorized as either positive or negative. For instance, the feeling of surprise transcends the categories of negative and positive emotions. Consequently, this feeling is included in the neutral class. However, in the domain adaptation concept, the assumption was made that the neutral emotion did not exhibit any Action Units (AUs) after first detecting them and then classifying the emotions. In the RoboRehab project, we examined the ability to recognize positive and negative emotions in hearing-impaired children. To do this, we investigated the performance of detecting just positive and negative emotions using both transfer learning and domain adaptation concepts. Upon doing the investigation using the HIC dataset, which only consisted of positive and negative emotions, it was noted that the domain adaptation strategy yielded much superior outcomes compared to transfer learning.
-
ÖgeDeep learning-based techniques for 3D point cloud analysis(Graduate School, 2023-10-10) Şahin, Yusuf Hüseyin ; Ünal, Gözde ; 504172510 ; Computer EngineeringThis thesis presents two innovative works in the field of point cloud processing: ODFNet and ALReg. The ODFNet work proposes a new method for the classification and segmentation of point clouds, while ALReg aims to facilitate the training of neural networks for point cloud registration using active learning. Convolutional neural networks (CNNs) have been widely used for visual tasks such as object categorization, object detection, and semantic segmentation. However, the application of CNNs to point clouds is a relatively recent development. Notably, there has been no previous approach that specifically utilizes point density information. To enhance the representational power of local features, we propose leveraging the distribution of point orientations within a neighborhood relative to a reference point. This led us to introduce the concept of point Orientation Distribution Functions (ODFs). In order to compute Orientation Distribution Functions (ODFs), our approach involves dividing the spherical region around each point into a set of cones aligned with predefined orientations. Within each cone, we calculate the density of points, resulting in the ODFs for that particular point. These ODF patterns provide valuable information about the spatial structure and orientation characteristics of the objects within the point cloud. The ODFs allow us to summarize the local neighborhood structure of the point cloud in a concise manner, which is beneficial for our point cloud analysis network model design. By incorporating the ODFs, we can effectively capture and utilize the significant local geometric information present in the point cloud, leading to improved performance and accuracy in our analysis tasks. Additionally, we introduce the ODFNet, a neural network model specifically designed for point cloud analysis tasks. The ODFNet utilizes the ODFBlock and ODFs to effectively incorporate the directional information captured by the ODFs. This integration enhances the performance of various tasks, such as classification, segmentation, and other related applications, by leveraging the rich geometric information provided by the ODFs. The experimental results confirm that ODFNet achieves state-of-the-art performance in both classification and segmentation tasks. Point cloud registration, which involves aligning multiple point clouds by calculating the rigid transformation between them, is a crucial task in computer vision. While there have been significant advancements in point cloud registration using various methods, most of them rely on training the network with the entire shape for registration. However, an alternative approach that has not been explored in the literature is active learning, which could also be employed to address this problem. Thus, in this thesis, we also introduce ALReg, an active learning approach aimed at improving the efficiency of point cloud registration by selectively utilizing informative regions to reduce training time. To achieve this, we modify the baseline registration networks by incorporating Monte Carlo DropOut (MCDO) to efficiently calculate uncertainty. Our main objective is to demonstrate that similar accuracy scores can be achieved by using fewer point clouds or parts of point clouds in the training phase of any point cloud registration network, provided that the selection of these training samples/parts is done effectively. By leveraging ALReg, our goal is to incorporate the most effective point cloud parts into the training procedure, thereby improving the efficiency and effectiveness of the registration network.
-
ÖgeDeep wavelet neural network for spatio-temporal data fusion(Graduate School, 2022-07-19) Kulaglic, Ajla ; Üstündağ, Burak Berk ; 504112516 ; Computer EngineeringMachine Learning (ML) algorithms have recently gained prominence in prediction problems. The construction of an accurate machine learning model becomes a real challenge concerning the nature of the data, the number of data samples as well as the accuracy and complexity of the model. This study introduced a new machine learning structure for temporal and spatio-temporal, univariate, and multivariate prediction problems. The predictive error compensated neural network model (PECNET), which combines spatio-temporal data, has been developed. Temporal data contains information within the observation time window, and its bandwidth is limited by the sampling rate. On the other hand, spatial data provide information regarding spatial location, while spatio-temporal data combine temporal and spatial resolution together. The PECNET model can capture both time dependencies and the spatial relationships between different data resources by fusing multivariate input patterns at multiple lengths and the sampling resolution. The PECNET achieves reliable prediction performance with relatively low model complexity and minimizes the overfitting and underfitting problems. In the proposed model, additional networks are used to predict the error of previously trained networks to compensate the overall prediction. The main network uses high correlation data with the target through moving frames in multiple scales. The PECNET improves time series prediction accuracy by enhancing orthogonal features within a data fusion scheme. The same structure and hyperparameter sets are applied to quite different problems to verify the proposed model's robustness and accuracy. Root-zone soil moisture, wind speed, financial time series data, and stationary and non-stationary time series benchmark problems are selected to evaluate the PECNET model. The results have shown improvement in the prediction accuracy and overfitting prevention using multiple neural networks for distinctive types of problems. The first part of this dissertation focuses on designing and implementing the proposed PECNET model. The algorithm is implemented in the Python programming language, and the performance of the proposed algorithm is evaluated on stochastic and chaotic time series benchmark problems found in the literature. Results have highlighted some aspects of PECNET implementations. The major contributions of the proposed method can be seen in improving the prediction accuracy for distinct types of time series data (chaotic and stochastic) using multiple neural networks where the secondary network is trained by shifted time series prediction error of the primary network. The overfitting is avoided due to an increase in recurrence-related feedback. The same structure and hyperparameter sets are applied for a wide range of time series prediction problems with moving frames in multiple scales. The discrete wavelet transform (DWT) used for preprocessing the input data yields better accuracy improvement than directly applying the time series data to the neural network in predictive error compensation. The PECNET for the stock price prediction problem is introduced in the second part of the dissertation. The selected data represent the non-stationary time series data. Due to the difficulties in the traditional normalization techniques that deal with non-stationary time series data, the average normalization method is proposed. The average value of the current input to the neural networks is computed and subtracted from particular input data. The proposed normalization method is able to represent the different volatilities and preservation of the original properties within each input sequence. The different frequencies of stock price time series data are used together in one neural network, while an additional network uses the previous residual errors as inputs. The updated learning method is applied in this part, enhancing the overall prediction performance. In the third part, the improved PECNET model enables choosing orthogonal features in data fusion applications. Different data types can be fused into one single model by extracting valuable knowledge from multivariate time series data. The extraction of valuable knowledge is done by checking the correlation between the remaining features and residual error. The PECNET chooses the highest correlating data to the residual error acquired by the previously trained network. It is well known that irrelevant features cause overfitting in forecasting models, representing a critical issue considering the number of samples and the number of available features. Because of that, selecting the proper feature set to the essential ones will reduce the learning process's computational cost and improve the accuracy by minimizing the overfitting. In the fourth chapter, the root-zone soil moisture problem is introduced. For this purpose, in-situ agrometeorological measurements and satellite remote sensing indices are used. The distance between the central point and known stations is calculated. The root-zone soil moisture estimation is done using only accumulated ground-based measurements as input data, using only remotely sensing indices, and combining both. Applying the PECNET to the spatio-temporal root-zone soil moisture estimation problem shows promising results. The results can be used to obtain the soil moisture map of neighboring points where sensor information is unavailable. The fifth part examines the decomposition into frequency bands of input time series data and the applicability of different filtering methods. For this purpose, the Butterworth filter is implemented and used as an additional filtering method. Besides the closing stock price as input data, the Far-Eastern stock market indices to obtain the spatial dimensional for the financial time series forecasting example has been included. The overall results showed that fusing spatial and temporal data together into a separately trained cascaded PECNET model can achieve promising results without causing overfitting or reducing the model performances. The proposed wavelet preprocessed PECNET also leaves room for improvement using various preprocessing techniques as well as different types of neural networks.
-
ÖgeDetection of common IoT authentication attacks and design of a lightweight authentication and key management protocol(Graduate School, 2023-12-18) Çetintav, Işıl ; Sandıkkaya, Mehmet Tahir ; 504182509 ; Computer EngineeringThe Internet of Things (IoT) has grown rapidly over several years. IoT establishes connections between devices and the Internet. Thing could be everything and every kind of object. Things are smart as they are able to connect to the Internet and make decisions automatically. IoT devices are widely used and there are numerous IoT devices worldwide. As devices are deployed in diverse settings, such as daily life, smart homes, smart cars, and smart agriculture, they offer various benefits. However, while IoT devices are helpful, the spread of devices causes several concerns. Management is crucial for IoT devices and numerous devices cannot be managed easily. Besides the management of devices, IoT devices are vulnerable due to their characteristics. One of the characteristics is that the IoT devices are resource-constraint. They generally have limited CPU, memory, and storage. As implementing comprehensive security mechanisms is expensive, users do not prefer to use them. Additionally, authentication and key exchange protocols are generally deficient. All of the mentioned issues make IoT devices vulnerable. While a vulnerable device could be easily captured by attackers, a group of devices could be also captured. Botnets, also known as robot networks, pose a threat to IoT devices as they infect and capture them. Thus, botnets compose a large-scale attack on Information Systems. There are numerous attacks on IoT devices. Thus, IoT devices need to implement robust security mechanisms. In this thesis, IoT devices mentioned are resource-constrained, include weak security mechanisms, establish continuous Internet connectivity, and perform specific functions. Even if these devices may include personal data, it is assumed that a breach of this data does not constitute a problem. For instance, a weather sensor data breach is not a significant concern for users. Besides these concerns, the mentioned devices transmit small data such as temperature, humidity, and commands. This thesis begins with a test environment setup to monitor attacks and attackers to understand attacker behavior and features. The test environment is installed on the WalT platform, a reproducible and reusable computer management environment. On this platform, a honeypot mechanism is installed for monitoring. It is aimed to provide comprehensive support to propose a suitable and effective security mechanism. Honeypot helps to present the attackers, who send malicious requests, attack types, and ease of attack. Upon analyzing the honeypot data, it becomes evident that weak authentication introduces vulnerabilities, leading to authentication attacks. Against detected attacks, a lightweight One-Time Password (OTP) authentication and key exchange protocol that is available from long-range or close-range areas, easy-to-use for users, and computationally low-cost protocol is proposed. The proposed authentication protocol includes a key exchange and presents a hierarchic management model. The hierarchic model provides easy management, cost-effective key exchange, and independence between devices. The proposed protocol has another crucial feature: all session data and session keys (ephemeral keys) are updated in every session. Every session is independent of each other. All session data and ephemeral keys are computed using only primitive cryptographic functions such as XOR operation and hash functions. Thus, the protocol is cost-effective and lightweight protocol. The protocol begins with registrations of servers and devices. Firstly, all servers are registered to their upper-level server. The registered servers start the authentication phase to register devices. Thus, device registration is completed with authenticated servers. The communication can be in two directions: device-to-server and server-to-device. The protocol is initiated by both participants separately. Devices and servers verify several values during the authentication. They generate ephemeral keys at the end of the authentication and messages are encrypted with these ephemeral keys. The protocol is guaranteed with the AVISPA model checker in a formal way. This thesis also presents an informal security analysis of the protocol. Security analysis shows that the protocol is robust against attacks like replay, theft, and DOS. Performance analysis is also presented of the proposed protocol. Devices compute only 1 XOR operation and 11 hash computations. Every authentication protocol has specific features, requirements, and goals when looking at the literature studies. The proposed protocol has the following features: Lightweight and cost-effective authentication, key exchange, and message transfer protocol. It consumes low power due to its primitive computations. There is no need to use extra hardware (i.e. smart card, RFID tag) for the protocol. It is possible to authenticate devices both remotely and nearby. Users can communicate with a single device or a group of devices. In this thesis, the goals are achieved; attackers are monitored with a honeypot, the security issues of the IoT devices are revealed, and a lightweight authentication and key exchange protocol is proposed with a well-suited management model.
-
ÖgeDeveloping a novel artificial intelligence based method for diagnosing chronic obstructive pulmonary disease(Graduate School, 2023-11-06) Moran, İnanç ; Altılar, Deniz Turgay ; 504072504 ; Computer EngineeringToday, research on machine learning and deep learning continues intensively due to their success in data classification and applications used in practice and their capacity to accurately reveal the information in the data. Since the beginning of the 21st century, especially deep learning, has produced very successful results by leaving traditional learning models behind and revolutionizing the latest technology. In this context, the detection of a fatal and global disease using deep learning has been researched in this thesis. The motivation of this research is to introduce the first research on automated Chronic Obstructive Pulmonary Disease (COPD) diagnosis using deep learning and the first annotated dataset in this field. The primary objective and contribution of this research is the development and design of an artificial intelligence system capable of diagnosing COPD utilizing only the heart signal (electrocardiogram, ECG) of the patient. In contrast to the traditional way of diagnosing COPD, which requires spirometer tests and a laborious workup in a hospital setting, the proposed system uses the classification capabilities of deep transfer learning and the patient's heart signal, which provides COPD signs in itself and can be received from any modern smart device. Since the disease progresses slowly and conceals itself until the final stage, hospital visits for diagnosis are uncommon. Hence, the medical goal of this research is to detect COPD using a simple heart signal before it becomes incurable. Deep transfer learning frameworks, which were previously trained on a general image data set, are transferred to carry out an automatic diagnosis of COPD by classifying patients' electrocardiogram signal equivalents, which are produced by signal-to-image transform techniques. Xception, VGG-19, InceptionResNetV2, DenseNet-121, and "trained-from-scratch" convolutional neural network architectures have been investigated for the detection of COPD, and it is demonstrated that they are able to obtain high performance rates in classifying nearly 33.000 instances using diverse training strategies. The highest classification rate was obtained by the Xception model at 99%. Although machine learning and deep learning generate accurate results, until a certain date, these techniques were subject to "black box" discourse. Recently, explainability has become a crucial issue in deep learning. Despite the exceptional performance of deep learning algorithms in various tasks, it is difficult to explain their inner workings and decision-making mechanisms in a way that is understandable. Explainable AI methods enable the accurate prediction of the outcomes of an AI model or the comprehension of the decision-making process. The LIME and SHAP methods, which are among the models that make it possible to interpret the results of deep learning and machine learning models, have been investigated for the purpose of interpreting the classifications made in the thesis. This research shows that the newly introduced COPD detection approach is effective, easily applicable, and eliminates the burden of considerable effort in a hospital. It could also be put into practice and serve as a diagnostic aid for chest disease experts by providing a deeper and faster interpretation of ECG signals. Using the knowledge gained while identifying COPD from ECG signals may aid in the early diagnosis of future diseases for which little data is currently available.
-
ÖgeDeveloping morphology disambiguation and named entity recognition for amharic(Graduate School, 2024-11-01) Jibril, Ebrahim Chekol ; Tantuğ, Ahmet Cüneyd ; 504122515 ; Computer EngineeringMorphological disambiguation is defined as the process of selecting the correct morphological analysis for a given word within a specific context. Developing natural language processing (NLP) applications is very challenging without effective morphological disambiguation. Semitic languages, including Arabic, Amharic, and Hebrew, present increased challenges for NLP tasks due to their complex morphology. Named Entity Recognition (NER) plays a crucial role as a preliminary phase in various downstream tasks such as machine translation, information retrieval, and question answering. It is an essential component of information extraction, used to identify proper names and temporal and numeric values in open domain text. The NER task is particularly difficult for Semitic languages because of their highly inflected nature. In this research, new datasets for developing word embeddings and performing morphological disambiguation are collected, and a relatively large dataset is annotated and made publicly available. Multiple Amharic named entity recognition systems are constructed utilizing contemporary deep learning techniques, including transfer learning with RoBERTa—a transformer-based model. Additionally, Bidirectional Long Short-Term Memory (BiLSTM) models are employed and integrated with a conditional random fields layer to enhance performance. A BiLSTM model is also developed specifically for morphology disambiguation using a newly prepared dataset. The Synthetic Minority Over-sampling Technique (SMOTE) is utilized to address the imbalance in class distribution within the datasets. The study achieves state-of-the-art results for Amharic named entity recognition, attaining an F1-score of 93% with RoBERTa, and achieves an accuracy of 90% for morphology disambiguation. In Chapter 1, the dissertation establishes the context by introducing the Amharic language and its significance within Ethiopia and the broader Semitic language family. It outlines the primary challenges associated with processing Amharic texts due to its rich morphological structure and orthographic variations. Key research questions and objectives are formulated, with a focus on advancing NLP capabilities for Amharic through improved morphological disambiguation and NER. Chapter 2 provides a comprehensive overview of Amharic's linguistic properties, emphasizing its script and morphological complexity. The lack of capitalization and extensive character set are identified as challenges for NLP. This chapter provides the foundational understanding necessary for developing tools capable of handling Amharic's unique linguistic features. In Chapter 3, the focus is on developing effective models for Amharic morphology disambiguation. Related work from other Semitic languages is reviewed, and the construction of relevant datasets is detailed. The application of BiLSTM models to tackle Amharic's morphological properties is described, along with the experimental setup and evaluation metrics, which demonstrate significant improvements in accuracy. Chapter 4 addresses the challenges of performing NER in Amharic, a task complicated by the language's morphological and orthographic features. A new, extensively annotated Amharic NER dataset is introduced. The chapter evaluates various model architectures, including BiLSTM-CRF and RoBERTa, a transformer-based model, and discusses the resulting enhancements in model performance. In Chapter 5, the research findings are synthesized, emphasizing the dissertation's contributions to advancing Amharic NLP through the development of high-performing models and comprehensive datasets. Recommendations for future work include enhancements in tools for spelling correction and further expansion of NER datasets to improve the system's capabilities. Through this comprehensive approach, the dissertation significantly contributes to the field of computational linguistics for low-resource languages, offering novel insights and methodologies that can be adapted for other morphologically rich languages.
-
ÖgeDinamik ortamlar için istatiksel metotlar kullanan çoklu evrimsel algoritmalar(Lisansüstü Eğitim Enstitüsü, 2022-09-19) Gazioğlu, Emrullah ; Uyar, Ayşe Şima ; 504152518 ; Bilgisayar MühendisligiGerçek dünyada karşılaştığımız birleşimsel (ing: combinatorial) optimizasyon problemleri doğası gereği dinamik bir yapıya sahiptir. Dinamik ortamlarda bulunması gereken optimum nokta zamanla değişeceğinden, sezgisel yaklaşımlar ancak bu ortamlara iyi adapte edilirse başarılı olabilir. Çevresel değişiklik, optimizasyon algoritmalarının her iki tarafında da (kısıtlar ve/veya amaç fonksiyonu) meydana gelebilir. Değişikliği ele almanın en basit yolu, algoritmayı yeniden başlatmaktır. Ancak, yeni optimal çözüm öncekinden çok uzak olmayabilir. Bu nedenle, yeniden başlatma fikri kullanışlı değildir. Bunun yerine, şimdiye kadar edinilen bilgiler mevcut ortama uyum sağlamak için faydalı olabilir. Bu uyarlamayı gerçekleştirmek için bazı dinamik ortam kriterleri dikkate alınmalıdır: (i): değişim sıklığı, (ii): değişikliğin şiddeti, (iii): döngü uzunluğu/döngü doğruluğu, (iv): değişimin öngörülebilirliği. Yukarıda bahsedilen problemleri ele alabilmek için literatürde hem deterministik hem de sezgisel yöntemler kullanılmıştır. Bu yöntemler yetersiz kalınca Metasezgisel algoritmalar kullanılmaya başlanmıştır. Genetik Algoritmalar, Metasezgisel algoritmaların, Evrimsel Algoritmalar alt sınıfına düşen ve türlerin doğadaki biyolojik evriminden ilham alan çok popüler optimizasyon algoritmalarıdır. GA'lar, literatürdeki büyük başarılarına rağmen, değişen ortamlarda genetik çeşitliliklerini kaybederler. Bunun nedenleri olarak şunları söyleyebiliriz: (i): faydalı çözümleri kaybetmek ve (ii): problemin değişkenleri arasındaki ilişkileri kullanamamak. Bu tezde, birinci sorun için, bir çoklu kromozom yapısı uygulanarak bir örtük bellek şeması geliştirilmiştir. İkinci sorun için, problemin değişkenleri (bir kromozomdaki genler olarak da bilinir) arasındaki ilişkilerden yararlanmak için bir Bayes Ağı kullanımıştır. Epistasis, gerçek biyolojik hayatta bir kromozomdaki genlerin etkileşimi anlamına gelir. Daha açık olarak, bir genin etkisi, başka bir genin/genlerin varlığına veya yokluğuna bağlıdır. Bu tezde, genlerin etkileşimlerinden faydalanmak için, çoklu gösterilimin yanı sıra, iyi bilinen bir Dağıtım Tahmini Algoritması olan Bayesçi Optimizasyon Algoritması, önerilen algoritmaya enjekte edilmiştir. Bu tezde, dinamik ortam optimizasyon problemleri ile baş edebilmek için GA tabanlı istatistiksel metotlar kullanan çok kromozomlu bir algoritma önerilmiştir. İlk olarak, örtük bir bellek şeması elde etmek için GA'ya çoklu gösterilim eklenmiştir. Genetik operatörler örtük bellek üzerinde icra edilirken uygunluk değeri hesaplamaları çözüm adaylarının fenotipleri üzerinden yürütülmüştür. Ayrıca, önerilen algoritmanın varyantları literatürde önceden tanıtılmış olan bazı göçmenlik yöntemleri kullanılarak oluşturulmuş, farklı parametre değerleri ile nasıl davrandıkları gözlenmiştir. Önerilen algoritmayı test etmek için üç farklı problem çözülmüştür: Ayrıştırılabilir Birleşim Tabanlı Fonksiyonlar, Dinamik Sırt Çantası Problemi ve Çok boyutlu Sırt Çantası Problemi. Ayrıştırılabilir Birleşim Tabanlı Fonksiyonlar, çeşitli karmaşıklık düzeyleri içerdikleri için dinamik optimizasyon problemlerinde sıkça kullanılan kıyaslama problemleridirler. Bu fonksiyonlarda her bir çözüm adayı dört bitlik bölümlere ayrılır ve her bölümün uygunluk değerleri ayrı ayrı hesaplandıktan sonra bulunan değerler toplanıp çözüm adayının genel uygunluk değeri bulunur. Sırt Çantası Problemi, bilgisayar bilimlerinde sıkça karşılaşılan bir problem formatıdır. Bu problemde, pahada (getirisi) ağır, yükte hafif nesnelerin toplanması hedeflenmekte ve bunu yaparken getiriyi maksimuma çıkarırken yükü minimumda tutmaya çalışılır. Gerçek dünya problemleri üzerindeki etkilerini görmek için bu problemin dinamik versiyonu çözüldü. Finansal yönetim ve endüstride, birçok gerçek dünya sorunu bu problem ile ilgilidir. Örneğin, kargo yükleme, üretim planlaması, sermaye bütçelemesi, proje seçimi ve portföy yönetimi bu problem ile çözülebilen örneklerdir. Çok boyutlu Sırt Çantası Problemi, normal versiyonundan farklı olarak, birden fazla kaynak içeren ve her bir kaynağın kendine ait kısıtları olan versiyonudur. Bu problem, tek bir kısıt yerine kaynak sayısı kadar kısıt olduğundan çözülmesi daha zordur. Yukarıda bahsedilen problemleri çözmek için iki farklı dinamik ortam yöntemi kullanılmıştır. Bunlardan birincisi XOR jeneratörü, diğer ise Normal Dağılım metotu ile yeni veri setleri oluşturmaktır. Sonuç olarak, bu tezde, dinamik optimizasyon problemlerini çözmek için hem istatistiksel bir yöntem hem de örtük bir bellek şeması kullanan bir GA önerilmiştir. Önerilen yöntemin dinamik ortamlardaki davranışını izlemek için üç farklı problem çözülmüştür. Daha sonra performansı literatürdeki en yeni bir yöntem ile karşılaştırılmıştır. Sonuçlar, önerilen yöntemin dinamik optimizasyon problemlerini çözmede oldukça etkili olduğunu göstermiştir.
-
ÖgeDirectional regularization based variational models for image recovery(Graduate School, 2022-08-19) Türeyen Demircan, Ezgi ; Kamaşak, Mustafa E. ; 504152509 ; Computer EngineeringThis thesis explores how local directional cues can be utilized in image recovery. Our intent is to provide image regularization paradigms that encourage the underlying directionalities. To this end, in the first phase of the thesis work, we design direction-aware analysis-based regularization terms. We boost the structure tensor total variation (STV) functionals used in inverse imaging problems so that they encode directional priors. More specifically, we suggest redefining structure tensors to describe the distribution of the ``directional" first-order derivatives within a local neighborhood. With this decision, we bring direction-awareness to the STV penalty terms, which were originally imposing local structural regularity. We enrich the nonlocal counterpart of the STV in the same way, which were additionally imposing nonlocal image self-similarity beforehand. These two types of regularizers are used to model denoising and deblurring problems within a variational framework. Since they result in convex energy functionals, we also develop convex optimization algorithms by devising the proximal maps of our direction-aware penalty terms. With these contributions in place, the major barrier in making these regularizers applicable lies in the difficulty of estimating directional parameters (i.e., the directions/orientations, the dose of anisotropy). Although, it is possible to come across uni-directional images, the real-world images usually exhibit no directional dominance. It is easy to precisely estimate the underlying directions of uni-directional (or partially directional) images. However, arbitrary and unstable directions call for spatially varying directional parameters. In this regard, we propose two different parameter estimation procedures, each of which employs the eigendecompositions of the semi-local/nonlocal structure tensors. We also make use of total variation (TV) regularization in one of the proposed procedures and a filterbank of anisotropic Gaussian kernels (AGKs) in the other. As our image regularization frameworks require the guidance of the directional parameter maps, we use the term ``direction-guided" in naming our regularizers. Through the quantitative and the visual experiments, we demonstrate how beneficial the involvement of the directional information is by validating the superiority of our regularizers over the state-of-the-art analysis-based regularization schemes, including STV and nonlocal STV. In the second phase of the thesis, we shift our focus from model-driven to data-driven image restoration, more specifically we deal with transfer learning. As the target field, we choose fluorescence microscopy imaging, where noise is a very usual phenomenon but data-driven denoising is less applicable due to lack of the ground-truth images. In order to tackle this challenge, we suggest tailoring a dataset by handpicking images from unrelated source datasets. This selective procedure explores some low-level view-based features (i.e., color, isotropy/anisotropy, and directionality) of the candidate images, and their similarities to those of the fluorescence microscopy images. Based upon our experience on the model-driven restoration techniques, we speculate that these low-level characteristics (especially directions) play an important role on image restoration. In order to encourage a deep learning model to exploit these characteristics, one could embed them into the training data. In fact, we establish the possibility of offering a good balance between content-awareness and universality of the model by transferring only low-level knowledge and letting the unrelated images bring additional knowledge. In addition to training a feed-forward denoising convolutional neural network (DnCNN) on our tailored dataset, we also suggest integrating a small amount of fluorescence data through the use of fine-tuning for better-recovered micrographs. We conduct extensive experiments considering both Gaussian and mixed Poisson-Gaussian denoising problems. On the one hand, the experiments show that our approach is able to curate a dataset, which is significantly superior to the arbitrarily chosen unrelated source datasets, and competitive against the real fluorescence images. On the other hand, the involvement of fine-tuning further boosts the performance by stimulating the content-awareness, at the expense of a limited amount of target-specific data that we assume is available.
-
ÖgeEnergy efficient resource management in cloud datacenters(Graduate School, 2023-07-11) Çağlar, İlksen ; Altılar, Deniz Turgay ; 504102501 ; Computer EngineeringWe propose an energy efficient resource allocation approach that integrates Holt Winters forecasting model for optimizing energy consumption while considering performance in a cloud computing environment. The approach is based on adaptive decision mechanism for turning on/off machines and detection of over utilization. By this way, it avoids performance degradation and improves energy efficiency. The proposed model consists of three functional modules, a forecasting module, a workload placement module along with physical and virtual machines, and a monitoring module. The forecasting module determines the required number of processing unit (Nr) according to the user demand. It evaluates the number of submitted workloads (NoSW), mean execution time of submitted workloads in interval and mean CPU requirements of them to calculate approximately total processing requirement (APRtotal). These three values are forecasted separately via forecasting methodologies namely Holt Winters (HW) and Auto Regressive Integrated Moving Average (ARIMA). The Holt Winters gives significantly better result in term of Mean Absolute Percentage Error (MAPE), since the time series include seasonality and trend. In addition, the interval is short and the long period to be forecasted, the ARIMA is not the right choice. The future demand of processing unit is calculated using these data. Therefore, the forecasting module is based on Holt Winters forecasting methodology with 8.85 error rate. Therefore, the forecasting module is based on the Holt Winters. Workload placement module is responsible for allocation of workloads to suitable VMs and allocation of these VMs to suitable servers. According to the information received from forecasting module, decision about turning a server on and off and placement for incoming workload is making in this module. The monitoring module is responsible for observing system status for 5 min. The consolidation algorithm is based on single threshold whether to decide that the server is over utilized. In other words, if the utilization ratio of CPU exceeds the predefined threshold, it means that the server is over utilized otherwise, the server is under load. If the utilization of server equals the threshold, the server is running at optimal utilization rate. Unlike other studies, overloading detection does not trigger VM migration. Overloading is undesirable since it causes performance degradation but, it can be acceptable under some conditions. To decide allocation of incoming workloads, this threshold is not only and enough parameter. Beside the threshold, the future demands are also considered as important as systems current state. The proposed algorithm also uses different parameters as remaining execution time of a workload, active number of servers (Na), required number of servers (Nr) besides efficient utilization threshold. The system can be instable with two cases; (1) Na is greater than Nr that means there are underutilized servers and it causes energy inefficiency (2) Nr is greater than Na, if new servers are not switched on, it causes over utilized servers and performance degradation. The algorithm is implemented and evaluated in CloudSim which is commonly preferred in the literature since, it provides a fair comparison between the proposed algorithm with previous approaches and it is easy to adapt and implementation. However, workloads come to the system in a static manner and the usage rates of the works vary depending on time. Our algorithms provide dynamically submission. Therefore, to make fair comparison, the benchmark code is modified to meet dynamic requirement by working Google Cluster Data via MongoDB integration. The forecasting module is based on Holt Winters as described before. Therefore, the approach is named Look-ahead Energy Efficient Allocation – Holt Winters (LAA-HW). If we knew the actual values instead of forecasted values, the system would give the result as Look-ahead Energy Efficient Allocation –Optimal (LAA-O). The proposed model uses Na and Nr parameters to decide the system's trend whether the system has active servers than required. If Na is greater than the Nr, incoming workloads are allocated on already active servers. It causes bottleneck for workloads with short execution time and less CPU requirement as the Google Tracelog workloads. The mean cpu requirement of a day and the mean execution time of a day are 3% and 1,13 min 32 respectively. It gives the small Nr value and it causes less number of received workload than Local Regression-Minimum Migration Time (LRMMT). The number of migration is zero in our approach. The energy consumption for switching on and off in our model is less in comparison with the migration model.
-
ÖgeEtkin sorgu önerileri için kullanıcı sorgularının görev tabanlı yönetilmesi(Lisansüstü Eğitim Enstitüsü, 2024-08-19) Ateş, Nurullah ; Yaslan, Yusuf ; 504142517 ; Bilgisayar Mühendisligiİnternet kullanıcılarının dolaylı niyetlerinin doğru bir şekilde tahmin edilmesi, çevrimiçi arama deneyimlerini etkinleştirmekte ve kullanıcıların görevlerini daha verimli tamamlamalarına yardımcı olmaktadır. Kullanıcılar, aradıkları bilgilere ulaşmak için çeşitli sorgular yaparak zaman sıralı sorgu günlüklerini oluşturmaktadır. Bu süreçte, internet kullanıcılarının bilgi ihtiyaçlarını karşılamak amacıyla arama motorlarıyla etkileşime girmesi sonucu büyük miktarda arama sorgusu kaydedilir. Sorgu verilerinin doğru analiz edilmesi, kullanıcı görevlerinin tahmin edilmesini ve daha iyi anlaşılmasını sağlar. Aynı oturum içinde farklı arama görevlerine ait sorgular bulunabileceği gibi, tek bir arama görevi de farklı oturumlara yayılabilir. Arama Görevi Özütleme (AGÖ), aynı niyeti taşıyan ve sorgu günlüğü verisine dağılmış sorguları, benzersiz kümeler halinde gruplama (kümeleme) işlemidir. Kullanıcının niyetinin doğru bir şekilde tanımlanması, arama motorları ve e-ticaret platformlarında sorgu önerisi ve yeniden formülasyon, kişiselleştirilmiş öneriler ve reklamcılık gibi arama yönlendirme süreçlerinin performansını arttırmaktadır. Ancak, bu süreçte AGÖ'nün etkinliği, karşılaşılan zorlukların üstesinden gelinmesine bağlıdır. AGÖ sırasında, kısa ve hatalı sorgular ile eksik anahtar kelimeler gibi iç zorlukların yanı sıra, bilinmeyen küme sayısı ve sınırlı etiketli veri seti gibi dış zorluklarla da karşılaşılabilmektedir. Bu tez kapsamında, internette sorgular ile gerçekleştirilen gezinme deneyimini iyileştirmek için üç çalışma yapılarak AGÖ problemine çözümler sunulmuştur. Bu çalışmalardan ilki "Denetimli Öğrenme Tabanlı Sorgu Segmenti Özütleme" adı ile sorgu segmenti tespitinin gerçekleştirilmesidir. Sorgu segmentleri bazen bir arama görevinin parçası olarak, bazen de arama görevinin tamamı olarak ortaya çıkabilir. Bu durum, segmentlerin doğru bir şekilde tanımlanmasını ve bir araya getirilmesini önemli kılar. Arama görevi çalışmalarında sıkça kullanılan algoritmalardan biri olan Baş Kuyruk Bileşenler ile Sorgu Kümeleme (QC-HTC) algoritması, arama görevlerini tespit etmek için sorgu segmentlerini bir araya getirerek uygun segmentlerin nasıl birleştirilebileceğine odaklanmaktadır. Bu sebeple, AGÖ çalışmalarına başlamadan önce, bahsedilen sorgu segmentlerini tespit eden bu çalışma gerçekleştirilmiştir. İkinci çalışma, Ağırlıklı Bağlı Bileşenler ile Sorgu Kümeleme (QC-WCC) ve ayrıca QC-HTC çizge kümeleme algoritmalarını kullanan Siyam Ağı (SA) ile Çizge Tabanlı Arama Görevi Özütleme gerçekleştiren bir çalışmadır. Bu iki çizge kümeleme algoritması, iki sorgu arasındaki benzerliğe ihtiyaç duyduğundan, bu tezde sorgular arasındaki benzerliği tespit etmek için SA kullanılmıştır. SA'nın, iki örnek arasındaki benzerliği az veriyle tespit edebilme kabiliyeti, onu arama görevi problemi için en uygun yöntemlerden biri haline getirmektedir. SA'lar arasındaki benzerliği bulmak için iki nesne genellikle ağın girişinde, aynı mimari ve parametrelere sahip paralel bir katmanda işlenir. Bu yöntem, Siyam mimarisinin iki girdi arasındaki ilişkileri (benzerlik/farklılık) modelleme konusunda daha doğrudan ve etkili olmasını sağlar. Ayrıca, her iki girdi aynı ağ yapısını ve parametrelerini kullanarak işlendiği için, SA öğrenme sürecinde daha az parametreyle daha verimli hale gelir. Bu sayede, özellikle az etiketli veri içeren AGÖ gibi durumlarda, modelin genelleme yeteneği artar ve daha iyi sonuçlar elde edilebilir. Bu tez kapsamındaki son çalışmada, k-kontur Tabanlı Tekrarlayan Derin Çizge Kümelemesini Kullanarak Arama Görevi Özütleme gerçeleştirilmiştir. QC-WCC ve QC-HTC algoritmalarının, AGÖ için en sık kullanılan kümeleme yöntemleri olduğu belirtilmiştir. Bu algoritmalar, arama görevlerini (kümleleri) belirlerken yalnızca belirli bir eşik değerinin üzerindeki ikili sorgu benzerliklerini kullanmakta ve "iki sorgu arasındaki benzerlik" dışında başka bir çizge topolojik özelliğini dikkate almamaktadır. AGÖ için en yaygın olarak kullanılan yöntemlerin çizge tabanlı olması, bu tez çalışmasını çizge tabanlı bir AGÖ için bir çözüm yöntemi aramaya yönlendirmiştir. Bu nedenle, çizgenin derin topolojik özelliklerinden yararlanan bir model önerilmiştir. Yukarıda belirtilen çalışmalar aşağıda üç bölümde genişletilmiştir. Sorgu segmentasyonu, kullanıcı sorgularını analiz ederken yaygın olarak gerçekleştirilen ilk aşamadır ve ardışık sorguların aynı alt göreve ait olup olmadığını belirler. Sorgu segmentasyon sürecindeki herhangi bir eksiklik, doğrudan görev tanımlamayı ve dolaylı olarak sorgu önerisi gibi diğer ileri sorgu tabanlı problemleri ve faaliyetleri olumsuz etkileyebilir. Güncel çalışmalar, sorguların ifade ettiği anlamı tespit etmek için Özyineli Sinir Ağları (ÖSA) ve dikkat tabanlı Yapay Sinir Ağlarına (YSA) odaklanmıştır. Bu tezde, sorguların gömme vektörlerini sorgu segmentasyon problemine özgü olarak iyileştirirken, bir karar ağı içeren Siyam Evrişimsel Sinir Ağı (ESA) önerilmektedir. Önerilen yöntem, Bağlam Dikkat Mekanizmalı Uzun Kısa-Süreli Bellek (İng. Context Attention based Long Short Term Memory (CA-LSTM)) modeli ve Çift Yönlü Özyineli Sinir Ağları (İng. Bidirectional Recurrent Neural Network (BiRNN)) tabanlı modeli ile Webis Arama Görevi Korpusu 2012 (WSMC12) ve Çapraz Oturum Görevi Çıkarma (CSTE) veri setleri üzerinde karşılaştırılmıştır. Modelimiz, \%95 performans göstererek mevcut modellere göre \%1'lik bir iyileşme sağlamış ve CSTE veri setinde \%81 doğruluk oranı ile önceki en iyi sonuçlara göre sınıflandırma doğruluğunda \%6'lık bir artış elde etmiştir. Derin öğrenme modelleri, eğitim için büyük miktarda veri gerektirir; ancak, arama görevi etiketli veri kümeleri nadir ve küçüktür. Tez kapsamında yapılan ikinci çalışmada bu sınırlamaların üstesinden gelmek için, hem mesafe metriklerini hem de karar ağlarını kullanan bir yapıyı özellik çıkarma süreciyle entegre eden Çizge Tabanlı Arama Görevi Özütleme (İng. Graph based Search Task Extraction Using Siamese Network (Graph-SeTES)) modeli önerilmektedir. Graph-SeTES, kısa sorgular için Wikipedia2vec, hatalı sorgular için fastText kullanrak AGÖ'nün iç zorluklarına çözüm üretmeye çalışmaktadır. Ayrıca, SA ile az etiketli veri ile bile iyi sonuçlar vererek AGÖ'nün dış zorluklarının üstesinden gelmeye çalışmıştır. Graph-SeTES, literatürdeki yüksek başarı gösteren AGÖ modelleri ile karşılaştırılmış ve onlara kıyasla daha iyi sonuçlar elde etmiştir. Sonuçlar, CSTE veri setinde en iyi temel modele göre \%6 daha iyi çıkmış ve bu performans farkı WSMC12 veri setinde de korunmuştur. Mevcut yöntemlerin çoğu, sorgular arasındaki ikili ilişkileri kullanan çizge tabanlı kümeleme algoritmalarını tercih etmiştir. Bunun nedeni, çizge tabanlı kümeleme algoritmalarının hem yerel (örneğin, iki sorgu arasındaki doğrudan bağlantı) hem de küresel (örneğin, birden fazla sorgu grubunun oluşturduğu genel yapı) bilgiyi kullanarak benzer sorguları doğal bir yapıda kümeleyebilmesidir. Ancak, bu yöntemler çizge topolojik yapı özelliklerini kullanmak yerine, basit bir eşik değerine göre çizgeyi kümeler. Literatürdeki son çalışmalar, sorgu sayısının artmasıyla model boyutunun büyümesini engellemek için derin kümeleme katmanlarını kullanmıştır. Ancak, bu modeller etiketli veri gerektirmekte ve modern dil modellerinin gömme temsillerini göz ardı etmektedir. Bu çalışmada, veri etiketlemesi gerektirmeden arama görevlerini özütlemek için çizge topolojik özelliklerini kullanan yenilikçi bir Bağlayıcı Yakınlık ve Kümeleme Katmanı Kullanan k-Kontur Tabanlı Grafik Evrişimsel Ağ (İng. k-Contour based Graph Convolutional Network Connective proximity Clustering Layer (CoGCN-C-CL)) mimarisi önerilmektedir. CoGCN-C-CL, sorgu temsillerini ve arama görevlerini eş zamanlı olarak öğrenir. K-tepe algoritması uygulanarak çizgenin çevresine göre daha yoğun olan yüksek ilişkili k-kontur alt çizgeleri çıkarılır. K-konturlar, çizgenin farklı kenar yoğunluklarına sahip, farklı ve bağımsız bölgelerini tanımlarken, Çizge Evrişimli Ağ (ÇEA), bu bölgelerdeki düğümler arasındaki etkileşimlerin kullanılmasını sağlar. Deneysel sonuçlar, CoGCN-C-CL'in, sık kullanılan arama görevi veri kümelerinde mevcut en iyi arama görevi kümeleme yöntemlerinden daha üstün olduğunu göstermektedir. Bu tez kapsamında sunulan yenilikçi yötemler ile sorgu ifadelerine mevcut yöntemlere göre daha etkili bir şekilde analiz etme ve gruplandırma yöntemleri uygulayarak AGÖ performansını arttırmıştır. Çalışmanın odak noktaları, SA'ları kullanarak benzer sorgu çiftlerini tespit etme ve k-kontur tabanlı özyinelemeli derin çizge kümeleme teknikleridir. Önerilen yöntemler, AGÖ'nün zorluklarını aşarak, sorgu önerisi, kişiselleştirilmiş tavsiyeler ve reklamcılık gibi süreçleri destekleyerek internet üzerinden bilgiye erişimin kalitesini ve verimliliğini artırmayı hedeflemektedir. İleriki çalışmalar için, AGÖ sürecini daha da iyileştirmek amacıyla çeşitli çizge yapısal özellikleri keşfetmek mümkün olabilir. Ayrıca, önerilen SA'ının daha bağımsız hale gelebilmesi için kendi kendine denetimli öğrenen bir şekilde çalışabilmesini sağlayacak düzenlemeler yapılabilir. Bu adaptasyonlar sayesinde, modelin genelleme yeteneği artırılabilir ve veri setlerine olan bağımlılık azaltılarak daha etkili bir öğrenme süreci sağlanabilir.
-
ÖgeEtmen tabanlı bir anlamsal süreç çalışma ortamının geliştirilmesi(Lisansüstü Eğitim Enstitüsü, 2021) Kır, Hüseyin ; Erdoğan, Takuhi Nadia ; 672532 ; Bilgisayar MühendisliğiKurumsal bilişim sistemleri alanı, uzun bir süre boyunca, kurumsal veriyi merkeze alan ve onun yönetimine odaklanan veri odaklı bilişim sistemleri tarafından hükmedilmiştir. Fakat zamanla bilginin de diğer üretim enstrümanları gibi kurumların hedeflerine ulaşmak için tükettikleri ve ürettikleri ara ürünler olduğu, asıl odaklanılması gereken bakış açısının üretimi sağlayan işlevler olduğu algısı yaygınlaşmaya başlamıştır. Bu yaklaşım ile, kurumsal veri/bilgi önemini korurken, merkeze iş süreçleri alınarak iş süreci yönetim sistemleri (İSYS) ortaya çıkmıştır. İSYS'ler kurumsal işleyişi temsil eden süreç modellerini girdi olarak alan ve katılımcıların eş güdümlü bir şekilde çalışmasını sağlayarak üretim süreçlerinin etkinliğini ve üretkenliğini arttırmayı hedefleyen genel yazılım sistemleridir. Bu sistemler zamanla gelişerek tüm süreç yaşam döngüsünü (tasarım, işletim, izleme, analiz ve iyileştirme) destekleyecek işlevselliklere erişmiştir. Geleneksel olarak İSYS'ler yönetmelikler ile detaylı bir şekilde tanımlanmış, öngörülebilen ve tekrarlanabilen süreçlerin modellenmesine ve işletilmesine odaklanmıştır. Bu süreçlerdeki olası tüm iş akışları tamamen bilinmektedir ve süreç katılımcılarının verebileceği kararlar önceden öngörülmektedir. Halihazırda, bu tür süreçler kurumsal süreçlerin büyük çoğunluğunu oluşturmaktadır. Ne var ki, kurumların %16'sı önceden öngörülemeyen olaylardan dolayı iş süreçlerini anlık olarak değiştirmek zorunda kaldıklarını, %10'u ise bazı süreçlerinin günlük olarak değiştiğini belirtmektedir. Aslen bu süreçler, mevcut İSYS'lerin yönetmekte yetersiz kaldığı, bilgi yoğun ve sanatsal süreçlerdir. Bilgi yoğun süreçler (BYS), yürütülmesi ve yönetilmesi çeşitli bilgi güdümlü karar verme görevlerini yerine getiren bilgi çalışanlarına bağlı olan süreçlerdir. Bu süreçler genelde üst seviyede bir iş akışına sahiptirler ama bu akışın detayları, dolaylı bir şekilde, sadece iş uzmanı tarafından bilinmektedir. Bu süreçler, formal bir süreç modeli ile ifade edilememekle beraber çoğu zaman yazılı bile değillerdir. Bilgi yoğun süreçlere örnek olarak enerji uzmanının bir hidroelektrik santrali projesini değerlendirme süreci örnek verilebilir. İş uzmanı, sunulan yapılabilirlik çalışmasının değerlendirilmesi, kamulaştırmaların gerekliliği ve uygunluğu, beklenen üretim projeksiyonlarının gerçekçiliği ve talep ile tutarlılığı gibi bir çok tecrübeye dayalı incelemeyi, duruma göre diğer iş uzmanlarına da (hukuk, planlama vb.) danışarak, süreci ilerletmektedir. Sürecin akışı tamamen anlık ihtiyaçlar doğrultusunda, iş uzmanının tecrübesi ile ortaya çıkmaktadır ve her değerlendirme süreci farklı bir akışa sahip olabilmektedir. Süreç yönetimi araştırma alanı, gelecekte sanal organizasyonların kurulacağı, dünyanın farklı yerlerindeki birbirini tanımayan insanların aynı sürece dahil olarak işbirliği içerisinde üretim yapabilecekleri bir geleceği hayal etmektedir. Yaşamakta olduğumuz Covid-19 pandemi süreci de bu eğilimi hızlandırarak, uzaktan birlikte çalışmayı bir zorunluluk haline getirmiştir. Bunun sonucunda, mevcut altyapıların desteklemediği, zorlu bilgi yoğun senaryolarda da süreç odaklı yaklaşımların uygulanması bir zorunluluk olmuştur. Günümüz İSY sistemlerinin bilgi odaklı süreç yönetimi hedefini hayata geçirebilmek için işbirliği, uyarlanabilirlik ve bağlam farkındalık gibi kavramların üzerine yoğunlaşması gerekmektedir. Bunun için, mevcut İSYS'lerin, bir dizi yeni gereksinimi desteklemeye başlaması gerekmektedir. Genel olarak bu gereksinimler: tüm kurumsal ortam, veri ve kuralların modellendiği bir kurumsal bilgi tabanının geliştirilmesi ve bu bilgi modeli üzerinde bilgi ile tetiklenen, kurallar ile şekillenen, organizasyon hedeflerine hizmet eden, dinamik işbirliklerinin yapılabildiği bir çalışma ortamının oluşturulması şeklinde özetlenebilir. Bu yöndeki araştırmalar ise hala, büyük oranda, akademik seviyededir ve sadece akıllı hata kotarma problemine odaklanmış durumdadır. Ayrıca, bu çalışmaların kurumsal standartlardan uzak oluşları ve uygulanabilirliklerindeki zorluklardan dolayı endüstriyel kullanımı yaygınlaşamamıştır. Tez çalışması kapsamında geliştirilen yöntem yüksek değişkenliğe sahip bilgi yoğun iş süreçlerinin yönetimi için üç hipotezi temel almaktadır. İlk olarak, süreç tasarımı sadece görev ve kontrol akışlarının modellenmesi ile sınırlı değildir, süreç uzayını oluşturan veri, kural, hedef, iş ortamı ve iş akış perspektiflerinin bütüncül bir şekilde ele alınması gerekmektedir. İkinci olarak, kapsüllemeyi ve bileşenleştirmeyi sağlamak için, süreç işletimleri, kurumsal bilgiyi güncelleyen görev akışları ile değil, her biri kendi hedefleri, inanışları, kararları ve yaşam döngüsü olan etkileşimli özerk varlıklar (akıllı yazılım etmenleri) üzerinden yönetilmelidir. İSY sistemlerinin nihai hedefi, iş akışlarının eş güdümünü sağlamaktan, iş uzmanlarının karar verme süreçlerine yardımcı olmaya doğru evrilmektedir. Bu doğrultuda, üçüncü hipotez olarak, bilgi çalışanlarının uzmanlıklarının en azından bir kısmı dijitalleştirilmeli ve özerk yazılım vekilleri tarafından yerine getirilmelidir. Bu amaçla, iki aşamalı bir yaklaşımla, tüm İSY yaşam döngüsünü destekleyen bir çözüm önerilmiştir. İlk olarak, iş süreçleri, kurumsal bilgi yönetimi ve çoklu etmen sistemleri modelleme paradigmalarını ve tasarım bileşenlerini kusursuz bir biçimde tümleştiren ve bir arada modellenmelerine olanak tanıyan, tümleşik bir modelleme metodolojisi geliştirilmiştir. Arttırımlı bir şekilde geliştirilen modeller organizasyon, iş ortamı, kurumsal stratejiler, işlevsellikler ve kısıtları anlamsal bir şekilde tanımlamakta ve kurumsal bilgi modelini oluşturmaktadır. Bu modellerin tasarımında, endüstride ve etmen tabanlı yazılım mühendisliğinde kullanılan standartlar ve en iyi uygulamalar, mümkün olduğunca yeniden kullanılarak, gerçek hayat problemlerinde kolay bir şekilde uygulanabilir olması hedeflenmiştir. Tez çalışmasının ikinci aşamasında, etmenlerin çalışma zamanında özerk bir şekilde hedefe yönelik ve bilgi odaklı davranış uyarlamaları yapmasına olanak tanıyan bir çoklu etmen tabanlı süreç işletim ortamı geliştirilmiştir. Geliştirilen bilgi modelini kullanan etmenler bilişsel yetenekler (hedef güdümlü planlama, kural uyumluluk, bilgi güdümlü davranışlar ve dinamik işbirlikleri gibi) sergileyerek, bilgi çalışanlarının karar verme süreçlerini desteklemeye çalışmaktadır. Bu amaçla, iş uzmanlarının karar verme yöntemlerinden esinlenerek geliştirilen buluşsal planlama yaklaşımı ile sergilenecek eylemlere çalışma zamanında, yeni bilgiler ortaya çıktıkça adım adım karar verilmekte ve hedefler ile gerçekler arasındaki boşluk kapatılmaya çalışılmaktadır. İş uzmanlarının hedefe yönelik davranış seçimi, süreç kalitesinin değerlendirilmesi, kurallara uygunluğun kontrolü, hata yönetimi ve dinamik müzakere ve işbirliği yetenekleri dijitalleştirilerek, etmenler tarafından yerine getirilebilir hale getirilmiştir. Bu sayede çalışma zamanında süreçlerin dinamik bir şekilde uyarlanabilmesi ve anlık etkileşimler ile yeniden şekillenerek organizasyon hedeflerine ulaşabilmesi sağlanmıştır. Gerçekleştirilen deneysel çalışmalar ile, süreç işletimi için yeterli kaynaklara sahip bir ortamda, tez kapsamında geliştirilen çerçevenin rastgele oluşturulan çalışma zamanı hatalarını başarılı bir şekilde kotarabildiğini ortaya koymuştur. Literatürdeki mevcut çalışmalar ile karşılaştırıldığında, geliştirilen sistemin, bilgi yoğun süreç yönetim sistemlerinin temel gereksinimlerinin büyük bir çoğunluğunu sağlayan, literatürdeki en kapsamlı çözüm olduğu ortaya konmuştur.
-
ÖgeFace recognition and person re-identification for person recognition( 2020) Başaran, Emrah ; Kamaşak, Mustafa Ersel ; Gökmen, MUhittin ; 629137 ; Bilgisayar Mühendisliği ; Computer EngineeringYüz tanıma ve kişinin yeniden tanınması (KYT) uygulamalarına, bireysel ve toplumsal güvenlik, adli vakalar ve eğlence başta olmak üzere, birçok farklı alanda ihtiyaç duyulmaktadır. Yüz görüntüleri, kişi teşhisi için, zengin ve oldukça ayırt edici özellikler barındırmaktadır. Bunun yanında, yüz görüntülerinin temas ve iş birliği olmaksızın elde edilebilir olması, yüz tanıma uygulamalarının, iris ve parmak izi gibi diğer biyometrik tanımlayıcıları kullanan uygulamalara göre daha geniş bir uygulama sahasına sahip olmasına sebep olmaktadır. KYT probleminde ise, biyometrik tanımlayıcılardan ziyade, tüm vücut görüntüleri kullanılmaktadır. Bu problemde, temel olarak, farklı kameralar tarafından kaydedilen kişi görüntülerinin eşleştirilmesine çalışılmaktadır. Yüz görüntülerinin elde edilemediği veya görüntülerin yüz tanıma yapılabilecek seviyede kaliteye sahip olmaması gibi durumlarda, KYT, kişi teşhisi için önemli bir yöntemdir. Tez kapsamında, öncelikle, kişi teşhisi için son derece önemli olan yüz tanıma problemi ele alınmaktadır. Daha sonra, KYT problem için özgün yöntemler önerilmektedir. Bu çalışmada, KYT problemi iki farklı şekilde incelenmektedir. Bunun sebebi, KYT için en önemli ipuçlarını barındıran renk bilgisinin zayıf aydınlatılmış veya karanlık ortamlarda kaydedilen görüntülerden elde edilemediği zaman, KYT' nin farklılaşması ve daha da zorlu bir problem haline gelmesidir. Gerçekleştirilen çalışmaların ilkinde, görünür etki alanında elde edilen RGB görüntüler kullanılmaktadır. İkincisinde ise, RGB görüntüler ile birlikte kızılötesi görüntülerde kullanılarak karşıt etki alanında KYT problemi incelenmektedir. Bilimsel yazında gerçekleştirilen çalışmalarda, yüz tanıma problemi, genel olarak kimlik saptama ve kimlik doğrulama olmak üzere iki farklı şekilde ele alınmaktadır. Hem saptama hem de doğrulama için geliştirilen yüz tanıma sistemlerinin en önemli kısmı ise, yüz görüntüleri için betimleyicilerin nasıl oluşturulacağıdır. Yüz tanıma performansı, büyük oranda bu betimleyicilerin kalitesine bağlıdır. Bu tezin yüz tanıma problemi ile ilgili olan bölümünde, güçlü betimleyiciler elde edebilmek için, temel olarak yerel Zernike momentleri (YZM) kullanılarak geliştirilen gözetimsiz öznitelik çıkarma yöntemleri önerilmektedir. İlk olarak, bütünsel yüz görüntülerinden öznitelik çıkarımı üzerine odaklanılmıştır. Geliştirilen yöntemde, iki farklı şekilde yerel öznitelikler açığa çıkarılmaktadır. İlkinde, art arda iki kez uygulanan YZM dönüşümü sonucunda elde edilen karmaşık örüntü haritaları üzerinde faz-genlik histogramları (FGH) oluşturulmaktadır. İkincisinde ise, gri seviye histogramlar kullanılmaktadır. Bu histogramlar, yerel Xor operatörü ile YZM örüntü haritalarının kodlanması sonucunda üretilen gri seviye görüntüler üzerinde oluşturulmaktadır. Hem FGH' ler hem de gri seviye histogramlar, alt bölgelere ayrılmış bütünsel yüz görüntülerinin alt bölgelerinde ayrı ayrı hesaplanmaktadır. Ardından, her bir örüntü haritasından elde edilen tüm histogramlar art arda birleştirilerek öznitelik vektörleri oluşturulmaktadır. Son aşamada ise, bu vektörlerin boyutları indirgenmektedir. Önerilen yöntemde, boyut indirgeme işlemi için, Beyazlatılmış Temel Bileşenler Analizi (BTBA) kullanılmakta ve blok tabanlı bir yöntem izlenmektedir. Öncelikle, alt bölgeler bir araya getirilerek bloklar oluşturulmaktadır ve ardından bu bloklardan elde edilen öznitelik vektörlerinin boyutları ayrı ayrı indirgenmektedir. Kullanılan bu yöntemlerin yüz tanıma performansı üzerindeki etkileri ve elde edilen başarılar, Face Recognition Technology (FERET) veriseti kullanılarak ortaya konmuştur. Tez kapsamında gerçekleştirilen yüz tanıma ile ilgili çalışmaların ikinci bölümünde ise, öznitelik çıkarımının nirengi noktaları etrafında gerçekleştirildiği başka bir yöntem önerilmektedir. Bu yöntemde, nirengi noktaları etrafından yamalar çıkarılmaktadır ve öznitelik vektörlerinde kullanılan FGH' ler bu yamaların alt bölgelerinde hesaplanmaktadır. Yüz görüntülerinin hem yerel hem de bütünsel bilgilerini içeren öznitelikler elde etmek amacıyla, yöntem içerisinde bir görüntü piramidi kullanılmaktadır. Piramit içerisindeki görüntülerin YZM örüntü haritalarından ayrı ayrı öznitelikler çıkarılarak çok ölçekli betimleyiciler elde edilmektedir. Ardından, görüntü piramidinden elde edilen öznitelikler art arda birleştirilerek, her bir nirengi noktası için ayrı bir öznitelik vektörü oluşturulmaktadır. Son aşamada ise, vektörlerin boyutları, BTBA kullanılarak ayrı ayrı indirgenmektedir. Önerilen yöntemin performansını test etmek amacıyla, FERET, Labeled Faces in the Wild (LFW) ve Surveillance Cameras Face (SCface) verisetleri kullanılmıştır. Elde edilen sonuçlar önerilen yöntemin aydınlatma, yüz ifadesi ve poz gibi değişikliklere karşı dayanıklı olduğunu ortaya koymaktadır. Bunun yanında, yöntemin, kontrolsüz ortamlarda veya kızılötesi tayfta elde edilen düşük çözünürlüklü yüz görüntüleri üzerindeki başarısı da gösterilmektedir. Kişilerin yeniden tanınması (KYT) problemi, arka plan dağınıklığı, poz, aydınlatma ve kamera bakış açısı değişimleri gibi faktörlerden dolayı oldukça zorlu bir iştir. Bu unsurlar, güçlü ve aynı zamanda ayırt edici öznitelikler çıkarma sürecini ciddi oranda etkileyerek, farklı kişilerin başarılı bir şekilde ayırt edilmesini zorlaştırmaktadırlar. Son yıllarda, KYT üzerinde gerçekleştirilen çalışmaların büyük bir çoğunluğu, bahsedilen unsurlar ile başa çıkabilecek yöntemler geliştirmek için, derin öğrenme yöntemlerinden yararlanmaktadır. Genel olarak bu çalışmalarda, kişi görüntüleri için öğrenilen gösterimlerin kalitesi, vücut parçalarından yerel öznitelikler çıkarılarak artırılmaya çalışılmaktadır. Vücut parçaları ise, sınırlayıcı kutu tespit etme yöntemleri ile tespit edilmektedir. Bu tezde, KYT problemi için, derin öğrenme yöntemleri kullanılarak geliştirilen bir yöntem önerilmektedir. Bu yöntemde, diğer çalışmalarda olduğu gibi, vücut parçalarından yerel öznitelikler elde edilmektedir. Fakat, parçalar tespit edilirken, sınırlayıcı kutular yerine anlamsal ayrıştırma kullanılmaktadır. Vücut görüntülerinin anlamsal olarak ayrıştırılması, piksel seviyesindeki doğruluğu ve rastgele sınırları modelleyebilmesi nedeniyle, sınırlayıcı kutu tespit etme yöntemine göre doğal olarak daha iyi bir alternatif olmaktadır. Önerilen yöntemde, anlamsal ayrıştırma KYT problemi için etkin bir şekilde kullanılarak, deneylerin yapıldığı verisetleri üzerinde bilinen en yüksek performansa ulaşılmaktadır. Anlamsal bölütlemenin yanı sıra, Inception ve ResNet gibi yaygın olarak kullanılan derin öğrenme mimarilerinin KYT problemi için daha verimli bir şekilde eğitilmesini sağlayan bir eğitim yöntemi de önerilmektedir. Yöntemlerin başarısı, Market-1501, CUHK03 DukeMTMC-reID verisetleri üzerinde gerçekleştirilen deneyler ile gösterilmektedir. Bu tez kapsamında gerçekleştirilen diğer bir çalışma ise, görünür-kızılötesi karşıt etki alanında KYT (GK-KYT) problemidir. GK-KYT problemi, zayıf aydınlatılmış veya karanlık ortamlarda gözetim işleminin gerçekleştirilebilmesi için son derece önemlidir. Son yıllarda, görünür etki alanında gerçekleştirilen birçok KYT çalışması bulunmaktadır. Buna karşın, bilimsel yazında, GK-KYT ile ilgili çok az sayıda çalışma gerçekleştirilmiştir. KYT' de var olan poz/aydınlanma değişimleri, arkaplan karmaşası ve kapanma gibi zorluklara ek olarak kızılötesi görüntülerde renk bilgisinin olmaması, GK-KYT' yi daha zorlu bir problem haline getirmektedir. Sonuç olarak, GK-KYT sistemlerinin performansı tipik olarak KYT sistemlerinden daha düşüktür. Bu tezde, GK-KYT' nin performansını iyileştirmek için 4 akışlı bir yöntem önerilmektedir. KYT ile ilgili gerçekleştirilen çalışmalarda olduğu gibi, GK-KYT için de derin öğrenme tekniklerinden yararlanılmıştır. Önerilen yöntemin her bir akışında, giriş görüntülerinin farklı bir gösterimi kullanılarak ayrı bir derin evrişimli sinir ağ (DESA) eğitilmektedir. Bu şekilde, her bir akıştaki DESA modelinin farklı ve aynı zamanda tamamlayıcı öznitelikler öğrenmesi amaçlanmaktadır. Yöntemin ilk akışında, gri-seviye ve kızılötesi giriş görüntüleri kullanılarak bir DESA modeli eğitilmektedir. İkinci akıştaki giriş görüntüleri ise, RGB görüntüler ve kızılötesi kanalın tekrarlanmasıyla oluşturulan 3-kanallı kızılötesi görüntülerdir. Diğer iki akışta ise, giriş görüntüsü olarak, YZM dönüşümü ile elde edilen yerel örüntü haritaları kullanılmaktadır. Bu örüntü haritaları, üçüncü akışta, gri-seviye ve kızılötesi görüntülerden, son akışta ise, RGB ve 3-kanallı kızılötesi görüntülerden elde edilmektedir. Son adımda ise, bilimsel yazında önerilen bir yeniden sıralama algoritmalası kullanılarak görüntüler arasındaki uzaklık hesaplanmaktadır. SYSU-MM01 ve RegDB verisetleri üzerinde gerçekleştirilen deneyler ile, önerilen yöntemin başarısı ortaya konmuştur.