Character-level dilated deep neural networks for web attack detection
Yükleniyor...
Dosyalar
Tarih
item.page.authors
Süreli Yayın başlığı
Süreli Yayın ISSN
Cilt Başlığı
Yayınevi
Graduate School
Özet
The swift expansion of web-based technology has resulted in a rise in intricate and advanced attacks directed toward website securities. An effective approach is necessary to defend against evolving attacks. This thesis's objective is to develop an effective method for detecting attacks. The goal is to detect attacks by utilizing the Hyper Text Transfer Protocol (HTTP) requests and minimizing the complexity of the preprocessing stage. For this reason, the HTTP requests are utilized at the character level. Therefore, the requests are interpreted as sequences of characters. Many studies have offered solutions to attack detection problems that leverage machine learning (ML) techniques. Feature engineering is required for many solutions in this field in order to achieve an efficient performance. Nevertheless, many of the applied techniques struggle to maintain the sequential information in the input. Deep learning (DL) garnered a lot of interest in attack detection since feature engineering is regarded to be the most labor-intensive step in developing an ML system. Since they are able to learn the feature representation and sequenced pattern within any given input automatically and generalize the feature representation efficiently. Hence, DL approaches outperform many traditional ML techniques. However, extracting long-term relationships remains a challenge for DL applications, despite their cutting-edge performance in attack detection. Larger receptive fields are necessary for convolutional neural networks (CNNs) to cover longer sequences. More layers are required for wider receptive fields, and more layers equal more parameters and a more difficult training process. Employing long short-term memory networks (LSTMs) is another efficient method for managing sequential data. Unfortunately, LSTMs still struggle to learn long-term relations because of their inability to deal with the problem of vanishing/exploding gradients. For capturing long-range dependencies in sequential data or time-series analysis, dilated neural networks are a good choice. By utilizing dilated layers and skip connections in LSTMs, the issue of vanishing/exploding gradients is mitigated. In CNNs, the receptive field can be expanded using dilated convolutions without requiring more computation or parameters. Gaps, or dilation, are created between the convolutional filter elements to accomplish the procedure. Dilated networks are therefore well suited for tasks that necessitate comprehending dependencies across a wide range since they can capture more contextual information. In this thesis, both the dilated LSTM and CNN-based methodologies' performances are assessed. Two distinct methodologies based on dilated LSTMs are evaluated: Dilated LSTM and dilated bidirectional LSTM (Bi-LSTM). The dilated Bi-LSTM methodology's first layer contains Bi-LSTM blocks. Consequently, the model in the Bi-LSTM layer retains the data available on both sides of each time step. With the aid of the dilated layers at the top of the Bi-LSTM layer, the model reduces the vanishing/exploding gradients problem and learns the temporal relations of different scales at various levels. With the exception of the LSTM blocks in the first layer, the structure of dilated LSTM is akin to dilated Bi-LSTM methodology. The multichannel of multilayer dilated CNN blocks with different kernel sizes make up the dilated CNN-based model as MC-MLDCNN. There are multiple layers in each channel, and their dilation sizes increase exponentially. By combining a variety of channels and multiple layers of dilated CNNs, the model recognizes the correlation and interdependence within character resolution in HTTP requests at various levels and scales. Three different datasets are used to assess the efficacy of CNN-based and dilated LSTM-based approaches to discover the long-term dependency of the complicated attacks. The Consejo Superior de Investigaciones Científicas (CSIC) 2010 dataset, the Web Application Firewall (WAF) dataset, and the self-collected dataset—which has been gathered for nearly a decade—are all used in the experiments. The WAF dataset only includes the query portion of HTTP requests, the self-collected data only includes the Uniform Resource Identifier (URI) portion and the full text of HTTP requests can be found in the CSIC 2010 dataset. The experiment's results demonstrate that, in terms of attack detection performance MC-MLDCNN performs better than dilated LSTM-based models in terms of accuracy, recall, precision and F1 score. MC-MLDCNN-based models require less computation time and converge faster as well. Therefore, the methodology proposed in this thesis to detect web attacks is MC-MLDCNN. The efficiency of the proposed methodology is compared with several cutting-edge DL-based methodologies found in the literature along with some conventional DL approaches. The experimental outcomes demonstrate the superiority of the proposed methodology using the same aforementioned metrics used for attack detection efficiency. Keeping the rate of categorizing normal requests as attacks (false positives) low while maintaining accurate attack detection is a critical skill for any effective web attack detection system. The business continuity is stopped as a result of the high false positive rate (FPR). To ensure enhanced security without compromising the availability and usability of web applications the FPR scores are also analyzed.
Açıklama
Thesis (Ph.D.) -- Istanbul Technical University, Graduate School, 2024
Konusu
deep neural networks, derin sinir ağları
