Deep learning based road segmentation from multi-source and multi-scale data

thumbnail.default.alt
Tarih
2023-05-12
Yazarlar
Öztürk, Ozan
Süreli Yayın başlığı
Süreli Yayın ISSN
Cilt Başlığı
Yayınevi
Graduate School
Özet
Roads are geographical objects that have been the subject of many application areas, such as city planning, traffic management, disaster management, and military interventions. The success of these applications depends on the speed and accuracy of obtaining road information. Researchers have mostly used satellite and/or aerial photographs as data sources in these studies and focused on the automatic acquisition of road information. Although successful results have been obtained with Artificial intelligence (AI)-based approaches, that are widely used recently, automatic segmentation of roads from remote sensing data is still considered a difficult and important problem due to its complex and irregular structure. AI has been developed to enable computers to realize human abilities such as reasoning, perception, and problem-solving. The most basic expectation is that AI can overcome the problems in which the traditional approaches are insufficient. As a recent trend of AI, deep learning (DL) methods establish a more complex relationship with the data and distinguish the hidden features of the data more accurately. DL is data-driven, and the quality, number, and variety of training data directly impact the performance of the models. For this purpose, comprehensive data sets such as MNIST, COCO, and ImageNet were published. However, the number of datasets containing geographic details is limited compared to others. In addition, datasets containing geographic details can represent only the characteristics of the regions where they were created. Therefore, the models trained with these data sets can only have the capacity to distinguish details at the level that they can only learn from these limited data. It is extremely difficult for these models to effectively predict roads in regions characterized by complex road networks, such as Istanbul. In this thesis, it is aimed to overcome the data gap in road segmentation studies with DL algorithms, to produce datasets representative of the study region, and finally to use data obtained from different sources together to overcome the problems encountered in existing research using only optical images. This thesis is divided into five main parts. The introduction provides a general overview of the subject matter, including comprehensive information on current studies and the motivation of this thesis. In the second part, a fast, accurate, and comprehensive road dataset production infrastructure was created using a web map service to overcome data-related problems. For this purpose, it was found appropriate to utilize service providers where maps can be edited based on user requests. Using the Static API feature of the Google Maps Platform, a data generation program was developed in Python programming language. In this program, the properties of the mask images corresponding to the satellite images were defined with a JavaScript code. An automatic static map style was created for road segmentation. In addition, using this program, the desired number of images can be generated randomly or as a sequence at fixed image sizes and within the boundaries of specified test regions. Furthermore, the Google Maps Platform does not provide geographic information about the images. In order to overcome this deficiency, the geo-referencing of these satellite images and corresponding masks was added to the program. In the third part of the thesis, it is aimed to create an Istanbul road dataset due to the necessity of producing a dataset that represents the characteristics of the region being tested in the road segmentation studies. Istanbul's road network is in a state of development with an ever-increasing population. As it contains different road types and land use details, it is capable of meeting the data diversity required by DL applications. The changing and evolving structure of Istanbul makes it one of the most important regions to be constantly observed and analyzed. In order to examine the contributions of different resolutions of satellite images and different generalization levels of masks in road segmentation studies, the images at zoom levels 14, 15, 16, and 17 from Google Maps were generated in this thesis. Consequently, 10000 optical images and road mask images were produced for each zoom level in the test regions in Istanbul. In order to test the performance of the generated dataset in DL models, the deep residual U-Net architecture was used. When the training metrics of the models' predictions are examined, it was found that the Istanbul dataset achieved successful results in terms of segmenting road pixels at each zoom level separately. In addition, DeepGlobe and Massachusetts datasets, which are widely preferred in road segmentation studies, were included in the analysis to test the prediction performance of the models trained with these datasets generated outside the study region.
Açıklama
Thesis(Ph.D.)-- Istanbul Technical University, Graduate School, 2023
Anahtar kelimeler
deep learning, derin öğrenme, image segmentation, görüntü bölütleme, remote sensing
Alıntı