LEE- Bilgisayar Mühendisliği-Yüksek Lisans
Bu koleksiyon için kalıcı URI
Gözat
Konu "aerial photography" ile LEE- Bilgisayar Mühendisliği-Yüksek Lisans'a göz atma
Sayfa başına sonuç
Sıralama Seçenekleri
-
ÖgeCrowd density map estimation system from aerial images(Graduate School, 2023-07-31) Çetinkaya, Osman Tarık ; Ekenel, Hazım Kemal ; 504201559 ; Computer EngineeringToday, the concept of urbanization, which has emerged with the choice or necessity of people to live in cities is a social and economic transformation. In recent times, the notion of a "smart city" has gained significant popularity due to its ability to incorporate various elements like sustainability, livability, quality of life, competition, branding, governance, participation, social welfare, and digitalization, thereby contributing to the advancement of urban development. Cities of varying sizes across different regions of the globe have been formulating smart city strategies for numerous years. Making a city "smart" emerges as a strategy to alleviate the problems caused by urban population growth and rapid urbanization. In order to provide a smart solution to the increasing traffic density in a big city by making detailed analyzes, to develop an automatic system that does not allow new vehicles to enter when the capacity is full by directing the newly arrived vehicles to the empty spaces according to the total capacity in the parking lots, can be given as a good example. In the earthquake that took place in Kahramanmara¸s, Turkey in 23 February, we saw that a system that can automatically detect the places where earthquake victims are concentrated has already become mandatory. In any natural disaster that may occur like this, it has become very important to be able to quickly identify groups of people in the regions and provide support with the help of drones. Military use cases can be mentioned as another application area for crowd counting. Today, it is very important for unmanned vehicles, developed for military purposes, to process the images in videos or photographs and continue their duty within the framework of an algorithm. In the situations such as smuggling activities at the borders or an illegal immigration, it is becoming a great need to be able to predict people and crowds from images taken from UAVs. Crowd analysis is very important for situations that require visual surveillance such as anomalies and alarm situations. In recent years, many different methods have been proposed to perform crowd density map estimation, and it has now become the most popular method to calculate the crowd density map estimation by processing density maps. These density maps are usually calculated with the help of CNNs. Most of the crowd counting datasets in the literature consist of images collected from surveillance cameras. Such images taken at an oblique and fixed angle, with people occupying the majority of the image, taken at a distance relatively close to the drone footage. The proposed approach in this study is of great importance for emergencies where images are required to be taken by drones in the environments where there are no surveillance cameras. The developed system consists of two stages. In the first stage, we determine whether the image contains any person(s) with the help of a binary classifier. If there are persons in the input image, the crowd estimation algorithm then calculates the density map of people in the given image. This study involves the development of a crowd density map detection system that leveraged the robust feature extraction capabilities of deep CNN architectures. A binary classifier comes into play before running a CNN designed for the crowd counting task in our system. This binary classifier is included to the system to distinguish whether there is a person(s) or not in an image taken from an UAV. In order to test the performance of the proposed system we benefited from VisDrone-CC2020 dataset [1]. We used image inpainting methods on this dataset to create UAV images that do not contain any human. For binary classification, the pretrained ResNet50 model [6] was then fine-tuned on the dataset and %87 accuracy was achieved. In order to perform crowd counting, which is the second stage of this system, we used SGANet [9].SGANet has been designed specifically for this problem. We created a new architecture by adding several layers to this network. To train the network, first, ground truth density maps were created. Ground truth density maps are produced using images and labels provided by the dataset, while output density maps are learned with our SGANet. By comparing the learned density map with the ground truth density map, a loss is evaluated, and this loss is used to train our SGANet. We obtained 8.65 MAE, which is the most used metric in the crowd counting task. We then performed an error analysis for the models trained for both binary classification and crowd counting. In the model used for binary classification, we have deduced that the incorrect outputs for binary classification can be caused by the formed artifacts in the photos after image inpainting. For crowd counting, it has been deduced that small percentage errors in dense scenes affect MAE a lot, so new metrics should be developed for this problem. In addition, in photographs where the distance between the place where the photograph was taken and the ground is greatly increased, we see that the pixels representing any object on the ground and the pixels representing a person are close to each other and are very few in number. Therefore, we have deduced that in these scenes calculations are made as if there were more people than the actual number.