Investigation of artificial intelligence-based point cloud semantic segmentation

thumbnail.default.alt
Tarih
2022-12-08
Yazarlar
Atik, Muhammed Enes
Süreli Yayın başlığı
Süreli Yayın ISSN
Cilt Başlığı
Yayınevi
Graduate School
Özet
With the increasing usage areas of 3D point clouds, information extraction from 3D data has become an important field of study in photogrammetry, remote sensing, computer vision and robotics. The geometric information contained in point clouds is valuable for the successful implementation of many applications. Point clouds can be obtained with 3D scanners, Light Detection and Ranging (LiDAR), Motion Object Rendering (SFM), photogrammetry, and RGB-D cameras. Among these technologies, the usage area of LiDAR technology, which can be detected from the aerial, terrestrial and mobile, is expanding day by day. Especially for mapping and autonomous vehicles, mobile LiDAR point clouds offer very useful data. Mobile LiDAR point clouds are a type of data obtained using laser scanners mounted on a moving vehicle. Accurate sense of space, mapping and precise positioning are essential requirements for autonomous driving. For the successful performance of these tasks, mobile LiDAR point clouds are an information-rich data source. Point cloud semantic segmentation has become an important research topic in the last decade. With the development of artificial intelligence techniques, semantic segmentation of point clouds has been applied in many areas. Many methods and data sets are shared in the literature, and although the research continues rapidly, more research is needed. Deep learning techniques also enable successful semantic segmentation of large and complex point clouds. Semantic segmentation has an important potential for autonomous driving systems to perceive and map the environment. This thesis presents three articles examining the use of artificial intelligence techniques in the semantic segmentation of point clouds. A new deep learning-based semantic segmentation approach is proposed in the thesis. In addition, approaches to improving the performance of existing machine learning and deep learning techniques are presented. In the first article, semantic segmentation performances of eight machine learning approaches were investigated using point clouds created with aerial and mobile LiDAR sensors. The feature vectors of each point in the point cloud are created using geometric features that describe the geometric relationships in the specific local neighborhood of the point. Only the 3D coordinates of the point cloud are not sufficient for semantic segmentation. Additional information needs to be created. The neighborhood of a point is determined by a sphere centered on the point. In the study, the change of semantic segmentation accuracy of machine learning algorithms depending on the change of the radius of this sphere has been examined. Determining the most suitable radius increases the distinctiveness of the geometric features, and thus the accuracy of the algorithms increases. The results obtained were compared with the results of current methods using the same data sets. In the second article, a new projection-based deep learning approach for point cloud semantic segmentation is presented. First, point clouds are converted into 2D images. These images are created by projecting the irregular structure of the point cloud onto the 2D plane. Spherical projection is used for projection. Mobile LiDAR point clouds consist of frames similar to an image array. This data needs to be evaluated quickly and accurately to ensure safe autonomous driving. Once converted, point clouds can now be treated as 2D images. U-Net and SegNet have commonly used image segmentation methods. The proposed method (SegUnet3D) was created by combining these two methods. Input data proceeds through two channels, U-Net and SegNet, and result estimates are created by summing the calculated weights in the final stage. Geometric features were calculated to describe the points. Each geometric feature is attached to the 2D images like a band of images. Thus, multi-spectral images representing the point cloud were created. The use of geometric features improved the semantic segmentation performance of the method. SemanticPOSS and RELLIS-3D data sets were used to implement the proposed method. SemanticPOSS includes dense urban area, and RELLIS-3D includes the rural area. Thus, the performance of the proposed method in different topographic structures was also examined. In addition, the experiments were repeated to determine the optimum parameters by changing the input image size and the minimum number of points required to calculate the geometric features. The proposed method was compared with the current methods in the literature. The mIoU metric was improved with the proposed method by up to 15.9\% in the SemanticPOSS data set and up to 5.4\% in the RELLIS-3D data set. The third article examines the effect of feature selection algorithms on the point cloud semantic segmentation performance of deep learning networks. Filter-based information gain (IG), Chi-square (Chi2) and ReliefF algorithms were used to select the relevant features. Because filter-based methods do not depend on a classifier, they produce more consistent results in determining the optimum properties. RandLA-Net and Superpoint Graph (SPG), which directly use points as deep learning networks, are preferred. Both methods can process geometric features as input data. Experiments were performed on three popular mobile LiDAR point cloud data sets. Selected data sets are Toronto3D, SZTAKI-CityMLS, and Paris-CARLA-3D. The use of three data sets is important in terms of generalizing the hypothesis of the proposed article. Toronto3D and Paris-CARLA-3D contain color information for a point. Considering the 3D coordinates (x, y, z), color information (red - green - blue), and selected geometric features, ten feature combinations were created for these two data sets. As a result, cases where sub-attributes determined by feature selection are used have higher semantic segmentation accuracy than cases where all features are used. Similar results were obtained from all data sets. It is also seen that color information significantly increases the accuracy of semantic segmentation. Especially without color information, it is not possible to distinguish geometrically similar classes such as road and road marking. It is seen that the feature with the highest importance according to the feature importance degrees is the height difference in a point neighborhood area. The feature importance ranking results in the first article are consistent. This study concluded that the success of point cloud semantic segmentation is a process dependent on the determined features. In summary, the effect of the usage of geometric features in PCSS applications with artificial intelligence approaches has been examined in this thesis. Each point of the point cloud is defined using geometric features to improve the PCSS performances of machine learning and deep learning algorithms. Analyzes were carried out for the most accurate identification of a point in the surface area. Mobile LiDAR point clouds, an important data source for autonomous driving, are the focus of the research. A fast and efficient projection-based deep learning network has been developed for point cloud semantic segmentation for autonomous driving. Performance analyzes and suggested methods are presented in a reproducible and applicable way in studies of point cloud semantic segmentation.
Açıklama
Thesis(Ph.D.) -- Istanbul Technical University, Graduate School, 2022
Anahtar kelimeler
digital photogrametry, sayısal fotogrametri, artificial intelligence, yapay zeka, three dimensional maps, üç boyutlu haritalar
Alıntı