LEE- Bilgisayar Mühendisliği Lisansüstü Programı

Bu topluluk için Kalıcı Uri

http://hdl.handle.net/11527/19206

Gözat

Identification of object manipulation anomalies for service robots

(Lisansüstü Eğitim Enstitüsü, 2021) Altan, Doğan ; Uzar Sarıel, Sanem ; 709912 ; Bilgisayar Mühendisliği

Recent advancements in artificial intelligence have resulted in an increase in the use of service robots in many domains. These domains include households, schools and factories to facilitate daily life in domestic tasks. Characteristics of such domains necessitate the intense interaction of robots with humans. These interactions necessitate extending the abilities of service robots to deal with safety and ethical issues. Since service robots are usually assigned to complex tasks, unexpected deviations of task state are highly probable. These deviations are called anomalies, and they need to be continually monitored and handled for robust execution. After an anomaly case is detected, it should be identified for effective recovery. For the identification task, a time series analysis of onboard sensor readings is needed since some anomaly indicators are observed long before the detection of the anomaly. These sensor readings need to be fused effectively for correct interpretations as they are generally taken asynchronously. In this thesis, the anomaly identification problem of everyday object manipulation scenarios is addressed. The problem is handled from two perspectives by considering the feature types that are processed. Two frameworks are investigated: the first one takes into account domain symbols as features while the second framework considers convolutional features. Chapter 5 presents the first framework to address this problem by analyzing symbols as features. It combines and fuses auditory, visual and proprioceptive sensory modalities with an early fusion method. Before they are fused, a visual modeling system generates visual predicates and provides them as inputs to the framework. Auditory data are fed into a support vector machine (SVM) based classifier to obtain distinct sound classes. Then, these data are fused and processed within a deep learning architecture. The architecture consists of an early fusion scheme, a long short-term memory (LSTM) block, a dense layer and a majority voting scheme. After the extracted features are fed into the designed architecture, the occurred anomaly is classified. Chapter 6 presents a convolutional three-stream anomaly identification (CLUE-AI) architecture that fuses visual, auditory and proprioceptive sensory modalities. Visual convolutional features are extracted with convolutional neural networks (CNNs) from raw 2D images gathered through an RGB-D camera. These visual features are then fed into an LSTM block with a self-attention mechanism. After attention values for each image in the gathered sequence are calculated, a dense layer outputs the attention-enabled results for the corresponding sequence. Mel frequency cepstral coefficients (MFCC) features are extracted from the auditory data gathered through a microphone in the auditory stage. This is followed by feeding these auditory features into a CNN block. The position of the gripper and the force applied by it are also fed into a designed CNN block. These resulting sensory modalities are then concatenated with a late fusion mechanism. Afterward, the resulting feature vector is fed into fully connected layers. Finally, the anomaly type is revealed. The experiments are conducted on real-world everyday object manipulation scenarios performed by a Baxter robot equipped with an RGB-D head camera on top and a microphone placed on the torso. Various investigations including comparative performance evaluations, parameter and multimodality analyses are studied to show the validity of the frameworks. The results indicate that the presented frameworks have the ability to identify anomalies with f-scores of 92% and 94%, respectively. As these results indicate, the CLUE-AI framework outperforms the other in classifying occurred anomaly types. Due to the requirements that the frameworks necessitate, the CLUE-AI framework does not require additional external modules such as a scene interpreter or a sound classifier as the other one does and provides better results compared to the symbol-based solution.
Software defect prediction with a personalization focus and challenges during deployment

(Lisansüstü Eğitim Enstitüsü, 2021) Eken, Beyza ; Kühn Tosun, Ayşe ; 723330 ; Bilgisayar Mühendisliği

Organizations apply software quality assurance techniques (SQA) to deliver high-quality products to their customers. Developing defect-free software holds a critical role in SQA activities. The increasing usage of software systems and also their rapidly evolving nature in terms of size and complexity raise the importance of effectiveness in defect detection activities. Software defect prediction (SDP) is a subfield of empirical software engineering that focuses on building automated and effective ways of detecting defects in software systems. Many SDP models have been proposed in two decades, and current state-of-the-art models mostly utilize artificial intelligence (AI) and machine learning (ML) techniques, and product, process, and people-related metrics which are collected from software repositories. So far now, the people aspect of the SDP has been studied less compared to the algorithm (i.e., ensembling or tuning machine learners) and data aspects (i.e., proposing new metrics). While the majority of people-focused studies incorporate developer or team related metrics into SDP models, recently personalized SDP models have been proposed. On the other hand, the majority of the SDP research so far now focuses on building SDP models that produce high rates of prediction performance values. Real case studies in industrial software projects and also the number of studies that research the applicability of SDP models in practice are relatively few. However, for an SPD solution to be successful and efficient, its applicability in real life is as important as its prediction accuracy. This thesis focus on two main goals: 1) assessing people factor in SDP to understand whether it helps to improve the prediction accuracy of SDP models, and 2) prototyping an SDP solution for an industrial setting and assessing its deployment performance. First, we made an empirical analysis to understand the effect of community smell patterns on the prediction of bug-prone software classes. The ''community smell'' term is recently coined to describe the collaboration and communication flaws in organizations. Our motivation in this part is based on the studies that show the success of incorporating community factors, i.e., sociotechnical network metrics, into prediction models to predict bug-prone software modules. Also, prior studies show the statistical association of community smells with code smells (which are code antipatterns) and report the predictive success of using code smell-related metrics in the SDP problem. We assess the contribution of community smells on the prediction of bug-prone classes against the contribution of other state-of-the-art metrics (e.g., static code metrics) and code smell metrics. Our analysis on ten open-source projects shows that community smells improve the prediction rates of baseline models by 3% in terms of area under the curve (AUC), while the code smell intensity metric improves the prediction rates by 17%. One reason for that is the existing ways of detecting community smell patterns may not be rich in terms of capturing communication patterns of the team since it only mines patterns through mailing archives of organizations. Another reason is that the technical code flaws (code smell intensity metric) are more successful in representing defect related information compared to community smells. Considering the challenging situation in extracting community patterns and the higher success of the code small intensity metric in SDP, we direct our research to focus on the code development skills of developers and the personalized SDP approach. Second, we investigate the personalized SDP models. The rationale behind the personalized SDP approach is that different developers tend to have different development patterns and consequently, their development may have different defect patterns. In the personalized approach, there is an SDP model for each developer in the team which is trained with the developer's own development history solely and its predictions target only the developer. Whereas in the traditional approach, there is a single SDP model that is trained with the whole team's development history, and its predictions target anyone in the team. Prior studies report promising results on the personalized SDP models. Still, their experimental setup is very limited in terms of data, context, model validation, and further explorations on the characteristics that affect the success of personalized models. We conduct a comprehensive investigation of personalized change-level SDP on 222 developers from six open-source projects utilizing two state-of-the-art ML algorithms and 13 process metrics collected from software code repositories that measure the development activity from size, history, diffusion, and experience aspects. We evaluate the model performance using rigorous validation setups, seven assessment criteria, and statistical tests. Our analysis shows that the personalized models (PM) predict defects better than general models (GM), i.e., increase recall by up to 24% for the 83% of developers. However, PM also increases the false alarms of GM by up to 12% for 77% of developers. Moreover, PM is superior to GM for those developers who contribute to the software modules that have been contributed by many prior developers. GM is superior to PM for the more experienced developers. Further, the information gained from various process metrics in prediction defects differs among individuals, but the size aspect is the most important one in the whole team. In the third part of the thesis, we build prototype personalized and general SDP models for our partner from the telecommunication industry. By using the same empirical setup that we use for the investigation of personalized models in open-source projects, we observe that GM detects more defects than PM (i.e., 29% higher recall) in our industrial case. However, PM gives 40% lower false alarms than GM, leading to a lower code inspection cost than GM. Moreover, we observe that utilizing multiple data sources such as semantic information extracted from commit descriptions and latent features of development activity and applying log filtering on metric values improve the recall of PM by up to 25% and lowers GM's false alarms by up to 32%. Considering the industrial team's perspective on prediction success criteria, we pick a model to deploy that produces balanced recall and false alarm rates: the GM model that utilizes the process and latent metrics and log filtering. Also, we observe that the semantic metrics extracted from the commit descriptions do not seem to contribute to the prediction of defects as much as process and latent metrics. In the fourth and last part of the thesis, we deploy the chosen SDP prototype into our industrial partner's real development environment and share our insights on the deployment. Integrating SDP models into real development environments has several challenges regarding performance validation, consistency, and data accuracy. The offline research setups may not be convenient to observe the performance of SDP models in real life since the online (real-life) data flow of software systems is different than offline setups. For example, in real life, discovering bug-inducing commits requires some time due to the bug life cycle, and this causes a data label noise in the training sets of an online setup. Whereas, an offline dataset does not have that problem since it utilizes a pre-collected batch dataset. Moreover, deployed SDP models need a re-training (update) with the recent commits to provide consistency in their prediction performance and to keep up with the non-stationary nature of the software. We propose an online prediction setup to investigate the deployed prototype's real-life performance under two parameters: 1) a train-test (TT) gap, which is a time gap between the train and test commits used to avoid learning from noisy data, and 2) model update period (UP) to include the recent data into the model learning process. Our empirical analysis shows that the offline performance of the SDP prototype reflects its online performance after the first year of the project. Also, the online prediction performance is significantly affected by the various TT gap and UP values, up to 37% and 18% in terms of recall, respectively. In deployment, we set the TT gap to 8-month and UP to 3-day, since those values are the most convenient ones according to the online evaluation results in terms of prediction capability and consistency over time. The thesis concludes that using the personalized SDP approach leads to promising results in predicting defects. However, whether PM should be chosen over GM depends on factors such as the ML algorithm used, the prediction performance assessment criteria of the organization, and developers' development characteristics. Future research in personalized SDP may focus on profiling developers in a transferable way instead of building a model for each software project. For example, collecting developer activity from public repositories to create a profile or using cross-project personalized models would be some options. Moreover, our industrial experience provides good insights regarding the challenges of applying SDP in an industrial context, from data collection to model deployment. Practitioners should consider using online prediction setups and conducting a domain analysis regarding the team's practices and prediction success criteria and project context (i.e., release cycle) before making deployment decisions to obtain good and consistent prediction performance. Interpretability and usability of models hold a crucial role in the future of SDP studies. More researchers are becoming interested in such aspects of SDP models, i.e., developer perceptions of SDP tools and actionability of prediction outputs.

Gözat

Konu "Artificial intelligence" ile LEE- Bilgisayar Mühendisliği Lisansüstü Programı'a göz atma

Sayfa başına sonuç

Sıralama Seçenekleri