Kullanıcı Destek Sistemlerinde Yardım Biletlerinin Otomatik Sınıflandırılması
Yükleniyor...
Dosyalar
Tarih
item.page.authors
Süreli Yayın başlığı
Süreli Yayın ISSN
Cilt Başlığı
Yayınevi
Fen Bilimleri Enstitüsü
Instıtute of Science and Technology
Instıtute of Science and Technology
Özet
Günümüz rekabet koşullarında, kullanıcı destek sistemlerinin sağladığı servis kalitesi, ürün veya hizmet alımı öncesi ve sonrasında müşterilerin organizasyon hakkındaki memnuniyetini belirleyen önemli bir etken haline gelmiştir. Organizasyonlar müşterilerinin veya kullanıcılarının sorun, görüş ve isteklerini en etkin şekilde değerlendirip cevaplamak suretiyle müşteri memnuniyetini arttırmayı amaçlamaktadırlar. Kullanıcı destek sistemleri kullanıcıdan gelen sorun, görüş ve istekleri yardım bileti şeklinde tutup, cevaplanmak üzere ilgili destek personeline veya birimine atanmasını sağlarlar. Günümüzdeki destek sistemlerinin bir kısmı bu atama işlemini, daha önceden belirlenmiş seçenekleri kullanıcıya işaretleterek yaparken, diğer bir kısmı operatör veya insan sınıflandırıcı vasıtasıyla gerçekleştirmektedir. Her iki yöntem de zaman alıcı ve hataya meyilli insan çabası gerektirir. Bu durum, özellikle büyük kuruluşlarda, kaynakların verimsiz şekilde kullanılmasına neden olmakta ve kullanıcı memnuniyetini olumsuz yönde etkilemektedir. Bu çalışmada kullanıcı destek sistemlerinin kalitesini iyileştirmek amacıyla, yeni bir destek sistem mimarisi önerilmiştir. Yapay zekâ algoritmaları ve doğal dil işleme yöntemleri kullanılarak yardım biletlerinin ilgili birime otomatik olarak atanması gerçekleştirilmiş, bunun yanı sıra tekrar eden yardım biletlerinin sistem tarafından otomatik cevaplanması sağlanmıştır. Metin halindeki verinin sayısal hale dönüştürülmesi için bag of word yaklaşımından yola çıkarak, veri kümesi içinde bulunan her terim birbirinden bağımsız nitelikler şeklinde kabul edilmiştir. Bu yaklaşımdan dolayı metin sınıflandırmada kullanılan özellik vektörü boyutu hayli büyük olmaktadır. Özellikle Türkçe, Japonca, Macarca gibi eklemeli dillerde bir terim pek çok farklı formda bulunabilmektedir. Bu durum ayrıca veri seyrekliği problemini de doğurmaktadır. Bu yüzden veri kümesi sırasıyla biçimbilimsel analiz ve biçimbilimsel belirsizlik giderimi işlemlerine tabi tutulmuştur. Sınıflandırma için ayırt edici bir özelliğe sahip olmayan düşük idf değerine sahip terimler, konudan bağımsız olarak kullanılan bağlaçlar ve sıradan kelimeler (stop words) özellik vektöründen çıkarılmıştır. Bu sayede mümküm olduğunca, özellik vektör boyutu küçültülmeye çalışılmış ve özellik vektörü gürültüden arındırılmıştır. Ayrıca kullanılan veri kümesindeki yazım yanlışları ve sayısal ifadelerden kurtulmak için ön veri aşamasında bir takım işlemler gerçekleştirilmiştir. Önerilen sistem sayesinde; yardım biletlerinin ilgili birime atanma işlemi otomatikleştirilerek bu işlemin destek personeli üzerindeki yükü, azaltılmıştır. Bu sayede zaman alıcı ve hataya meyilli insan emeği gerektiren elle atama işlemi asgari düzeye indirgenmiştir. Atama işleminin otomatikleştirilmesiyle, kullanıcıdan gelen yardım biletleri daha kısa sürede cevaplanarak sağlanan servis kalitesi, dolayısıyla son-kullanıcı memnuniyeti arttırılmıştır. Yanlış atama işlemleri minimuma indirgenerek, değerli destek kaynaklarının daha etkin ve verimli şekilde kullanılmasına olanak sağlanmıştır. Ayrıca, tekrar eden yardım biletlerinin tespit edilip otomatik cevaplandırılması, destek personelinin iş yükünü azaltmıştır.
In the current competitive environment, the service quality of user support systems, before and after purchase of products or services, has become an important factor that affect customer satisfaction. A user suppport system or Issue Tracking System (ITS) is a type of software to capture and keep track of customer issues, which may be customer problems or customer requests, and to assign the tickets the relevant support person or unit in order to create a solution. In the support process, incoming new tickets are analyzed and assessed by organization's support teams in order to fulfill the ticket request. In a large organization, better allocation and effective usage of the valuable support resources is directly results in substantial cost cuts. Addressing the issue tickets to appropriate person or unit in the support team is crucial to maintain better allocation of resources and improved end user satisfaction while ensuring better allotment of support recourses. Many ITSs in use, two different methods are used to address help ticket to appropriate unit or person. Some ITSs rely on the end users for choosing the right problem category or related unit among predefined categories. The main drawback of this type of ITS is the possibility of ticket misaddressing to an unrelated staff and the need of an extra redirection step to the right staff. Other type of ITSs relies support operators to choose the right assignment of an incoming issue to the right staff. Especially at large organizations, the manual assignment is not applicable sufficiently. This error prone process necessitates costly human effort which is time consuming and tedious. Human involvement may introduce improperly assigned tickets due to human errors. On the other hand, the manual assignment step increases the average response time of the tickets, which deteriorates the enduser satisfaction. In this study, an extension to an ITS for automatically assigning the issue tickets to the right related person or unit in the support team is proposed. Using machine learning techniques, the recommended extension, which is capable of responding to the needs of the large organizations, reduces manual efforts and human errors while ensuring high quality service levels and improved end-user satisfaction. Additionaly, in this study, automatic response of repetive tickets is proposed,. In most case, users ignore to look previously asked questions and ask the similar questions. The proposed system intends to use resources in the most efficient manner. So, help tickets are firstly tried to be answered without any assistance of technician expert. The similarity of each FAQ between coming help ticket is calculated. The answer of the most similar FAQ which exceeds predefined threshold similarity is suggested to the user. If the user does not become satisfied with the suggestion, the ticket is escalated the up level to be responded by technical person. The proposed system basically contains two phase classification process to assign issue ticket to related support unit. The first classification aims to detect the related category of ticket which is directly related to the department of the issue while the second classification tries to determine the related subcategory or unit under the specified category that describes which type of the problem in the determined department. For example; an issue ticket describing network connection problem must be directed to the network problem category (or network department) and be classified as low speed problem type subcategory defined under network department. Since the proposed system is semi automatic, if the prediction confidence of each classification is greater than the predetermined threshold value, the issue ticket is assigned to the relevant category or subcategory. Otherwise, manual classification of issue ticket is performed by an operator to assign to related category and/or subcategory. According the classifications results, the issue tickets are assigned to the support staff who has the right expertise with the issue described in ticket in order to return a response to end user. The assignment of tickets to category and subcategory is basically a single-label, multi-class text classification problem. This problem is a widely studied problem in which various algorithms and feature extraction techniques can be used. However the proposed system is language independent, the implementation of the system may require additional language preprocessing steps, because the problem definition is represented in a specific natural language such as Turkish, English etc. To conduct our experience a dataset consisting of approximately ten thousand issue tickets in Turkish that collected from ITU Issue Tracking System which is a web application that users can request on various issues to different departments within the university. Each issue tickets contain date, user, category field (related department), subcategory field (the problem type under related department), ticket subject field and ticket body attributes. The problem definition of each ticket is defined in unstructured natural language text. In this study to categorize help tickets, category, subcategory free form ticket content and ticket subject are used. The rest of attributes such as sending date and user info of tickets are ignored. Not distinctive terms for classification which have got smaller idf than a certain threshold for all documents are considered as stop words. Lots of these terms are conjunctions used as an independent word to the topic and misspelled words. The stop words have been removed from the feature vectors. In this way, feature vector size is reduced as much as possible and noise of the feature vector was eliminated. In order to classify tickets with optimum learning techniques, four different supervised machine learning techniques; decision tree, SVM with poly kernel which allows non -linear models, naïve bayes and k nearest neighbors are applied using WEKA tool. Ten fold cross validation is used measure average performance of machine learning algorithms. SVM classifier is a discriminant-based algorithm. The classifier concerns only close examples to the discriminator or border and ignores the other instances. Thus, the complexity of classifier depends only the count of support vectors, not dataset size. So it is most suitable classification method for problems that contain large data. Kernel-based algorithms are defined as a convex optimization problem and they find the best single solution. Decision tree is a simple and widely used classification technique. The classifier consists series of test questions and conditions in a tree structure. Greedy algorithm builds the tree from top to down. At each node, the best splitting of the remaining data is intended. To reduce the size of decision tree and increase the accuracy, pruning process is performed by removing sections of tree that provide weak information gain to classify instance. Nearest neighbor algorithm aims to predict label of a new instance by measuring distance of the closest predefined number of training samples. This algorithm is commonly preferred when there is little or no prior knowledge about the distribution of the data. It is an instance based classification algorithm. Naïve Bayes is a statistical classification algorithm based on Bayes theorem. It provides quite well performance when the training data consists of low amount of data and does not contain all possibilities. Also the classifier relates with features rather than instances. Briefly, the manual assignment of issue tickets to appropriate unit or person in support team is not feasible sufficiently for large organizations. It is time consuming and there may be mistakes due to human errors. In this study, to assign tickets automatically, a model based on supervised machine learning algorithms is proposed. Dataset consisting of previously categorized tickets are used to train classification algorithms. Bag of words approach is utilized to extract features vectors. Morphological analysis of terms is performed to avoid data sparseness problem and decrease the vector size. Four different supervised classification algorithms are implemented to evaluate performances comparatively. Commonly used term weighting methods are used to convert text into numerical form. The classification performance varies directly related to the machine learning algorithm, the weighting method and the dataset. Consequently, the proposed approach reduces manual efforts and human errors while ensuring high service levels and improved end-user satisfaction. Also, the proposed system provides to large organizations better allocation and effective usage of the valuable support resources.
In the current competitive environment, the service quality of user support systems, before and after purchase of products or services, has become an important factor that affect customer satisfaction. A user suppport system or Issue Tracking System (ITS) is a type of software to capture and keep track of customer issues, which may be customer problems or customer requests, and to assign the tickets the relevant support person or unit in order to create a solution. In the support process, incoming new tickets are analyzed and assessed by organization's support teams in order to fulfill the ticket request. In a large organization, better allocation and effective usage of the valuable support resources is directly results in substantial cost cuts. Addressing the issue tickets to appropriate person or unit in the support team is crucial to maintain better allocation of resources and improved end user satisfaction while ensuring better allotment of support recourses. Many ITSs in use, two different methods are used to address help ticket to appropriate unit or person. Some ITSs rely on the end users for choosing the right problem category or related unit among predefined categories. The main drawback of this type of ITS is the possibility of ticket misaddressing to an unrelated staff and the need of an extra redirection step to the right staff. Other type of ITSs relies support operators to choose the right assignment of an incoming issue to the right staff. Especially at large organizations, the manual assignment is not applicable sufficiently. This error prone process necessitates costly human effort which is time consuming and tedious. Human involvement may introduce improperly assigned tickets due to human errors. On the other hand, the manual assignment step increases the average response time of the tickets, which deteriorates the enduser satisfaction. In this study, an extension to an ITS for automatically assigning the issue tickets to the right related person or unit in the support team is proposed. Using machine learning techniques, the recommended extension, which is capable of responding to the needs of the large organizations, reduces manual efforts and human errors while ensuring high quality service levels and improved end-user satisfaction. Additionaly, in this study, automatic response of repetive tickets is proposed,. In most case, users ignore to look previously asked questions and ask the similar questions. The proposed system intends to use resources in the most efficient manner. So, help tickets are firstly tried to be answered without any assistance of technician expert. The similarity of each FAQ between coming help ticket is calculated. The answer of the most similar FAQ which exceeds predefined threshold similarity is suggested to the user. If the user does not become satisfied with the suggestion, the ticket is escalated the up level to be responded by technical person. The proposed system basically contains two phase classification process to assign issue ticket to related support unit. The first classification aims to detect the related category of ticket which is directly related to the department of the issue while the second classification tries to determine the related subcategory or unit under the specified category that describes which type of the problem in the determined department. For example; an issue ticket describing network connection problem must be directed to the network problem category (or network department) and be classified as low speed problem type subcategory defined under network department. Since the proposed system is semi automatic, if the prediction confidence of each classification is greater than the predetermined threshold value, the issue ticket is assigned to the relevant category or subcategory. Otherwise, manual classification of issue ticket is performed by an operator to assign to related category and/or subcategory. According the classifications results, the issue tickets are assigned to the support staff who has the right expertise with the issue described in ticket in order to return a response to end user. The assignment of tickets to category and subcategory is basically a single-label, multi-class text classification problem. This problem is a widely studied problem in which various algorithms and feature extraction techniques can be used. However the proposed system is language independent, the implementation of the system may require additional language preprocessing steps, because the problem definition is represented in a specific natural language such as Turkish, English etc. To conduct our experience a dataset consisting of approximately ten thousand issue tickets in Turkish that collected from ITU Issue Tracking System which is a web application that users can request on various issues to different departments within the university. Each issue tickets contain date, user, category field (related department), subcategory field (the problem type under related department), ticket subject field and ticket body attributes. The problem definition of each ticket is defined in unstructured natural language text. In this study to categorize help tickets, category, subcategory free form ticket content and ticket subject are used. The rest of attributes such as sending date and user info of tickets are ignored. Not distinctive terms for classification which have got smaller idf than a certain threshold for all documents are considered as stop words. Lots of these terms are conjunctions used as an independent word to the topic and misspelled words. The stop words have been removed from the feature vectors. In this way, feature vector size is reduced as much as possible and noise of the feature vector was eliminated. In order to classify tickets with optimum learning techniques, four different supervised machine learning techniques; decision tree, SVM with poly kernel which allows non -linear models, naïve bayes and k nearest neighbors are applied using WEKA tool. Ten fold cross validation is used measure average performance of machine learning algorithms. SVM classifier is a discriminant-based algorithm. The classifier concerns only close examples to the discriminator or border and ignores the other instances. Thus, the complexity of classifier depends only the count of support vectors, not dataset size. So it is most suitable classification method for problems that contain large data. Kernel-based algorithms are defined as a convex optimization problem and they find the best single solution. Decision tree is a simple and widely used classification technique. The classifier consists series of test questions and conditions in a tree structure. Greedy algorithm builds the tree from top to down. At each node, the best splitting of the remaining data is intended. To reduce the size of decision tree and increase the accuracy, pruning process is performed by removing sections of tree that provide weak information gain to classify instance. Nearest neighbor algorithm aims to predict label of a new instance by measuring distance of the closest predefined number of training samples. This algorithm is commonly preferred when there is little or no prior knowledge about the distribution of the data. It is an instance based classification algorithm. Naïve Bayes is a statistical classification algorithm based on Bayes theorem. It provides quite well performance when the training data consists of low amount of data and does not contain all possibilities. Also the classifier relates with features rather than instances. Briefly, the manual assignment of issue tickets to appropriate unit or person in support team is not feasible sufficiently for large organizations. It is time consuming and there may be mistakes due to human errors. In this study, to assign tickets automatically, a model based on supervised machine learning algorithms is proposed. Dataset consisting of previously categorized tickets are used to train classification algorithms. Bag of words approach is utilized to extract features vectors. Morphological analysis of terms is performed to avoid data sparseness problem and decrease the vector size. Four different supervised classification algorithms are implemented to evaluate performances comparatively. Commonly used term weighting methods are used to convert text into numerical form. The classification performance varies directly related to the machine learning algorithm, the weighting method and the dataset. Consequently, the proposed approach reduces manual efforts and human errors while ensuring high service levels and improved end-user satisfaction. Also, the proposed system provides to large organizations better allocation and effective usage of the valuable support resources.
Açıklama
Tez (Yüksek Lisans) -- İstanbul Teknik Üniversitesi, Fen Bilimleri Enstitüsü, 2014
Thesis (M.Sc.) -- İstanbul Technical University, Instıtute of Science and Technology, 2014
Thesis (M.Sc.) -- İstanbul Technical University, Instıtute of Science and Technology, 2014
Konusu
Kullanıcı Destek Sistemleri, Otomatik Atama, Yardım Bileti, User Support Systems, Automatic Assignment, Issue Ticket
