Please use this identifier to cite or link to this item: http://hdl.handle.net/11527/416
Title: Kullanıcı Tarayıcı Geçmişine Dayanarak Müşteri Yorumlarının Özetlenmesi
Other Titles: Personalized Feature Based Summarization
Authors: Öğüdücü, Şule Gündüz
Kavasoğlu, F. Zehra
10005736
Bilgisayar Mühendisliği
Computer Engineering
Keywords: sentiment analizi
ürün yorumları analizi
metin madenciliği
sentiment analysis
product review mining
text mining
Issue Date: 19-Jul-2013
Publisher: Fen Bilimleri Enstitüsü
Institute of Science and Technology
Abstract: Son yıllarda, insanlar gün geçtikçe İnternet üzerinden alışveriş yapmaya alışmakta, bu da alışveriş sitelerinin hızla yaygınlaşmasına neden olmaktadır. İnternet üzerinden alışveriş yaparken ilgili ürünü kullananların ürünü nasıl buldukları ve ürünle ilgili yorumları potansiyel müşteriler için ürün seçimlerinde önemli bir rol oynamaktadır. Bu nedenle çoğu alışveriş sitesi, ürünlerin sayfalarında ilgili yorumları göstermek için müşteri yorumları kısımlarına yer vermektedir. İnternet üzerinde paylaşımda bulunurken herhangi bir baskı olmadığından, insanlar kendilerini özgür hissetmekte ve ürünlerle ilgili yorumlarını samimi bir şekilde dile getirmektedirler. Bu da, diğer müşterilerin o yorumları inandırıcı bulmasına ve ürün seçimlerini yaparken bu yorumları temel almalarına neden olmaktadır. Ancak, bu özgürlük ortamı ürünlerle ilgili çok fazla yorum yapılmasına sebep olmakta ve okuma zamanı dikkate alındığında bu yorumların hepsini okumak potansiyel müşteriler için çok mümkün olmamaktadır. Potansiyel müşteriler, bütün yorumları okumak yerine üst kısımlarda yer alan bazı yorumları okuyarak karar verme eğilimi göstermektedirler. Ancak, üstte yer alan yorumların, onların ilgilendiği özellikleri içermeme durumu bulunmaktadır ve bu yorumları okumak onlar için sadece zaman kaybı haline gelmektedir. İlgilendikleri özellikleri içeren yorumları özet bir şekilde sunmak, müşterilerin aradıkları bilgileri kısa sürede elde etmelerini sağlayacaktır. Bu kapsamda, ürün yorumlarının ürün nitelikleri bazında özet bir bilgi haline getirilmesi pek çok müşteri için faydalı olacaktır. Ürünlerde ortak özellikler yer alabildiği gibi oldukça farklı özellikler de yer alabilmektedir. Müşterilerin dikkat ettiği özellikler ise kişisel ihtiyaçlara ve zevklere bağlı olarak değişebilmektedir. Bir müşterinin genel davranışı, girdiği alışveriş sitesinde önem verdiği kriterlere göre arama yapmak ve ilgisini çeken ürünleri tıklayarak incelemektir. Müşteri yorumlarını ise ürünlerle ilgili her tıkladığı sayfadaki ilgili kısımda okuyabilmektedir. Ancak, tıkladığı linkler ya da arama sonuçları tam olarak ilgilendiği özelliklerle ilgili bilgileri içermeyebilmektedir. Bu nedenle, aradığı özellikleri ön görerek ilgilendiği ürün yorumlarını getirmek, müşteri için zaman kazancı sağlamakla kalmayacak, kendisi için en doğru ürün seçimini yapmasını da kolaylaştıracaktır. Bu çalışmada, ürün yorumlarını nitelikler bazında özetlerken kişisel olarak tercih edilen özelliklerin belirlenmesine, ilgilenilen özelliklerle ilgili yorumların daha üst sıralarda getirilmesine çalışılmıştır. Kullanıcının arama yaparken girdiği kelimelerin ve tıkladığı linklerdeki ortak özelliklerin kişisel olarak tercih edilen özellikleri anlamada önemli olduğu düşünülmüş ve ürün yorumları içerisinde bu kelimelere göre arama yapılmıştır. Arama sonuçlarına göre gelen yorumlar ilgili ürün ve özelliği altında gösterilmiştir. Çalışmayı yaparken hepsiburada.com1 alışveriş sitesindeki ürünler ve kullanıcı yorumları ele alınmıştır. Geliştirilen yöntem, var olan özellik bazındaki kullanıcı yorumlarını özetleme tekniklerinden biriyle ROUGE kütüphanesi aracılığıyla karşılaştırılmıştır. Elde edilen sonuçların iki kategoride ümit verici olduğu görülmüştür: aranılan kritelere uygunluk ve çalışma zamanları. Sonuçların, kişisel olarak aranılan kriterlere uygunluğu göz önüne alındığında, geliştirilen yöntemde var olan yönteme göre en az 3 kat iyileştirme olduğu görülmüştür. Çalışma zamanları açısından değerlendirildiğinde ise geliştirilen yöntemin var olan yönteme göre 36-55 kat daha hızlı olduğu belirlenmiştir.
With the growing popularity of internet, e-commerce web sites are taking more and more places in our lives. Nowadays, a growing number of people are shopping online. Every e-commerce web site today has the product review feature which allows customers to express their opinions and comments about the product they have purchased. Using the product review feature of e-commerce web sites, these customers are submitting comments and declaring their opinions about the products as well as indicating satisfaction with the products. These comments are important for potential customers when deciding which product to buy. However, reading large amounts of customer reviews available for each product is a time consuming process. For this reason, customers usually tend to read small pieces of topmost comments and skip the rest of them. Also, each customer is able to post different comments about different specifications of a product and the reviews reflect the personal judgements of the customers because their requirements and the expectations might differ in many ways. Therefore, a review is totally subjective and it provides important personal feedback about the product. At this point, a feature based summarization of the products is very helpful for potential customers in selecting the best product option. To get the characteristics of customer behavior, sentiments of the reviews should be analyzed in order to determine the positive/negative sides of the product. Several previous studies on feature based summarization overcome this issue by summing up the reviews for a common user profile. Nevertheless, interests and needs are different for each customer and a potential customer is eager to make use of the reviews that are addressing his/her personal interests and needs when selecting the most suitable product option. Thus, reviews should be filtered according to the personal preferences of the potential customers and feature based summarization should be directed by personal preferences. In this paper, we propose a novel feature based approach for personalized review summarization by giving importance to potential individual customer preferences. Existing feature based review summarization methods create a product summary for a common user profile ignoring the individual preferences. In these studies the personal preferences are ignored and the main goal is to summarize the reviews for an average user. Personalization is taken into account generally on text mining approaches. News filtering according to personal preferences is investigated in some works [1, 2]. Personalization on review mining is also researched in a recent study on personalized recommendation of user comments [3]. However, these common text mining approaches do not meet the requirements on product review mining as in the feature based investigations. To the best of our knowledge, there are no studies available related to personalization on feature based investigations. Feature based systems usually take specific model comments and explore its feature-sentiment relations. Yet, this approach is not sufficient for a potential customer. If a customer is looking for a phone, he/she wants to examine all the models available and compare the reviews based on his/her personal preferences. Reading only the reviews about the features that are related to customers’ personal preferences can help the potential customer make the right decision and save time for finding the valuable information from vast amount of reviews. In this work, we propose a novel method for personalized review summarization on an e-commerce web site. Our method can be summarized as follows: 1. Extracting common features of the products from click- through pages of the current user, 2. Finding desired products of a current user based on extracted common product features, 3. Finding reviews that are more related to the desired products in search, 4. Identifying product features on the reviews, 5. Identifying positive/negative opinions on the reviews, 6. Generating feature-opinion pairs to understand the related sentiment of a feature, 7. Producing a product model based summary of these feature-opinion pairs. Our method is different from traditional feature based summarization in a number of ways. First of all, we use the search log history of users in order to extract user preferences for personalization purposes. Second, our method has a shorter runtime since the summarization is performed on filtered relevant reviews. And lastly, while identifying product features, we use the product features taken from the web site and this can be evaluated as supervised method on feature extraction. To the best of our knowledge, there is no study made in Turkish on review summarization and there is no available data set. We have started our work by building the data set. We have used user reviews and product properties sections of the e-commerce web site “hepsiburada.com”. We get 4877 user reviews. Our categories are: SLR cameras (12 models), mobile phones (7 models), book (5 different books), movie (6 different movies), washing machine (3 models), iron (8 models), mouse (12 models), sport clothes (4 different clothes), hair straighteners (8 models) and watch (6 models). We also store 1464 product properties related to the models. We have proposed a personalized feature based summarization (PFBS) method and implemented in Java. The computer has been used in this work has Intel Core 2 Duo CPU – 2.53GHz, 4GB RAM, 32 byte Win7 operating system. We evaluate our method comparing to the Hu and Liu’s existing feature based summarization (FBS) method using the statistical comparison tool “ROUGE”. We calculate the coherence score between the summarized results and the search query words of the users to understand that how much relevant results are getting from the systems. The first 100 feature-opinion paired sentences are taken to the summarization part from the systems considering the reading capability of a potential customer. We select top 5 categories from the category list that is ordered with feature counts: mobile phone, SLR camera, mouse, hair straighteners and wrist watch. We generate 5 sample user scenarios in these categories. We compare two systems in two perspectives: 1. F-scores for coherence
 2. Running times First, we only take the reviews existed in their related categories. For example, the reviews in the mobile phone category are taken to the systems for a mobile phone user’s scenario. We want to understand that how much relevant reviews are taken although the related category is given to the systems. The coherence results are show that PFBS has really higher coherence scores than FBS. Second, we take all the reviews to the systems ignoring their related categories. And we see that PFBS has the same values as in the first experiment. This shows that PFBS is stable that gets related reviews in both conditions. The coherence scores for FBS are less than or equal to the results in the first condition. This states that the most relevant reviews are taken when the reviews in the related categories are given to the FBS. Also, we compare two systems according to their runtimes. Our method has shorter runtimes than FBS because of dealing with just the related reviews that are taken from the search process. But, FBS deals with all of the reviews and this makes it slower than our system. According to the results, our system is approximately 36 times faster than FBS when the reviews are taken in the related categories, and also approximately 55 times faster than FBS when all the reviews are taken. This shows that our system is more successfull on operating high volume data than FBS. We observe that when users search with general words like “sports watch”, “cheap smart phone”, “wireless optic mouse” etc., much more relevant scores are taken than feature based searches like “Android operating system”, “ionic hair straighteners” etc. This shows that people tend to make general comments, they are not explaining their satisfactions in detail. Capturing the details is becoming much more important at that point. Also, we observe that if the product has numerous features, the diversity in the reviews is getting higher and this causes less coherence scores. When all of the reviews are given to the systems in the sample scenarios, we see the irrelevant featured reviews are listed in the results in FBS. For example, the “steam boiler” featured sentences are listed in the results for a mobile phone customer. When the reviews in the related categories are given to the systems, the results seem relevant. But, the results could be irrelevant for the potential customer in detail. For a mobile phone user’s scenario, when the reviews in the mobile phone category are given to the systems, only mobile phone reviews are listed in the results in both systems. But, if a customer searches for Android operating system not for Bada system, the reviews on both operating systems occurred in the results. However, Android comments tend to exist in higher orders in the PFBS results. Thus, PFBS has higher coherence scores than FBS in all of the situations. As a result, we try to order customer reviews for a potential customer regarding his/her interests in this work. While surfing on the web site, we get his/her web log data at each page and filter related customer reviews based on his/her search queries and click-through pages. We display feature based summarization for each model obtained from filtered reviews. The results are very promising in our work and we believe that personalization in feature based review summarization systems will become important at near future.
Description: Tez (Yüksek Lisans) -- İstanbul Teknik Üniversitesi, Fen Bilimleri Enstitüsü, 2013
Thesis (M.Sc.) -- İstanbul Technical University, Institute of Science and Technology, 2013
URI: http://hdl.handle.net/11527/416
Appears in Collections:Bilgisayar Mühendisliği Lisansüstü Programı - Yüksek Lisans

Files in This Item:
File Description SizeFormat 
13763.pdf1.02 MBAdobe PDFView/Open


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.