999精品在线视频,手机成人午夜在线视频,久久不卡国产精品无码,中日无码在线观看,成人av手机在线观看,日韩精品亚洲一区中文字幕,亚洲av无码人妻,四虎国产在线观看 ?

Analyzing Customer Reviews on Social Media via Applying Association Rule

2021-12-11 13:29:10NancyAwadallahAwadandAmenaMahmoud
Computers Materials&Continua 2021年8期

Nancy Awadallah Awad and Amena Mahmoud

1Department of Computer and Information Systems,Sadat Academy for Management Sciences,Cairo,11742,Egypt

2Faculty of computers and Information,Department of Computer Science,Kafr el Sheikh University,Kafr El Sheikh,33511,Egypt

Abstract:The rapid growth of the use of social media opens up new challenges and opportunities to analyze various aspects and patterns in communication.In-text mining,several techniques are available such as information clustering,extraction, summarization,classification.In this study, a text mining framework was presented which consists of 4 phases retrieving,processing,indexing,and mine association rule phase.It is applied by using the association rule mining technique to check the associated term with the Huawei P30 Pro phone.Customer reviews are extracted from many websites and Facebook groups,such as re-view.cnet.com, CNET.Facebook and amazon.com technology,where customers from all over the world placed their notes on cell phones.In this analysis, a total of 192 reviews of Huawei P30 Pro were collected to evaluate them by text mining techniques.The findings demonstrate that Huawei P30 Pro,has strong points such as the best safety,high-qualitycamera,battery that lasts more than 24 hours, and the processor is very fast.This paper aims to prove that text mining decreases human efforts by recognizing significant documents.This will lead to improving the awareness of customers to choose their products and at the same time sales managers also get to know what their products were accepted by customers suspended.

Keywords: Machine learning; text mining; social media; big data;association rule; document clustering

1 Introduction

Since the rise in social media usage in the last decade, as an additional source to traditional media, individuals have been looking to gain information from the crowd.Social media data can be analyzed to gain insights into issues, trends, influential actors, and other kinds of information [1].Through the use of an assessment of the social network, influencers or opinion pioneers can be distinguished and the scope of such a person can be disclosed by analyzing their follow-up network.

Text Mining (TM) is defined as a process to extract meaningful information from the collected text data.Before applying any data mining techniques, it should take into consideration the important process of TM which is preprocessing operations [2].

In the field of text data analysis, there are several applications used similar to information extraction, summarization, and document classification, clustering.The vast amount of textual documentation becomes more intensive study with the rise of web technology and needs to be perfectly processed to help researchers get meaningful information data mining techniques [3].

The objective of this study is to discuss how the applying of the association rule with text mining helps researchers to know the importance of an item (product) in social media without referring to review a huge amount of data from customers.In this study, the researcher analyzed the Huawei P30 Pro phone’s consumer feedback, and text mining techniques are used to evaluate how various words are used for other words and what responses from the customer based mainly on this phone.

This research organization is as follows, Section 2 discusses the background of the text mining process, Section 3 presents the previous studies which focused on text mining techniques, Section 4 researchers present the proposed framework for text mining, Section 5 discusses the methodology of this study and Section 6 discusses the analysis and results for applying the proposed framework for text mining the customers’reviews on social media.

2 Background

2.1 Text Mining Process

The pattern is extracted from the unstructured data or natural language text as the input,as TM is the extraction of meaningful information from the text and then processed to obtain structured text [2].TM includes five key steps for the text to be processed as:

—Document Gathering:In the first step, the text documents are collected in different formats which be in form of HTML doc, pdf, word [4].

—Document Pre-Processing:

In the second process, removing redundancies, separate words, inconsistencies, and stemming hence documents are prepared for the 3 next stages, as follows [5,6].

(a) Tokenization:

The document string given is split into a single unit or token [5].

(b) Removal of Stop word:

The removal of usual words like a, an, but, and, of, the, etc., in this step.

(c) Stemming:

A stem is a group of words with equal significance that are very similar.The basis of a specific word [4] is described in this method.Porter’s algorithm is one of the prevalent stemming algorithms.

Text Transformation:since the text document is a collection of words and their occurrences [7], the Vector Space Model and Bag of Words are the two main ways in which such documents are represented.

Feature Selection:this method retrieves an irrelevant feature from input [7,8].Two techniques in the selection of features are filtering and wrapping techniques.

—Pattern Selection:the conventional process of data mining is combined with the process of text mining in this stage [8,9].(Tab.1) illustrated text mining techniques.

Table 1:Text Mining techniques

3 Literature Review

The Internet is an environment to collect a huge expanded amount of data.Whereas data can be extensively ordered into two types, qualitative and quantitative data [13,14].

Social network investment is a form of consumption and the various types of returns on social capital, such as economic returns [15], political, social, and cultural gains, are the drivers of utility consumption.[16] However, to generate social capital, not all social media have equal political significance, as this depends on the types of platforms and the types of activities [17,18].

In this section, previous studies of text mining on social media will be represented, such as social media effects on customer’s procurement via the internet, also text analysis through machine learning will be introduced.

3.1 Social Media Analysis

Authors in [19] have identified that the volume of data has been most often cited as a challenge addressed by social media analysis researchers.They presented a summary of the main challenges and challenges facing researchers in the (discovery, collection, and preparation) steps of the research process of social media analytics that come before the data is analyzed.

Authors in [20,21] indicated that the use of social media exposes individuals to information on political issues or current events, raising their awareness and understanding of these issues and increasing their likelihood of engaging in civic and political life.

While authors described in [22,23] that text mining tools help analyze and determine what the posts like and determining what the posts like, the individuals on the social media network(behavior and reaction) refer to or meaning.

3.2 Text Analysis

Authors in [2] integrated one of the association rule mining algorithms namely Apriori into text mining to find interesting patterns and it can easily understand by visualization techniques.

They indicated that the analysis of sentiment is a specific form of text analysis for the identification of valence and the analysis of subjectivity of user-generated content (UGC).

Authors in [24] applied sentiment analysis methods, the overall Web-based textual data, and various forms of UGC, whether positive, neutral, or negative, can be measured.Authors in [25]said that getting a structured form of text is the most important task in text mining, while authors in [26] discussed that dealing with structured data using mining tools is easy as compared to unstructured data.They also presented text mining applications in the national security systems,bioinformatics, and business intelligence.

Authors in [27] described the text mining process steps and presented a survey in text mining techniques which are used in research fields such as decision tree, clustering, categorization.

Authors in [28] applied the k-mean clustering technique to combine similar text documents through a web-based text mining process, and to identify information occurrence in documents,they used TF-IDF (Term Frequency-Inverse Document Frequency) algorithm.

3.3 Supervised and Unsupervised Machine Learning

Supervised learning refers to a classification technique for machine learning that uses a set of labeled training data to determine class labels for unnoticed instances.One of the common algorithms for classification (K-Nearest Neighbors, Vector Machine Support [SVM], Logistic Regression, Naive Bay [NB]) [29,30].Authors in [29] illustrated that for sentiment analysis, the classic lexicon-based approach (“unsupervised technique”) and both methods have been used.

The lexicon-based approach compares the characteristics of the text with pre-defined positive and negative sentiment lexicons and determines whether the document has a more positive or negative tone.For UGC valence classification, the supervised classification method exists.But the restriction of the lexicon-based approach to online review sentiment detection is that this method is highly domain-dependent.

Authors in [31] noted that supervised machine learning techniques have shown relatively better performance than unsupervised methods, but the need for large expert annotated training data to be generated from scratch could be one limitation, as the technique may fail when there is insufficient training data.

By comparing the performance of various classification techniques (‘helpfulness analysis’),authors in [32] tried to identify the most helpful TripAdvisor hotel reviews.Authors discussed in [33] that the lexicon-based approach is a common unsupervised method of determining the polarity and semantic orientation of SM statements that involve predefining positive and negative word and phrase lexicons.

The process of Social Media Analytics (SMA) proposed by [34] was used to inform the framework for classification.There were three steps involved in the process of analyzing SM content:tracking, preparation, and analysis.

The following SM analytical methods are applied after a thorough review of the SMA methods used to accomplish these processes, such as text analysis; sentiment analysis; content analysis; trend analysis; predictive analytics; social network analysis; spatial analysis; and comparative analysis.

4 Proposed Framework for Text Mining System

In this section, the proposed framework for text mining will be presented with 4 main stages(retrieving data, processing, indexing, and association rule) phase as illustrated in (Fig.1).

4.1 Retrieving Data

In this study, the researcher gathered data on review.cnet.com, CNET.technology on Facebook, and amazon.com from consumer feedback.File formats (RTF, txt, doc, etc.) are approved at this stage and will be translated into XML format at the processing stage.

4.2 Processing Phase

The processing phase has some sub-steps (transformation, filtration, and stemming of the documents).In this phase firstly text gathers from different sources for transformation.After that,unimportant words such as grammatical words (common adverbs, articles, determiners, pronouns,prepositions, and non-informative verbs (be)) are removed from documents content by the filtering process.Checking the content of the documents and eliminate all the unimportant words that are listed in stop words and also, after that, the special characters, parentheses, commas will be replaced with the spaces among words in the converted document.After completion of the categorization process, the process of word stemming will be started, which removes the word’s prefixes and suffixes.A stemming dictionary (lexicon) will be used as a stemming algorithm.

4.3 Indexing Phase

The techniques for automated production of indexes associated with documents usually rely on frequency-based weighting schemes.The weighting scheme TF-IDF (Term Frequency, Inverse Document Frequency) is used to assign higher weights to distinguished terms in a document, and it is the most widely used weighting scheme [35] which is defined as Eq.(1) illustrated, where

w(i,j)≥0:

N tj refers to the no.of documents in collection C

Where the second clause, the value of w(i, j)=0, when words that do not occur (Ndi,tj=0).

Document frequency formula as Eq.(2) illustrated is:

(log C ?logNtj=logC ?log1=logC) gives full weight to words that occur in one document [36].To perform easily indexing, for each document select the keywords that achieve the given weight constraints, this will be done when a weighting scheme has been selected After this step, the X-mean clustering of documents was done.

4.4 Association Rule Mining

In this phase, an algorithm is used to find out the related words that are frequently used and to generate the confidence and lift factors on these words that will be helpful to make association rules.For text mining using the association rule, the Frequent Pattern Growth (FP-Growth)algorithm is used.

The FP-Growth algorithm is more applicable than the Apriori algorithm.It represents the database in the form of a frequent pattern tree or FP tree whose purpose is to mine the most frequent pattern [37].The database is fragmented into “pattern fragment,” each item of these fragmented patterns is analyzed.The lower nodes of the FP tree represent the item sets while the root node represents null [38].

(Fig.2) illustrated pseudo code to mine the frequent pattern using the FP-Growth algorithm.

Figure 2:Pseudocode for FP growth algorithm [39]

5 Methodology

Customer reviews are collected from several sites and Facebook groups such as review.cnet.com, CNET.Technology on Facebook and amazon.com, where customers from everywhere put their notes about mobile phones.In this study, a total of 192 reviews of Huawei P30 Pro were collected from previously mentioned Facebook groups and amazon sites to analyze them through text mining techniques.

Next, stop words were removed which have no significant information and occur very frequently such as the words ‘a(chǎn)’, ‘a(chǎn)n,’‘is,’‘a(chǎn)re,’this will be done through the stop-words process.After that, unimportant words such as grammatical words (common adverbs, articles, determiners,pronouns, prepositions, and non-informative verbs (be)) are removed from documents content by the filtering process.Next, a stemming dictionary (lexicon) will be used as a stemming algorithm.

Afterward, the indexing phase will be started, with the TF-IDF value of each word in each document was weighed.Each word existing in the matrix was created with TF-IDF scores.

Next step, the X-mean [40] clustering algorithm will be applied after forming the document term matrix.X-mean clustering produced three clusters.After that, the association rule is applied.RapidMiner tool is used for processing collected.

6 Analysis Results

6.1 Analysis of Clusters

X-mean clustering is applied for collecting data and produced 3 clusters which were identified as technical feedback, emotional feedback, and smartphone brands feedback.Some of the words are given in (Tab.2).

Table 2:Important words in three clusters

Cluster_0 represents customer’s remarks which focused on the technical aspects of Huawei pro, whereas Cluster_1 is represented customer’s remarks which focused the emotional feedback, but Cluster_2 has represented customer’s comparison between several products and brands(Huawei, iPhone, Samsung) concerning the features of Huawei P30 Pro, iPhone 11 and Samsung Galaxy note 10 plus.

6.2 Analysis of Word Counts

Word counts can be used to determine what are the most words which should be meaningful in the output.Hence, all reviews of Huawei pro, Huawei, and mate have occurred very frequently.(Tab.3) represents counts of the word for the most word repeated in customer’s reviews.

Table 3:Counts of word

6.3 Word Association Analysis

Association rule mining presents the relation to other words and their occurrences in the document.In this phase, the FP-Growth algorithm is used to extract the related words that are repeatedly used and to generate the confidence and lifting factors on these words that will be helpful to make association rules.

In this study, (Tab.4) illustrated important rules extracted from applying association rule,higher strength of the rule related to lifting value.The rules determined the association between smartphones brands (Apple, Huawei, Samsung,) and products (iPhone 11, Pro, mate, Galaxy S,note 10 plus), it can be said that the word note associated with Samsung and also galaxy word with Samsung.Also, it can be said that at least some people have associated the words good,camera, and great with either Huawei (P20 Pro, P30 Pro, mate 20) or the product especially Huawei p30 pro.

Table 4:Text mining results via applying association rules

7 Conclusion

Text mining decreases human efforts by recognizing significant documents.So, not all 192(Customer’s reviews) were important to be read to understand what customers opinions about Huawei P30 Pro which has been a large portion by most of the reviewers.The loyalty to the iPhone was re-presented by some user feedback and compared with the Samsung Galaxy note 10 plus by some others.

Customers who love Samsung claim that it was easy to use and nice in price, but others assume that the charger of Samsung is poor.In specific, Huawei lovers (Huawei P30 Pro) say that it has strong points such as best safe, high camera quality, a battery that lasts more than 24 h,and a very good processor.

Funding Statement:The authors received no specific funding for this study.

Conflicts of Interest:The authors declare that they have no conflicts of interest to report regarding the present study.

主站蜘蛛池模板: 人妻丝袜无码视频| 午夜激情婷婷| 波多野结衣一区二区三区AV| 免费一级毛片在线观看| 精品国产成人三级在线观看| 免费一极毛片| 一本色道久久88亚洲综合| 亚洲二区视频| 日韩福利视频导航| 亚洲欧美日韩成人在线| 国产精品无码AV中文| 本亚洲精品网站| 国产人成午夜免费看| 国产精品香蕉在线观看不卡| 国产精品区网红主播在线观看| 久久国产亚洲欧美日韩精品| 无码啪啪精品天堂浪潮av| 亚洲91在线精品| 最新亚洲av女人的天堂| 欧美一区二区三区不卡免费| 日韩无码视频专区| 亚洲制服丝袜第一页| 国产剧情一区二区| 制服丝袜 91视频| 国产成a人片在线播放| 午夜少妇精品视频小电影| 强奷白丝美女在线观看| 日韩精品欧美国产在线| 色婷婷久久| 亚洲性日韩精品一区二区| 国产三级a| 激情综合五月网| 1769国产精品免费视频| 99在线视频免费| 成人精品免费视频| 久久国产成人精品国产成人亚洲| 国产喷水视频| 亚洲天堂777| 这里只有精品在线播放| 亚洲成人网在线观看| 国产一区二区精品高清在线观看| 国产导航在线| 欧美日韩一区二区三区四区在线观看 | 欧美一区二区人人喊爽| 亚洲swag精品自拍一区| 71pao成人国产永久免费视频| 国产爽歪歪免费视频在线观看| 97成人在线观看| 国产精品网址在线观看你懂的| 欧美亚洲国产一区| 五月天在线网站| 97精品久久久大香线焦| 黄色网在线| 成年免费在线观看| 91丝袜乱伦| 成人综合在线观看| 久久中文字幕不卡一二区| 99热这里只有精品免费| 亚洲国产成人在线| 制服丝袜 91视频| 99热亚洲精品6码| 国产麻豆va精品视频| 欧美午夜在线视频| 曰韩人妻一区二区三区| 免费又爽又刺激高潮网址| 国产精品分类视频分类一区| 国产精品视频3p| 国产成人乱码一区二区三区在线| 欧美一区二区三区国产精品| 激情综合图区| 国产女人爽到高潮的免费视频 | 亚洲人网站| 亚洲国产欧美自拍| 中国国语毛片免费观看视频| 精品国产一区91在线| 欧美成人第一页| 国内精品久久人妻无码大片高| 久久久久免费精品国产| 国产精品久久自在自2021| 园内精品自拍视频在线播放| 国产人在线成免费视频| 亚洲二区视频|