999精品在线视频,手机成人午夜在线视频,久久不卡国产精品无码,中日无码在线观看,成人av手机在线观看,日韩精品亚洲一区中文字幕,亚洲av无码人妻,四虎国产在线观看 ?

A multi-target stance detection based on Bi-LSTM network with position-weight

2020-11-27 09:17:20XuYilong徐翼龍LiWenfaWangGongmingHuangLingyun
High Technology Letters 2020年4期

Xu Yilong (徐翼龍), Li Wenfa, Wang Gongming, Huang Lingyun

(*Smart City College, Beijing Union University, Beijing 100101, P.R.China)(**College of Robotics, Beijing Union University, Beijing 100101, P.R.China)(***Tianyuan Network Co., Ltd., Beijing 100193, P.R.China)(****Beijing Tianyuan Network Co., Ltd., Beijing 100193, P.R.China)(*****Chinatelecom Information Development Co., Ltd., Beijing 100093, P.R.China)

Abstract

Key words: long short-term memory (LSTM), multi-target, natural language processing, stance detection

0 Introduction

In recent years, the continuous improvement and development of social media has led an increasing number of people to use social media to share and discuss their attitudes toward different people, events, and objects. Analysis of such texts containing stances in the social media may help us understand their preferences and opinions. Such information plays an important role in public opinion analysis. A large number of researchers, such as Wang et al.[1]and Li et al.[2], have used stance detection technology to find such information.

Multi-target stance detection[3]is a sub-task of stance detection. It is used to mine the stance classifications of different targets in one text. Typical examples include mining the opinions of different politicians in elections and analyzing user recognition of different brands in similar products. In addition to determining the stance of the target, multi-target stance detection identifies the corresponding positions of different targets in the same text.

In the fields of multi-target stance detection, most methods combine the 2 tasks of target location (determine the context that describes the different goals) and stance detection (determine stance label based on a goal) into one task during execution. So, these methods tend to enlarge the structure of the model to enhance its ability to mine features.The result, however, provides the comprehensive optimal solution of the 2 tasks rather than the optimal solution of stance detection. Thus, stance detection of a certain target is easily affected by other target descriptions, which may reduce the accuracy of the result. Therefore, target location and stance detection should be executed successively and independently.

To enable such execution, the proposal is as follows. First, the context range in the text corresponding to the different goals needs to be located. In the case of the above example presented here, the context range concerns ‘Hillary Clinton being a liar.’ Then, the stance is determined by analyzing the target text in the context range. Based on the above statement, a bidirectional long short-term memory (Bi-LSTM) with position-weight is proposed to carry out the multi-target stance detection. Bi-LSTM can describe the dependency relationship between words from front to back and from back to front, and the position-weight vector can describe the impact of words on the different targets of stance detection.The multi-target stance detection database of the American election in 2016 is used to validate the proposed method.

1 Related work

In recent years, given the rapid growth in the number of social media users, researchers have begun to focus on stance detection from social media texts. In 2016, the international conferences SemEval[4]and the 5th CCF International Conference on Natural Language Processing and Chinese Computing (NLPCC)[5]provided annotated data on stance detection from the social media. Thereafter, some researchers began studying different types of data for stance detection[6,7]. Given the similarity between stance detection and sentiment analysis, researchers made efforts to distinguish between them[8], and attempted using sentiment analysis to obtain improved stance detection effects[9].

With regard to the social media, deep learning methods, including the traditional CNN and recurrent neural network (RNN)[10-13]as well as fusion models[14-17], are typically used to detect stance.

Subsequently, researchers began applying stance detection to texts containing different targets in the same category. This is called multi-target stance detection, and is regarded as a sub-task of stance detection[18-21]. For example, Liu et al.[19]proposed an approach to automatically obtain zones of each target and combined it with the LSTM method to obtain the corresponding stance. The results demonstrated the effectiveness of two-target stance detection. Lin et al.[21]designed a topic-based approach to detect multiple standpoints in Web text to generate a stance classifier according to the distribution of the standpoint-related topic-term. They produced the parameter values of the classifier with this adaptive method and proved its effectiveness through experiments.

The above methods do not use the positional relationship between words in the text and targets to help the algorithm enhance the ability to distinction between the content describing the different targets.Therefore, in order to obtain the best stance detection effect, it is necessary to extract the appropriate clause as the input text according to the context range corresponding to the target so as to avoid the influence of unrelated text. Therefore, in this paper, an unsupervised method to extract the context ranges of different targets in the text is proposed. Then, the Bi-LSTM network with position-weight is generated by combining position-weight with the Bi-LSTM approach. Finally, the stance labels of different targets are predicted with LSTM and Softmax classification. The details of the approach are explained in the next section.

2 Proposed method

The architecture of our model is shown in Fig.1. It consists of the following 5 modules: embedded layer, Bi-LSTM layer, position-weight fusion layer, LSTM layer, and Softmax classifier. The result of combining the word embeddings of all the target topics and the input text serves as the input of the model, and the output consists of the author’s stance labels for all the possible target topics.

Fig.1 Model architecture

2.1 Embedding layer

The embedding layer transforms every word in the input text into one vector, each of which expresses the relationship between the words applicable to the context.By representing each word in a text as a 1×nvector, the text can be represented as anl×nmatrix (lis the number of words in the text, andnis the dimension of each lexical vector). Accordingly, the input text can be converted into a numerical matrix, which facilitates the feature extraction by the algorithm.

2.2 Bi-LSTM layer

To extract valid features from unstructured text, LSTM is used to encode the text. The input of the LSTM is a tensor formed by arranging the embedded vectors of the words to be processed in order from front to back. The corresponding output is a tensor composed of implicit states of the LSTM units in order from front to back. LSTM can describe long-distance lexical dependency in the text and is suitable for text data modeling[22].

The Bi-LSTM network consists of a forward LSTM and a backward LSTM. The input of the forward LSTM network is composed of the embedded vectors of words arranged in order from front to back, and the input of the backward LSTM network is a series of the embedded vectors of words arranged in the opposite order. Thus, Bi-LSTM can describe the dependency relationship between words in the front to back and back to front directions. The output of Bi-LSTM is the result of the splicing of the output of the forward and backward LSTM units. Therefore, in Bi-LSTM network, each word will be first transmitted to a forward LSTM unit and then to a backward LSTM unit, and its output is the result of the splicing of the hidden states of the 2 LSTM units.

In the LSTM model[23], a unittis calculated as follows:

it=σ(xtUi+ht-1Wi+bi)

(1)

ft=σ(xtUf+ht-1Wf+bf)

(2)

ot=σ(xtUo+ht-1Wo+bo)

(3)

qt=tanh(xtUq+ht-1Wq+bq)

(4)

pt=ft×pt-1+it×qt

(5)

ht=ot×tanh(pt)

(6)

whereU∈Rd×nandW∈Rn×nare weight matrixes,b∈Rnis the offset vector,dis the dimension of the word embedding,σand tanh represent sigmoid and tanh activation functions andnis the output size of the LSTM network. The LSTM model consists of input gateit, forgetting gateft, and output gateot.

2.3 Position-weight fusion layer

When using Bi-LSTM alone to extract the features, it becomes difficult to analyze the differences between multiple targets of the text. This leads to lack of pertinence when the algorithm processes multiple targets in the same text. Therefore, in order to reflect the differences among the corresponding features of different targets in the text, a two-stage method is designed. The first step calculates the ultimate position-weight vector, and the second step concatenates the position-weight vector and output of the Bi-LSTM layer.

2.3.1 Calculating the final position-weight vector

(7)

(8)

(9)

At this point, each component of vectorErepresents the influence of each word on the target, as shown in Fig.2. Each element inEis a value between 0 and 1.

Fig.2 Position-weight vector of 2 targets in the same text

In order to control the effect of vectorEon the prediction result, the coefficientμis used to expand each component of vectorEby a factor ofμ. The influence of positional weight on system can be changed by adjustingμ, as follows:

Eμ=E×μ

(10)

whereEμis the ultimate position-weight vector. Each element inEμis a value between 0 andμ.

2.3.2 Concatenating position-weight vector and output of Bi-LSTM

In the Bi-LSTM network, the output of each word is composed of the spliced hidden states of the forward and backward LSTM units. In addition, each word corresponds to one position-weight in theEμ. Thus, a new vector is produced by concatenating the position-weight of each word and the Bi-LSTM output. This vector is taken as the output of the position-weight fusion layer. This vector can not only describe the dependency between words in the different directions, but it can also describe the impact of a word on the different targets of stance detection.

2.4 LSTM layer

To determine the stance labels from the fusion of the position-weight and the output of Bi-LSTM, the LSTM is used for re-encoding[23]. This process will re-extract features from the fused tensor from the previous section in the order of the text.The input of this layer is the output vector of the position-weight fusion layer, and the output is the hidden state of the last LSTM unit.

2.5 Softmax classifier

The output of the LSTM layer is taken as the input of this layer, and the Softmax classifier[24]is used to predict the stance labels of the different targets.

3 Experiment

3.1 Experimental setting

The specific process of completing the multi-objective position detection task is shown in Fig.3.

Fig.3 Flow chart for multi-target position detection

The experiment used the stance detection corpus for the US 2016 general election constructed by Sobhani et al.[3]. This corpus contains 3 datasets, each of which is a collection of tweets and stance labels of 2 candidates. In the original corpus, 2 target words of each sentence were combined for analysis in Ref.[3]. Distribution of data are shown in Table 1. In addition, the model parameters are shown in Table 2.

Table 1 Details of the experimental datasets

Table 2 Main parameter setting in our experiment

3.2 Evaluation metric

As a category task, stance detection is more inclined to improve the classification accuracy of the “favor” and “against” stances. Therefore, the averageF1 scores of “favor” and “against” (Favg) were used as the evaluation indictors[4].

3.3 Baselines

The selected baselines are as follows.

Sequence-to-sequence (Seq2Seq)[26]. Recently, the Seq2seq model has achieved good performance when dealing with timing problems. Therefore, Ref.[3] applied this model to the multi-target stance detection problem. In this method, text is used as the input of the model, and the stance labels representing different targets are output. The advantage of this algorithm is that it not only mines the stance related to the target from the text, but also refers to the relationship between multiple targets.

Target-related zone modeling (TRZM)[19]. This model is proposed for multi-target stance detection tasks.It uses a region segmentation method to divide a text containing 2 targets into 4 parts, and then a multi-input LSTM is used to process these parts to detect the stance results.

3.4 Results and discussion

In order to verify the effectiveness of the proposed method, the following 2 experiments are carried out: comparison between the proposed method and the related baselines, and comparisons of different parameters in the position-weight fusion layer.

3.4.1 Comparison between the proposed method and the related baselines

The experimental result of the algorithm is compared with those of the other algorithms, as shown in Table 3, where PW-Bi-LSTM is the bidirectional LSTM network with position-weight proposed in this paper. There is the result of PW-Bi-LSTM>TRZM >Seq2Seq, when comparing theF1 value of different methods. The conclusions drawn from these results are as follows.

1) Although this method has the ability to refer to different labels to detect the stance, the method does not take into account the effect of the positional relationship between the text and the target. This may be the reason why its effect is lower than TRZM and the model in this article.

2) The effect of TRZM is not good, but it can meet the actual requirement, because the combination of feature extraction and deep learning is a good way to improve multi-target stance detection.But this method splits the integrity of the text, which may be the reason for its poor performance.

3) The proposed method outperformed the other methods in 3 datasets and macroFavgare at 1.4% higher than the corresponding values in the other algorithms. Compared with the other methods, the proposed method can automatically extract the position features of different targets in the text and expand the tolerance for input difference. For input text with different targets, other methods may be impacted by other targets when detecting the given target stance, and their accuracies decrease subsequently. However, the proposed method can avoid the influence of irrelevant targets, and the accuracy does not change much.

Table 3 Performances of our approach and the compared methods

3.4.2 Comparisons of different parameters in the position-weight fusion layer

One of the key parameters affecting the performance of the proposed method is the coefficientμ, mentioned in Section 2.3. In order to determine the influence of the ultimate position-weight vector on the algorithm and to find the optimal coefficientμin the position-weight fusion layer,Favgfor different values ofμin the development and test sets in the 3 datasets are compared, as shown in Fig.4. The figure shows that when the ultimate position-weight vector is added to our algorithm (i.e.,μ≠0),Favgare significantly improved, which indicates that this addition can improve the result of the multi-target stance detection.In addition, the effect of the proposed algorithm is related to the coefficientμ. In the 3 datasets, the best results in development sets are achieved whenμequals 3, 5 and 10, respectively. Thus, the effect of the proposed algorithm can be improved by adjusting the coefficientμ.

Fig.4 Favg for different coefficients (μ) in the proposed method. The x-axis denotes the coefficient size, and the y-axis refers to Favg

4 Conclusions

In this study, Bi-LSTM network with position-weight for multi-target stance detection is proposed.The positional relationship between word and target is represented as a vector. And then this vector is embedded into the Bi-LSTM model to refine the stance detection. The experimental results demonstrate the validity of the proposed method, which states that adding the multi-target information can expand the tolerance for the input difference and diversity. In the future, additional position feature extraction methods and actual data covering a wider range of topics will be adopted for continuous improvement and optimization of the algorithm. In addition, it leads to a large volatility of the experiment that the number of data sets used in this paper is small. Therefore, in the follow-up work, the study of the volatility of the results will be considered.

主站蜘蛛池模板: 国产制服丝袜91在线| 国产精品高清国产三级囯产AV| 另类欧美日韩| 另类重口100页在线播放| 亚洲狠狠婷婷综合久久久久| 亚洲日韩高清在线亚洲专区| 欧美.成人.综合在线| 亚洲色图欧美| 久久久久人妻一区精品色奶水 | 免费国产无遮挡又黄又爽| 欧美一道本| 精品久久高清| 奇米影视狠狠精品7777| 黑色丝袜高跟国产在线91| 欧美日韩国产成人在线观看| 亚洲日韩AV无码精品| 三级国产在线观看| 欧美成人二区| 日本欧美中文字幕精品亚洲| 午夜高清国产拍精品| 亚洲性色永久网址| 国产老女人精品免费视频| 亚洲伊人久久精品影院| 久久精品丝袜| 国产一级毛片网站| 国产成人福利在线视老湿机| 亚欧乱色视频网站大全| 久久天天躁狠狠躁夜夜躁| 欧美日韩导航| 欧美国产日韩在线观看| 亚洲色图欧美| 亚洲成A人V欧美综合| 91丝袜在线观看| 永久毛片在线播| 99re在线视频观看| 亚洲最大综合网| 97国产在线视频| av一区二区人妻无码| 三级视频中文字幕| 国产精品一区二区久久精品无码| 国产H片无码不卡在线视频| 曰韩免费无码AV一区二区| 欧美人在线一区二区三区| 欧美色伊人| 成人国产精品视频频| 国内毛片视频| 亚洲国产日韩在线观看| 一本一道波多野结衣av黑人在线| 国产超薄肉色丝袜网站| 在线观看网站国产| 54pao国产成人免费视频| 91无码人妻精品一区| 人妻熟妇日韩AV在线播放| 久久精品国产免费观看频道| 日韩区欧美区| 亚洲精品777| 免费无码网站| 色悠久久综合| 91毛片网| 又爽又黄又无遮挡网站| 在线观看国产精美视频| 久久综合国产乱子免费| www中文字幕在线观看| 免费国产在线精品一区| 99久久这里只精品麻豆| 久久中文无码精品| 日本国产精品| 最新日本中文字幕| 自拍亚洲欧美精品| 不卡午夜视频| 国产91丝袜在线观看| 久久动漫精品| 亚洲男人的天堂在线观看| 熟女成人国产精品视频| 国产精品成人免费综合| 精品免费在线视频| 欧美一级大片在线观看| 色噜噜综合网| 伊人查蕉在线观看国产精品| 亚洲最新网址| www.91中文字幕| 精品国产美女福到在线不卡f|