本文转载自:https://zhuanlan.zhihu.com/p/81542002

Different from document- and sentence-level sentiment analysis, fine-grained sentiment analysis aims to analyze the sentiment of specific aspects of goods and services in review texts. For example, in the sentence “waiters are unfriendly but the pasta is out of this world.” the two aspects of waiters and pasta express negative and positive emotions respectively.
(不同于文档和句子级别的情感分析,细粒度的情感分析的目的在于分析评论文本中商品和服务的具体方面的情感。比如对于句子"waiters are unfriendly but the pasta is out of this world."中 waiters和pasta两个方面分别表达了负面情感和正面情感。)

For example, remarks about covid-19 on Twitter may contain multiple aspects, and authors may have different emotional polarities for different aspects, such as vaccines, government control, and Long-COVID. They all have different emotions, and these emotions are often Not the same as the overall sentiment of the sentence. So we need ABSA assistance.
(比如Twitter上有关covid-19相关言论可能包含多个方面，作者可能对不同方面具有不同情感极性，比如可能对疫苗,对于政府的管控,对于Long- COVID都有着不同的情感，并且这些情感往往和句子整体的情感不相同。因此我们就需要ABSA辅助。)

文章目录

前言
一. Target-Oriented Opinion Words Extraction
二. Bert on ABSA
二、AdaRNN(Adaptive Recursive Neural Network for Target-Dependent Twitter Sentiment Classification).
二、TD-LSTM(Target-Dependent LSTM)
三、MemNet
四、AT-LSTM
五. ATAE-LSTM
六. IAN

前言

本文将介绍一些absa任务上经典的模型，主要集中在aspect term sentiment analysis 和 aspect category sentiment analysis两个子任务上。首先注明在absa任务中，aspect based sentiment analysis 和 aspect level sentiment analysis一般指代同一种任务。aspect term和target指代同一种对象，sentence和context指代同一种对象，但是不同论文中对同一种事物可能有不同的表达。

一. Target-Oriented Opinion Words Extraction

TWOE is a subtask in ABSA to extract the aspect and its opinion words from the sentence.

Typical ABSA can analyze the sentiment polarity of different aspects in the review text, but cannot reflect the opinion words of each object. We first mark the corresponding target opinion words on the ABSA datasets of SemEval14, SemEval15, and SemEval16.

(典型的ABSA能够分析出评论文本中不同方面的情感极性，但是不能体现具体每一个对象的观点词。我们首先对SemEval14, SemEval15,SemEval16的ABSA数据集进行对应target观点词的标注。)

The difficulty of TOWE is that for the same sentence, if the input target objects are different, the corresponding labeled opinion words should also be different. The core of the problem lies in how to model the semantic relationship between the target and the context, so as to obtain a target-specific text representation.

(TOWE的难点在于对于同一个句子，如果输入的目标对象不同，那么对应标注的观点词应该也不同。问题核心在于如何建模target和上下文之间的语义关系，从而得到target-specific的文本表示。)

A common algorithm is to use TD-LSTM. TD-LSTM uses two LSTMs to model Target and the context on the left and the context on the right and Target respectively, and splices the last hidden vectors of the two LSTMs together and sends them to a softmax. Classification can be done in the classifier
(一种常见的做法是使用TD-LSTM,TD-LSTM使用两个LSTM分别去建模Target和左边的上下文以及Target和右边的上下文，将两个LSTM最后的隐向量拼接到一起送入到一个softmax分类器中进行分类即可.)

Another common model is the IOG-based model. IOG adopts the encoder-decoder framework, and the encoder contains three components, namely Inward-LSTM, Outward-LSTM and Global LSTM.

Outward-LSTM consists of two LSTMs in opposite directions, modeling the above and below from the target object in the middle of the sentence (from the inside to the outside), thereby passing the target information into its context to generate a target-specific representation.

Contrary to Outward-LSTM, Inward-LSTM uses two inward LSTMs to encode the target position from both ends of the sentence (outside-in).

In addition, in order to compensate for the separation and incompleteness of sentence information brought by the segmentation context, we introduce Global LSTM to model the overall semantics of sentences, that is, use a common BiLSTM to encode the complete sentence representation. Finally, we concatenate the representations of the corresponding positions of the Inward-LSTM, Outward-LSTM and Global LSTM to obtain the representation of the encoder.

On the decoder side, we tried two decoding strategies, one is to do three classifications independently at each position, which has the advantage of faster decoding, and the second is to use CRF to consider the dependencies between labels.

(一种常见的做法是基于IOG的模型。IOG 采用 encoder-decoder 框架，encoder 中包含了三个组件，分别是 Inward-LSTM，Outward-LSTM 和Global LSTM。我们根据目标对象的位置将评论句子分为三个部分：上文、target、下文。

Outward-LSTM 由两个方向相反的 LSTM，从句子中间的目标对象分别向上文和下文建模（自内向外），从而将目标信息传入到它的上下文中，生成 target-specific 的表示。和 Outward-LSTM 相反，Inward-LSTM 采用两个方向向内的 LSTM 分别从句子两端向目标位置编码（自外向内）。

此外，为了弥补切分上下文带来的句子信息的分隔和不完整，我们引入 Global LSTM 来建模句子的整体语义，即用一个普通的 BiLSTM 编码完整的句子表示。最终，我们将 Inward-LSTM，Outward-LSTM 和 Global LSTM 对应位置的表示进行拼接，得到 encoder 的表示。

在 decoder 端，我们使尝试了两种解码策略，一种是在每个位置独立的做三分类，优点是解码速度更快，第二种是使用 CRF 来考虑标签之间的依赖关系。)

二. Bert on ABSA

Utilizing BERT for Aspect-Based Sentiment Analysis via Constructing Auxiliary Sentence
(NAACL 2019)

The model converts aspect-level sentiment classification into a sentence pair classification problem by constructing auxiliary sentences.

(文章通过构造辅助句子的方式，将属性级情感分类转换为一个句对分类问题。)

Taking the sentence “LOCATION1 is central London and it is so extremely expensive” as an example, there are four ways to construct auxiliary sentences, which are:
(以句子“LOCATION1 is central London and it is so extremely expensive”为例,辅助句子的构造方法有四种，分别是:)

The first is the QA-M paradigm. The auxiliary sentence is “what do you think of the price of LOCATION1”.
第一种,QA-M范式。辅助句子是"what do you think of the price of LOCATION1"。

The second is the NLI-M paradigm. The auxiliary sentence is “LOCATION1-price”.
第二种,NLI-M范式。辅助句子是"LOCATION1- price"。

The third one is the best, the QA-B paradigm. The auxiliary sentence is “the polarity of the aspect price of LOCATION1 is positive/none/negative”, and the auxiliary three auxiliary sentences are classified as Yes or No. Score the results of the three sentences, and select the one with the highest Yes score as the Aspect prediction category.

第三种是最好的,QA-B范式。辅助句子是"the polarity of the aspect price of LOCATION1 is positive/none/negative"，并对辅助三个辅助句子进行Yes or No的二分类。对三个句子的结果进行打分，选择Yes得分最高的作为Aspect预测类别。

The fourth, NLI-B paradigm. The auxiliary sentence is “LOCATION1-price-positive/none/negative”, and the prediction method is the same as above.
第四种,NLI-B范式。辅助句子是"LOCATION1- price - positive/none/negative"，预测方法同上。

二、AdaRNN(Adaptive Recursive Neural Network for Target-Dependent Twitter Sentiment Classification).

The model devises a rule that transforms a dependency parse tree into a binary tree, where Target is located at one of the two child nodes of the root of the tree. By using RNN to spread the sentiment information of words to the target from bottom to top, a vector representation of a sentence is finally obtained, and it is input into the softmax classifier for classification.

该模型设计了一个规则，将依存分析树转化为一个二叉树，其中Target位于树根的两个子节点之一的位置。通过使用RNN将词的情感信息自下而上地传播到Target周围，最终得到一个句子的向量表示,并且输入到softmax分类起中进行分类。

二、TD-LSTM(Target-Dependent LSTM)

Effective LSTMs for Target-Dependent Sentiment Classification中提出了TD-LSTM和TC-LSTM两个模型。

TD-LSTM uses two LSTMs to model Target and the context on the left and Target and the context on the right, respectively. The last hidden vectors of the two LSTMs are spliced together and sent to a softmax classifier for classification.
(TD-LSTM使用两个LSTM分别去建模Target和左边的上下文以及Target和右边的上下文，将两个LSTM最后的隐向量拼接到一起送入到一个softmax分类器中进行分类即可)。

TC-LSTM concatenates the average value of the Target word vector with the word vector of each word in the sentence and then performs the same operation as TD-LSTM.
(TC-LSTM将Target词向量的平均值和句子中每个词的词向量进行拼接然后进行和TD- LSTM一样的操作)。

三、MemNet

Aspect Level Sentiment Classification with Deep Memory Network proposes a method MemNet based on multi-hop attention. Each word in the sentence is first converted into a word vector representation, and the word vectors of all words in the entire sentence form a matrix called memory. The word vectors of all words in the aspect are averaged as the initial aspect representation. Each hop contains an attention layer and a linear layer. The attention layer uses the aspect representation obtained in the previous step as the query, calculates the similarity between the query and the word vector of each word in the sentence, and weights the word vector to obtain the context vector. The context vector is then summed with the linearly transformed aspect to obtain the aspect representation in this step. The hops of the same structure are stacked multiple times, and the aspect representations of different levels are used as the query to extract contextual features. The final aspect representation is classified by the softmax classifier.

(Aspect Level Sentiment Classification with Deep Memory Network提出了一种基于multi-hop attention的方法MemNet。句子中的每个词首先被转换成词向量表示，整个句子中所有词的词向量就组成了一个矩阵，称为memory。aspect中所有词的词向量取平均作为初始的aspect表示。每一个hop包含一个attention层和一个线性层。attention层使用上一步得出的aspect表示作query，计算query和句子中每个词的词向量的相似度，对词向量进行加权求和得到上下文向量。上下文向量再与经过线性变换的aspect进行求和得到此步骤中的aspect表示。相同结构的hop堆叠多次，使用不同层次的aspect表示作为query提取上下文特征，最终得到的aspect表示经过softmax分类器进行分类)。

四、AT-LSTM

Attenion-Based LSTM for Aspect-level Sentiment Classification

In AT-LSTM, each word in the sentence is first converted into a word vector representation, and LSTM encoding is used to obtain the hidden layer state. A learnable parameter vector is used as the query, the linearly transformed representation of the aspect’s embedding and the splicing of the hidden layer state are used as the key, and the hidden layer state is used as the value to calculate the attention. The output context vector and the last hidden layer state are spliced and a nonlinear transformation is performed to obtain the final classification feature, and then a softmax classifier is used to predict the probability distribution of each category.

(在AT-LSTM中，句子中的每个词首先被转换成词向量表示，使用LSTM编码得到隐层状态。使用一个可学习的参数向量作为query，aspect的embedding经过线性变换后的表示和隐层状态的拼接作为key，隐层状态作为value计算attention。输出的上下文向量和最后一个隐层状态拼接后做一次非线性变换得到最终的分类特征，再经过一个softmax分类器预测每个类别的概率分布。)

五. ATAE-LSTM

Attention-based LSTM with Aspect Embedding

In order to make better use of aspect information, the embedding layer of each word is spliced with the embedding of aspect to obtain aspect-related word representations. Subsequent operations are the same as AT-LSTM.
(为了更好利用aspect的信息，在embedding层将每一个词的embedding和aspect的embedding进行拼接，得到aspect相关的词表示。后续操作和AT-LSTM一样)

六. IAN

Interactive Attention Networks for Aspect-Level Sentiment Classification

An interactive attention network model IAN is proposed to calculate the interactive attention between the context and the target. It not only uses the target information to select the important words in the context, but also uses the context information to select the important words in the target. Specifically, a LSTM is used to calculate the hidden layer state of target and context, and the hidden layer states of target and context are averagely pooled to obtain a fixed-length vector representation of target and context.
(提出一种交互式注意力网络模型IAN,计算context和target之间的交互式注意力,不仅使用target信息来选择context中重要的词，也使用context信息来选择target中重要的词。具体的，target和context分别使用一个LSTM计算隐层状态，将target和context的隐层状态进行平均池化得到固定长度的target和context的向量表示。)

Use the pooled representation of the target as the query, and the hidden layer state of the context as the key and value to calculate the attention to obtain the context representation related to the target.
(使用target的池化表示作为query，context的隐层状态作为key和value计算attention，得到target相关的context表示。)

Similarly, the pooled representation of the context is used as the query, and the hidden state of the target is used as the key and value to calculate the attention to obtain the context-related target representation. The two are spliced together as a classification feature through a linear layer and tanh activation to get log odds. Finally, the classification probability is output through softmax.
(同样的，使用context的池化表示作为query，target的隐层状态作为key和value计算attention，得到context相关的target表示。将两者拼接起来作为分类特征经过一个线性层和tanh激活得到对数几率。最后经过softmax输出分类概率。)

Aspect Based Sentiment Analysis经典模型相关推荐

细粒度情感分析（Aspect Based Sentiment Analysis, ABSA），一个从零开始的案例教程【Python实现】
目录前言数据和源码你要了解的基础 1. 细粒度情感分析(ABSA)案例背景 1.1 任务介绍 1.2 数据基本介绍 1.3 如何评估ABSA的结果 2. 任务一:Aspect Term Extr ...
[论文阅读]A Joint Training Dual-MRC Framework for Aspect Based Sentiment Analysis
摘要基于方面的情感分析 (ABSA) 涉及三个基本子任务:方面术语提取.观点术语提取和方面级情感分类.早期的工作只专注于单独解决这些子任务之一.最近的一些工作集中在解决两个子任务的组合,例如,提取方 ...
【ACSA】Beta Distribution Guided Aspect-aware Graph for Aspect Category Sentiment Analysis
Beta Distribution Guided Aspect-aware Graph for Aspect Category Sentiment Analysis with Affective Kn ...
Beta Distribution Guided Aspect-aware Graph for Aspect Category Sentiment Analysis论文阅读笔记（EMNLP2021）
目录标题翻译:基于Beta分布引导方面感知图的方面类别情感分析原文链接:https://aclanthology.org/2021.emnlp-main.19.pdf 摘要: 1 引言 2 相关工 ...
Multi-Interactive Memory Network for Aspect BasedMultimodal Sentiment Analysis（AAAI-2019）
随着多模态用户生成内容(如文本和图像)在互联网上的流行,多模态情感分析近年来受到了越来越多的研究关注.在方面层面的情感分析中,多模态数据通常比纯文本数据更重要,并且具有各种相关性,包括该方面对文本和图 ...
2019_AAAI_Multi-Interactive Memory Network for Aspect Based Multimodal Sentiment Analysis
Multi-Interactive Memory Network for Aspect Based Multimodal Sentiment Analysis 论文地址:https://ojs.aaa ...
Deep Mask Memory Network with Semantic Dependency and Context Moment for Aspect Level Sentiment Clas
这篇博文是Deep Memory Network在Aspect Based Sentiment方向上的应用的一部分,如果你已经熟知深度记忆网络并且看过其在ABSA的应用,只想看这篇论文,可以跳过直接阅 ...
Gated Mechanism for Attention Based Multi Modal Sentiment Analysis 阅读笔记
GATED MECHANISM FOR ATTENTION BASED MULTIMODAL SENTIMENT ANALYSIS 阅读笔记最近在跟进多模态的情感分析发现多模态榜一又被刷下来了,这篇 ...
《Sentiment Analysis of Chinese Microblog Based on Stacked Bidirectional LSTM》论文阅读笔记
文章名:<Sentiment Analysis of Chinese Microblog Based on Stacked Bidirectional LSTM> 作者:JUNHAO ZH ...

Aspect Based Sentiment Analysis经典模型