关系抽取论文总结（relation extraction）不断更新

2000

1.Miller, Scott, et al. "A novel use of statistical parsing to extract information from text." 1st Meeting of the North American Chapter of the Association for Computational Linguistics. 2000. 被引用次数：246

[paper]

主要思想：关系抽取通常是一个pipeline模型，第一步词性识别，第二步实体识别，第三步句法分析，第四步语义解析。这种模型最大的问题就是前一步的错误会传播到后一步（error propagation）。为了解决这个问题，作者设计了一个joint model联合的训练这几个分步骤。这个模型采用增强解析树（augmented parse trees）结构去抽取句子级别的关系，为了训练增强解析树作者使用TREEBANK解析结果加上人工标注生成了特定领域语料，然后使用模式匹配和统计方法挖掘关系。这个模型缺点是需要标注增强解析树的语料，非常耗时。其次，需要对不同类型的关系设计模式匹配的规则。这些缺点制约了其应用场景。

2004

1. Culotta, Aron, and Jeffrey Sorensen. "Dependency tree kernels for relation extraction." Proceedings of the 42nd annual meeting on association for computational linguistics. Association for Computational Linguistics, 2004. 被引用次数：848

[paper]

主要思想：首先构建增强依存树（Augmented Dependency Trees），得到一个句子和两个实体的各种特征，然后定义Tree Kernel函数将样本特征映射到高维空间，最后使用SVM进行关系分类。这种方法的缺点就是很依赖增强依存树的结果。

2.Kambhatla, Nanda. "Combining lexical, syntactic, and semantic features with maximum entropy models for extracting relations." Proceedings of the ACL 2004 on Interactive poster and demonstration sessions. Association for Computational Linguistics, 2004. 被引用次数：512

[paper]

主要思想：过去的方法依赖于句法解析树，这篇论文设计了各种特征，并使用最大熵（maximum entropy）方法在关系抽取任务上得到了很好的效果。作者设计的特征包括：words, entity type, mention level, overlap, dependency, parse tree。实验表明在使用少量解析树特征情况下，分类器就能达到不错的效果。这个模型的缺点就是需要人工设计特征。

2005

1. Zhao, Shubin, and Ralph Grishman. "Extracting relations with integrated information using kernel methods." Proceedings of the 43rd annual meeting on association for computational linguistics. Association for Computational Linguistics, 2005. 被引用次数：365

[paper]

主要思想：这篇文章是04年核函数模型的升级版，作者考虑了三种粒度的信息：分词tokenization, 句子解析sentence parsing , 深度依存分析deep dependency analysis。并针对不同源信息设计核函数，最后对核函数进行组合。这种方法的缺点就是仍然需要依赖NLP工具提取特征，而且这个模型复杂度较高。

2. GuoDong, Zhou, et al. "Exploring various knowledge in relation extraction." Proceedings of the 43rd annual meeting on association for computational linguistics. Association for Computational Linguistics, 2005. 被引用次数：559

[paper]

主要思想：这篇文章同样是句子级别的关系抽取，首先提取特征，然后使用SVM进行关系分类。这篇文章的创新点在于特征设计，作者发现chunking信息足够捕捉句法信息，作者还使用了WordNet, Name List去增强语义信息。这个模型的缺点仍然是需要扔设计特征，且依赖NLP工具的准确性。

3. Bunescu, Razvan C., and Raymond J. Mooney. "A shortest path dependency kernel for relation extraction." Proceedings of the conference on human language technology and empirical methods in natural language processing. Association for Computational Linguistics, 2005. 被引用次数：816

[paper]

主要思想：同样是一篇基于依存树的核方法论文，这篇文章将一个句子构建成一个图，其中单词作为图的节点，依存关系作为图的边。这样我们可以得到两个实体的最短路径，对这个最短路径上的节点的单词、词性、实体类别等特征进行组合就得到了最终特征，最后使用核方法和SVM进行关系分类。这个方法的创新点在于求依存关系的最短路径，这跟我们人类推理关系是类似的。缺点就是仍然依赖NLP工具的准确性。

2006

1. Mooney, Raymond J., and Razvan C. Bunescu. "Subsequence kernels for relation extraction." Advances in neural information processing systems. 2006. 被引用次数：486

[paper]

主要思想：同样是一篇基于核方法的文章，这篇文章的创新点在于通过一个实体将一个句子分为三部分：实体1的左边部分，实体1和实体2的中间部分，实体2的右边部分。这样能够得到更加精细的特征，从而提高了性能。这篇文章在蛋白质作用关系数据集上进行的实验。

2. Culotta, Aron, Andrew McCallum, and Jonathan Betz. "Integrating probabilistic extraction models and data mining to discover relations and patterns in text." Proceedings of the main conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics. Association for Computational Linguistics, 2006. 被引用次数：191

[paper]

2008

1. Banko, Michele, and Oren Etzioni. "The tradeoffs between open and traditional relation extraction." Proceedings of ACL-08: HLT. 2008. 被引用次数：433

[paper]

主要思想：这篇文章关注的是开放域的关系抽取，也就是没有提前定义关系的类别。作者发现95%的句子级别关系可以被总结为8种模式。那么句子级别关系抽取可以被定义为一个序列标注的问题，对应句子的8种模式。通过训练一个CRF模型完成这个任务。这种方法限制比较多，仅对于特定模式的关系抽取任务表现好。

2. Bundschus, Markus, et al. "Extraction of semantic biomedical relations from text using conditional random fields." BMC bioinformatics 9.1 (2008): 207. 被引用次数：223

[paper]

主要思想：这是一篇生物医学领域关系抽取论文，其中有两个数据集：疾病治疗方案关系，基因和疾病关系。作者使用了两个CRF模型进行关系抽取。第一个CRF识别实体，第二个CRF识别实体所属的关系。第二个CRF依赖于第一个CRF的结果。这种方法的缺点是需要设计特征，且依赖NLP工具。

2009

1. Mintz, Mike, et al. "Distant supervision for relation extraction without labeled data." Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP: Volume 2-Volume 2. Association for Computational Linguistics, 2009. 被引用次数：1483

[paper]

主要思想：这是远程监督关系抽取的开山之作，值得细细品味。这篇文章的基本假设是：如果两个实体在已知知识库中存在某种关系，那么当这两个实体在同一个句子中共现的时候，那么这个句子也在表达这种关系。基于这种假设可以使用知识库中的已有关系进行自动标注，这样就产生了大量正样本。而负样本使用随机实体对进行标注。通过这种策略生成训练样本，减少标注，然后再设计特征，训练关系分类器。这种方法主要有两个问题，第一个是假设过于肯定，有时候两个实体一起出现，但并没有表达知识库定义的关系。也有可能两个实体之间存在多种类型关系，那么就无法判断这一个句子中所说的是哪一种关系；另外这种标注方式依赖于NER的性能。

2010

1. Yao, Limin, Sebastian Riedel, and Andrew McCallum. "Collective cross-document relation extraction without labelled data." Proceedings of the 2010 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 2010. 被引用次数：116

[paper]

主要思想：远程监督会产生很多噪音，作者发现这些错误通常不满足一些基本限制，比如“国籍“这种关系，通常第一个实体是人，第二个实体是国家，两个实体之间存在着特殊谓词。作者利用这一现象设计了一个跨文档的关系抽取模型，识别实体类型，然后综合多个文档中的两个实体的特征构建图，最终确定两个实体的关系。这个模型的缺点是比较复杂，仍然依赖很多NLP工具。

2011

1. Hoffmann, Raphael, et al. "Knowledge-based weak supervision for information extraction of overlapping relations." Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies-Volume 1. Association for Computational Linguistics, 2011. 被引用次数：524

[paper]

主要思想：解决重叠关系的抽取，重叠关系指的是同一对实体之间的存在多种类型的关系。为了减少远程监督噪音影响，使用多实例学习思想，同时结合句子级别和文档级别的特征进行重叠关系的抽取。缺点是依赖NLP工具。

2012

1. Surdeanu, Mihai, et al. "Multi-instance multi-label learning for relation extraction." Proceedings of the 2012 joint conference on empirical methods in natural language processing and computational natural language learning. Association for Computational Linguistics, 2012. 被引用次数：430

[paper]

主要思想：远程监督有两个问题：第一个是自动标注数据含有大量错误，第二个是不能解决实体之间存在多种类型关系的问题。作者定义了Multi-instance Multi-label Learning，使用多实例学习思想解决噪音的问题，使用多标签学习解决第二个问题。

2. Takamatsu, Shingo, Issei Sato, and Hiroshi Nakagawa. "Reducing wrong labels in distant supervision for relation extraction." Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics: Long Papers-Volume 1. Association for Computational Linguistics, 2012. 被引用次数：142

[paper]

主要思想：解决远程监督错误标注的问题，作者设计了一个生成模型判断关系的模式（pattern）。如果这种模式能够表达这种类型的关系，那么称之为正模式，自动标注的样本应当保留；如果这种模式下产生的都是错误样本，那么称为负模式，自动标注的样本应该抛弃。通过学习这种规则去去除掉自动标注中错误样本，从而提高关系抽取的性能。

2013

1. Riedel, Sebastian, et al. "Relation extraction with matrix factorization and universal schemas." Proceedings of the 2013 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. 2013. 被引用次数：396

[paper]

主要思想：这篇文章解决的是开放域的关系抽取（OpenIE）。本文最大的创新点是使用协同过滤（collaborative filtering）思想进行关系抽取。首先构建一个矩阵，矩阵的行是实体对，矩阵的列是来自结构化或非结构化数据中的关系。通过矩阵分解就可以得到实体对属于某一关系的值。这种方法比baseline的准确率高了近十个点。这种方法的缺点也比较显著，模型复杂，而且随着实体数量增多，矩阵规模会变的非常大。

2014

1. Zeng, Daojian, et al. "Relation classification via convolutional deep neural network." (2014). 被引用次数：609

[paper]

主要思想：传统机器学习算法进行关系抽取，需要设计特征，而且NLP工具产生的错误会传播。而深度学习可以自动提取高纬度特征而不需要使用NLP工具。这篇文章提出采用卷积神经网络进行关系抽取。他们采用词汇向量和词的位置向量作为卷积神经网络的输入，通过卷积层、池化层和非线性层得到句子表示。通过考虑实体的位置向量和其他相关的词汇特征（lexical-feature）进行关系抽取，句子中的实体信息能够被较好地考虑到关系抽取中。这篇文章也开启了关系抽取的新纪元，CNN被广泛的使用在关系抽取的任务上。

2015

1. Nguyen, Thien Huu, and Ralph Grishman. "Relation extraction: Perspective from convolutional neural networks." Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing. 2015. 被引用次数：191

[paper]

主要思想：这篇文章是对14年Zeng的方法的改进，Zeng的方法只使用了单个窗口的卷积核，而且还使用了词汇特征（lexical-feature）。这篇文章完全使用多个尺度的窗口提取N-gram特征，不再使用词汇特征，例如WordNet。实验结果表明本文提出的方法仅次于使用了各种特征的传统机器学习方法。

2. Zeng, Daojian, et al. "Distant supervision for relation extraction via piecewise convolutional neural networks." Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing. 2015.被引用次数：293

[paper]

主要思想：这篇文章的动机是解决远程监督产生的错误样本问题。创新点主要有两个：第一个是使用piecewise CNN提取句子的特征。第二个是使用多实例学习消除错误样本的问题。piecewise CNN使用两个实体将一个句子分为三部分，实体1左边部分，两个实体中间部分，实体2的右边部分，然后分别卷积，max-pooling得到三个部分的特征，然后进行merge作为一个句子的表示。多实例学习部分将所有实体对和句子作为一个包，只要有一个句子能表示这种关系，就把这个包分类为该类型关系，从而减少了错误样本的影响。

3. Xu, Yan, et al. "Classifying relations via long short term memory networks along shortest dependency paths." proceedings of the 2015 conference on empirical methods in natural language processing. 2015. 被引用次数：210

[paper]

主要思想：这篇论文首次使用LSTM进行关系抽取。对于句子级别的关系抽取，依存树这种特征非常重要。通过找到两个实体在依存树中的最短路径可以有效提取关键信息，去除无关的信息。在得到两个实体最短路径后，以根节点将最短路径分为两部分，这样对于确定关系的方向很有用。然后使用LSTM对这两条路径进行特征提取，同时使用多通道（multi-channel）LSTM引入额外的信息，比如POS，WordNet。接着使用池化层提取多通道LSTM得到的特征，最后得到关系分类的结果。实验表明LSTM加上最短依存路径能够有效提取句子级别的关系。

4. Xu, Kun, et al. "Semantic relation classification via convolutional neural networks with simple negative sampling." arXiv preprint arXiv:1506.07650 (2015). 被引用次数：130

[paper]

主要思想：这篇与LSTM最短依赖路径方法类似，不同的地方在于这篇文章使用的CNN提取最短路径序列的特征。另外，作者发现实体的顺序对于判断句子级别的关系很重要，所以使用了负采样，调换两个实体的顺序作为负样本进行训练。实验表明，CNN加上负采样再加上WordNet等特征，这个模型F1达到领先水平。

2016

1. Lin, Yankai, et al. "Neural relation extraction with selective attention over instances." Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). 2016. 被引用次数：253

[paper]

主要思想：这篇是对15年piece-wise CNN算法的改进，主要解决远程监督错误标注的问题。piece-wise这篇使用多实例学习解决错误标注的问题，但是每次从包中选择一个最能表达这种关系的句子，这样就会丢失很多信息。这篇论文针对这点进行了改进。作者设计了一个句子级别的注意力机制为包中的所以句子打分，从而评价每个句子的贡献，最后综合包中所以句子的信息进行关系抽取。

2. Wang, Linlin, et al. "Relation classification via multi-level attention cnns." (2016). 被引用次数：135

[paper]

主要思想：作者设计一个新颖的CNN架构进行句子级别关系抽取，本文创新点有两个，第一个是使用了两层Attention机制提取句子特征，第一层attention是实体attention（entity-specific attention），计算输入的word和每个实体的相关性；第二层attention是关系attention（relation-specific pooling attention），计算卷积后的n-gram特征与每种关系的相关程度。第二个创新点是设计了一种pair-wise合页损失函数。这篇文章的实验表明这个模型能够有效提取句子的关键特征，性能也达到了SOTA水平。这个模型的缺点，直观上来看就是比较复杂，感觉缺少一点美感。

2017

1. Ji, Guoliang, et al. "Distant supervision for relation extraction with sentence-level attention and entity descriptions." Thirty-First AAAI Conference on Artificial Intelligence. 2017. 被引用次数：66

[paper]

2. Wu, Yi, David Bamman, and Stuart Russell. "Adversarial training for relation extraction." Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing. 2017. 被引用次数：46

[paper]

2018

1. Verga, Patrick, Emma Strubell, and Andrew McCallum. "Simultaneously self-attending to all mentions for full-abstract biological relation extraction." arXiv preprint arXiv:1802.10569(2018). 被引用次数：30

[paper]

2. Qin, Pengda, Weiran Xu, and William Yang Wang. "Robust distant supervision relation extraction via deep reinforcement learning." arXiv preprint arXiv:1805.09927 (2018). 被引用次数：21

[paper]

3. Han, Xu, et al. "Fewrel: A large-scale supervised few-shot relation classification dataset with state-of-the-art evaluation." arXiv preprint arXiv:1810.10147 (2018). 被引用次数：5

[paper]

4. Feng, Jun, et al. "Reinforcement learning for relation classification from noisy data." Thirty-Second AAAI Conference on Artificial Intelligence. 2018. 被引用次数：43

[paper]

2019

Sahu, Sunil Kumar, et al. "Inter-sentence Relation Extraction with Document-level Graph Convolutional Neural Network." arXiv preprint arXiv:1906.04684 (2019).

[paper]

论文笔记