一、题目要求

根据提供的spacy/pytorch（任选其一）代码和依存分析的内容训练一个可以判断人物关系的依存分析模型。

二、数据集展示

训练集数据如下：
测试集数据如下：

三、数据集处理

依存句法通过分析语言单位内成分之前的依存关系解释其句法结构，主张句子中核心动词是支配其他成分的中心成分。而它本身却不受其他任何成分的支配，所有受支配成分都以某种关系从属于支配者。
为了方便后续模型的训练，现在需要对训练集进行标注。中文依存分析的标注关系如下：

以房祖名是成龙的儿子为例，需要将该句子标注成以下形式：

构建TRAIN_DATA 如下(以房祖名是成龙的儿子为例)：

TRAIN_DATA = [("房祖名 是 成龙 的 儿子", {'heads': [1,1,4,3,1],'deps': ['nsubj','ROOT','amod','prep','pobj']})
]

四、训练模型

4.1 导入包

导入所需要的包。

from __future__ import unicode_literals, print_functionimport plac
import random
from pathlib import Path
import spacy
from spacy.training import Example
from spacy.pipeline.dep_parser import DEFAULT_PARSER_MODEL

4.2 模型参数的注解（语种、输出目录以及训练迭代次数）

@plac.annotations(model=("Model name. Defaults to blank 'en' model.", "option", "m", str),output_dir=("Optional output directory", "option", "o", Path),n_iter=("Number of training iterations", "option", "n", int))

4.3 建立模型

def main(model=None, output_dir='./result', n_iter=100):"""Load the model, set up the pipeline and train the parser."""if model is not None:nlp = spacy.load(model)  # load existing spaCy modelprint("Loaded model '%s'" % model)else:nlp = spacy.blank('en')  # create blank Language classprint("Created blank 'en' model")# add the parser to the pipeline if it doesn't exist# nlp.create_pipe works for built-ins that are registered with spaCyif 'parser' not in nlp.pipe_names:parser = nlp.create_pipe('parser')nlp.add_pipe('parser', first=True,config = config)# otherwise, get it, so we can add labels to itelse:parser = nlp.get_pipe('parser')# add labels to the parserfor _, annotations in TRAIN_DATA:for dep in annotations.get('deps', []):parser.add_label(dep)#for head in annotations.get('heads', []):#parser.add_label(head)# get names of other pipes to disable them during trainingother_pipes = [pipe for pipe in nlp.pipe_names if pipe != 'parser']with nlp.disable_pipes(*other_pipes):  # only train parseroptimizer = nlp.begin_training()for itn in range(n_iter):random.shuffle(TRAIN_DATA)losses = {}for text, annotations in TRAIN_DATA:example = Example.from_dict(nlp.make_doc(text), annotations)nlp.update([example], sgd=optimizer, losses=losses)print(losses)# test the trained modeltest_text = "王祖蓝是陈奕迅的弟弟"doc = nlp(test_text)print('Dependencies', [(t.text, t.dep_, t.head.text) for t in doc])# save model to output directoryif output_dir is not None:output_dir = Path(output_dir)if not output_dir.exists():output_dir.mkdir()nlp.to_disk(output_dir)print("Saved model to", output_dir)# test the saved modelprint("Loading from", output_dir)nlp2 = spacy.load(output_dir)doc = nlp2(test_text)print('Dependencies', [(t.text, t.dep_, t.head.text) for t in doc])

五、模型测试

导入保存好的模型并进行模型测试。

from __future__ import unicode_literals, print_function
import spacydef test_model(nlp):texts = ["小红 是 小绿 的 妻子","房祖名 和 成龙 的 关系 是 父子","张晓龙 和 张佳宁 的 舅舅","陈奕迅 的 弟弟 是 王祖蓝","房祖名 和 成龙 的 关系 是","小风 是 小火 的 女儿","冯绍峰 的 妻子 是 赵丽颖"]docs = nlp.pipe(texts)for doc in docs:print(doc.text)print([(t.text, t.dep_, t.head.text) for t in doc if t.dep_ != '-'])if __name__ == '__main__':nlp = spacy.load('./result')test_model(nlp)

测试结果如下：

spacy依存分析模型相关推荐

第17课：基于 CRF 的中文句法依存分析模型实现
句法分析是自然语言处理中的关键技术之一,其基本任务是确定句子的句法结构或者句子中词汇之间的依存关系.主要包括两方面的内容,一是确定语言的语法体系,即对语言中合法句子的语法结构给予形式化的定义:另一方面 ...
spacy 英文模型下载_spaCy2.1中文模型包
1.预训练模型概述 spaCy是最流行的开源NLP开发包之一,它有极快的处理速度,并且预置了词性标注.句法依存分析.命名实体识别等多个自然语言处理的必备模型. 本包提供适用于spaCy 2.1的中文 ...
spaCy 2.1 中文NLP模型
spaCy是最流行的开源NLP开发包之一,它有极快的处理速度,并且预置了词性标注.句法依存分析.命名实体识别等多个自然语言处理的必备模型,因此受到社区的热烈欢迎.中文版预训练模型包括词性标注.依存分析 ...
7.中文句法依存分析
1.概念句法分析是自然语言处理(NLP)中的关键技术之一,其基本任务是确定句子的句法结构或者句子中词汇之间的依存关系.主要包括两方面的内容:一是确定语言的语法体系,即对语言中合法句子的语法结构给予形 ...
ACL 2021 | 结构化知识蒸馏方法
本文介绍了上海科技大学屠可伟课题组与阿里巴巴达摩院的一项合作研究,提出了在结构预测问题上一种较为通用的结构化知识蒸馏方法.该论文已被 ACL 2021 接受为长文. 论文标题: Structural ...
独家 | 一文读懂自然语言处理NLP（附学习资料）
前言自然语言处理是文本挖掘的研究领域之一,是人工智能和语言学领域的分支学科.在此领域中探讨如何处理及运用自然语言. 对于自然语言处理的发展历程,可以从哲学中的经验主义和理性主义说起.基于统计的自然语 ...
美团知识图谱问答技术及在商家推荐回复场景中的实践与探索
猜你喜欢 0.电商知识图谱的构建及在搜索推荐场景的应用实践1.如何搭建一套个性化推荐系统?2.内容推荐策略产品经理的方法与实践3.京东推荐算法精排技术实践4.微博推荐算法实践与机器学习平台演进5.腾讯 ...
系统学习NLP（五）--句法分析
转自:https://www.jianshu.com/p/fb408b6a0904 真佩服作者的毅力,把基础概念都敲出来了... 句法分析的基本任务是确定句子的语法结构或句子中词汇之间的依存关系.句 ...
句法分析（PCFG，Transition-based parsing）
句法分析的基本任务是确定句子的句法结构或者句子中词汇之间的依存关系. 句法分析分为句法结构分析(syntactic structure parsing)和依存关系分析(dependency pars ...

spacy依存分析模型

依存分析模型

一、题目要求

二、数据集展示

三、数据集处理

四、训练模型

4.1 导入包

4.2 模型参数的注解（语种、输出目录以及训练迭代次数）

4.3 建立模型

五、模型测试

spacy依存分析模型相关推荐

最新文章

热门文章