短语标记17个

标注

英文说明

中文说明

ADJP

Adjective phrase

形容词短语,由JJ投射

ADVP

Adverbial phrase headed by AD

由副词开头的副词短语、状语

CLP

Classifier phrase

量词短语

CP

Clause headed by C(complementizer)

由补语引导的补语从句,关系从句

DNP

Phrase formed by “XP+DEG”

XP+DEG结构构成的短语

DP

Determiner phrease

限定词短语

DVP

Phrase formed BY ‘’XP+DEB“

XP+DEV结构构成的短语

FRAG

fragment

片段

IP

InflectionPhrase

Simple clause headed by I(INFL或其他曲折成份)

LCP

Phrase formed by ”XP+LC“

处所词为中心语的短语

LST

List marker

用于解释说明性的列表标记短语

NP

Noun phrase

名词短语

PP

Preposition phrase

介词短语

PRN

Parenthetical

插入语

QP

Quantifier phrase

数词短语,由数量词构成的短语结构

UCP

Unidentical coordination phrase

非一致性并列短语

VP

Verb phrase

动词短语

动词复合6个标记

VCD 并列动词复合 (VCD (VV 投资 )    (VV 办厂 ))
VCP VV+VC 动词+是
VNV A不A,A一A,(VNV(VV 能) (AD 不) (VV 能))
VPT V的R,或V不R (VPT (VV 得)   (AD 不)   (VV 到))
VRD 动词结果复合,第二个成份是第一个成份的结果(VRD (VV 呈现) (VV 出));(VP(VRD(VV 联合) (VV 起来)))
VSB 定语+核心复合,第一个成份为不及物动词,两个成份之间没有附加语或者体标记,VSB (VV 加速) (VV 建设)) (VP(VSB(VV 仰头)(VV 望去)))

NP

中心词为名词构成的短语。从语法角度看,有两种含义:(1)按句法成份构成的短语,如组块在句子中充当主语、宾语等,可以增加辅助标签,NP-Sbg,NP-Obj;(2)知识库中的实体和属性,这种组块称为baseNP。

VP

以动词为中心,与其修饰、限定、并列成份共同构成的一种语义组块。

CoreNLP中源码

nonTerminalInfo.put("ROOT",new String[][]{{left, "IP"}});
nonTerminalInfo.put("PAIR",new String[][]{{left, "IP"}});// Major syntactic categories
nonTerminalInfo.put("ADJP",new String[][]{{left, "JJ","ADJP"}}); // there is one ADJP unary rewrite to AD but otherwiseall have JJ or ADJP
nonTerminalInfo.put("ADVP",new String[][]{{left, "AD","CS", "ADVP","JJ"}}); // CS is a subordinating conjunctor, and there are acouple of ADVP->JJ unary rewrites
nonTerminalInfo.put("CLP",new String[][]{{right, "M","CLP"}});
//nonTerminalInfo.put("CP", newString[][] {{left,"WHNP","IP","CP","VP"}}); // this iscomplicated; see bracketing guide p. 34. Actually, all WHNP are empty. IP/CP seems to be the best semantic head; syntax would dictate DEC/ADVP.Using IP/CP/VP/M is INCREDIBLY bad for Dep parser - lose 3% absolute.
nonTerminalInfo.put("CP",new String[][]{{right, "DEC","WHNP", "WHPP"},rightExceptPunct}); // the (syntax-oriented) right-first head rule
// nonTerminalInfo.put("CP", new String[][]{{right, "DEC","ADVP", "CP", "IP", "VP","M"}}); // the (syntax-oriented) right-first head rule
nonTerminalInfo.put("DNP",new String[][]{{right, "DEG","DEC"}, rightExceptPunct});//according to tgrep2, first preparation, all DNPs have a DEG daughter
nonTerminalInfo.put("DP",new String[][]{{left, "DT","DP"}}); // there's one instance of DP adjunction
nonTerminalInfo.put("DVP",new String[][]{{right, "DEV","DEC"}}); // DVP always has DEV under it
nonTerminalInfo.put("FRAG",new String[][]{{right, "VV","NN"}, rightExceptPunct});//FRAGseems only to be used for bits at the beginnings of articles:"Xinwenshe<DATE>" and "(wan)"
nonTerminalInfo.put("INTJ",new String[][]{{right, "INTJ","IJ", "SP"}});
nonTerminalInfo.put("IP",new String[][]{{left, "VP","IP"}, rightExceptPunct}); // CDM July 2010 following email from Pi-Chuanchanged preference to VP over IP: IP can be -SBJ, -OBJ, or -ADV, and shouldn'tbe head
nonTerminalInfo.put("LCP",new String[][]{{right, "LC","LCP"}}); // there's a bit of LCP adjunction
nonTerminalInfo.put("LST",new String[][]{{right, "CD","PU"}}); // covers all examples
nonTerminalInfo.put("NP",new String[][]{{right, "NN","NR", "NT","NP", "PN","CP"}}); // Basic heads are NN/NR/NT/NP; PN is pronoun.  Some NPs are nominalized relative clauseswithout overt nominal material; these are NP->CP unary rewrites.  Finally, note that this doesn't give any specialtreatment of coordination.
nonTerminalInfo.put("PP",new String[][]{{left, "P","PP"}}); // in the manual there's an example of VV heading PP butI couldn't find such an example with tgrep2
// cdm 2006: PRN changed to not choose punctuation.  Helped parsing (if not significantly)
// nonTerminalInfo.put("PRN", new String[][]{{left,"PU"}}); //presumably left/right doesn't matter
nonTerminalInfo.put("PRN",new String[][]{{left, "NP","VP", "IP","QP", "PP","ADJP", "CLP","LCP"}, {rightdis, "NN","NR", "NT","FW"}});
// cdm 2006: QP: add OD -- occurs some;occasionally NP, NT, M; parsing performance no-op
nonTerminalInfo.put("QP",new String[][]{{right, "QP","CLP", "CD","OD", "NP","NT", "M"}});//there's some QP adjunction
// add OD?
nonTerminalInfo.put("UCP",new String[][]{{left, }}); //an alternative would be"PU","CC"
nonTerminalInfo.put("VP",new String[][]{{left, "VP","VCD", "VPT","VV", "VCP","VA", "VC","VE", "IP","VSB", "VCP","VRD", "VNV"},leftExceptPunct}); //note that ba and long bei introduce IP-OBJ smallclauses; short bei introduces VP
// add BA, LB, as needed// verb compounds
nonTerminalInfo.put("VCD",new String[][]{{left, "VCD","VV", "VA","VC", "VE"}});//could easily be right instead
nonTerminalInfo.put("VCP",new String[][]{{left, "VCD","VV", "VA","VC", "VE"}});// notmuch info from documentation
nonTerminalInfo.put("VRD",new String[][]{{left, "VCD","VRD", "VV","VA", "VC","VE"}}); // definitely left
nonTerminalInfo.put("VSB",new String[][]{{right, "VCD","VSB", "VV","VA", "VC","VE"}}); // definitely right, though some examples lookquestionably classified (na2lai2 zhi1fu4)
nonTerminalInfo.put("VNV",new String[][]{{left, "VV","VA", "VC","VE"}}); // left/right doesn't matter
nonTerminalInfo.put("VPT",new String[][]{{left, "VV","VA", "VC","VE"}}); // activity verb is to the left// some POS tags apparently sit where phrases are supposed to be
nonTerminalInfo.put("CD",new String[][]{{right, "CD"}});
nonTerminalInfo.put("NN",new String[][]{{right, "NN"}});
nonTerminalInfo.put("NR",new String[][]{{right, "NR"}});// I'm adding these POS tags to doprimitive morphology for character-level
// parsing.  It shouldn't affect anythingelse because heads of preterminals are not
// generally queried - GMA
nonTerminalInfo.put("VV",new String[][]{{left}});
nonTerminalInfo.put("VA",new String[][]{{left}});
nonTerminalInfo.put("VC",new String[][]{{left}});
nonTerminalInfo.put("VE",new String[][]{{left}});// new for ctb6.
nonTerminalInfo.put("FLR",new String[][]{rightExceptPunct});// new for CTB9
nonTerminalInfo.put("DFL",new String[][]{rightExceptPunct});
nonTerminalInfo.put("EMO",new String[][]{leftExceptPunct});//left/right doesn't matter
nonTerminalInfo.put("INC",new String[][]{leftExceptPunct});
nonTerminalInfo.put("INTJ",new String[][]{leftExceptPunct});
nonTerminalInfo.put("OTH",new String[][]{leftExceptPunct});
nonTerminalInfo.put("SKIP",new String[][]{leftExceptPunct}); 

斯坦福stanford coreNLP 宾州树库汉语短语类别表23个相关推荐

  1. 句法分析语料:宾州树库、UD树库

    句法分析语料:宾州树库.UD树库 目录 句法分析语料:宾州树库.UD树库 宾州树库 UD树库

  2. 中文宾州树库标记含义

    来源:http://blog.csdn.net/neutblue/article/details/7375085 1        Part-Of-Speech tags: 33 tags 标记 英语 ...

  3. 词性标记说明(Penn Treebank Tagset 宾州树库)

    转自:http://blog.csdn.net/wskings/article/details/17607021 最近在做命名实体识别,用到Stanford-CoreNlp词性标记,由于不是语言学专业 ...

  4. 【NLP】Penn Treebank Tagset 宾州树库 词性标记说明

    转自:http://blog.csdn.net/wskings/article/details/17607021 最近在做命名实体识别,用到Stanford-CoreNlp词性标记,由于不是语言学专业 ...

  5. 中文树库-CTB短语结构标记

    中文树库-CTB短语结构标记 词类标记-33类 Tag Eecription AD 副词 AS 体态词,体标记 BA "把""将"的词性标记 CC 并列连词,& ...

  6. 汉语树库/CoNLL格式,依存句法分析语料

    转载自码农场,原文链接:http://www.hankcs.com/nlp/corpus/chinese-treebank.html 本文旨在介绍CoNLL格式的中文依存语料库(汉语依存树库).CoN ...

  7. NLP工具——Stanford CoreNLP的python封装包 处理中文

    文章目录 1.StanfordCoreNLP是什么? 2.StanfordNLP是什么? 3.StanfordNLP的使用 3.1 安装 3.2 运行 3.3 如何处理中文? 3.4 demo 4.第 ...

  8. 独家 | 综述:情感树库上语义组合的递归深层模型

    作者:Talha Chafekar翻译:顾伟嵩校对:阿笛本文约1400字,建议阅读5分钟本文探讨了单词和n-grams的不同组合方法,以及如何借助基于树的表示法,以自底向上的方式预测短语或单词的二元或 ...

  9. stanford corenlp的TokensRegex

    最近做一些音乐类.读物类的自然语言理解,就调研使用了下Stanford corenlp,记录下来. 功能 Stanford Corenlp是一套自然语言分析工具集包括: POS(part of spe ...

  10. 【中文树库标记---CTB】

    北大标注集 词性编码 词性名称 注解 词性编码 词性名称 注解 Ag 形语素 形容词语素.形容词代码为a,语素代码为g前面置以A a 形容词 取英语形容词adjective的第1个字母 ad 副形词 ...

最新文章

  1. html 分页_MySQL——优化嵌套查询和分页查询
  2. 漫谈边缘计算(三):5G的好拍档
  3. python和c 的区别-python和C语言的差别
  4. JS 里的数据类型及几个操作
  5. 喀秋莎Camtasia Studio微视频录制工具使用指南
  6. 自制合成孔径雷达(2) SDR实现的对比(SDR实现测速雷达)
  7. Mysql—— order 和 limit 的用法
  8. 刷IP工具、刷IP软件的原理和工作过程
  9. 洛谷P3354 [IOI2005]Riv 河流 题解
  10. 《回炉重造》——集合(容器)
  11. 前端启动本地服务的四种方法,看完不会你锤我
  12. 第一集 DLNA 白話文介紹
  13. 《微信公众平台开发:从零基础到ThinkPHP5高性能框架实践》——1.2 微信公众账号注册...
  14. 关于录制短视频点播不能播放问题的总结
  15. KubeSphere安装redis集群,全程超带劲
  16. python 字符串方法 replace_python字符串方法replace()简介
  17. 治疗失眠小妙招:按摩百会穴酸枣仁贴肚脐
  18. matlab 局部极值点,matlab 图像局部求极值
  19. 中国联通MEC边缘云架构与部署实践
  20. [原创] Photoshopt午简单的调出暗青色效果

热门文章

  1. 转账设计测试用例-----必背
  2. spring boot 集成paypal支付 rest api v2的实现
  3. 计算机键盘按键功能说明,电脑键盘各个按键功能分别是什么 电脑键盘各个按键功能介绍...
  4. 国际化地区语言码对照表(i18n)
  5. 360软件小助手-壁纸存储路径
  6. 测试ips显示器的软件,IPS屏幕显示测试
  7. python报错:expected an indented block
  8. ubuntu防火墙安装和设置-ufw
  9. matlab imcrop 对应python函数_MATLAB车牌识别之车牌精准定位浅谈
  10. jmeter参数化测试-姓名生成