百度ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling

2024-04-16 01:16:15

目录

简介

主要出发点

主要工作

3.2. Explicitly N-gram Masked Language

3.3 Comprehensive N-gram Prediction

3.4 Enhanced N-gram Relation Modeling

实验结果

消融实验

Effect of Explicitly N-gram MLM

Size of N-gram Lexicon

Effect of Comprehensive N-gram Prediction and Enhanced N-gram Relation Modeling

思考小结

简介

ERNIE-Gram, an explicitly n-gram masking and predicting method to eliminate the limitations of previous contiguously masking strategies and incorporate coarse-grained linguistic information into pre-training sufficiently. ERNIE-Gram conducts comprehensive n-gram pre- diction and relation modeling to further enhance the learning of semantic n-grams for pre-training.

主要出发点

BERT’s MLM focuses on the representations of fine-grained text units (e.g. words or subwords in English and characters in Chinese), rarely considering the coarse-grained linguistic information (e.g. named entities or phrases in English and words in Chinese) thus incurring inadequate representation learning.
Many efforts have been devoted to integrate coarse-grained semantic information by independently masking and predicting contiguous sequences of n tokens, namely n-grams, such as named entities, phrases (Sun et al., 2019b), whole words.
We argue that such contiguously masking strategies are less effective and reliable since the prediction of tokens in masked n-grams are independent of each other, which neglects the intra-dependencies of n-grams.

主要工作

3.2. Explicitly N-gram Masked Language

如上图f1(a): 之前的Contiguously MLM，忽略了ngram内部词之前的依赖关系，预测时ngram中的各个token之间是相互独立的，loss计算方式：

如上图f1(b): explicitly N-gram MLM，将ngram看成一个整体(token)（此处需额外一个ngram字典），预测时只需在一个位置预测，loss计算方式：

3.3 Comprehensive N-gram Prediction

更进一步的，该工作同时进行了ngram整体片段的预测和内部各个token的预测，作者对mask matrix进行了精心的设计，详见原文

3.4 Enhanced N-gram Relation Modeling

To explicitly learn the semantic relationships be- tween n-grams, we jointly pre-train a small genera- tor model θ′ with explicitly n-gram MLM objective to sample plausible n-gram identities. Then we employ the generated identities to preform mask- ing and train the standard model θ to predict the original n-grams from fake ones in coarse-grained and fine-grained manners, as shown in Figure 3(a), which is efficient to model the pair relationships between similar n-grams.
建模ngram之间的关系，借鉴了一部分ELECTRA的思想

实验结果

基本比较稳定的超过对比的ptm

消融实验

Effect of Explicitly N-gram MLM

Explicitly N-gram MLM 对于 contiguously mlm 的提升并没有想象的那么大，0.5左右

Size of N-gram Lexicon

Effect of Comprehensive N-gram Prediction and Enhanced N-gram Relation Modeling

貌似enrm的影响比cnp的影响更大

思考小结

整个工作感觉还是比较复杂的，看来想有效提升，刷榜还是很不容易的，不过总感觉不是那么丝滑，大道至简；
之前做相关项目的时候，自己对于ngram或span也是没有好的解决方式（想扩大字典将词包含进来），没想到其实粗暴的 contiguously mlm也有效果，但是 Explicitly N-gram MLM 对于 contiguously mlm 的提升并没有我想象的那么大（太天真）（另，侧面反映其实采用字级别的处理方式表现也还可以）

百度ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling相关推荐

详细介绍百度ERNIE 2.0：A Continual Pre-Training Framework for Language Understanding
系列阅读: 详细介绍百度ERNIE1.0:Enhanced Representation through Knowledge Integration 详细介绍百度ERNIE 2.0:A Continu ...
[文献阅读]——ERNIE-Gram: Pre-Training with Explicitly N-Gram Masked Language Modeling for NLU(TBC)
目录前言引言相关工作模型背景介绍--Contiguously MLM(掩膜token) Explicitly N-gram MLM(掩膜N-gram) Comprehensive N-gra ...
百度ERNIE登顶GLUE榜单，得分首破90大关
出品 | AI科技大本营(ID:rgznai100) 12月10日,百度ERNIE在自然语言处理领域权威数据集GLUE中登顶榜首,以9个任务平均得分首次突破90大关刷新该榜单历史,其表现超越微软MT ...
百度ERNIE 2.0发布！16项中英文任务表现超越BERT和XLNet
整理 | 夕颜出品 | AI科技大本营(ID:rgznai100) 导读:2019 年 3 月,百度正式发布 NLP 模型 ERNIE,其在中文任务中全面超越 BERT 一度引发业界广泛关注和探讨.今 ...
『清华ERNIE』与『百度ERNIE』的爱恨情仇
『清华ERNIE』与『百度ERNIE』的爱恨情仇 FesianXu 20210219 at Baidu intern 前言最近笔者在查看ERNIE论文的时候,发生了一件很乌龙的事情,本来笔者要 ...
【NLP】5 分钟理解百度 ERNIE 核心思想
❝ 本文主要帮助读者超短时间内理解 ERNIE 核心思想,适合正在准备面试百度的同学 (如果需要内推可以找我).如果想要细致了解 ERNIE 的各个细节,建议读原论文:ERNIE1.0 和 ERNIE ...
百度ERNIE新突破，登顶中文医疗信息处理权威榜单CBLUE冠军
医疗领域存在大量的专业知识和医学术语,人类经过长时间的学习才能成为一名优秀的医生.那机器如何才能"读懂"医疗文献呢?尤其是面对电子病历.生物医疗文献中存在的大量非结构化.非标准化文 ...
百度 ERNIE 在 GLUE 大赛中击败了微软和谷歌
受中英文差异的启发,ERNIE 的成功表明人工智能研究可以集百家之长. 作者 | Karen Hao 译者 | 弯月,责编 | Elle 出品 | CSDN(ID:CSDNnews) 以下为译文: 本 ...
做人类语言谜题的破壁人：百度ERNIE 2.0的突破与创造
这两天AI圈有一个广受关注的新闻,百度发布了持续学习的语义理解框架ERNIE 2.0,这个模型在1.0版本中文任务中全面超越BERT的基础上,英文任务取得了全新突破,在共计16个中英文任务上超越了BE ...

最新文章

热门文章