最近,我在Twitter上发现了一个有趣的话题,假设有这样一个场景,论文研究在GitHub上发布,而后续论文则会提交与原始论文不同之处。在人工智能机器学习领域,信息过载一直是一个大问题,每个月都有大量新论文发表,这样的通过commit history展示方式或许会给你带来眼前一亮。

下面我们就来蹭蹭大明星BERT的热度,来看看这一场景应用到BERT系论文会是什么样子的?

commit arXiv:1810.04805
Author: Devlin et al.
Date: Thu Oct 11 00:50:01 2018 +0000
Initial Commit: BERT
-Transformer Decoder
+Masked Language Modeling
+Next Sentence Prediction
+WordPiece 30K

commit arXiv:1901.07291
Author: Lample et al.
Date: Sun Nov 10 10:46:37 2019 +0000
Cross-lingual Language Model Pretraining
+Translation Language Modeling(TLM)
+Causal Language Modeling(CLM)

commit arXiv:1906.08237
Author: Yang et al.
Date: Wed Jun 19 17:35:48 2019 +0000
XLNet: Generalized Autoregressive Pretraining for Language Understanding
-Masked Language Modeling
-BERT Transformer
+Permutation Language Modeling
+Transformer-XL
+Two-stream self-attention

commit arXiv:1907.10529
Author: Joshi et al.
Date: Wed Jul 24 15:43:40 2019 +0000
SpanBERT: Improving Pre-training by Representing and Predicting Spans
-Random Token Masking
-Next Sentence Prediction
-Bi-sequence Training
+Continuous Span Masking
+Span-Boundary Objective(SBO)
+Single-Sequence Training

commit arXiv:1907.11692
Author: Liu et al.
Date: Fri Jul 26 17:48:29 2019 +0000
RoBERTa: A Robustly Optimized BERT Pretraining Approach
-Next Sentence Prediction
-Static Masking of Tokens
+Dynamic Masking of Tokens
+Byte Pair Encoding(BPE) 50K
+Large batch size
+CC-NEWS dataset

commit arXiv:1908.10084
Author: Reimers et al.
Date: Tue Aug 27 08:50:17 2019 +0000
Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks
+Siamese Network Structure
+Finetuning on SNLI and MNLI

commit arXiv:1909.11942
Author: Lan et al.
Date: Thu Sep 26 07:06:13 2019 +0000
ALBERT: A Lite BERT for Self-supervised Learning of Language Representations
-Next Sentence Prediction
+Sentence Order Prediction
+Cross-layer Parameter Sharing
+Factorized Embeddings

commit arXiv:1910.01108
Author: Sanh et al.
Date: Wed Oct 2 17:56:28 2019 +0000
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter
-Next Sentence Prediction
-Token-Type Embeddings
-[CLS] pooling
+Knowledge Distillation
+Cosine Embedding Loss
+Dynamic Masking

commit arXiv:1911.03894
Author: Martin et al.
Date: Sun Nov 10 10:46:37 2019 +0000
CamemBERT: a Tasty French Language Model
-BERT
-English
+ROBERTA
+French OSCAR dataset(138GB)
+Whole-word Masking(WWM)
+SentencePiece Tokenizer

commit arXiv:1912.05372
Author: Le et al.
Date: Wed Dec 11 14:59:32 2019 +0000
FlauBERT: Unsupervised Language Model Pre-training for French
-BERT
-English
+ROBERTA
+fastBPE
+Stochastic Depth
+French dataset(71GB)
+FLUE(French Language Understanding Evaluation) benchmark

假如BERT系论文变成Commit History相关推荐

  1. LM详解 Bert系 论文译读

    LM模型详解 ================================================================== BERT: Pre-training of Deep ...

  2. Git Basics - Viewing the Commit History

    Git --distributed-even-if-your-workflow-isnt About DocumentationReferenceBookVideosExternal Links Do ...

  3. BERT(一)--论文翻译:BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

    转载请注明出处:https://blog.csdn.net/nocml/article/details/124860490 传送门: BERT(一)–论文翻译:BERT: Pre-training o ...

  4. BERT相关论文、文章和代码资源汇总

    转自:http://www.52nlp.cn/tag/transformer BERT相关论文.文章和代码资源汇总 4条回复 BERT最近太火,蹭个热点,整理一下相关的资源,包括Paper, 代码和文 ...

  5. 【NLP系列】最新BERT相关论文汇总

    关注上方"深度学习技术前沿",选择"星标公众号", 资源干货,第一时间送达! 项目资源链接:https://github.com/murufeng/Awesom ...

  6. 最新BERT相关论文汇总

    原文链接: https://github.com/murufeng/BERT_papers 项目列表展示: 更多项目详细内容,点击阅读原文查看.

  7. NewBeeNLP 年中 | From NewBee To NB

    哈喽大家好,这里是NewBeeNLP.今天趁着端午休假,归类梳理了下之前的原创文章,不知道你是从哪篇文章开始关注的呢???? 非常感谢一年多来的喜欢和支持,不管是入门小白还是行业老司机,希望都能在Ne ...

  8. 【BERT蒸馏】DistilBERT、Distil-LSTM、TinyBERT、FastBERT(论文+代码)

    文章目录 0. 引言 1. FastBERT: a Self-distilling BERT with Adaptive Inference Time 1.1 摘要 1.2 动机 1.3 贡献(适用于 ...

  9. 15篇论文全面概览BERT压缩方法

    作者 | Mitchell A. Gordon 译者 | 孙薇 出品 | AI科技大本营(ID:rgznai100) 模型压缩可减少受训神经网络的冗余--由于几乎没有BERT或者BERT-Large模 ...

最新文章

  1. Android系统架构-[Android取经之路]
  2. Ajax+asp.net实现用户登陆 转自http://www.shangxueba.com/jingyan/2933319.html
  3. 解锁三星bl锁有几种方法_三星S6解锁教程_三星GALAXY S6怎么解锁Bootloader的方法
  4. 阴差阳错2019-12-13
  5. mnist torch加载fashion_Pytorch加载并可视化FashionMNIST指定层(Udacity)
  6. Ubuntu 远程管理常用命令
  7. uni中一些插件的使用
  8. 《Rework》摘录及感想
  9. DM 源码阅读系列文章(七)定制化数据同步功能的实现
  10. 【转载】《武学求真录》和《逝去的武林》及《老拳师的故事》 -3
  11. 数据科学面试问答题库
  12. Linux线程ID与内核LWP的关系
  13. 初步分析CCLE和GDSC的数据——RNA表达矩阵
  14. 新版完整标准 BS ISO-IEC 24745-2022 信息安全、网络安全和隐私保护-生物特征信息保护
  15. P3088 [USACO13NOV]Crowded Cows
  16. python在哪个城市工资高_“英语学科教学和笔译专业,哪个工资较高?”
  17. 你对人体的健康了解多少?
  18. 怎么给文件夹加密码 电脑文件夹加密方法
  19. 开源全链路应用监控系统
  20. 哔哩哔哩【青龙面板】

热门文章

  1. SAP License:雾里看花系列——做管理还要懂SAP吗?
  2. 软件工程学习进度第十周汇总
  3. Docker学习笔记之浅谈虚拟化和容器技术
  4. JavaScript 基础(三) - Date对象,RegExp对象,Math对象,Window 对象,History 对象,Location 对象,DOM 节点...
  5. casefold()方法
  6. Vmware Ubuntu 开机蓝屏
  7. TCP/IP详解 学习三
  8. 支付宝支付返回通知时 notify_url和return_url的选择
  9. C++ string 用法详解(转)
  10. WebMatrix经典案例