《Improving the Robustness of Question Answering Systems to Question Paraphrasing》

新加坡国立大学

这篇论文主要是解决模型的鲁棒性,提出了两个测试集,最后通过实验证明(数据增强),能够在提出的两个数据集上得到比较好的结果。

动机:


方法:

训练一个模型 Paraphrase-Guided Paraphrasing Network
(source question + paraphrase suggestion(手动构造的) -> target question(输出))

下图是模型输入:预处理成图所示

训练语料:
1、WikiAnswers paraphrase corpus:22 million question pairs , select 350,000 question pairs
2、Quora Question Pairs dataset :280,000 training examples

训练好模型之后,如何在SQuAD上生成对抗样本呢?

1. all n-grams (up to 6-grams) from the source question and remove unigrams that are stopwords.
2. search the paraphrase database PPDB for paraphrases of the remaining n-grams with equivalence score above 0.25
3. A set of paraphrase suggestions for the model to generate paraphrased questions.
4. use the pretrained model by Wieting and Gimpel (2018) to obtain paraphrase similarity score


A total of 1,062 paraphrased questions are produced.

人工评测:
78.1% of the generated paraphrases are judged to be semantically equivalent and 78.6% are judged to be fluent

下图是一个例子:

对于干扰性问题:

by using words in the context near a wrong answer candidate of the same type to
generate a natural adversarial example

We perform such paraphrasing manually by going through question and context pairs from the
SQuAD development set and re-writing the question

We create a total of 56 paraphrased questions

微调:

1、Non-Adversarial Paraphrased Test Set
用上述的方法:无人工参与
similarity score above 0.9,to create more diverse paraphrased questions as training data
We randomly sample 25,000 paraphrased questions to be used as additional training data.

2、Adversarial Paraphrased Test Set
• We use Flair6 (Akbik et al., 2018) trained on the Ontonotes dataset[8] which contains 12 named entity classes to label which named entity class
• Extract sentences from the context containing named entities of the same type
• 语法分析得到的名词、动词短语,使其形成 paraphrase suggestions并且每一个suggestio 都至少包含两个词,且不和答案重置
• using the paraphrasing model to paraphrase questions
• paraphrase similarity score above 0.83;we want to allow context words that could be very
different from the question words to appear in the generated paraphrase
additional 25,000 paraphrased training examples.


通过使用这篇论文数据处理的方法。得到了鲁棒性训练集,并且将这部分的数据通过数据增强的方式对模型重新训练。可以看到Table 3, 4, 5, 6都有了比较大的提升,尤其对于干扰性样本来说。

下面是有些相关工作,可供大家参考:
Adversarial Examples for Question Answering
Jia and Liang[1] :
appending a distracting sentence to the end of a passage.
缺点:
the adversarial examples created are unnatural and not
expected to be present in naturally occurring passages.

Some previous work used question paraphrasing to create more natural adversarial examples.
Ribeiro[2]:
use of back translation to obtain paraphrasing rules.
Rychalska[3]:
replaced the most important question word with a synonym from WordNet and ELMo embeddings

Neural Paraphrasing Networks

Besides single paraphrase generation, the value of generating multiple paraphrases.
Gupta[5] a variational autoencoder (VAE)

Xu[6] assumed that different paraphrasing styles used different rewriting patterns, which were represented as latent embeddings. These embeddings were used to augment the decoder’s hidden state to generate different paraphrases.

这篇文章方法:
A more guided approach to generate diverse paraphrases,
Given k suggestions, our model is thus able to generate up to k paraphrased questions.

Paraphrasing as an Intermediate Task to Question Answering
Dong:[7]
The probability distribution of the answer was then generated for each paraphrased question, which was subsequently weighted by the score of each paraphrased question to compute the overall conditional probability
时间消耗比较大
这篇文章 we consider question paraphrasing as a separate task

reference

[1]Adversarial examples for evaluating reading comprehension systems.EMNLP,2017
[2] Semantically equivalent adversarial rules for debugging NLP models. ACL,2018
[3] Are you tough enough? Framework for robustness validation of machine comprehension
Systems.
[4] Learning to paraphrase for question answering. EMNLP,2017
[5]A deep generative framework for paraphrase generation. AAAI,2018
[6]D-PAGE: Diverse paraphrase generation 2018
[7] Learning to paraphrase for question answering. EMNLP,2017
[8] https://catalog.ldc.upenn.edu/docs/LDC2013T19/OntoNotes-Release-5.0.pdf

Improving the Robustness of Question Answering Systems to Question Paraphrasing相关推荐

  1. 论文阅读:A Survey on Why-Type Question Answering Systems

    "WHY"类型问答系统的研究 文章目录 "WHY"类型问答系统的研究 0. 摘要 1. 介绍 2. 数字助手 VS 问答系统 3. 重要的定义 4. QA系统的 ...

  2. acl 2020 Question Answering

    文章目录 2020 Fluent Response Generation for Conversational Question Answering PLATO: Pre-trained Dialog ...

  3. CHAPTER 23 Question Answering

    CHAPTER 23 Question Answering Speech and Language Processing ed3 读书笔记 Two major paradigms of questio ...

  4. 李宏毅机器学习2021作业7-Bert (Question Answering)

    内容为自己对助教给出代码的自我理解(甚至可以理解为部分翻译..)外加一些函数的查找以及其功能,欢迎大家指出我的不足,帖子主要是作为自己的笔记记录一下,不喜勿喷.3q Task description ...

  5. 视频问答与推理(Video Question Answering and Reasoning)——论文调研

    文章目录 0. 前言 1. ACM MM 2. CVPR 3. ICCV 4. AAAI 更新时间--2019.12 首稿 0. 前言 学习 VQA 的第一步--前期论文调研. 调研近几年在各大会议上 ...

  6. 论文笔记Improving Multi-hop Knowledge Base Question Answering by Learning Intermediate Supervision Signa

    Improving Multi-hop Knowledge Base Question Answering by Learning Intermediate Supervision Signals 引 ...

  7. 论文总结之对话生成《Improving Knowledge-aware Dialogue Generation via Knowledge Base Question Answering》

    11.Improving Knowledge-aware Dialogue Generation via Knowledge Base Question Answering 本篇论文是ACL最新的20 ...

  8. 论文解读:Improving Multi-hop Question Answering over Knowledge Graphs using Knowledge Base Embeddings

    论文解读:Improving Multi-hop Question Answering over Knowledge Graphs using Knowledge Base Embeddings    ...

  9. Check It Again: Progressive Visual Question Answering via Visual Entailment 论文笔记

    Check It Again: Progressive Visual Question Answering via Visual Entailment 论文笔记 一.Abstract 二.引言 三.R ...

最新文章

  1. linkedhashmap 顺序_有关于LinkedHashMap一份简单理解
  2. 在CSDN中如何转载别人的博客
  3. oracle 新建TNS监听,oracle for windows 监听问题之TNS-12545
  4. 最优的cuda线程配置
  5. Faster\Slower 快慢指针的应用
  6. JsonBuilder初出茅庐
  7. 给大家推荐几位顶级Go语言专家写的公众号
  8. Spring(16) 获得bean的id
  9. ATP-EMTP中变压器联结方式与电压的关系
  10. H3CSE培训阶段1
  11. 如何经营好(开好)一家淘宝店铺
  12. 锚杆拉拔试验弹性模量计算_锚杆拉拔试验检测标准
  13. ai如何置入_ai中更新置入图片链接的具体步骤介绍
  14. Python 豆瓣TOP250 电影爬取
  15. 面试被问离职原因,别乱说
  16. 常用BUG管理工具系统介绍
  17. SAP 的总账和明细账
  18. 微信公众号头像如何修改
  19. 录屏时计算机休眠,硬盘录像机里硬盘提示休眠,什么意思?
  20. MicroBlaze定时器(Timer)的使用

热门文章

  1. mysql实验总结,数据库实验总结
  2. X2-xml基础知识二[xml]
  3. html网页调用手机拨打电话-js拨打手机电话-web拨打手机电话
  4. 传统串口设备快速实现联网的解决方案(串口-以太网网关、Modbus网关、Modbus Poll/Slave调试软件的使用、Modbus报文数据实例分析)
  5. 计算机 屏幕花屏,电脑显示器画面花屏一直抖动的五种修复方法
  6. 使用statcounter统计Hexo博客访问量
  7. 使用Tomcat10.0.10搭建一个文件下载服务器
  8. 机器视觉CAD图纸转换G代码 点胶机 切割机设备 可导入CA D图纸轨迹
  9. 【转载:实测有效】WASAPI 音量控制 IAudioEndpointVolume
  10. 问题解决:记录一次Linux服务器根目录突然爆满