文章目录

  • Motivation
  • Intro
  • Contribution
  • MIMICS-Click
  • MIMICS-ClickExplore
  • MIMICS-Manual
  • Data Analysis
    • Question Template Analysis
    • Analyzing Engagement Based on Clarification Impression
    • Analysis Based on Query Length
    • Analysis Based on the Number of Candidate Answers
    • Analyzing Click Entropy Distribution on Candidate Answers(confused)

原文链接: MIMICS: A Large-Scale Data Collection for Search Clarification (arxiv.org)

Motivation

The research community still feels the lack of a large-scale data for studying different aspects of search clarification.

Intro

  • each clarification in MIMICS consist of a clarifying question and up to five candidate answers

All the datasets presented in this paper only demonstrate the queries from the en-US market.

MIMICS-Click and MIMICS-ClickExplore are based on user interaction in Bing.

MIMICS-Manual is based on manual annotations of clarification paned by multiple trained annotators.

Contribution

Create MIMICS, consists of three datasets:

  • MIMICS-Click: includes over 400k unique queries with the associated clarification panes.
  • MIMICS-ClickExplore: is an exploration data and contains multiple clarification panes per query. It includes over 60k unique queries.
  • MIMICS-Manual: is a smaller dataset with manual annotations for clarifying questions, candidate answer sets, and the landing result page after clicking on individual candidate answers.

MIMICS-Click

  • only kept the queries for which a clarification pane was rendered in the search engine result page (SERP).

  • the clarification panes were solely generated based on the submitted queries, therefore they do not include session and personalized information

  • This resulted in 414,362 unique queries, each associated with exactly one clarification pane. Out of which 71,188 of clarifications have received positive clickthrough rates.

MIMICS-ClickExplore

Although MIMICS-Click is a invaluable resource for learning to generate clarification and related research problems, it does not allow researchers to study some tasks, such as studying click bias in user interactions with clarification.

  • we used the top-m clarifications generated by our algorithms and presented them to different sets of users. The user interactions with multiple clarification panes for the same query at the same time period enable
    comparison of these clarification panes

  • The resulted dataset contains 64,007 unique queries and 168,921 query-clarification pairs. Out of which, 89,441 query-clarification pairs received positive engagements.

  • Note that the sampling strategies for MIMICS-Click and MIMICS-ClickExplore are different which resulted in significantly more query-clarification pairs with low impressions in MIMICS-Click.

MIMICS-Manual

Click does not necessarily reflect all quality aspects. In addition, it can be biased for many reasons.

  • we randomly sampled queries from the query logs to collect manual annotations for a set of realistic user queries.
  • further used the same algorithm to generate one or more clarification pairs for each query
  • Each query-clarification pair was assigned to at least three annotators

step1: asked the annotators to skim and review a few pages of the search results returned by Bing

step2: Each clarifying question is given a label 2 (Good), 1 (Fair), or 0 (Bad)(does not show the candidate answers to the annotators at this stage)

step3: the annotators were asked to judge the overall quality of the candidate answer set(2, 1, 0)

step4: Annotating the Landing SERP(the SERP after clicking one answer) Quality for Each Individual Candidate Answer

  • Our annotations resulted in over 2.4k unique queries and over 2.8k query-clarification pairs

Note: in case of having a generic template instead of clarifying questions (i.e., “select one to refine your search”), we do not ask the annotators to provide a question quality labels.

Data Analysis

Question Template Analysis

  • the last four templates (T4 - T7)(less frequent in the dataset and more specific) have led to higher engagements compared to T1, T2, and T3 in both MIMICS-Click and MIMICS-ClickExplore
  • the exploration dataset has higher average engagements compared to MIMICS-Click.
    • The reason is that the number of query-clarification pairs with zero engagements in MIMICS-Click are higher than those in MIMICS-ClickExplore

Analyzing Engagement Based on Clarification Impression

MIMICS-Click and MIMICS-ClickExplore contain a three-level impression label per query-clarification pair

  • there is negligible difference between the average engagements across impression levels

Analysis Based on Query Length

We study user engagements and manual quality labels with respect to query length

  • the average engagement increases as the queries get longer.

    • longer queries are often natural language questions, while short queries are keyword queries
  • this is inconsistent with the manual annotations suggesting that single word queries have higher question quality, answer set quality, and also landing page quality

    • This observation suggests that user engagement with clarification is not necessarily aligned with the clarification quality

Analysis Based on the Number of Candidate Answers

  • there is a small difference between average engagements in both MIMICS-Click and MIMICS-ClickExplore datasets

    • The clarifications with three candidate answers have led to a slightly higher engagement than the rest.
  • The clarifications with three candidate answers has best question quality but worst answer set quality
    • This highlights that the question quality may play a key role in increasing user engagements

Analyzing Click Entropy Distribution on Candidate Answers(confused)

MIMICS-Click and MIMICS-ClickExplore both contain conditional click probability on each individual answer.

The entropy of this probabilistic distribution demonstrates how clicks are distributed across candidate answers.

  • the number of peaks in the entropy distribution is aligned with the number of candidate answers.

    • The entropy values where the histogram peaks suggest that in many cases there is a uniform-like distribution for m out of n candidate answers

【论文阅读】MIMICS: A Large-Scale Data Collection for Search Clarification相关推荐

  1. Introducing DataFrames in Apache Spark for Large Scale Data Science(中英双语)

    文章标题 Introducing DataFrames in Apache Spark for Large Scale Data Science 一个用于大规模数据科学的API--DataFrame ...

  2. 论文阅读:Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour

    论文阅读:Accurate, Large Minibatch SGD: Training ImageNet in 1 Hour 论文地址: https://arxiv.org/pdf/1706.026 ...

  3. 【论文阅读】OUTRAGEOUSLY LARGE NEURAL NETWORKS: THE SPARSELY-GATED MIXTURE-OF-EXPERTS LAYER

    一.背景 神经网络的吸收信息的容量(capacity)受限于参数数目. 条件计算(conditional computation)针对于每个样本,激活网络的部分子网络进行计算,它在理论上已证明,可以作 ...

  4. 【论文阅读报告】 Real-time Personalization using Embeddings for Search Ranking at Airbnb

    主要内容 Airbnb作为全球最大的住宿网站之一,其团队希望构建一个能够实时为用户提供个性化的房源排名的系统. Airbnb利用word2vec模型针对用户的长期兴趣和短期兴趣分别对房源和用户做了嵌入 ...

  5. 【论文阅读】Rotom: A Meta-Learned Data Augmentation Framework for Entity Matching

    摘要 深度学习几乎彻底改变了计算机科学的所有领域,包括数据管理.然而,对高质量训练数据的需求正在放缓深层神经网络的广泛应用.为此,数据增强(DA)成为一种常见的技术,它可以从现有的示例中生成更多标记的 ...

  6. 【论文阅读笔记】Random Erasing Data Augmentation

    论文地址:random erasing 论文总结   本文的方法名为random erasing,是一种数据增强的方法.通过随机选择不同大小的方形区域,填充随机像素值,达到增加数据遮挡效果的数据增强. ...

  7. [论文阅读](SHAPING DATASETS: OPTIMAL DATA SELECTION FOR SPECIFIC TARGET DISTRIBUTIONS ACROSS DIMENSIONS)

    文章目录 摘要 引言 方法 补充:分支界定法 实验结果 摘要 提出了一种基于混合整数线性规划(MILP)的数据集操作方法.提出的优化可以将数据集缩小到特定的大小,同时在不同维度上强制执行特定的分布.它 ...

  8. 论文阅读:Fast Optical Flow using Dense Inverse Search

    文章目录 1. 论文总述 2021_09_14补充: 2. 光流鲁棒性遇到的挑战 3. 保持精度的同时提高速度的一些方法 4. Fast inverse search for corresponden ...

  9. G-mixup论文阅读

    论文阅读:G-mixup:Graph Data Augmentation for Graph Classification 摘要 mixup数据增强在图像领域获得了不错的效果,但是在图数据中仍然有很多 ...

最新文章

  1. python实现gauss-seidel迭代公式_python实现高斯(Gauss)迭代法的例子
  2. 什么是反射,为什么有用?
  3. 阿里云系列——3.企业网站备案步骤---2018-1-4
  4. python turtle库画图案-python之绘制图形库turtle(转)
  5. java内存泄露分析方案
  6. (1) 漂亮的日期控件
  7. Java Web项目_order下载、运行
  8. 985毕业,3年大数据经验,面试阿里腾讯失败,只因做不好报表
  9. weblogic集群部署与负载均衡_集群,负载均衡,分布式的讲解和多台服务器代码同步...
  10. 星云STS 常用配置
  11. LoadRunner场景参数文件部分参数说明
  12. vjc机器人灰度怎么编程_求用vc++编程实现显示灰度直方图的详细步骤,越详细越好...
  13. python下载前获取文件大小
  14. 电容尺寸、封装及PCB库
  15. python函数拟合
  16. 任务管理器已被系统管理员停用的解决方法
  17. 2022上海省赛(A,E,G,H,M,N)
  18. Zabbix使用SMTP发送邮件报警并且制定报警内容
  19. 计算机系统时间的修复,电脑时间总是不对,小编教你如何恢复正常
  20. 磁珠法DNA pull down试剂盒、蛋白质-核酸相互作用

热门文章

  1. hexo博客yilia主题_缺失模块_解决方案
  2. 在qt实现手机通讯录系统_通讯录管理系统的设计与实现(QT,SQlite)
  3. 使用spotify的docker-maven-plugin插件将SpringBoot项目打包为Docker镜像
  4. 简单的Animation实现角色行走(学习笔记)
  5. 【Hack The Box】windows练习-- Object
  6. 商业智能软件对比评测:FineBI和Tableau
  7. 百度飞桨图像分类------第三天(实现蝴蝶图像分类)
  8. MyBatis 03 动态SQL
  9. 学计算机做人需要有什么基础,计算机专业学生装逼入门(文/郭策)
  10. 给世界上色——滤镜底层原理