1、Visual reasoning[from RAVEN CVPR19]

Early attempts were made in 1940s-1970s in the field of logic-based AI. Newell argued that one of the potential solutions to AI was “to construct a single program that would take a standard intelligence test” [42].There are two important trials: (i) Evans presented an AI algorithm that solved a type of geometric analogy tasks in the Wechsler Adult Intelligence Scale (WAIS) test [10, 11],and (ii) Simon and Kotovsky devised a program that solved Thurstone letter series completion problems [54]. However,these early attempts were heuristic-based with hand-crafted rules, making it difficult to apply to other problems.
The reasoning ability of modern vision systems was first systematically analyzed in the CLEVR dataset [22].By carefully controlling inductive bias and slicing the vision systems’ reasoning ability into several axes, Johnson et al. successfully identified major drawbacks of existing models. A subsequent work [23] on this dataset achieved good performance by introducing a program generator in a structured space and combining it with a program execution engine. A similar work that also leveraged language guided structured reasoning was proposed in [18]. Modules with special attention mechanism were latter proposed in an end-to-end manner to solve this visual reasoning task [19, 49, 59]. However, superior performance gain was observed in very recent works [6, 36, 58] that fell back to structured representations by using primitives, dependency trees, or logic. These works also inspire us to incorporate structure information into solving the RPM problem.

More generally, Bisk et al. [4] studied visual reasoning in a 3D block world. Perez et al. [46] introduced a conditional layer for visual reasoning. Aditya et al. [1] proposed a probabilistic soft logic in an attention module to increase
model interpretability. And Barrett et al. [3] measured abstract reasoning in neural networks.

翻译：

20世纪40-70年代，人们在基于逻辑的人工智能领域进行了早期尝试。纽厄尔认为，人工智能的一个潜在解决方案是“构建一个接受标准智力测试的单一程序”[42]。有两个重要的试验：（i）埃文斯提出了一种人工智能算法，解决了韦氏成人智力量表（WAIS）测试中的一类几何类比任务；（ii）西蒙和科托夫斯基设计了一个解决瑟斯通字母系列完成问题的程序[54]。然而，这些早期的尝试都是基于启发式的手工规则，因此很难应用于其他问题。
CLEVR数据集[22]首次系统地分析了现代视觉系统的推理能力，通过仔细控制归纳偏差，将视觉系统的推理能力分成几个轴，Johnson等人。成功地识别了现有模型的主要缺点。关于这个数据集的后续工作[23]通过在结构化空间中引入程序生成器并将其与程序执行引擎结合，获得了良好的性能。文献[18]中提出了一项类似的工作，也利用了语言引导的结构化推理。后者以端到端的方式提出具有特殊注意机制的模块来解决这一视觉推理任务[19,49,59]。然而，在最近的研究中发现了优越的性能增益[6，36，58]，这些工作通过使用原语、依赖树或逻辑回到结构化表示。这些工作也启发我们将结构信息融入到解决RPM问题中。更广泛地说，Bisk等人。[4] 研究了三维块世界中的视觉推理。Perez等人。[46]为视觉推理引入了条件层。Aditya等人。[1] 提出了一种概率软逻辑在注意模块中的增加模型可解释性。Barrett等人。[3] 神经网络中的度量抽象推理。

Visual reasoning相关推荐

神经网络也可以有逻辑——解析视觉推理（Visual Reasoning）
本文来源知乎,感谢作者Flood Sung授权转载! 前言在我们的上一篇文章最前沿:百家争鸣的Meta Learning/Learning to learn 中,我们谈到了星际2 需要AI具备极好 ...
【论文阅读】Interpretable Visual Reasoning via Probabilistic Formulation under Natural Supervision
[论文阅读]Interpretable Visual Reasoning via Probabilistic Formulation under Natural Supervision 目录 [论文阅 ...
Visual Reasoning(1): CLEVR Dataset
CLEVR: A Diagnostic Dataset for Compositional Language and Elementary Visual Reasoning Introduction ...
VQA+Visual Reasoning SOTA探索
2014-2019年VQA论文:https://heary.cn/posts/VQA-%E8%BF%91%E4%BA%94%E5%B9%B4%E8%A7%86%E8%A7%89%E9%97%AE%E7 ...
Visual Reasoning Strategies for Effect Size Judgments and Decisions
论文传送门作者华盛顿大学 Alex Kale Matthew Kay 美国西北大学 Jessica Hullman 摘要不确定性可视化经常强调点估计,以支持规模估计或通过可视化比较的决策.然而, ...
（VQA）LRTA: A Transparent Neural-Symbolic Reasoning Framework with Modular Supervision for Visual Que
发表于2020年的一篇文章 LRTA神经符号推理框架视觉问答目前的主要方法依赖于"黑盒"神经编码器()对图像问题进行编码,难以为预测过程提供直观的.人类可读的证明形式, 本文提出 ...
Visual BERT论文的简单汇总
目录 ICCV 2019 VideoBERT NIPS 2019 ViLBERT arXiv 2019 VisualBERT arXiv 2019 CBT arXiv 2019 UNITER EMNL ...
Visual Question Answering概述
目录任务描述应用领域主要问题主流框架常用数据集 Metrics 部分数据集介绍摘自这篇博客任务描述输入:图片III.由nnn个单词组成的问题Q={q1,...,qn}Q=\{ q_1,. ...
多模态 Generalized Visual Language Models
点击上方"迈微AI研习社",选择"星标★"公众号重磅干货,第一时间送达多年来,人们一直在研究处理图像以生成文本,例如图像字幕和视觉问答.传统上,此类系统依赖 ...
Learning Visual Commonsense for Robust Scene Graph Generation论文笔记
原论文地址:https://link.springer.com/content/pdf/10.1007/978-3-030-58592-1_38.pdf 目录总体结构: 感知模型GLAT: 融合感知 ...

Visual reasoning

1、Visual reasoning[from RAVEN CVPR19]

Visual reasoning相关推荐

最新文章

热门文章