摘要

一些提出的解决文本图像的超分辨率方法忽略了笔画的视觉质量（文本的原子单位）在文本识别中起着至关重要的作用这一事实。当人类观察低分辨率文本图像时，他们会固有地使用部分笔画级细节来恢复整体字符的外观。受格式塔心理学的启发，本文提出了一种包含笔画聚焦模块（SFM）的笔画感知场景文本图像超分辨率方法，以专注于文本图像中字符的笔画级内部结构。具体来说，本文尝试设计用于在笔划级别分解英文字符和数字的规则，然后预训练文本识别器以提供笔划级别的注意力图作为位置线索，以控制生成的超分辨率图像与生成的超分辨率图像之间的一致性。

方法

Pixel-wise Supervision Module

与Scene Text Telescope: Text-Focused Scene Image Super-Resolution相同
利用L2损失衡量：

Stroke-Focused Module

为了利用更细粒度的注意力图，我们在两个合成数据集上预训练了一个基于 Transformer 的识别器，包括 Synth90k 和 SynthText ，笔画级别标签。更具体地说，给定字符级标签 cGT = {c1, c2, …, ct}，我们分解每个字符并将它们连接起来以构造笔画级标签 sGT = {s1, s2, …, st0} , 其中 t 和 t0 表示两个不同级别 (t ≤ t0 ) 的标签的最大长度。当达到收敛时，我们丢弃在训练期间使用交叉熵损失监督的序列预测 ypred，并且只利用多头自注意力模块生成的笔画级别注意力图序列作为笔画级别位置线索。将 HR 图像的注意力图表示为 AHR = {A1 HR, A2 HR, …, At0 HR}，将 SR 图像表示为 ASR = {A1 SR, A2 SR, …, At0 SR}，然后采用一个 L1 损失来约束这两个映射如下：

结论

在本文中，我们提出了一种受格式塔心理学启发的笔画感知场景文本图像超分辨率方法，突出了笔画区域的细节。所提出的方法确实可以生成更多可区分的超分辨率文本图像。如实验结果所示，所提出的 SFM 能够在 TextZoom 和中文手写十个数据集上实现最先进的性能，而不会引入额外的时间开销。

【论文阅读】Text Gestalt: Stroke-Aware Scene Text Image Super-Resolution相关推荐

《论文阅读》Commonsense Knowledge Aware Conversation Generation with Graph Attention
<论文阅读>Commonsense Knowledge Aware Conversation Generation with Graph Attention 简介论文试图解决什么问题? ...
【论文阅读】Gait Quality Aware Network: Toward the Interpretability of Silhouette-Based Gait Recognition
Gait Quality Aware Network: Toward the Interpretability of Silhouette-Based Gait Recognition 摘要 Intr ...
论文阅读：Generating Videos with Scene Dynamics
目录 Contributions Method 1.Video Generator Network 2.Video Discriminator Network Results 1.Quantitati ...
论文阅读《Block-NeRF: Scalable Large Scene Neural View Synthesis》
论文地址:https://arxiv.org/pdf/2202.05263.pdf 复现源码:https://github.com/dvlab-research/BlockNeRFPytorch 概述 ...
UnrealText: Synthesizing Realistic Scene Text Images from the Unreal World（译）
UnrealText:合成来自虚幻世界的真实场景文本图像仅供参考,如翻译不到的请指出,侵权删来源: CVPR2020,旷视 code 链接: https://jyouhou.github.io/U ...
EAST: An Efﬁcient and Accurate Scene Text Detector 论文阅读
EAST: An Efﬁcient and Accurate Scene Text Detector 论文阅读 Reference 正文摘要引言相关工作方法算法网络设计标签生成损失函数 ...
【论文阅读】Scene Text Image Super-Resolution in the Wild
[论文阅读]Scene Text Image Super-Resolution in the Wild 摘要引言相关工作 TextZoom数据集方法 pipeline SRB 中央对齐模块梯度 ...
《Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spotting》论文阅读笔记
论文阅读笔记去年在ECCV上发表的<Mask TextSpotter v3: Segmentation Proposal Network for Robust Scene Text Spott ...
论文阅读（Xiang Bai——【arXiv2016】Scene Text Detection via Holistic, Multi-Channel Prediction）...
Xiang Bai--[arXiv2016]Scene Text Detection via Holistic, Multi-Channel Prediction 目录作者和相关链接方法概括创新 ...

【论文阅读】Text Gestalt: Stroke-Aware Scene Text Image Super-Resolution

【论文阅读】Text Gestalt: Stroke-Aware Scene Text Image Super-Resolution

摘要

方法

Pixel-wise Supervision Module

Stroke-Focused Module

结论

【论文阅读】Text Gestalt: Stroke-Aware Scene Text Image Super-Resolution相关推荐

最新文章

热门文章