Subjects: cs.Cv

1.Spatiotemporal Deformation Perception for Fisheye Video Rectification


作者:Shangrong Yang, Chunyu Lin, Kang Liao, Yao Zhao





Although the distortion correction of fisheye images has been extensively studied, the correction of fisheye videos is still an elusive challenge. For different frames of the fisheye video, the existing image correction methods ignore the correlation of sequences, resulting in temporal jitter in the corrected video. To solve this problem, we propose a temporal weighting scheme to get a plausible global optical flow, which mitigates the jitter effect by progressively reducing the weight of frames. Subsequently, we observe that the inter-frame optical flow of the video is facilitated to perceive the local spatial deformation of the fisheye video. Therefore, we derive the spatial deformation through the flows of fisheye and distorted-free videos, thereby enhancing the local accuracy of the predicted result. However, the independent correction for each frame disrupts the temporal correlation. Due to the property of fisheye video, a distorted moving object may be able to find its distorted-free pattern at another moment. To this end, a temporal deformation aggregator is designed to reconstruct the deformation correlation between frames and provide a reliable global feature. Our method achieves an end-to-end correction and demonstrates superiority in correction quality and stability compared with the SOTA correction methods.

2.Convolutional Neural Networks Trained to Identify Words Provide a Good Account of Visual Form Priming Effects


作者:Dong Yin, Valerio Biscione, Jeffrey Bowers





A wide variety of orthographic coding schemes and models of visual word identification have been developed to account for masked priming data that provide a measure of orthographic similarity between letter strings. These models tend to include hand-coded orthographic representations with single unit coding for specific forms of knowledge (e.g., units coding for a letter in a given position or a letter sequence). Here we assess how well a range of these coding schemes and models account for the pattern of form priming effects taken from the Form Priming Project and compare these findings to results observed in with 11 standard deep neural network models (DNNs) developed in computer science. We find that deep convolutional networks perform as well or better than the coding schemes and word recognition models, whereas transformer networks did less well. The success of convolutional networks is remarkable as their architectures were not developed to support word recognition (they were designed to perform well on object recognition) and they classify pixel images of words (rather artificial encodings of letter strings). The findings add to the recent work of (Hannagan et al., 2021) suggesting that convolutional networks may capture key aspects of visual word identification.

3.Cross-Layer Retrospective Retrieving via Layer Attention(ICLR 2023)


作者:Yanwen Fang, Yuxi Cai, Jintai Chen, Jingyu Zhao, Guangjian Tian, Guodong Li





More and more evidence has shown that strengthening layer interactions can enhance the representation power of a deep neural network, while self-attention excels at learning interdependencies by retrieving query-activated information. Motivated by this, we devise a cross-layer attention mechanism, called multi-head recurrent layer attention (MRLA), that sends a query representation of the current layer to all previous layers to retrieve query-related information from different levels of receptive fields. A light-weighted version of MRLA is also proposed to reduce the quadratic computation cost. The proposed layer attention mechanism can enrich the representation power of many state-of-the-art vision networks, including CNNs and vision transformers. Its effectiveness has been extensively evaluated in image classification, object detection and instance segmentation tasks, where improvements can be consistently observed. For example, our MRLA can improve 1.6% Top-1 accuracy on ResNet-50, while only introducing 0.16M parameters and 0.07B FLOPs. Surprisingly, it can boost the performances by a large margin of 3-4% box AP and mask AP in dense prediction tasks. Our code is available at


  1. 每日学术速递4.10

    CV - 计算机视觉 |  ML - 机器学习 |  RL - 强化学习 | NLP 自然语言处理 Subjects: cs.CV 1.Super-Resolving Face Image by Fa ...

  2. 每日学术速递5.10

    CV - 计算机视觉 |  ML - 机器学习 |  RL - 强化学习 | NLP 自然语言处理 Subjects: cs.CV 1.ZipIt! Merging Models from Diffe ...

  3. 每日学术速递5.15

    CV - 计算机视觉 |  ML - 机器学习 |  RL - 强化学习 | NLP 自然语言处理 Subjects: cs.CL 1.Not All Languages Are Created Eq ...

  4. 每日学术速递4.30

    CV - 计算机视觉 |  ML - 机器学习 |  RL - 强化学习 | NLP 自然语言处理 Subjects: cs.CV 1.Masked Frequency Modeling for Se ...

  5. 每日学术速递1.26

    CV - 计算机视觉 今天带来的是北航IRIP实验室被国际人工智能联合会议IJCAI-ECAI 2022接收的3篇论文. IJCAI 是人工智能领域中最主要的学术会议之一,原为单数年召开,自2015年 ...

  6. 每日学术速递1.27

    CV - 计算机视觉  |  ML - 机器学习 |  RL - 强化学习 前沿推介: ICLR 2023 ICLR 全称为国际学习表征会议(International Conference on L ...

  7. 每日学术速递3.15

    CV - 计算机视觉 |  ML - 机器学习 |  RL - 强化学习 | NLP 自然语言处理 Subjects: cs.CV 1.MVImgNet: A Large-scale Dataset ...

  8. 每日学术速递5.21

    CV - 计算机视觉 |  ML - 机器学习 |  RL - 强化学习 | NLP 自然语言处理 Subjects: cs.CV 1.Going Denser with Open-Vocabular ...

  9. 每日学术速递5.29

    CV - 计算机视觉 |  ML - 机器学习 |  RL - 强化学习 | NLP 自然语言处理 Subjects: cs.CV 1.Custom-Edit: Text-Guided Image E ...


  1. 拿到腾讯字节快手 offer 后,他的 LeetCode 刷题经验在 GitHub 火了!
  2. Gitlab+Jenkins学习之路(三)之gitlab权限管理--issue管理
  3. 每日一道算法题--leetcode 509--斐波那契数(动态规划)--python
  4. 6174问题 --ACM解决方法
  5. 使用maven的profile区分本地环境和线上环境
  6. python爬取新闻网站标题_python如何正确抓取网页标题
  7. mysql业务数据库回退_理解MySQL数据库事务-隔离性
  8. 7-118 估值一亿的AI核心代码 (20 分)
  9. 【20181031T2】几串字符【数位DP思想+组合数】
  10. java线程条件变量_Java线程:条件变量 lock
  11. Tableau数据可视化案例
  12. UG标准件库的使用方法
  13. Web前端初步——IDE工具选择和emment插件
  14. c语言 zipf分布,Zipf分布:如何测量Zipf分布
  15. linkedin python 领英技能 测评
  16. 自适应辛普森(Simpson)积分
  17. 跟开涛学shiro练习代码
  18. ms17-010(永恒之蓝)漏洞利用
  19. 最新弹幕播放器源码/支持对接苹果+蓝光接口API
  20. 信贷业务全流程22个环节


  1. 爬取b站“开启一个时代”周杰伦mv《可爱女人》弹幕,以及词云制作
  2. 基于Socket的即时通信系统—CS模式(未完待续)
  3. 六级核心词汇(不熟悉部分)
  4. 选购CD-R/RW dvd盘片
  5. hdu2894// 算法竞赛——进阶指南——acwing 400. 太鼓达人 欧拉回路经典题 //欧拉回路的建模小结
  6. windows系统如何解除宽带限速?
  7. android字库使用
  8. 官方网站下载conda包并本地安装
  9. 戴尔塔式服务器显示器掉帧,【戴尔 SP2318H IPS显示器使用感受】颗粒感|掉帧|指示灯_摘要频道_什么值得买...
  10. 【限速标志识别】形态学限速标志识别【含GUI Matlab源码 1142期】