本专栏是计算机视觉方向论文收集积累,时间:2021年9月7日,来源:paper digest

欢迎关注原创公众号 【计算机视觉联盟】,回复 【西瓜书手推笔记】 可获取我的机器学习纯手推笔记!

直达笔记地址:机器学习手推笔记(GitHub地址)

1, TITLE: Towards Expressive Communication with Internet Memes: A New Multimodal Conversation Dataset and Benchmark
AUTHORS: Zhengcong Fei ; Zekang Li ; Jinchao Zhang ; Yang Feng ; Jie Zhou
CATEGORY: cs.CL [cs.CL, cs.CV]
HIGHLIGHT: In this paper, we propose a new task named as \textbf{M}eme incorporated \textbf{O}pen-domain \textbf{D}ialogue (MOD). To facilitate the MOD research, we construct a large-scale open-domain multimodal dialogue dataset incorporating abundant Internet memes into utterances.

2, TITLE: Data Efficient Masked Language Modeling for Vision and Language
AUTHORS: Yonatan Bitton ; Gabriel Stanovsky ; Michael Elhadad ; Roy Schwartz
CATEGORY: cs.CL [cs.CL, cs.CV, cs.LG]
HIGHLIGHT: In this paper, we observe several key disadvantages of MLM in this setting.

3, TITLE: Spatiotemporal Inconsistency Learning for DeepFake Video Detection
AUTHORS: Zhihao Gu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Specifically, we present a novel temporal modeling paradigm in TIM by exploiting the temporal difference over adjacent frames along with both horizontal and vertical directions.

4, TITLE: Utilizing Adversarial Targeted Attacks to Boost Adversarial Robustness
AUTHORS: Uriya Pesso ; Koby Bibas ; Meir Feder
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We propose a novel solution by adopting the recently suggested Predictive Normalized Maximum Likelihood.

5, TITLE: On Robustness of Generative Representations Against Catastrophic Forgetting
AUTHORS: Wojciech Masarczyk ; Kamil Deja ; Tomasz Trzci?ski
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this work, we aim at answering this question by posing and validating a set of research hypotheses related to the specificity of representations built internally by neural models.

6, TITLE: ISyNet: Convolutional Neural Networks Design for AI Accelerator
AUTHORS: ALEXEY LETUNOVSKIY et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: For many years the main goal of the research was to improve the quality of models, even if the complexity was impractically high.

7, TITLE: Learning Object-Compositional Neural Radiance Field for Editable Scene Rendering
AUTHORS: BANGBANG YANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we present a novel neural scene rendering system, which learns an object-compositional neural radiance field and produces realistic rendering with editing capability for a clustered and real-world scene.

8, TITLE: Weakly Supervised Relative Spatial Reasoning for Visual Question Answering
AUTHORS: Pratyay Banerjee ; Tejas Gokhale ; Yezhou Yang ; Chitta Baral
CATEGORY: cs.CV [cs.CV, cs.CL, cs.LG]
HIGHLIGHT: In this work, we evaluate the faithfulness of V\&L models to such geometric understanding, by formulating the prediction of pair-wise relative locations of objects as a classification as well as a regression task.

9, TITLE: Audio-Visual Transformer Based Crowd Counting
AUTHORS: Usman Sajid ; Xiangyu Chen ; Hasan Sajid ; Taejoon Kim ; Guanghui Wang
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: The paper proposes a new audiovisual multi-task network to address the critical challenges in crowd counting by effectively utilizing both visual and audio inputs for better modalities association and productive feature extraction.

10, TITLE: RiWNet: A Moving Object Instance Segmentation Network Being Robust in Adverse Weather Conditions
AUTHORS: Chenjie Wang ; Chengyuan Li ; Bin Luo ; Wei Wang ; Jun Liu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We compare RiWNet to several other state-of-the-art methods in some challenging datasets, and RiWNet shows better performance especially under adverse weather conditions. Finally, in order to verify the effect of moving instance segmentation in different weather disturbances, we propose a VKTTI-moving dataset which is a moving instance segmentation dataset based on the VKTTI dataset, taking into account different weather scenes such as rain, fog, sunset, morning as well as overcast.

11, TITLE: GOHOME: Graph-Oriented Heatmap Output Forfuture Motion Estimation
AUTHORS: Thomas Gilles ; Stefano Sabatini ; Dzmitry Tsishkou ; Bogdan Stanciulescu ; Fabien Moutarde
CATEGORY: cs.CV [cs.CV, cs.RO]
HIGHLIGHT: In this paper, we propose GOHOME, a method leveraging graph representations of the High Definition Map and sparse projections to generate a heatmap output representing the future position probability distribution for a given agent in a traffic scene.

12, TITLE: Sparse Spatial Attention Network for Semantic Segmentation
AUTHORS: Mengyu Liu ; Hujun Yin
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we present a sparse spatial attention network (SSANet) to improve the efficiency of the spatial attention mechanism without sacrificing the performance.

13, TITLE: Stimuli-Aware Visual Emotion Analysis
AUTHORS: Jingyuan Yang ; Jie Li ; Xiumei Wang ; Yuxuan Ding ; Xinbo Gao
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: Inspired by the \textit{Stimuli-Organism-Response (S-O-R)} emotion model in psychological theory, we proposed a stimuli-aware VEA method consisting of three stages, namely stimuli selection (S), feature extraction (O) and emotion prediction (R).

14, TITLE: Robust Fine-tuning of Zero-shot Models
AUTHORS: MITCHELL WORTSMAN et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: We address this tension by introducing a simple and effective method for improving robustness: ensembling the weights of the zero-shot and fine-tuned models.

15, TITLE: Dual Transfer Learning for Event-based End-task Prediction Via Pluggable Event to Image Translation
AUTHORS: Lin Wang ; Yujeong Chae ; Kuk-Jin Yoon
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a simple yet flexible two-stream framework named Dual Transfer Learning (DTL) to effectively enhance the performance on the end-tasks without adding extra inference cost.

16, TITLE: A Comprehensive Approach for UAV Small Object Detection with Simulation-based Transfer Learning and Adaptive Fusion
AUTHORS: Chen Rui ; Guo Youwei ; Zheng Huafei ; Jiang Hongyu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To tackle these problems, a novel comprehensive approach that combines transfer learning based on simulation data and adaptive fusion is proposed.

17, TITLE: Square Root Marginalization for Sliding-Window Bundle Adjustment
AUTHORS: Nikolaus Demmel ; David Schubert ; Christiane Sommer ; Daniel Cremers ; Vladyslav Usenko
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper we propose a novel square root sliding-window bundle adjustment suitable for real-time odometry applications.

18, TITLE: Robust Attentive Deep Neural Network for Exposing GAN-generated Faces
AUTHORS: Hui Guo ; Shu Hu ; Xin Wang ; Ming-Ching Chang ; Siwei Lyu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To address these shortcomings, we propose a robust, attentive, end-to-end network that can spot GAN-generated faces by analyzing their eye inconsistencies.

19, TITLE: Toward Realistic Single-View 3D Object Reconstructionwith Unsupervised Learning from Multiple Images
AUTHORS: Long-Nhat Ho ; Anh Tuan Tran ; Quynh Phung ; Minh Hoai
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we eliminate the symmetry requirement with a novel unsupervised algorithm that can learn a 3D reconstruction network from a multi-image dataset.

20, TITLE: Spatial Domain Feature Extraction Methods for Unconstrained Handwritten Malayalam Character Recognition
AUTHORS: Jomy John
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Spatial domain features suitable for recognition are chosen in this work.

21, TITLE: Does Melania Trump Have A Body Double from The Perspective of Automatic Face Recognition?
AUTHORS: Khawla Mallat ; Fabiola Becerra-Riera ; Annette Morales-Gonz�lez ; Heydi M�ndez-V�zquez ; Jean-Luc Dugelay
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, we explore whether automatic face recognition can help in verifying widespread misinformation on social media, particularly conspiracy theories that are based on the existence of body doubles.

22, TITLE: Exploiting Spatial-Temporal Semantic Consistency for Video Scene Parsing
AUTHORS: XINGJIAN HE et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a Spatial-Temporal Semantic Consistency method to capture class-exclusive context information.

23, TITLE: Stochastic Neural Radiance Fields:Quantifying Uncertainty in Implicit 3D Representations
AUTHORS: Jianxiong Shen ; Adria Ruiz ; Antonio Agudo ; Francesc Moreno
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this context, we propose Stochastic Neural Radiance Fields (S-NeRF), a generalization of standard NeRF that learns a probability distribution over all the possible radiance fields modeling the scene.

24, TITLE: Less Is More: Lighter and Faster Deep Neural Architecture for Tomato Leaf Disease Classification
AUTHORS: Sabbir Ahmed ; Md. Bakhtiar Hasan ; Tasnim Ahmed ; Redwan Karim Sony ; Md. Hasanul Kabir
CATEGORY: cs.CV [cs.CV, cs.LG, I.4.9]
HIGHLIGHT: This work proposes a lightweight transfer learning-based approach for detecting diseases from tomato leaves.

25, TITLE: Identification of Driver Phone Usage Violations Via State-of-the-Art Object Detection with Tracking
AUTHORS: Steven Carrell ; Amir Atapour-Abarghouei
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: In this work, we propose a custom-trained state-of-the-art object detector to work with roadside cameras to capture driver phone usage without the need for human intervention.

26, TITLE: Robust Event Detection Based on Spatio-Temporal Latent Action Unit Using Skeletal Information
AUTHORS: Hao Xing ; Yuxuan Xue ; Mingchuan Zhou ; Darius Burschka
CATEGORY: cs.CV [cs.CV, cs.HC, I.5.1; I.5.2; I.5.3]
HIGHLIGHT: This paper propose a novel dictionary learning approach to detect event action using skeletal information extracted from RGBD video.

27, TITLE: CTRL-C: Camera Calibration TRansformer with Line-Classification
AUTHORS: JINWOO LEE et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, we propose Camera calibration TRansformer with Line-Classification (CTRL-C), an end-to-end neural network-based approach to single image camera calibration, which directly estimates the camera parameters from an image and a set of line segments.

28, TITLE: Self-supervised Product Quantization for Deep Unsupervised Image Retrieval
AUTHORS: Young Kyun Jang ; Nam Ik Cho
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To tackle these issues, we propose the first deep unsupervised image retrieval method dubbed Self-supervised Product Quantization (SPQ) network, which is label-free and trained in a self-supervised manner.

29, TITLE: Underwater 3D Reconstruction Using Light Fields
AUTHORS: Yuqi Ding ; Yu Ji ; Jingyi Yu ; Jinwei Ye
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we present an underwater 3D reconstruction solution using light field cameras.

30, TITLE: Image Recognition Via Vietoris-Rips Complex
AUTHORS: Yasuhiko Asao ; Jumpei Nagase ; Ryotaro Sakamoto ; Shiro Takagi
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a way to extract such features from images by a method based on algebraic topology.

31, TITLE: Learning to Generate Scene Graph from Natural Language Supervision
AUTHORS: Yiwu Zhong ; Jing Shi ; Jianwei Yang ; Chenliang Xu ; Yin Li
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose one of the first methods that learn from image-sentence pairs to extract a graphical representation of localized objects and their relationships within an image, known as scene graph.

32, TITLE: GDP: Stabilized Neural Network Pruning Via Gates with Differentiable Polarization
AUTHORS: YI GUO et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In view of the research gaps, we present a new module named Gates with Differentiable Polarization (GDP), inspired by principled optimization ideas.

33, TITLE: GeneAnnotator: A Semi-automatic Annotation Tool for Visual Scene Graph
AUTHORS: Zhixuan Zhang ; Chi Zhang ; Zhenning Niu ; Le Wang ; Yuehu Liu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this manuscript, we introduce a semi-automatic scene graph annotation tool for images, the GeneAnnotator.

34, TITLE: Reasoning Graph Networks for Kinship Verification: from Star-shaped to Hierarchical
AUTHORS: Wanhua Li ; Jiwen Lu ; Abudukelimu Wuerkaixi ; Jianjiang Feng ; Jie Zhou
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we investigate the problem of facial kinship verification by learning hierarchical reasoning graph networks.

35, TITLE: Learning Fine-Grained Motion Embedding for Landscape Animation
AUTHORS: HONGWEI XUE et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper we focus on landscape animation, which aims to generate time-lapse videos from a single landscape image. To train and evaluate on diverse time-lapse videos, we build the largest high-resolution Time-lapse video dataset with Diverse scenes, namely Time-lapse-D, which includes 16,874 video clips with over 10 million frames.

36, TITLE: From Contexts to Locality: Ultra-high Resolution Image Segmentation Via Locality-aware Contextual Correlation
AUTHORS: Qi Li ; Weixiang Yang ; Wenxi Liu ; Yuanlong Yu ; Shengfeng He
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we innovate the widely used high-resolution image segmentation pipeline, in which an ultra-high resolution image is partitioned into regular patches for local segmentation and then the local results are merged into a high-resolution semantic mask.

37, TITLE: 3D Human Texture Estimation from A Single Image with Transformers
AUTHORS: Xiangyu Xu ; Chen Change Loy
CATEGORY: cs.CV [cs.CV, cs.AI, cs.GR, cs.LG, cs.MM]
HIGHLIGHT: We propose a Transformer-based framework for 3D human texture estimation from a single image.

38, TITLE: Weakly Supervised Few-Shot Segmentation Via Meta-Learning
AUTHORS: Pedro H. T. Gama ; Hugo Oliveira ; Jos� Marcato Junior ; Jefersson A. dos Santos
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this paper, we present two novel meta learning methods, named WeaSeL and ProtoSeg, for the few-shot semantic segmentation task with sparse annotations.

39, TITLE: Revisiting 3D ResNets for Video Recognition
AUTHORS: XIANZHI DU et. al.
CATEGORY: cs.CV [cs.CV, cs.LG, eess.IV]
HIGHLIGHT: We propose a simple scaling strategy for 3D ResNets, in combination with improved training strategies and minor architectural changes.

40, TITLE: Comparing The Machine Readability of Traffic Sign Pictograms in Austria and Germany
AUTHORS: Alexander Maletzky ; Stefan Thumfart ; Christoph Wru�
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: To that end, we train classification models on synthetic data sets and evaluate their classification accuracy in a controlled setting.

41, TITLE: Point-Based Neural Rendering with Per-View Optimization
AUTHORS: Georgios Kopanas ; Julien Philip ; Thomas Leimk�hler ; George Drettakis
CATEGORY: cs.CV [cs.CV, cs.GR]
HIGHLIGHT: We introduce a general approach that is initialized with MVS, but allows further optimization of scene properties in the space of input views, including depth and reprojected features, resulting in improved novel-view synthesis.

42, TITLE: Improved RAMEN: Towards Domain Generalization for Visual Question Answering
AUTHORS: Bhanuka Manesha Samarasekara Vitharana Gamage ; Lim Chern Hong
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: This study provides two major improvements to the early/late fusion module and aggregation module of the RAMEN architecture, with the objective of further strengthening domain generalization.

43, TITLE: Visual Recognition with Deep Learning from Biased Image Datasets
AUTHORS: Robin Vogel ; Stephan Cl�men�on ; Pierre Laforgue
CATEGORY: cs.CV [cs.CV, cs.CY, cs.LG, stat.ML]
HIGHLIGHT: In this paper, we show how biasing models, originally introduced for nonparametric estimation in (Gill et al., 1988), and recently revisited from the perspective of statistical learning theory in (Laforgue and Cl\'emen\c{c}on, 2019), can be applied to remedy these problems in the context of visual recognition.

44, TITLE: Tensor Normalization and Full Distribution Training
AUTHORS: Wolfgang Fuhl
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: In this work, we introduce pixel wise tensor normalization, which is inserted after rectifier linear units and, together with batch normalization, provides a significant improvement in the accuracy of modern deep neural networks.

45, TITLE: Information Theory-Guided Heuristic Progressive Multi-View Coding
AUTHORS: JIANGMENG LI et. al.
CATEGORY: cs.CV [cs.CV, cs.AI, cs.LG]
HIGHLIGHT: Guided by it, we build a multi-view coding method with a three-tier progressive architecture, namely Information theory-guided heuristic Progressive Multi-view Coding (IPMC).

46, TITLE: PR-Net: Preference Reasoning for Personalized Video Highlight Detection
AUTHORS: RUNNAN CHEN et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a simple yet efficient preference reasoning framework (PR-Net) to explicitly take the diverse interests into account for frame-level highlight prediction.

47, TITLE: Encoder-decoder with Multi-level Attention for 3D Human Shape and Pose Estimation
AUTHORS: ZINIU WAN et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To this end, we propose Multi-level Attention Encoder-Decoder Network (MAED), including a Spatial-Temporal Encoder (STE) and a Kinematic Topology Decoder (KTD) to model multi-level attentions in a unified framework.

48, TITLE: ERA: Entity Relationship Aware Video Summarization with Wasserstein GAN
AUTHORS: Guande Wu ; Jianzhe Lin ; Claudio T. Silva
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper proposes a novel Entity relationship Aware video summarization method (ERA) to address the above problems.

49, TITLE: Image In Painting Applied to Art Completing Escher's Print Gallery
AUTHORS: Lucia Cipolina-Kun ; Simone Caenazzo ; Gaston Mazzei ; Aditya Srinivas Menon
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: We introduce M.C Eschers Print Gallery lithography as a use case example.

50, TITLE: Fast Image-Anomaly Mitigation for Autonomous Mobile Robots
AUTHORS: Gianmario Fumagalli ; Yannick Huber ; Marcin Dymczyk ; Roland Siegwart ; Renaud Dub�
CATEGORY: cs.CV [cs.CV, cs.AI, cs.RO]
HIGHLIGHT: In this work we address this importantissue by implementing a pre-processing step that can effectivelymitigate such artifacts in a real-time fashion, thus supportingthe deployment of autonomous systems with limited computecapabilities.

51, TITLE: To Be Critical: Self-Calibrated Weakly Supervised Learning for Salient Object Detection
AUTHORS: Yongri Piao ; Jian Wang ; Miao Zhang ; Zhengxuan Ma ; Huchuan Lu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this work, 1) we propose a self-calibrated training strategy by explicitly establishing a mutual calibration loop between pseudo labels and network predictions, liberating the saliency network from error-prone propagation caused by pseudo labels.

52, TITLE: Robust Mitosis Detection Using A Cascade Mask-RCNN Approach With Domain-Specific Residual Cycle-GAN Data Augmentation
AUTHORS: GAUTHIER ROY et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: For the MIDOG mitosis detection challenge, we created a cascade algorithm consisting of a Mask-RCNN detector, followed by a classification ensemble consisting of ResNet50 and DenseNet201 to refine detected mitotic candidates.

53, TITLE: Bridging The Gap Between Events and Frames Through Unsupervised Domain Adaptation
AUTHORS: Nico Messikommer ; Daniel Gehrig ; Mathias Gehrig ; Davide Scaramuzza
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: To overcome this drawback, we propose a task transfer method that allows models to be trained directly with labeled images and unlabeled event data.

54, TITLE: Moving Object Detection for Event-based Vision Using K-means Clustering
AUTHORS: Anindya Mondal ; Mayukhmali Das
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: In this paper, we investigate the application of the k-means clustering technique in detecting moving objects in event-based data.

55, TITLE: Class Semantics-based Attention for Action Detection
AUTHORS: DEEPAK SRIDHAR et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a novel attention mechanism, the Class Semantics-based Attention (CSA), that learns from the temporal distribution of semantics of action classes present in an input video to find the importance scores of the encoded features, which are used to provide attention to the more useful encoded features.

56, TITLE: The Animation Transformer: Visual Correspondence Via Segment Matching
AUTHORS: EVAN CASEY et. al.
CATEGORY: cs.CV [cs.CV, cs.AI, cs.GR]
HIGHLIGHT: To that end, we propose the Animation Transformer (AnT) which uses a transformer-based architecture to learn the spatial and visual relationships between segments across a sequence of images.

57, TITLE: Seam Carving Detection and Localization Using Two-Stage Deep Neural Networks
AUTHORS: Lakshmanan Nataraj ; Chandrakanth Gudavalli ; Tajuddin Manhar Mohammed ; Shivkumar Chandrasekaran ; B. S. Manjunath
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we propose a two-step method to detect and localize seam carved images.

58, TITLE: Deep Saliency Prior for Reducing Visual Distraction
AUTHORS: KFIR ABERMAN et. al.
CATEGORY: cs.CV [cs.CV, cs.GR, cs.LG]
HIGHLIGHT: We present results on a variety of natural images and conduct a perceptual study to evaluate and validate the changes in viewers' eye-gaze between the original images and our edited results.

59, TITLE: A Realistic Approach to Generate Masked Faces Applied on Two Novel Masked Face Recognition Data Sets
AUTHORS: TUDOR MARE et. al.
CATEGORY: cs.CV [cs.CV, cs.LG]
HIGHLIGHT: We propose a method for enhancing data sets containing faces without masks by creating synthetic masks and overlaying them on faces in the original images.

60, TITLE: F3S: Free Flow Fever Screening
AUTHORS: KUNAL RAO et. al.
CATEGORY: cs.CV [cs.CV, eess.IV]
HIGHLIGHT: We present a novel fever-screening system, F3S, that uses edge machine learning techniques to accurately measure core body temperatures of multiple individuals in a free-flow setting.

61, TITLE: Pyramid R-CNN: Towards Better Performance and Adaptability for 3D Object Detection
AUTHORS: JIAGENG MAO et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: We present a flexible and high-performance framework, named Pyramid R-CNN, for two-stage 3D object detection from point clouds.

62, TITLE: Navigating The Mise-en-Page: Interpretive Machine Learning Approaches to The Visual Layouts of Multi-Ethnic Periodicals
AUTHORS: Benjamin Charles Germain Lee ; Joshua Ortiz Baco ; Sarah H. Salter ; Jim Casey
CATEGORY: cs.CV [cs.CV, cs.DL, cs.IR]
HIGHLIGHT: This paper presents a computational method of analysis that draws from machine learning, library science, and literary studies to map the visual layouts of multi-ethnic newspapers from the late 19th and early 20th century United States.

63, TITLE: Fusformer: A Transformer-based Fusion Approach for Hyperspectral Image Super-resolution
AUTHORS: Jin-Fan Hu ; Ting-Zhu Huang ; Liang-Jian Deng
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we design a network based on the transformer for fusing the low-resolution hyperspectral images and high-resolution multispectral images to obtain the high-resolution hyperspectral images.

64, TITLE: Towards Accurate Alignment in Real-time 3D Hand-Mesh Reconstruction
AUTHORS: Xiao Tang ; Tianyu Wang ; Chi-Wing Fu
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: This paper presents a novel pipeline by decoupling the hand-mesh reconstruction task into three stages: a joint stage to predict hand joints and segmentation; a mesh stage to predict a rough hand mesh; and a refine stage to fine-tune it with an offset mesh for mesh-image alignment.

65, TITLE: Deep Person Generation: A Survey from The Perspective of Face, Pose and Cloth Synthesis
AUTHORS: Tong Sha ; Wei Zhang ; Tong Shen ; Zhoujun Li ; Tao Mei
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: More than two hundred papers are covered for a thorough overview, and the milestone works are highlighted to witness the major technical breakthrough.

66, TITLE: Voxel Transformer for 3D Object Detection
AUTHORS: JIAGENG MAO et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: In this paper, we resolve the problem by introducing a Transformer-based architecture that enables long-range relationships between voxels by self-attention.

67, TITLE: Hierarchical Object-to-Zone Graph for Object Navigation
AUTHORS: SIXIAN ZHANG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: Previous works usually implement deep models to train an agent to predict actions in real-time.

68, TITLE: Parsing Table Structures in The Wild
AUTHORS: RUJIAO LONG et. al.
CATEGORY: cs.CV [cs.CV]
HIGHLIGHT: For designing such a system, we propose an approach named Cycle-CenterNet on the top of CenterNet with a novel cycle-pairing module to simultaneously detect and group tabular cells into structured tables. Alongside with our Cycle-CenterNet, we also present a large-scale dataset, named Wired Table in the Wild (WTW), which includes well-annotated structure parsing of multiple style tables in several scenes like the photo, scanning files, web pages, \emph{etc.}.

69, TITLE: Efficient Action Recognition Using Confidence Distillation
AUTHORS: Shervin Manzuri Shalmani ; Fei Chiang ; Rong Zheng
CATEGORY: cs.CV [cs.CV, cs.AI]
HIGHLIGHT: To mitigate both these issues, we propose the confidence distillation framework to teach a representation of uncertainty of the teacher to the student sampler and divide the task of full video prediction between the student and the teacher models.

70, TITLE: RAMA: A Rapid Multicut Algorithm on GPU
AUTHORS: Ahmed Abbas ; Paul Swoboda
CATEGORY: cs.DC [cs.DC, cs.CV, cs.DS, cs.LG]
HIGHLIGHT: We propose a highly parallel primal-dual algorithm for the multicut (a.k.a. correlation clustering) problem, a classical graph clustering problem widely used in machine learning and computer vision.

71, TITLE: CodeNeRF: Disentangled Neural Radiance Fields for Object Categories
AUTHORS: Wonbong Jang ; Lourdes Agapito
CATEGORY: cs.GR [cs.GR, cs.CV, cs.LG]
HIGHLIGHT: We conduct experiments on the SRN benchmark, which show that CodeNeRF generalises well to unseen objects and achieves on-par performance with methods that require known camera pose at test time.

72, TITLE: Sensor Data Augmentation with Resampling for Contrastive Learning in Human Activity Recognition
AUTHORS: Jinqiang Wang ; Tao Zhu ; Jingyuan Gan ; Huansheng Ning ; Yaping Wan
CATEGORY: cs.HC [cs.HC, cs.CV]
HIGHLIGHT: To optimize the effect of contrast learning models, in this paper, we investigate the sampling frequency of sensors and propose a resampling data augmentation method.

73, TITLE: Improving Joint Learning of Chest X-Ray and Radiology Report By Word Region Alignment
AUTHORS: ZHANGHEXUAN JI et. al.
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CL, cs.CV, eess.IV]
HIGHLIGHT: This paper proposes a Joint Image Text Representation Learning Network (JoImTeRNet) for pre-training on chest X-ray images and their radiology reports.

74, TITLE: Cluster-Promoting Quantization with Bit-Drop for Minimizing Network Quantization Loss
AUTHORS: Jung Hyun Lee ; Jihun Yun ; Sung Ju Hwang ; Eunho Yang
CATEGORY: cs.LG [cs.LG, cs.CV]
HIGHLIGHT: In this work, we propose a novel quantization method for neural networks, Cluster-Promoting Quantization (CPQ) that finds the optimal quantization grids while naturally encouraging the underlying full-precision weights to gather around those quantization grids cohesively during training.

75, TITLE: Fair Federated Learning for Heterogeneous Face Data
AUTHORS: Samhita Kanaparthy ; Manisha Padala ; Sankarshan Damle ; Sujit Gujar
CATEGORY: cs.LG [cs.LG, cs.CV, cs.CY]
HIGHLIGHT: To resolve this challenge, we propose several aggregation techniques.

76, TITLE: Sparse-MLP: A Fully-MLP Architecture with Conditional Computation
AUTHORS: Yuxuan Lou ; Fuzhao Xue ; Zangwei Zheng ; Yang You
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV]
HIGHLIGHT: In this paper, we propose Sparse-MLP, scaling the recent MLP-Mixer model with sparse MoE layers, to achieve a more computation-efficient architecture.

77, TITLE: Active Learning for Automated Visual Inspection of Manufactured Products
AUTHORS: Elena Trajkova ; Jo?e M. Ro?anec ; Paulien Dam ; Bla? Fortuna ; Dunja Mladeni?
CATEGORY: cs.LG [cs.LG, cs.AI, cs.CV]
HIGHLIGHT: In this research, we compare three active learning approaches and five machine learning algorithms applied to visual defect inspection with real-world data provided by Philips Consumer Lifestyle BV.

78, TITLE: Multi-Agent Variational Occlusion Inference Using People As Sensors
AUTHORS: Masha Itkina ; Ye-Ji Mun ; Katherine Driggs-Campbell ; Mykel J. Kochenderfer
CATEGORY: cs.RO [cs.RO, cs.AI, cs.CV, cs.LG, cs.MA, I.2.9; I.2.10]
HIGHLIGHT: We propose an occlusion inference method that characterizes observed behaviors of human agents as sensor measurements, and fuses them with those from a standard sensor suite.

79, TITLE: Navigational Path-Planning For All-Terrain Autonomous Agricultural Robot
AUTHORS: Vedant Ghodke
CATEGORY: cs.RO [cs.RO, cs.CV]
HIGHLIGHT: This report paper compares novel algorithms for autonomous navigation of farmlands.

80, TITLE: Timbre Transfer with Variational Auto Encoding and Cycle-Consistent Adversarial Networks
AUTHORS: Russell Sammut Bonnici ; Charalampos Saitis ; Martin Benning
CATEGORY: cs.SD [cs.SD, cs.AI, cs.CV, cs.LG, eess.AS]
HIGHLIGHT: This research project investigates the application of deep learning to timbre transfer, where the timbre of a source audio can be converted to the timbre of a target audio with minimal loss in quality.

81, TITLE: Generative Models Improve Radiomics Performance in Different Tasks and Different Datasets: An Experimental Study
AUTHORS: Junhua Chen ; Inigo Bermejo ; Andre Dekker ; Leonard Wee
CATEGORY: q-bio.QM [q-bio.QM, cs.CV, eess.IV]
HIGHLIGHT: In this article, we investigate the possibility of using deep learning generative models to improve the performance of radiomics from low dose CTs.

82, TITLE: Predicting Isocitrate Dehydrogenase Mutationstatus in Glioma Using Structural Brain Networksand Graph Neural Networks
AUTHORS: YIRAN WEI et. al.
CATEGORY: eess.IV [eess.IV, cs.CV, q-bio.NC]
HIGHLIGHT: Here we propose a method to predict the IDH mutation using GNN, based on the structural brain network of patients.

83, TITLE: A Privacy-Preserving Image Retrieval Scheme Using A Codebook Generated From Independent Plain-Image Dataset
AUTHORS: Kenta Iida ; Hitoshi Kiya
CATEGORY: eess.IV [eess.IV, cs.CV, cs.MM]
HIGHLIGHT: In this paper, we propose a privacy-preserving image-retrieval scheme using a codebook generated by using a plain-image dataset.

84, TITLE: OCTAVA: An Open-source Toolbox for Quantitative Analysis of Optical Coherence Tomography Angiography Images
AUTHORS: GAVRIELLE R. UNTRACHT et. al.
CATEGORY: eess.IV [eess.IV, cs.CV, q-bio.TO]
HIGHLIGHT: With the goal of contributing to standardization of OCTA data analysis, we report a user-friendly, open-source toolbox, OCTAVA (OCTA Vascular Analyzer), to automate the pre-processing, segmentation, and quantitative analysis of en face OCTA maximum intensity projection images in a standardized workflow.

85, TITLE: Right Ventricular Segmentation from Short- and Long-Axis MRIs Via Information Transition
AUTHORS: Lei Li ; Wangbin Ding ; Liqun Huang ; Xiahai Zhuang
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: In this work, we propose an automatic RV segmentation framework, where the information from long-axis (LA) views is utilized to assist the segmentation of short-axis (SA) views via information transition.

86, TITLE: Recognition of COVID-19 Disease Utilizing X-Ray Imaging of The Chest Using CNN
AUTHORS: Md Gulzar Hussain ; Ye Shiren
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: The goal of this research is to assess the convolutional neural networks (CNNs) to diagnosis COVID-19 utisizing X-ray images of chest.

87, TITLE: Automated Cardiac Resting Phase Detection Targeted on The Right Coronary Artery
AUTHORS: SEUNG SU YOON et. al.
CATEGORY: eess.IV [eess.IV, cs.CV, physics.med-ph]
HIGHLIGHT: The purpose of this work is to propose a fully automated framework that allows the detection of the right coronary artery (RCA) RP within CINE series.

88, TITLE: Automatic Segmentation of The Optic Nerve Head Region in Optical Coherence Tomography: A Methodological Review
AUTHORS: RITA MARQUES et. al.
CATEGORY: eess.IV [eess.IV, cs.CV, physics.med-ph]
HIGHLIGHT: This review summarizes the current state-of-the-art in automatic segmentation of the ONH in OCT.

89, TITLE: Deep Learning Facilitates Fully Automated Brain Image Registration of Optoacoustic Tomography and Magnetic Resonance Imaging
AUTHORS: YEXING HU et. al.
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: Here we propose a fully automated registration method for MSOT-MRI multimodal imaging empowered by deep learning.

90, TITLE: (M)SLAe-Net: Multi-Scale Multi-Level Attention Embedded Network for Retinal Vessel Segmentation
AUTHORS: Shreshth Saini ; Geetika Agrawal
CATEGORY: eess.IV [eess.IV, cs.CV, cs.LG]
HIGHLIGHT: In this work, we propose a multi-scale, multi-level attention embedded CNN architecture ((M)SLAe-Net) to address the issue of multi-stage processing for robust and precise segmentation of retinal vessels.

91, TITLE: Evaluation of Convolutional Neural Networks for COVID-19 Classification on Chest X-Rays
AUTHORS: FELIPE ANDR� ZEISER et. al.
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: In this paper, we propose the evaluation of convolutional neural networks to identify pneumonia due to COVID-19 in XR.

92, TITLE: A Decoupled Uncertainty Model for MRI Segmentation Quality Estimation
AUTHORS: Richard Shaw ; Carole H. Sudre ; Sebastien Ourselin ; M. Jorge Cardoso ; Hugh G. Pemberton
CATEGORY: eess.IV [eess.IV, cs.CV]
HIGHLIGHT: We aim to automate the process using a probabilistic network that estimates segmentation uncertainty through a heteroscedastic noise model, providing a measure of task-specific quality.

93, TITLE: Image Compression with Recurrent Neural Network and Generalized Divisive Normalization
AUTHORS: Khawar Islam ; L. Minh Dang ; Sujin Lee ; Hyeonjoon Moon
CATEGORY: eess.IV [eess.IV, cs.CV, cs.MM]
HIGHLIGHT: In this paper, two effective novel blocks are developed: analysis and synthesis block that employs the convolution layer and Generalized Divisive Normalization (GDN) in the variable-rate encoder and decoder side.

94, TITLE: Multi-View Spatial-Temporal Graph Convolutional Networks with Domain Generalization for Sleep Stage Classification
AUTHORS: ZIYU JIA et. al.
CATEGORY: eess.SP [eess.SP, cs.AI, cs.CV, cs.LG]
HIGHLIGHT: To address the above challenges, we propose a multi-view spatial-temporal graph convolutional networks (MSTGCN) with domain generalization for sleep stage classification.

计算机视觉论文-2021-09-07相关推荐

  1. 2021.09.07 移动端APP开发了解一下

    我是大自然的搬运工. Native App翻页类型总结: 微交互:移动端APP页面跳转方式分析 Web App开发步骤: 一步一步教你如何开发h5页面 H5页面跳转的几种方法: H5打开新窗口与页面跳 ...

  2. 2021年必读的10 个计算机视觉论文总结

    点击上方"3D视觉工坊",选择"星标" 干货第一时间送达 作者丨Louis Bouchard 来源丨DeepHub IMBA 编辑丨极市平台 本文是作者总结的今 ...

  3. 【AI视野·今日CV 计算机视觉论文速览 第240期】Thu, 4 Nov 2021

    AI视野·今日CS.CV 计算机视觉论文速览 Thu, 4 Nov 2021 Totally 35 papers

  4. 【AI视野·今日CV 计算机视觉论文速览 第239期】Wed, 3 Nov 2021

    AI视野·今日CS.CV 计算机视觉论文速览 Wed, 3 Nov 2021 Totally 48 papers

  5. 【AI视野·今日CV 计算机视觉论文速览 第238期】Fri, 1 Oct 2021

    AI视野·今日CS.CV 计算机视觉论文速览 Fri, 1 Oct 2021 Totally 62 papers

  6. 【AI视野·今日CV 计算机视觉论文速览 第237期】Thu, 30 Sep 2021

    AI视野·今日CS.CV 计算机视觉论文速览 Thu, 30 Sep 2021 Totally 47 papers

  7. 【AI视野·今日CV 计算机视觉论文速览 第233期】Tue, 3 Aug 2021

    AI视野·今日CS.CV 计算机视觉论文速览 Tue, 3 Aug 2021 Totally xx papers

  8. 【AI视野·今日CV 计算机视觉论文速览 第232期】Thu, 8 Jul 2021

    AI视野·今日CS.CV 计算机视觉论文速览 Thu, 8 Jul 2021 Totally 62 papers

  9. 【AI视野·今日CV 计算机视觉论文速览 第229期】Thu, 1 Jul 2021

    AI视野·今日CS.CV 计算机视觉论文速览 Thu, 1 Jul 2021 Totally 53 papers

  10. 【AI视野·今日CV 计算机视觉论文速览 第227期】Fri, 25 Jun 2021

    AI视野·今日CS.CV 计算机视觉论文速览 Fri, 25 Jun 2021 Totally 63 papers

最新文章

  1. pycharm 在ubuntu18.04 20.04以上保存在侧边栏的方法
  2. Swift 循环、数组 字典的遍历
  3. 网页入侵攻防修炼(一)
  4. 大雁塔为什么七层_西安旅游的打卡景点,大雁塔是干嘛的?怎么来的?
  5. python note 11 函数名的使用、闭包、迭代器
  6. SoringMVC-常用注解标签详解(摘抄)
  7. php分页上一页下一页判断,分页(上一页,下一页)
  8. [设计模式-结构型]适配器(Adapter)
  9. mysql8.0依赖_分享MySql8.0.19 安装采坑记录
  10. simulik中的液压建模Simscape_Fluids资料收集及学习(原SimHydraulics)
  11. 中科院计算机所网络安全,中科院着力培养网络空间安全人才
  12. 一键AI绘画-生成自己想要生成的图片(你懂的)。
  13. LCD/OLED显示产品从新品导入量产的线体认证策划
  14. 震旦adc225打印机连接计算机,震旦adc225驱动
  15. 【树莓派】在Raspbian下将wifi中继为有线网络
  16. 如何找到QQ互联开发者认证在哪?
  17. 天地图三维帮助文档(Cesium)
  18. 电机对应的电流计算方式及电线、端子的选型
  19. 学习方法之——费曼技巧学习
  20. 如何设计IIR滤波器

热门文章

  1. 30个专业的电子商务网站,助您一臂之力
  2. 电脑重装操作系统——使用U盘安装(简略步骤)
  3. 为什么onenote一直在加载_【完美解决】11.OneNote中英文字体不统一,微软10多年未解决的Bug!...
  4. rarlinux基于linux-x64
  5. 关于计算机病毒的试题,计算机病毒测试题.doc
  6. SAP WBS预算可通过二种方式配置和使用
  7. Mac book Pro BootCamp驱动下载地址
  8. Golang - Structs 包的使用
  9. LINGO11免密版windows
  10. CSF文件播放器处理总结