Computer Vision and Pattern Recognition

Authors and titles for recent submissions

Tue, 14 May 2024
Mon, 13 May 2024
Fri, 10 May 2024
Thu, 9 May 2024
Wed, 8 May 2024

[ total of 437 entries: 1-437 ]
[ showing up to 580 entries per page: fewer | more ]

Tue, 14 May 2024

[1] arXiv:2405.07992 [pdf, other]: Title: MambaOut: Do We Really Need Mamba for Vision?

Authors: Weihao Yu, Xinchao Wang

Comments: Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[2] arXiv:2405.07988 [pdf, ps, other]: Title: A Generalist Learner for Multifaceted Medical Image Interpretation

Authors: Hong-Yu Zhou, Subathra Adithan, Julián Nicolás Acosta, Eric J. Topol, Pranav Rajpurkar

Comments: Technical study

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[3] arXiv:2405.07974 [pdf, other]: Title: SignAvatar: Sign Language 3D Motion Reconstruction and Generation

Authors: Lu Dong, Lipisha Chaudhary, Fei Xu, Xiao Wang, Mason Lary, Ifeoma Nwogu

Comments: Accepted by FG2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[4] arXiv:2405.07969 [pdf, other]: Title: Investigating the Semantic Robustness of CLIP-based Zero-Shot Anomaly Segmentation

Authors: Kevin Stangl, Marius Arvinte, Weilin Xu, Cory Cornelius

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[5] arXiv:2405.07966 [pdf, other]: Title: OverlapMamba: Novel Shift State Space Model for LiDAR-based Place Recognition

Authors: Qiuchi Xiang, Jintao Cheng, Jiehao Luo, Jin Wu, Rui Fan, Xieyuanli Chen, Xiaoyu Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[6] arXiv:2405.07933 [pdf, other]: Title: Authentic Hand Avatar from a Phone Scan via Universal Hand Model

Authors: Gyeongsik Moon, Weipeng Xu, Rohan Joshi, Chenglei Wu, Takaaki Shiratori

Comments: Accepted to CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[7] arXiv:2405.07921 [pdf, other]: Title: Can Better Text Semantics in Prompt Tuning Improve VLM Generalization?

Authors: Hari Chandana Kuchibhotla, Sai Srinivas Kancheti, Abbavaram Gowtham Reddy, Vineeth N Balasubramanian

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[8] arXiv:2405.07919 [pdf, other]: Title: Exploring the Low-Pass Filtering Behavior in Image Super-Resolution

Authors: Haoyu Deng, Zijing Xu, Yule Duan, Xiao Wu, Wenjie Shu, Liang-Jian Deng

Comments: Accepted by ICML 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[9] arXiv:2405.07916 [pdf, other]: Title: IMAFD: An Interpretable Multi-stage Approach to Flood Detection from time series Multispectral Data

Authors: Ziyang Zhang, Plamen Angelov, Dmitry Kangin, Nicolas Longépé

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[10] arXiv:2405.07913 [pdf, other]: Title: CTRLorALTer: Conditional LoRAdapter for Efficient 0-Shot Control & Altering of T2I Models

Authors: Nick Stracke, Stefan Andreas Baumann, Joshua M. Susskind, Miguel Angel Bautista, Björn Ommer

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[11] arXiv:2405.07868 [pdf, other]: Title: Boostlet.js: Image processing plugins for the web via JavaScript injection

Authors: Edward Gaibor, Shruti Varade, Rohini Deshmukh, Tim Meyer, Mahsa Geshvadi, SangHyuk Kim, Vidhya Sree Narayanappa, Daniel Haehn

Comments: 5 pages, 5 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[12] arXiv:2405.07865 [pdf, other]: Title: AnoVox: A Benchmark for Multimodal Anomaly Detection in Autonomous Driving

Authors: Daniel Bogdoll, Iramm Hamdard, Lukas Namgyu Rößler, Felix Geisler, Muhammed Bayram, Felix Wang, Jan Imhof, Miguel de Campos, Anushervon Tabarov, Yitian Yang, Hanno Gottschalk, J. Marius Zöllner

Comments: Daniel Bogdoll, Iramm Hamdard, and Lukas Namgyu R\"o{\ss}ler contributed equally

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[13] arXiv:2405.07857 [pdf, other]: Title: Synergistic Integration of Coordinate Network and Tensorial Feature for Improving Neural Radiance Fields from Sparse Inputs

Authors: Mingyu Kim, Jun-Seong Kim, Se-Young Yun, Jin-Hwa Kim

Comments: ICML2024 ; Project page is accessible at this https URL ; Code is available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[14] arXiv:2405.07847 [pdf, other]: Title: SceneFactory: A Workflow-centric and Unified Framework for Incremental Scene Modeling

Authors: Yijun Yuan, Michael Bleier, Andreas Nüchter

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[15] arXiv:2405.07845 [pdf, other]: Title: Multi-Task Learning for Fatigue Detection and Face Recognition of Drivers via Tree-Style Space-Channel Attention Fusion Network

Authors: Shulei Qu, Zhenguo Gao, Xiaowei Chen, Na Li, Yakai Wang, Xiaoxiao Wu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[16] arXiv:2405.07814 [pdf, other]: Title: NutritionVerse-Direct: Exploring Deep Neural Networks for Multitask Nutrition Prediction from Food Images

Authors: Matthew Keller, Chi-en Amy Tai, Yuhao Chen, Pengcheng Xi, Alexander Wong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[17] arXiv:2405.07801 [pdf, other]: Title: Deep Learning-Based Object Pose Estimation: A Comprehensive Survey

Authors: Jian Liu, Wei Sun, Hui Yang, Zhiwen Zeng, Chongpei Liu, Jin Zheng, Xingyu Liu, Hossein Rahmani, Nicu Sebe, Ajmal Mian

Comments: 27 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[18] arXiv:2405.07798 [pdf, other]: Title: FreeVA: Offline MLLM as Training-Free Video Assistant

Authors: Wenhao Wu

Comments: Preprint. Work in progress

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[19] arXiv:2405.07784 [pdf, other]: Title: Generating Human Motion in 3D Scenes from Text Descriptions

Authors: Zhi Cen, Huaijin Pi, Sida Peng, Zehong Shen, Minghui Yang, Shuai Zhu, Hujun Bao, Xiaowei Zhou

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[20] arXiv:2405.07777 [pdf, other]: Title: GMSR:Gradient-Guided Mamba for Spectral Reconstruction from RGB Images

Authors: Xinying Wang, Zhixiong Huang, Sifan Zhang, Jiawen Zhu, Lin Feng

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[21] arXiv:2405.07776 [pdf, other]: Title: SAR Image Synthesis with Diffusion Models

Authors: Denisa Qosja, Simon Wagner, Daniel O'Hagan

Comments: Published at IEEE Radar Conference 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV); Signal Processing (eess.SP)
[22] arXiv:2405.07723 [pdf, other]: Title: Coarse or Fine? Recognising Action End States without Labels

Authors: Davide Moltisanti, Hakan Bilen, Laura Sevilla-Lara, Frank Keller

Comments: The Eleventh Workshop on Fine-Grained Visual Categorization (CVPR 24)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[23] arXiv:2405.07702 [pdf, other]: Title: FORESEE: Multimodal and Multi-view Representation Learning for Robust Prediction of Cancer Survival

Authors: Liangrui Pan, Yijun Peng, Yan Li, Yiyi Liang, Liwen Xu, Qingchun Liang, Shaoliang Peng

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[24] arXiv:2405.07698 [pdf, other]: Title: oTTC: Object Time-to-Contact for Motion Estimation in Autonomous Driving

Authors: Abdul Hannan Khan, Syed Tahseen Raza Rizvi, Dheeraj Varma Chittari Macharavtu, Andreas Dengel

Comments: 9 pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[25] arXiv:2405.07696 [pdf, other]: Title: MonoMAE: Enhancing Monocular 3D Detection through Depth-Aware Masked Autoencoders

Authors: Xueying Jiang, Sheng Jin, Xiaoqin Zhang, Ling Shao, Shijian Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[26] arXiv:2405.07680 [pdf, other]: Title: Establishing a Unified Evaluation Framework for Human Motion Generation: A Comparative Analysis of Metrics

Authors: Ali Ismail-Fawaz, Maxime Devanne, Stefano Berretti, Jonathan Weber, Germain Forestier

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[27] arXiv:2405.07663 [pdf, other]: Title: Sign Stitching: A Novel Approach to Sign Language Production

Authors: Harry Walsh, Ben Saunders, Richard Bowden

Comments: 18 pages, 3 figures, 4 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[28] arXiv:2405.07655 [pdf, other]: Title: Quality-aware Selective Fusion Network for V-D-T Salient Object Detection

Authors: Liuxin Bao, Xiaofei Zhou, Xiankai Lu, Yaoqi Sun, Haibing Yin, Zhenghui Hu, Jiyong Zhang, Chenggang Yan

Comments: Accepted by IEEE Transactions on Image Processing (TIP)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[29] arXiv:2405.07653 [pdf, other]: Title: Fast Training Data Acquisition for Object Detection and Segmentation using Black Screen Luminance Keying

Authors: Thomas Pöllabauer, Volker Knauthe, André Boller, Arjan Kuijper, Dieter Fellner

Comments: 32. International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision'2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[30] arXiv:2405.07648 [pdf, other]: Title: CDFormer:When Degradation Prediction Embraces Diffusion Model for Blind Image Super-Resolution

Authors: Qingguo Liu, Chenyi Zhuang, Pan Gao, Jie Qin

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[31] arXiv:2405.07600 [pdf, other]: Title: Integrity Monitoring of 3D Object Detection in Automated Driving Systems using Raw Activation Patterns and Spatial Filtering

Authors: Hakan Yekta Yatbaz, Mehrdad Dianati, Konstantinos Koufos, Roger Woodman

Comments: Submitted to ITSC 2024. arXiv admin note: text overlap with arXiv:2404.07685

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[32] arXiv:2405.07595 [pdf, other]: Title: Environmental Matching Attack Against Unmanned Aerial Vehicles Object Detection

Authors: Dehong Kong, Siyuan Liang, Wenqi Ren

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[33] arXiv:2405.07594 [pdf, other]: Title: RGBD-Glue: General Feature Combination for Robust RGB-D Point Cloud Registration

Authors: Congjia Chen, Xiaoyu Jia, Yanhong Zheng, Yufu Qu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[34] arXiv:2405.07582 [pdf, other]: Title: FRRffusion: Unveiling Authenticity with Diffusion-Based Face Retouching Reversal

Authors: Fengchuang Xing, Xiaowen Shi, Yuan-Gen Wang, Chunsheng Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[35] arXiv:2405.07573 [pdf, other]: Title: MaskFuser: Masked Fusion of Joint Multi-Modal Tokenization for End-to-End Autonomous Driving

Authors: Yiqun Duan, Xianda Guo, Zheng Zhu, Zhen Wang, Yu-Kai Wang, Chin-Teng Lin

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[36] arXiv:2405.07571 [pdf, other]: Title: TattTRN: Template Reconstruction Network for Tattoo Retrieval

Authors: Lazaro Janier Gonzalez-Soler, Maciej Salwowski, Christian Rathgeb, Daniel Fischer

Comments: Accepted at CVPR Workshop 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[37] arXiv:2405.07550 [pdf, other]: Title: Wild Berry image dataset collected in Finnish forests and peatlands using drones

Authors: Luigi Riz, Sergio Povoli, Andrea Caraffa, Davide Boscaini, Mohamed Lamine Mekhalfi, Paul Chippendale, Marjut Turtiainen, Birgitta Partanen, Laura Smith Ballester, Francisco Blanes Noguera, Alessio Franchi, Elisa Castelli, Giacomo Piccinini, Luca Marchesotti, Micael Santos Couceiro, Fabio Poiesi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[38] arXiv:2405.07524 [pdf, other]: Title: HybridHash: Hybrid Convolutional and Self-Attention Deep Hashing for Image Retrieval

Authors: Chao He, Hongxi Wei

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[39] arXiv:2405.07523 [pdf, other]: Title: Adaptation of Distinct Semantics for Uncertain Areas in Polyp Segmentation

Authors: Quang Vinh Nguyen, Van Thong Huynh, Soo-Hyung Kim

Comments: 13 pages with 7 figures, British Machine Vision Conference 2023

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[40] arXiv:2405.07520 [pdf, ps, other]: Title: Dehazing Remote Sensing and UAV Imagery: A Review of Deep Learning, Prior-based, and Hybrid Approaches

Authors: Gao Yu Lee, Jinkuan Chen, Tanmoy Dam, Md Meftahul Ferdaus, Daniel Puiu Poenar, Vu N Duong

Comments: Submitted to journal and under review, once the paper is accepted, the copyright will be transferred to the corresponding journal

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[41] arXiv:2405.07516 [pdf, other]: Title: Support-Query Prototype Fusion Network for Few-shot Medical Image Segmentation

Authors: Xiaoxiao Wu, Zhenguo Gao, Xiaowei Chen, Yakai Wang, Shulei Qu, Na Li

Comments: 19 pages, 7 figures, 4 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[42] arXiv:2405.07481 [pdf, other]: Title: Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout Analysis

Authors: Tianci Bi, Xiaoyi Zhang, Zhizheng Zhang, Wenxuan Xie, Cuiling Lan, Yan Lu, Nanning Zheng

Comments: Accepted to CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[43] arXiv:2405.07472 [pdf, other]: Title: GaussianVTON: 3D Human Virtual Try-ON via Multi-Stage Gaussian Splatting Editing with Image Prompting

Authors: Haodong Chen, Yongle Huang, Haojian Huang, Xiangsheng Ge, Dian Shao

Comments: On-going work

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[44] arXiv:2405.07459 [pdf, other]: Title: DualFocus: A Unified Framework for Integrating Positive and Negative Descriptors in Text-based Person Retrieval

Authors: Yuchuan Deng, Zhanpeng Hu, Jiakun Han, Chuang Deng, Qijun Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[45] arXiv:2405.07451 [pdf, other]: Title: CLIP-Powered TASS: Target-Aware Single-Stream Network for Audio-Visual Question Answering

Authors: Yuanyuan Jiang, Jianqin Yin

Comments: Submitted to the Journal on February 6, 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[46] arXiv:2405.07444 [pdf, other]: Title: Motion Keyframe Interpolation for Any Human Skeleton via Temporally Consistent Point Cloud Sampling and Reconstruction

Authors: Clinton Mo, Kun Hu, Chengjiang Long, Dong Yuan, Zhiyong Wang

Comments: 17 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[47] arXiv:2405.07425 [pdf, other]: Title: Sakuga-42M Dataset: Scaling Up Cartoon Research

Authors: Zhenglin Pan, Yu Zhu, Yuxuan Mu

Comments: Arxiv Pre-print. Work in Progress

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[48] arXiv:2405.07411 [pdf, other]: Title: MoVL:Exploring Fusion Strategies for the Domain-Adaptive Application of Pretrained Models in Medical Imaging Tasks

Authors: Haijiang Tian, Jingkun Yue, Xiaohong Liu, Guoxing Yang, Zeyu Jiang, Guangyu Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[49] arXiv:2405.07407 [pdf, other]: Title: PitcherNet: Powering the Moneyball Evolution in Baseball Video Analytics

Authors: Jerrin Bright, Bavesh Balaji, Yuhao Chen, David A Clausi, John S Zelek

Comments: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW'24)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[50] arXiv:2405.07399 [pdf, other]: Title: Semi-Supervised Weed Detection for Rapid Deployment and Enhanced Efficiency

Authors: Alzayat Saleh, Alex Olsen, Jake Wood, Bronson Philippa, Mostafa Rahimi Azghadi

Comments: 16 pages, 4 figures, 6 tables. Submitted to Elsevier

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[51] arXiv:2405.07369 [pdf, other]: Title: Incorporating Anatomical Awareness for Enhanced Generalizability and Progression Prediction in Deep Learning-Based Radiographic Sacroiliitis Detection

Authors: Felix J. Dorfner, Janis L. Vahldiek, Leonhard Donle, Andrei Zhukov, Lina Xu, Hartmut Häntze, Marcus R. Makowski, Hugo J.W.L. Aerts, Fabian Proft, Valeria Rios Rodriguez, Judith Rademacher, Mikhail Protopopov, Hildrun Haibel, Torsten Diekhoff, Murat Torgutalp, Lisa C. Adams, Denis Poddubnyy, Keno K. Bressem

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[52] arXiv:2405.07364 [pdf, other]: Title: BoQ: A Place is Worth a Bag of Learnable Queries

Authors: Amar Ali-bey, Brahim Chaib-draa, Philippe Giguère

Comments: Accepted at CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[53] arXiv:2405.07346 [pdf, other]: Title: Understanding and Evaluating Human Preferences for AI Generated Images with Instruction Tuning

Authors: Jiarui Wang, Huiyu Duan, Guangtao Zhai, Xiongkuo Min

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[54] arXiv:2405.07332 [pdf, other]: Title: PotatoGANs: Utilizing Generative Adversarial Networks, Instance Segmentation, and Explainable AI for Enhanced Potato Disease Identification and Classification

Authors: Mohammad Shafiul Alam, Fatema Tuj Johora Faria, Mukaffi Bin Moin, Ahmed Al Wase, Md. Rabius Sani, Khan Md Hasib

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[55] arXiv:2405.07319 [pdf, other]: Title: LayGA: Layered Gaussian Avatars for Animatable Clothing Transfer

Authors: Siyou Lin, Zhe Li, Zhaoqi Su, Zerong Zheng, Hongwen Zhang, Yebin Liu

Comments: SIGGRAPH 2024 conference track

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[56] arXiv:2405.07306 [pdf, other]: Title: Point Resampling and Ray Transformation Aid to Editable NeRF Models

Authors: Zhenyang Li, Zilong Chen, Feifan Qu, Mingqing Wang, Yizhou Zhao, Kai Zhang, Yifan Peng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[57] arXiv:2405.07293 [pdf, other]: Title: Sparse Sampling is All You Need for Fast Wrong-way Cycling Detection in CCTV Videos

Authors: Jing Xu, Wentao Shi, Sheng Ren, Pan Gao, Peng Zhou, Jie Qin

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[58] arXiv:2405.07288 [pdf, other]: Title: Erasing Concepts from Text-to-Image Diffusion Models with Few-shot Unlearning

Authors: Masane Fuchi, Tomohiro Takagi

Comments: 23 pages, 28 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[59] arXiv:2405.07284 [pdf, ps, other]: Title: Zero Shot Context-Based Object Segmentation using SLIP (SAM+CLIP)

Authors: Saaketh Koundinya Gundavarapu, Arushi Arora, Shreya Agarwal

Comments: 5 pages, 3 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[60] arXiv:2405.07272 [pdf, ps, other]: Title: MAML MOT: Multiple Object Tracking based on Meta-Learning

Authors: Jiayi Chen, Chunhua Deng

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[61] arXiv:2405.07257 [pdf, other]: Title: Listen, Disentangle, and Control: Controllable Speech-Driven Talking Head Generation

Authors: Changpeng Cai, Guinan Guo, Jiao Li, Junhao Su, Chenghao He, Jing Xiao, Yuanxu Chen, Lei Dai, Feiyu Zhu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[62] arXiv:2405.07202 [pdf, other]: Title: Unified Video-Language Pre-training with Synchronized Audio

Authors: Shentong Mo, Haofan Wang, Huaxia Li, Xu Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[63] arXiv:2405.07201 [pdf, other]: Title: Building a Strong Pre-Training Baseline for Universal 3D Large-Scale Perception

Authors: Haoming Chen, Zhizhong Zhang, Yanyun Qu, Ruixin Zhang, Xin Tan, Yuan Xie

Comments: Accepted to CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[64] arXiv:2405.07194 [pdf, other]: Title: Differentiable Model Scaling using Differentiable Topk

Authors: Kai Liu, Ruohui Wang, Jianfei Gao, Kai Chen

Comments: Accepted by ICML 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[65] arXiv:2405.07178 [pdf, other]: Title: Hologram: Realtime Holographic Overlays via LiDAR Augmented Reconstruction

Authors: Ekansh Agrawal

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[66] arXiv:2405.07174 [pdf, other]: Title: CRSFL: Cluster-based Resource-aware Split Federated Learning for Continuous Authentication

Authors: Mohamad Wazzeh, Mohamad Arafeh, Hani Sami, Hakima Ould-Slimane, Chamseddine Talhi, Azzam Mourad, Hadi Otrok

Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)
[67] arXiv:2405.07171 [pdf, other]: Title: Enhanced Online Test-time Adaptation with Feature-Weight Cosine Alignment

Authors: WeiQin Chuah, Ruwan Tennakoon, Alireza Bab-Hadiashar

Comments: 22 pages, 7 figures, 8 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[68] arXiv:2405.07167 [pdf, other]: Title: 3D Hand Mesh Recovery from Monocular RGB in Camera Space

Authors: Haonan Li, Patrick P. K. Chen, Yitong Zhou

Comments: 21 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[69] arXiv:2405.07166 [pdf, other]: Title: Resource Efficient Perception for Vision Systems

Authors: A V Subramanyam, Niyati Singal, Vinay K Verma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[70] arXiv:2405.07164 [pdf, other]: Title: Modeling Pedestrian Intrinsic Uncertainty for Multimodal Stochastic Trajectory Prediction via Energy Plan Denoising

Authors: Yao Liu, Quan Z. Sheng, Lina Yao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[71] arXiv:2405.07157 [pdf, other]: Title: Semi-Self-Supervised Domain Adaptation: Developing Deep Learning Models with Limited Annotated Data for Wheat Head Segmentation

Authors: Alireza Ghanbari, Gholamhassan Shirdel, Farhad Maleki

Comments: 12

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[72] arXiv:2405.07155 [pdf, other]: Title: Enhancing Multi-modal Learning: Meta-learned Cross-modal Knowledge Distillation for Handling Missing Modalities

Authors: Hu Wang, Congbo Ma, Yuyuan Liu, Yuanhong Chen, Yu Tian, Jodie Avery, Louise Hull, Gustavo Carneiro

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[73] arXiv:2405.07121 [pdf, other]: Title: In The Wild Ellipse Parameter Estimation for Circular Dining Plates and Bowls

Authors: Akil Pathiranage, Chris Czarnecki, Yuhao Chen, Pengcheng Xi, Linlin Xu, Alexander Wong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[74] arXiv:2405.07116 [pdf, other]: Title: CoViews: Adaptive Augmentation Using Cooperative Views for Enhanced Contrastive Learning

Authors: Nazim Bendib

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[75] arXiv:2405.07047 [pdf, other]: Title: Unsupervised Density Neural Representation for CT Metal Artifact Reduction

Authors: Qing Wu, Xu Guo, Lixuan Chen, Dongming He, Hongjiang Wei, Xudong Wang, S. Kevin Zhou, Yifeng Zhang, Jingyi Yu, Yuyao Zhang

Comments: 13 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[76] arXiv:2405.07046 [pdf, other]: Title: Retrieval Enhanced Zero-Shot Video Captioning

Authors: Yunchuan Ma, Laiyun Qing, Guorong Li, Yuankai Qi, Quan Z. Sheng, Qingming Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[77] arXiv:2405.07044 [pdf, other]: Title: Semantic Guided Large Scale Factor Remote Sensing Image Super-resolution with Generative Diffusion Prior

Authors: Ce Wang, Wanjie Sun

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[78] arXiv:2405.07031 [pdf, other]: Title: Global Motion Understanding in Large-Scale Video Object Segmentation

Authors: Volodymyr Fedynyak, Yaroslav Romanus, Oles Dobosevych, Igor Babin, Roman Riazantsev

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[79] arXiv:2405.07027 [pdf, other]: Title: TD-NeRF: Novel Truncated Depth Prior for Joint Camera Pose and Neural Radiance Field Optimization

Authors: Zhen Tan, Zongtan Zhou, Yangbing Ge, Zi Wang, Xieyuanli Chen, Dewen Hu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[80] arXiv:2405.07012 [pdf, other]: Title: Incorporating Degradation Estimation in Light Field Spatial Super-Resolution

Authors: Zeyu Xiao, Zhiwei Xiong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[81] arXiv:2405.06994 [pdf, other]: Title: GRASP-GCN: Graph-Shape Prioritization for Neural Architecture Search under Distribution Shifts

Authors: Sofia Casarin, Oswald Lanz, Sergio Escalera

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[82] arXiv:2405.06980 [pdf, other]: Title: Fractals as Pre-training Datasets for Anomaly Detection and Localization

Authors: C. I. Ugwu, S. Casarin, O. Lanz

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[83] arXiv:2405.06948 [pdf, other]: Title: Training-free Subject-Enhanced Attention Guidance for Compositional Text-to-image Generation

Authors: Shengyuan Liu, Bo Wang, Ye Ma, Te Yang, Xipeng Cao, Quan Chen, Han Li, Di Dong, Peng Jiang

Comments: 26 pages, 13 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[84] arXiv:2405.06945 [pdf, other]: Title: Direct Learning of Mesh and Appearance via 3D Gaussian Splatting

Authors: Ancheng Lin, Jun Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[85] arXiv:2405.06944 [pdf, other]: Title: Learning Monocular Depth from Focus with Event Focal Stack

Authors: Chenxu Jiang, Mingyuan Lin, Chi Zhang, Zhenghai Wang, Lei Yu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[86] arXiv:2405.06929 [pdf, other]: Title: PRENet: A Plane-Fit Redundancy Encoding Point Cloud Sequence Network for Real-Time 3D Action Recognition

Authors: Shenglin He, Xiaoyang Qu, Jiguang Wan, Guokuan Li, Changsheng Xie, Jianzong Wang

Comments: Accepted by the 2024 International Joint Conference on Neural Networks (IJCNN 2024)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[87] arXiv:2405.06926 [pdf, other]: Title: TAI++: Text as Image for Multi-Label Image Classification by Co-Learning Transferable Prompt

Authors: Xiangyu Wu, Qing-Yuan Jiang, Yang Yang, Yi-Feng Wu, Qing-Guo Chen, Jianfeng Lu

Comments: Accepted for publication at IJCAI 2024; 13 pages; 11 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[88] arXiv:2405.06918 [pdf, other]: Title: Super-Resolving Blurry Images with Events

Authors: Chi Zhang, Mingyuan Lin, Xiang Zhang, Chenxu Jiang, Lei Yu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[89] arXiv:2405.06916 [pdf, other]: Title: High-order Neighborhoods Know More: HyperGraph Learning Meets Source-free Unsupervised Domain Adaptation

Authors: Jinkun Jiang, Qingxuan Lv, Yuezun Li, Yong Du, Sheng Chen, Hui Yu, Junyu Dong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[90] arXiv:2405.06914 [pdf, other]: Title: Non-confusing Generation of Customized Concepts in Diffusion Models

Authors: Wang Lin, Jingyuan Chen, Jiaxin Shi, Yichen Zhu, Chen Liang, Junzhong Miao, Tao Jin, Zhou Zhao, Fei Wu, Shuicheng Yan, Hanwang Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[91] arXiv:2405.06911 [pdf, other]: Title: Replication Study and Benchmarking of Real-Time Object Detection Models

Authors: Pierre-Luc Asselin, Vincent Coulombe, William Guimont-Martin, William Larrivée-Hardy

Comments: Authors are presented in alphabetical order, each having equal contribution to the work. Copyright may be transferred without notice, after which this version may no longer be accessible

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[92] arXiv:2405.06903 [pdf, other]: Title: UniGarmentManip: A Unified Framework for Category-Level Garment Manipulation via Dense Visual Correspondence

Authors: Ruihai Wu, Haoran Lu, Yiyan Wang, Yubo Wang, Hao Dong

Comments: CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[93] arXiv:2405.06893 [pdf, other]: Title: ADLDA: A Method to Reduce the Harm of Data Distribution Shift in Data Augmentation

Authors: Haonan Wang

Comments: 8 page 4 fig

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[94] arXiv:2405.06887 [pdf, other]: Title: FineParser: A Fine-grained Spatio-temporal Action Parser for Human-centric Action Quality Assessment

Authors: Jinglin Xu, Sibo Yin, Guohao Zhao, Zishuo Wang, Yuxin Peng

Comments: Accepted by CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[95] arXiv:2405.06875 [pdf, other]: Title: LogicAL: Towards logical anomaly synthesis for unsupervised anomaly localization

Authors: Ying Zhao

Comments: Accepted to Visual Anomaly and Novelty Detection (VAND) 2.0 Workshop at CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[96] arXiv:2405.06872 [pdf, other]: Title: eCAR: edge-assisted Collaborative Augmented Reality Framework

Authors: Jinwoo Jeon, Woontack Woo

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[97] arXiv:2405.06865 [pdf, other]: Title: Disrupting Style Mimicry Attacks on Video Imagery

Authors: Josephine Passananti, Stanley Wu, Shawn Shan, Haitao Zheng, Ben Y. Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
[98] arXiv:2405.06849 [pdf, other]: Title: GreedyViG: Dynamic Axial Graph Construction for Efficient Vision GNNs

Authors: Mustafa Munir, William Avery, Md Mostafijur Rahman, Radu Marculescu

Comments: Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[99] arXiv:2405.06845 [pdf, other]: Title: CasCalib: Cascaded Calibration for Motion Capture from Sparse Unsynchronized Cameras

Authors: James Tang, Shashwat Suri, Daniel Ajisafe, Bastian Wandt, Helge Rhodin

Comments: Accepted to the 18th IEEE International Conference on Automatic Face and Gesture Recognition

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[100] arXiv:2405.06841 [pdf, other]: Title: Bridging the Gap: Protocol Towards Fair and Consistent Affect Analysis

Authors: Guanyu Hu, Eleni Papadopoulou, Dimitrios Kollias, Paraskevi Tzouveli, Jie Wei, Xinyu Yang

Comments: accepted at IEEE FG 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[101] arXiv:2405.06828 [pdf, other]: Title: G-FARS: Gradient-Field-based Auto-Regressive Sampling for 3D Part Grouping

Authors: Junfeng Cheng, Tania Stathaki

Comments: CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[102] arXiv:2405.06821 [pdf, other]: Title: Synchronized Object Detection for Autonomous Sorting, Mapping, and Quantification of Medical Materials

Authors: Federico Zocco, Daniel Lake, Shahin Rahimifard

Comments: To be submitted

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[103] arXiv:2405.06814 [pdf, other]: Title: Dual-Task Vision Transformer for Rapid and Accurate Intracerebral Hemorrhage Classification on CT Images

Authors: Jialiang Fan, Guoyu Lu, Xinhui Fan

Comments: 9 pages, 4 figure3

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[104] arXiv:2405.06782 [pdf, other]: Title: GraphRelate3D: Context-Dependent 3D Object Detection with Inter-Object Relationship Graphs

Authors: Mingyu Liu, Ekim Yurtsever, Marc Brede, Jun Meng, Walter Zimmer, Xingcheng Zhou, Bare Luka Zagar, Yuning Cui, Alois Knoll

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[105] arXiv:2405.06778 [pdf, other]: Title: Shape Conditioned Human Motion Generation with Diffusion Model

Authors: Kebing Xue, Hyewon Seo

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[106] arXiv:2405.06765 [pdf, other]: Title: Common Corruptions for Enhancing and Evaluating Robustness in Air-to-Air Visual Object Detection

Authors: Anastasios Arsenos, Vasileios Karampinis, Evangelos Petrongonas, Christos Skliros, Dimitrios Kollias, Stefanos Kollias, Athanasios Voulodimos

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[107] arXiv:2405.06749 [pdf, other]: Title: Ensuring UAV Safety: A Vision-only and Real-time Framework for Collision Avoidance Through Object Detection, Tracking, and Distance Estimation

Authors: Vasileios Karampinis, Anastasios Arsenos, Orfeas Filippopoulos, Evangelos Petrongonas, Christos Skliros, Dimitrios Kollias, Stefanos Kollias, Athanasios Voulodimos

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[108] arXiv:2405.07991 (cross-list from cs.RO) [pdf, other]: Title: SPIN: Simultaneous Perception, Interaction and Navigation

Authors: Shagun Uppal, Ananye Agarwal, Haoyu Xiong, Kenneth Shaw, Deepak Pathak

Comments: In CVPR 2024. Website at this https URL

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Systems and Control (eess.SY)
[109] arXiv:2405.07990 (cross-list from cs.CL) [pdf, other]: Title: Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots

Authors: Chengyue Wu, Yixiao Ge, Qiushan Guo, Jiahao Wang, Zhixuan Liang, Zeyu Lu, Ying Shan, Ping Luo

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[110] arXiv:2405.07987 (cross-list from cs.LG) [pdf, other]: Title: The Platonic Representation Hypothesis

Authors: Minyoung Huh, Brian Cheung, Tongzhou Wang, Phillip Isola

Comments: Equal contributions

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
[111] arXiv:2405.07930 (cross-list from cs.MM) [pdf, other]: Title: Improving Multimodal Learning with Multi-Loss Gradient Modulation

Authors: Konstantinos Kontras, Christos Chatzichristos, Matthew Blaschko, Maarten De Vos

Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[112] arXiv:2405.07905 (cross-list from eess.IV) [pdf, other]: Title: PLUTO: Pathology-Universal Transformer

Authors: Dinkar Juyal, Harshith Padigela, Chintan Shah, Daniel Shenker, Natalia Harguindeguy, Yi Liu, Blake Martin, Yibo Zhang, Michael Nercessian, Miles Markey, Isaac Finberg, Kelsey Luu, Daniel Borders, Syed Ashar Javed, Emma Krause, Raymond Biju, Aashish Sood, Allen Ma, Jackson Nyman, John Shamshoian, Guillaume Chhor, Darpan Sanghavi, Marc Thibault, Limin Yu, Fedaa Najdawi, Jennifer A. Hipp, Darren Fahy, Benjamin Glass, Eric Walk, John Abel, Harsha Pokkalla, Andrew H. Beck, Sean Grullon

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[113] arXiv:2405.07869 (cross-list from eess.IV) [pdf, other]: Title: Enhancing Clinically Significant Prostate Cancer Prediction in T2-weighted Images through Transfer Learning from Breast Cancer

Authors: Chi-en Amy Tai, Alexander Wong

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[114] arXiv:2405.07861 (cross-list from eess.IV) [pdf, other]: Title: Improving Breast Cancer Grade Prediction with Multiparametric MRI Created Using Optimized Synthetic Correlated Diffusion Imaging

Authors: Chi-en Amy Tai, Alexander Wong

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[115] arXiv:2405.07854 (cross-list from eess.IV) [pdf, other]: Title: Using Multiparametric MRI with Optimized Synthetic Correlated Diffusion Imaging to Enhance Breast Cancer Pathologic Complete Response Prediction

Authors: Chi-en Amy Tai, Alexander Wong

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[116] arXiv:2405.07842 (cross-list from astro-ph.IM) [pdf, other]: Title: Ground-based Image Deconvolution with Swin Transformer UNet

Authors: Utsav Akhaury, Pascale Jablonka, Jean-Luc Starck, Frédéric Courbin

Comments: 11 pages, 14 figures

Subjects: Instrumentation and Methods for Astrophysics (astro-ph.IM); Computer Vision and Pattern Recognition (cs.CV)
[117] arXiv:2405.07827 (cross-list from cs.MM) [pdf, other]: Title: Automatic Recognition of Food Ingestion Environment from the AIM-2 Wearable Sensor

Authors: Yuning Huang, Mohamed Abul Hassan, Jiangpeng He, Janine Higgins, Megan McCrory, Heather Eicher-Miller, Graham Thomas, Edward O Sazonov, Fengqing Maggie Zhu

Comments: Accepted at CVPRw 2024

Subjects: Multimedia (cs.MM); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[118] arXiv:2405.07813 (cross-list from cs.LG) [pdf, other]: Title: Localizing Task Information for Improved Model Merging and Compression

Authors: Ke Wang, Nikolaos Dimitriadis, Guillermo Ortiz-Jimenez, François Fleuret, Pascal Frossard

Comments: Accepted ICML 2024; The first two authors contributed equally to this work; Project website: this https URL

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[119] arXiv:2405.07780 (cross-list from cs.LG) [pdf, other]: Title: Harnessing Hierarchical Label Distribution Variations in Test Agnostic Long-tail Recognition

Authors: Zhiyong Yang, Qianqian Xu, Zitai Wang, Sicong Li, Boyu Han, Shilong Bao, Xiaochun Cao, Qingming Huang

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[120] arXiv:2405.07762 (cross-list from eess.IV) [pdf, other]: Title: A method for supervoxel-wise association studies of age and other non-imaging variables from coronary computed tomography angiograms

Authors: Johan Öfverstedt, Elin Lundström, Göran Bergström, Joel Kullberg, Håkan Ahlström

Comments: 34 pages

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[121] arXiv:2405.07674 (cross-list from eess.IV) [pdf, other]: Title: CoVScreen: Pitfalls and recommendations for screening COVID-19 using Chest X-rays

Authors: Sonit Singh

Comments: 21 pages

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[122] arXiv:2405.07606 (cross-list from cs.HC) [pdf, other]: Title: AIris: An AI-powered Wearable Assistive Device for the Visually Impaired

Authors: Dionysia Danai Brilli, Evangelos Georgaras, Stefania Tsilivaki, Nikos Melanitis, Konstantina Nikita

Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV)
[123] arXiv:2405.07544 (cross-list from cs.RO) [pdf, other]: Title: Automatic Odometry-Less OpenDRIVE Generation From Sparse Point Clouds

Authors: Leon Eisemann, Johannes Maucher

Comments: 8 pages, 4 figures, 3 algorithms, 2 tables

Journal-ref: 2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC)

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[124] arXiv:2405.07489 (cross-list from cs.LG) [pdf, other]: Title: Sparse Domain Transfer via Elastic Net Regularization

Authors: Jingwei Zhang, Farzan Farnia

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[125] arXiv:2405.07392 (cross-list from cs.RO) [pdf, other]: Title: NGD-SLAM: Towards Real-Time SLAM for Dynamic Environments without GPU

Authors: Yuhao Zhang

Comments: 12 pages, 5 figures

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[126] arXiv:2405.07338 (cross-list from eess.IV) [pdf, other]: Title: Explainable Convolutional Neural Networks for Retinal Fundus Classification and Cutting-Edge Segmentation Models for Retinal Blood Vessels from Fundus Images

Authors: Fatema Tuj Johora Faria, Mukaffi Bin Moin, Pronay Debnath, Asif Iftekher Fahim, Faisal Muhammad Shah

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[127] arXiv:2405.07309 (cross-list from cs.RO) [pdf, other]: Title: DiffGen: Robot Demonstration Generation via Differentiable Physics Simulation, Differentiable Rendering, and Vision-Language Model

Authors: Yang Jin, Jun Lv, Shuqiang Jiang, Cewu Lu

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[128] arXiv:2405.07283 (cross-list from cs.RO) [pdf, other]: Title: BeautyMap: Binary-Encoded Adaptable Ground Matrix for Dynamic Points Removal in Global Maps

Authors: Mingkai Jia, Qingwen Zhang, Bowen Yang, Jin Wu, Ming Liu, Patric Jensfelt

Comments: The first two authors are co-first authors. 8 pages, accepted by RA-L

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[129] arXiv:2405.07256 (cross-list from eess.IV) [pdf, other]: Title: Leveraging Fixed and Dynamic Pseudo-labels for Semi-supervised Medical Image Segmentation

Authors: Suruchi Kumari, Pravendra Singh

Comments: Under Review

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[130] arXiv:2405.07145 (cross-list from cs.CR) [pdf, other]: Title: Stable Signature is Unstable: Removing Image Watermark from Diffusion Models

Authors: Yuepeng Hu, Zhengyuan Jiang, Moyang Guo, Neil Gong

Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[131] arXiv:2405.07041 (cross-list from cs.RO) [pdf, other]: Title: Multi-agent Traffic Prediction via Denoised Endpoint Distribution

Authors: Yao Liu, Ruoyu Wang, Yuanjiang Cao, Quan Z. Sheng, Lina Yao

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[132] arXiv:2405.07033 (cross-list from cs.NI) [pdf, ps, other]: Title: A Performance Analysis Modeling Framework for Extended Reality Applications in Edge-Assisted Wireless Networks

Authors: Anik Mallik, Jiang Xie, Zhu Han

Comments: 12 pages, 4 figures; To appear in Proceedings of IEEE International Conference on Distributed Computing Systems (ICDCS), 2024

Subjects: Networking and Internet Architecture (cs.NI); Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Image and Video Processing (eess.IV)
[133] arXiv:2405.07023 (cross-list from eess.IV) [pdf, other]: Title: Efficient Real-world Image Super-Resolution Via Adaptive Directional Gradient Convolution

Authors: Long Peng, Yang Cao, Renjing Pei, Wenbo Li, Jiaming Guo, Xueyang Fu, Yang Wang, Zheng-Jun Zha

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[134] arXiv:2405.07001 (cross-list from cs.CL) [pdf, other]: Title: Evaluating Task-based Effectiveness of MLLMs on Charts

Authors: Yifan Wu, Lutao Yan, Yuyu Luo, Yunhai Wang, Nan Tang

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[135] arXiv:2405.06995 (cross-list from cs.SD) [pdf, other]: Title: Benchmarking Cross-Domain Audio-Visual Deception Detection

Authors: Xiaobao Guo, Zitong Yu, Nithish Muthuchamy Selvaraj, Bingquan Shen, Adams Wai-Kin Kong, Alex C. Kot

Comments: 10 pages

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[136] arXiv:2405.06880 (cross-list from eess.IV) [pdf, other]: Title: EMCAD: Efficient Multi-scale Convolutional Attention Decoding for Medical Image Segmentation

Authors: Md Mostafijur Rahman, Mustafa Munir, Radu Marculescu

Comments: 14 pages, 5 figures, 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[137] arXiv:2405.06859 (cross-list from cs.LG) [pdf, other]: Title: Reimplementation of Learning to Reweight Examples for Robust Deep Learning

Authors: Parth Patil, Ben Boardley, Jack Gardner, Emily Loiselle, Deerajkumar Parthipan

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[138] arXiv:2405.06855 (cross-list from cs.LG) [pdf, other]: Title: Linear Explanations for Individual Neurons

Authors: Tuomas Oikarinen, Tsui-Wei Weng

Comments: Published in ICML 2024

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[139] arXiv:2405.06789 (cross-list from eess.IV) [pdf, other]: Title: Self-Consistent Recursive Diffusion Bridge for Medical Image Translation

Authors: Fuat Arslan, Bilal Kabas, Onat Dalmaz, Muzaffer Ozbey, Tolga Çukur

Comments: 11 pages, 6 figures

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[140] arXiv:2405.06786 (cross-list from eess.IV) [pdf, other]: Title: SAM3D: Zero-Shot Semi-Automatic Segmentation in 3D Medical Images with the Segment Anything Model

Authors: Trevor J. Chan, Aarush Sahni, Jie Li, Alisha Luthra, Amy Fang, Alison Pouch, Chamith S. Rajapakse

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[141] arXiv:2405.06702 (cross-list from cs.CL) [pdf, other]: Title: Malayalam Sign Language Identification using Finetuned YOLOv8 and Computer Vision Techniques

Authors: Abhinand K., Abhiram B. Nair, Dhananjay C., Hanan Hamza, Mohammed Fawaz J., Rahma Fahim K., Anoop V. S

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[142] arXiv:2405.06646 (cross-list from cs.GR) [pdf, other]: Title: On-the-fly Learning to Transfer Motion Style with Diffusion Models: A Semantic Guidance Approach

Authors: Lei Hu, Zihao Zhang, Yongjing Ye, Yiwen Xu, Shihong Xia

Comments: 23 pages

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)

Mon, 13 May 2024

[143] arXiv:2405.06636 [pdf, other]: Title: Federated Document Visual Question Answering: A Pilot Study

Authors: Khanh Nguyen, Dimosthenis Karatzas

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[144] arXiv:2405.06634 [pdf, other]: Title: Multimodal LLMs Struggle with Basic Visual Network Analysis: a VNA Benchmark

Authors: Evan M. Williams, Kathleen M. Carley

Comments: 11 pages, 3 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[145] arXiv:2405.06600 [pdf, other]: Title: Multi-Object Tracking in the Dark

Authors: Xinzhe Wang, Kang Ma, Qiankun Liu, Yunhao Zou, Ying Fu

Comments: Accepted by CVPR2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[146] arXiv:2405.06598 [pdf, other]: Title: A Lightweight Transformer for Remote Sensing Image Change Captioning

Authors: Dongwei Sun, Yajie Bao, Xiangyong Cao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[147] arXiv:2405.06593 [pdf, other]: Title: Non-Uniform Spatial Alignment Errors in sUAS Imagery From Wide-Area Disasters

Authors: Thomas Manzini, Priyankari Perali, Raisa Karnik, Mihir Godbole, Hasnat Abdullah, Robin Murphy

Comments: 6 pages, 5 figures, 1 table

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[148] arXiv:2405.06586 [pdf, other]: Title: Enhancing Weakly Supervised Semantic Segmentation with Multi-modal Foundation Models: An End-to-End Approach

Authors: Elham Ravanbakhsh, Cheng Niu, Yongqing Liang, J. Ramanujam, Xin Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[149] arXiv:2405.06574 [pdf, other]: Title: Deep video representation learning: a survey

Authors: Elham Ravanbakhsh, Yongqing Liang, J. Ramanujam, Xin Li

Comments: Multimedia Tools and Applications (2023) 1-31

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[150] arXiv:2405.06547 [pdf, other]: Title: OneTo3D: One Image to Re-editable Dynamic 3D Model and Video Generation

Authors: Jinwei Lin

Comments: 24 pages, 13 figures, 2 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[151] arXiv:2405.06536 [pdf, other]: Title: Mesh Denoising Transformer

Authors: Wenbo Zhao, Xianming Liu, Deming Zhai, Junjun Jiang, Xiangyang Ji

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[152] arXiv:2405.06535 [pdf, other]: Title: Controllable Image Generation With Composed Parallel Token Prediction

Authors: Jamie Stirling, Noura Al-Moubayed

Comments: 9 pages, 6 figures, non-anonymised pre-print for NeurIPS 2024 main conference. arXiv admin note: text overlap with arXiv:2402.04550, arXiv:2404.13788, arXiv:2403.06098, arXiv:2401.16025

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[153] arXiv:2405.06525 [pdf, other]: Title: Semantic and Spatial Adaptive Pixel-level Classifier for Semantic Segmentation

Authors: Xiaowen Ma, Zhenliang Ni, Xinghao Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[154] arXiv:2405.06502 [pdf, other]: Title: Multi-Target Unsupervised Domain Adaptation for Semantic Segmentation without External Data

Authors: Yonghao Xu, Pedram Ghamisi, Yannis Avrithis

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[155] arXiv:2405.06468 [pdf, other]: Title: Pseudo-Prompt Generating in Pre-trained Vision-Language Models for Multi-Label Medical Image Classification

Authors: Yaoqin Ye, Junjie Zhang, Hongwei Shi

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[156] arXiv:2405.06467 [pdf, other]: Title: Attend, Distill, Detect: Attention-aware Entropy Distillation for Anomaly Detection

Authors: Sushovan Jena, Vishwas Saini, Ujjwal Shaw, Pavitra Jain, Abhay Singh Raihal, Anoushka Banerjee, Sharad Joshi, Ananth Ganesh, Arnav Bhavsar

Comments: 15 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[157] arXiv:2405.06408 [pdf, other]: Title: I3DGS: Improve 3D Gaussian Splatting from Multiple Dimensions

Authors: Jinwei Lin

Comments: 16 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[158] arXiv:2405.06389 [pdf, other]: Title: Continual Novel Class Discovery via Feature Enhancement and Adaptation

Authors: Yifan Yu, Shaokun Wang, Yuhang He, Junzhe Chen, Yihong Gong

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[159] arXiv:2405.06383 [pdf, other]: Title: How to Augment for Atmospheric Turbulence Effects on Thermal Adapted Object Detection Models?

Authors: Engin Uzun, Erdem Akagunduz

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[160] arXiv:2405.06354 [pdf, other]: Title: KeepOriginalAugment: Single Image-based Better Information-Preserving Data Augmentation Approach

Authors: Teerath Kumar, Alessandra Mileo, Malika Bendechache

Comments: This paper has been accepted at 20th International Conference on Artificial Intelligence Applications and Innovations 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[161] arXiv:2405.06345 [pdf, other]: Title: Evaluating Adversarial Robustness in the Spatial Frequency Domain

Authors: Keng-Hsin Liao, Chin-Yuan Yeh, Hsi-Wen Chen, Ming-Syan Chen

Comments: 14 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[162] arXiv:2405.06342 [pdf, other]: Title: Compression-Realized Deep Structural Network for Video Quality Enhancement

Authors: Hanchi Sun, Xiaohong Liu, Xinyang Jiang, Yifei Shen, Dongsheng Li, Xiongkuo Min, Guangtao Zhai

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[163] arXiv:2405.06340 [pdf, other]: Title: Improving Transferable Targeted Adversarial Attack via Normalized Logit Calibration and Truncated Feature Mixing

Authors: Juanjuan Weng, Zhiming Luo, Shaozi Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[164] arXiv:2405.06323 [pdf, other]: Title: Open Access Battle Damage Detection via Pixel-Wise T-Test on Sentinel-1 Imagery

Authors: Ollie Ballinger

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[165] arXiv:2405.06319 [pdf, other]: Title: Decoding Emotions in Abstract Art: Cognitive Plausibility of CLIP in Recognizing Color-Emotion Associations

Authors: Hanna-Sophia Widhoelzl, Ece Takmaz

Comments: To appear in the Proceedings of the Annual Meeting of the Cognitive Science Society 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[166] arXiv:2405.06288 [pdf, other]: Title: PCLMix: Weakly Supervised Medical Image Segmentation via Pixel-Level Contrastive Learning and Dynamic Mix Augmentation

Authors: Yu Lei, Haolun Luo, Lituan Wang, Zhenwei Zhang, Lei Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[167] arXiv:2405.06283 [pdf, other]: Title: Novel Class Discovery for Ultra-Fine-Grained Visual Categorization

Authors: Yu Liu, Yaqi Cai, Qi Jia, Binglin Qiu, Weimin Wang, Nan Pu

Comments: 10 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[168] arXiv:2405.06279 [pdf, other]: Title: Benchmarking Classical and Learning-Based Multibeam Point Cloud Registration

Authors: Li Ling, Jun Zhang, Nils Bore, John Folkesson, Anna Wåhlin

Comments: Accepted at ICRA 2024 (IEEE International Conference on Robotics and Automation 2024)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[169] arXiv:2405.06278 [pdf, other]: Title: Exploring the Interplay of Interpretability and Robustness in Deep Neural Networks: A Saliency-guided Approach

Authors: Amira Guesmi, Nishant Suresh Aswani, Muhammad Shafique

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
[170] arXiv:2405.06277 [pdf, other]: Title: Learning A Spiking Neural Network for Efficient Image Deraining

Authors: Tianyu Song, Guiyue Jin, Pengpeng Li, Kui Jiang, Xiang Chen, Jiyu Jin

Comments: Accepted by IJCAI2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[171] arXiv:2405.06264 [pdf, other]: Title: Selective Focus: Investigating Semantics Sensitivity in Post-training Quantization for Lane Detection

Authors: Yunqian Fan, Xiuying Wei, Ruihao Gong, Yuqing Ma, Xiangguo Zhang, Qi Zhang, Xianglong Liu

Comments: Accepted by AAAI-24

Journal-ref: AAAI 2024, 38, 11936-11943

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[172] arXiv:2405.06260 [pdf, other]: Title: Precise Apple Detection and Localization in Orchards using YOLOv5 for Robotic Harvesting Systems

Authors: Jiang Ziyue, Yin Bo, Lu Boyun

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[173] arXiv:2405.06246 [pdf, ps, other]: Title: Comparative Analysis of Advanced Feature Matching Algorithms in Challenging High Spatial Resolution Optical Satellite Stereo Scenarios

Authors: Qiyan Luo, Jidan Zhang, Yuzhen Xie, Xu Huang, Ting Han

Comments: The manuscript is accepted as Oral Presentation in IEEE International Geoscience and Remote Sensing Symposium(IGARSS 2024)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[174] arXiv:2405.06241 [pdf, other]: Title: MGS-SLAM: Monocular Sparse Tracking and Gaussian Mapping with Depth Smooth Regularization

Authors: Pengcheng Zhu, Yaoming Zhuang, Baoquan Chen, Li Li, Chengdong Wu, Zhanlin Liu

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[175] arXiv:2405.06228 [pdf, other]: Title: Context-Guided Spatial Feature Reconstruction for Efficient Semantic Segmentation

Authors: Zhenliang Ni, Xinghao Chen, Yingjie Zhai, Yehui Tang, Yunhe Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[176] arXiv:2405.06227 [pdf, other]: Title: MaskMatch: Boosting Semi-Supervised Learning Through Mask Autoencoder-Driven Feature Learning

Authors: Wenjin Zhang, Keyi Li, Sen Yang, Chenyang Gao, Wanzhao Yang, Sifan Yuan, Ivan Marsic

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[177] arXiv:2405.06217 [pdf, other]: Title: DARA: Domain- and Relation-aware Adapters Make Parameter-efficient Tuning for Visual Grounding

Authors: Ting Liu, Xuyang Liu, Siteng Huang, Honggang Chen, Quanjun Yin, Long Qin, Donglin Wang, Yue Hu

Comments: Accepted by ICME 2024 (Oral)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[178] arXiv:2405.06216 [pdf, other]: Title: Event-based Structure-from-Orbit

Authors: Ethan Elms (1), Yasir Latif (1), Tae Ha Park (2), Tat-Jun Chin (1) ((1) The University of Adelaide, (2) Stanford University)

Comments: This work will be published in the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[179] arXiv:2405.06214 [pdf, other]: Title: Aerial-NeRF: Adaptive Spatial Partitioning and Sampling for Large-Scale Aerial Rendering

Authors: Xiaohan Zhang, Yukui Qiu, Zhenyu Sun, Qi Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[180] arXiv:2405.06201 [pdf, other]: Title: PhysMLE: Generalizable and Priors-Inclusive Multi-task Remote Physiological Measurement

Authors: Jiyao Wang, Hao Lu, Ange Wang, Xiao Yang, Yingcong Chen, Dengbo He, Kaishun Wu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[181] arXiv:2405.06198 [pdf, ps, other]: Title: MAPL: Memory Augmentation and Pseudo-Labeling for Semi-Supervised Anomaly Detection

Authors: Junzhuo Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[182] arXiv:2405.06196 [pdf, other]: Title: VLSM-Adapter: Finetuning Vision-Language Segmentation Efficiently with Lightweight Blocks

Authors: Manish Dhakal, Rabin Adhikari, Safal Thapaliya, Bishesh Khanal

Comments: 12 pages, 5 figures, 2 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[183] arXiv:2405.06191 [pdf, ps, other]: Title: ODC-SA Net: Orthogonal Direction Enhancement and Scale Aware Network for Polyp Segmentation

Authors: Chenhao Xu, Yudian Zhang, Kaiye Xu, Haijiang Zhu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[184] arXiv:2405.06185 [pdf, other]: Title: Zero-shot Degree of Ill-posedness Estimation for Active Small Object Change Detection

Authors: Koji Takeda, Kanji Tanaka, Yoshimasa Nakamura, Asako Kanezaki

Comments: 7 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[185] arXiv:2405.06181 [pdf, other]: Title: Residual-NeRF: Learning Residual NeRFs for Transparent Object Manipulation

Authors: Bardienus P. Duisterhof, Yuemin Mao, Si Heng Teng, Jeffrey Ichnowski

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[186] arXiv:2405.06143 [pdf, other]: Title: Perceptual Crack Detection for Rendered 3D Textured Meshes

Authors: Armin Shafiee Sarvestani, Wei Zhou, Zhou Wang

Comments: Accepted by IEEE QoMEX 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computational Geometry (cs.CG); Multimedia (cs.MM)
[187] arXiv:2405.06128 [pdf, other]: Title: Enhanced Multimodal Content Moderation of Children's Videos using Audiovisual Fusion

Authors: Syed Hammad Ahmed, Muhammad Junaid Khan, Gita Sukthankar

Comments: 8 pages, 3 figures, Accepted at The 37th International FLAIRS Conference

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[188] arXiv:2405.06116 [pdf, other]: Title: Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression: EventMamba

Authors: Hongwei Ren, Yue Zhou, Jiadong Zhu, Haotian Fu, Yulong Huang, Xiaopeng Lin, Yuetong Fang, Fei Ma, Hao Yu, Bojun Cheng

Comments: Extension Journal of TTPOINT and PEPNet

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[189] arXiv:2405.06088 [pdf, other]: Title: A Mixture of Experts Approach to 3D Human Motion Prediction

Authors: Edmund Shieh, Joshua Lee Franco, Kang Min Bae, Tej Lalvani

Comments: 16 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[190] arXiv:2405.06057 [pdf, other]: Title: UnSegGNet: Unsupervised Image Segmentation using Graph Neural Networks

Authors: Kovvuri Sai Gopal Reddy, Bodduluri Saran, A. Mudit Adityaja, Saurabh J. Shigwan, Nitin Kumar

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[191] arXiv:2405.06049 [pdf, other]: Title: BB-Patch: BlackBox Adversarial Patch-Attack using Zeroth-Order Optimization

Authors: Satyadwyoom Kumar, Saurabh Gupta, Arun Balaji Buduru

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[192] arXiv:2405.05983 [pdf, ps, other]: Title: Real-Time Pill Identification for the Visually Impaired Using Deep Learning

Authors: Bo Dang, Wenchao Zhao, Yufeng Li, Danqing Ma, Qixuan Yu, Elly Yijun Zhu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[193] arXiv:2405.06473 (cross-list from cs.RO) [pdf, other]: Title: Autonomous Driving with a Deep Dual-Model Solution for Steering and Braking Control

Authors: Ana Petra Jukić, Ana Šelek, Marija Seder, Ivana Podnar Žarko

Comments: 6 pages, 2 figures, accepted for publication in Proceedings of International Conference on Smart and Sustainable Technologies (SpliTech 2024)

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[194] arXiv:2405.06463 (cross-list from eess.IV) [pdf, other]: Title: MRSegmentator: Robust Multi-Modality Segmentation of 40 Classes in MRI and CT Sequences

Authors: Hartmut Häntze, Lina Xu, Felix J. Dorfner, Leonhard Donle, Daniel Truhn, Hugo Aerts, Mathias Prokop, Bram van Ginneken, Alessa Hering, Lisa C. Adams, Keno K. Bressem

Comments: 13 pages, 6 figures; corrected co-author info

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[195] arXiv:2405.06301 (cross-list from cs.LG) [pdf, ps, other]: Title: Learning from String Sequences

Authors: David Lindsay, Sian Lindsay

Comments: 10 pages, 1 figure, 4 tables, Technical Report

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computational Engineering, Finance, and Science (cs.CE); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[196] arXiv:2405.06286 (cross-list from cs.RO) [pdf, ps, other]: Title: A Joint Approach Towards Data-Driven Virtual Testing for Automated Driving: The AVEAS Project

Authors: Leon Eisemann, Mirjam Fehling-Kaschek, Silke Forkert, Andreas Forster, Henrik Gommel, Susanne Guenther, Stephan Hammer, David Hermann, Marvin Klemp, Benjamin Lickert, Florian Luettner, Robin Moss, Nicole Neis, Maria Pohle, Dominik Schreiber, Cathrina Sowa, Daniel Stadler, Janina Stompe, Michael Strobelt, David Unger, Jens Ziehn

Comments: 6 pages, 5 figures, 2 tables

Journal-ref: Proceedings of the 7th International Symposium on Future Active Safety Technology toward zero traffic accidents (JSAE FAST-zero '23), 2023

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Machine Learning (cs.LG); Systems and Control (eess.SY)
[197] arXiv:2405.06284 (cross-list from eess.IV) [pdf, other]: Title: Modality-agnostic Domain Generalizable Medical Image Segmentation by Multi-Frequency in Multi-Scale Attention

Authors: Ju-Hyeon Nam, Nur Suriza Syazwany, Su Jung Kim, Sang-Chul Lee

Comments: Accepted in Computer Vision and Pattern Recognition (CVPR) 2024

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[198] arXiv:2405.06265 (cross-list from cs.RO) [pdf, other]: Title: Uncertainty-aware Semantic Mapping in Off-road Environments with Dempster-Shafer Theory of Evidence

Authors: Junyoung Kim, Junwon Seo

Comments: Our project website can be found at this https URL

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[199] arXiv:2405.06234 (cross-list from cs.LG) [pdf, other]: Title: TS3IM: Unveiling Structural Similarity in Time Series through Image Similarity Assessment Insights

Authors: Yuhan Liu, Ke Tu

Comments: 6 pages, 6 figures

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[200] arXiv:2405.06175 (cross-list from eess.IV) [pdf, other]: Title: Prior-guided Diffusion Model for Cell Segmentation in Quantitative Phase Imaging

Authors: Zhuchen Shao, Mark A. Anastasio, Hua Li

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[201] arXiv:2405.06166 (cross-list from eess.IV) [pdf, other]: Title: MDNet: Multi-Decoder Network for Abdominal CT Organs Segmentation

Authors: Debesh Jha, Nikhil Kumar Tomar, Koushik Biswas, Gorkem Durak, Matthew Antalek, Zheyuan Zhang, Bin Wang, Md Mostafijur Rahman, Hongyi Pan, Alpay Medetalibeyoglu, Yury Velichko, Daniela Ladner, Amir Borhani, Ulas Bagci

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[202] arXiv:2405.06149 (cross-list from cs.AI) [pdf, other]: Title: DisBeaNet: A Deep Neural Network to augment Unmanned Surface Vessels for maritime situational awareness

Authors: Srikanth Vemula, Eulises Franco, Michael Frye

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

Fri, 10 May 2024

[203] arXiv:2405.05967 [pdf, other]: Title: Distilling Diffusion Models into Conditional GANs

Authors: Minguk Kang, Richard Zhang, Connelly Barnes, Sylvain Paris, Suha Kwak, Jaesik Park, Eli Shechtman, Jun-Yan Zhu, Taesung Park

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[204] arXiv:2405.05953 [pdf, other]: Title: Frame Interpolation with Consecutive Brownian Bridge Diffusion

Authors: Zonglin Lyu, Ming Li, Jianbo Jiao, Chen Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[205] arXiv:2405.05949 [pdf, other]: Title: CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts

Authors: Jiachen Li, Xinyao Wang, Sijie Zhu, Chia-Wen Kuo, Lu Xu, Fan Chen, Jitesh Jain, Humphrey Shi, Longyin Wen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[206] arXiv:2405.05945 [pdf, other]: Title: Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion Transformers

Authors: Peng Gao, Le Zhuo, Ziyi Lin, Chris Liu, Junsong Chen, Ruoyi Du, Enze Xie, Xu Luo, Longtian Qiu, Yuhang Zhang, Chen Lin, Rongjie Huang, Shijie Geng, Renrui Zhang, Junlin Xi, Wenqi Shao, Zhengkai Jiang, Tianshuo Yang, Weicai Ye, He Tong, Jingwen He, Yu Qiao, Hongsheng Li

Comments: Technical Report; Code at: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[207] arXiv:2405.05900 [pdf, other]: Title: A Comprehensive Survey of Masked Faces: Recognition, Detection, and Unmasking

Authors: Mohamed Mahmoud, Mahmoud SalahEldin Kasem, Hyun-Soo Kang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[208] arXiv:2405.05858 [pdf, other]: Title: Free-Moving Object Reconstruction and Pose Estimation with Virtual Camera

Authors: Haixin Shi, Yinlin Hu, Daniel Koguciuk, Juan-Ting Lin, Mathieu Salzmann, David Ferstl

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Robotics (cs.RO)
[209] arXiv:2405.05853 [pdf, other]: Title: Robust and Explainable Fine-Grained Visual Classification with Transfer Learning: A Dual-Carriageway Framework

Authors: Zheming Zuo, Joseph Smith, Jonathan Stonehouse, Boguslaw Obara

Comments: Accepted in the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024 workshop

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[210] arXiv:2405.05852 [pdf, other]: Title: Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for Control

Authors: Gunshi Gupta, Karmesh Yadav, Yarin Gal, Dhruv Batra, Zsolt Kira, Cong Lu, Tim G. J. Rudner

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG); Robotics (cs.RO); Machine Learning (stat.ML)
[211] arXiv:2405.05841 [pdf, other]: Title: Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition

Authors: Zuan Gao, Yuxin Wang, Yadong Qu, Boqiang Zhang, Zixiao Wang, Jianjun Xu, Hongtao Xie

Comments: Accepted to IJCAI2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[212] arXiv:2405.05830 [pdf, ps, other]: Title: Mask-TS Net: Mask Temperature Scaling Uncertainty Calibration for Polyp Segmentation

Authors: Yudian Zhang, Chenhao Xu, Kaiye Xu, Haijiang Zhu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[213] arXiv:2405.05811 [pdf, other]: Title: Parallel Cross Strip Attention Network for Single Image Dehazing

Authors: Lihan Tong, Yun Liu, Tian Ye, Weijia Li, Liyuan Chen, Erkang Chen

Comments: 10 pages , 4 figures, CTISC'24

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[214] arXiv:2405.05808 [pdf, other]: Title: Fast and Controllable Post-training Sparsity: Learning Optimal Sparsity Allocation with Global Constraint in Minutes

Authors: Ruihao Gong, Yang Yong, Zining Wang, Jinyang Guo, Xiuying Wei, Yuqing Ma, Xianglong Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[215] arXiv:2405.05806 [pdf, other]: Title: MasterWeaver: Taming Editability and Identity for Personalized Text-to-Image Generation

Authors: Yuxiang Wei, Zhilong Ji, Jinfeng Bai, Hongzhi Zhang, Lei Zhang, Wangmeng Zuo

Comments: 34 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[216] arXiv:2405.05803 [pdf, other]: Title: Boosting Multimodal Large Language Models with Visual Tokens Withdrawal for Rapid Inference

Authors: Zhihang Lin, Mingbao Lin, Luxi Lin, Rongrong Ji

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[217] arXiv:2405.05791 [pdf, other]: Title: Sequential Amodal Segmentation via Cumulative Occlusion Learning

Authors: Jiayang Ao, Qiuhong Ke, Krista A. Ehinger

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[218] arXiv:2405.05769 [pdf, other]: Title: Exploring Text-Guided Single Image Editing for Remote Sensing Images

Authors: Fangzhou Han, Lingyu Si, Hongwei Dong, Lamei Zhang, Hao Chen, Bo Du

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[219] arXiv:2405.05768 [pdf, other]: Title: FastScene: Text-Driven Fast 3D Indoor Scene Generation via Panoramic Gaussian Splatting

Authors: Yikun Ma, Dandan Zhan, Zhi Jin

Comments: Accepted by IJCAI-2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[220] arXiv:2405.05766 [pdf, other]: Title: To Trust or Not to Trust: Towards a novel approach to measure trust for XAI systems

Authors: Miquel Miró-Nicolau, Gabriel Moyà-Alcover, Antoni Jaume-i-Capó, Manuel González-Hidalgo, Maria Gemma Sempere Campello, Juan Antonio Palmer Sancho

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[221] arXiv:2405.05763 [pdf, ps, other]: Title: DP-MDM: Detail-Preserving MR Reconstruction via Multiple Diffusion Models

Authors: Mengxiao Geng, Jiahao Zhu, Xiaolin Zhu, Qiqing Liu, Dong Liang, Qiegen Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[222] arXiv:2405.05760 [pdf, other]: Title: Similarity Guided Multimodal Fusion Transformer for Semantic Location Prediction in Social Media

Authors: Zhizhen Zhang, Ning Wang, Haojie Li, Zhihui Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[223] arXiv:2405.05755 [pdf, other]: Title: CSA-Net: Channel-wise Spatially Autocorrelated Attention Networks

Authors: Nick Nikzad, Yongsheng Gao, Jun Zhou

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[224] arXiv:2405.05749 [pdf, other]: Title: NeRFFaceSpeech: One-shot Audio-driven 3D Talking Head Synthesis via Generative Prior

Authors: Gihoon Kim, Kwanggyoon Seo, Sihun Cha, Junyong Noh

Comments: 11 pages, 5 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[225] arXiv:2405.05745 [pdf, other]: Title: Efficient Pretraining Model based on Multi-Scale Local Visual Field Feature Reconstruction for PCB CT Image Element Segmentation

Authors: Chen Chen, Kai Qiao, Jie Yang, Jian Chen, Bin Yan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[226] arXiv:2405.05742 [pdf, other]: Title: How Quality Affects Deep Neural Networks in Fine-Grained Image Classification

Authors: Joseph Smith, Zheming Zuo, Jonathan Stonehouse, Boguslaw Obara

Comments: VISAPP 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[227] arXiv:2405.05714 [pdf, other]: Title: Estimating Noisy Class Posterior with Part-level Labels for Noisy Label Learning

Authors: Rui Zhao, Bin Shi, Jianfei Ruan, Tianze Pan, Bo Dong

Comments: CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[228] arXiv:2405.05707 [pdf, other]: Title: LatentColorization: Latent Diffusion-Based Speaker Video Colorization

Authors: Rory Ward, Dan Bigioi, Shubhajit Basak, John G. Breslin, Peter Corcoran

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[229] arXiv:2405.05691 [pdf, other]: Title: StableMoFusion: Towards Robust and Efficient Diffusion-based Motion Generation Framework

Authors: Yiheng Huang, Hui Yang, Chuanchen Luo, Yuxi Wang, Shibiao Xu, Zhaoxiang Zhang, Man Zhang, Junran Peng

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[230] arXiv:2405.05674 [pdf, ps, other]: Title: TransAnaNet: Transformer-based Anatomy Change Prediction Network for Head and Neck Cancer Patient Radiotherapy

Authors: Meixu Chen, Kai Wang, Michael Dohopolski, Howard Morgan, Jing Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)
[231] arXiv:2405.05672 [pdf, other]: Title: Multi-Stream Keypoint Attention Network for Sign Language Recognition and Translation

Authors: Mo Guan, Yan Wang, Guangkun Ma, Jiarui Liu, Mingzu Sun

Comments: 15 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[232] arXiv:2405.05663 [pdf, other]: Title: RPBG: Towards Robust Neural Point-based Graphics in the Wild

Authors: Qingtian Zhu, Zizhuang Wei, Zhongtian Zheng, Yifan Zhan, Zhuyu Yao, Jiawang Zhang, Kejian Wu, Yinqiang Zheng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[233] arXiv:2405.05647 [pdf, ps, other]: Title: Letter to the Editor: What are the legal and ethical considerations of submitting radiology reports to ChatGPT?

Authors: Siddharth Agarwal, David Wood, Robin Carpenter, Yiran Wei, Marc Modat, Thomas C Booth

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[234] arXiv:2405.05636 [pdf, other]: Title: SwapTalk: Audio-Driven Talking Face Generation with One-Shot Customization in Latent Space

Authors: Zeren Zhang, Haibo Qin, Jiayu Huang, Yixin Li, Hui Lin, Yitao Duan, Jinwen Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[235] arXiv:2405.05615 [pdf, other]: Title: Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning

Authors: Shibo Jie, Yehui Tang, Ning Ding, Zhi-Hong Deng, Kai Han, Yunhe Wang

Comments: Accepted to ICML2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
[236] arXiv:2405.05614 [pdf, other]: Title: Depth Awakens: A Depth-perceptual Attention Fusion Network for RGB-D Camouflaged Object Detection

Authors: Xinran Liua, Lin Qia, Yuxuan Songa, Qi Wen

Journal-ref: Image and Vision Computing, 143:104924, 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Networking and Internet Architecture (cs.NI)
[237] arXiv:2405.05613 [pdf, other]: Title: Robust Pseudo-label Learning with Neighbor Relation for Unsupervised Visible-Infrared Person Re-Identification

Authors: Xiangbo Yin, Jiangming Shi, Yachao Zhang, Yang Lu, Zhizhong Zhang, Yuan Xie, Yanyun Qu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[238] arXiv:2405.05605 [pdf, other]: Title: Minimal Perspective Autocalibration

Authors: Andrea Porfiri Dal Cin, Timothy Duff, Luca Magri, Tomas Pajdla

Comments: 8 pages main paper + 2 pages references + 8 pages supplementary; to be presented at CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[239] arXiv:2405.05587 [pdf, other]: Title: Navigate Beyond Shortcuts: Debiased Learning through the Lens of Neural Collapse

Authors: Yining Wang, Junjie Sun, Chenyue Wang, Mi Zhang, Min Yang

Comments: CVPR 2024 Highlight

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[240] arXiv:2405.05584 [pdf, other]: Title: A Survey on Backbones for Deep Video Action Recognition

Authors: Zixuan Tang, Youjun Zhao, Yuhang Wen, Mengyuan Liu

Comments: This paper has been accepted by ICME workshop

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[241] arXiv:2405.05574 [pdf, other]: Title: Vision-Language Modeling with Regularized Spatial Transformer Networks for All Weather Crosswind Landing of Aircraft

Authors: Debabrata Pal, Anvita Singh, Saumya Saumya, Shouvik Das

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[242] arXiv:2405.05573 [pdf, other]: Title: Poisoning-based Backdoor Attacks for Arbitrary Target Label with Positive Triggers

Authors: Binxiao Huang, Jason Chun Lok, Chang Liu, Ngai Wong

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
[243] arXiv:2405.05553 [pdf, other]: Title: Towards Robust Physical-world Backdoor Attacks on Lane Detection

Authors: Xinwei Zhang, Aishan Liu, Tianyuan Zhang, Siyuan Liang, Xianglong Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[244] arXiv:2405.05552 [pdf, other]: Title: Bidirectional Progressive Transformer for Interaction Intention Anticipation

Authors: Zichen Zhang, Hongchen Luo, Wei Zhai, Yang Cao, Yu Kang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[245] arXiv:2405.05551 [pdf, ps, other]: Title: The object detection model uses combined extraction with KNN and RF classification

Authors: Florentina Tatrin Kurniati, Daniel HF Manongga, Irwan Sembiring, Sutarto Wijono, Roy Rudolf Huizen

Journal-ref: IJEECS, pp 436-445, Vol 35, No 1 July 2024; https://ijeecs.iaescore.com/index.php/IJEECS/article/view/35888

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[246] arXiv:2405.05538 [pdf, other]: Title: A Survey on Personalized Content Synthesis with Diffusion Models

Authors: Xulu Zhang, Xiao-Yong Wei, Wengyu Zhang, Jinlin Wu, Zhaoxiang Zhang, Zhen Lei, Qing Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[247] arXiv:2405.05530 [pdf, other]: Title: NurtureNet: A Multi-task Video-based Approach for Newborn Anthropometry

Authors: Yash Khandelwal, Mayur Arvind, Sriram Kumar, Ashish Gupta, Sachin Kumar Danisetty, Piyush Bagad, Anish Madan, Mayank Lunayach, Aditya Annavajjala, Abhishek Maiti, Sansiddh Jain, Aman Dalmia, Namrata Deka, Jerome White, Jigar Doshi, Angjoo Kanazawa, Rahul Panicker, Alpan Raval, Srinivas Rana, Makarand Tapaswi

Comments: Accepted at CVPM Workshop at CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[248] arXiv:2405.05524 [pdf, other]: Title: Universal Adversarial Perturbations for Vision-Language Pre-trained Models

Authors: Peng-Fei Zhang, Zi Huang, Guangdong Bai

Comments: 9 pages, 5 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[249] arXiv:2405.05523 [pdf, other]: Title: Prompt When the Animal is: Temporal Animal Behavior Grounding with Positional Recovery Training

Authors: Sheng Yan, Xin Du, Zongying Li, Yi Wang, Hongcang Jin, Mengyuan Liu

Comments: Accepted by ICMEW 2024. arXiv admin note: text overlap with arXiv:2404.13657

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[250] arXiv:2405.05518 [pdf, other]: Title: DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map Construction

Authors: Siyu Li, Jiacheng Lin, Hao Shi, Jiaming Zhang, Song Wang, You Yao, Zhiyong Li, Kailun Yang

Comments: The source code will be made publicly available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO); Image and Video Processing (eess.IV)
[251] arXiv:2405.05502 [pdf, other]: Title: Towards Accurate and Robust Architectures via Neural Architecture Search

Authors: Yuwei Ou, Yuqi Feng, Yanan Sun

Comments: Accepted by CVPR2024. arXiv admin note: substantial text overlap with arXiv:2212.14049

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[252] arXiv:2405.05497 [pdf, other]: Title: Multi-Level Feature Fusion Network for Lightweight Stereo Image Super-Resolution

Authors: Yunxiang Li, Wenbin Zou, Qiaomu Wei, Feng Huang, Jing Wu

Comments: 10 pages, 7 figures, CVPRWorkshop NTIRE2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[253] arXiv:2405.05488 [pdf, ps, other]: Title: Advancing Head and Neck Cancer Survival Prediction via Multi-Label Learning and Deep Model Interpretation

Authors: Meixu Chen, Kai Wang, Jing Wang

Comments: 10 pages, 4 figures, 2 tables, 2 pages of supplementary material

Subjects: Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)
[254] arXiv:2405.05477 [pdf, other]: Title: DynaSeg: A Deep Dynamic Fusion Method for Unsupervised Image Segmentation Incorporating Feature Similarity and Spatial Continuity

Authors: Naimul Khan, Boujemaa Guermazi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[255] arXiv:2405.05446 [pdf, other]: Title: GDGS: Gradient Domain Gaussian Splatting for Sparse Representation of Radiance Fields

Authors: Yuanhao Gong

Comments: arXiv admin note: text overlap with arXiv:2404.09105

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[256] arXiv:2405.05428 [pdf, other]: Title: Adversary-Guided Motion Retargeting for Skeleton Anonymization

Authors: Thomas Carr, Depeng Xu, Aidong Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[257] arXiv:2405.05422 [pdf, other]: Title: EarthMatch: Iterative Coregistration for Fine-grained Localization of Astronaut Photography

Authors: Gabriele Berton, Gabriele Goletto, Gabriele Trivigno, Alex Stoken, Barbara Caputo, Carlo Masone

Comments: CVPR 2024 IMW - webpage: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[258] arXiv:2405.05363 [pdf, other]: Title: LOC-ZSON: Language-driven Object-Centric Zero-Shot Object Retrieval and Navigation

Authors: Tianrui Guan, Yurou Yang, Harry Cheng, Muyuan Lin, Richard Kim, Rajasimman Madhivanan, Arnie Sen, Dinesh Manocha

Comments: Accepted to ICRA 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[259] arXiv:2405.05355 [pdf, other]: Title: Geometry-Informed Distance Candidate Selection for Adaptive Lightweight Omnidirectional Stereo Vision with Fisheye Images

Authors: Conner Pulling, Je Hon Tan, Yaoyu Hu, Sebastian Scherer

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[260] arXiv:2405.05354 [pdf, other]: Title: Transfer-LMR: Heavy-Tail Driving Behavior Recognition in Diverse Traffic Scenarios

Authors: Chirag Parikh, Ravi Shankar Mishra, Rohan Chandra, Ravi Kiran Sarvadevabhatla

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[261] arXiv:2405.05297 [pdf, ps, other]: Title: Deep Learning Method to Predict Wound Healing Progress Based on Collagen Fibers in Wound Tissue

Authors: Juan He, Xiaoyan Wang, Long Chen, Yunpeng Cai, Zhengshan Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[262] arXiv:2405.05295 [pdf, other]: Title: Relevant Irrelevance: Generating Alterfactual Explanations for Image Classifiers

Authors: Silvan Mertes, Tobias Huber, Christina Karle, Katharina Weitz, Ruben Schlagowski, Cristina Conati, Elisabeth André

Comments: Accepted at IJCAI 2024. arXiv admin note: text overlap with arXiv:2207.09374

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[263] arXiv:2405.05261 [pdf, other]: Title: 3D Holistic OR Anonymization

Authors: Tony Danjun Wang

Comments: This bachelor's thesis was the foundation of the paper "DisguisOR: Holistic Face Anonymization for the Operating Room" (see arXiv:2307.14241), published at IPCAI'23

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[264] arXiv:2405.05260 [pdf, other]: Title: Financial Table Extraction in Image Documents

Authors: William Watson, Bo Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[265] arXiv:2405.05956 (cross-list from cs.RO) [pdf, other]: Title: Probing Multimodal LLMs as World Models for Driving

Authors: Shiva Sreeram, Tsun-Hsuan Wang, Alaa Maalouf, Guy Rosman, Sertac Karaman, Daniela Rus

Comments: this https URL this https URL

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[266] arXiv:2405.05944 (cross-list from eess.IV) [pdf, other]: Title: MRISegmentator-Abdomen: A Fully Automated Multi-Organ and Structure Segmentation Tool for T1-weighted Abdominal MRI

Authors: Yan Zhuang, Tejas Sudharshan Mathai, Pritam Mukherjee, Brandon Khoury, Boah Kim, Benjamin Hou, Nusrat Rabbee, Ronald M. Summers

Comments: 23 pages, 13 figures

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[267] arXiv:2405.05941 (cross-list from cs.RO) [pdf, other]: Title: Evaluating Real-World Robot Manipulation Policies in Simulation

Authors: Xuanlin Li, Kyle Hsu, Jiayuan Gu, Karl Pertsch, Oier Mees, Homer Rich Walke, Chuyuan Fu, Ishikaa Lunawat, Isabel Sieh, Sean Kirmani, Sergey Levine, Jiajun Wu, Chelsea Finn, Hao Su, Quan Vuong, Ted Xiao

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[268] arXiv:2405.05934 (cross-list from cs.LG) [pdf, other]: Title: Theoretical Guarantees of Data Augmented Last Layer Retraining Methods

Authors: Monica Welfert, Nathan Stromberg, Lalitha Sankar

Comments: Extended version of a paper accepted to ISIT 2024. arXiv admin note: text overlap with arXiv:2402.11039

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT); Machine Learning (stat.ML)
[269] arXiv:2405.05886 (cross-list from cs.LG) [pdf, other]: Title: Exploiting Autoencoder's Weakness to Generate Pseudo Anomalies

Authors: Marcella Astrid, Muhammad Zaigham Zaheer, Djamila Aouada, Seung-Ik Lee

Comments: SharedIt link: this https URL

Journal-ref: Neural Computing and Applications, pp.1-17 (2024)

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[270] arXiv:2405.05876 (cross-list from cs.RO) [pdf, other]: Title: Composable Part-Based Manipulation

Authors: Weiyu Liu, Jiayuan Mao, Joy Hsu, Tucker Hermans, Animesh Garg, Jiajun Wu

Comments: Presented at CoRL 2023. For videos and additional results, see our website: this https URL

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[271] arXiv:2405.05847 (cross-list from cs.LG) [pdf, other]: Title: Learned feature representations are biased by complexity, learning order, position, and more

Authors: Andrew Kyle Lampinen, Stephanie C. Y. Chan, Katherine Hermann

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[272] arXiv:2405.05846 (cross-list from cs.CR) [pdf, other]: Title: Could It Be Generated? Towards Practical Analysis of Memorization in Text-To-Image Diffusion Models

Authors: Zhe Ma, Xuhong Zhang, Qingming Li, Tianyu Du, Wenzhi Chen, Zonghui Wang, Shouling Ji

Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[273] arXiv:2405.05836 (cross-list from cs.LG) [pdf, other]: Title: Informed Decision-Making through Advancements in Open Set Recognition and Unknown Sample Detection

Authors: Atefeh Mahdavi, Marco Carvalho

Comments: Accepted for proceedings of the 57th Hawaii International Conference on System Sciences: 10 pages, 6 figures, 3-6 January 2024, Honolulu, United States

Journal-ref: Atefeh, M., & Marco, C. (2024). "Informed Decision-Making through Advancements in Open Set Recognition and Unknown Sample Detection." Proceedings of the 57th Hawaii International Conference on System Sciences, 1090-1999

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[274] arXiv:2405.05828 (cross-list from cs.RO) [pdf, other]: Title: MAD-ICP: It Is All About Matching Data -- Robust and Informed LiDAR Odometry

Authors: Simone Ferrari, Luca Di Giammarino, Leonardo Brizi, Giorgio Grisetti

Comments: this https URL

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[275] arXiv:2405.05814 (cross-list from eess.IV) [pdf, ps, other]: Title: MSDiff: Multi-Scale Diffusion Model for Ultra-Sparse View CT Reconstruction

Authors: Pinhuang Tan, Mengxiao Geng, Jingya Lu, Liu Shi, Bin Huang, Qiegen Liu

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[276] arXiv:2405.05800 (cross-list from cs.GR) [pdf, other]: Title: DragGaussian: Enabling Drag-style Manipulation on 3D Gaussian Representation

Authors: Sitian Shen, Jing Xu, Yuheng Yuan, Xingyi Yang, Qiuhong Shen, Xinchao Wang

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[277] arXiv:2405.05792 (cross-list from cs.RO) [pdf, other]: Title: RoboHop: Segment-based Topological Map Representation for Open-World Visual Navigation

Authors: Sourav Garg, Krishan Rana, Mehdi Hosseinzadeh, Lachlan Mares, Niko Sünderhauf, Feras Dayoub, Ian Reid

Comments: Published at ICRA 2024; 9 pages, 8 figures

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
[278] arXiv:2405.05787 (cross-list from cs.RO) [pdf, other]: Title: Autonomous Robotic Ultrasound System for Liver Follow-up Diagnosis: Pilot Phantom Study

Authors: Tianpeng Zhang (1), Sekeun Kim (2), Jerome Charton (2), Haitong Ma (1), Kyungsang Kim (2), Na Li (1), Quanzheng Li (2) ((1) SEAS, Harvard University (2) CAMCA, Massachusetts General Hospital and Harvard Medical School)

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Systems and Control (eess.SY)
[279] arXiv:2405.05695 (cross-list from cs.LG) [pdf, other]: Title: Aux-NAS: Exploiting Auxiliary Labels with Negligibly Extra Inference Cost

Authors: Yuan Gao, Weizhong Zhang, Wenhan Luo, Lin Ma, Jin-Gang Yu, Gui-Song Xia, Jiayi Ma

Comments: Accepted to ICLR 2024

Journal-ref: International Conference on Learning Representations (ICLR), 2024

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[280] arXiv:2405.05667 (cross-list from eess.IV) [pdf, other]: Title: VM-DDPM: Vision Mamba Diffusion for Medical Image Synthesis

Authors: Zhihan Ju, Wanting Zhou

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[281] arXiv:2405.05658 (cross-list from eess.IV) [pdf, ps, other]: Title: Artificial intelligence for abnormality detection in high volume neuroimaging: a systematic review and meta-analysis

Authors: Siddharth Agarwal, David A. Wood, Mariusz Grzeda, Chandhini Suresh, Munaib Din, James Cole, Marc Modat, Thomas C Booth

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[282] arXiv:2405.05648 (cross-list from cs.RO) [pdf, other]: Title: ASGrasp: Generalizable Transparent Object Reconstruction and Grasping from RGB-D Active Stereo Camera

Authors: Jun Shi, Yong A, Yixiang Jin, Dingzhe Li, Haoyu Niu, Zhezhu Jin, He Wang

Comments: IEEE International Conference on Robotics and Automation (ICRA), 2024

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[283] arXiv:2405.05619 (cross-list from cs.LG) [pdf, other]: Title: Rectified Gaussian kernel multi-view k-means clustering

Authors: Kristina P. Sinaga

Comments: 13 pages, 1 figure, 7 Tables

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[284] arXiv:2405.05588 (cross-list from cs.LG) [pdf, other]: Title: Model Inversion Robustness: Can Transfer Learning Help?

Authors: Sy-Tuyen Ho, Koh Jun Hao, Keshigeyan Chandrasegaran, Ngoc-Bao Nguyen, Ngai-Man Cheung

Journal-ref: CVPR 2024

Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[285] arXiv:2405.05564 (cross-list from eess.IV) [pdf, other]: Title: Joint Edge Optimization Deep Unfolding Network for Accelerated MRI Reconstruction

Authors: Yue Cai, Yu Luo, Jie Ling, Shun Yao

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[286] arXiv:2405.05520 (cross-list from eess.IV) [pdf, other]: Title: Continuous max-flow augmentation of self-supervised few-shot learning on SPECT left ventricles

Authors: Ádám István Szűcs, Béla Kári, Oszkár Pártos

Comments: ISBI 2024 Accepted paper for presentation

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[287] arXiv:2405.05386 (cross-list from cs.LG) [pdf, other]: Title: Interpretability Needs a New Paradigm

Authors: Andreas Madsen, Himabindu Lakkaraju, Siva Reddy, Sarath Chandar

Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[288] arXiv:2405.05336 (cross-list from eess.IV) [pdf, other]: Title: Joint semi-supervised and contrastive learning enables zero-shot domain-adaptation and multi-domain segmentation

Authors: Alvaro Gomariz, Yusuke Kikuchi, Yun Yvonna Li, Thomas Albrecht, Andreas Maunz, Daniela Ferrara, Huanxiang Lu, Orcun Goksel

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

Thu, 9 May 2024

[289] arXiv:2405.05259 [pdf, other]: Title: OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies

Authors: Lingdong Kong, Youquan Liu, Lai Xing Ng, Benoit R. Cottereau, Wei Tsang Ooi

Comments: CVPR 2024 (Highlight); 26 pages, 12 figures, 11 tables; Code at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[290] arXiv:2405.05258 [pdf, other]: Title: Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving

Authors: Lingdong Kong, Xiang Xu, Jiawei Ren, Wenwei Zhang, Liang Pan, Kai Chen, Wei Tsang Ooi, Ziwei Liu

Comments: Preprint; 17 pages, 6 figures, 8 tables; Code at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)
[291] arXiv:2405.05256 [pdf, other]: Title: THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models

Authors: Prannay Kaul, Zhizhong Li, Hao Yang, Yonatan Dukler, Ashwin Swaminathan, C. J. Taylor, Stefano Soatto

Comments: In CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[292] arXiv:2405.05252 [pdf, other]: Title: Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models

Authors: Hongjie Wang, Difan Liu, Yan Kang, Yijun Li, Zhe Lin, Niraj K. Jha, Yuchen Liu

Comments: Accepted to IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Image and Video Processing (eess.IV); Signal Processing (eess.SP)
[293] arXiv:2405.05241 [pdf, other]: Title: BenthicNet: A global compilation of seafloor images for deep learning applications

Authors: Scott C. Lowe, Benjamin Misiuk, Isaac Xu, Shakhboz Abdulazizov, Amit R. Baroi, Alex C. Bastos, Merlin Best, Vicki Ferrini, Ariell Friedman, Deborah Hart, Ove Hoegh-Guldberg, Daniel Ierodiaconou, Julia Mackin-McLaughlin, Kathryn Markey, Pedro S. Menandro, Jacquomo Monk, Shreya Nemani, John O'Brien, Elizabeth Oh, Luba Y. Reshitnyk, Katleen Robert, Chris M. Roelfsema, Jessica A. Sameoto, Alexandre C. G. Schimel, Jordan A. Thomson, Brittany R. Wilson, Melisa C. Wong, Craig J. Brown, Thomas Trappenberg

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[294] arXiv:2405.05237 [pdf, other]: Title: EVA-X: A Foundation Model for General Chest X-ray Analysis with Self-supervised Learning

Authors: Jingfeng Yao, Xinggang Wang, Yuehao Song, Huangxuan Zhao, Jun Ma, Yajie Chen, Wenyu Liu, Bo Wang

Comments: codes available at: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[295] arXiv:2405.05224 [pdf, other]: Title: Imagine Flash: Accelerating Emu Diffusion Models with Backward Distillation

Authors: Jonas Kohler, Albert Pumarola, Edgar Schönfeld, Artsiom Sanakoyeu, Roshan Sumbaly, Peter Vajda, Ali Thabet

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[296] arXiv:2405.05216 [pdf, other]: Title: FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models

Authors: Jinglin Xu, Yijie Guo, Yuxin Peng

Comments: Accepted by CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[297] arXiv:2405.05173 [pdf, other]: Title: A Survey on Occupancy Perception for Autonomous Driving: The Information Fusion Perspective

Authors: Huaiyuan Xu, Junliang Chen, Shiyu Meng, Yi Wang, Lap-Pui Chau

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[298] arXiv:2405.05164 [pdf, other]: Title: ProbRadarM3F: mmWave Radar based Human Skeletal Pose Estimation with Probability Map Guided Multi-Format Feature Fusion

Authors: Bing Zhu, Zixin He, Weiyi Xiong, Guanhua Ding, Jianan Liu, Tao Huang, Wei Chen, Wei Xiang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[299] arXiv:2405.05145 [pdf, other]: Title: Conformal Semantic Image Segmentation: Post-hoc Quantification of Predictive Uncertainty

Authors: Luca Mossina, Joseba Dalmau, Léo andéol

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[300] arXiv:2405.05143 [pdf, other]: Title: Learning Object Semantic Similarity with Self-Supervision

Authors: Arthur Aubret, Timothy Schaumlöffel, Gemma Roig, Jochen Triesch

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
[301] arXiv:2405.05133 [pdf, other]: Title: Identifying every building's function in large-scale urban areas with multi-modality remote-sensing data

Authors: Zhuohong Li, Wei He, Jiepan Li, Hongyan Zhang

Comments: 5 pages, 7 figures, accepted by IGARSS 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[302] arXiv:2405.05130 [pdf, other]: Title: Multi-scale Bottleneck Transformer for Weakly Supervised Multimodal Violence Detection

Authors: Shengyang Sun, Xiaojin Gong

Comments: Accepted by ICME 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[303] arXiv:2405.05079 [pdf, other]: Title: Power Variable Projection for Initialization-Free Large-Scale Bundle Adjustment

Authors: Simon Weber, Je Hyeong Hong, Daniel Cremers

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[304] arXiv:2405.05057 [pdf, other]: Title: Real-Time Motion Detection Using Dynamic Mode Decomposition

Authors: Marco Mignacca, Simone Brugiapaglia, Jason J. Bramburger

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[305] arXiv:2405.05039 [pdf, other]: Title: Reviewing Intelligent Cinematography: AI research for camera-based video production

Authors: Adrian Azzarelli, Nantheera Anantrasirichai, David R Bull

Comments: For researchers and cinematographers. 43 pages including Table of Contents, List of Figures and Tables. We obtained permission to use Figures 5 and 11. All other Figures have been drawn by us

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[306] arXiv:2405.05031 [pdf, other]: Title: Mitigating Bias Using Model-Agnostic Data Attribution

Authors: Sander De Coninck, Wei-Cheng Wang, Sam Leroux, Pieter Simoens

Comments: Accepted to the 2024 IEEE CVPR Workshop on Fair, Data-efficient, and Trusted Computer Vision

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[307] arXiv:2405.05027 [pdf, other]: Title: StyleMamba : State Space Model for Efficient Text-driven Image Style Transfer

Authors: Zijia Wang, Zhi-Song Liu

Comments: Blind submission to ECAI 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[308] arXiv:2405.05016 [pdf, other]: Title: TGTM: TinyML-based Global Tone Mapping for HDR Sensors

Authors: Peter Todorov, Julian Hartig, Jan Meyer-Siemon, Martin Fiedler, Gregor Schewior

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[309] arXiv:2405.05012 [pdf, other]: Title: The Entropy Enigma: Success and Failure of Entropy Minimization

Authors: Ori Press, Ravid Shwartz-Ziv, Yann LeCun, Matthias Bethge

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[310] arXiv:2405.05010 [pdf, other]: Title: ${M^2D}$NeRF: Multi-Modal Decomposition NeRF with 3D Feature Fields

Authors: Ning Wang, Lefei Zhang, Angel X Chang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[311] arXiv:2405.05004 [pdf, other]: Title: TENet: Targetness Entanglement Incorporating with Multi-Scale Pooling and Mutually-Guided Fusion for RGB-E Object Tracking

Authors: Pengcheng Shao, Tianyang Xu, Zhangyong Tang, Linze Li, Xiao-Jun Wu, Josef Kittler

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[312] arXiv:2405.05001 [pdf, other]: Title: HMANet: Hybrid Multi-Axis Aggregation Network for Image Super-Resolution

Authors: Shu-Chuan Chu, Zhi-Chao Dou, Jeng-Shyang Pan, Shaowei Weng, Junbao Li

Comments: 12 pages, 10 figures, conference

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[313] arXiv:2405.04997 [pdf, other]: Title: Bridging the Gap Between Saliency Prediction and Image Quality Assessment

Authors: Kirillov Alexey, Andrey Moskalenko, Dmitriy Vatolin

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[314] arXiv:2405.04974 [pdf, other]: Title: Discrepancy-based Diffusion Models for Lesion Detection in Brain MRI

Authors: Keqiang Fan, Xiaohao Cai, Mahesan Niranjan

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[315] arXiv:2405.04971 [pdf, other]: Title: End-to-End Semi-Supervised approach with Modulated Object Queries for Table Detection in Documents

Authors: Iqraa Ehsan, Tahira Shehzadi, Didier Stricker, Muhammad Zeshan Afzal

Comments: ICDAR-IJDAR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[316] arXiv:2405.04969 [pdf, other]: Title: A review on discriminative self-supervised learning methods

Authors: Nikolaos Giakoumoglou, Tania Stathaki

Comments: 21 pages, 7 figures, 11 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[317] arXiv:2405.04964 [pdf, other]: Title: Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution

Authors: Yi Xiao, Qiangqiang Yuan, Kui Jiang, Yuzeng Chen, Qiang Zhang, Chia-Wen Lin

Comments: Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[318] arXiv:2405.04953 [pdf, other]: Title: Supervised Anomaly Detection for Complex Industrial Images

Authors: Aimira Baitieva, David Hurych, Victor Besnier, Olivier Bernard

Comments: Accepted to CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[319] arXiv:2405.04950 [pdf, other]: Title: VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Context

Authors: Yunxin Li, Baotian Hu, Haoyuan Shi, Wei Wang, Longyue Wang, Min Zhang

Comments: 17 pages; Accepted by ICML 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[320] arXiv:2405.04943 [pdf, ps, other]: Title: Unsupervised Skin Feature Tracking with Deep Neural Networks

Authors: Jose Chang, Torbjörn E.M. Nordling

Comments: arXiv admin note: text overlap with arXiv:2112.14159

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[321] arXiv:2405.04940 [src]: Title: Harnessing the Power of MLLMs for Transferable Text-to-Image Person ReID

Authors: Wentao Tan

Comments: arXiv admin comment: This version has been removed by arXiv administrators as the submitter did not have the rights to agree to the license at the time of submission

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[322] arXiv:2405.04918 [pdf, other]: Title: Delve into Base-Novel Confusion: Redundancy Exploration for Few-Shot Class-Incremental Learning

Authors: Haichen Zhou, Yixiong Zou, Ruixuan Li, Yuhua Li, Kui Xiao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[323] arXiv:2405.04913 [pdf, other]: Title: Weakly-supervised Semantic Segmentation via Dual-stream Contrastive Learning of Cross-image Contextual Information

Authors: Qi Lai, Chi-Man Vong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[324] arXiv:2405.04909 [pdf, other]: Title: Traj-LLM: A New Exploration for Empowering Trajectory Prediction with Pre-trained Large Language Models

Authors: Zhengxing Lan, Hongbo Li, Lingshan Liu, Bo Fan, Yisheng Lv, Yilong Ren, Zhiyong Cui

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[325] arXiv:2405.04900 [pdf, other]: Title: Self-supervised Gait-based Emotion Representation Learning from Selective Strongly Augmented Skeleton Sequences

Authors: Cheng Song, Lu Lu, Zhen Ke, Long Gao, Shuai Ding

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[326] arXiv:2405.04889 [pdf, other]: Title: Fast LiDAR Upsampling using Conditional Diffusion Models

Authors: Sander Elias Magnussen Helgesen, Kazuto Nakashima, Jim Tørresen, Ryo Kurazume

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[327] arXiv:2405.04883 [pdf, other]: Title: FreeBind: Free Lunch in Unified Multimodal Space via Knowledge Fusion

Authors: Zehan Wang, Ziang Zhang, Xize Cheng, Rongjie Huang, Luping Liu, Zhenhui Ye, Haifeng Huang, Yang Zhao, Tao Jin, Peng Gao, Zhou Zhao

Comments: Accepted by ICML 2024. The code and checkpoints will be released at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[328] arXiv:2405.04858 [pdf, other]: Title: Pedestrian Attribute Recognition as Label-balanced Multi-label Learning

Authors: Yibo Zhou, Hai-Miao Hu, Yirong Xiang, Xiaokang Zhang, Haotian Wu

Comments: Accepted as ICML2024 main conference paper

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[329] arXiv:2405.04834 [pdf, other]: Title: FlexEControl: Flexible and Efficient Multimodal Control for Text-to-Image Generation

Authors: Xuehai He, Jian Zheng, Jacob Zhiyuan Fang, Robinson Piramuthu, Mohit Bansal, Vicente Ordonez, Gunnar A Sigurdsson, Nanyun Peng, Xin Eric Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[330] arXiv:2405.04815 [pdf, other]: Title: Proportion Estimation by Masked Learning from Label Proportion

Authors: Takumi Okuo, Kazuya Nishimura, Hiroaki Ito, Kazuhiro Terada, Akihiko Yoshizawa, Ryoma Bise

Comments: Accepted at The 3rd MICCAI workshop on Data Augmentation, Labeling, and Imperfections

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[331] arXiv:2405.04807 [pdf, other]: Title: Transformer Architecture for NetsDB

Authors: Subodh Kamble, Kunal Sunil Kasodekar

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[332] arXiv:2405.04800 [pdf, other]: Title: DeepDamageNet: A two-step deep-learning model for multi-disaster building damage segmentation and classification using satellite imagery

Authors: Irene Alisjahbana, Jiawei Li, Ben (Mullet) Strong, Yue Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[333] arXiv:2405.04788 [pdf, other]: Title: DiffMatch: Visual-Language Guidance Makes Better Semi-supervised Change Detector

Authors: Kaiyu Li, Xiangyong Cao, Yupeng Deng, Deyu Meng

Comments: 13 pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[334] arXiv:2405.04782 [pdf, other]: Title: Dual-Image Enhanced CLIP for Zero-Shot Anomaly Detection

Authors: Zhaoxiang Zhang, Hanqiu Deng, Jinan Bao, Xingyu Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[335] arXiv:2405.04771 [pdf, other]: Title: Exploring Vision Transformers for 3D Human Motion-Language Models with Motion Patches

Authors: Qing Yu, Mikihiro Tanaka, Kent Fujiwara

Comments: Accepted to CVPR 2024, Project website: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[336] arXiv:2405.04759 [pdf, ps, other]: Title: Multi-Label Out-of-Distribution Detection with Spectral Normalized Joint Energy

Authors: Yihan Mei, Xinyu Wang, Dell Zhang, Xiaoling Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[337] arXiv:2405.04741 [pdf, other]: Title: All in One Framework for Multimodal Re-identification in the Wild

Authors: He Li, Mang Ye, Ming Zhang, Bo Du

Comments: 12 pages, 3 figure, CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[338] arXiv:2405.04722 [pdf, other]: Title: Detecting and Refining HiRISE Image Patches Obscured by Atmospheric Dust

Authors: Kunal Sunil Kasodekar

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[339] arXiv:2405.04717 [pdf, other]: Title: Remote Diffusion

Authors: Kunal Sunil Kasodekar

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[340] arXiv:2405.04682 [pdf, other]: Title: TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation

Authors: Hritik Bansal, Yonatan Bitton, Michal Yarom, Idan Szpektor, Aditya Grover, Kai-Wei Chang

Comments: 23 pages, 12 figures, 8 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[341] arXiv:2405.04675 [pdf, other]: Title: TexControl: Sketch-Based Two-Stage Fashion Image Generation Using Diffusion Model

Authors: Yongming Zhang, Tianyu Zhang, Haoran Xie

Comments: 5 pages, 8 figures, accepted in NICOGRAPH International 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[342] arXiv:2405.04662 [pdf, other]: Title: Radar Fields: Frequency-Space Neural Scene Representations for FMCW Radar

Authors: David Borts, Erich Liang, Tim Brödermann, Andrea Ramazzina, Stefanie Walz, Edoardo Palladin, Jipeng Sun, David Bruggemann, Christos Sakaridis, Luc Van Gool, Mario Bijelic, Felix Heide

Comments: 8 pages, 6 figures, to be published in SIGGRAPH 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[343] arXiv:2405.04650 [pdf, other]: Title: A Self-Supervised Method for Body Part Segmentation and Keypoint Detection of Rat Images

Authors: László Kopácsi, Áron Fóthi, András Lőrincz

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[344] arXiv:2405.04634 [pdf, other]: Title: FRACTAL: An Ultra-Large-Scale Aerial Lidar Dataset for 3D Semantic Segmentation of Diverse Landscapes

Authors: Charles Gaydon, Michel Daab, Floryne Roche

Comments: 15 pages | 9 figures | 8 tables | Dataset is available at this https URL | Trained model is available at this https URL | Deep learning code repository is on Gihtub at this https URL | Data engineering code repository is on Github at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[345] arXiv:2405.04605 [pdf, ps, other]: Title: AI in Lung Health: Benchmarking Detection and Diagnostic Models Across Multiple CT Scan Datasets

Authors: Fakrul Islam Tushar, Avivah Wang, Lavsen Dahal, Michael R. Harowicz, Kyle J. Lafata, Tina D. Tailor, Joseph Y. Lo

Comments: 16 pages, 2 tables, 5 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[346] arXiv:2405.04589 [pdf, other]: Title: A Novel Wide-Area Multiobject Detection System with High-Probability Region Searching

Authors: Xianlei Long, Hui Zhao, Chao Chen, Fuqiang Gu, Qingyi Gu

Comments: Accepted by ICRA 2024

Journal-ref: 2024 IEEE International Conference on Robotics and Automation (ICRA)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[347] arXiv:2405.04549 [pdf, other]: Title: ClothPPO: A Proximal Policy Optimization Enhancing Framework for Robotic Cloth Manipulation with Observation-Aligned Action Spaces

Authors: Libing Yang, Yang Li, Long Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[348] arXiv:2405.04538 [pdf, other]: Title: DiffFinger: Advancing Synthetic Fingerprint Generation through Denoising Diffusion Probabilistic Models

Authors: Freddie Grabovski, Lior Yasur, Yaniv Hacmon, Lior Nisimov, Stav Nimrod

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[349] arXiv:2405.04537 [pdf, other]: Title: An intuitive multi-frequency feature representation for SO(3)-equivariant networks

Authors: Dongwon Son, Jaehyung Kim, Sanghyeon Son, Beomjoon Kim

Comments: ICLR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[350] arXiv:2405.04536 [pdf, other]: Title: When Training-Free NAS Meets Vision Transformer: A Neural Tangent Kernel Perspective

Authors: Qiqi Zhou, Yichen Zhu

Comments: ICASSP2024 oral

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[351] arXiv:2405.04535 [pdf, other]: Title: Image Classification for CSSVD Detection in Cacao Plants

Authors: Atuhurra Jesse, N'guessan Yves-Roland Douha, Pabitra Lenka

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[352] arXiv:2405.05170 (cross-list from cs.MM) [pdf, other]: Title: Picking watermarks from noise (PWFN): an improved robust watermarking model against intensive distortions

Authors: Sijing Xie, Chengxin Zhao, Nan Sun, Wei Li, Hefei Ling

Comments: Accepted by ICME2024

Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[353] arXiv:2405.05160 (cross-list from cs.LG) [pdf, other]: Title: Selective Classification Under Distribution Shifts

Authors: Hengyue Liang, Le Peng, Ju Sun

Comments: Total 25 pages (14 pages for main body); preprint for journal submission

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[354] arXiv:2405.05095 (cross-list from math.NA) [pdf, other]: Title: Approximation properties relative to continuous scale space for hybrid discretizations of Gaussian derivative operators

Authors: Tony Lindeberg

Comments: 13 pages, 11 figures. arXiv admin note: text overlap with arXiv:2311.11317

Subjects: Numerical Analysis (math.NA); Computer Vision and Pattern Recognition (cs.CV)
[355] arXiv:2405.05007 (cross-list from eess.IV) [pdf, other]: Title: HC-Mamba: Vision MAMBA with Hybrid Convolutional Techniques for Medical Image Segmentation

Authors: Jiashu Xu

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[356] arXiv:2405.04966 (cross-list from cs.IT) [pdf, other]: Title: Communication-Efficient Collaborative Perception via Information Filling with Codebook

Authors: Yue Hu, Juntong Peng, Sifei Liu, Junhao Ge, Si Liu, Siheng Chen

Comments: 10 pages, Accepted by CVPR 2024

Subjects: Information Theory (cs.IT); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[357] arXiv:2405.04902 (cross-list from eess.IV) [pdf, other]: Title: HAGAN: Hybrid Augmented Generative Adversarial Network for Medical Image Synthesis

Authors: Zhihan Ju, Wanting Zhou, Longteng Kong, Yu Chen, Yi Li, Zhenan Sun, Caifeng Shan

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[358] arXiv:2405.04890 (cross-list from cs.RO) [pdf, other]: Title: GISR: Geometric Initialization and Silhouette-based Refinement for Single-View Robot Pose and Configuration Estimation

Authors: Ivan Bilić, Filip Marić, Fabio Bonsignorio, Ivan Petrović

Comments: Submitted to IEEE Robotics and Automation Letters (RA-L)

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[359] arXiv:2405.04867 (cross-list from eess.IV) [pdf, other]: Title: MIPI 2024 Challenge on Demosaic for HybridEVS Camera: Methods and Results

Authors: Yaqi Wu, Zhihao Fan, Xiaofeng Chu, Jimmy S. Ren, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangcheng Zhou, Ruicheng Feng, Yuekun Dai, Peiqing Yang, Chen Change Loy, Senyan Xu, Zhijing Sun, Jiaying Zhu, Yurui Zhu, Xueyang Fu, Zheng-Jun Zha, Jun Cao, Cheng Li, Shu Chen, Liang Ma, Shiyang Zhou, Haijin Zeng, Kai Feng, Yongyong Chen, Jingyong Su, Xianyu Guan, Hongyuan Yu, Cheng Wan, Jiamin Lin, Binnan Han, Yajun Zou, Zhuoyuan Wu, Yuan Huang, Yongsheng Yu, Daoan Zhang, Jizhe Li, Xuanwu Yin, Kunlong Zuo, Yunfan Lu, Yijie Xu, Wenzong Ma, Weiyu Guo, Hui Xiong, Wei Yu, Bingchun Luo, Sabari Nathan, Priya Kansal

Comments: MIPI@CVPR2024. Website: this https URL

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[360] arXiv:2405.04812 (cross-list from cs.RO) [pdf, other]: Title: General Place Recognition Survey: Towards Real-World Autonomy

Authors: Peng Yin, Jianhao Jiao, Shiqi Zhao, Lingyun Xu, Guoquan Huang, Howie Choset, Sebastian Scherer, Jianda Han

Comments: 20 pages, 12 figures, under review

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[361] arXiv:2405.04778 (cross-list from eess.IV) [pdf, other]: Title: Teacher-Student Network for Real-World Face Super-Resolution with Progressive Embedding of Edge Information

Authors: Zhilei Liu, Chenggong Zhang

Comments: Accepted by ICIP 2023

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[362] arXiv:2405.04610 (cross-list from eess.IV) [pdf, other]: Title: Exploring Explainable AI Techniques for Improved Interpretability in Lung and Colon Cancer Classification

Authors: Mukaffi Bin Moin, Fatema Tuj Johora Faria, Swarnajit Saha, Bushra Kamal Rafa, Mohammad Shafiul Alam

Comments: Accepted in 4th International Conference on Computing and Communication Networks (ICCCNet-2024)

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[363] arXiv:2405.04595 (cross-list from eess.IV) [pdf, ps, other]: Title: An Advanced Features Extraction Module for Remote Sensing Image Super-Resolution

Authors: Naveed Sultan, Amir Hajian, Supavadee Aramvith

Comments: Preprint of paper from The 21st International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology or ECTI-CON 2024, Khon Kaen, Thailand

Journal-ref: ECTI-CON 2024, Khon Kaen Thailand

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[364] arXiv:2405.04507 (cross-list from stat.AP) [pdf, other]: Title: New allometric models for the USA create a step-change in forest carbon estimation, modeling, and mapping

Authors: Lucas K. Johnson (1), Michael J. Mahoney (1), Grant Domke (2), Colin M. Beier (1) ((1) State University of New York College of Environmental Science and Forestry, (2) USDA Forest Service)

Comments: Manuscript: 16 pages, 7 figures; Supplements: 3 pages, 2 figures; Submitted to: Remote Sensing of Environment

Subjects: Applications (stat.AP); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Wed, 8 May 2024

[365] arXiv:2405.04534 [pdf, other]: Title: Tactile-Augmented Radiance Fields

Authors: Yiming Dou, Fengyu Yang, Yi Liu, Antonio Loquercio, Andrew Owens

Comments: CVPR 2024, Project page: this https URL, Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[366] arXiv:2405.04533 [pdf, other]: Title: ChatHuman: Language-driven 3D Human Understanding with Retrieval-Augmented Tool Reasoning

Authors: Jing Lin, Yao Feng, Weiyang Liu, Michael J. Black

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[367] arXiv:2405.04496 [pdf, other]: Title: Edit-Your-Motion: Space-Time Diffusion Decoupling Learning for Video Motion Editing

Authors: Yi Zuo, Lingling Li, Licheng Jiao, Fang Liu, Xu Liu, Wenping Ma, Shuyuan Yang, Yuwei Guo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[368] arXiv:2405.04489 [pdf, other]: Title: S3Former: Self-supervised High-resolution Transformer for Solar PV Profiling

Authors: Minh Tran, Adrian De Luis, Haitao Liao, Ying Huang, Roy McCann, Alan Mantooth, Jack Cothren, Ngan Le

Comments: Preprint

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[369] arXiv:2405.04457 [pdf, other]: Title: Towards Geographic Inclusion in the Evaluation of Text-to-Image Models

Authors: Melissa Hall, Samuel J. Bell, Candace Ross, Adina Williams, Michal Drozdzal, Adriana Romero Soriano

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Human-Computer Interaction (cs.HC)
[370] arXiv:2405.04442 [pdf, other]: Title: AugmenTory: A Fast and Flexible Polygon Augmentation Library

Authors: Tanaz Ghahremani, Mohammad Hoseyni, Mohammad Javad Ahmadi, Pouria Mehrabi, Amirhossein Nikoofard

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[371] arXiv:2405.04416 [pdf, other]: Title: DistGrid: Scalable Scene Reconstruction with Distributed Multi-resolution Hash Grid

Authors: Sidun Liu, Peng Qiao, Zongxin Ye, Wenyu Li, Yong Dou

Comments: Originally submitted to Siggraph Asia 2023

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[372] arXiv:2405.04408 [pdf, other]: Title: DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks

Authors: Jiaxin Zhang, Dezhi Peng, Chongyu Liu, Peirong Zhang, Lianwen Jin

Comments: Accepted by CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[373] arXiv:2405.04404 [pdf, other]: Title: Vision Mamba: A Comprehensive Survey and Taxonomy

Authors: Xiao Liu, Chenxu Zhang, Lei Zhang

Comments: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[374] arXiv:2405.04403 [pdf, other]: Title: Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks

Authors: Georgios Pantazopoulos, Amit Parekh, Malvina Nikandrou, Alessandro Suglia

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[375] arXiv:2405.04390 [pdf, other]: Title: DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving

Authors: Chen Min, Dawei Zhao, Liang Xiao, Jian Zhao, Xinli Xu, Zheng Zhu, Lei Jin, Jianshu Li, Yulan Guo, Junliang Xing, Liping Jing, Yiming Nie, Bin Dai

Comments: Accepted by CVPR2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[376] arXiv:2405.04377 [pdf, other]: Title: Choose What You Need: Disentangled Representation Learning for Scene Text Recognition, Removal and Editing

Authors: Boqiang Zhang, Hongtao Xie, Zuan Gao, Yuxin Wang

Comments: Accepted to CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[377] arXiv:2405.04370 [pdf, other]: Title: Diff-IP2D: Diffusion-Based Hand-Object Interaction Prediction on Egocentric Videos

Authors: Junyi Ma, Jingyi Xu, Xieyuanli Chen, Hesheng Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[378] arXiv:2405.04356 [pdf, other]: Title: Diffusion-driven GAN Inversion for Multi-Modal Face Image Generation

Authors: Jihyun Kim, Changjae Oh, Hoseok Do, Soohyun Kim, Kwanghoon Sohn

Comments: Accepted by CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[379] arXiv:2405.04345 [pdf, other]: Title: Novel View Synthesis with Neural Radiance Fields for Industrial Robot Applications

Authors: Markus Hillemann, Robert Langendörfer, Max Heiken, Max Mehltretter, Andreas Schenk, Martin Weinmann, Stefan Hinz, Christian Heipke, Markus Ulrich

Comments: 8 pages, 8 figures, accepted for publication in The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences (ISPRS Archives) 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[380] arXiv:2405.04327 [pdf, other]: Title: Audio-Visual Speech Representation Expert for Enhanced Talking Face Video Generation and Evaluation

Authors: Dogucan Yaman, Fevziye Irem Eyiokur, Leonard Bärmann, Seymanur Aktı, Hazım Kemal Ekenel, Alexander Waibel

Comments: CVPR2024 NTIRE Workshop

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[381] arXiv:2405.04312 [pdf, other]: Title: Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer

Authors: Zhuoyi Yang, Heyang Jiang, Wenyi Hong, Jiayan Teng, Wendi Zheng, Yuxiao Dong, Ming Ding, Jie Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[382] arXiv:2405.04311 [pdf, ps, other]: Title: Cross-IQA: Unsupervised Learning for Image Quality Assessment

Authors: Zhen Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
[383] arXiv:2405.04309 [pdf, other]: Title: Non-rigid Structure-from-Motion: Temporally-smooth Procrustean Alignment and Spatially-variant Deformation Modeling

Authors: Jiawei Shi, Hui Deng, Yuchao Dai

Comments: Accepted by CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[384] arXiv:2405.04305 [pdf, other]: Title: A New Dataset and Comparative Study for Aphid Cluster Detection and Segmentation in Sorghum Fields

Authors: Raiyan Rahman, Christopher Indris, Goetz Bramesfeld, Tianxiao Zhang, Kaidong Li, Xiangyu Chen, Ivan Grijalva, Brian McCornack, Daniel Flippo, Ajay Sharda, Guanghui Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[385] arXiv:2405.04299 [pdf, other]: Title: ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers

Authors: Jinke Li, Xiao He, Chonghua Zhou, Xiaoqiang Cheng, Yang Wen, Dan Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[386] arXiv:2405.04251 [pdf, other]: Title: A General Model for Detecting Learner Engagement: Implementation and Evaluation

Authors: Somayeh Malekshahi, Javad M. Kheyridoost, Omid Fatemi

Comments: 13 pages, 2 Postscript figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
[387] arXiv:2405.04233 [pdf, other]: Title: Vidu: a Highly Consistent, Dynamic and Skilled Text-to-Video Generator with Diffusion Models

Authors: Fan Bao, Chendong Xiang, Gang Yue, Guande He, Hongzhou Zhu, Kaiwen Zheng, Min Zhao, Shilong Liu, Yaole Wang, Jun Zhu

Comments: Project page at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[388] arXiv:2405.04211 [pdf, other]: Title: Breast Histopathology Image Retrieval by Attention-based Adversarially Regularized Variational Graph Autoencoder with Contrastive Learning-Based Feature Extraction

Authors: Nematollah Saeidi, Hossein Karshenas, Bijan Shoushtarian, Sepideh Hatamikia, Ramona Woitek, Amirreza Mahbod

Comments: 31 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[389] arXiv:2405.04189 [pdf, ps, other]: Title: Artificial Intelligence-powered fossil shark tooth identification: Unleashing the potential of Convolutional Neural Networks

Authors: Andrea Barucci, Giulia Ciacci, Pietro Liò, Tiago Azevedo, Andrea Di Cencio, Marco Merella, Giovanni Bianucci, Giulia Bosio, Simone Casati, Alberto Collareta

Comments: 40 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[390] arXiv:2405.04175 [pdf, other]: Title: Topicwise Separable Sentence Retrieval for Medical Report Generation

Authors: Junting Zhao, Yang Zhou, Zhihao Chen, Huazhu Fu, Liang Wan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[391] arXiv:2405.04167 [pdf, other]: Title: Bridging the Synthetic-to-Authentic Gap: Distortion-Guided Unsupervised Domain Adaptation for Blind Image Quality Assessment

Authors: Aobo Li, Jinjian Wu, Yongxu Liu, Leida Li

Comments: Accepted by CVPR2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[392] arXiv:2405.04164 [pdf, other]: Title: Sign2GPT: Leveraging Large Language Models for Gloss-Free Sign Language Translation

Authors: Ryan Wong, Necati Cihan Camgoz, Richard Bowden

Comments: Accepted at ICLR2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[393] arXiv:2405.04133 [pdf, other]: Title: Exposing AI-generated Videos: A Benchmark Dataset and a Local-and-Global Temporal Defect Based Detection Method

Authors: Peisong He, Leyao Zhu, Jiaxing Li, Shiqi Wang, Haoliang Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[394] arXiv:2405.04121 [pdf, other]: Title: ELiTe: Efficient Image-to-LiDAR Knowledge Transfer for Semantic Segmentation

Authors: Zhibo Zhang, Ximing Yang, Weizhong Zhang, Cheng Jin

Comments: 9 pages, 6 figures, ICME 2024 oral

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[395] arXiv:2405.04103 [pdf, other]: Title: COM3D: Leveraging Cross-View Correspondence and Cross-Modal Mining for 3D Retrieval

Authors: Hao Wu, Ruochong LI, Hao Wang, Hui Xiong

Comments: Accepted by ICME 2024 oral

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[396] arXiv:2405.04100 [pdf, other]: Title: ESP: Extro-Spective Prediction for Long-term Behavior Reasoning in Emergency Scenarios

Authors: Dingrui Wang, Zheyuan Lai, Yuda Li, Yi Wu, Yuexin Ma, Johannes Betz, Ruigang Yang, Wei Li

Comments: Accepted by ICRA 2024 as Oral Presentation

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[397] arXiv:2405.04097 [pdf, other]: Title: Unmasking Illusions: Understanding Human Perception of Audiovisual Deepfakes

Authors: Ammarah Hashmi, Sahibzada Adil Shahzad, Chia-Wen Lin, Yu Tsao, Hsin-Min Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Machine Learning (cs.LG); Multimedia (cs.MM)
[398] arXiv:2405.04093 [pdf, other]: Title: DCNN: Dual Cross-current Neural Networks Realized Using An Interactive Deep Learning Discriminator for Fine-grained Objects

Authors: Da Fu, Mingfei Rong, Eun-Hu Kim, Hao Huang, Witold Pedrycz

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[399] arXiv:2405.04044 [pdf, other]: Title: DMOFC: Discrimination Metric-Optimized Feature Compression

Authors: Changsheng Gao, Yiheng Jiang, Li Li, Dong Liu, Feng Wu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[400] arXiv:2405.04042 [pdf, other]: Title: Space-time Reinforcement Network for Video Object Segmentation

Authors: Yadang Chen, Wentao Zhu, Zhi-Xin Yang, Enhua Wu

Comments: Accepted by ICME 2024. 6 pages, 10 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[401] arXiv:2405.04009 [pdf, other]: Title: Structured Click Control in Transformer-based Interactive Segmentation

Authors: Long Xu, Yongquan Chen, Rui Huang, Feng Wu, Shiwu Lai

Comments: 10 pages, 6 figures, submitted to NeurIPS 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[402] arXiv:2405.04007 [pdf, other]: Title: SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing

Authors: Yuying Ge, Sijie Zhao, Chen Li, Yixiao Ge, Ying Shan

Comments: Technical Report; Dataset released in this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[403] arXiv:2405.03995 [pdf, other]: Title: Deep Event-based Object Detection in Autonomous Driving: A Survey

Authors: Bingquan Zhou, Jie Jiang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[404] arXiv:2405.03981 [pdf, other]: Title: Predicting Lung Disease Severity via Image-Based AQI Analysis using Deep Learning Techniques

Authors: Anvita Mahajan, Sayali Mate, Chinmayee Kulkarni, Suraj Sawant

Comments: 11 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[405] arXiv:2405.03978 [pdf, other]: Title: VMambaCC: A Visual State Space Model for Crowd Counting

Authors: Hao-Yuan Ma, Li Zhang, Shuai Shi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[406] arXiv:2405.03971 [pdf, other]: Title: Unified End-to-End V2X Cooperative Autonomous Driving

Authors: Zhiwei Li, Bozhen Zhang, Lei Yang, Tianyu Shen, Nuo Xu, Ruosen Hao, Weiting Li, Tao Yan, Huaping Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[407] arXiv:2405.03959 [pdf, other]: Title: Joint Estimation of Identity Verification and Relative Pose for Partial Fingerprints

Authors: Xiongjun Guan, Zhiyu Pan, Jianjiang Feng, Jie Zhou

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[408] arXiv:2405.03958 [pdf, other]: Title: Simple Drop-in LoRA Conditioning on Attention Layers Will Improve Your Diffusion Model

Authors: Joo Young Choi, Jaesung R. Park, Inkyu Park, Jaewoong Cho, Albert No, Ernest K. Ryu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[409] arXiv:2405.03955 [pdf, ps, other]: Title: IPFed: Identity protected federated learning for user authentication

Authors: Yosuke Kaga, Yusei Suzuki, Kenta Takahashi

Journal-ref: 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[410] arXiv:2405.03945 [pdf, other]: Title: Role of Sensing and Computer Vision in 6G Wireless Communications

Authors: Seungnyun Kim, Jihoon Moon, Jinhong Kim, Yongjun Ahn, Donghoon Kim, Sunwoo Kim, Kyuhong Shim, Byonghyo Shim

Subjects: Computer Vision and Pattern Recognition (cs.CV); Networking and Internet Architecture (cs.NI)
[411] arXiv:2405.03894 [pdf, other]: Title: MVDiff: Scalable and Flexible Multi-View Diffusion for 3D Object Reconstruction from Single-View

Authors: Emmanuelle Bourigault, Pauline Bourigault

Comments: CVPRW: Generative Models for Computer Vision

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[412] arXiv:2405.03884 [pdf, other]: Title: BadFusion: 2D-Oriented Backdoor Attacks against 3D Object Detection

Authors: Saket S. Chaturvedi, Lan Zhang, Wenbin Zhang, Pan He, Xiaoyong Yuan

Comments: Accepted at IJCAI 2024 Conference

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[413] arXiv:2405.03882 [pdf, other]: Title: Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer

Authors: Huihong Shi, Haikuo Shao, Wendong Mao, Zhongfeng Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[414] arXiv:2405.03852 [pdf, other]: Title: VSA4VQA: Scaling a Vector Symbolic Architecture to Visual Question Answering on Natural Images

Authors: Anna Penzkofer, Lei Shi, Andreas Bulling

Comments: To be published in the Proceedings of the Annual Meeting of the Cognitive Science Society (CogSci'24)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[415] arXiv:2405.03846 [pdf, other]: Title: Enhancing Apparent Personality Trait Analysis with Cross-Modal Embeddings

Authors: Ádám Fodor, Rachid R. Saboundji, András Lőrincz

Comments: 14 pages, 4 figures

Journal-ref: Annales Universitatis Scientiarium Budapestinensis de Rolando E\"otv\"os Nominatae. Sectio Computatorica, MaCS Special Issue, 2021

Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[416] arXiv:2405.03803 [pdf, other]: Title: MoDiPO: text-to-motion alignment via AI-feedback-driven Direct Preference Optimization

Authors: Massimiliano Pappa, Luca Collorone, Giovanni Ficarra, Indro Spinelli, Fabio Galasso

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[417] arXiv:2405.03770 [pdf, other]: Title: Foundation Models for Video Understanding: A Survey

Authors: Neelu Madan, Andreas Moegelmose, Rajat Modi, Yogesh S. Rawat, Thomas B. Moeslund

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[418] arXiv:2405.03722 [pdf, other]: Title: Class-relevant Patch Embedding Selection for Few-Shot Image Classification

Authors: Weihao Jiang, Haoyang Cui, Kun He

Comments: arXiv admin note: text overlap with arXiv:2405.03109

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[419] arXiv:2405.03715 [pdf, other]: Title: Iterative Filter Pruning for Concatenation-based CNN Architectures

Authors: Svetlana Pavlitska, Oliver Bagge, Federico Peccia, Toghrul Mammadov, J. Marius Zöllner

Comments: Accepted for publication at IJCNN 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[420] arXiv:2405.03702 [pdf, other]: Title: Leafy Spurge Dataset: Real-world Weed Classification Within Aerial Drone Imagery

Authors: Kyle Doherty, Max Gurinas, Erik Samsoe, Charles Casper, Beau Larkin, Philip Ramsey, Brandon Trabucco, Ruslan Salakhutdinov

Comments: Official Dataset Technical Report. Used in DA-Fusion (arXiv:2302.07944)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[421] arXiv:2405.04459 (cross-list from cs.AI) [pdf, other]: Title: A Significantly Better Class of Activation Functions Than ReLU Like Activation Functions

Authors: Mathew Mithra Noel, Yug Oswal

Comments: 14 pages

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
[422] arXiv:2405.04392 (cross-list from cs.RO) [pdf, other]: Title: BILTS: A novel bi-invariant local trajectory-shape descriptor for rigid-body motion

Authors: Arno Verduyn, Erwin Aertbeliën, Glenn Maes, Joris De Schutter, Maxim Vochten

Comments: This work has been submitted as a regular research paper for consideration in the IEEE Transactions on Robotics. Copyright may be transferred without notice, after which this version may no longer be accessible

Subjects: Robotics (cs.RO); Computational Geometry (cs.CG); Computer Vision and Pattern Recognition (cs.CV)
[423] arXiv:2405.04378 (cross-list from cs.RO) [pdf, other]: Title: $\textbf{Splat-MOVER}$: Multi-Stage, Open-Vocabulary Robotic Manipulation via Editable Gaussian Splatting

Authors: Ola Shorinwa, Johnathan Tucker, Aliyah Smith, Aiden Swann, Timothy Chen, Roya Firoozi, Monroe Kennedy III, Mac Schwager

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[424] arXiv:2405.04295 (cross-list from eess.IV) [pdf, other]: Title: Semi-Supervised Disease Classification based on Limited Medical Image Data

Authors: Yan Zhang, Chun Li, Zhaoxia Liu, Ming Li

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[425] arXiv:2405.04288 (cross-list from eess.IV) [pdf, other]: Title: BetterNet: An Efficient CNN Architecture with Residual Learning and Attention for Precision Polyp Segmentation

Authors: Owen Singh, Sandeep Singh Sengar

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[426] arXiv:2405.04274 (cross-list from eess.IV) [pdf, other]: Title: Group-aware Parameter-efficient Updating for Content-Adaptive Neural Video Compression

Authors: Zhenghao Chen, Luping Zhou, Zhihao Hu, Dong Xu

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[427] arXiv:2405.04191 (cross-list from cs.LG) [pdf, other]: Title: Effective and Robust Adversarial Training against Data and Label Corruptions

Authors: Peng-Fei Zhang, Zi Huang, Xin-Shun Xu, Guangdong Bai

Comments: 12 pages, 8 figures

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[428] arXiv:2405.04169 (cross-list from eess.IV) [pdf, other]: Title: D-TrAttUnet: Toward Hybrid CNN-Transformer Architecture for Generic and Subtle Segmentation in Medical Images

Authors: Fares Bougourzi, Fadi Dornaika, Cosimo Distante, Abdelmalik Taleb-Ahmed

Comments: arXiv admin note: text overlap with arXiv:2303.15576

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[429] arXiv:2405.04071 (cross-list from cs.RO) [pdf, other]: Title: IMU-Aided Event-based Stereo Visual Odometry

Authors: Junkai Niu, Sheng Zhong, Yi Zhou

Comments: 10 pages, 7 figures, ICRA

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[430] arXiv:2405.04041 (cross-list from cs.AI) [pdf, other]: Title: Feature Map Convergence Evaluation for Functional Module

Authors: Ludan Zhang, Chaoyi Chen, Lei He, Keqiang Li

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[431] arXiv:2405.04023 (cross-list from eess.IV) [pdf, other]: Title: Lumbar Spine Tumor Segmentation and Localization in T2 MRI Images Using AI

Authors: Rikathi Pal, Sudeshna Mondal, Aditi Gupta, Priya Saha, Somoballi Ghoshal, Amlan Chakrabarti, Susmita Sur-Kolay

Comments: 9 pages, 12 figures

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[432] arXiv:2405.03905 (cross-list from cs.AR) [pdf, other]: Title: A 65nm 36nJ/Decision Bio-inspired Temporal-Sparsity-Aware Digital Keyword Spotting IC with 0.6V Near-Threshold SRAM

Authors: Qinyu Chen, Kwantae Kim, Chang Gao, Sheng Zhou, Taekwang Jang, Tobi Delbruck, Shih-Chii Liu

Subjects: Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[433] arXiv:2405.03827 (cross-list from cs.RO) [pdf, other]: Title: Direct learning of home vector direction for insect-inspired robot navigation

Authors: Michiel Firlefyn, Jesse Hagenaars, Guido de Croon

Comments: Published at ICRA 2024, project webpage at this https URL

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[434] arXiv:2405.03762 (cross-list from eess.IV) [pdf, other]: Title: Deep learning classifier of locally advanced rectal cancer treatment response from endoscopy images

Authors: Jorge Tapias Gomez, Aneesh Rangnekar, Hannah Williams, Hannah Thompson, Julio Garcia-Aguilar, Joshua Jesse Smith, Harini Veeraraghavan

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[435] arXiv:2405.03732 (cross-list from eess.IV) [pdf, ps, other]: Title: Accelerated MR Cholangiopancreatography with Deep Learning-based Reconstruction

Authors: Jinho Kim, Marcel Dominik Nickel, Florian Knoll

Comments: 20 pages, 6 figures, 2 tables

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[436] arXiv:2405.03730 (cross-list from cs.LG) [pdf, other]: Title: Tilt your Head: Activating the Hidden Spatial-Invariance of Classifiers

Authors: Johann Schmidt, Sebastian Stober

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[437] arXiv:2405.03713 (cross-list from eess.IV) [pdf, other]: Title: Improve Cross-Modality Segmentation by Treating MRI Images as Inverted CT Scans

Authors: Hartmut Häntze, Lina Xu, Leonhard Donle, Felix J. Dorfner, Alessa Hering, Lisa C. Adams, Keno K. Bressem

Comments: 3 pages, 2 figures

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Tue, 14 May 2024
Mon, 13 May 2024
Fri, 10 May 2024
Thu, 9 May 2024
Wed, 8 May 2024

[ total of 437 entries: 1-437 ]
[ showing up to 580 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2405, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions

Tue, 14 May 2024

Mon, 13 May 2024

Fri, 10 May 2024

Thu, 9 May 2024

Wed, 8 May 2024