Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 32

[ total of 456 entries: 1-104 | 33-136 | 137-240 | 241-344 | 345-448 | 449-456 ]
[ showing 104 entries per page: fewer | more | all ]

Fri, 10 May 2024 (continued, showing last 54 of 86 entries)

[33] arXiv:2405.05615 [pdf, other]: Title: Memory-Space Visual Prompting for Efficient Vision-Language Fine-Tuning

Authors: Shibo Jie, Yehui Tang, Ning Ding, Zhi-Hong Deng, Kai Han, Yunhe Wang

Comments: Accepted to ICML2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
[34] arXiv:2405.05614 [pdf, other]: Title: Depth Awakens: A Depth-perceptual Attention Fusion Network for RGB-D Camouflaged Object Detection

Authors: Xinran Liua, Lin Qia, Yuxuan Songa, Qi Wen

Journal-ref: Xinran Liu, Lin Qi, Yuxuan Song, and Qi Wen. Depth awakens: A depth-perceptual attention fusion network for rgb-d camouflaged object detection. Image and Vision Computing, 143:104924, 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Networking and Internet Architecture (cs.NI)
[35] arXiv:2405.05613 [pdf, other]: Title: Robust Pseudo-label Learning with Neighbor Relation for Unsupervised Visible-Infrared Person Re-Identification

Authors: Xiangbo Yin, Jiangming Shi, Yachao Zhang, Yang Lu, Zhizhong Zhang, Yuan Xie, Yanyun Qu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[36] arXiv:2405.05605 [pdf, other]: Title: Minimal Perspective Autocalibration

Authors: Andrea Porfiri Dal Cin, Timothy Duff, Luca Magri, Tomas Pajdla

Comments: 8 pages main paper + 2 pages references + 8 pages supplementary; to be presented at CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[37] arXiv:2405.05587 [pdf, other]: Title: Navigate Beyond Shortcuts: Debiased Learning through the Lens of Neural Collapse

Authors: Yining Wang, Junjie Sun, Chenyue Wang, Mi Zhang, Min Yang

Comments: CVPR 2024 Highlight

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[38] arXiv:2405.05584 [pdf, other]: Title: A Survey on Backbones for Deep Video Action Recognition

Authors: Zixuan Tang, Youjun Zhao, Yuhang Wen, Mengyuan Liu

Comments: This paper has been accepted by ICME workshop

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[39] arXiv:2405.05574 [pdf, other]: Title: Vision-Language Modeling with Regularized Spatial Transformer Networks for All Weather Crosswind Landing of Aircraft

Authors: Debabrata Pal, Anvita Singh, Saumya Saumya, Shouvik Das

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[40] arXiv:2405.05573 [pdf, other]: Title: Poisoning-based Backdoor Attacks for Arbitrary Target Label with Positive Triggers

Authors: Binxiao Huang, Jason Chun Lok, Chang Liu, Ngai Wong

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
[41] arXiv:2405.05553 [pdf, other]: Title: Towards Robust Physical-world Backdoor Attacks on Lane Detection

Authors: Xinwei Zhang, Aishan Liu, Tianyuan Zhang, Siyuan Liang, Xianglong Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[42] arXiv:2405.05552 [pdf, other]: Title: Bidirectional Progressive Transformer for Interaction Intention Anticipation

Authors: Zichen Zhang, Hongchen Luo, Wei Zhai, Yang Cao, Yu Kang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[43] arXiv:2405.05551 [pdf, ps, other]: Title: The object detection model uses combined extraction with KNN and RF classification

Authors: Florentina Tatrin Kurniati, Daniel HF Manongga, Irwan Sembiring, Sutarto Wijono, Roy Rudolf Huizen

Journal-ref: IJEECS, pp 436-445, Vol 35, No 1 July 2024; https://ijeecs.iaescore.com/index.php/IJEECS/article/view/35888

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[44] arXiv:2405.05538 [pdf, other]: Title: A Survey on Personalized Content Synthesis with Diffusion Models

Authors: Xulu Zhang, Xiao-Yong Wei, Wengyu Zhang, Jinlin Wu, Zhaoxiang Zhang, Zhen Lei, Qing Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[45] arXiv:2405.05530 [pdf, other]: Title: NurtureNet: A Multi-task Video-based Approach for Newborn Anthropometry

Authors: Yash Khandelwal, Mayur Arvind, Sriram Kumar, Ashish Gupta, Sachin Kumar Danisetty, Piyush Bagad, Anish Madan, Mayank Lunayach, Aditya Annavajjala, Abhishek Maiti, Sansiddh Jain, Aman Dalmia, Namrata Deka, Jerome White, Jigar Doshi, Angjoo Kanazawa, Rahul Panicker, Alpan Raval, Srinivas Rana, Makarand Tapaswi

Comments: Accepted at CVPM Workshop at CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[46] arXiv:2405.05524 [pdf, other]: Title: Universal Adversarial Perturbations for Vision-Language Pre-trained Models

Authors: Peng-Fei Zhang, Zi Huang, Guangdong Bai

Comments: 9 pages, 5 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[47] arXiv:2405.05523 [pdf, other]: Title: Prompt When the Animal is: Temporal Animal Behavior Grounding with Positional Recovery Training

Authors: Sheng Yan, Xin Du, Zongying Li, Yi Wang, Hongcang Jin, Mengyuan Liu

Comments: Accepted by ICMEW 2024. arXiv admin note: text overlap with arXiv:2404.13657

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[48] arXiv:2405.05518 [pdf, other]: Title: DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map Construction

Authors: Siyu Li, Jiacheng Lin, Hao Shi, Jiaming Zhang, Song Wang, You Yao, Zhiyong Li, Kailun Yang

Comments: The source code will be made publicly available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO); Image and Video Processing (eess.IV)
[49] arXiv:2405.05502 [pdf, other]: Title: Towards Accurate and Robust Architectures via Neural Architecture Search

Authors: Yuwei Ou, Yuqi Feng, Yanan Sun

Comments: Accepted by CVPR2024. arXiv admin note: substantial text overlap with arXiv:2212.14049

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[50] arXiv:2405.05497 [pdf, other]: Title: Multi-Level Feature Fusion Network for Lightweight Stereo Image Super-Resolution

Authors: Yunxiang Li, Wenbin Zou, Qiaomu Wei, Feng Huang, Jing Wu

Comments: 10 pages, 7 figures, CVPRWorkshop NTIRE2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[51] arXiv:2405.05488 [pdf, ps, other]: Title: Advancing Head and Neck Cancer Survival Prediction via Multi-Label Learning and Deep Model Interpretation

Authors: Meixu Chen, Kai Wang, Jing Wang

Comments: 10 pages, 4 figures, 2 tables, 2 pages of supplementary material

Subjects: Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)
[52] arXiv:2405.05477 [pdf, other]: Title: DynaSeg: A Deep Dynamic Fusion Method for Unsupervised Image Segmentation Incorporating Feature Similarity and Spatial Continuity

Authors: Naimul Khan, Boujemaa Guermazi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[53] arXiv:2405.05446 [pdf, other]: Title: GDGS: Gradient Domain Gaussian Splatting for Sparse Representation of Radiance Fields

Authors: Yuanhao Gong

Comments: arXiv admin note: text overlap with arXiv:2404.09105

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[54] arXiv:2405.05428 [pdf, other]: Title: Adversary-Guided Motion Retargeting for Skeleton Anonymization

Authors: Thomas Carr, Depeng Xu, Aidong Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[55] arXiv:2405.05422 [pdf, other]: Title: EarthMatch: Iterative Coregistration for Fine-grained Localization of Astronaut Photography

Authors: Gabriele Berton, Gabriele Goletto, Gabriele Trivigno, Alex Stoken, Barbara Caputo, Carlo Masone

Comments: CVPR 2024 IMW - webpage: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[56] arXiv:2405.05363 [pdf, other]: Title: LOC-ZSON: Language-driven Object-Centric Zero-Shot Object Retrieval and Navigation

Authors: Tianrui Guan, Yurou Yang, Harry Cheng, Muyuan Lin, Richard Kim, Rajasimman Madhivanan, Arnie Sen, Dinesh Manocha

Comments: Accepted to ICRA 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[57] arXiv:2405.05355 [pdf, other]: Title: Geometry-Informed Distance Candidate Selection for Adaptive Lightweight Omnidirectional Stereo Vision with Fisheye Images

Authors: Conner Pulling, Je Hon Tan, Yaoyu Hu, Sebastian Scherer

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[58] arXiv:2405.05354 [pdf, other]: Title: Transfer-LMR: Heavy-Tail Driving Behavior Recognition in Diverse Traffic Scenarios

Authors: Chirag Parikh, Ravi Shankar Mishra, Rohan Chandra, Ravi Kiran Sarvadevabhatla

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[59] arXiv:2405.05297 [pdf, ps, other]: Title: Deep Learning Method to Predict Wound Healing Progress Based on Collagen Fibers in Wound Tissue

Authors: Juan He, Xiaoyan Wang, Long Chen, Yunpeng Cai, Zhengshan Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[60] arXiv:2405.05295 [pdf, other]: Title: Relevant Irrelevance: Generating Alterfactual Explanations for Image Classifiers

Authors: Silvan Mertes, Tobias Huber, Christina Karle, Katharina Weitz, Ruben Schlagowski, Cristina Conati, Elisabeth André

Comments: Accepted at IJCAI 2024. arXiv admin note: text overlap with arXiv:2207.09374

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[61] arXiv:2405.05261 [pdf, other]: Title: 3D Holistic OR Anonymization

Authors: Tony Danjun Wang

Comments: This bachelor's thesis was the foundation of the paper "DisguisOR: Holistic Face Anonymization for the Operating Room" (see arXiv:2307.14241), published at IPCAI'23

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[62] arXiv:2405.05260 [pdf, other]: Title: Financial Table Extraction in Image Documents

Authors: William Watson, Bo Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[63] arXiv:2405.05956 (cross-list from cs.RO) [pdf, other]: Title: Probing Multimodal LLMs as World Models for Driving

Authors: Shiva Sreeram, Tsun-Hsuan Wang, Alaa Maalouf, Guy Rosman, Sertac Karaman, Daniela Rus

Comments: this https URL this https URL

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[64] arXiv:2405.05944 (cross-list from eess.IV) [pdf, other]: Title: MRISegmentator-Abdomen: A Fully Automated Multi-Organ and Structure Segmentation Tool for T1-weighted Abdominal MRI

Authors: Yan Zhuang, Tejas Sudharshan Mathai, Pritam Mukherjee, Brandon Khoury, Boah Kim, Benjamin Hou, Nusrat Rabbee, Ronald M. Summers

Comments: 23 pages, 13 figures

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[65] arXiv:2405.05941 (cross-list from cs.RO) [pdf, other]: Title: Evaluating Real-World Robot Manipulation Policies in Simulation

Authors: Xuanlin Li, Kyle Hsu, Jiayuan Gu, Karl Pertsch, Oier Mees, Homer Rich Walke, Chuyuan Fu, Ishikaa Lunawat, Isabel Sieh, Sean Kirmani, Sergey Levine, Jiajun Wu, Chelsea Finn, Hao Su, Quan Vuong, Ted Xiao

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[66] arXiv:2405.05934 (cross-list from cs.LG) [pdf, other]: Title: Theoretical Guarantees of Data Augmented Last Layer Retraining Methods

Authors: Monica Welfert, Nathan Stromberg, Lalitha Sankar

Comments: Extended version of a paper accepted to ISIT 2024. arXiv admin note: text overlap with arXiv:2402.11039

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT); Machine Learning (stat.ML)
[67] arXiv:2405.05886 (cross-list from cs.LG) [pdf, other]: Title: Exploiting Autoencoder's Weakness to Generate Pseudo Anomalies

Authors: Marcella Astrid, Muhammad Zaigham Zaheer, Djamila Aouada, Seung-Ik Lee

Comments: SharedIt link: this https URL

Journal-ref: Astrid, M., Zaheer, M.Z., Aouada, D. and Lee, S.I., 2024. Exploiting autoencoder's weakness to generate pseudo anomalies. Neural Computing and Applications, pp.1-17

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[68] arXiv:2405.05876 (cross-list from cs.RO) [pdf, other]: Title: Composable Part-Based Manipulation

Authors: Weiyu Liu, Jiayuan Mao, Joy Hsu, Tucker Hermans, Animesh Garg, Jiajun Wu

Comments: Presented at CoRL 2023. For videos and additional results, see our website: this https URL

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[69] arXiv:2405.05847 (cross-list from cs.LG) [pdf, other]: Title: Learned feature representations are biased by complexity, learning order, position, and more

Authors: Andrew Kyle Lampinen, Stephanie C. Y. Chan, Katherine Hermann

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[70] arXiv:2405.05846 (cross-list from cs.CR) [pdf, other]: Title: Could It Be Generated? Towards Practical Analysis of Memorization in Text-To-Image Diffusion Models

Authors: Zhe Ma, Xuhong Zhang, Qingming Li, Tianyu Du, Wenzhi Chen, Zonghui Wang, Shouling Ji

Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[71] arXiv:2405.05836 (cross-list from cs.LG) [pdf, other]: Title: Informed Decision-Making through Advancements in Open Set Recognition and Unknown Sample Detection

Authors: Atefeh Mahdavi, Marco Carvalho

Comments: Accepted for proceedings of the 57th Hawaii International Conference on System Sciences: 10 pages, 6 figures, 3-6 January 2024, Honolulu, United States

Journal-ref: Atefeh, M., & Marco, C. (2024). "Informed Decision-Making through Advancements in Open Set Recognition and Unknown Sample Detection." Proceedings of the 57th Hawaii International Conference on System Sciences, 1090-1999

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[72] arXiv:2405.05828 (cross-list from cs.RO) [pdf, other]: Title: MAD-ICP: It Is All About Matching Data -- Robust and Informed LiDAR Odometry

Authors: Simone Ferrari, Luca Di Giammarino, Leonardo Brizi, Giorgio Grisetti

Comments: this https URL

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[73] arXiv:2405.05814 (cross-list from eess.IV) [pdf, ps, other]: Title: MSDiff: Multi-Scale Diffusion Model for Ultra-Sparse View CT Reconstruction

Authors: Pinhuang Tan, Mengxiao Geng, Jingya Lu, Liu Shi, Bin Huang, Qiegen Liu

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[74] arXiv:2405.05800 (cross-list from cs.GR) [pdf, other]: Title: DragGaussian: Enabling Drag-style Manipulation on 3D Gaussian Representation

Authors: Sitian Shen, Jing Xu, Yuheng Yuan, Xingyi Yang, Qiuhong Shen, Xinchao Wang

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[75] arXiv:2405.05792 (cross-list from cs.RO) [pdf, other]: Title: RoboHop: Segment-based Topological Map Representation for Open-World Visual Navigation

Authors: Sourav Garg, Krishan Rana, Mehdi Hosseinzadeh, Lachlan Mares, Niko Sünderhauf, Feras Dayoub, Ian Reid

Comments: Published at ICRA 2024; 9 pages, 8 figures

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
[76] arXiv:2405.05787 (cross-list from cs.RO) [pdf, other]: Title: Autonomous Robotic Ultrasound System for Liver Follow-up Diagnosis: Pilot Phantom Study

Authors: Tianpeng Zhang (1), Sekeun Kim (2), Jerome Charton (2), Haitong Ma (1), Kyungsang Kim (2), Na Li (1), Quanzheng Li (2) ((1) SEAS, Harvard University (2) CAMCA, Massachusetts General Hospital and Harvard Medical School)

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Systems and Control (eess.SY)
[77] arXiv:2405.05695 (cross-list from cs.LG) [pdf, other]: Title: Aux-NAS: Exploiting Auxiliary Labels with Negligibly Extra Inference Cost

Authors: Yuan Gao, Weizhong Zhang, Wenhan Luo, Lin Ma, Jin-Gang Yu, Gui-Song Xia, Jiayi Ma

Comments: Accepted to ICLR 2024

Journal-ref: International Conference on Learning Representations (ICLR), 2024

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[78] arXiv:2405.05667 (cross-list from eess.IV) [pdf, other]: Title: VM-DDPM: Vision Mamba Diffusion for Medical Image Synthesis

Authors: Zhihan Ju, Wanting Zhou

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[79] arXiv:2405.05658 (cross-list from eess.IV) [pdf, ps, other]: Title: Artificial intelligence for abnormality detection in high volume neuroimaging: a systematic review and meta-analysis

Authors: Siddharth Agarwal, David A. Wood, Mariusz Grzeda, Chandhini Suresh, Munaib Din, James Cole, Marc Modat, Thomas C Booth

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[80] arXiv:2405.05648 (cross-list from cs.RO) [pdf, other]: Title: ASGrasp: Generalizable Transparent Object Reconstruction and Grasping from RGB-D Active Stereo Camera

Authors: Jun Shi, Yong A, Yixiang Jin, Dingzhe Li, Haoyu Niu, Zhezhu Jin, He Wang

Comments: IEEE International Conference on Robotics and Automation (ICRA), 2024

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[81] arXiv:2405.05619 (cross-list from cs.LG) [pdf, other]: Title: Rectified Gaussian kernel multi-view k-means clustering

Authors: Kristina P. Sinaga

Comments: 13 pages, 1 figure, 7 Tables

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[82] arXiv:2405.05588 (cross-list from cs.LG) [pdf, other]: Title: Model Inversion Robustness: Can Transfer Learning Help?

Authors: Sy-Tuyen Ho, Koh Jun Hao, Keshigeyan Chandrasegaran, Ngoc-Bao Nguyen, Ngai-Man Cheung

Journal-ref: CVPR 2024

Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[83] arXiv:2405.05564 (cross-list from eess.IV) [pdf, other]: Title: Joint Edge Optimization Deep Unfolding Network for Accelerated MRI Reconstruction

Authors: Yue Cai, Yu Luo, Jie Ling, Shun Yao

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[84] arXiv:2405.05520 (cross-list from eess.IV) [pdf, other]: Title: Continuous max-flow augmentation of self-supervised few-shot learning on SPECT left ventricles

Authors: Ádám István Szűcs, Béla Kári, Oszkár Pártos

Comments: ISBI 2024 Accepted paper for presentation

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[85] arXiv:2405.05386 (cross-list from cs.LG) [pdf, other]: Title: Interpretability Needs a New Paradigm

Authors: Andreas Madsen, Himabindu Lakkaraju, Siva Reddy, Sarath Chandar

Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[86] arXiv:2405.05336 (cross-list from eess.IV) [pdf, other]: Title: Joint semi-supervised and contrastive learning enables zero-shot domain-adaptation and multi-domain segmentation

Authors: Alvaro Gomariz, Yusuke Kikuchi, Yun Yvonna Li, Thomas Albrecht, Andreas Maunz, Daniela Ferrara, Huanxiang Lu, Orcun Goksel

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

Thu, 9 May 2024 (showing first 50 of 76 entries)

[87] arXiv:2405.05259 [pdf, other]: Title: OpenESS: Event-based Semantic Scene Understanding with Open Vocabularies

Authors: Lingdong Kong, Youquan Liu, Lai Xing Ng, Benoit R. Cottereau, Wei Tsang Ooi

Comments: CVPR 2024 (Highlight); 26 pages, 12 figures, 11 tables; Code at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[88] arXiv:2405.05258 [pdf, other]: Title: Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous Driving

Authors: Lingdong Kong, Xiang Xu, Jiawei Ren, Wenwei Zhang, Liang Pan, Kai Chen, Wei Tsang Ooi, Ziwei Liu

Comments: Preprint; 17 pages, 6 figures, 8 tables; Code at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)
[89] arXiv:2405.05256 [pdf, other]: Title: THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language Models

Authors: Prannay Kaul, Zhizhong Li, Hao Yang, Yonatan Dukler, Ashwin Swaminathan, C. J. Taylor, Stefano Soatto

Comments: In CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[90] arXiv:2405.05252 [pdf, other]: Title: Attention-Driven Training-Free Efficiency Enhancement of Diffusion Models

Authors: Hongjie Wang, Difan Liu, Yan Kang, Yijun Li, Zhe Lin, Niraj K. Jha, Yuchen Liu

Comments: Accepted to IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Image and Video Processing (eess.IV); Signal Processing (eess.SP)
[91] arXiv:2405.05241 [pdf, other]: Title: BenthicNet: A global compilation of seafloor images for deep learning applications

Authors: Scott C. Lowe, Benjamin Misiuk, Isaac Xu, Shakhboz Abdulazizov, Amit R. Baroi, Alex C. Bastos, Merlin Best, Vicki Ferrini, Ariell Friedman, Deborah Hart, Ove Hoegh-Guldberg, Daniel Ierodiaconou, Julia Mackin-McLaughlin, Kathryn Markey, Pedro S. Menandro, Jacquomo Monk, Shreya Nemani, John O'Brien, Elizabeth Oh, Luba Y. Reshitnyk, Katleen Robert, Chris M. Roelfsema, Jessica A. Sameoto, Alexandre C. G. Schimel, Jordan A. Thomson, Brittany R. Wilson, Melisa C. Wong, Craig J. Brown, Thomas Trappenberg

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[92] arXiv:2405.05237 [pdf, other]: Title: EVA-X: A Foundation Model for General Chest X-ray Analysis with Self-supervised Learning

Authors: Jingfeng Yao, Xinggang Wang, Yuehao Song, Huangxuan Zhao, Jun Ma, Yajie Chen, Wenyu Liu, Bo Wang

Comments: codes available at: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[93] arXiv:2405.05224 [pdf, other]: Title: Imagine Flash: Accelerating Emu Diffusion Models with Backward Distillation

Authors: Jonas Kohler, Albert Pumarola, Edgar Schönfeld, Artsiom Sanakoyeu, Roshan Sumbaly, Peter Vajda, Ali Thabet

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[94] arXiv:2405.05216 [pdf, other]: Title: FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion Models

Authors: Jinglin Xu, Yijie Guo, Yuxin Peng

Comments: Accepted by CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[95] arXiv:2405.05173 [pdf, other]: Title: A Survey on Occupancy Perception for Autonomous Driving: The Information Fusion Perspective

Authors: Huaiyuan Xu, Junliang Chen, Shiyu Meng, Yi Wang, Lap-Pui Chau

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[96] arXiv:2405.05164 [pdf, other]: Title: ProbRadarM3F: mmWave Radar based Human Skeletal Pose Estimation with Probability Map Guided Multi-Format Feature Fusion

Authors: Bing Zhu, Zixin He, Weiyi Xiong, Guanhua Ding, Jianan Liu, Tao Huang, Wei Chen, Wei Xiang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[97] arXiv:2405.05145 [pdf, other]: Title: Conformal Semantic Image Segmentation: Post-hoc Quantification of Predictive Uncertainty

Authors: Luca Mossina, Joseba Dalmau, Léo andéol

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[98] arXiv:2405.05143 [pdf, other]: Title: Learning Object Semantic Similarity with Self-Supervision

Authors: Arthur Aubret, Timothy Schaumlöffel, Gemma Roig, Jochen Triesch

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
[99] arXiv:2405.05133 [pdf, other]: Title: Identifying every building's function in large-scale urban areas with multi-modality remote-sensing data

Authors: Zhuohong Li, Wei He, Jiepan Li, Hongyan Zhang

Comments: 5 pages, 7 figures, accepted by IGARSS 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[100] arXiv:2405.05130 [pdf, other]: Title: Multi-scale Bottleneck Transformer for Weakly Supervised Multimodal Violence Detection

Authors: Shengyang Sun, Xiaojin Gong

Comments: Accepted by ICME 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[101] arXiv:2405.05079 [pdf, other]: Title: Power Variable Projection for Initialization-Free Large-Scale Bundle Adjustment

Authors: Simon Weber, Je Hyeong Hong, Daniel Cremers

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[102] arXiv:2405.05057 [pdf, other]: Title: Real-Time Motion Detection Using Dynamic Mode Decomposition

Authors: Marco Mignacca, Simone Brugiapaglia, Jason J. Bramburger

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[103] arXiv:2405.05039 [pdf, other]: Title: Reviewing Intelligent Cinematography: AI research for camera-based video production

Authors: Adrian Azzarelli, Nantheera Anantrasirichai, David R Bull

Comments: For researchers and cinematographers. 43 pages including Table of Contents, List of Figures and Tables. We obtained permission to use Figures 5 and 11. All other Figures have been drawn by us

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[104] arXiv:2405.05031 [pdf, other]: Title: Mitigating Bias Using Model-Agnostic Data Attribution

Authors: Sander De Coninck, Wei-Cheng Wang, Sam Leroux, Pieter Simoens

Comments: Accepted to the 2024 IEEE CVPR Workshop on Fair, Data-efficient, and Trusted Computer Vision

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[105] arXiv:2405.05027 [pdf, other]: Title: StyleMamba : State Space Model for Efficient Text-driven Image Style Transfer

Authors: Zijia Wang, Zhi-Song Liu

Comments: Blind submission to ECAI 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[106] arXiv:2405.05016 [pdf, other]: Title: TGTM: TinyML-based Global Tone Mapping for HDR Sensors

Authors: Peter Todorov, Julian Hartig, Jan Meyer-Siemon, Martin Fiedler, Gregor Schewior

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[107] arXiv:2405.05012 [pdf, other]: Title: The Entropy Enigma: Success and Failure of Entropy Minimization

Authors: Ori Press, Ravid Shwartz-Ziv, Yann LeCun, Matthias Bethge

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[108] arXiv:2405.05010 [pdf, other]: Title: ${M^2D}$NeRF: Multi-Modal Decomposition NeRF with 3D Feature Fields

Authors: Ning Wang, Lefei Zhang, Angel X Chang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[109] arXiv:2405.05004 [pdf, other]: Title: TENet: Targetness Entanglement Incorporating with Multi-Scale Pooling and Mutually-Guided Fusion for RGB-E Object Tracking

Authors: Pengcheng Shao, Tianyang Xu, Zhangyong Tang, Linze Li, Xiao-Jun Wu, Josef Kittler

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[110] arXiv:2405.05001 [pdf, other]: Title: HMANet: Hybrid Multi-Axis Aggregation Network for Image Super-Resolution

Authors: Shu-Chuan Chu, Zhi-Chao Dou, Jeng-Shyang Pan, Shaowei Weng, Junbao Li

Comments: 12 pages, 10 figures, conference

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[111] arXiv:2405.04997 [pdf, other]: Title: Bridging the Gap Between Saliency Prediction and Image Quality Assessment

Authors: Kirillov Alexey, Andrey Moskalenko, Dmitriy Vatolin

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[112] arXiv:2405.04974 [pdf, other]: Title: Discrepancy-based Diffusion Models for Lesion Detection in Brain MRI

Authors: Keqiang Fan, Xiaohao Cai, Mahesan Niranjan

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[113] arXiv:2405.04971 [pdf, other]: Title: End-to-End Semi-Supervised approach with Modulated Object Queries for Table Detection in Documents

Authors: Iqraa Ehsan, Tahira Shehzadi, Didier Stricker, Muhammad Zeshan Afzal

Comments: ICDAR-IJDAR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[114] arXiv:2405.04969 [pdf, other]: Title: A review on discriminative self-supervised learning methods

Authors: Nikolaos Giakoumoglou, Tania Stathaki

Comments: 21 pages, 7 figures, 11 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[115] arXiv:2405.04964 [pdf, other]: Title: Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution

Authors: Yi Xiao, Qiangqiang Yuan, Kui Jiang, Yuzeng Chen, Qiang Zhang, Chia-Wen Lin

Comments: Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[116] arXiv:2405.04953 [pdf, other]: Title: Supervised Anomaly Detection for Complex Industrial Images

Authors: Aimira Baitieva, David Hurych, Victor Besnier, Olivier Bernard

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[117] arXiv:2405.04950 [pdf, other]: Title: VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Context

Authors: Yunxin Li, Baotian Hu, Haoyuan Shi, Wei Wang, Longyue Wang, Min Zhang

Comments: 17 pages; Accepted by ICML 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[118] arXiv:2405.04943 [pdf, ps, other]: Title: Unsupervised Skin Feature Tracking with Deep Neural Networks

Authors: Jose Chang, Torbjörn E.M. Nordling

Comments: arXiv admin note: text overlap with arXiv:2112.14159

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[119] arXiv:2405.04940 [pdf, other]: Title: Harnessing the Power of MLLMs for Transferable Text-to-Image Person ReID

Authors: Wentao Tan, Changxing Ding, Jiayu Jiang, Fei Wang, Yibing Zhan, Dapeng Tao

Comments: CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[120] arXiv:2405.04918 [pdf, other]: Title: Delve into Base-Novel Confusion: Redundancy Exploration for Few-Shot Class-Incremental Learning

Authors: Haichen Zhou, Yixiong Zou, Ruixuan Li, Yuhua Li, Kui Xiao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[121] arXiv:2405.04913 [pdf, other]: Title: Weakly-supervised Semantic Segmentation via Dual-stream Contrastive Learning of Cross-image Contextual Information

Authors: Qi Lai, Chi-Man Vong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[122] arXiv:2405.04909 [pdf, other]: Title: Traj-LLM: A New Exploration for Empowering Trajectory Prediction with Pre-trained Large Language Models

Authors: Zhengxing Lan, Hongbo Li, Lingshan Liu, Bo Fan, Yisheng Lv, Yilong Ren, Zhiyong Cui

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[123] arXiv:2405.04900 [pdf, other]: Title: Self-supervised Gait-based Emotion Representation Learning from Selective Strongly Augmented Skeleton Sequences

Authors: Cheng Song, Lu Lu, Zhen Ke, Long Gao, Shuai Ding

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[124] arXiv:2405.04889 [pdf, other]: Title: Fast LiDAR Upsampling using Conditional Diffusion Models

Authors: Sander Elias Magnussen Helgesen, Kazuto Nakashima, Jim Tørresen, Ryo Kurazume

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[125] arXiv:2405.04883 [pdf, other]: Title: Molecule-Space: Free Lunch in Unified Multimodal Space via Knowledge Fusion

Authors: Zehan Wang, Ziang Zhang, Xize Cheng, Rongjie Huang, Luping Liu, Zhenhui Ye, Haifeng Huang, Yang Zhao, Tao Jin, Peng Gao, Zhou Zhao

Comments: Accepted by ICML 2024. The code and checkpoints are released at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[126] arXiv:2405.04858 [pdf, other]: Title: Pedestrian Attribute Recognition as Label-balanced Multi-label Learning

Authors: Yibo Zhou, Hai-Miao Hu, Yirong Xiang, Xiaokang Zhang, Haotian Wu

Comments: Accepted as ICML2024 main conference paper

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[127] arXiv:2405.04834 [pdf, other]: Title: FlexEControl: Flexible and Efficient Multimodal Control for Text-to-Image Generation

Authors: Xuehai He, Jian Zheng, Jacob Zhiyuan Fang, Robinson Piramuthu, Mohit Bansal, Vicente Ordonez, Gunnar A Sigurdsson, Nanyun Peng, Xin Eric Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[128] arXiv:2405.04815 [pdf, other]: Title: Proportion Estimation by Masked Learning from Label Proportion

Authors: Takumi Okuo, Kazuya Nishimura, Hiroaki Ito, Kazuhiro Terada, Akihiko Yoshizawa, Ryoma Bise

Comments: Accepted at The 3rd MICCAI workshop on Data Augmentation, Labeling, and Imperfections

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[129] arXiv:2405.04807 [pdf, other]: Title: Transformer Architecture for NetsDB

Authors: Subodh Kamble, Kunal Sunil Kasodekar

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[130] arXiv:2405.04800 [pdf, other]: Title: DeepDamageNet: A two-step deep-learning model for multi-disaster building damage segmentation and classification using satellite imagery

Authors: Irene Alisjahbana, Jiawei Li, Ben (Mullet) Strong, Yue Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[131] arXiv:2405.04788 [pdf, other]: Title: DiffMatch: Visual-Language Guidance Makes Better Semi-supervised Change Detector

Authors: Kaiyu Li, Xiangyong Cao, Yupeng Deng, Deyu Meng

Comments: 13 pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[132] arXiv:2405.04782 [pdf, other]: Title: Dual-Image Enhanced CLIP for Zero-Shot Anomaly Detection

Authors: Zhaoxiang Zhang, Hanqiu Deng, Jinan Bao, Xingyu Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[133] arXiv:2405.04771 [pdf, other]: Title: Exploring Vision Transformers for 3D Human Motion-Language Models with Motion Patches

Authors: Qing Yu, Mikihiro Tanaka, Kent Fujiwara

Comments: Accepted to CVPR 2024, Project website: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[134] arXiv:2405.04759 [pdf, ps, other]: Title: Multi-Label Out-of-Distribution Detection with Spectral Normalized Joint Energy

Authors: Yihan Mei, Xinyu Wang, Dell Zhang, Xiaoling Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[135] arXiv:2405.04741 [pdf, other]: Title: All in One Framework for Multimodal Re-identification in the Wild

Authors: He Li, Mang Ye, Ming Zhang, Bo Du

Comments: 12 pages, 3 figure, CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[136] arXiv:2405.04722 [pdf, other]: Title: Detecting and Refining HiRISE Image Patches Obscured by Atmospheric Dust

Authors: Kunal Sunil Kasodekar

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)

[ total of 456 entries: 1-104 | 33-136 | 137-240 | 241-344 | 345-448 | 449-456 ]
[ showing 104 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2405, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 32

Fri, 10 May 2024 (continued, showing last 54 of 86 entries)

Thu, 9 May 2024 (showing first 50 of 76 entries)