Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 112

Wed, 22 May 2024
Tue, 21 May 2024
Mon, 20 May 2024
Fri, 17 May 2024
Thu, 16 May 2024

[ total of 429 entries: 1-500 | 113-429 ]
[ showing up to 500 entries per page: fewer | more ]

Tue, 21 May 2024 (continued, showing last 115 of 142 entries)

[113] arXiv:2405.11921 [pdf, other]: Title: MirrorGaussian: Reflecting 3D Gaussians for Reconstructing Mirror Reflections

Authors: Jiayue Liu, Xiao Tang, Freeman Cheng, Roy Yang, Zhihao Li, Jianzhuang Liu, Yi Huang, Jiaqi Lin, Shiyong Liu, Xiaofei Wu, Songcen Xu, Chun Yuan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[114] arXiv:2405.11914 [pdf, other]: Title: PT43D: A Probabilistic Transformer for Generating 3D Shapes from Single Highly-Ambiguous RGB Images

Authors: Yiheng Xiong, Angela Dai

Comments: 10 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[115] arXiv:2405.11913 [pdf, other]: Title: Diff-BGM: A Diffusion Model for Video Background Music Generation

Authors: Sizhe Li, Yiming Qin, Minghang Zheng, Xin Jin, Yang Liu

Comments: Accepted by CVPR 2024(Poster)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[116] arXiv:2405.11905 [pdf, other]: Title: CSTA: CNN-based Spatiotemporal Attention for Video Summarization

Authors: Jaewon Son, Jaehun Park, Kwangsu Kim

Comments: Accepted at CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[117] arXiv:2405.11903 [pdf, ps, other]: Title: A comprehensive overview of deep learning techniques for 3D point cloud classification and semantic segmentation

Authors: Sushmita Sarker, Prithul Sarker, Gunner Stone, Ryan Gorman, Alireza Tavakkoli, George Bebis, Javad Sattarvand

Comments: Published in Springer Nature (Machine Vision and Applications)

Journal-ref: Machine Vision and Applications 35, 67 (2024)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[118] arXiv:2405.11894 [pdf, other]: Title: Refining Coded Image in Human Vision Layer Using CNN-Based Post-Processing

Authors: Takahiro Shindo, Yui Tatsumi, Taiju Watanabe, Hiroshi Watanabe

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[119] arXiv:2405.11867 [pdf, other]: Title: Depth Prompting for Sensor-Agnostic Depth Estimation

Authors: Jin-Hwi Park, Chanhwi Jeong, Junoh Lee, Hae-Gon Jeon

Comments: Accepted at CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)
[120] arXiv:2405.11862 [pdf, other]: Title: SEMv3: A Fast and Robust Approach to Table Separation Line Detection

Authors: Chunxia Qin, Zhenrong Zhang, Pengfei Hu, Chenyu Liu, Jiefeng Ma, Jun Du

Comments: 9 pages, 6 figures, 5 tables. Accepted by IJCAI2024 main track

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[121] arXiv:2405.11852 [pdf, other]: Title: Evolving Storytelling: Benchmarks and Methods for New Character Customization with Diffusion Models

Authors: Xiyu Wang, Yufei Wang, Satoshi Tsutsui, Weisi Lin, Bihan Wen, Alex C. Kot

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[122] arXiv:2405.11850 [pdf, other]: Title: Rethinking Overlooked Aspects in Vision-Language Models

Authors: Yuan Liu, Le Tian, Xiao Zhou, Jie Zhou

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[123] arXiv:2405.11846 [pdf, other]: Title: EPPS: Advanced Polyp Segmentation via Edge Information Injection and Selective Feature Decoupling

Authors: Mengqi Lei, Xin Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[124] arXiv:2405.11837 [pdf, other]: Title: Improving the Explain-Any-Concept by Introducing Nonlinearity to the Trainable Surrogate Model

Authors: Mounes Zaval, Sedat Ozer

Comments: This paper is accepted for publication at IEEE SIU conference, 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[125] arXiv:2405.11823 [pdf, other]: Title: Stereo-Knowledge Distillation from dpMV to Dual Pixels for Light Field Video Reconstruction

Authors: Aryan Garg, Raghav Mallampali, Akshat Joshi, Shrisudhan Govindarajan, Kaushik Mitra

Comments: International Conference of Computational Photography (ICCP 2024), 11 pages and 12 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[126] arXiv:2405.11822 [pdf, other]: Title: FeTT: Continual Class Incremental Learning via Feature Transformation Tuning

Authors: Sunyuan Qiang, Xuxin Lin, Yanyan Liang, Jun Wan, Du Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[127] arXiv:2405.11814 [pdf, other]: Title: Climatic & Anthropogenic Hazards to the Nasca World Heritage: Application of Remote Sensing, AI, and Flood Modelling

Authors: Masato Sakai, Marcus Freitag, Akihisa Sakurai, Conrad M Albrecht, Hendrik F Hamann

Comments: accepted at IGARSS 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computers and Society (cs.CY)
[128] arXiv:2405.11809 [pdf, other]: Title: Distill-then-prune: An Efficient Compression Framework for Real-time Stereo Matching Network on Edge Devices

Authors: Baiyu Pan, Jichao Jiao, Jianxing Pang, Jun Cheng

Comments: International Conference on Robotics and Automation (ICRA) 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[129] arXiv:2405.11794 [pdf, other]: Title: ViViD: Video Virtual Try-on using Diffusion Models

Authors: Zixun Fang, Wei Zhai, Aimin Su, Hongliang Song, Kai Zhu, Mao Wang, Yu Chen, Zhiheng Liu, Yang Cao, Zheng-Jun Zha

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[130] arXiv:2405.11793 [pdf, other]: Title: MM-Retinal: Knowledge-Enhanced Foundational Pretraining with Fundus Image-Text Expertise

Authors: Ruiqi Wu, Chenran Zhang, Jianle Zhang, Yi Zhou, Tao Zhou, Huazhu Fu

Comments: Early Accepted by The International Conference on Medical Image Computing and Computer Assisted Intervention(MICCAI)2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[131] arXiv:2405.11770 [pdf, other]: Title: Learning Spatial Similarity Distribution for Few-shot Object Counting

Authors: Yuanwu Xu, Feifan Song, Haofeng Zhang

Comments: Accepted to IJCAI2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[132] arXiv:2405.11765 [pdf, other]: Title: DATR: Unsupervised Domain Adaptive Detection Transformer with Dataset-Level Adaptation and Prototypical Alignment

Authors: Jianhong Han, Liang Chen, Yupei Wang

Comments: Manuscript submitted to IEEE Transactions on Image Processing

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[133] arXiv:2405.11757 [pdf, other]: Title: DLAFormer: An End-to-End Transformer For Document Layout Analysis

Authors: Jiawei Wang, Kai Hu, Qiang Huo

Comments: ICDAR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[134] arXiv:2405.11754 [pdf, other]: Title: Versatile Teacher: A Class-aware Teacher-student Framework for Cross-domain Adaptation

Authors: Runou Yang, Tian Tian, Jinwen Tian

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[135] arXiv:2405.11732 [pdf, ps, other]: Title: Quality assurance of organs-at-risk delineation in radiotherapy

Authors: Yihao Zhao, Cuiyun Yuan, Ying Liang, Yang Li, Chunxia Li, Man Zhao, Jun Hu, Wei Liu, Chenbin Liu

Comments: 14 pages,5 figures, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)
[136] arXiv:2405.11690 [pdf, other]: Title: InterAct: Capture and Modelling of Realistic, Expressive and Interactive Activities between Two Persons in Daily Scenarios

Authors: Yinghao Huang, Leo Ho, Dafei Qin, Mingyi Shi, Taku Komura

Comments: The first two authors contributed equally to this work

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[137] arXiv:2405.11685 [pdf, other]: Title: ColorFoil: Investigating Color Blindness in Large Vision and Language Models

Authors: Ahnaf Mozib Samin, M. Firoz Ahmed, Md. Mushtaq Shahriyar Rafee

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[138] arXiv:2405.11682 [pdf, other]: Title: FADet: A Multi-sensor 3D Object Detection Network based on Local Featured Attention

Authors: Ziang Guo, Zakhar Yagudin, Selamawit Asfaw, Artem Lykov, Dzmitry Tsetserukou

Comments: Submitted to IEEE

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[139] arXiv:2405.11677 [pdf, other]: Title: Advancing 6-DoF Instrument Pose Estimation in Variable X-Ray Imaging Geometries

Authors: Christiaan G.A. Viviers, Lena Filatova, Maurice Termeer, Peter H.N. de With, Fons van der Sommen

Comments: Early author version of paper. Refer to the full paper at this https URL

Journal-ref: IEEE Transactions on Image Processing (2024) (Volume: 33) Page(s): 2462 - 2476

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[140] arXiv:2405.11675 [pdf, other]: Title: Deep Ensemble Art Style Recognition

Authors: Orfeas Menis-Mastromichalakis, Natasa Sofou, Giorgos Stamou

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[141] arXiv:2405.11655 [pdf, other]: Title: Track Anything Rapter(TAR)

Authors: Tharun V. Puthanveettil, Fnu Obaid ur Rahman

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[142] arXiv:2405.11643 [pdf, other]: Title: Morphological Prototyping for Unsupervised Slide Representation Learning in Computational Pathology

Authors: Andrew H. Song, Richard J. Chen, Tong Ding, Drew F.K. Williamson, Guillaume Jaume, Faisal Mahmood

Comments: CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Applications (stat.AP)
[143] arXiv:2405.11629 [pdf, other]: Title: Searching Realistic-Looking Adversarial Objects For Autonomous Driving Systems

Authors: Shengxiang Sun, Shenzhe Zhu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[144] arXiv:2405.11621 [pdf, ps, other]: Title: Computer Vision in the Food Industry: Accurate, Real-time, and Automatic Food Recognition with Pretrained MobileNetV2

Authors: Shayan Rokhva, Babak Teimourpour, Amir Hossein Soltani

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[145] arXiv:2405.11618 [pdf, other]: Title: Transcriptomics-guided Slide Representation Learning in Computational Pathology

Authors: Guillaume Jaume, Lukas Oldenburg, Anurag Vaidya, Richard J. Chen, Drew F.K. Williamson, Thomas Peeters, Andrew H. Song, Faisal Mahmood

Comments: CVPR'24, Oral

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[146] arXiv:2405.11616 [pdf, other]: Title: Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention

Authors: Peng Li, Yuan Liu, Xiaoxiao Long, Feihu Zhang, Cheng Lin, Mengfei Li, Xingqun Qi, Shanghang Zhang, Wenhan Luo, Ping Tan, Wenping Wang, Qifeng Liu, Yike Guo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[147] arXiv:2405.11614 [pdf, other]: Title: Nickel and Diming Your GAN: A Dual-Method Approach to Enhancing GAN Efficiency via Knowledge Distillation

Authors: Sangyeop Yeo, Yoojin Jang, Jaejun Yoo

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[148] arXiv:2405.11582 [pdf, other]: Title: SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization

Authors: Jialong Guo, Xinghao Chen, Yehui Tang, Yunhe Wang

Comments: Accepted to ICML 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[149] arXiv:2405.11574 [pdf, other]: Title: Reproducibility Study of CDUL: CLIP-Driven Unsupervised Learning for Multi-Label Image Classification

Authors: Manan Shah, Yash Bhalgat

Comments: Reproducibility study

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[150] arXiv:2405.11564 [pdf, other]: Title: CRF360D: Monocular 360 Depth Estimation via Spherical Fully-Connected CRFs

Authors: Zidong Cao, Lin Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[151] arXiv:2405.11551 [pdf, other]: Title: An Invisible Backdoor Attack Based On Semantic Feature

Authors: Yangming Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[152] arXiv:2405.11536 [pdf, other]: Title: RobMOT: Robust 3D Multi-Object Tracking by Observational Noise and State Estimation Drift Mitigation on LiDAR PointCloud

Authors: Mohamed Nagy, Naoufel Werghi, Bilal Hassan, Jorge Dias, Majid Khonji

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[153] arXiv:2405.11526 [pdf, other]: Title: Register assisted aggregation for Visual Place Recognition

Authors: Xuan Yu, Zhenyong Fu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[154] arXiv:2405.11523 [pdf, other]: Title: Diffusion-Based Hierarchical Image Steganography

Authors: Youmin Xu, Xuanyu Zhang, Jiwen Yu, Chong Mou, Xiandong Meng, Jian Zhang

Comments: arXiv admin note: text overlap with arXiv:2305.16936

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[155] arXiv:2405.11511 [pdf, other]: Title: Online Action Representation using Change Detection and Symbolic Programming

Authors: Vishnu S Nair, Sneha Sree, Jayaraj Joseph, Mohanasankar Sivaprakasam

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[156] arXiv:2405.11501 [pdf, other]: Title: DogFLW: Dog Facial Landmarks in the Wild Dataset

Authors: George Martvel, Greta Abele, Annika Bremhorst, Chiara Canori, Nareed Farhat, Giulia Pedretti, Ilan Shimshoni, Anna Zamansky

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[157] arXiv:2405.11498 [pdf, other]: Title: The Effectiveness of Edge Detection Evaluation Metrics for Automated Coastline Detection

Authors: Conor O'Sullivan, Seamus Coveney, Xavier Monteys, Soumyabrata Dev

Journal-ref: 2023 Photonics & Electromagnetics Research Symposium (PIERS)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[158] arXiv:2405.11496 [pdf, other]: Title: DEMO: A Statistical Perspective for Efficient Image-Text Matching

Authors: Fan Zhang, Xian-Sheng Hua, Chong Chen, Xiao Luo

Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)
[159] arXiv:2405.11494 [pdf, other]: Title: Automated Coastline Extraction Using Edge Detection Algorithms

Authors: Conor O'Sullivan, Seamus Coveney, Xavier Monteys, Soumyabrata Dev

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[160] arXiv:2405.11493 [pdf, other]: Title: Point Cloud Compression with Implicit Neural Representations: A Unified Framework

Authors: Hongning Ruan, Yulin Shao, Qianqian Yang, Liang Zhao, Dusit Niyato

Comments: 6 Pages, 6 Figures, submitted to IEEE ICCC

Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT); Signal Processing (eess.SP)
[161] arXiv:2405.11491 [pdf, other]: Title: BOSC: A Backdoor-based Framework for Open Set Synthetic Image Attribution

Authors: Jun Wang, Benedetta Tondi, Mauro Barni

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[162] arXiv:2405.11487 [pdf, other]: Title: "Previously on ..." From Recaps to Story Summarization

Authors: Aditya Kumar Singh, Dhruv Srivastava, Makarand Tapaswi

Comments: CVPR 2024; Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[163] arXiv:2405.11483 [pdf, other]: Title: MICap: A Unified Model for Identity-aware Movie Descriptions

Authors: Haran Raajesh, Naveen Reddy Desanur, Zeeshan Khan, Makarand Tapaswi

Comments: CVPR 2024, Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[164] arXiv:2405.11481 [pdf, other]: Title: Physics-aware Hand-object Interaction Denoising

Authors: Haowen Luo, Yunze Liu, Li Yi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[165] arXiv:2405.11478 [pdf, other]: Title: Unsupervised Image Prior via Prompt Learning and CLIP Semantic Guidance for Low-Light Image Enhancement

Authors: Igor Morawski, Kai He, Shusil Dangi, Winston H. Hsu

Comments: Accepted to CVPR 2024 Workshop NTIRE: New Trends in Image Restoration and Enhancement workshop and Challenges

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[166] arXiv:2405.11476 [pdf, other]: Title: NubbleDrop: A Simple Way to Improve Matching Strategy for Prompted One-Shot Segmentation

Authors: Zhiyu Xu, Qingliang Chen

Comments: Under Review

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[167] arXiv:2405.11473 [pdf, other]: Title: FIFO-Diffusion: Generating Infinite Videos from Text without Training

Authors: Jihwan Kim, Junoh Kang, Jinyoung Choi, Bohyung Han

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[168] arXiv:2405.11468 [pdf, other]: Title: Emphasizing Crucial Features for Efficient Image Restoration

Authors: Hu Gao, Bowen Ma, Ying Zhang, Jingfan Yang, Jing Yang, Depeng Dang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[169] arXiv:2405.11467 [pdf, other]: Title: AdaAugment: A Tuning-Free and Adaptive Approach to Enhance Data Augmentation

Authors: Suorong Yang, Peijia Li, Xin Xiong, Furao Shen, Jian Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[170] arXiv:2405.11448 [pdf, other]: Title: Cross-Domain Knowledge Distillation for Low-Resolution Human Pose Estimation

Authors: Zejun Gu, Zhong-Qiu Zhao, Henghui Ding, Hao Shen, Zhao Zhang, De-Shuang Huang

Comments: 11 pages, 5 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[171] arXiv:2405.11442 [pdf, other]: Title: Unifying 3D Vision-Language Understanding via Promptable Queries

Authors: Ziyu Zhu, Zhuofan Zhang, Xiaojian Ma, Xuesong Niu, Yixin Chen, Baoxiong Jia, Zhidong Deng, Siyuan Huang, Qing Li

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[172] arXiv:2405.11437 [pdf, other]: Title: The First Swahili Language Scene Text Detection and Recognition Dataset

Authors: Fadila Wendigoundi Douamba, Jianjun Song, Ling Fu, Yuliang Liu, Xiang Bai

Comments: Accepted to ICDAR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[173] arXiv:2405.11351 [pdf, other]: Title: PlantTracing: Tracing Arabidopsis Thaliana Apex with CenterTrack

Authors: Yuanzhe Liu, Yixiang Mao, Yao Wang

Comments: 4 pages, 13 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[174] arXiv:2405.11345 [pdf, ps, other]: Title: City-Scale Multi-Camera Vehicle Tracking System with Improved Self-Supervised Camera Link Model

Authors: Yuqiang Lin, Sam Lockyer, Adrian Evans, Markus Zarbock, Nic Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[175] arXiv:2405.11338 [pdf, ps, other]: Title: EyeFound: A Multimodal Generalist Foundation Model for Ophthalmic Imaging

Authors: Danli Shi, Weiyi Zhang, Xiaolan Chen, Yexin Liu, Jianchen Yang, Siyu Huang, Yih Chung Tham, Yingfeng Zheng, Mingguang He

Comments: 21 pages, 2 figures, 4 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[176] arXiv:2405.11337 [pdf, other]: Title: A Unified Approach Towards Active Learning and Out-of-Distribution Detection

Authors: Sebastian Schmidt, Leonard Schenk, Leo Schwinn, Stephan Günnemann

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[177] arXiv:2405.11336 [pdf, other]: Title: UPAM: Unified Prompt Attack in Text-to-Image Generation Models Against Both Textual Filters and Visual Checkers

Authors: Duo Peng, Qiuhong Ke, Jun Liu

Comments: Accepted by ICML2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[178] arXiv:2405.11315 [pdf, other]: Title: MediCLIP: Adapting CLIP for Few-shot Medical Image Anomaly Detection

Authors: Ximiao Zhang, Min Xu, Dehui Qiu, Ruixin Yan, Ning Lang, Xiuzhuang Zhou

Comments: 12 pages, 3 figures, 5 tables, early accepted at MICCAI 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[179] arXiv:2405.11293 [pdf, other]: Title: InfRS: Incremental Few-Shot Object Detection in Remote Sensing Images

Authors: Wuzhou Li, Jiawei Zhou, Xiang Li, Yi Cao, Guang Jin, Xuemin Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[180] arXiv:2405.11286 [pdf, other]: Title: Motion Avatar: Generate Human and Animal Avatars with Arbitrary Motion

Authors: Zeyu Zhang, Yiran Wang, Biao Wu, Shuo Chen, Zhiyuan Zhang, Shiya Huang, Wenbo Zhang, Meng Fang, Ling Chen, Yang Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[181] arXiv:2405.11276 [pdf, other]: Title: Visible and Clear: Finding Tiny Objects in Difference Map

Authors: Bing Cao, Haiyu Yao, Pengfei Zhu, Qinghua Hu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[182] arXiv:2405.11270 [pdf, other]: Title: HR Human: Modeling Human Avatars with Triangular Mesh and High-Resolution Textures from Videos

Authors: Qifeng Chen, Rengan Xie, Kai Huang, Qi Wang, Wenting Zheng, Rong Li, Yuchi Huo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[183] arXiv:2405.11252 [pdf, other]: Title: Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score Matching

Authors: Xingyu Miao, Haoran Duan, Varun Ojha, Jun Song, Tejal Shah, Yang Long, Rajiv Ranjan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[184] arXiv:2405.11240 [pdf, other]: Title: Testing the Performance of Face Recognition for People with Down Syndrome

Authors: Christian Rathgeb, Mathias Ibsen, Denise Hartmann, Simon Hradetzky, Berglind Ólafsdóttir

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[185] arXiv:2405.11236 [pdf, other]: Title: TriLoRA: Integrating SVD for Advanced Style Personalization in Text-to-Image Generation

Authors: Chengcheng Feng, Mu He, Qiuyu Tian, Haojie Yin, Xiaofang Zhao, Hongwei Tang, Xingqiang Wei

Comments: Accepted by AI for Content Creation (AI4CC) workshop at CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[186] arXiv:2405.11205 [pdf, other]: Title: Fuse & Calibrate: A bi-directional Vision-Language Guided Framework for Referring Image Segmentation

Authors: Yichen Yan, Xingjian He, Sihan Chen, Shichen Lu, Jing Liu

Comments: 12 pages, 4 figures ICIC2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[187] arXiv:2405.11190 [pdf, other]: Title: ReasonPix2Pix: Instruction Reasoning Dataset for Advanced Image Editing

Authors: Ying Jin, Pengyang Ling, Xiaoyi Dong, Pan Zhang, Jiaqi Wang, Dahua Lin

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[188] arXiv:2405.11180 [pdf, other]: Title: GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic Hand Gesture Recognition

Authors: Mallika Garg, Debashis Ghosh, Pyari Mohan Pradhan

Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[189] arXiv:2405.11165 [pdf, other]: Title: Automated Multi-level Preference for MLLMs

Authors: Mengxi Zhang, Kang Rong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[190] arXiv:2405.11158 [pdf, other]: Title: Dusk Till Dawn: Self-supervised Nighttime Stereo Depth Estimation using Visual Foundation Models

Authors: Madhu Vankadari, Samuel Hodgson, Sangyun Shin, Kaichen Zhou Andrew Markham, Niki Trigoni

Comments: The paper is published at ICRA 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[191] arXiv:2405.11154 [pdf, other]: Title: Revisiting the Robust Generalization of Adversarial Prompt Tuning

Authors: Fan Yang, Mingxuan Xia, Sangzhou Xia, Chicheng Ma, Hui Hui

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[192] arXiv:2405.11151 [pdf, other]: Title: Multi-scale Information Sharing and Selection Network with Boundary Attention for Polyp Segmentation

Authors: Xiaolu Kang, Zhuoqi Ma, Kang Liu, Yunan Li, Qiguang Miao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[193] arXiv:2405.11145 [pdf, other]: Title: Detecting Multimodal Situations with Insufficient Context and Abstaining from Baseless Predictions

Authors: Junzhang Liu, Zhecan Wang, Hammad Ayyubi, Haoxuan You, Chris Thomas, Rui Sun, Shih-Fu Chang, Kai-Wei Chang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[194] arXiv:2405.11129 [pdf, other]: Title: MotionGS : Compact Gaussian Splatting SLAM by Motion Filter

Authors: Xinli Guo, Peng Han, Weidong Zhang, Hongtian Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[195] arXiv:2405.11126 [pdf, other]: Title: Flexible Motion In-betweening with Diffusion Models

Authors: Setareh Cohan, Guy Tevet, Daniele Reda, Xue Bin Peng, Michiel van de Panne

Comments: SIGGRAPH 2024. For project page and code, see this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[196] arXiv:2405.11112 [pdf, other]: Title: Enhancing Understanding Through Wildlife Re-Identification

Authors: J. Buitenhuis

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[197] arXiv:2405.11067 [pdf, other]: Title: Bayesian Learning-driven Prototypical Contrastive Loss for Class-Incremental Learning

Authors: Nisha L. Raichur, Lucas Heublein, Tobias Feigl, Alexander Rügamer, Christopher Mutschler, Felix Ott

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[198] arXiv:2405.11021 [pdf, other]: Title: Photorealistic 3D Urban Scene Reconstruction and Point Cloud Extraction using Google Earth Imagery and Gaussian Splatting

Authors: Kyle Gao, Dening Lu, Hongjie He, Linlin Xu, Jonathan Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[199] arXiv:2405.10954 [pdf, ps, other]: Title: Multimodal CLIP Inference for Meta-Few-Shot Image Classification

Authors: Constance Ferragu, Philomene Chagniot, Vincent Coyette

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[200] arXiv:2405.10952 [pdf, other]: Title: VICAN: Very Efficient Calibration Algorithm for Large Camera Networks

Authors: Gabriel Moreira, Manuel Marques, João Paulo Costeira, Alexander Hauptmann

Comments: To appear at the IEEE International Conference on Robotics and Automation (ICRA), 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[201] arXiv:2405.10951 [pdf, other]: Title: Block Selective Reprogramming for On-device Training of Vision Transformers

Authors: Sreetama Sarkar, Souvik Kundu, Kai Zheng, Peter A. Beerel

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[202] arXiv:2405.10949 [pdf, other]: Title: Global License Plate Dataset

Authors: Siddharth Agrawal

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[203] arXiv:2405.10948 [pdf, other]: Title: Surgical-LVLM: Learning to Adapt Large Vision-Language Model for Grounded Visual Question Answering in Robotic Surgery

Authors: Guankun Wang, Long Bai, Wan Jun Nah, Jie Wang, Zhaoxi Zhang, Zhen Chen, Jinlin Wu, Mobarakol Islam, Hongbin Liu, Hongliang Ren

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO); Image and Video Processing (eess.IV)
[204] arXiv:2405.10947 [pdf, other]: Title: Depth-aware Panoptic Segmentation

Authors: Tuan Nguyen, Max Mehltretter, Franz Rottensteiner

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[205] arXiv:2405.10946 [pdf, other]: Title: Application of Tensorized Neural Networks for Cloud Classification

Authors: Alifu Xiafukaiti, Devanshu Garg, Aruto Hosaka, Koichi Yanagisawa, Yuichiro Minato, Tsuyoshi Yoshida

Comments: 11 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Atmospheric and Oceanic Physics (physics.ao-ph)
[206] arXiv:2405.12171 (cross-list from cs.SE) [pdf, other]: Title: State of the Practice for Medical Imaging Software

Authors: W. Spencer Smith, Ao Dong, Jacques Carette, Michael D. Noseworthy

Comments: 73 pages, 14 figures, 12 tables

Subjects: Software Engineering (cs.SE); Computer Vision and Pattern Recognition (cs.CV)
[207] arXiv:2405.11880 (cross-list from cs.LG) [pdf, other]: Title: Quantifying In-Context Reasoning Effects and Memorization Effects in LLMs

Authors: Siyu Lou, Yuntian Chen, Xiaodan Liang, Liang Lin, Quanshi Zhang

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[208] arXiv:2405.11829 (cross-list from cs.LG) [pdf, other]: Title: Adversarially Diversified Rehearsal Memory (ADRM): Mitigating Memory Overfitting Challenge in Continual Learning

Authors: Hikmat Khan, Ghulam Rasool, Nidhal Carla Bouaynaya

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[209] arXiv:2405.11708 (cross-list from cs.LG) [pdf, other]: Title: Adaptive Batch Normalization Networks for Adversarial Robustness

Authors: Shao-Yuan Lo, Vishal M. Patel

Comments: Accepted at IEEE International Conference on Advanced Video and Signal-based Surveillance (AVSS) 2024

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[210] arXiv:2405.11659 (cross-list from cs.RO) [pdf, other]: Title: Auto-Platoon : Freight by example

Authors: Tharun V. Puthanveettil, Abhijay Singh, Yashveer Jain, Vinay Bukka, Sameer Arjun S

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[211] arXiv:2405.11640 (cross-list from cs.AI) [pdf, other]: Title: Inquire, Interact, and Integrate: A Proactive Agent Collaborative Framework for Zero-Shot Multimodal Medical Reasoning

Authors: Zishan Gu, Fenglin Liu, Changchang Yin, Ping Zhang

Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[212] arXiv:2405.11598 (cross-list from eess.IV) [pdf, other]: Title: AI-Assisted Diagnosis for Covid-19 CXR Screening: From Data Collection to Clinical Validation

Authors: Carlo Alberto Barbano, Riccardo Renzulli, Marco Grosso, Domenico Basile, Marco Busso, Marco Grangetto

Comments: Accepted at 21st IEEE International Symposium on Biomedical Imaging (ISBI)

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[213] arXiv:2405.11533 (cross-list from cs.LG) [pdf, other]: Title: Hierarchical Selective Classification

Authors: Shani Goren, Ido Galil, Ran El-Yaniv

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[214] arXiv:2405.11492 (cross-list from cs.RO) [pdf, other]: Title: Enhancing Vehicle Aerodynamics with Deep Reinforcement Learning in Voxelised Models

Authors: Jignesh Patel, Yannis Spyridis, Vasileios Argyriou

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[215] arXiv:2405.11386 (cross-list from eess.IV) [pdf, other]: Title: Liver Fat Quantification Network with Body Shape

Authors: Qiyue Wang, Wu Xue, Xiaoke Zhang, Fang Jin, James Hahn

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[216] arXiv:2405.11326 (cross-list from cs.LG) [pdf, other]: Title: On the Trajectory Regularity of ODE-based Diffusion Sampling

Authors: Defang Chen, Zhenyu Zhou, Can Wang, Chunhua Shen, Siwei Lyu

Comments: ICML 2024, 30 pages

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[217] arXiv:2405.11320 (cross-list from cs.LG) [pdf, other]: Title: Sampling Strategies for Mitigating Bias in Face Synthesis Methods

Authors: Emmanouil Maragkoudakis, Symeon Papadopoulos, Iraklis Varlamis, Christos Diou

Comments: Accepted to the BIAS 2023 ECML-PKDD Workshop

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[218] arXiv:2405.11301 (cross-list from cs.CL) [pdf, other]: Title: Enhancing Fine-Grained Image Classifications via Cascaded Vision Language Models

Authors: Canshi Wei

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[219] arXiv:2405.11298 (cross-list from cs.RO) [pdf, other]: Title: Visual Episodic Memory-based Exploration

Authors: Jack Vice, Natalie Ruiz-Sanchez, Pamela K. Douglas, Gita Sukthankar

Comments: FLAIRS 2023, 7 pages, 11 figures

Journal-ref: The International FLAIRS Conference Proceedings. Vol. 36. 2023

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[220] arXiv:2405.11295 (cross-list from eess.IV) [pdf, ps, other]: Title: Medical Image Analysis for Detection, Treatment and Planning of Disease using Artificial Intelligence Approaches

Authors: Nand Lal Yadav, Satyendra Singh, Rajesh Kumar, Sudhakar Singh

Comments: 10 pages, 3 figures

Journal-ref: International Journal of Microsystems and IoT, Vol. 1, Issue 5, pp.278- 287, 2023

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[221] arXiv:2405.11289 (cross-list from eess.IV) [pdf, other]: Title: Diffusion Model Driven Test-Time Image Adaptation for Robust Skin Lesion Classification

Authors: Ming Hu, Siyuan Yan, Peng Xia, Feilong Tang, Wenxue Li, Peibo Duan, Lin Zhang, Zongyuan Ge

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[222] arXiv:2405.11273 (cross-list from cs.AI) [pdf, other]: Title: Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts

Authors: Yunxin Li, Shenyuan Jiang, Baotian Hu, Longyue Wang, Wanqi Zhong, Wenhan Luo, Lin Ma, Min Zhang

Comments: 22 pages, 13 figures. Project Website: this https URL Working in progress

Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[223] arXiv:2405.11176 (cross-list from cs.RO) [pdf, other]: Title: Outlier-Robust Long-Term Robotic Mapping Leveraging Ground Segmentation

Authors: Hyungtae Lim

Comments: 2 pages, 4 figures

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[224] arXiv:2405.11133 (cross-list from eess.IV) [pdf, ps, other]: Title: XCAT-2.0: A Comprehensive Library of Personalized Digital Twins Derived from CT Scans

Authors: Lavsen Dahal, Mobina Ghojoghnejad, Dhrubajyoti Ghosh, Yubraj Bhandari, David Kim, Fong Chi Ho, Fakrul Islam Tushar, Ehsan Abadi, Ehsan Samei, Joseph Lo, Paul Segars

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[225] arXiv:2405.11064 (cross-list from eess.SP) [pdf, other]: Title: TVCondNet: A Conditional Denoising Neural Network for NMR Spectroscopy

Authors: Zihao Zou, Shirin Shoushtari, Jiaming Liu, Jialiang Zhang, Patrick Judge, Emilia Santana, Alison Lim, Marcus Foston, Ulugbek S. Kamilov

Subjects: Signal Processing (eess.SP); Computer Vision and Pattern Recognition (cs.CV)
[226] arXiv:2405.11029 (cross-list from cs.LG) [pdf, other]: Title: Generative Artificial Intelligence: A Systematic Review and Applications

Authors: Sandeep Singh Sengar, Affan Bin Hasan, Sanjay Kumar, Fiona Carroll

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[227] arXiv:2405.10950 (cross-list from eess.IV) [pdf, ps, other]: Title: Classification of colorectal primer carcinoma from normal colon with mid-infrared spectra

Authors: B. Borkovits, E. Kontsek, A. Pesti, P. Gordon, S. Gergely, I. Csabai, A. Kiss, P. Pollner

Comments: 15 pages, 5 figures, 4 tables, Conferentia Chemometrica 2023 special edition, for the original digital location, see this https URL , digital biblio info: (2024) e3542

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Tissues and Organs (q-bio.TO)

Mon, 20 May 2024

[228] arXiv:2405.10934 [pdf, other]: Title: Reconstruction of Manipulated Garment with Guided Deformation Prior

Authors: Ren Li, Corentin Dumery, Zhantao Deng, Pascal Fua

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[229] arXiv:2405.10913 [pdf, other]: Title: Blackbox Adaptation for Medical Image Segmentation

Authors: Jay N. Paranjape, Shameema Sikder, S. Swaroop Vedula, Vishal M. Patel

Comments: Accepted early at MICCAI 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[230] arXiv:2405.10885 [pdf, other]: Title: FA-Depth: Toward Fast and Accurate Self-supervised Monocular Depth Estimation

Authors: Fei Wang, Jun Cheng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[231] arXiv:2405.10879 [pdf, other]: Title: One registration is worth two segmentations

Authors: Shiqi Huang, Tingfa Xu, Ziyi Shen, Shaheer Ullah Saeed, Wen Yan, Dean Barratt, Yipeng Hu

Comments: Early Accepted by MICCAI2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[232] arXiv:2405.10871 [pdf, other]: Title: BraTS-Path Challenge: Assessing Heterogeneous Histopathologic Brain Tumor Sub-regions

Authors: Spyridon Bakas, Siddhesh P. Thakur, Shahriar Faghani, Mana Moassefi, Ujjwal Baid, Verena Chung, Sarthak Pati, Shubham Innani, Bhakti Baheti, Jake Albrecht, Alexandros Karargyris, Hasan Kassem, MacLean P. Nasrallah, Jared T. Ahrendsen, Valeria Barresi, Maria A. Gubbiotti, Giselle Y. López, Calixto-Hope G. Lucas, Michael L. Miller, Lee A. D. Cooper, Jason T. Huse, William R. Bell

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[233] arXiv:2405.10868 [pdf, other]: Title: Air Signing and Privacy-Preserving Signature Verification for Digital Documents

Authors: P. Sarveswarasarma, T. Sathulakjan, V. J. V. Godfrey, Thanuja D. Ambegoda

Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[234] arXiv:2405.10864 [pdf, other]: Title: Improving face generation quality and prompt following with synthetic captions

Authors: Michail Tarasiou, Stylianos Moschoglou, Jiankang Deng, Stefanos Zafeiriou

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[235] arXiv:2405.10842 [pdf, ps, other]: Title: Automated Radiology Report Generation: A Review of Recent Advances

Authors: Phillip Sloan, Philip Clatworthy, Edwin Simpson, Majid Mirmehdi

Comments: 24 pages, 8 figures, 6 tables. Submitted to IEEE Reviews in Biomedical Engineering

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[236] arXiv:2405.10832 [pdf, other]: Title: Open-Vocabulary Spatio-Temporal Action Detection

Authors: Tao Wu, Shuqiu Ge, Jie Qin, Gangshan Wu, Limin Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[237] arXiv:2405.10802 [pdf, other]: Title: Reduced storage direct tensor ring decomposition for convolutional neural networks compression

Authors: Mateusz Gabor, Rafał Zdunek

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[238] arXiv:2405.10748 [pdf, other]: Title: Deep Data Consistency: a Fast and Robust Diffusion Model-based Solver for Inverse Problems

Authors: Hanyu Chen, Zhixiu Hao, Liying Xiao

Comments: Codes: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[239] arXiv:2405.10739 [pdf, other]: Title: Efficient Multimodal Large Language Models: A Survey

Authors: Yizhang Jin, Jian Li, Yexin Liu, Tianjun Gu, Kai Wu, Zhengkai Jiang, Muyang He, Bo Zhao, Xin Tan, Zhenye Gan, Yabiao Wang, Chengjie Wang, Lizhuang Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[240] arXiv:2405.10736 [pdf, other]: Title: StackOverflowVQA: Stack Overflow Visual Question Answering Dataset

Authors: Motahhare Mirzaei, Mohammad Javad Pirhadi, Sauleh Eetemadi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[241] arXiv:2405.10718 [pdf, other]: Title: SignLLM: Sign Languages Production Large Language Models

Authors: Sen Fang, Lei Wang, Ce Zheng, Yapeng Tian, Chen Chen

Comments: 33 pages, website at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[242] arXiv:2405.10707 [pdf, ps, other]: Title: HARIS: Human-Like Attention for Reference Image Segmentation

Authors: Mengxi Zhang, Heqing Lian, Yiming Liu, Jie Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[243] arXiv:2405.10696 [pdf, other]: Title: Autonomous AI-enabled Industrial Sorting Pipeline for Advanced Textile Recycling

Authors: Yannis Spyridis, Vasileios Argyriou, Antonios Sarigiannidis, Panagiotis Radoglou, Panagiotis Sarigiannidis

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[244] arXiv:2405.10690 [pdf, other]: Title: CoLeaF: A Contrastive-Collaborative Learning Framework for Weakly Supervised Audio-Visual Video Parsing

Authors: Faegheh Sardari, Armin Mustafa, Philip J. B. Jackson, Adrian Hilton

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[245] arXiv:2405.10674 [pdf, other]: Title: From Sora What We Can See: A Survey of Text-to-Video Generation

Authors: Rui Sun, Yumin Zhang, Tejal Shah, Jiahao Sun, Shuoying Zhang, Wenqi Li, Haoran Duan, Bo Wei, Rajiv Ranjan

Comments: A comprehensive list of text-to-video generation studies in this survey is available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[246] arXiv:2405.10612 [pdf, other]: Title: Not All Prompts Are Secure: A Switchable Backdoor Attack Against Pre-trained Vision Transformers

Authors: Sheng Yang, Jiawang Bai, Kuofeng Gao, Yong Yang, Yiming Li, Shu-tao Xia

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[247] arXiv:2405.10610 [pdf, other]: Title: Driving Referring Video Object Segmentation with Vision-Language Pre-trained Models

Authors: Zikun Zhou, Wentao Xiong, Li Zhou, Xin Li, Zhenyu He, Yaowei Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[248] arXiv:2405.10598 [pdf, other]: Title: Learning Object-Centric Representation via Reverse Hierarchy Guidance

Authors: Junhong Zou, Xiangyu Zhu, Zhaoxiang Zhang, Zhen Lei

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[249] arXiv:2405.10591 [pdf, other]: Title: GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision

Authors: Xin Tan, Wenbin Wu, Zhiwei Zhang, Chaojie Fan, Yong Peng, Zhizhong Zhang, Yuan Xie, Lizhuang Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[250] arXiv:2405.10589 [pdf, other]: Title: Improving Point-based Crowd Counting and Localization Based on Auxiliary Point Guidance

Authors: I-Hsiang Chen, Wei-Ting Chen, Yu-Wei Liu, Ming-Hsuan Yang, Sy-Yen Kuo

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
[251] arXiv:2405.10577 [pdf, other]: Title: DuoSpaceNet: Leveraging Both Bird's-Eye-View and Perspective View Representations for 3D Object Detection

Authors: Zhe Huang, Yizhe Zhao, Hao Xiao, Chenyan Wu, Lingting Ge

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[252] arXiv:2405.10575 [pdf, other]: Title: Accurate Training Data for Occupancy Map Prediction in Automated Driving Using Evidence Theory

Authors: Jonas Kälble, Sascha Wirges, Maxim Tatarchenko, Eddy Ilg

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[253] arXiv:2405.10567 [pdf, other]: Title: Team Samsung-RAL: Technical Report for 2024 RoboDrive Challenge-Robust Map Segmentation Track

Authors: Xiaoshuai Hao, Yifan Yang, Hui Zhang, Mengchuan Wei, Yi Zhou, Haimei Zhao, Jing Zhang

Comments: ICRA 2024 RoboDrive Challenge Robust Map Segmentation Track 3rd Place Technical Report. arXiv admin note: text overlap with arXiv:2205.09743 by other authors

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[254] arXiv:2405.10557 [pdf, other]: Title: Resolving Symmetry Ambiguity in Correspondence-based Methods for Instance-level Object Pose Estimation

Authors: Yongliang Lin, Yongzhi Su, Sandeep Inuganti, Yan Di, Naeem Ajilforoushan, Hanqing Yang, Yu Zhang, Jason Rambach

Comments: 8 pages,10 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[255] arXiv:2405.10554 [pdf, other]: Title: NeRO: Neural Road Surface Reconstruction

Authors: Ruibo Wang, Song Zhang, Ping Huang, Donghai Zhang, Haoyu Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[256] arXiv:2405.10530 [pdf, other]: Title: CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation

Authors: Mushui Liu, Jun Dan, Ziqian Lu, Yunlong Yu, Yingming Li, Xi Li

Comments: 5 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[257] arXiv:2405.10529 [pdf, other]: Title: Safeguarding Vision-Language Models Against Patched Visual Prompt Injectors

Authors: Jiachen Sun, Changsheng Wang, Jiongxiao Wang, Yiwei Zhang, Chaowei Xiao

Comments: 15 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[258] arXiv:2405.10518 [pdf, ps, other]: Title: Enhancing Perception Quality in Remote Sensing Image Compression via Invertible Neural Network

Authors: Junhui Li, Xingsong Hou

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[259] arXiv:2405.10508 [pdf, other]: Title: ART3D: 3D Gaussian Splatting for Text-Guided Artistic Scenes Generation

Authors: Pengzhi Li, Chengshuai Tang, Qinxuan Huang, Zhiheng Li

Comments: Accepted at CVPR 2024 Workshop on AI3DG

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[260] arXiv:2405.10504 [pdf, ps, other]: Title: Multi-scale Semantic Prior Features Guided Deep Neural Network for Urban Street-view Image

Authors: Jianshun Zeng, Wang Li, Yanjie Lv, Shuai Gao, YuChu Qin

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[261] arXiv:2405.10489 [pdf, other]: Title: MixCut:A Data Augmentation Method for Facial Expression Recognition

Authors: Jiaxiang Yu, Yiyang Liu, Ruiyang Fan, Guobing Sun

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[262] arXiv:2405.10456 [pdf, other]: Title: Region-level labels in ice charts can produce pixel-level segmentation for Sea Ice types

Authors: Muhammed Patel, Xinwei Chen, Linlin Xu, Yuhao Chen, K Andrea Scott, David A. Clausi

Comments: Published at ICLR 2024 Machine Learning for Remote Sensing (ML4RS) Workshop

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[263] arXiv:2405.10444 [pdf, other]: Title: A Novel Bounding Box Regression Method for Single Object Tracking

Authors: Omar Abdelaziz, Mohamed Sami Shehata

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[264] arXiv:2405.10439 [pdf, other]: Title: Beyond Traditional Single Object Tracking: A Survey

Authors: Omar Abdelaziz, Mohamed Shehata, Mohamed Mohamed

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[265] arXiv:2405.10423 [pdf, other]: Title: Diversity-Aware Sign Language Production through a Pose Encoding Variational Autoencoder

Authors: Mohamed Ilyes Lakhal, Richard Bowden

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[266] arXiv:2405.10398 [pdf, other]: Title: Drone-type-Set: Drone types detection benchmark for drone detection and tracking

Authors: Kholoud AlDosari, AIbtisam Osman, Omar Elharrouss, Somaya AlMaadeed, Mohamed Zied Chaari

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[267] arXiv:2405.10370 [pdf, other]: Title: Grounded 3D-LLM with Referent Tokens

Authors: Yilun Chen, Shuai Yang, Haifeng Huang, Tai Wang, Ruiyuan Lyu, Runsen Xu, Dahua Lin, Jiangmiao Pang

Comments: Preprint

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[268] arXiv:2405.10357 [pdf, other]: Title: RGB Guided ToF Imaging System: A Survey of Deep Learning-based Methods

Authors: Xin Qiao, Matteo Poggi, Pengchao Deng, Hao Wei, Chenyang Ge, Stefano Mattoccia

Comments: To appear on International Journal of Computer Vision (IJCV)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[269] arXiv:2405.10347 [pdf, other]: Title: Networking Systems for Video Anomaly Detection: A Tutorial and Survey

Authors: Jing Liu, Yang Liu, Jieyu Lin, Jielin Li, Peng Sun, Bo Hu, Liang Song, Azzedine Boukerche, Victor C.M. Leung

Comments: Submitted to ACM Computing Surveys, under review,for more information and supplementary material, please see this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computers and Society (cs.CY)
[270] arXiv:2405.10939 (cross-list from cs.LG) [pdf, other]: Title: DINO as a von Mises-Fisher mixture model

Authors: Hariprasath Govindarajan, Per Sidén, Jacob Roll, Fredrik Lindsten

Comments: Accepted to ICLR 2023

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[271] arXiv:2405.10870 (cross-list from eess.IV) [pdf, other]: Title: Multicenter Privacy-Preserving Model Training for Deep Learning Brain Metastases Autosegmentation

Authors: Yixing Huang, Zahra Khodabakhshi, Ahmed Gomaa, Manuel Schmidt, Rainer Fietkau, Matthias Guckenberger, Nicolaus Andratschke, Christoph Bert, Stephanie Tanadini-Lang, Florian Putz

Comments: Submission to the Green Journal (Major Revision)

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[272] arXiv:2405.10833 (cross-list from eess.IV) [pdf, other]: Title: Automatic segmentation of Organs at Risk in Head and Neck cancer patients from CT and MRI scans

Authors: Sébastien Quetin, Andrew Heschl, Mauricio Murillo, Murali Rohit, Shirin A. Enger, Farhad Maleki

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[273] arXiv:2405.10803 (cross-list from eess.IV) [pdf, other]: Title: A Large-scale Multi Domain Leukemia Dataset for the White Blood Cells Detection with Morphological Attributes for Explainability

Authors: Abdul Rehman, Talha Meraj, Aiman Mahmood Minhas, Ayisha Imran, Mohsen Ali, Waqas Sultani

Comments: Early Accept

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[274] arXiv:2405.10754 (cross-list from math.OC) [pdf, other]: Title: Stable Phase Retrieval with Mirror Descent

Authors: Jean-Jacques Godeme, Jalal Fadili, Claude Amra, Myriam Zerrad

Subjects: Optimization and Control (math.OC); Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT)
[275] arXiv:2405.10723 (cross-list from eess.IV) [pdf, other]: Title: Eddeep: Fast eddy-current distortion correction for diffusion MRI with deep learning

Authors: Antoine Legouhy, Ross Callaghan, Whitney Stee, Philippe Peigneux, Hojjat Azadbakht, Hui Zhang

Comments: submitted to MICCAI 2024

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[276] arXiv:2405.10705 (cross-list from eess.IV) [pdf, other]: Title: 3D Vessel Reconstruction from Sparse-View Dynamic DSA Images via Vessel Probability Guided Attenuation Learning

Authors: Zhentao Liu, Huangxuan Zhao, Wenhui Qin, Zhenghong Zhou, Xinggang Wang, Wenping Wang, Xiaochun Lai, Chuansheng Zheng, Dinggang Shen, Zhiming Cui

Comments: 12 pages, 13 figures, 5 tables

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[277] arXiv:2405.10702 (cross-list from cs.CL) [pdf, ps, other]: Title: Empowering Prior to Court Legal Analysis: A Transparent and Accessible Dataset for Defensive Statement Classification and Interpretation

Authors: Yannis Spyridis, Jean-Paul, Haneen Deeb, Vasileios Argyriou

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[278] arXiv:2405.10691 (cross-list from eess.IV) [pdf, other]: Title: LoCI-DiffCom: Longitudinal Consistency-Informed Diffusion Model for 3D Infant Brain Image Completion

Authors: Zihao Zhu, Tianli Tao, Yitian Tao, Haowen Deng, Xinyi Cai, Gaofeng Wu, Kaidong Wang, Haifeng Tang, Lixuan Zhu, Zhuoyang Gu, Jiawei Huang, Dinggang Shen, Han Zhang

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[279] arXiv:2405.10561 (cross-list from eess.IV) [pdf, other]: Title: Infrared Image Super-Resolution via Lightweight Information Split Network

Authors: Shijie Liu, Kang Yan, Feiwei Qin, Changmiao Wang, Ruiquan Ge, Kai Zhang, Jie Huang, Yong Peng, Jin Cao

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[280] arXiv:2405.10550 (cross-list from eess.IV) [pdf, other]: Title: LighTDiff: Surgical Endoscopic Image Low-Light Enhancement with T-Diffusion

Authors: Tong Chen, Qingcheng Lyu, Long Bai, Erjian Guo, Huxin Gao, Xiaoxiao Yang, Hongliang Ren, Luping Zhou

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[281] arXiv:2405.10531 (cross-list from cs.LG) [pdf, other]: Title: Nonparametric Teaching of Implicit Neural Representations

Authors: Chen Zhang, Steven Tin Sui Luo, Jason Chun Lok Li, Yik-Chung Wu, Ngai Wong

Comments: ICML 2024 (24 pages, 13 figures)

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[282] arXiv:2405.10497 (cross-list from cs.MM) [pdf, other]: Title: SMP Challenge: An Overview and Analysis of Social Media Prediction Challenge

Authors: Bo Wu, Peiye Liu, Wen-Huang Cheng, Bei Liu, Zhaoyang Zeng, Jia Wang, Qiushi Huang, Jiebo Luo

Comments: ACM Multimedia. arXiv admin note: text overlap with arXiv:1910.01795

Subjects: Multimedia (cs.MM); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Social and Information Networks (cs.SI)

Fri, 17 May 2024

[283] arXiv:2405.10320 [pdf, other]: Title: Toon3D: Seeing Cartoons from a New Perspective

Authors: Ethan Weber, Riley Peterlinz, Rohan Mathur, Frederik Warburg, Alexei A. Efros, Angjoo Kanazawa

Comments: Please see our project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[284] arXiv:2405.10317 [pdf, other]: Title: Text-to-Vector Generation with Neural Path Representation

Authors: Peiying Zhang, Nanxuan Zhao, Jing Liao

Comments: Accepted by SIGGRAPH 2024. Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[285] arXiv:2405.10316 [pdf, other]: Title: Analogist: Out-of-the-box Visual In-Context Learning with Image Diffusion Model

Authors: Zheng Gu, Shiyuan Yang, Jing Liao, Jing Huo, Yang Gao

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[286] arXiv:2405.10314 [pdf, other]: Title: CAT3D: Create Anything in 3D with Multi-View Diffusion Models

Authors: Ruiqi Gao, Aleksander Holynski, Philipp Henzler, Arthur Brussee, Ricardo Martin-Brualla, Pratul Srinivasan, Jonathan T. Barron, Ben Poole

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[287] arXiv:2405.10305 [pdf, other]: Title: 4D Panoptic Scene Graph Generation

Authors: Jingkang Yang, Jun Cen, Wenxuan Peng, Shuai Liu, Fangzhou Hong, Xiangtai Li, Kaiyang Zhou, Qifeng Chen, Ziwei Liu

Comments: Accepted as NeurIPS 2023. Code: this https URL Previous Series: PSG this https URL and PVSG this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[288] arXiv:2405.10300 [pdf, other]: Title: Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection

Authors: Tianhe Ren, Qing Jiang, Shilong Liu, Zhaoyang Zeng, Wenlong Liu, Han Gao, Hongjie Huang, Zhengyu Ma, Xiaoke Jiang, Yihao Chen, Yuda Xiong, Hao Zhang, Feng Li, Peijun Tang, Kent Yu, Lei Zhang

Comments: Technical report

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[289] arXiv:2405.10286 [pdf, other]: Title: FFF: Fixing Flawed Foundations in contrastive pre-training results in very strong Vision-Language models

Authors: Adrian Bulat, Yassine Ouali, Georgios Tzimiropoulos

Comments: Accepted at CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[290] arXiv:2405.10272 [pdf, other]: Title: Faces that Speak: Jointly Synthesising Talking Face and Speech from Text

Authors: Youngjoon Jang, Ji-Hoon Kim, Junseok Ahn, Doyeop Kwak, Hong-Sun Yang, Yoon-Cheol Ju, Il-Hwan Kim, Byeong-Yeol Kim, Joon Son Chung

Comments: CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[291] arXiv:2405.10266 [pdf, other]: Title: A Tale of Two Languages: Large-Vocabulary Continuous Sign Language Recognition from Spoken Language Supervision

Authors: Charles Raude, K R Prajwal, Liliane Momeni, Hannah Bull, Samuel Albanie, Andrew Zisserman, Gül Varol

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[292] arXiv:2405.10256 [pdf, other]: Title: Biasing & Debiasing based Approach Towards Fair Knowledge Transfer for Equitable Skin Analysis

Authors: Anshul Pundhir, Balasubramanian Raman, Pravendra Singh

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[293] arXiv:2405.10255 [pdf, other]: Title: When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models

Authors: Xianzheng Ma, Yash Bhalgat, Brandon Smart, Shuai Chen, Xinghui Li, Jian Ding, Jindong Gu, Dave Zhenyu Chen, Songyou Peng, Jia-Wang Bian, Philip H Torr, Marc Pollefeys, Matthias Nießner, Ian D Reid, Angel X. Chang, Iro Laina, Victor Adrian Prisacariu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[294] arXiv:2405.10244 [pdf, ps, other]: Title: Towards Task-Compatible Compressible Representations

Authors: Anderson de Andrade, Ivan Bajić

Comments: To be published in ICME Workshops 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Signal Processing (eess.SP)
[295] arXiv:2405.10185 [pdf, other]: Title: DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative Data

Authors: Chengxiang Fan, Muzhi Zhu, Hao Chen, Yang Liu, Weijia Wu, Huaqi Zhang, Chunhua Shen

Comments: Accepted to CVPR 2024, codes are available at \href{this https URL}{this https URL}

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[296] arXiv:2405.10175 [pdf, other]: Title: Filling Missing Values Matters for Range Image-Based Point Cloud Segmentation

Authors: Bike Chen, Chen Gong, Juha Röning

Comments: This paper has been submitted to a journal

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[297] arXiv:2405.10160 [pdf, other]: Title: PIR: Remote Sensing Image-Text Retrieval with Prior Instruction Representation Learning

Authors: Jiancheng Pan, Muyuan Ma, Qing Ma, Cong Bai, Shengyong Chen

Comments: 15 pages, 9 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[298] arXiv:2405.10148 [pdf, other]: Title: SpecDETR: A Transformer-based Hyperspectral Point Object Detection Network

Authors: Zhaoxu Li, Wei An, Gaowei Guo, Longguang Wang, Yingqian Wang, Zaiping Lin

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[299] arXiv:2405.10140 [pdf, other]: Title: Libra: Building Decoupled Vision System on Large Language Models

Authors: Yifan Xu, Xiaoshan Yang, Yaguang Song, Changsheng Xu

Comments: ICML2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[300] arXiv:2405.10132 [pdf, other]: Title: Cooperative Visual-LiDAR Extrinsic Calibration Technology for Intersection Vehicle-Infrastructure: A review

Authors: Xinyu Zhang, Yijin Xiong, Qianxin Qu, Renjie Wang, Xin Gao, Jing Liu, Shichun Guo, Jun Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[301] arXiv:2405.10122 [pdf, other]: Title: Generating Coherent Sequences of Visual Illustrations for Real-World Manual Tasks

Authors: João Bordalo, Vasco Ramos, Rodrigo Valério, Diogo Glória-Silva, Yonatan Bitton, Michal Yarom, Idan Szpektor, Joao Magalhaes

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[302] arXiv:2405.10082 [pdf, other]: Title: An Integrated Framework for Multi-Granular Explanation of Video Summarization

Authors: Konstantinos Tsigos, Evlampios Apostolidis, Vasileios Mezaris

Comments: Under review

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[303] arXiv:2405.10075 [pdf, other]: Title: HecVL: Hierarchical Video-Language Pretraining for Zero-shot Surgical Phase Recognition

Authors: Kun Yuan, Vinkle Srivastav, Nassir Navab, Nicolas Padoy

Comments: Accepted by MICCAI2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[304] arXiv:2405.10053 [pdf, other]: Title: SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection

Authors: Mingxuan Liu, Tyler L. Hayes, Elisa Ricci, Gabriela Csurka, Riccardo Volpi

Comments: Accepted as a conference paper (highlight) at CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[305] arXiv:2405.10046 [pdf, other]: Title: A Preprocessing and Postprocessing Voxel-based Method for LiDAR Semantic Segmentation Improvement in Long Distance

Authors: Andrea Matteazzi, Pascal Colling, Michael Arnold, Dietmar Tutsch

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[306] arXiv:2405.10041 [pdf, other]: Title: Revealing Hierarchical Structure of Leaf Venations in Plant Science via Label-Efficient Segmentation: Dataset and Method

Authors: Weizhen Liu, Ao Li, Ze Wu, Yue Li, Baobin Ge, Guangyu Lan, Shilin Chen, Minghe Li, Yunfei Liu, Xiaohui Yuan, Nanqing Dong

Comments: Accepted by IJCAI2024, Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[307] arXiv:2405.10037 [pdf, other]: Title: Bilateral Event Mining and Complementary for Event Stream Super-Resolution

Authors: Zhilin Huang, Quanmin Liang, Yijie Yu, Chujun Qin, Xiawu Zheng, Kai Huang, Zikun Zhou, Wenming Yang

Comments: Accepted to CVPR2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[308] arXiv:2405.10030 [pdf, other]: Title: RSDehamba: Lightweight Vision Mamba for Remote Sensing Satellite Image Dehazing

Authors: Huiling Zhou, Xianhao Wu, Hongming Chen, Xiang Chen, Xin He

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[309] arXiv:2405.10014 [pdf, other]: Title: Frequency-Domain Refinement with Multiscale Diffusion for Super Resolution

Authors: Xingjian Wang, Li Chai, Jiming Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[310] arXiv:2405.10008 [pdf, other]: Title: Solving the enigma: Deriving optimal explanations of deep networks

Authors: Michail Mamalakis, Antonios Mamalakis, Ingrid Agartz, Lynn Egeland Mørch-Johnsen, Graham Murray, John Suckling, Pietro Lio

Comments: keywords: XAI, neuroscience, brain, 3D, 2D, computer vision, classification

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[311] arXiv:2405.09996 [pdf, other]: Title: Driving-Video Dehazing with Non-Aligned Regularization for Safety Assistance

Authors: Junkai Fan, Jiangwei Weng, Kun Wang, Yijun Yang, Jianjun Qian, Jun Li, Jian Yang

Comments: Accepted by CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[312] arXiv:2405.09985 [pdf, other]: Title: VirtualModel: Generating Object-ID-retentive Human-object Interaction Image by Diffusion Model for E-commerce Marketing

Authors: Binghui Chen, Chongyang Zhong, Wangmeng Xiang, Yifeng Geng, Xuansong Xie

Comments: project page: this https URL;

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[313] arXiv:2405.09981 [pdf, other]: Title: Adversarial Robustness for Visual Grounding of Multimodal Large Language Models

Authors: Kuofeng Gao, Yang Bai, Jiawang Bai, Yong Yang, Shu-Tao Xia

Comments: ICLR 2024 Workshop on Reliable and Responsible Foundation Models

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[314] arXiv:2405.09976 [pdf, other]: Title: Language-Oriented Semantic Latent Representation for Image Transmission

Authors: Giordano Cicchetti, Eleonora Grassucci, Jihong Park, Jinho Choi, Sergio Barbarossa, Danilo Comminiello

Comments: Under review at IEEE International Workshop on Machine Learning for Signal Processing (MLSP) 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Signal Processing (eess.SP)
[315] arXiv:2405.09964 [pdf, other]: Title: KPNDepth: Depth Estimation of Lane Images under Complex Rainy Environment

Authors: Zhengxu Shi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[316] arXiv:2405.09955 [pdf, other]: Title: Dual-band feature selection for maturity classification of specialty crops by hyperspectral imaging

Authors: Usman A. Zahidi, Krystian Łukasik, Grzegorz Cielniak

Comments: Preprint: Paper submitted to the special issue of "Computers and Electronics in Agriculture"

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[317] arXiv:2405.09942 [pdf, other]: Title: FPDIoU Loss: A Loss Function for Efficient Bounding Box Regression of Rotated Object Detection

Authors: Siliang Ma, Yong Xu

Comments: arXiv admin note: text overlap with arXiv:2307.07662, text overlap with arXiv:1902.09630 by other authors

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[318] arXiv:2405.09934 [pdf, other]: Title: Detecting Domain Shift in Multiple Instance Learning for Digital Pathology Using Fréchet Domain Distance

Authors: Milda Pocevičiūtė, Gabriel Eilertsen, Stina Garvin, Claes Lundström

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[319] arXiv:2405.09933 [pdf, other]: Title: MiniMaxAD: A Lightweight Autoencoder for Feature-Rich Anomaly Detection

Authors: Fengjie Wang, Chengming Liu, Lei Shi, Pang Haibo

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[320] arXiv:2405.09931 [pdf, other]: Title: Learning from Observer Gaze:Zero-Shot Attention Prediction Oriented by Human-Object Interaction Recognition

Authors: Yuchen Zhou, Linkai Liu, Chao Gou

Comments: Accepted by CVPR2024. Project HomePage: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[321] arXiv:2405.09924 [pdf, other]: Title: Infrared Adversarial Car Stickers

Authors: Xiaopei Zhu, Yuqiu Liu, Zhanhao Hu, Jianmin Li, Xiaolin Hu

Comments: Accepted by CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[322] arXiv:2405.09923 [pdf, other]: Title: NTIRE 2024 Restore Any Image Model (RAIM) in the Wild Challenge

Authors: Jie Liang, Radu Timofte, Qiaosi Yi, Shuaizheng Liu, Lingchen Sun, Rongyuan Wu, Xindong Zhang, Hui Zeng, Lei Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[323] arXiv:2405.09922 [pdf, other]: Title: Cross-sensor self-supervised training and alignment for remote sensing

Authors: Valerio Marsocci (CEDRIC - VERTIGO, CNAM), Nicolas Audebert (CEDRIC - VERTIGO, CNAM, LaSTIG, IGN)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[324] arXiv:2405.09902 [pdf, other]: Title: Unveiling the Potential: Harnessing Deep Metric Learning to Circumvent Video Streaming Encryption

Authors: Arwin Gansekoele, Tycho Bot, Rob van der Mei, Sandjai Bhulai, Mark Hoogendoorn

Comments: Published in the WI-IAT 2023 proceedings

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
[325] arXiv:2405.09883 [pdf, other]: Title: RoScenes: A Large-scale Multi-view 3D Dataset for Roadside Perception

Authors: Xiaosu Zhu, Hualian Sheng, Sijia Cai, Bing Deng, Shaopeng Yang, Qiao Liang, Ken Chen, Lianli Gao, Jingkuan Song, Jieping Ye

Comments: Technical report. 32 pages, 21 figures, 13 tables. this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[326] arXiv:2405.09882 [pdf, other]: Title: DiffAM: Diffusion-based Adversarial Makeup Transfer for Facial Privacy Protection

Authors: Yuhao Sun, Lingyun Yu, Hongtao Xie, Jiaming Li, Yongdong Zhang

Comments: 16 pages, 11 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[327] arXiv:2405.09880 [pdf, other]: Title: Deep Learning-Based Quasi-Conformal Surface Registration for Partial 3D Faces Applied to Facial Recognition

Authors: Yuchen Guo, Hanqun Cao, Lok Ming Lui

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[328] arXiv:2405.09879 [pdf, other]: Title: Generative Unlearning for Any Identity

Authors: Juwon Seo, Sung-Hoon Lee, Tae-Young Lee, Seungjun Moon, Gyeong-Moon Park

Comments: 15 pages, 17 figures, 10 tables, CVPR 2024 Poster

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[329] arXiv:2405.09874 [pdf, other]: Title: Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion

Authors: Xinyang Li, Zhangyu Lai, Linning Xu, Jianfei Guo, Liujuan Cao, Shengchuan Zhang, Bo Dai, Rongrong Ji

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[330] arXiv:2405.09873 [pdf, other]: Title: IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model

Authors: Yongsong Huang, Tomo Miyazaki, Xiaofeng Liu, Shinichiro Omachi

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[331] arXiv:2405.09863 [pdf, other]: Title: Box-Free Model Watermarks Are Prone to Black-Box Removal Attacks

Authors: Haonan An, Guang Hua, Zhiping Lin, Yuguang Fang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[332] arXiv:2405.09858 [pdf, other]: Title: Towards Realistic Incremental Scenario in Class Incremental Semantic Segmentation

Authors: Jihwan Kwak, Sungmin Cha, Taesup Moon

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[333] arXiv:2405.09828 [pdf, other]: Title: PillarNeXt: Improving the 3D detector by introducing Voxel2Pillar feature encoding and extracting multi-scale features

Authors: Xusheng Li, Chengliang Wang, Shumao Wang, Zhuo Zeng, Ji Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[334] arXiv:2405.09827 [pdf, other]: Title: Parallel Backpropagation for Shared-Feature Visualization

Authors: Alexander Lappe, Anna Bognár, Ghazaleh Ghamkhari Nejad, Albert Mukovskiy, Lucas Martini, Martin A. Giese, Rufin Vogels

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[335] arXiv:2405.09806 [pdf, other]: Title: MediSyn: Text-Guided Diffusion Models for Broad Medical 2D and 3D Image Synthesis

Authors: Joseph Cho, Cyril Zakka, Rohan Shad, Ross Wightman, Akshay Chaudhari, William Hiesinger

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[336] arXiv:2405.09789 [pdf, other]: Title: LeMeViT: Efficient Vision Transformer with Learnable Meta Tokens for Remote Sensing Image Interpretation

Authors: Wentao Jiang, Jing Zhang, Di Wang, Qiming Zhang, Zengmao Wang, Bo Du

Comments: Accepted by IJCAI'2024. The code is available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[337] arXiv:2405.09782 [pdf, other]: Title: Size-invariance Matters: Rethinking Metrics and Losses for Imbalanced Multi-object Salient Object Detection

Authors: Feiran Li, Qianqian Xu, Shilong Bao, Zhiyong Yang, Runmin Cong, Xiaochun Cao, Qingming Huang

Comments: This paper has been accepted by ICML2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[338] arXiv:2405.09777 [pdf, other]: Title: Rethinking Barely-Supervised Segmentation from an Unsupervised Domain Adaptation Perspective

Authors: Zhiqiang Shen, Peng Cao, Junming Su, Jinzhu Yang, Osmar R. Zaiane

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[339] arXiv:2405.09755 [pdf, other]: Title: Collision Avoidance Metric for 3D Camera Evaluation

Authors: Vage Taamazyan, Alberto Dall'olio, Agastya Kalra

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[340] arXiv:2405.09717 [pdf, other]: Title: From NeRFs to Gaussian Splats, and Back

Authors: Siming He, Zach Osman, Pratik Chaudhari

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[341] arXiv:2405.09713 [pdf, other]: Title: SOK-Bench: A Situated Video Reasoning Benchmark with Aligned Open-World Knowledge

Authors: Andong Wang, Bo Wu, Sunli Chen, Zhenfang Chen, Haotian Guan, Wei-Ning Lee, Li Erran Li, Chuang Gan

Comments: CVPR

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[342] arXiv:2405.09707 [pdf, other]: Title: Point2SSM++: Self-Supervised Learning of Anatomical Shape Models from Point Clouds

Authors: Jadie Adams, Shireen Elhabian

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[343] arXiv:2405.09697 [pdf, other]: Title: Weakly Supervised Bayesian Shape Modeling from Unsegmented Medical Images

Authors: Jadie Adams, Krithika Iyer, Shireen Elhabian

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[344] arXiv:2405.09682 [pdf, other]: Title: Synth-to-Real Unsupervised Domain Adaptation for Instance Segmentation

Authors: Guo Yachan, Xiao Yi, Xue Danna, Jose Luis Gomez Zurita, Antonio M. López

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[345] arXiv:2405.09588 [pdf, ps, other]: Title: Training Deep Learning Models with Hybrid Datasets for Robust Automatic Target Detection on real SAR images

Authors: Benjamin Camus, Théo Voillemin, Corentin Le Barbu, Jean-Christophe Louvigné (DGA.MI), Carole Belloni (DGA.MI), Emmanuel Vallée (DGA.MI)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Signal Processing (eess.SP)
[346] arXiv:2405.09582 [pdf, other]: Title: AD-Aligning: Emulating Human-like Generalization for Cognitive Domain Adaptation in Deep Learning

Authors: Zhuoying Li, Bohua Wan, Cong Mu, Ruzhang Zhao, Shushan Qiu, Chao Yan

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[347] arXiv:2405.09550 [pdf, other]: Title: Mask-based Invisible Backdoor Attacks on Object Detection

Authors: Shin Jeong Jin

Comments: 7 pages, 3 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
[348] arXiv:2405.10292 (cross-list from cs.AI) [pdf, other]: Title: Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning

Authors: Yuexiang Zhai, Hao Bai, Zipeng Lin, Jiayi Pan, Shengbang Tong, Yifei Zhou, Alane Suhr, Saining Xie, Yann LeCun, Yi Ma, Sergey Levine

Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[349] arXiv:2405.10262 (cross-list from cs.LG) [pdf, other]: Title: Two-Phase Dynamics of Interactions Explains the Starting Point of a DNN Learning Over-Fitted Features

Authors: Junpeng Zhang, Qing Li, Liang Lin, Quanshi Zhang

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[350] arXiv:2405.10254 (cross-list from eess.IV) [pdf, other]: Title: PRISM: A Multi-Modal Generative Foundation Model for Slide-Level Histopathology

Authors: George Shaikovski, Adam Casson, Kristen Severson, Eric Zimmermann, Yi Kan Wang, Jeremy D. Kunz, Juan A. Retamero, Gerard Oakley, David Klimstra, Christopher Kanan, Matthew Hanna, Michal Zelechowski, Julian Viret, Neil Tenenholtz, James Hall, Nicolo Fusi, Razik Yousfi, Peter Hamilton, William A. Moye, Eugene Vorontsov, Siqi Liu, Thomas J. Fuchs

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[351] arXiv:2405.10246 (cross-list from eess.IV) [pdf, other]: Title: A Foundation Model for Brain Lesion Segmentation with Mixture of Modality Experts

Authors: Xinru Zhang, Ni Ou, Berke Doga Basaran, Marco Visentin, Mengyun Qiao, Renyang Gu, Cheng Ouyang, Yaou Liu, Paul M. Matthew, Chuyang Ye, Wenjia Bai

Comments: The work has been early accepted by MICCAI 2024

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[352] arXiv:2405.10068 (cross-list from eess.IV) [pdf, other]: Title: MrRegNet: Multi-resolution Mask Guided Convolutional Neural Network for Medical Image Registration with Large Deformations

Authors: Ruizhe Li, Grazziela Figueredo, Dorothee Auer, Christian Wagner, Xin Chen

Comments: Accepted for publication at IEEE International Symposium on Biomedical Imaging (ISBI) 2024

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[353] arXiv:2405.10020 (cross-list from cs.RO) [pdf, other]: Title: Natural Language Can Help Bridge the Sim2Real Gap

Authors: Albert Yu, Adeline Foote, Raymond Mooney, Roberto Martín-Martín

Comments: To appear in RSS 2024

Subjects: Robotics (cs.RO); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[354] arXiv:2405.10004 (cross-list from eess.IV) [pdf, other]: Title: ROCOv2: Radiology Objects in COntext Version 2, an Updated Multimodal Image Dataset

Authors: Johannes Rückert, Louise Bloch, Raphael Brüngel, Ahmad Idrissi-Yaghir, Henning Schäfer, Cynthia S. Schmidt, Sven Koitka, Obioma Pelka, Asma Ben Abacha, Alba G. Seco de Herrera, Henning Müller, Peter A. Horn, Felix Nensa, Christoph M. Friedrich

Comments: Major revision Scientific Data

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[355] arXiv:2405.09990 (cross-list from eess.IV) [pdf, other]: Title: Histopathology Foundation Models Enable Accurate Ovarian Cancer Subtype Classification

Authors: Jack Breen, Katie Allen, Kieran Zucker, Lucy Godson, Nicolas M. Orsi, Nishant Ravikumar

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[356] arXiv:2405.09959 (cross-list from eess.IV) [pdf, other]: Title: Patient-Specific Real-Time Segmentation in Trackerless Brain Ultrasound

Authors: Reuben Dorent, Erickson Torio, Nazim Haouchine, Colin Galvin, Sarah Frisken, Alexandra Golby, Tina Kapur, William Wells

Comments: Early accept at MICCAI 2024 - code available at: this https URL

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[357] arXiv:2405.09864 (cross-list from astro-ph.IM) [pdf, other]: Title: Solar multi-object multi-frame blind deconvolution with a spatially variant convolution neural emulator

Authors: A. Asensio Ramos (IAC+ULL)

Comments: 15 pages, 14 figures, accepted for publication in A&A

Subjects: Instrumentation and Methods for Astrophysics (astro-ph.IM); Computer Vision and Pattern Recognition (cs.CV)
[358] arXiv:2405.09851 (cross-list from eess.IV) [pdf, other]: Title: Region of Interest Detection in Melanocytic Skin Tumor Whole Slide Images -- Nevus & Melanoma

Authors: Yi Cui, Yao Li, Jayson R. Miedema, Sharon N. Edmiston, Sherif Farag, J.S. Marron, Nancy E. Thomas

Comments: 5 figures, NeurIPS 2022 Workshop

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Quantitative Methods (q-bio.QM)
[359] arXiv:2405.09820 (cross-list from cs.LG) [pdf, other]: Title: Densely Distilling Cumulative Knowledge for Continual Learning

Authors: Zenglin Shi, Pei Liu, Tong Su, Yunpeng Wu, Kuien Liu, Yu Song, Meng Wang

Comments: 12 pages; Continual Leanrning; Class-incremental Learning; Knowledge Distillation; Forgetting

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[360] arXiv:2405.09814 (cross-list from cs.GR) [pdf, other]: Title: Semantic Gesticulator: Semantics-Aware Co-Speech Gesture Synthesis

Authors: Zeyi Zhang, Tenglong Ao, Yuyao Zhang, Qingzhe Gao, Chuan Lin, Baoquan Chen, Libin Liu

Comments: SIGGRAPH 2024 (Journal Track); Project page: this https URL

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[361] arXiv:2405.09798 (cross-list from cs.LG) [pdf, other]: Title: Many-Shot In-Context Learning in Multimodal Foundation Models

Authors: Yixing Jiang, Jeremy Irvin, Ji Hun Wang, Muhammad Ahmed Chaudhry, Jonathan H. Chen, Andrew Y. Ng

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[362] arXiv:2405.09787 (cross-list from eess.IV) [pdf, other]: Title: Analysis of the BraTS 2023 Intracranial Meningioma Segmentation Challenge

Authors: Dominic LaBella, Ujjwal Baid, Omaditya Khanna, Shan McBurney-Lin, Ryan McLean, Pierre Nedelec, Arif Rashid, Nourel Hoda Tahon, Talissa Altes, Radhika Bhalerao, Yaseen Dhemesh, Devon Godfrey, Fathi Hilal, Scott Floyd, Anastasia Janas, Anahita Fathi Kazerooni, John Kirkpatrick, Collin Kent, Florian Kofler, Kevin Leu, Nazanin Maleki, Bjoern Menze, Maxence Pajot, Zachary J. Reitman, Jeffrey D. Rudie, Rachit Saluja, Yury Velichko, Chunhao Wang, Pranav Warman, Maruf Adewole, Jake Albrecht, Udunna Anazodo, Syed Muhammad Anwar, Timothy Bergquist, Sully Francis Chen, Verena Chung, Gian-Marco Conte, Farouk Dako, James Eddy, Ivan Ezhov, Nastaran Khalili, Juan Eugenio Iglesias, Zhifan Jiang, Elaine Johanson, Koen Van Leemput, Hongwei Bran Li, Marius George Linguraru, Xinyang Liu, Aria Mahtabfar, Zeke Meier, et al. (71 additional authors not shown)

Comments: 16 pages, 11 tables, 10 figures, MICCAI

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[363] arXiv:2405.09716 (cross-list from eess.IV) [pdf, other]: Title: Illumination Histogram Consistency Metric for Quantitative Assessment of Video Sequences

Authors: Long Chen, Mobarakol Islam, Matt Clarkson, Thomas Dowrick

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[364] arXiv:2405.09711 (cross-list from cs.AI) [pdf, other]: Title: STAR: A Benchmark for Situated Reasoning in Real-World Videos

Authors: Bo Wu, Shoubin Yu, Zhenfang Chen, Joshua B Tenenbaum, Chuang Gan

Comments: NeurIPS

Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[365] arXiv:2405.09695 (cross-list from cs.HC) [pdf, other]: Title: Enhancing Saliency Prediction in Monitoring Tasks: The Role of Visual Highlights

Authors: Zekun Wu, Anna Maria Feit

Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV)
[366] arXiv:2405.09601 (cross-list from physics.med-ph) [pdf, ps, other]: Title: Fully Automated OCT-based Tissue Screening System

Authors: Shaohua Pi, Razieh Ganjee, Lingyun Wang, Riley K. Arbuckle, Chengcheng Zhao, Jose A Sahel, Bingjie Wang, Yuanyuan Chen

Subjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV)
[367] arXiv:2405.09600 (cross-list from cs.LG) [pdf, other]: Title: Aggregate Representation Measure for Predictive Model Reusability

Authors: Vishwesh Sangarya, Richard Bradford, Jung-Eun Kim

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
[368] arXiv:2405.09594 (cross-list from eess.IV) [pdf, other]: Title: Learning Generalized Medical Image Representations through Image-Graph Contrastive Pretraining

Authors: Sameer Khanna, Daniel Michael, Marinka Zitnik, Pranav Rajpurkar

Comments: Accepted into Machine Learning for Health (ML4H) 2023

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[369] arXiv:2405.09589 (cross-list from cs.LG) [pdf, other]: Title: Unveiling Hallucination in Text, Image, Video, and Audio Foundation Models: A Comprehensive Survey

Authors: Pranab Sahoo, Prabhash Meharia, Akash Ghosh, Sriparna Saha, Vinija Jain, Aman Chadha

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[370] arXiv:2405.09586 (cross-list from eess.IV) [pdf, other]: Title: Factual Serialization Enhancement: A Key Innovation for Chest X-ray Report Generation

Authors: Kang Liu, Zhuoqi Ma, Mengmeng Liu, Zhicheng Jiao, Xiaolu Kang, Qiguang Miao, Kun Xie

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[371] arXiv:2405.09558 (cross-list from eess.SP) [pdf, other]: Title: An EM Body Model for Device-Free Localization with Multiple Antenna Receivers: A First Study

Authors: Vittorio Rampa, Federica Fieramosca, Stefano Savazzi, Michele D'Amico

Journal-ref: 2023 IEEE-APS Topical Conference on Antennas and Propagation in Wireless Communications (APWC)

Subjects: Signal Processing (eess.SP); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[372] arXiv:2405.09552 (cross-list from eess.IV) [pdf, other]: Title: ODFormer: Semantic Fundus Image Segmentation Using Transformer for Optic Nerve Head Detection

Authors: Jiayi Wang, Yi-An Mao, Xiaoyu Ma, Sicen Guo, Yuting Shao, Xiao Lv, Wenting Han, Mark Christopher, Linda M. Zangwill, Yanlong Bi, Rui Fan

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

Thu, 16 May 2024

[373] arXiv:2405.09546 [pdf, other]: Title: BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation

Authors: Yunhao Ge, Yihe Tang, Jiashu Xu, Cem Gokmen, Chengshu Li, Wensi Ai, Benjamin Jose Martinez, Arman Aydin, Mona Anvari, Ayush K Chakravarthy, Hong-Xing Yu, Josiah Wong, Sanjana Srivastava, Sharon Lee, Shengxin Zha, Laurent Itti, Yunzhu Li, Roberto Martín-Martín, Miao Liu, Pengchuan Zhang, Ruohan Zhang, Li Fei-Fei, Jiajun Wu

Comments: CVPR 2024 (Highlight). Project website: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[374] arXiv:2405.09544 [pdf, other]: Title: Classifying geospatial objects from multiview aerial imagery using semantic meshes

Authors: David Russell, Ben Weinstein, David Wettergreen, Derek Young

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[375] arXiv:2405.09487 [pdf, other]: Title: Color Space Learning for Cross-Color Person Re-Identification

Authors: Jiahao Nie, Shan Lin, Alex C. Kot

Comments: Accepted by ICME 2024 (Oral)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[376] arXiv:2405.09463 [pdf, other]: Title: Gaze-DETR: Using Expert Gaze to Reduce False Positives in Vulvovaginal Candidiasis Screening

Authors: Yan Kong, Sheng Wang, Jiangdong Cai, Zihao Zhao, Zhenrong Shen, Yonghao Li, Manman Fei, Qian Wang

Comments: MICCAI-2024 early accept. Our code is available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[377] arXiv:2405.09459 [pdf, other]: Title: Fourier Boundary Features Network with Wider Catchers for Glass Segmentation

Authors: Xiaolin Qin, Jiacen Liu, Qianlei Wang, Shaolin Zhang, Fei Zhu, Zhang Yi

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[378] arXiv:2405.09431 [pdf, other]: Title: A Survey On Text-to-3D Contents Generation In The Wild

Authors: Chenhan Jiang

Comments: 11 pages, 10 figures, 4 tables. arXiv admin note: text overlap with arXiv:2401.17807 by other authors

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[379] arXiv:2405.09426 [pdf, other]: Title: Global-Local Image Perceptual Score (GLIPS): Evaluating Photorealistic Quality of AI-Generated Images

Authors: Memoona Aziz, Umair Rehman, Muhammad Umair Danish, Katarina Grolinger

Comments: 10 pages, 3 figures. Submitted to IEEE Transactions on Human-Machine Systems

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[380] arXiv:2405.09409 [pdf, ps, other]: Title: Real-World Federated Learning in Radiology: Hurdles to overcome and Benefits to gain

Authors: Markus R. Bujotzek, Ünal Akünal, Stefan Denner, Peter Neher, Maximilian Zenk, Eric Frodl, Astha Jaiswal, Moon Kim, Nicolai R. Krekiehn, Manuel Nickel, Richard Ruppel, Marcus Both, Felix Döllinger, Marcel Opitz, Thorsten Persigehl, Jens Kleesiek, Tobias Penzkofer, Klaus Maier-Hein, Rickmer Braren, Andreas Bucher

Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)
[381] arXiv:2405.09404 [pdf, other]: Title: Time-Equivariant Contrastive Learning for Degenerative Disease Progression in Retinal OCT

Authors: Taha Emre, Arunava Chakravarty, Dmitrii Lachinov, Antoine Rivail, Ursula Schmidt-Erfurth, Hrvoje Bogunović

Comments: Accepted at MICCAI 2024 (early accept, top 11%)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[382] arXiv:2405.09403 [pdf, other]: Title: Identity Overlap Between Face Recognition Train/Test Data: Causing Optimistic Bias in Accuracy Measurement

Authors: Haiyu Wu, Sicong Tian, Jacob Gutierrez, Aman Bhatta, Kağan Öztürk, Kevin W. Bowyer

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[383] arXiv:2405.09365 [pdf, other]: Title: SARATR-X: A Foundation Model for Synthetic Aperture Radar Images Target Recognition

Authors: Weijie L, Wei Yang, Yuenan Hou, Li Liu, Yongxiang Liu, Xiang Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[384] arXiv:2405.09355 [pdf, other]: Title: Vision-Based Neurosurgical Guidance: Unsupervised Localization and Camera-Pose Prediction

Authors: Gary Sarwin, Alessandro Carretta, Victor Staartjes, Matteo Zoli, Diego Mazzatenta, Luca Regli, Carlo Serra, Ender Konukoglu

Comments: Early Accept at MICCAI 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[385] arXiv:2405.09342 [pdf, other]: Title: Progressive Depth Decoupling and Modulating for Flexible Depth Completion

Authors: Zhiwen Yang, Jiehua Zhang, Liang Li, Chenggang Yan, Yaoqi Sun, Haibing Yin

Comments: The article is accepted by IEEE Transactions on Instrumentation & Measurement

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[386] arXiv:2405.09334 [pdf, other]: Title: Content-Based Image Retrieval for Multi-Class Volumetric Radiology Images: A Benchmark Study

Authors: Farnaz Khun Jush, Steffen Vogler, Tuan Truong, Matthias Lenga

Comments: 23 pages, 9 Figures, 13 Tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)
[387] arXiv:2405.09333 [pdf, other]: Title: Application of Gated Recurrent Units for CT Trajectory Optimization

Authors: Yuedong Yuan, Linda-Sophie Schneider, Andreas Maier

Comments: 4 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[388] arXiv:2405.09321 [pdf, other]: Title: ReconBoost: Boosting Can Achieve Modality Reconcilement

Authors: Cong Hua, Qianqian Xu, Shilong Bao, Zhiyong Yang, Qingming Huang

Comments: This paper has been accepted by ICML2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)
[389] arXiv:2405.09291 [pdf, other]: Title: Sensitivity Decouple Learning for Image Compression Artifacts Reduction

Authors: Li Ma, Yifan Zhao, Peixi Peng, Yonghong Tian

Comments: Accepted by Transactions on Image Processing

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
[390] arXiv:2405.09288 [pdf, other]: Title: DeCoDEx: Confounder Detector Guidance for Improved Diffusion-based Counterfactual Explanations

Authors: Nima Fathi, Amar Kumar, Brennan Nichyporuk, Mohammad Havaei, Tal Arbel

Comments: Accepted to Medical Imaging with Deep Learning (MIDL) 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[391] arXiv:2405.09266 [pdf, other]: Title: Dance Any Beat: Blending Beats with Visuals in Dance Video Generation

Authors: Xuanchen Wang, Heng Wang, Dongnan Liu, Weidong Cai

Comments: 11 pages, 6 figures, demo page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[392] arXiv:2405.09247 [pdf, other]: Title: Graph Neural Network based Handwritten Trajectories Recognition

Authors: Anuj Sharma, Sukhdeep Singh, S Ratna

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[393] arXiv:2405.09215 [pdf, other]: Title: Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model

Authors: Wanting Xu, Yang Liu, Langping He, Xucheng Huang, Ling Jiang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[394] arXiv:2405.09194 [pdf, ps, other]: Title: Flexible image analysis for law enforcement agencies with deep neural networks to determine: where, who and what

Authors: Henri Bouma, Bart Joosten, Maarten C Kruithof, Maaike H T de Boer, Alexandru Ginsca (LIST (CEA)), Benjamin Labbe (LIST (CEA)), Quoc T Vuong (LIST (CEA))

Journal-ref: SPIE - Counterterrorism, Crime Fighting, Forensics, and Surveillance Technologies II, 2018, pp.27

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[395] arXiv:2405.09152 [pdf, other]: Title: Scalable Image Coding for Humans and Machines Using Feature Fusion Network

Authors: Takahiro Shindo, Taiju Watanabe, Yui Tatsumi, Hiroshi Watanabe

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[396] arXiv:2405.09150 [pdf, other]: Title: Curriculum Dataset Distillation

Authors: Zhiheng Ma, Anjia Cao, Funing Yang, Xing Wei

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[397] arXiv:2405.09148 [pdf, ps, other]: Title: A Hierarchically Feature Reconstructed Autoencoder for Unsupervised Anomaly Detection

Authors: Honghui Chen, Pingping Chen, Huan Mao, Mengxi Jiang

Comments: 12 pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[398] arXiv:2405.09138 [pdf, other]: Title: OpenGait: A Comprehensive Benchmark Study for Gait Recognition towards Better Practicality

Authors: Chao Fan, Saihui Hou, Junhao Liang, Chuanfu Shen, Jingzhe Ma, Dongyang Jin, Yongzhen Huang, Shiqi Yu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[399] arXiv:2405.09131 [pdf, other]: Title: RobustMVS: Single Domain Generalized Deep Multi-view Stereo

Authors: Hongbin Xu, Weitao Chen, Baigui Sun, Xuansong Xie, Wenxiong Kang

Comments: Accepted to TCSVT. Code will be released at: this https URL Benchmark will be released at: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[400] arXiv:2405.09125 [pdf, other]: Title: HAAP: Vision-context Hierarchical Attention Autoregressive with Adaptive Permutation for Scene Text Recognition

Authors: Honghui Chen, Yuhang Qiu, Jiabao Wang, Pingping Chen, Nam Ling

Comments: 12 pages, 10 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[401] arXiv:2405.09114 [pdf, other]: Title: SOEDiff: Efficient Distillation for Small Object Editing

Authors: Qihe Pan, Zicheng Wang, Zhen Zhao, Yiming Wu, Sifan Long, Haoran Liang, Ronghua Liang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[402] arXiv:2405.09083 [pdf, other]: Title: RSHazeDiff: A Unified Fourier-aware Diffusion Model for Remote Sensing Image Dehazing

Authors: Jiamei Xiong, Xuefeng Yan, Yongzhen Wang, Wei Zhao, Xiao-Ping Zhang, Mingqiang Wei

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[403] arXiv:2405.09059 [pdf, other]: Title: Task-adaptive Q-Face

Authors: Haomiao Sun, Mingjie He, Shiguang Shan, Hu Han, Xilin Chen

Comments: Ever submitted to ECCV2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[404] arXiv:2405.09056 [pdf, other]: Title: CTS: A Consistency-Based Medical Image Segmentation Model

Authors: Kejia Zhang, Lan Zhang, Haiwei Pan, Baolong Yu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[405] arXiv:2405.09054 [pdf, other]: Title: Dim Small Target Detection and Tracking: A Novel Method Based on Temporal Energy Selective Scaling and Trajectory Association

Authors: Weihua Gao, Wenlong Niu, Wenlong Lu, Pengcheng Wang, Zhaoyuan Qi, Xiaodong Peng, Zhen Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[406] arXiv:2405.09050 [pdf, other]: Title: 3D Shape Augmentation with Content-Aware Shape Resizing

Authors: Mingxiang Chen, Jian Zhang, Boli Zhou, Yang Song

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[407] arXiv:2405.09045 [pdf, other]: Title: AMSNet: Netlist Dataset for AMS Circuits

Authors: Zhuofu Tao, Yichen Shi, Yiru Huo, Rui Ye, Zonghang Li, Li Huang, Chen Wu, Na Bai, Zhiping Yu, Ting-Jung Lin, Lei He

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[408] arXiv:2405.09041 [pdf, other]: Title: Learning from Partial Label Proportions for Whole Slide Image Segmentation

Authors: Shinnosuke Matsuo, Daiki Suehiro, Seiichi Uchida, Hiroaki Ito, Kazuhiro Terada, Akihiko Yoshizawa, Ryoma Bise

Comments: Accepted at MICCAI2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[409] arXiv:2405.09032 [pdf, other]: Title: ICAL: Implicit Character-Aided Learning for Enhanced Handwritten Mathematical Expression Recognition

Authors: Jianhua Zhu, Liangcai Gao, Wenqi Zhao

Comments: Accept by ICDAR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[410] arXiv:2405.09024 [pdf, other]: Title: Dynamic Loss Decay based Robust Oriented Object Detection on Remote Sensing Images with Noisy Labels

Authors: Guozhang Liu, Ting Liu, Mengke Yuan, Tao Pang, Guangxing Yang, Hao Fu, Tao Wang, Tongkui Liao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[411] arXiv:2405.09006 [pdf, other]: Title: Spatial Semantic Recurrent Mining for Referring Image Segmentation

Authors: Jiaxing Yang, Lihe Zhang, Jiayu Sun, Huchuan Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[412] arXiv:2405.08996 [pdf, other]: Title: Learning Correspondence for Deformable Objects

Authors: Priya Sundaresan, Aditya Ganapathi, Harry Zhang, Shivin Devgon

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[413] arXiv:2405.08992 [pdf, other]: Title: Contextual Emotion Recognition using Large Vision Language Models

Authors: Yasaman Etesam, Özge Nilay Yalçın, Chuxuan Zhang, Angelica Lim

Comments: 8 pages, website: this https URL arXiv admin note: text overlap with arXiv:2310.19995

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[414] arXiv:2405.08991 [pdf, other]: Title: Theoretical Analysis for Expectation-Maximization-Based Multi-Model 3D Registration

Authors: David Jin, Harry Zhang, Kai Chang

Comments: arXiv admin note: substantial text overlap with arXiv:2402.10865

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[415] arXiv:2405.08961 [pdf, other]: Title: Bird's-Eye View to Street-View: A Survey

Authors: Khawlah Bajbaa, Muhammad Usman, Saeed Anwar, Ibrahim Radwan, Abdul Bais

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[416] arXiv:2405.08932 [pdf, other]: Title: Self-supervised vision-langage alignment of deep learning representations for bone X-rays analysis

Authors: Alexandre Englebert, Anne-Sophie Collin, Olivier Cornu, Christophe De Vleeschouwer

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[417] arXiv:2405.08911 [pdf, other]: Title: CLIP with Quality Captions: A Strong Pretraining for Vision Tasks

Authors: Pavan Kumar Anasosalu Vasu, Hadi Pouransari, Fartash Faghri, Oncel Tuzel

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[418] arXiv:2405.08909 [pdf, other]: Title: ADA-Track: End-to-End Multi-Camera 3D Multi-Object Tracking with Alternating Detection and Association

Authors: Shuxiao Ding, Lukas Schneider, Marius Cordts, Juergen Gall

Comments: 14 pages, 3 figures, accepted by CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[419] arXiv:2405.08890 [pdf, other]: Title: Language-Guided Self-Supervised Video Summarization Using Text Semantic Matching Considering the Diversity of the Video

Authors: Tomoya Sugihara, Shuntaro Masuda, Ling Xiao, Toshihiko Yamasaki

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[420] arXiv:2405.09539 (cross-list from eess.IV) [pdf, ps, other]: Title: MMFusion: Multi-modality Diffusion Model for Lymph Node Metastasis Diagnosis in Esophageal Cancer

Authors: Chengyu Wu, Chengkai Wang, Yaqi Wang, Huiyu Zhou, Yatao Zhang, Qifeng Wang, Shuai Wang

Comments: Early accepted to MICCAI 2024 (6/6/5)

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[421] arXiv:2405.09530 (cross-list from cs.CY) [pdf, other]: Title: A community palm model

Authors: Nicholas Clinton, Andreas Vollrath, Remi D'annunzio, Desheng Liu, Henry B. Glick, Adrià Descals, Alicia Sullivan, Oliver Guinan, Jacob Abramowitz, Fred Stolle, Chris Goodman, Tanya Birch, David Quinn, Olga Danylo, Tijs Lips, Daniel Coelho, Enikoe Bihari, Bryce Cronkite-Ratcliff, Ate Poortinga, Atena Haghighattalab, Evan Notman, Michael DeWitt, Aaron Yonas, Gennadii Donchyts, Devaja Shah, David Saah, Karis Tenneson, Nguyen Hanh Quyen, Megha Verma, Andrew Wilcox

Comments: v0

Subjects: Computers and Society (cs.CY); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[422] arXiv:2405.09472 (cross-list from eess.IV) [pdf, other]: Title: Perception- and Fidelity-aware Reduced-Reference Super-Resolution Image Quality Assessment

Authors: Xinying Lin, Xuyang Liu, Hong Yang, Xiaohai He, Honggang Chen

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[423] arXiv:2405.09353 (cross-list from eess.IV) [pdf, other]: Title: Large coordinate kernel attention network for lightweight image super-resolution

Authors: Fangwei Hao, Jiesheng Wu, Haotian Lu, Ji Du, Jing Xu

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[424] arXiv:2405.09298 (cross-list from eess.IV) [pdf, ps, other]: Title: Deep Blur Multi-Model (DeepBlurMM) -- a strategy to mitigate the impact of image blur on deep learning model performance in histopathology image analysis

Authors: Yujie Xiang, Bojing Liu, Mattias Rantalainen

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[425] arXiv:2405.09286 (cross-list from cs.MM) [pdf, other]: Title: MVBIND: Self-Supervised Music Recommendation For Videos Via Embedding Space Binding

Authors: Jiajie Teng, Huiyu Duan, Yucheng Zhu, Sijing Wu, Guangtao Zhai

Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV)
[426] arXiv:2405.09077 (cross-list from eess.IV) [pdf, other]: Title: Compressive Feature Selection for Remote Visual Multi-Task Inference

Authors: Saeed Ranjbar Alvar, Ivan V. Bajić

Comments: 6 pages, 8 figures, IEEE ICME Workshop on Coding for Machines

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[427] arXiv:2405.09049 (cross-list from cs.LG) [pdf, other]: Title: Perception Without Vision for Trajectory Prediction: Ego Vehicle Dynamics as Scene Representation for Efficient Active Learning in Autonomous Driving

Authors: Ross Greer, Mohan Trivedi

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[428] arXiv:2405.08981 (cross-list from cs.HC) [pdf, other]: Title: Impact of Design Decisions in Scanpath Modeling

Authors: Parvin Emami, Yue Jiang, Zixin Guo, Luis A. Leiva

Comments: 16 pages

Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[429] arXiv:2405.08920 (cross-list from cs.LG) [pdf, other]: Title: Neural Collapse Meets Differential Privacy: Curious Behaviors of NoisyGD with Near-perfect Representation Learning

Authors: Chendi Wang, Yuqing Zhu, Weijie J. Su, Yu-Xiang Wang

Comments: To appear in ICML 2024

Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

Wed, 22 May 2024
Tue, 21 May 2024
Mon, 20 May 2024
Fri, 17 May 2024
Thu, 16 May 2024

[ total of 429 entries: 1-500 | 113-429 ]
[ showing up to 500 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2405, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 112

Tue, 21 May 2024 (continued, showing last 115 of 142 entries)

Mon, 20 May 2024

Fri, 17 May 2024

Thu, 16 May 2024