Computer Vision and Pattern Recognition

Authors and titles for recent submissions

Tue, 21 May 2024
Mon, 20 May 2024
Fri, 17 May 2024
Thu, 16 May 2024
Wed, 15 May 2024

[ total of 420 entries: 1-420 ]
[ showing up to 553 entries per page: fewer | more ]

Tue, 21 May 2024

[1] arXiv:2405.12221 [pdf, other]: Title: Images that Sound: Composing Images and Sounds on a Single Canvas

Authors: Ziyang Chen, Daniel Geng, Andrew Owens

Comments: Project site: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[2] arXiv:2405.12218 [pdf, other]: Title: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo

Authors: Tianqi Liu, Guangcong Wang, Shoukang Hu, Liao Shen, Xinyi Ye, Yuhang Zang, Zhiguo Cao, Wei Li, Ziwei Liu

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[3] arXiv:2405.12217 [pdf, other]: Title: Adapting Large Multimodal Models to Distribution Shifts: The Role of In-Context Learning

Authors: Guanglin Zhou, Zhongyi Han, Shiming Chen, Biwei Huang, Liming Zhu, Salman Khan, Xin Gao, Lina Yao

Comments: 17 pages, 7 figures, 7 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[4] arXiv:2405.12211 [pdf, other]: Title: Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices

Authors: Nathaniel Cohen, Vladimir Kulikov, Matan Kleiner, Inbar Huberman-Spiegelglas, Tomer Michaeli

Comments: ICML 2024. Code and examples are available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[5] arXiv:2405.12202 [pdf, other]: Title: Hierarchical Neural Operator Transformer with Learnable Frequency-aware Loss Prior for Arbitrary-scale Super-resolution

Authors: Xihaier Luo, Xiaoning Qian, Byung-Jun Yoon

Comments: 20 pages, 14 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[6] arXiv:2405.12200 [pdf, other]: Title: Multi-View Attentive Contextualization for Multi-View 3D Object Detection

Authors: Xianpeng Liu, Ce Zheng, Ming Qian, Nan Xue, Chen Chen, Zhebin Zhang, Chen Li, Tianfu Wu

Comments: Accepted by CVPR2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[7] arXiv:2405.12175 [pdf, other]: Title: Enhancing Explainable AI: A Hybrid Approach Combining GradCAM and LRP for CNN Interpretability

Authors: Vaibhav Dhore, Achintya Bhat, Viraj Nerlekar, Kashyap Chavhan, Aniket Umare

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[8] arXiv:2405.12150 [pdf, other]: Title: Bangladeshi Native Vehicle Detection in Wild

Authors: Bipin Saha, Md. Johirul Islam, Shaikh Khaled Mostaque, Aditya Bhowmik, Tapodhir Karmakar Taton, Md. Nakib Hayat Chowdhury, Mamun Bin Ibne Reaz

Comments: 13 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[9] arXiv:2405.12139 [pdf, other]: Title: DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM

Authors: Xuchen Li, Xiaokun Feng, Shiyu Hu, Meiqi Wu, Dailing Zhang, Jing Zhang, Kaiqi Huang

Comments: Accepted by CVPR Workshop 2024, Oral Presentation

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[10] arXiv:2405.12126 [pdf, other]: Title: Alzheimer's Magnetic Resonance Imaging Classification Using Deep and Meta-Learning Models

Authors: Nida Nasir, Muneeb Ahmed, Neda Afreen, Mustafa Sameer

Subjects: Computer Vision and Pattern Recognition (cs.CV); Emerging Technologies (cs.ET); Machine Learning (cs.LG); Multimedia (cs.MM)
[11] arXiv:2405.12114 [pdf, other]: Title: A New Cross-Space Total Variation Regularization Model for Color Image Restoration with Quaternion Blur Operator

Authors: Zhigang Jia, Yuelian Xiang, Meixiang Zhao, Tingting Wu, Michael K. Ng

Comments: 15pages,10figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Numerical Analysis (math.NA)
[12] arXiv:2405.12110 [pdf, other]: Title: CoR-GS: Sparse-View 3D Gaussian Splatting via Co-Regularization

Authors: Jiawei Zhang, Jiahe Li, Xiaohan Yu, Lei Huang, Lin Gu, Jin Zheng, Xiao Bai

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[13] arXiv:2405.12107 [pdf, other]: Title: Imp: Highly Capable Large Multimodal Models for Mobile Devices

Authors: Zhenwei Shao, Zhou Yu, Jun Yu, Xuecheng Ouyang, Lihao Zheng, Zhenbiao Gai, Mingyang Wang, Jiajun Ding

Comments: 19 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[14] arXiv:2405.12105 [pdf, other]: Title: Sheet Music Transformer ++: End-to-End Full-Page Optical Music Recognition for Pianoform Sheet Music

Authors: Antonio Ríos-Vila, Jorge Calvo-Zaragoza, David Rizo, Thierry Paquet

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[15] arXiv:2405.12070 [pdf, other]: Title: AutoSoccerPose: Automated 3D posture Analysis of Soccer Shot Movements

Authors: Calvin Yeung, Kenjiro Ide, Keisuke Fujii

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[16] arXiv:2405.12069 [pdf, other]: Title: Gaussian Head & Shoulders: High Fidelity Neural Upper Body Avatars with Anchor Gaussian Guided Texture Warping

Authors: Tianhao Wu, Jing Yang, Zhilin Guo, Jingyi Wan, Fangcheng Zhong, Cengiz Oztireli

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[17] arXiv:2405.12057 [pdf, other]: Title: NPLMV-PS: Neural Point-Light Multi-View Photometric Stereo

Authors: Fotios Logothetis, Ignas Budvytis, Roberto Cipolla

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[18] arXiv:2405.12018 [pdf, other]: Title: Continuous Sign Language Recognition with Adapted Conformer via Unsupervised Pretraining

Authors: Neena Aloysius, Geetha M, Prema Nedungadi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[19] arXiv:2405.12006 [pdf, other]: Title: Depth Reconstruction with Neural Signed Distance Fields in Structured Light Systems

Authors: Rukun Qiao, Hiroshi Kawasaki, Hongbin Zha

Comments: 10 pages, 8 figures, accepted by 3DV 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[20] arXiv:2405.12003 [pdf, other]: Title: Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification

Authors: Weilian Zhou, Sei-Ichiro Kamata, Haipeng Wang, Man-Sing Wong, Huiying (Cynthia)Hou

Comments: 19 pages, 16 figures,

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[21] arXiv:2405.11993 [pdf, other]: Title: GGAvatar: Geometric Adjustment of Gaussian Head Avatar

Authors: Xinyang Li, Jiaxin Wang, Yixin Xuan, Gongxin Yao, Yu Pan

Comments: 9 pages, 5 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[22] arXiv:2405.11985 [pdf, other]: Title: MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering

Authors: Jingqun Tang, Qi Liu, Yongjie Ye, Jinghui Lu, Shu Wei, Chunhui Lin, Wanqing Li, Mohamad Fitri Faiz Bin Mahmood, Hao Feng, Zhen Zhao, Yanjie Wang, Yuliang Liu, Hao Liu, Xiang Bai, Can Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[23] arXiv:2405.11978 [pdf, other]: Title: SM-DTW: Stability Modulated Dynamic Time Warping for signature verification

Authors: Antonio Parziale, Moises Diaz, Miguel A. Ferrer, Angelo Marcelli

Journal-ref: Pattern Recognition Letters, Volume: 121, Pages 113-122 (2019)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[24] arXiv:2405.11977 [pdf, other]: Title: GuidedRec: Guiding Ill-Posed Unsupervised Volumetric Recovery

Authors: Alexandre Cafaro, Amaury Leroy, Guillaume Beldjoudi, Pauline Maury, Charlotte Robert, Eric Deutsch, Vincent Grégoire, Vincent Lepetit, Nikos Paragios

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[25] arXiv:2405.11976 [pdf, other]: Title: Position-Guided Prompt Learning for Anomaly Detection in Chest X-Rays

Authors: Zhichao Sun, Yuliang Gu, Yepeng Liu, Zerui Zhang, Zhou Zhao, Yongchao Xu

Comments: MICCAI 2024 Early Accept

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[26] arXiv:2405.11971 [pdf, other]: Title: Data Augmentation for Text-based Person Retrieval Using Large Language Models

Authors: Zheng Li, Lijia Si, Caili Guo, Yang Yang, Qiushi Cao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[27] arXiv:2405.11936 [pdf, other]: Title: UAV-VisLoc: A Large-scale Dataset for UAV Visual Localization

Authors: Wenjia Xu, Yaxuan Yao, Jiaqi Cao, Zhiwei Wei, Chunbo Liu, Jiuniu Wang, Mugen Peng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[28] arXiv:2405.11921 [pdf, other]: Title: MirrorGaussian: Reflecting 3D Gaussians for Reconstructing Mirror Reflections

Authors: Jiayue Liu, Xiao Tang, Freeman Cheng, Roy Yang, Zhihao Li, Jianzhuang Liu, Yi Huang, Jiaqi Lin, Shiyong Liu, Xiaofei Wu, Songcen Xu, Chun Yuan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[29] arXiv:2405.11914 [pdf, other]: Title: PT43D: A Probabilistic Transformer for Generating 3D Shapes from Single Highly-Ambiguous RGB Images

Authors: Yiheng Xiong, Angela Dai

Comments: 10 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[30] arXiv:2405.11913 [pdf, other]: Title: Diff-BGM: A Diffusion Model for Video Background Music Generation

Authors: Sizhe Li, Yiming Qin, Minghang Zheng, Xin Jin, Yang Liu

Comments: Accepted by CVPR 2024(Poster)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[31] arXiv:2405.11905 [pdf, other]: Title: CSTA: CNN-based Spatiotemporal Attention for Video Summarization

Authors: Jaewon Son, Jaehun Park, Kwangsu Kim

Comments: Accepted at CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[32] arXiv:2405.11903 [pdf, ps, other]: Title: A comprehensive overview of deep learning techniques for 3D point cloud classification and semantic segmentation

Authors: Sushmita Sarker, Prithul Sarker, Gunner Stone, Ryan Gorman, Alireza Tavakkoli, George Bebis, Javad Sattarvand

Comments: Published in Springer Nature (Machine Vision and Applications)

Journal-ref: Machine Vision and Applications 35, 67 (2024)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[33] arXiv:2405.11894 [pdf, other]: Title: Refining Coded Image in Human Vision Layer Using CNN-Based Post-Processing

Authors: Takahiro Shindo, Yui Tatsumi, Taiju Watanabe, Hiroshi Watanabe

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[34] arXiv:2405.11867 [pdf, other]: Title: Depth Prompting for Sensor-Agnostic Depth Estimation

Authors: Jin-Hwi Park, Chanhwi Jeong, Junoh Lee, Hae-Gon Jeon

Comments: Accepted at CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)
[35] arXiv:2405.11862 [pdf, other]: Title: SEMv3: A Fast and Robust Approach to Table Separation Line Detection

Authors: Chunxia Qin, Zhenrong Zhang, Pengfei Hu, Chenyu Liu, Jiefeng Ma, Jun Du

Comments: 9 pages, 6 figures, 5 tables. Accepted by IJCAI2024 main track

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[36] arXiv:2405.11852 [pdf, other]: Title: Evolving Storytelling: Benchmarks and Methods for New Character Customization with Diffusion Models

Authors: Xiyu Wang, Yufei Wang, Satoshi Tsutsui, Weisi Lin, Bihan Wen, Alex C. Kot

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[37] arXiv:2405.11850 [pdf, other]: Title: Rethinking Overlooked Aspects in Vision-Language Models

Authors: Yuan Liu, Le Tian, Xiao Zhou, Jie Zhou

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[38] arXiv:2405.11846 [pdf, other]: Title: EPPS: Advanced Polyp Segmentation via Edge Information Injection and Selective Feature Decoupling

Authors: Mengqi Lei, Xin Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[39] arXiv:2405.11837 [pdf, other]: Title: Improving the Explain-Any-Concept by Introducing Nonlinearity to the Trainable Surrogate Model

Authors: Mounes Zaval, Sedat Ozer

Comments: This paper is accepted for publication at IEEE SIU conference, 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[40] arXiv:2405.11823 [pdf, other]: Title: Stereo-Knowledge Distillation from dpMV to Dual Pixels for Light Field Video Reconstruction

Authors: Aryan Garg, Raghav Mallampali, Akshat Joshi, Shrisudhan Govindarajan, Kaushik Mitra

Comments: International Conference of Computational Photography (ICCP 2024), 11 pages and 12 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[41] arXiv:2405.11822 [pdf, other]: Title: FeTT: Continual Class Incremental Learning via Feature Transformation Tuning

Authors: Sunyuan Qiang, Xuxin Lin, Yanyan Liang, Jun Wan, Du Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[42] arXiv:2405.11814 [pdf, other]: Title: Climatic & Anthropogenic Hazards to the Nasca World Heritage: Application of Remote Sensing, AI, and Flood Modelling

Authors: Masato Sakai, Marcus Freitag, Akihisa Sakurai, Conrad M Albrecht, Hendrik F Hamann

Comments: accepted at IGARSS 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computers and Society (cs.CY)
[43] arXiv:2405.11809 [pdf, other]: Title: Distill-then-prune: An Efficient Compression Framework for Real-time Stereo Matching Network on Edge Devices

Authors: Baiyu Pan, Jichao Jiao, Jianxing Pang, Jun Cheng

Comments: International Conference on Robotics and Automation (ICRA) 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[44] arXiv:2405.11794 [pdf, other]: Title: ViViD: Video Virtual Try-on using Diffusion Models

Authors: Zixun Fang, Wei Zhai, Aimin Su, Hongliang Song, Kai Zhu, Mao Wang, Yu Chen, Zhiheng Liu, Yang Cao, Zheng-Jun Zha

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[45] arXiv:2405.11793 [pdf, other]: Title: MM-Retinal: Knowledge-Enhanced Foundational Pretraining with Fundus Image-Text Expertise

Authors: Ruiqi Wu, Chenran Zhang, Jianle Zhang, Yi Zhou, Tao Zhou, Huazhu Fu

Comments: Early Accepted by The International Conference on Medical Image Computing and Computer Assisted Intervention(MICCAI)2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[46] arXiv:2405.11770 [pdf, other]: Title: Learning Spatial Similarity Distribution for Few-shot Object Counting

Authors: Yuanwu Xu, Feifan Song, Haofeng Zhang

Comments: Accepted to IJCAI2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[47] arXiv:2405.11765 [pdf, other]: Title: DATR: Unsupervised Domain Adaptive Detection Transformer with Dataset-Level Adaptation and Prototypical Alignment

Authors: Jianhong Han, Liang Chen, Yupei Wang

Comments: Manuscript submitted to IEEE Transactions on Image Processing

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[48] arXiv:2405.11757 [pdf, other]: Title: DLAFormer: An End-to-End Transformer For Document Layout Analysis

Authors: Jiawei Wang, Kai Hu, Qiang Huo

Comments: ICDAR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[49] arXiv:2405.11754 [pdf, other]: Title: Versatile Teacher: A Class-aware Teacher-student Framework for Cross-domain Adaptation

Authors: Runou Yang, Tian Tian, Jinwen Tian

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[50] arXiv:2405.11732 [pdf, ps, other]: Title: Quality assurance of organs-at-risk delineation in radiotherapy

Authors: Yihao Zhao, Cuiyun Yuan, Ying Liang, Yang Li, Chunxia Li, Man Zhao, Jun Hu, Wei Liu, Chenbin Liu

Comments: 14 pages,5 figures, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)
[51] arXiv:2405.11690 [pdf, other]: Title: InterAct: Capture and Modelling of Realistic, Expressive and Interactive Activities between Two Persons in Daily Scenarios

Authors: Yinghao Huang, Leo Ho, Dafei Qin, Mingyi Shi, Taku Komura

Comments: The first two authors contributed equally to this work

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[52] arXiv:2405.11685 [pdf, other]: Title: ColorFoil: Investigating Color Blindness in Large Vision and Language Models

Authors: Ahnaf Mozib Samin, M. Firoz Ahmed, Md. Mushtaq Shahriyar Rafee

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[53] arXiv:2405.11682 [pdf, other]: Title: FADet: A Multi-sensor 3D Object Detection Network based on Local Featured Attention

Authors: Ziang Guo, Zakhar Yagudin, Selamawit Asfaw, Artem Lykov, Dzmitry Tsetserukou

Comments: Submitted to IEEE

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[54] arXiv:2405.11677 [pdf, other]: Title: Advancing 6-DoF Instrument Pose Estimation in Variable X-Ray Imaging Geometries

Authors: Christiaan G.A. Viviers, Lena Filatova, Maurice Termeer, Peter H.N. de With, Fons van der Sommen

Comments: Early author version of paper. Refer to the full paper at this https URL

Journal-ref: IEEE Transactions on Image Processing (2024) (Volume: 33) Page(s): 2462 - 2476

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[55] arXiv:2405.11675 [pdf, other]: Title: Deep Ensemble Art Style Recognition

Authors: Orfeas Menis-Mastromichalakis, Natasa Sofou, Giorgos Stamou

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[56] arXiv:2405.11655 [pdf, other]: Title: Track Anything Rapter(TAR)

Authors: Tharun V. Puthanveettil, Fnu Obaid ur Rahman

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[57] arXiv:2405.11643 [pdf, other]: Title: Morphological Prototyping for Unsupervised Slide Representation Learning in Computational Pathology

Authors: Andrew H. Song, Richard J. Chen, Tong Ding, Drew F.K. Williamson, Guillaume Jaume, Faisal Mahmood

Comments: CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Applications (stat.AP)
[58] arXiv:2405.11629 [pdf, other]: Title: Searching Realistic-Looking Adversarial Objects For Autonomous Driving Systems

Authors: Shengxiang Sun, Shenzhe Zhu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[59] arXiv:2405.11621 [pdf, ps, other]: Title: Computer Vision in the Food Industry: Accurate, Real-time, and Automatic Food Recognition with Pretrained MobileNetV2

Authors: Shayan Rokhva, Babak Teimourpour, Amir Hossein Soltani

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[60] arXiv:2405.11618 [pdf, other]: Title: Transcriptomics-guided Slide Representation Learning in Computational Pathology

Authors: Guillaume Jaume, Lukas Oldenburg, Anurag Vaidya, Richard J. Chen, Drew F.K. Williamson, Thomas Peeters, Andrew H. Song, Faisal Mahmood

Comments: CVPR'24, Oral

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[61] arXiv:2405.11616 [pdf, other]: Title: Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention

Authors: Peng Li, Yuan Liu, Xiaoxiao Long, Feihu Zhang, Cheng Lin, Mengfei Li, Xingqun Qi, Shanghang Zhang, Wenhan Luo, Ping Tan, Wenping Wang, Qifeng Liu, Yike Guo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[62] arXiv:2405.11614 [pdf, other]: Title: Nickel and Diming Your GAN: A Dual-Method Approach to Enhancing GAN Efficiency via Knowledge Distillation

Authors: Sangyeop Yeo, Yoojin Jang, Jaejun Yoo

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[63] arXiv:2405.11582 [pdf, other]: Title: SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization

Authors: Jialong Guo, Xinghao Chen, Yehui Tang, Yunhe Wang

Comments: Accepted to ICML 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[64] arXiv:2405.11574 [pdf, other]: Title: Reproducibility Study of CDUL: CLIP-Driven Unsupervised Learning for Multi-Label Image Classification

Authors: Manan Shah, Yash Bhalgat

Comments: Reproducibility study

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[65] arXiv:2405.11564 [pdf, other]: Title: CRF360D: Monocular 360 Depth Estimation via Spherical Fully-Connected CRFs

Authors: Zidong Cao, Lin Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[66] arXiv:2405.11551 [pdf, other]: Title: An Invisible Backdoor Attack Based On Semantic Feature

Authors: Yangming Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[67] arXiv:2405.11536 [pdf, other]: Title: RobMOT: Robust 3D Multi-Object Tracking by Observational Noise and State Estimation Drift Mitigation on LiDAR PointCloud

Authors: Mohamed Nagy, Naoufel Werghi, Bilal Hassan, Jorge Dias, Majid Khonji

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[68] arXiv:2405.11526 [pdf, other]: Title: Register assisted aggregation for Visual Place Recognition

Authors: Xuan Yu, Zhenyong Fu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[69] arXiv:2405.11523 [pdf, other]: Title: Diffusion-Based Hierarchical Image Steganography

Authors: Youmin Xu, Xuanyu Zhang, Jiwen Yu, Chong Mou, Xiandong Meng, Jian Zhang

Comments: arXiv admin note: text overlap with arXiv:2305.16936

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[70] arXiv:2405.11511 [pdf, other]: Title: Online Action Representation using Change Detection and Symbolic Programming

Authors: Vishnu S Nair, Sneha Sree, Jayaraj Joseph, Mohanasankar Sivaprakasam

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[71] arXiv:2405.11501 [pdf, other]: Title: DogFLW: Dog Facial Landmarks in the Wild Dataset

Authors: George Martvel, Greta Abele, Annika Bremhorst, Chiara Canori, Nareed Farhat, Giulia Pedretti, Ilan Shimshoni, Anna Zamansky

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[72] arXiv:2405.11498 [pdf, other]: Title: The Effectiveness of Edge Detection Evaluation Metrics for Automated Coastline Detection

Authors: Conor O'Sullivan, Seamus Coveney, Xavier Monteys, Soumyabrata Dev

Journal-ref: 2023 Photonics & Electromagnetics Research Symposium (PIERS)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[73] arXiv:2405.11496 [pdf, other]: Title: DEMO: A Statistical Perspective for Efficient Image-Text Matching

Authors: Fan Zhang, Xian-Sheng Hua, Chong Chen, Xiao Luo

Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)
[74] arXiv:2405.11494 [pdf, other]: Title: Automated Coastline Extraction Using Edge Detection Algorithms

Authors: Conor O'Sullivan, Seamus Coveney, Xavier Monteys, Soumyabrata Dev

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[75] arXiv:2405.11493 [pdf, other]: Title: Point Cloud Compression with Implicit Neural Representations: A Unified Framework

Authors: Hongning Ruan, Yulin Shao, Qianqian Yang, Liang Zhao, Dusit Niyato

Comments: 6 Pages, 6 Figures, submitted to IEEE ICCC

Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT); Signal Processing (eess.SP)
[76] arXiv:2405.11491 [pdf, other]: Title: BOSC: A Backdoor-based Framework for Open Set Synthetic Image Attribution

Authors: Jun Wang, Benedetta Tondi, Mauro Barni

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[77] arXiv:2405.11487 [pdf, other]: Title: "Previously on ..." From Recaps to Story Summarization

Authors: Aditya Kumar Singh, Dhruv Srivastava, Makarand Tapaswi

Comments: CVPR 2024; Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[78] arXiv:2405.11483 [pdf, other]: Title: MICap: A Unified Model for Identity-aware Movie Descriptions

Authors: Haran Raajesh, Naveen Reddy Desanur, Zeeshan Khan, Makarand Tapaswi

Comments: CVPR 2024, Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[79] arXiv:2405.11481 [pdf, other]: Title: Physics-aware Hand-object Interaction Denoising

Authors: Haowen Luo, Yunze Liu, Li Yi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[80] arXiv:2405.11478 [pdf, other]: Title: Unsupervised Image Prior via Prompt Learning and CLIP Semantic Guidance for Low-Light Image Enhancement

Authors: Igor Morawski, Kai He, Shusil Dangi, Winston H. Hsu

Comments: Accepted to CVPR 2024 Workshop NTIRE: New Trends in Image Restoration and Enhancement workshop and Challenges

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[81] arXiv:2405.11476 [pdf, other]: Title: NubbleDrop: A Simple Way to Improve Matching Strategy for Prompted One-Shot Segmentation

Authors: Zhiyu Xu, Qingliang Chen

Comments: Under Review

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[82] arXiv:2405.11473 [pdf, other]: Title: FIFO-Diffusion: Generating Infinite Videos from Text without Training

Authors: Jihwan Kim, Junoh Kang, Jinyoung Choi, Bohyung Han

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[83] arXiv:2405.11468 [pdf, other]: Title: Emphasizing Crucial Features for Efficient Image Restoration

Authors: Hu Gao, Bowen Ma, Ying Zhang, Jingfan Yang, Jing Yang, Depeng Dang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[84] arXiv:2405.11467 [pdf, other]: Title: AdaAugment: A Tuning-Free and Adaptive Approach to Enhance Data Augmentation

Authors: Suorong Yang, Peijia Li, Xin Xiong, Furao Shen, Jian Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[85] arXiv:2405.11448 [pdf, other]: Title: Cross-Domain Knowledge Distillation for Low-Resolution Human Pose Estimation

Authors: Zejun Gu, Zhong-Qiu Zhao, Henghui Ding, Hao Shen, Zhao Zhang, De-Shuang Huang

Comments: 11 pages, 5 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[86] arXiv:2405.11442 [pdf, other]: Title: Unifying 3D Vision-Language Understanding via Promptable Queries

Authors: Ziyu Zhu, Zhuofan Zhang, Xiaojian Ma, Xuesong Niu, Yixin Chen, Baoxiong Jia, Zhidong Deng, Siyuan Huang, Qing Li

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[87] arXiv:2405.11437 [pdf, other]: Title: The First Swahili Language Scene Text Detection and Recognition Dataset

Authors: Fadila Wendigoundi Douamba, Jianjun Song, Ling Fu, Yuliang Liu, Xiang Bai

Comments: Accepted to ICDAR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[88] arXiv:2405.11351 [pdf, other]: Title: PlantTracing: Tracing Arabidopsis Thaliana Apex with CenterTrack

Authors: Yuanzhe Liu, Yixiang Mao, Yao Wang

Comments: 4 pages, 13 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[89] arXiv:2405.11345 [pdf, ps, other]: Title: City-Scale Multi-Camera Vehicle Tracking System with Improved Self-Supervised Camera Link Model

Authors: Yuqiang Lin, Sam Lockyer, Adrian Evans, Markus Zarbock, Nic Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[90] arXiv:2405.11338 [pdf, ps, other]: Title: EyeFound: A Multimodal Generalist Foundation Model for Ophthalmic Imaging

Authors: Danli Shi, Weiyi Zhang, Xiaolan Chen, Yexin Liu, Jianchen Yang, Siyu Huang, Yih Chung Tham, Yingfeng Zheng, Mingguang He

Comments: 21 pages, 2 figures, 4 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[91] arXiv:2405.11337 [pdf, other]: Title: A Unified Approach Towards Active Learning and Out-of-Distribution Detection

Authors: Sebastian Schmidt, Leonard Schenk, Leo Schwinn, Stephan Günnemann

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[92] arXiv:2405.11336 [pdf, other]: Title: UPAM: Unified Prompt Attack in Text-to-Image Generation Models Against Both Textual Filters and Visual Checkers

Authors: Duo Peng, Qiuhong Ke, Jun Liu

Comments: Accepted by ICML2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[93] arXiv:2405.11315 [pdf, other]: Title: MediCLIP: Adapting CLIP for Few-shot Medical Image Anomaly Detection

Authors: Ximiao Zhang, Min Xu, Dehui Qiu, Ruixin Yan, Ning Lang, Xiuzhuang Zhou

Comments: 12 pages, 3 figures, 5 tables, early accepted at MICCAI 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[94] arXiv:2405.11293 [pdf, other]: Title: InfRS: Incremental Few-Shot Object Detection in Remote Sensing Images

Authors: Wuzhou Li, Jiawei Zhou, Xiang Li, Yi Cao, Guang Jin, Xuemin Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[95] arXiv:2405.11286 [pdf, other]: Title: Motion Avatar: Generate Human and Animal Avatars with Arbitrary Motion

Authors: Zeyu Zhang, Yiran Wang, Biao Wu, Shuo Chen, Zhiyuan Zhang, Shiya Huang, Wenbo Zhang, Meng Fang, Ling Chen, Yang Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[96] arXiv:2405.11276 [pdf, other]: Title: Visible and Clear: Finding Tiny Objects in Difference Map

Authors: Bing Cao, Haiyu Yao, Pengfei Zhu, Qinghua Hu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[97] arXiv:2405.11270 [pdf, other]: Title: HR Human: Modeling Human Avatars with Triangular Mesh and High-Resolution Textures from Videos

Authors: Qifeng Chen, Rengan Xie, Kai Huang, Qi Wang, Wenting Zheng, Rong Li, Yuchi Huo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[98] arXiv:2405.11252 [pdf, other]: Title: Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score Matching

Authors: Xingyu Miao, Haoran Duan, Varun Ojha, Jun Song, Tejal Shah, Yang Long, Rajiv Ranjan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[99] arXiv:2405.11240 [pdf, other]: Title: Testing the Performance of Face Recognition for People with Down Syndrome

Authors: Christian Rathgeb, Mathias Ibsen, Denise Hartmann, Simon Hradetzky, Berglind Ólafsdóttir

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[100] arXiv:2405.11236 [pdf, other]: Title: TriLoRA: Integrating SVD for Advanced Style Personalization in Text-to-Image Generation

Authors: Chengcheng Feng, Mu He, Qiuyu Tian, Haojie Yin, Xiaofang Zhao, Hongwei Tang, Xingqiang Wei

Comments: Accepted by AI for Content Creation (AI4CC) workshop at CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[101] arXiv:2405.11205 [pdf, other]: Title: Fuse & Calibrate: A bi-directional Vision-Language Guided Framework for Referring Image Segmentation

Authors: Yichen Yan, Xingjian He, Sihan Chen, Shichen Lu, Jing Liu

Comments: 12 pages, 4 figures ICIC2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[102] arXiv:2405.11190 [pdf, other]: Title: ReasonPix2Pix: Instruction Reasoning Dataset for Advanced Image Editing

Authors: Ying Jin, Pengyang Ling, Xiaoyi Dong, Pan Zhang, Jiaqi Wang, Dahua Lin

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[103] arXiv:2405.11180 [pdf, other]: Title: GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic Hand Gesture Recognition

Authors: Mallika Garg, Debashis Ghosh, Pyari Mohan Pradhan

Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[104] arXiv:2405.11165 [pdf, other]: Title: Automated Multi-level Preference for MLLMs

Authors: Mengxi Zhang, Kang Rong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[105] arXiv:2405.11158 [pdf, other]: Title: Dusk Till Dawn: Self-supervised Nighttime Stereo Depth Estimation using Visual Foundation Models

Authors: Madhu Vankadari, Samuel Hodgson, Sangyun Shin, Kaichen Zhou Andrew Markham, Niki Trigoni

Comments: The paper is published at ICRA 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[106] arXiv:2405.11154 [pdf, other]: Title: Revisiting the Robust Generalization of Adversarial Prompt Tuning

Authors: Fan Yang, Mingxuan Xia, Sangzhou Xia, Chicheng Ma, Hui Hui

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[107] arXiv:2405.11151 [pdf, other]: Title: Multi-scale Information Sharing and Selection Network with Boundary Attention for Polyp Segmentation

Authors: Xiaolu Kang, Zhuoqi Ma, Kang Liu, Yunan Li, Qiguang Miao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[108] arXiv:2405.11145 [pdf, other]: Title: Detecting Multimodal Situations with Insufficient Context and Abstaining from Baseless Predictions

Authors: Junzhang Liu, Zhecan Wang, Hammad Ayyubi, Haoxuan You, Chris Thomas, Rui Sun, Shih-Fu Chang, Kai-Wei Chang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[109] arXiv:2405.11129 [pdf, other]: Title: MotionGS : Compact Gaussian Splatting SLAM by Motion Filter

Authors: Xinli Guo, Peng Han, Weidong Zhang, Hongtian Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[110] arXiv:2405.11126 [pdf, other]: Title: Flexible Motion In-betweening with Diffusion Models

Authors: Setareh Cohan, Guy Tevet, Daniele Reda, Xue Bin Peng, Michiel van de Panne

Comments: SIGGRAPH 2024. For project page and code, see this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[111] arXiv:2405.11112 [pdf, other]: Title: Enhancing Understanding Through Wildlife Re-Identification

Authors: J. Buitenhuis

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[112] arXiv:2405.11067 [pdf, other]: Title: Bayesian Learning-driven Prototypical Contrastive Loss for Class-Incremental Learning

Authors: Nisha L. Raichur, Lucas Heublein, Tobias Feigl, Alexander Rügamer, Christopher Mutschler, Felix Ott

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[113] arXiv:2405.11021 [pdf, other]: Title: Photorealistic 3D Urban Scene Reconstruction and Point Cloud Extraction using Google Earth Imagery and Gaussian Splatting

Authors: Kyle Gao, Dening Lu, Hongjie He, Linlin Xu, Jonathan Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[114] arXiv:2405.10954 [pdf, ps, other]: Title: Multimodal CLIP Inference for Meta-Few-Shot Image Classification

Authors: Constance Ferragu, Philomene Chagniot, Vincent Coyette

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[115] arXiv:2405.10952 [pdf, other]: Title: VICAN: Very Efficient Calibration Algorithm for Large Camera Networks

Authors: Gabriel Moreira, Manuel Marques, João Paulo Costeira, Alexander Hauptmann

Comments: To appear at the IEEE International Conference on Robotics and Automation (ICRA), 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[116] arXiv:2405.10951 [pdf, other]: Title: Block Selective Reprogramming for On-device Training of Vision Transformers

Authors: Sreetama Sarkar, Souvik Kundu, Kai Zheng, Peter A. Beerel

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[117] arXiv:2405.10949 [pdf, other]: Title: Global License Plate Dataset

Authors: Siddharth Agrawal

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[118] arXiv:2405.10948 [pdf, other]: Title: Surgical-LVLM: Learning to Adapt Large Vision-Language Model for Grounded Visual Question Answering in Robotic Surgery

Authors: Guankun Wang, Long Bai, Wan Jun Nah, Jie Wang, Zhaoxi Zhang, Zhen Chen, Jinlin Wu, Mobarakol Islam, Hongbin Liu, Hongliang Ren

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO); Image and Video Processing (eess.IV)
[119] arXiv:2405.10947 [pdf, other]: Title: Depth-aware Panoptic Segmentation

Authors: Tuan Nguyen, Max Mehltretter, Franz Rottensteiner

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[120] arXiv:2405.10946 [pdf, other]: Title: Application of Tensorized Neural Networks for Cloud Classification

Authors: Alifu Xiafukaiti, Devanshu Garg, Aruto Hosaka, Koichi Yanagisawa, Yuichiro Minato, Tsuyoshi Yoshida

Comments: 11 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Atmospheric and Oceanic Physics (physics.ao-ph)
[121] arXiv:2405.12171 (cross-list from cs.SE) [pdf, other]: Title: State of the Practice for Medical Imaging Software

Authors: W. Spencer Smith, Ao Dong, Jacques Carette, Michael D. Noseworthy

Comments: 73 pages, 14 figures, 12 tables

Subjects: Software Engineering (cs.SE); Computer Vision and Pattern Recognition (cs.CV)
[122] arXiv:2405.11880 (cross-list from cs.LG) [pdf, other]: Title: Quantifying In-Context Reasoning Effects and Memorization Effects in LLMs

Authors: Siyu Lou, Yuntian Chen, Xiaodan Liang, Liang Lin, Quanshi Zhang

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[123] arXiv:2405.11829 (cross-list from cs.LG) [pdf, other]: Title: Adversarially Diversified Rehearsal Memory (ADRM): Mitigating Memory Overfitting Challenge in Continual Learning

Authors: Hikmat Khan, Ghulam Rasool, Nidhal Carla Bouaynaya

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[124] arXiv:2405.11708 (cross-list from cs.LG) [pdf, other]: Title: Adaptive Batch Normalization Networks for Adversarial Robustness

Authors: Shao-Yuan Lo, Vishal M. Patel

Comments: Accepted at IEEE International Conference on Advanced Video and Signal-based Surveillance (AVSS) 2024

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[125] arXiv:2405.11659 (cross-list from cs.RO) [pdf, other]: Title: Auto-Platoon : Freight by example

Authors: Tharun V. Puthanveettil, Abhijay Singh, Yashveer Jain, Vinay Bukka, Sameer Arjun S

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[126] arXiv:2405.11640 (cross-list from cs.AI) [pdf, other]: Title: Inquire, Interact, and Integrate: A Proactive Agent Collaborative Framework for Zero-Shot Multimodal Medical Reasoning

Authors: Zishan Gu, Fenglin Liu, Changchang Yin, Ping Zhang

Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[127] arXiv:2405.11598 (cross-list from eess.IV) [pdf, other]: Title: AI-Assisted Diagnosis for Covid-19 CXR Screening: From Data Collection to Clinical Validation

Authors: Carlo Alberto Barbano, Riccardo Renzulli, Marco Grosso, Domenico Basile, Marco Busso, Marco Grangetto

Comments: Accepted at 21st IEEE International Symposium on Biomedical Imaging (ISBI)

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[128] arXiv:2405.11533 (cross-list from cs.LG) [pdf, other]: Title: Hierarchical Selective Classification

Authors: Shani Goren, Ido Galil, Ran El-Yaniv

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[129] arXiv:2405.11492 (cross-list from cs.RO) [pdf, other]: Title: Enhancing Vehicle Aerodynamics with Deep Reinforcement Learning in Voxelised Models

Authors: Jignesh Patel, Yannis Spyridis, Vasileios Argyriou

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[130] arXiv:2405.11386 (cross-list from eess.IV) [pdf, other]: Title: Liver Fat Quantification Network with Body Shape

Authors: Qiyue Wang, Wu Xue, Xiaoke Zhang, Fang Jin, James Hahn

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[131] arXiv:2405.11326 (cross-list from cs.LG) [pdf, other]: Title: On the Trajectory Regularity of ODE-based Diffusion Sampling

Authors: Defang Chen, Zhenyu Zhou, Can Wang, Chunhua Shen, Siwei Lyu

Comments: ICML 2024, 30 pages

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[132] arXiv:2405.11320 (cross-list from cs.LG) [pdf, other]: Title: Sampling Strategies for Mitigating Bias in Face Synthesis Methods

Authors: Emmanouil Maragkoudakis, Symeon Papadopoulos, Iraklis Varlamis, Christos Diou

Comments: Accepted to the BIAS 2023 ECML-PKDD Workshop

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[133] arXiv:2405.11301 (cross-list from cs.CL) [pdf, other]: Title: Enhancing Fine-Grained Image Classifications via Cascaded Vision Language Models

Authors: Canshi Wei

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[134] arXiv:2405.11298 (cross-list from cs.RO) [pdf, other]: Title: Visual Episodic Memory-based Exploration

Authors: Jack Vice, Natalie Ruiz-Sanchez, Pamela K. Douglas, Gita Sukthankar

Comments: FLAIRS 2023, 7 pages, 11 figures

Journal-ref: The International FLAIRS Conference Proceedings. Vol. 36. 2023

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[135] arXiv:2405.11295 (cross-list from eess.IV) [pdf, ps, other]: Title: Medical Image Analysis for Detection, Treatment and Planning of Disease using Artificial Intelligence Approaches

Authors: Nand Lal Yadav, Satyendra Singh, Rajesh Kumar, Sudhakar Singh

Comments: 10 pages, 3 figures

Journal-ref: International Journal of Microsystems and IoT, Vol. 1, Issue 5, pp.278- 287, 2023

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[136] arXiv:2405.11289 (cross-list from eess.IV) [pdf, other]: Title: Diffusion Model Driven Test-Time Image Adaptation for Robust Skin Lesion Classification

Authors: Ming Hu, Siyuan Yan, Peng Xia, Feilong Tang, Wenxue Li, Peibo Duan, Lin Zhang, Zongyuan Ge

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[137] arXiv:2405.11273 (cross-list from cs.AI) [pdf, other]: Title: Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts

Authors: Yunxin Li, Shenyuan Jiang, Baotian Hu, Longyue Wang, Wanqi Zhong, Wenhan Luo, Lin Ma, Min Zhang

Comments: 22 pages, 13 figures. Project Website: this https URL Working in progress

Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[138] arXiv:2405.11176 (cross-list from cs.RO) [pdf, other]: Title: Outlier-Robust Long-Term Robotic Mapping Leveraging Ground Segmentation

Authors: Hyungtae Lim

Comments: 2 pages, 4 figures

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[139] arXiv:2405.11133 (cross-list from eess.IV) [pdf, ps, other]: Title: XCAT-2.0: A Comprehensive Library of Personalized Digital Twins Derived from CT Scans

Authors: Lavsen Dahal, Mobina Ghojoghnejad, Dhrubajyoti Ghosh, Yubraj Bhandari, David Kim, Fong Chi Ho, Fakrul Islam Tushar, Ehsan Abadi, Ehsan Samei, Joseph Lo, Paul Segars

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[140] arXiv:2405.11064 (cross-list from eess.SP) [pdf, other]: Title: TVCondNet: A Conditional Denoising Neural Network for NMR Spectroscopy

Authors: Zihao Zou, Shirin Shoushtari, Jiaming Liu, Jialiang Zhang, Patrick Judge, Emilia Santana, Alison Lim, Marcus Foston, Ulugbek S. Kamilov

Subjects: Signal Processing (eess.SP); Computer Vision and Pattern Recognition (cs.CV)
[141] arXiv:2405.11029 (cross-list from cs.LG) [pdf, other]: Title: Generative Artificial Intelligence: A Systematic Review and Applications

Authors: Sandeep Singh Sengar, Affan Bin Hasan, Sanjay Kumar, Fiona Carroll

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[142] arXiv:2405.10950 (cross-list from eess.IV) [pdf, ps, other]: Title: Classification of colorectal primer carcinoma from normal colon with mid-infrared spectra

Authors: B. Borkovits, E. Kontsek, A. Pesti, P. Gordon, S. Gergely, I. Csabai, A. Kiss, P. Pollner

Comments: 15 pages, 5 figures, 4 tables, Conferentia Chemometrica 2023 special edition, for the original digital location, see this https URL , digital biblio info: (2024) e3542

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Tissues and Organs (q-bio.TO)

Mon, 20 May 2024

[143] arXiv:2405.10934 [pdf, other]: Title: Reconstruction of Manipulated Garment with Guided Deformation Prior

Authors: Ren Li, Corentin Dumery, Zhantao Deng, Pascal Fua

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[144] arXiv:2405.10913 [pdf, other]: Title: Blackbox Adaptation for Medical Image Segmentation

Authors: Jay N. Paranjape, Shameema Sikder, S. Swaroop Vedula, Vishal M. Patel

Comments: Accepted early at MICCAI 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[145] arXiv:2405.10885 [pdf, other]: Title: FA-Depth: Toward Fast and Accurate Self-supervised Monocular Depth Estimation

Authors: Fei Wang, Jun Cheng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[146] arXiv:2405.10879 [pdf, other]: Title: One registration is worth two segmentations

Authors: Shiqi Huang, Tingfa Xu, Ziyi Shen, Shaheer Ullah Saeed, Wen Yan, Dean Barratt, Yipeng Hu

Comments: Early Accepted by MICCAI2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[147] arXiv:2405.10871 [pdf, other]: Title: BraTS-Path Challenge: Assessing Heterogeneous Histopathologic Brain Tumor Sub-regions

Authors: Spyridon Bakas, Siddhesh P. Thakur, Shahriar Faghani, Mana Moassefi, Ujjwal Baid, Verena Chung, Sarthak Pati, Shubham Innani, Bhakti Baheti, Jake Albrecht, Alexandros Karargyris, Hasan Kassem, MacLean P. Nasrallah, Jared T. Ahrendsen, Valeria Barresi, Maria A. Gubbiotti, Giselle Y. López, Calixto-Hope G. Lucas, Michael L. Miller, Lee A. D. Cooper, Jason T. Huse, William R. Bell

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[148] arXiv:2405.10868 [pdf, other]: Title: Air Signing and Privacy-Preserving Signature Verification for Digital Documents

Authors: P. Sarveswarasarma, T. Sathulakjan, V. J. V. Godfrey, Thanuja D. Ambegoda

Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[149] arXiv:2405.10864 [pdf, other]: Title: Improving face generation quality and prompt following with synthetic captions

Authors: Michail Tarasiou, Stylianos Moschoglou, Jiankang Deng, Stefanos Zafeiriou

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[150] arXiv:2405.10842 [pdf, ps, other]: Title: Automated Radiology Report Generation: A Review of Recent Advances

Authors: Phillip Sloan, Philip Clatworthy, Edwin Simpson, Majid Mirmehdi

Comments: 24 pages, 8 figures, 6 tables. Submitted to IEEE Reviews in Biomedical Engineering

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[151] arXiv:2405.10832 [pdf, other]: Title: Open-Vocabulary Spatio-Temporal Action Detection

Authors: Tao Wu, Shuqiu Ge, Jie Qin, Gangshan Wu, Limin Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[152] arXiv:2405.10802 [pdf, other]: Title: Reduced storage direct tensor ring decomposition for convolutional neural networks compression

Authors: Mateusz Gabor, Rafał Zdunek

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[153] arXiv:2405.10748 [pdf, other]: Title: Deep Data Consistency: a Fast and Robust Diffusion Model-based Solver for Inverse Problems

Authors: Hanyu Chen, Zhixiu Hao, Liying Xiao

Comments: Codes: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[154] arXiv:2405.10739 [pdf, other]: Title: Efficient Multimodal Large Language Models: A Survey

Authors: Yizhang Jin, Jian Li, Yexin Liu, Tianjun Gu, Kai Wu, Zhengkai Jiang, Muyang He, Bo Zhao, Xin Tan, Zhenye Gan, Yabiao Wang, Chengjie Wang, Lizhuang Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[155] arXiv:2405.10736 [pdf, other]: Title: StackOverflowVQA: Stack Overflow Visual Question Answering Dataset

Authors: Motahhare Mirzaei, Mohammad Javad Pirhadi, Sauleh Eetemadi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[156] arXiv:2405.10718 [pdf, other]: Title: SignLLM: Sign Languages Production Large Language Models

Authors: Sen Fang, Lei Wang, Ce Zheng, Yapeng Tian, Chen Chen

Comments: 33 pages, website at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[157] arXiv:2405.10707 [pdf, ps, other]: Title: HARIS: Human-Like Attention for Reference Image Segmentation

Authors: Mengxi Zhang, Heqing Lian, Yiming Liu, Kang Rong, Jie Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[158] arXiv:2405.10696 [pdf, other]: Title: Autonomous AI-enabled Industrial Sorting Pipeline for Advanced Textile Recycling

Authors: Yannis Spyridis, Vasileios Argyriou, Antonios Sarigiannidis, Panagiotis Radoglou, Panagiotis Sarigiannidis

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[159] arXiv:2405.10690 [pdf, other]: Title: CoLeaF: A Contrastive-Collaborative Learning Framework for Weakly Supervised Audio-Visual Video Parsing

Authors: Faegheh Sardari, Armin Mustafa, Philip J. B. Jackson, Adrian Hilton

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[160] arXiv:2405.10674 [pdf, other]: Title: From Sora What We Can See: A Survey of Text-to-Video Generation

Authors: Rui Sun, Yumin Zhang, Tejal Shah, Jiahao Sun, Shuoying Zhang, Wenqi Li, Haoran Duan, Bo Wei, Rajiv Ranjan

Comments: A comprehensive list of text-to-video generation studies in this survey is available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[161] arXiv:2405.10612 [pdf, other]: Title: Not All Prompts Are Secure: A Switchable Backdoor Attack Against Pre-trained Vision Transformers

Authors: Sheng Yang, Jiawang Bai, Kuofeng Gao, Yong Yang, Yiming Li, Shu-tao Xia

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[162] arXiv:2405.10610 [pdf, other]: Title: Driving Referring Video Object Segmentation with Vision-Language Pre-trained Models

Authors: Zikun Zhou, Wentao Xiong, Li Zhou, Xin Li, Zhenyu He, Yaowei Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[163] arXiv:2405.10598 [pdf, other]: Title: Learning Object-Centric Representation via Reverse Hierarchy Guidance

Authors: Junhong Zou, Xiangyu Zhu, Zhaoxiang Zhang, Zhen Lei

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[164] arXiv:2405.10591 [pdf, other]: Title: GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision

Authors: Xin Tan, Wenbin Wu, Zhiwei Zhang, Chaojie Fan, Yong Peng, Zhizhong Zhang, Yuan Xie, Lizhuang Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[165] arXiv:2405.10589 [pdf, other]: Title: Improving Point-based Crowd Counting and Localization Based on Auxiliary Point Guidance

Authors: I-Hsiang Chen, Wei-Ting Chen, Yu-Wei Liu, Ming-Hsuan Yang, Sy-Yen Kuo

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
[166] arXiv:2405.10577 [pdf, other]: Title: DuoSpaceNet: Leveraging Both Bird's-Eye-View and Perspective View Representations for 3D Object Detection

Authors: Zhe Huang, Yizhe Zhao, Hao Xiao, Chenyan Wu, Lingting Ge

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[167] arXiv:2405.10575 [pdf, other]: Title: Accurate Training Data for Occupancy Map Prediction in Automated Driving Using Evidence Theory

Authors: Jonas Kälble, Sascha Wirges, Maxim Tatarchenko, Eddy Ilg

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[168] arXiv:2405.10567 [pdf, other]: Title: Team Samsung-RAL: Technical Report for 2024 RoboDrive Challenge-Robust Map Segmentation Track

Authors: Xiaoshuai Hao, Yifan Yang, Hui Zhang, Mengchuan Wei, Yi Zhou, Haimei Zhao, Jing Zhang

Comments: ICRA 2024 RoboDrive Challenge Robust Map Segmentation Track 3rd Place Technical Report. arXiv admin note: text overlap with arXiv:2205.09743 by other authors

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[169] arXiv:2405.10557 [pdf, other]: Title: Resolving Symmetry Ambiguity in Correspondence-based Methods for Instance-level Object Pose Estimation

Authors: Yongliang Lin, Yongzhi Su, Sandeep Inuganti, Yan Di, Naeem Ajilforoushan, Hanqing Yang, Yu Zhang, Jason Rambach

Comments: 8 pages,10 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[170] arXiv:2405.10554 [pdf, other]: Title: NeRO: Neural Road Surface Reconstruction

Authors: Ruibo Wang, Song Zhang, Ping Huang, Donghai Zhang, Haoyu Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[171] arXiv:2405.10530 [pdf, other]: Title: CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation

Authors: Mushui Liu, Jun Dan, Ziqian Lu, Yunlong Yu, Yingming Li, Xi Li

Comments: 5 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[172] arXiv:2405.10529 [pdf, other]: Title: Safeguarding Vision-Language Models Against Patched Visual Prompt Injectors

Authors: Jiachen Sun, Changsheng Wang, Jiongxiao Wang, Yiwei Zhang, Chaowei Xiao

Comments: 15 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[173] arXiv:2405.10518 [pdf, ps, other]: Title: Enhancing Perception Quality in Remote Sensing Image Compression via Invertible Neural Network

Authors: Junhui Li, Xingsong Hou

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[174] arXiv:2405.10508 [pdf, other]: Title: ART3D: 3D Gaussian Splatting for Text-Guided Artistic Scenes Generation

Authors: Pengzhi Li, Chengshuai Tang, Qinxuan Huang, Zhiheng Li

Comments: Accepted at CVPR 2024 Workshop on AI3DG

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[175] arXiv:2405.10504 [pdf, ps, other]: Title: Multi-scale Semantic Prior Features Guided Deep Neural Network for Urban Street-view Image

Authors: Jianshun Zeng, Wang Li, Yanjie Lv, Shuai Gao, YuChu Qin

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[176] arXiv:2405.10489 [pdf, other]: Title: MixCut:A Data Augmentation Method for Facial Expression Recognition

Authors: Jiaxiang Yu, Yiyang Liu, Ruiyang Fan, Guobing Sun

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[177] arXiv:2405.10456 [pdf, other]: Title: Region-level labels in ice charts can produce pixel-level segmentation for Sea Ice types

Authors: Muhammed Patel, Xinwei Chen, Linlin Xu, Yuhao Chen, K Andrea Scott, David A. Clausi

Comments: Published at ICLR 2024 Machine Learning for Remote Sensing (ML4RS) Workshop

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[178] arXiv:2405.10444 [pdf, other]: Title: A Novel Bounding Box Regression Method for Single Object Tracking

Authors: Omar Abdelaziz, Mohamed Sami Shehata

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[179] arXiv:2405.10439 [pdf, other]: Title: Beyond Traditional Single Object Tracking: A Survey

Authors: Omar Abdelaziz, Mohamed Shehata, Mohamed Mohamed

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[180] arXiv:2405.10423 [pdf, other]: Title: Diversity-Aware Sign Language Production through a Pose Encoding Variational Autoencoder

Authors: Mohamed Ilyes Lakhal, Richard Bowden

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[181] arXiv:2405.10398 [pdf, other]: Title: Drone-type-Set: Drone types detection benchmark for drone detection and tracking

Authors: Kholoud AlDosari, AIbtisam Osman, Omar Elharrouss, Somaya AlMaadeed, Mohamed Zied Chaari

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[182] arXiv:2405.10370 [pdf, other]: Title: Grounded 3D-LLM with Referent Tokens

Authors: Yilun Chen, Shuai Yang, Haifeng Huang, Tai Wang, Ruiyuan Lyu, Runsen Xu, Dahua Lin, Jiangmiao Pang

Comments: Preprint

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[183] arXiv:2405.10357 [pdf, other]: Title: RGB Guided ToF Imaging System: A Survey of Deep Learning-based Methods

Authors: Xin Qiao, Matteo Poggi, Pengchao Deng, Hao Wei, Chenyang Ge, Stefano Mattoccia

Comments: To appear on International Journal of Computer Vision (IJCV)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[184] arXiv:2405.10347 [pdf, other]: Title: Networking Systems for Video Anomaly Detection: A Tutorial and Survey

Authors: Jing Liu, Yang Liu, Jieyu Lin, Jielin Li, Peng Sun, Bo Hu, Liang Song, Azzedine Boukerche, Victor C.M. Leung

Comments: Submitted to ACM Computing Surveys, under review,for more information and supplementary material, please see this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computers and Society (cs.CY)
[185] arXiv:2405.10939 (cross-list from cs.LG) [pdf, other]: Title: DINO as a von Mises-Fisher mixture model

Authors: Hariprasath Govindarajan, Per Sidén, Jacob Roll, Fredrik Lindsten

Comments: Accepted to ICLR 2023

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[186] arXiv:2405.10870 (cross-list from eess.IV) [pdf, other]: Title: Multicenter Privacy-Preserving Model Training for Deep Learning Brain Metastases Autosegmentation

Authors: Yixing Huang, Zahra Khodabakhshi, Ahmed Gomaa, Manuel Schmidt, Rainer Fietkau, Matthias Guckenberger, Nicolaus Andratschke, Christoph Bert, Stephanie Tanadini-Lang, Florian Putz

Comments: Submission to the Green Journal (Major Revision)

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[187] arXiv:2405.10833 (cross-list from eess.IV) [pdf, other]: Title: Automatic segmentation of Organs at Risk in Head and Neck cancer patients from CT and MRI scans

Authors: Sébastien Quetin, Andrew Heschl, Mauricio Murillo, Murali Rohit, Shirin A. Enger, Farhad Maleki

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[188] arXiv:2405.10803 (cross-list from eess.IV) [pdf, other]: Title: A Large-scale Multi Domain Leukemia Dataset for the White Blood Cells Detection with Morphological Attributes for Explainability

Authors: Abdul Rehman, Talha Meraj, Aiman Mahmood Minhas, Ayisha Imran, Mohsen Ali, Waqas Sultani

Comments: Early Accept

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[189] arXiv:2405.10754 (cross-list from math.OC) [pdf, other]: Title: Stable Phase Retrieval with Mirror Descent

Authors: Jean-Jacques Godeme, Jalal Fadili, Claude Amra, Myriam Zerrad

Subjects: Optimization and Control (math.OC); Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT)
[190] arXiv:2405.10723 (cross-list from eess.IV) [pdf, other]: Title: Eddeep: Fast eddy-current distortion correction for diffusion MRI with deep learning

Authors: Antoine Legouhy, Ross Callaghan, Whitney Stee, Philippe Peigneux, Hojjat Azadbakht, Hui Zhang

Comments: submitted to MICCAI 2024

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[191] arXiv:2405.10705 (cross-list from eess.IV) [pdf, other]: Title: 3D Vessel Reconstruction from Sparse-View Dynamic DSA Images via Vessel Probability Guided Attenuation Learning

Authors: Zhentao Liu, Huangxuan Zhao, Wenhui Qin, Zhenghong Zhou, Xinggang Wang, Wenping Wang, Xiaochun Lai, Chuansheng Zheng, Dinggang Shen, Zhiming Cui

Comments: 12 pages, 13 figures, 5 tables

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[192] arXiv:2405.10702 (cross-list from cs.CL) [pdf, ps, other]: Title: Empowering Prior to Court Legal Analysis: A Transparent and Accessible Dataset for Defensive Statement Classification and Interpretation

Authors: Yannis Spyridis, Jean-Paul, Haneen Deeb, Vasileios Argyriou

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[193] arXiv:2405.10691 (cross-list from eess.IV) [pdf, other]: Title: LoCI-DiffCom: Longitudinal Consistency-Informed Diffusion Model for 3D Infant Brain Image Completion

Authors: Zihao Zhu, Tianli Tao, Yitian Tao, Haowen Deng, Xinyi Cai, Gaofeng Wu, Kaidong Wang, Haifeng Tang, Lixuan Zhu, Zhuoyang Gu, Jiawei Huang, Dinggang Shen, Han Zhang

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[194] arXiv:2405.10561 (cross-list from eess.IV) [pdf, other]: Title: Infrared Image Super-Resolution via Lightweight Information Split Network

Authors: Shijie Liu, Kang Yan, Feiwei Qin, Changmiao Wang, Ruiquan Ge, Kai Zhang, Jie Huang, Yong Peng, Jin Cao

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[195] arXiv:2405.10550 (cross-list from eess.IV) [pdf, other]: Title: LighTDiff: Surgical Endoscopic Image Low-Light Enhancement with T-Diffusion

Authors: Tong Chen, Qingcheng Lyu, Long Bai, Erjian Guo, Huxin Gao, Xiaoxiao Yang, Hongliang Ren, Luping Zhou

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[196] arXiv:2405.10531 (cross-list from cs.LG) [pdf, other]: Title: Nonparametric Teaching of Implicit Neural Representations

Authors: Chen Zhang, Steven Tin Sui Luo, Jason Chun Lok Li, Yik-Chung Wu, Ngai Wong

Comments: ICML 2024 (24 pages, 13 figures)

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[197] arXiv:2405.10497 (cross-list from cs.MM) [pdf, other]: Title: SMP Challenge: An Overview and Analysis of Social Media Prediction Challenge

Authors: Bo Wu, Peiye Liu, Wen-Huang Cheng, Bei Liu, Zhaoyang Zeng, Jia Wang, Qiushi Huang, Jiebo Luo

Comments: ACM Multimedia. arXiv admin note: text overlap with arXiv:1910.01795

Subjects: Multimedia (cs.MM); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Social and Information Networks (cs.SI)

Fri, 17 May 2024

[198] arXiv:2405.10320 [pdf, other]: Title: Toon3D: Seeing Cartoons from a New Perspective

Authors: Ethan Weber, Riley Peterlinz, Rohan Mathur, Frederik Warburg, Alexei A. Efros, Angjoo Kanazawa

Comments: Please see our project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[199] arXiv:2405.10317 [pdf, other]: Title: Text-to-Vector Generation with Neural Path Representation

Authors: Peiying Zhang, Nanxuan Zhao, Jing Liao

Comments: Accepted by SIGGRAPH 2024. Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[200] arXiv:2405.10316 [pdf, other]: Title: Analogist: Out-of-the-box Visual In-Context Learning with Image Diffusion Model

Authors: Zheng Gu, Shiyuan Yang, Jing Liao, Jing Huo, Yang Gao

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[201] arXiv:2405.10314 [pdf, other]: Title: CAT3D: Create Anything in 3D with Multi-View Diffusion Models

Authors: Ruiqi Gao, Aleksander Holynski, Philipp Henzler, Arthur Brussee, Ricardo Martin-Brualla, Pratul Srinivasan, Jonathan T. Barron, Ben Poole

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[202] arXiv:2405.10305 [pdf, other]: Title: 4D Panoptic Scene Graph Generation

Authors: Jingkang Yang, Jun Cen, Wenxuan Peng, Shuai Liu, Fangzhou Hong, Xiangtai Li, Kaiyang Zhou, Qifeng Chen, Ziwei Liu

Comments: Accepted as NeurIPS 2023. Code: this https URL Previous Series: PSG this https URL and PVSG this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[203] arXiv:2405.10300 [pdf, other]: Title: Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection

Authors: Tianhe Ren, Qing Jiang, Shilong Liu, Zhaoyang Zeng, Wenlong Liu, Han Gao, Hongjie Huang, Zhengyu Ma, Xiaoke Jiang, Yihao Chen, Yuda Xiong, Hao Zhang, Feng Li, Peijun Tang, Kent Yu, Lei Zhang

Comments: Technical report

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[204] arXiv:2405.10286 [pdf, other]: Title: FFF: Fixing Flawed Foundations in contrastive pre-training results in very strong Vision-Language models

Authors: Adrian Bulat, Yassine Ouali, Georgios Tzimiropoulos

Comments: Accepted at CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[205] arXiv:2405.10272 [pdf, other]: Title: Faces that Speak: Jointly Synthesising Talking Face and Speech from Text

Authors: Youngjoon Jang, Ji-Hoon Kim, Junseok Ahn, Doyeop Kwak, Hong-Sun Yang, Yoon-Cheol Ju, Il-Hwan Kim, Byeong-Yeol Kim, Joon Son Chung

Comments: CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[206] arXiv:2405.10266 [pdf, other]: Title: A Tale of Two Languages: Large-Vocabulary Continuous Sign Language Recognition from Spoken Language Supervision

Authors: Charles Raude, K R Prajwal, Liliane Momeni, Hannah Bull, Samuel Albanie, Andrew Zisserman, Gül Varol

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[207] arXiv:2405.10256 [pdf, other]: Title: Biasing & Debiasing based Approach Towards Fair Knowledge Transfer for Equitable Skin Analysis

Authors: Anshul Pundhir, Balasubramanian Raman, Pravendra Singh

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[208] arXiv:2405.10255 [pdf, other]: Title: When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models

Authors: Xianzheng Ma, Yash Bhalgat, Brandon Smart, Shuai Chen, Xinghui Li, Jian Ding, Jindong Gu, Dave Zhenyu Chen, Songyou Peng, Jia-Wang Bian, Philip H Torr, Marc Pollefeys, Matthias Nießner, Ian D Reid, Angel X. Chang, Iro Laina, Victor Adrian Prisacariu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[209] arXiv:2405.10244 [pdf, ps, other]: Title: Towards Task-Compatible Compressible Representations

Authors: Anderson de Andrade, Ivan Bajić

Comments: To be published in ICME Workshops 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Signal Processing (eess.SP)
[210] arXiv:2405.10185 [pdf, other]: Title: DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative Data

Authors: Chengxiang Fan, Muzhi Zhu, Hao Chen, Yang Liu, Weijia Wu, Huaqi Zhang, Chunhua Shen

Comments: Accepted to CVPR 2024, codes are available at \href{this https URL}{this https URL}

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[211] arXiv:2405.10175 [pdf, other]: Title: Filling Missing Values Matters for Range Image-Based Point Cloud Segmentation

Authors: Bike Chen, Chen Gong, Juha Röning

Comments: This paper has been submitted to a journal

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[212] arXiv:2405.10160 [pdf, other]: Title: PIR: Remote Sensing Image-Text Retrieval with Prior Instruction Representation Learning

Authors: Jiancheng Pan, Muyuan Ma, Qing Ma, Cong Bai, Shengyong Chen

Comments: 15 pages, 9 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[213] arXiv:2405.10148 [pdf, other]: Title: SpecDETR: A Transformer-based Hyperspectral Point Object Detection Network

Authors: Zhaoxu Li, Wei An, Gaowei Guo, Longguang Wang, Yingqian Wang, Zaiping Lin

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[214] arXiv:2405.10140 [pdf, other]: Title: Libra: Building Decoupled Vision System on Large Language Models

Authors: Yifan Xu, Xiaoshan Yang, Yaguang Song, Changsheng Xu

Comments: ICML2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[215] arXiv:2405.10132 [pdf, other]: Title: Cooperative Visual-LiDAR Extrinsic Calibration Technology for Intersection Vehicle-Infrastructure: A review

Authors: Xinyu Zhang, Yijin Xiong, Qianxin Qu, Renjie Wang, Xin Gao, Jing Liu, Shichun Guo, Jun Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[216] arXiv:2405.10122 [pdf, other]: Title: Generating Coherent Sequences of Visual Illustrations for Real-World Manual Tasks

Authors: João Bordalo, Vasco Ramos, Rodrigo Valério, Diogo Glória-Silva, Yonatan Bitton, Michal Yarom, Idan Szpektor, Joao Magalhaes

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[217] arXiv:2405.10082 [pdf, other]: Title: An Integrated Framework for Multi-Granular Explanation of Video Summarization

Authors: Konstantinos Tsigos, Evlampios Apostolidis, Vasileios Mezaris

Comments: Under review

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[218] arXiv:2405.10075 [pdf, other]: Title: HecVL: Hierarchical Video-Language Pretraining for Zero-shot Surgical Phase Recognition

Authors: Kun Yuan, Vinkle Srivastav, Nassir Navab, Nicolas Padoy

Comments: Accepted by MICCAI2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[219] arXiv:2405.10053 [pdf, other]: Title: SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection

Authors: Mingxuan Liu, Tyler L. Hayes, Elisa Ricci, Gabriela Csurka, Riccardo Volpi

Comments: Accepted as a conference paper (highlight) at CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[220] arXiv:2405.10046 [pdf, other]: Title: A Preprocessing and Postprocessing Voxel-based Method for LiDAR Semantic Segmentation Improvement in Long Distance

Authors: Andrea Matteazzi, Pascal Colling, Michael Arnold, Dietmar Tutsch

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[221] arXiv:2405.10041 [pdf, other]: Title: Revealing Hierarchical Structure of Leaf Venations in Plant Science via Label-Efficient Segmentation: Dataset and Method

Authors: Weizhen Liu, Ao Li, Ze Wu, Yue Li, Baobin Ge, Guangyu Lan, Shilin Chen, Minghe Li, Yunfei Liu, Xiaohui Yuan, Nanqing Dong

Comments: Accepted by IJCAI2024, Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[222] arXiv:2405.10037 [pdf, other]: Title: Bilateral Event Mining and Complementary for Event Stream Super-Resolution

Authors: Zhilin Huang, Quanmin Liang, Yijie Yu, Chujun Qin, Xiawu Zheng, Kai Huang, Zikun Zhou, Wenming Yang

Comments: Accepted to CVPR2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[223] arXiv:2405.10030 [pdf, other]: Title: RSDehamba: Lightweight Vision Mamba for Remote Sensing Satellite Image Dehazing

Authors: Huiling Zhou, Xianhao Wu, Hongming Chen, Xiang Chen, Xin He

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[224] arXiv:2405.10014 [pdf, other]: Title: Frequency-Domain Refinement with Multiscale Diffusion for Super Resolution

Authors: Xingjian Wang, Li Chai, Jiming Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[225] arXiv:2405.10008 [pdf, other]: Title: Solving the enigma: Deriving optimal explanations of deep networks

Authors: Michail Mamalakis, Antonios Mamalakis, Ingrid Agartz, Lynn Egeland Mørch-Johnsen, Graham Murray, John Suckling, Pietro Lio

Comments: keywords: XAI, neuroscience, brain, 3D, 2D, computer vision, classification

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[226] arXiv:2405.09996 [pdf, other]: Title: Driving-Video Dehazing with Non-Aligned Regularization for Safety Assistance

Authors: Junkai Fan, Jiangwei Weng, Kun Wang, Yijun Yang, Jianjun Qian, Jun Li, Jian Yang

Comments: Accepted by CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[227] arXiv:2405.09985 [pdf, other]: Title: VirtualModel: Generating Object-ID-retentive Human-object Interaction Image by Diffusion Model for E-commerce Marketing

Authors: Binghui Chen, Chongyang Zhong, Wangmeng Xiang, Yifeng Geng, Xuansong Xie

Comments: project page: this https URL;

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[228] arXiv:2405.09981 [pdf, other]: Title: Adversarial Robustness for Visual Grounding of Multimodal Large Language Models

Authors: Kuofeng Gao, Yang Bai, Jiawang Bai, Yong Yang, Shu-Tao Xia

Comments: ICLR 2024 Workshop on Reliable and Responsible Foundation Models

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[229] arXiv:2405.09976 [pdf, other]: Title: Language-Oriented Semantic Latent Representation for Image Transmission

Authors: Giordano Cicchetti, Eleonora Grassucci, Jihong Park, Jinho Choi, Sergio Barbarossa, Danilo Comminiello

Comments: Under review at IEEE International Workshop on Machine Learning for Signal Processing (MLSP) 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Signal Processing (eess.SP)
[230] arXiv:2405.09964 [pdf, other]: Title: KPNDepth: Depth Estimation of Lane Images under Complex Rainy Environment

Authors: Zhengxu Shi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[231] arXiv:2405.09955 [pdf, other]: Title: Dual-band feature selection for maturity classification of specialty crops by hyperspectral imaging

Authors: Usman A. Zahidi, Krystian Łukasik, Grzegorz Cielniak

Comments: Preprint: Paper submitted to the special issue of "Computers and Electronics in Agriculture"

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[232] arXiv:2405.09942 [pdf, other]: Title: FPDIoU Loss: A Loss Function for Efficient Bounding Box Regression of Rotated Object Detection

Authors: Siliang Ma, Yong Xu

Comments: arXiv admin note: text overlap with arXiv:2307.07662, text overlap with arXiv:1902.09630 by other authors

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[233] arXiv:2405.09934 [pdf, other]: Title: Detecting Domain Shift in Multiple Instance Learning for Digital Pathology Using Fréchet Domain Distance

Authors: Milda Pocevičiūtė, Gabriel Eilertsen, Stina Garvin, Claes Lundström

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[234] arXiv:2405.09933 [pdf, other]: Title: MiniMaxAD: A Lightweight Autoencoder for Feature-Rich Anomaly Detection

Authors: Fengjie Wang, Chengming Liu, Lei Shi, Pang Haibo

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[235] arXiv:2405.09931 [pdf, other]: Title: Learning from Observer Gaze:Zero-Shot Attention Prediction Oriented by Human-Object Interaction Recognition

Authors: Yuchen Zhou, Linkai Liu, Chao Gou

Comments: Accepted by CVPR2024. Project HomePage: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[236] arXiv:2405.09924 [pdf, other]: Title: Infrared Adversarial Car Stickers

Authors: Xiaopei Zhu, Yuqiu Liu, Zhanhao Hu, Jianmin Li, Xiaolin Hu

Comments: Accepted by CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[237] arXiv:2405.09923 [pdf, other]: Title: NTIRE 2024 Restore Any Image Model (RAIM) in the Wild Challenge

Authors: Jie Liang, Radu Timofte, Qiaosi Yi, Shuaizheng Liu, Lingchen Sun, Rongyuan Wu, Xindong Zhang, Hui Zeng, Lei Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[238] arXiv:2405.09922 [pdf, other]: Title: Cross-sensor self-supervised training and alignment for remote sensing

Authors: Valerio Marsocci (CEDRIC - VERTIGO, CNAM), Nicolas Audebert (CEDRIC - VERTIGO, CNAM, LaSTIG, IGN)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[239] arXiv:2405.09902 [pdf, other]: Title: Unveiling the Potential: Harnessing Deep Metric Learning to Circumvent Video Streaming Encryption

Authors: Arwin Gansekoele, Tycho Bot, Rob van der Mei, Sandjai Bhulai, Mark Hoogendoorn

Comments: Published in the WI-IAT 2023 proceedings

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
[240] arXiv:2405.09883 [pdf, other]: Title: RoScenes: A Large-scale Multi-view 3D Dataset for Roadside Perception

Authors: Xiaosu Zhu, Hualian Sheng, Sijia Cai, Bing Deng, Shaopeng Yang, Qiao Liang, Ken Chen, Lianli Gao, Jingkuan Song, Jieping Ye

Comments: Technical report. 32 pages, 21 figures, 13 tables. this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[241] arXiv:2405.09882 [pdf, other]: Title: DiffAM: Diffusion-based Adversarial Makeup Transfer for Facial Privacy Protection

Authors: Yuhao Sun, Lingyun Yu, Hongtao Xie, Jiaming Li, Yongdong Zhang

Comments: 16 pages, 11 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[242] arXiv:2405.09880 [pdf, other]: Title: Deep Learning-Based Quasi-Conformal Surface Registration for Partial 3D Faces Applied to Facial Recognition

Authors: Yuchen Guo, Hanqun Cao, Lok Ming Lui

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[243] arXiv:2405.09879 [pdf, other]: Title: Generative Unlearning for Any Identity

Authors: Juwon Seo, Sung-Hoon Lee, Tae-Young Lee, Seungjun Moon, Gyeong-Moon Park

Comments: 15 pages, 17 figures, 10 tables, CVPR 2024 Poster

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[244] arXiv:2405.09874 [pdf, other]: Title: Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion

Authors: Xinyang Li, Zhangyu Lai, Linning Xu, Jianfei Guo, Liujuan Cao, Shengchuan Zhang, Bo Dai, Rongrong Ji

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[245] arXiv:2405.09873 [pdf, other]: Title: IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model

Authors: Yongsong Huang, Tomo Miyazaki, Xiaofeng Liu, Shinichiro Omachi

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[246] arXiv:2405.09863 [pdf, other]: Title: Box-Free Model Watermarks Are Prone to Black-Box Removal Attacks

Authors: Haonan An, Guang Hua, Zhiping Lin, Yuguang Fang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[247] arXiv:2405.09858 [pdf, other]: Title: Towards Realistic Incremental Scenario in Class Incremental Semantic Segmentation

Authors: Jihwan Kwak, Sungmin Cha, Taesup Moon

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[248] arXiv:2405.09828 [pdf, other]: Title: PillarNeXt: Improving the 3D detector by introducing Voxel2Pillar feature encoding and extracting multi-scale features

Authors: Xusheng Li, Chengliang Wang, Shumao Wang, Zhuo Zeng, Ji Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[249] arXiv:2405.09827 [pdf, other]: Title: Parallel Backpropagation for Shared-Feature Visualization

Authors: Alexander Lappe, Anna Bognár, Ghazaleh Ghamkhari Nejad, Albert Mukovskiy, Lucas Martini, Martin A. Giese, Rufin Vogels

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[250] arXiv:2405.09806 [pdf, other]: Title: MediSyn: Text-Guided Diffusion Models for Broad Medical 2D and 3D Image Synthesis

Authors: Joseph Cho, Cyril Zakka, Rohan Shad, Ross Wightman, Akshay Chaudhari, William Hiesinger

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[251] arXiv:2405.09789 [pdf, other]: Title: LeMeViT: Efficient Vision Transformer with Learnable Meta Tokens for Remote Sensing Image Interpretation

Authors: Wentao Jiang, Jing Zhang, Di Wang, Qiming Zhang, Zengmao Wang, Bo Du

Comments: Accepted by IJCAI'2024. The code is available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[252] arXiv:2405.09782 [pdf, other]: Title: Size-invariance Matters: Rethinking Metrics and Losses for Imbalanced Multi-object Salient Object Detection

Authors: Feiran Li, Qianqian Xu, Shilong Bao, Zhiyong Yang, Runmin Cong, Xiaochun Cao, Qingming Huang

Comments: This paper has been accepted by ICML2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[253] arXiv:2405.09777 [pdf, other]: Title: Rethinking Barely-Supervised Segmentation from an Unsupervised Domain Adaptation Perspective

Authors: Zhiqiang Shen, Peng Cao, Junming Su, Jinzhu Yang, Osmar R. Zaiane

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[254] arXiv:2405.09755 [pdf, other]: Title: Collision Avoidance Metric for 3D Camera Evaluation

Authors: Vage Taamazyan, Alberto Dall'olio, Agastya Kalra

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[255] arXiv:2405.09717 [pdf, other]: Title: From NeRFs to Gaussian Splats, and Back

Authors: Siming He, Zach Osman, Pratik Chaudhari

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[256] arXiv:2405.09713 [pdf, other]: Title: SOK-Bench: A Situated Video Reasoning Benchmark with Aligned Open-World Knowledge

Authors: Andong Wang, Bo Wu, Sunli Chen, Zhenfang Chen, Haotian Guan, Wei-Ning Lee, Li Erran Li, Chuang Gan

Comments: CVPR

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[257] arXiv:2405.09707 [pdf, other]: Title: Point2SSM++: Self-Supervised Learning of Anatomical Shape Models from Point Clouds

Authors: Jadie Adams, Shireen Elhabian

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[258] arXiv:2405.09697 [pdf, other]: Title: Weakly Supervised Bayesian Shape Modeling from Unsegmented Medical Images

Authors: Jadie Adams, Krithika Iyer, Shireen Elhabian

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[259] arXiv:2405.09682 [pdf, other]: Title: Synth-to-Real Unsupervised Domain Adaptation for Instance Segmentation

Authors: Guo Yachan, Xiao Yi, Xue Danna, Jose Luis Gomez Zurita, Antonio M. López

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[260] arXiv:2405.09588 [pdf, ps, other]: Title: Training Deep Learning Models with Hybrid Datasets for Robust Automatic Target Detection on real SAR images

Authors: Benjamin Camus, Théo Voillemin, Corentin Le Barbu, Jean-Christophe Louvigné (DGA.MI), Carole Belloni (DGA.MI), Emmanuel Vallée (DGA.MI)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Signal Processing (eess.SP)
[261] arXiv:2405.09582 [pdf, other]: Title: AD-Aligning: Emulating Human-like Generalization for Cognitive Domain Adaptation in Deep Learning

Authors: Zhuoying Li, Bohua Wan, Cong Mu, Ruzhang Zhao, Shushan Qiu, Chao Yan

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[262] arXiv:2405.09550 [pdf, other]: Title: Mask-based Invisible Backdoor Attacks on Object Detection

Authors: Shin Jeong Jin

Comments: 7 pages, 3 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
[263] arXiv:2405.10292 (cross-list from cs.AI) [pdf, other]: Title: Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning

Authors: Yuexiang Zhai, Hao Bai, Zipeng Lin, Jiayi Pan, Shengbang Tong, Yifei Zhou, Alane Suhr, Saining Xie, Yann LeCun, Yi Ma, Sergey Levine

Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[264] arXiv:2405.10262 (cross-list from cs.LG) [pdf, other]: Title: Two-Phase Dynamics of Interactions Explains the Starting Point of a DNN Learning Over-Fitted Features

Authors: Junpeng Zhang, Qing Li, Liang Lin, Quanshi Zhang

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[265] arXiv:2405.10254 (cross-list from eess.IV) [pdf, other]: Title: PRISM: A Multi-Modal Generative Foundation Model for Slide-Level Histopathology

Authors: George Shaikovski, Adam Casson, Kristen Severson, Eric Zimmermann, Yi Kan Wang, Jeremy D. Kunz, Juan A. Retamero, Gerard Oakley, David Klimstra, Christopher Kanan, Matthew Hanna, Michal Zelechowski, Julian Viret, Neil Tenenholtz, James Hall, Nicolo Fusi, Razik Yousfi, Peter Hamilton, William A. Moye, Eugene Vorontsov, Siqi Liu, Thomas J. Fuchs

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[266] arXiv:2405.10246 (cross-list from eess.IV) [pdf, other]: Title: A Foundation Model for Brain Lesion Segmentation with Mixture of Modality Experts

Authors: Xinru Zhang, Ni Ou, Berke Doga Basaran, Marco Visentin, Mengyun Qiao, Renyang Gu, Cheng Ouyang, Yaou Liu, Paul M. Matthew, Chuyang Ye, Wenjia Bai

Comments: The work has been early accepted by MICCAI 2024

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[267] arXiv:2405.10068 (cross-list from eess.IV) [pdf, other]: Title: MrRegNet: Multi-resolution Mask Guided Convolutional Neural Network for Medical Image Registration with Large Deformations

Authors: Ruizhe Li, Grazziela Figueredo, Dorothee Auer, Christian Wagner, Xin Chen

Comments: Accepted for publication at IEEE International Symposium on Biomedical Imaging (ISBI) 2024

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[268] arXiv:2405.10020 (cross-list from cs.RO) [pdf, other]: Title: Natural Language Can Help Bridge the Sim2Real Gap

Authors: Albert Yu, Adeline Foote, Raymond Mooney, Roberto Martín-Martín

Comments: To appear in RSS 2024

Subjects: Robotics (cs.RO); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[269] arXiv:2405.10004 (cross-list from eess.IV) [pdf, other]: Title: ROCOv2: Radiology Objects in COntext Version 2, an Updated Multimodal Image Dataset

Authors: Johannes Rückert, Louise Bloch, Raphael Brüngel, Ahmad Idrissi-Yaghir, Henning Schäfer, Cynthia S. Schmidt, Sven Koitka, Obioma Pelka, Asma Ben Abacha, Alba G. Seco de Herrera, Henning Müller, Peter A. Horn, Felix Nensa, Christoph M. Friedrich

Comments: Major revision Scientific Data

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[270] arXiv:2405.09990 (cross-list from eess.IV) [pdf, other]: Title: Histopathology Foundation Models Enable Accurate Ovarian Cancer Subtype Classification

Authors: Jack Breen, Katie Allen, Kieran Zucker, Lucy Godson, Nicolas M. Orsi, Nishant Ravikumar

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[271] arXiv:2405.09959 (cross-list from eess.IV) [pdf, other]: Title: Patient-Specific Real-Time Segmentation in Trackerless Brain Ultrasound

Authors: Reuben Dorent, Erickson Torio, Nazim Haouchine, Colin Galvin, Sarah Frisken, Alexandra Golby, Tina Kapur, William Wells

Comments: Early accept at MICCAI 2024 - code available at: this https URL

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[272] arXiv:2405.09864 (cross-list from astro-ph.IM) [pdf, other]: Title: Solar multi-object multi-frame blind deconvolution with a spatially variant convolution neural emulator

Authors: A. Asensio Ramos (IAC+ULL)

Comments: 15 pages, 14 figures, accepted for publication in A&A

Subjects: Instrumentation and Methods for Astrophysics (astro-ph.IM); Computer Vision and Pattern Recognition (cs.CV)
[273] arXiv:2405.09851 (cross-list from eess.IV) [pdf, other]: Title: Region of Interest Detection in Melanocytic Skin Tumor Whole Slide Images -- Nevus & Melanoma

Authors: Yi Cui, Yao Li, Jayson R. Miedema, Sharon N. Edmiston, Sherif Farag, J.S. Marron, Nancy E. Thomas

Comments: 5 figures, NeurIPS 2022 Workshop

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Quantitative Methods (q-bio.QM)
[274] arXiv:2405.09820 (cross-list from cs.LG) [pdf, other]: Title: Densely Distilling Cumulative Knowledge for Continual Learning

Authors: Zenglin Shi, Pei Liu, Tong Su, Yunpeng Wu, Kuien Liu, Yu Song, Meng Wang

Comments: 12 pages; Continual Leanrning; Class-incremental Learning; Knowledge Distillation; Forgetting

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[275] arXiv:2405.09814 (cross-list from cs.GR) [pdf, other]: Title: Semantic Gesticulator: Semantics-Aware Co-Speech Gesture Synthesis

Authors: Zeyi Zhang, Tenglong Ao, Yuyao Zhang, Qingzhe Gao, Chuan Lin, Baoquan Chen, Libin Liu

Comments: SIGGRAPH 2024 (Journal Track); Project page: this https URL

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[276] arXiv:2405.09798 (cross-list from cs.LG) [pdf, other]: Title: Many-Shot In-Context Learning in Multimodal Foundation Models

Authors: Yixing Jiang, Jeremy Irvin, Ji Hun Wang, Muhammad Ahmed Chaudhry, Jonathan H. Chen, Andrew Y. Ng

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[277] arXiv:2405.09787 (cross-list from eess.IV) [pdf, other]: Title: Analysis of the BraTS 2023 Intracranial Meningioma Segmentation Challenge

Authors: Dominic LaBella, Ujjwal Baid, Omaditya Khanna, Shan McBurney-Lin, Ryan McLean, Pierre Nedelec, Arif Rashid, Nourel Hoda Tahon, Talissa Altes, Radhika Bhalerao, Yaseen Dhemesh, Devon Godfrey, Fathi Hilal, Scott Floyd, Anastasia Janas, Anahita Fathi Kazerooni, John Kirkpatrick, Collin Kent, Florian Kofler, Kevin Leu, Nazanin Maleki, Bjoern Menze, Maxence Pajot, Zachary J. Reitman, Jeffrey D. Rudie, Rachit Saluja, Yury Velichko, Chunhao Wang, Pranav Warman, Maruf Adewole, Jake Albrecht, Udunna Anazodo, Syed Muhammad Anwar, Timothy Bergquist, Sully Francis Chen, Verena Chung, Gian-Marco Conte, Farouk Dako, James Eddy, Ivan Ezhov, Nastaran Khalili, Juan Eugenio Iglesias, Zhifan Jiang, Elaine Johanson, Koen Van Leemput, Hongwei Bran Li, Marius George Linguraru, Xinyang Liu, Aria Mahtabfar, Zeke Meier, et al. (71 additional authors not shown)

Comments: 16 pages, 11 tables, 10 figures, MICCAI

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[278] arXiv:2405.09716 (cross-list from eess.IV) [pdf, other]: Title: Illumination Histogram Consistency Metric for Quantitative Assessment of Video Sequences

Authors: Long Chen, Mobarakol Islam, Matt Clarkson, Thomas Dowrick

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[279] arXiv:2405.09711 (cross-list from cs.AI) [pdf, other]: Title: STAR: A Benchmark for Situated Reasoning in Real-World Videos

Authors: Bo Wu, Shoubin Yu, Zhenfang Chen, Joshua B Tenenbaum, Chuang Gan

Comments: NeurIPS

Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[280] arXiv:2405.09695 (cross-list from cs.HC) [pdf, other]: Title: Enhancing Saliency Prediction in Monitoring Tasks: The Role of Visual Highlights

Authors: Zekun Wu, Anna Maria Feit

Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV)
[281] arXiv:2405.09601 (cross-list from physics.med-ph) [pdf, ps, other]: Title: Fully Automated OCT-based Tissue Screening System

Authors: Shaohua Pi, Razieh Ganjee, Lingyun Wang, Riley K. Arbuckle, Chengcheng Zhao, Jose A Sahel, Bingjie Wang, Yuanyuan Chen

Subjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV)
[282] arXiv:2405.09600 (cross-list from cs.LG) [pdf, other]: Title: Aggregate Representation Measure for Predictive Model Reusability

Authors: Vishwesh Sangarya, Richard Bradford, Jung-Eun Kim

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
[283] arXiv:2405.09594 (cross-list from eess.IV) [pdf, other]: Title: Learning Generalized Medical Image Representations through Image-Graph Contrastive Pretraining

Authors: Sameer Khanna, Daniel Michael, Marinka Zitnik, Pranav Rajpurkar

Comments: Accepted into Machine Learning for Health (ML4H) 2023

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[284] arXiv:2405.09589 (cross-list from cs.LG) [pdf, other]: Title: Unveiling Hallucination in Text, Image, Video, and Audio Foundation Models: A Comprehensive Survey

Authors: Pranab Sahoo, Prabhash Meharia, Akash Ghosh, Sriparna Saha, Vinija Jain, Aman Chadha

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[285] arXiv:2405.09586 (cross-list from eess.IV) [pdf, other]: Title: Factual Serialization Enhancement: A Key Innovation for Chest X-ray Report Generation

Authors: Kang Liu, Zhuoqi Ma, Mengmeng Liu, Zhicheng Jiao, Xiaolu Kang, Qiguang Miao, Kun Xie

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[286] arXiv:2405.09558 (cross-list from eess.SP) [pdf, other]: Title: An EM Body Model for Device-Free Localization with Multiple Antenna Receivers: A First Study

Authors: Vittorio Rampa, Federica Fieramosca, Stefano Savazzi, Michele D'Amico

Journal-ref: 2023 IEEE-APS Topical Conference on Antennas and Propagation in Wireless Communications (APWC)

Subjects: Signal Processing (eess.SP); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[287] arXiv:2405.09552 (cross-list from eess.IV) [pdf, other]: Title: ODFormer: Semantic Fundus Image Segmentation Using Transformer for Optic Nerve Head Detection

Authors: Jiayi Wang, Yi-An Mao, Xiaoyu Ma, Sicen Guo, Yuting Shao, Xiao Lv, Wenting Han, Mark Christopher, Linda M. Zangwill, Yanlong Bi, Rui Fan

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

Thu, 16 May 2024

[288] arXiv:2405.09546 [pdf, other]: Title: BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation

Authors: Yunhao Ge, Yihe Tang, Jiashu Xu, Cem Gokmen, Chengshu Li, Wensi Ai, Benjamin Jose Martinez, Arman Aydin, Mona Anvari, Ayush K Chakravarthy, Hong-Xing Yu, Josiah Wong, Sanjana Srivastava, Sharon Lee, Shengxin Zha, Laurent Itti, Yunzhu Li, Roberto Martín-Martín, Miao Liu, Pengchuan Zhang, Ruohan Zhang, Li Fei-Fei, Jiajun Wu

Comments: CVPR 2024 (Highlight). Project website: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[289] arXiv:2405.09544 [pdf, other]: Title: Classifying geospatial objects from multiview aerial imagery using semantic meshes

Authors: David Russell, Ben Weinstein, David Wettergreen, Derek Young

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[290] arXiv:2405.09487 [pdf, other]: Title: Color Space Learning for Cross-Color Person Re-Identification

Authors: Jiahao Nie, Shan Lin, Alex C. Kot

Comments: Accepted by ICME 2024 (Oral)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[291] arXiv:2405.09463 [pdf, other]: Title: Gaze-DETR: Using Expert Gaze to Reduce False Positives in Vulvovaginal Candidiasis Screening

Authors: Yan Kong, Sheng Wang, Jiangdong Cai, Zihao Zhao, Zhenrong Shen, Yonghao Li, Manman Fei, Qian Wang

Comments: MICCAI-2024 early accept. Our code is available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[292] arXiv:2405.09459 [pdf, other]: Title: Fourier Boundary Features Network with Wider Catchers for Glass Segmentation

Authors: Xiaolin Qin, Jiacen Liu, Qianlei Wang, Shaolin Zhang, Fei Zhu, Zhang Yi

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[293] arXiv:2405.09431 [pdf, other]: Title: A Survey On Text-to-3D Contents Generation In The Wild

Authors: Chenhan Jiang

Comments: 11 pages, 10 figures, 4 tables. arXiv admin note: text overlap with arXiv:2401.17807 by other authors

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[294] arXiv:2405.09426 [pdf, other]: Title: Global-Local Image Perceptual Score (GLIPS): Evaluating Photorealistic Quality of AI-Generated Images

Authors: Memoona Aziz, Umair Rehman, Muhammad Umair Danish, Katarina Grolinger

Comments: 10 pages, 3 figures. Submitted to IEEE Transactions on Human-Machine Systems

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[295] arXiv:2405.09409 [pdf, ps, other]: Title: Real-World Federated Learning in Radiology: Hurdles to overcome and Benefits to gain

Authors: Markus R. Bujotzek, Ünal Akünal, Stefan Denner, Peter Neher, Maximilian Zenk, Eric Frodl, Astha Jaiswal, Moon Kim, Nicolai R. Krekiehn, Manuel Nickel, Richard Ruppel, Marcus Both, Felix Döllinger, Marcel Opitz, Thorsten Persigehl, Jens Kleesiek, Tobias Penzkofer, Klaus Maier-Hein, Rickmer Braren, Andreas Bucher

Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)
[296] arXiv:2405.09404 [pdf, other]: Title: Time-Equivariant Contrastive Learning for Degenerative Disease Progression in Retinal OCT

Authors: Taha Emre, Arunava Chakravarty, Dmitrii Lachinov, Antoine Rivail, Ursula Schmidt-Erfurth, Hrvoje Bogunović

Comments: Accepted at MICCAI 2024 (early accept, top 11%)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[297] arXiv:2405.09403 [pdf, other]: Title: Identity Overlap Between Face Recognition Train/Test Data: Causing Optimistic Bias in Accuracy Measurement

Authors: Haiyu Wu, Sicong Tian, Jacob Gutierrez, Aman Bhatta, Kağan Öztürk, Kevin W. Bowyer

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[298] arXiv:2405.09365 [pdf, other]: Title: SARATR-X: A Foundation Model for Synthetic Aperture Radar Images Target Recognition

Authors: Weijie L, Wei Yang, Yuenan Hou, Li Liu, Yongxiang Liu, Xiang Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[299] arXiv:2405.09355 [pdf, other]: Title: Vision-Based Neurosurgical Guidance: Unsupervised Localization and Camera-Pose Prediction

Authors: Gary Sarwin, Alessandro Carretta, Victor Staartjes, Matteo Zoli, Diego Mazzatenta, Luca Regli, Carlo Serra, Ender Konukoglu

Comments: Early Accept at MICCAI 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[300] arXiv:2405.09342 [pdf, other]: Title: Progressive Depth Decoupling and Modulating for Flexible Depth Completion

Authors: Zhiwen Yang, Jiehua Zhang, Liang Li, Chenggang Yan, Yaoqi Sun, Haibing Yin

Comments: The article is accepted by IEEE Transactions on Instrumentation & Measurement

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[301] arXiv:2405.09334 [pdf, other]: Title: Content-Based Image Retrieval for Multi-Class Volumetric Radiology Images: A Benchmark Study

Authors: Farnaz Khun Jush, Steffen Vogler, Tuan Truong, Matthias Lenga

Comments: 23 pages, 9 Figures, 13 Tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)
[302] arXiv:2405.09333 [pdf, other]: Title: Application of Gated Recurrent Units for CT Trajectory Optimization

Authors: Yuedong Yuan, Linda-Sophie Schneider, Andreas Maier

Comments: 4 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[303] arXiv:2405.09321 [pdf, other]: Title: ReconBoost: Boosting Can Achieve Modality Reconcilement

Authors: Cong Hua, Qianqian Xu, Shilong Bao, Zhiyong Yang, Qingming Huang

Comments: This paper has been accepted by ICML2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)
[304] arXiv:2405.09291 [pdf, other]: Title: Sensitivity Decouple Learning for Image Compression Artifacts Reduction

Authors: Li Ma, Yifan Zhao, Peixi Peng, Yonghong Tian

Comments: Accepted by Transactions on Image Processing

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
[305] arXiv:2405.09288 [pdf, other]: Title: DeCoDEx: Confounder Detector Guidance for Improved Diffusion-based Counterfactual Explanations

Authors: Nima Fathi, Amar Kumar, Brennan Nichyporuk, Mohammad Havaei, Tal Arbel

Comments: Accepted to Medical Imaging with Deep Learning (MIDL) 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[306] arXiv:2405.09266 [pdf, other]: Title: Dance Any Beat: Blending Beats with Visuals in Dance Video Generation

Authors: Xuanchen Wang, Heng Wang, Dongnan Liu, Weidong Cai

Comments: 11 pages, 6 figures, demo page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[307] arXiv:2405.09247 [pdf, other]: Title: Graph Neural Network based Handwritten Trajectories Recognition

Authors: Anuj Sharma, Sukhdeep Singh, S Ratna

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[308] arXiv:2405.09215 [pdf, other]: Title: Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model

Authors: Wanting Xu, Yang Liu, Langping He, Xucheng Huang, Ling Jiang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[309] arXiv:2405.09194 [pdf, ps, other]: Title: Flexible image analysis for law enforcement agencies with deep neural networks to determine: where, who and what

Authors: Henri Bouma, Bart Joosten, Maarten C Kruithof, Maaike H T de Boer, Alexandru Ginsca (LIST (CEA)), Benjamin Labbe (LIST (CEA)), Quoc T Vuong (LIST (CEA))

Journal-ref: SPIE - Counterterrorism, Crime Fighting, Forensics, and Surveillance Technologies II, 2018, pp.27

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[310] arXiv:2405.09152 [pdf, other]: Title: Scalable Image Coding for Humans and Machines Using Feature Fusion Network

Authors: Takahiro Shindo, Taiju Watanabe, Yui Tatsumi, Hiroshi Watanabe

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[311] arXiv:2405.09150 [pdf, other]: Title: Curriculum Dataset Distillation

Authors: Zhiheng Ma, Anjia Cao, Funing Yang, Xing Wei

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[312] arXiv:2405.09148 [pdf, ps, other]: Title: A Hierarchically Feature Reconstructed Autoencoder for Unsupervised Anomaly Detection

Authors: Honghui Chen, Pingping Chen, Huan Mao, Mengxi Jiang

Comments: 12 pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[313] arXiv:2405.09138 [pdf, other]: Title: OpenGait: A Comprehensive Benchmark Study for Gait Recognition towards Better Practicality

Authors: Chao Fan, Saihui Hou, Junhao Liang, Chuanfu Shen, Jingzhe Ma, Dongyang Jin, Yongzhen Huang, Shiqi Yu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[314] arXiv:2405.09131 [pdf, other]: Title: RobustMVS: Single Domain Generalized Deep Multi-view Stereo

Authors: Hongbin Xu, Weitao Chen, Baigui Sun, Xuansong Xie, Wenxiong Kang

Comments: Accepted to TCSVT. Code will be released at: this https URL Benchmark will be released at: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[315] arXiv:2405.09125 [pdf, other]: Title: HAAP: Vision-context Hierarchical Attention Autoregressive with Adaptive Permutation for Scene Text Recognition

Authors: Honghui Chen, Yuhang Qiu, Jiabao Wang, Pingping Chen, Nam Ling

Comments: 12 pages, 10 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[316] arXiv:2405.09114 [pdf, other]: Title: SOEDiff: Efficient Distillation for Small Object Editing

Authors: Qihe Pan, Zicheng Wang, Zhen Zhao, Yiming Wu, Sifan Long, Haoran Liang, Ronghua Liang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[317] arXiv:2405.09083 [pdf, other]: Title: RSHazeDiff: A Unified Fourier-aware Diffusion Model for Remote Sensing Image Dehazing

Authors: Jiamei Xiong, Xuefeng Yan, Yongzhen Wang, Wei Zhao, Xiao-Ping Zhang, Mingqiang Wei

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[318] arXiv:2405.09059 [pdf, other]: Title: Task-adaptive Q-Face

Authors: Haomiao Sun, Mingjie He, Shiguang Shan, Hu Han, Xilin Chen

Comments: Ever submitted to ECCV2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[319] arXiv:2405.09056 [pdf, other]: Title: CTS: A Consistency-Based Medical Image Segmentation Model

Authors: Kejia Zhang, Lan Zhang, Haiwei Pan, Baolong Yu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[320] arXiv:2405.09054 [pdf, other]: Title: Dim Small Target Detection and Tracking: A Novel Method Based on Temporal Energy Selective Scaling and Trajectory Association

Authors: Weihua Gao, Wenlong Niu, Wenlong Lu, Pengcheng Wang, Zhaoyuan Qi, Xiaodong Peng, Zhen Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[321] arXiv:2405.09050 [pdf, other]: Title: 3D Shape Augmentation with Content-Aware Shape Resizing

Authors: Mingxiang Chen, Jian Zhang, Boli Zhou, Yang Song

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[322] arXiv:2405.09045 [pdf, other]: Title: AMSNet: Netlist Dataset for AMS Circuits

Authors: Zhuofu Tao, Yichen Shi, Yiru Huo, Rui Ye, Zonghang Li, Li Huang, Chen Wu, Na Bai, Zhiping Yu, Ting-Jung Lin, Lei He

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[323] arXiv:2405.09041 [pdf, other]: Title: Learning from Partial Label Proportions for Whole Slide Image Segmentation

Authors: Shinnosuke Matsuo, Daiki Suehiro, Seiichi Uchida, Hiroaki Ito, Kazuhiro Terada, Akihiko Yoshizawa, Ryoma Bise

Comments: Accepted at MICCAI2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[324] arXiv:2405.09032 [pdf, other]: Title: ICAL: Implicit Character-Aided Learning for Enhanced Handwritten Mathematical Expression Recognition

Authors: Jianhua Zhu, Liangcai Gao, Wenqi Zhao

Comments: Accept by ICDAR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[325] arXiv:2405.09024 [pdf, other]: Title: Dynamic Loss Decay based Robust Oriented Object Detection on Remote Sensing Images with Noisy Labels

Authors: Guozhang Liu, Ting Liu, Mengke Yuan, Tao Pang, Guangxing Yang, Hao Fu, Tao Wang, Tongkui Liao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[326] arXiv:2405.09006 [pdf, other]: Title: Spatial Semantic Recurrent Mining for Referring Image Segmentation

Authors: Jiaxing Yang, Lihe Zhang, Jiayu Sun, Huchuan Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[327] arXiv:2405.08996 [pdf, other]: Title: Learning Correspondence for Deformable Objects

Authors: Priya Sundaresan, Aditya Ganapathi, Harry Zhang, Shivin Devgon

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[328] arXiv:2405.08992 [pdf, other]: Title: Contextual Emotion Recognition using Large Vision Language Models

Authors: Yasaman Etesam, Özge Nilay Yalçın, Chuxuan Zhang, Angelica Lim

Comments: 8 pages, website: this https URL arXiv admin note: text overlap with arXiv:2310.19995

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[329] arXiv:2405.08991 [pdf, other]: Title: Theoretical Analysis for Expectation-Maximization-Based Multi-Model 3D Registration

Authors: David Jin, Harry Zhang, Kai Chang

Comments: arXiv admin note: substantial text overlap with arXiv:2402.10865

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[330] arXiv:2405.08961 [pdf, other]: Title: Bird's-Eye View to Street-View: A Survey

Authors: Khawlah Bajbaa, Muhammad Usman, Saeed Anwar, Ibrahim Radwan, Abdul Bais

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[331] arXiv:2405.08932 [pdf, other]: Title: Self-supervised vision-langage alignment of deep learning representations for bone X-rays analysis

Authors: Alexandre Englebert, Anne-Sophie Collin, Olivier Cornu, Christophe De Vleeschouwer

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[332] arXiv:2405.08911 [pdf, other]: Title: CLIP with Quality Captions: A Strong Pretraining for Vision Tasks

Authors: Pavan Kumar Anasosalu Vasu, Hadi Pouransari, Fartash Faghri, Oncel Tuzel

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[333] arXiv:2405.08909 [pdf, other]: Title: ADA-Track: End-to-End Multi-Camera 3D Multi-Object Tracking with Alternating Detection and Association

Authors: Shuxiao Ding, Lukas Schneider, Marius Cordts, Juergen Gall

Comments: 14 pages, 3 figures, accepted by CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[334] arXiv:2405.08890 [pdf, other]: Title: Language-Guided Self-Supervised Video Summarization Using Text Semantic Matching Considering the Diversity of the Video

Authors: Tomoya Sugihara, Shuntaro Masuda, Ling Xiao, Toshihiko Yamasaki

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[335] arXiv:2405.09539 (cross-list from eess.IV) [pdf, ps, other]: Title: MMFusion: Multi-modality Diffusion Model for Lymph Node Metastasis Diagnosis in Esophageal Cancer

Authors: Chengyu Wu, Chengkai Wang, Yaqi Wang, Huiyu Zhou, Yatao Zhang, Qifeng Wang, Shuai Wang

Comments: Early accepted to MICCAI 2024 (6/6/5)

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[336] arXiv:2405.09530 (cross-list from cs.CY) [pdf, other]: Title: A community palm model

Authors: Nicholas Clinton, Andreas Vollrath, Remi D'annunzio, Desheng Liu, Henry B. Glick, Adrià Descals, Alicia Sullivan, Oliver Guinan, Jacob Abramowitz, Fred Stolle, Chris Goodman, Tanya Birch, David Quinn, Olga Danylo, Tijs Lips, Daniel Coelho, Enikoe Bihari, Bryce Cronkite-Ratcliff, Ate Poortinga, Atena Haghighattalab, Evan Notman, Michael DeWitt, Aaron Yonas, Gennadii Donchyts, Devaja Shah, David Saah, Karis Tenneson, Nguyen Hanh Quyen, Megha Verma, Andrew Wilcox

Comments: v0

Subjects: Computers and Society (cs.CY); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[337] arXiv:2405.09472 (cross-list from eess.IV) [pdf, other]: Title: Perception- and Fidelity-aware Reduced-Reference Super-Resolution Image Quality Assessment

Authors: Xinying Lin, Xuyang Liu, Hong Yang, Xiaohai He, Honggang Chen

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[338] arXiv:2405.09353 (cross-list from eess.IV) [pdf, other]: Title: Large coordinate kernel attention network for lightweight image super-resolution

Authors: Fangwei Hao, Jiesheng Wu, Haotian Lu, Ji Du, Jing Xu

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[339] arXiv:2405.09298 (cross-list from eess.IV) [pdf, ps, other]: Title: Deep Blur Multi-Model (DeepBlurMM) -- a strategy to mitigate the impact of image blur on deep learning model performance in histopathology image analysis

Authors: Yujie Xiang, Bojing Liu, Mattias Rantalainen

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[340] arXiv:2405.09286 (cross-list from cs.MM) [pdf, other]: Title: MVBIND: Self-Supervised Music Recommendation For Videos Via Embedding Space Binding

Authors: Jiajie Teng, Huiyu Duan, Yucheng Zhu, Sijing Wu, Guangtao Zhai

Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV)
[341] arXiv:2405.09077 (cross-list from eess.IV) [pdf, other]: Title: Compressive Feature Selection for Remote Visual Multi-Task Inference

Authors: Saeed Ranjbar Alvar, Ivan V. Bajić

Comments: 6 pages, 8 figures, IEEE ICME Workshop on Coding for Machines

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[342] arXiv:2405.09049 (cross-list from cs.LG) [pdf, other]: Title: Perception Without Vision for Trajectory Prediction: Ego Vehicle Dynamics as Scene Representation for Efficient Active Learning in Autonomous Driving

Authors: Ross Greer, Mohan Trivedi

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[343] arXiv:2405.08981 (cross-list from cs.HC) [pdf, other]: Title: Impact of Design Decisions in Scanpath Modeling

Authors: Parvin Emami, Yue Jiang, Zixin Guo, Luis A. Leiva

Comments: 16 pages

Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[344] arXiv:2405.08920 (cross-list from cs.LG) [pdf, other]: Title: Neural Collapse Meets Differential Privacy: Curious Behaviors of NoisyGD with Near-perfect Representation Learning

Authors: Chendi Wang, Yuqing Zhu, Weijie J. Su, Yu-Xiang Wang

Comments: To appear in ICML 2024

Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

Wed, 15 May 2024

[345] arXiv:2405.08816 [pdf, other]: Title: The RoboDrive Challenge: Drive Anytime Anywhere in Any Condition

Authors: Lingdong Kong, Shaoyuan Xie, Hanjiang Hu, Yaru Niu, Wei Tsang Ooi, Benoit R. Cottereau, Lai Xing Ng, Yuexin Ma, Wenwei Zhang, Liang Pan, Kai Chen, Ziwei Liu, Weichao Qiu, Wei Zhang, Xu Cao, Hao Lu, Ying-Cong Chen, Caixin Kang, Xinning Zhou, Chengyang Ying, Wentao Shang, Xingxing Wei, Yinpeng Dong, Bo Yang, Shengyin Jiang, Zeliang Ma, Dengyi Ji, Haiwen Li, Xingliang Huang, Yu Tian, Genghua Kou, Fan Jia, Yingfei Liu, Tiancai Wang, Ying Li, Xiaoshuai Hao, Yifan Yang, Hui Zhang, Mengchuan Wei, Yi Zhou, Haimei Zhao, Jing Zhang, Jinke Li, Xiao He, Xiaoqiang Cheng, Bingyang Zhang, Lirong Zhao, Dianlei Ding, Fangsheng Liu, Yixiang Yan, Hongming Wang, Nanfei Ye, Lun Luo, Yubo Tian, Yiwei Zuo, Zhe Cao, Yi Ren, Yunfan Li, Wenjie Liu, Xun Wu, Yifan Mao, Ming Li, Jian Liu, Jiayang Liu, Zihan Qin, Cunxi Chu, et al. (25 additional authors not shown)

Comments: ICRA 2024; 31 pages, 24 figures, 5 tables; Code at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[346] arXiv:2405.08815 [pdf, other]: Title: Efficient Vision-Language Pre-training by Cluster Masking

Authors: Zihao Wei, Zixuan Pan, Andrew Owens

Comments: CVPR 2024, Project page: this https URL , Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[347] arXiv:2405.08813 [pdf, other]: Title: CinePile: A Long Video Question Answering Dataset and Benchmark

Authors: Ruchit Rawal, Khalid Saifullah, Ronen Basri, David Jacobs, Gowthami Somepalli, Tom Goldstein

Comments: Project page with all the artifacts - this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[348] arXiv:2405.08807 [pdf, other]: Title: SciFIBench: Benchmarking Large Multimodal Models for Scientific Figure Interpretation

Authors: Jonathan Roberts, Kai Han, Neil Houlsby, Samuel Albanie

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[349] arXiv:2405.08794 [pdf, other]: Title: Ambiguous Annotations: When is a Pedestrian not a Pedestrian?

Authors: Luisa Schwirten, Jannes Scholz, Daniel Kondermann, Janis Keuper

Comments: Paper accepted at the CVPR 2024 Vision and Language for Autonomous Driving and Robotics Workshop

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[350] arXiv:2405.08786 [pdf, other]: Title: Incorporating Clinical Guidelines through Adapting Multi-modal Large Language Model for Prostate Cancer PI-RADS Scoring

Authors: Tiantian Zhang, Manxi Lin, Hongda Guo, Xiaofan Zhang, Ka Fung Peter Chiu, Aasa Feragen, Qi Dou

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[351] arXiv:2405.08780 [pdf, ps, other]: Title: Harnessing the power of longitudinal medical imaging for eye disease prognosis using Transformer-based sequence modeling

Authors: Gregory Holste, Mingquan Lin, Ruiwen Zhou, Fei Wang, Lei Liu, Qi Yan, Sarah H. Van Tassel, Kyle Kovacs, Emily Y. Chew, Zhiyong Lu, Zhangyang Wang, Yifan Peng

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[352] arXiv:2405.08776 [pdf, ps, other]: Title: FolkTalent: Enhancing Classification and Tagging of Indian Folk Paintings

Authors: Nancy Hada, Aditya Singh, Kavita Vemuri

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[353] arXiv:2405.08768 [pdf, other]: Title: EfficientTrain++: Generalized Curriculum Learning for Efficient Visual Backbone Training

Authors: Yulin Wang, Yang Yue, Rui Lu, Yizeng Han, Shiji Song, Gao Huang

Comments: Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI). Journal version of arXiv:2211.09703 (ICCV 2023). Code is available at: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[354] arXiv:2405.08765 [pdf, other]: Title: Image to Pseudo-Episode: Boosting Few-Shot Segmentation by Unlabeled Data

Authors: Jie Zhang, Yuhan Li, Yude Wang, Stephen Lin, Shiguang Shan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[355] arXiv:2405.08748 [pdf, other]: Title: Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding

Authors: Zhimin Li, Jianwei Zhang, Qin Lin, Jiangfeng Xiong, Yanxin Long, Xinchi Deng, Yingfang Zhang, Xingchao Liu, Minbin Huang, Zedong Xiao, Dayou Chen, Jiajun He, Jiahao Li, Wenyue Li, Chen Zhang, Rongwei Quan, Jianxiang Lu, Jiabin Huang, Xiaoyan Yuan, Xiaoxiao Zheng, Yixuan Li, Jihong Zhang, Chao Zhang, Meng Chen, Jie Liu, Zheng Fang, Weiyan Wang, Jinbao Xue, Yangyu Tao, Jianchen Zhu, Kai Liu, Sihuan Lin, Yifu Sun, Yun Li, Dongdong Wang, Mingtao Chen, Zhichao Hu, Xiao Xiao, Yan Chen, Yuhong Liu, Wei Liu, Di Wang, Yong Yang, Jie Jiang, Qinglin Lu

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[356] arXiv:2405.08720 [pdf, other]: Title: The Lost Melody: Empirical Observations on Text-to-Video Generation From A Storytelling Perspective

Authors: Andrew Shin, Yusuke Mori, Kunitake Kaneko

Comments: To appear at CVPR 2024 Workshop on AI for Content Creation (AI4CC)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[357] arXiv:2405.08717 [pdf, other]: Title: How Much You Ate? Food Portion Estimation on Spoons

Authors: Aaryam Sharma, Chris Czarnecki, Yuhao Chen, Pengcheng Xi, Linlin Xu, Alexander Wong

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[358] arXiv:2405.08715 [pdf, other]: Title: DeVOS: Flow-Guided Deformable Transformer for Video Object Segmentation

Authors: Volodymyr Fedynyak, Yaroslav Romanus, Bohdan Hlovatskyi, Bohdan Sydor, Oles Dobosevych, Igor Babin, Roman Riazantsev

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[359] arXiv:2405.08695 [pdf, other]: Title: The impact of Compositionality in Zero-shot Multi-label action recognition for Object-based tasks

Authors: Carmela Calabrese, Stefano Berti, Giulia Pasquale, Lorenzo Natale

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[360] arXiv:2405.08681 [pdf, other]: Title: Achieving Fairness Through Channel Pruning for Dermatological Disease Diagnosis

Authors: Qingpeng Kong, Ching-Hao Chiu, Dewen Zeng, Yu-Jen Chen, Tsung-Yi Ho, Jingtong hu, Yiyu Shi

Comments: 13 pages, 3 figures, early accepted by International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[361] arXiv:2405.08668 [pdf, other]: Title: Promoting AI Equity in Science: Generalized Domain Prompt Learning for Accessible VLM Research

Authors: Qinglong Cao, Yuntian Chen, Lu Lu, Hao Sun, Zhenzhong Zeng, Xiaokang Yang, Dongxiao Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Applications (stat.AP)
[362] arXiv:2405.08609 [pdf, other]: Title: Dynamic NeRF: A Review

Authors: Jinwei Lin

Comments: 25 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[363] arXiv:2405.08593 [pdf, other]: Title: Open-Vocabulary Object Detection via Neighboring Region Attention Alignment

Authors: Sunyuan Qiang, Xianfei Li, Yanyan Liang, Wenlong Liao, Tao He, Pai Peng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[364] arXiv:2405.08589 [pdf, other]: Title: Variable Substitution and Bilinear Programming for Aligning Partially Overlapping Point Sets

Authors: Wei Lian, Zhesen Cui, Fei Ma, Hang Pan, Wangmeng Zuo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[365] arXiv:2405.08587 [pdf, other]: Title: EchoTracker: Advancing Myocardial Point Tracking in Echocardiography

Authors: Md Abulkalam Azad, Artem Chernyshov, John Nyberg, Ingrid Tveten, Lasse Lovstakken, Håvard Dalen, Bjørnar Grenne, Andreas Østvik

Comments: Submitted version that got provisionally (early) accepted (top 11%) to MICCAI2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[366] arXiv:2405.08586 [pdf, other]: Title: Cross-Domain Feature Augmentation for Domain Generalization

Authors: Yingnan Liu, Yingtian Zou, Rui Qiao, Fusheng Liu, Mong Li Lee, Wynne Hsu

Comments: Accepted to the 33rd International Joint Conference on Artificial Intelligence (IJCAI 2024); Code is available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[367] arXiv:2405.08578 [pdf, ps, other]: Title: Local-peak scale-invariant feature transform for fast and random image stitching

Authors: Hao Li, Lipo Wang, Tianyun Zhao, Wei Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[368] arXiv:2405.08555 [pdf, other]: Title: Dual-Branch Network for Portrait Image Quality Assessment

Authors: Wei Sun, Weixia Zhang, Yanwei Jiang, Haoning Wu, Zicheng Zhang, Jun Jia, Yingjie Zhou, Zhongpeng Ji, Xiongkuo Min, Weisi Lin, Guangtao Zhai

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[369] arXiv:2405.08547 [pdf, other]: Title: Exploring Graph-based Knowledge: Multi-Level Feature Distillation via Channels Relational Graph

Authors: Zhiwei Wang, Jun Huang, Longhua Ma, Chengyu Wu, Hongyu Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[370] arXiv:2405.08533 [pdf, other]: Title: Dynamic Feature Learning and Matching for Class-Incremental Learning

Authors: Sunyuan Qiang, Yanyan Liang, Jun Wan, Du Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[371] arXiv:2405.08493 [pdf, ps, other]: Title: Rethinking Scanning Strategies with Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental Study

Authors: Qinfeng Zhu, Yuan Fang, Yuanzhi Cai, Cheng Chen, Lei Fan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[372] arXiv:2405.08487 [pdf, other]: Title: Semantic Contextualization of Face Forgery: A New Definition, Dataset, and Detection Method

Authors: Mian Zou, Baosheng Yu, Yibing Zhan, Siwei Lyu, Kede Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
[373] arXiv:2405.08483 [pdf, other]: Title: RDPN6D: Residual-based Dense Point-wise Network for 6Dof Object Pose Estimation Based on RGB-D Images

Authors: Zong-Wei Hong, Yen-Yang Hung, Chu-Song Chen

Comments: Accepted by CVPR Workshop DLGC, 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[374] arXiv:2405.08463 [pdf, other]: Title: A Timely Survey on Vision Transformer for Deepfake Detection

Authors: Zhikan Wang, Zhongyao Cheng, Jiajie Xiong, Xun Xu, Tianrui Li, Bharadwaj Veeravalli, Xulei Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[375] arXiv:2405.08458 [pdf, other]: Title: Rethinking Prior Information Generation with CLIP for Few-Shot Segmentation

Authors: Jin Wang, Bingfeng Zhang, Jian Pang, Honglong Chen, Weifeng Liu

Comments: Accepted by CVPR 2024; The camera-ready version

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[376] arXiv:2405.08434 [pdf, other]: Title: TP3M: Transformer-based Pseudo 3D Image Matching with Reference

Authors: Liming Han, Zhaoxiang Liu, Shiguo Lian

Comments: Accepted by ICRA 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[377] arXiv:2405.08429 [pdf, other]: Title: TEDNet: Twin Encoder Decoder Neural Network for 2D Camera and LiDAR Road Detection

Authors: Martín Bayón-Gutiérrez, María Teresa García-Ordás, Héctor Alaiz Moretón, Jose Aveleira-Mata, Sergio Rubio Martín, José Alberto Benítez-Andrades

Comments: Source code: this https URL

Journal-ref: M Bay\'on-Guti\'errez, MT Garc\'ia-Ord\'as, H Alaiz Moret\'on, J Aveleira-Mata, S Rubio-Mart\'in, JA Ben\'itez-Andrades. TEDNet: Twin Encoder Decoder Neural Network for 2D Camera and LiDAR Road Detection. Logic Journal of the IGPL. 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[378] arXiv:2405.08419 [pdf, other]: Title: WaterMamba: Visual State Space Model for Underwater Image Enhancement

Authors: Meisheng Guan, Haiyong Xu, Gangyi Jiang, Mei Yu, Yeyao Chen, Ting Luo, Yang Song

Comments: arXiv admin note: substantial text overlap with arXiv:2403.06098

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[379] arXiv:2405.08344 [pdf, other]: Title: No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding

Authors: Yingjie Zhai, Wenshuo Li, Yehui Tang, Xinghao Chen, Yunhe Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[380] arXiv:2405.08337 [pdf, ps, other]: Title: Perivascular space Identification Nnunet for Generalised Usage (PINGU)

Authors: Benjamin Sinclair, Lucy Vivash, Jasmine Moses, Miranda Lynch, William Pham, Karina Dorfman, Cassandra Marotta, Shaun Koh, Jacob Bunyamin, Ella Rowsthorn, Alex Jarema, Himashi Peiris, Zhaolin Chen, Sandy R Shultz, David K Wright, Dexiao Kong, Sharon L. Naismith, Terence J. OBrien, Meng Law

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[381] arXiv:2405.08329 [pdf, other]: Title: Cross-Dataset Generalization For Retinal Lesions Segmentation

Authors: Clément Playout, Farida Cheriet

Comments: 6 pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[382] arXiv:2405.08322 [pdf, other]: Title: StraightPCF: Straight Point Cloud Filtering

Authors: Dasith de Silva Edirimuni, Xuequan Lu, Gang Li, Lei Wei, Antonio Robles-Kelly, Hongdong Li

Comments: This paper has been accepted to the IEEE/CVF CVPR Conference, 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[383] arXiv:2405.08300 [pdf, other]: Title: Vector-Symbolic Architecture for Event-Based Optical Flow

Authors: Hongzhi You, Yijun Cao, Wei Yuan, Fanjun Wang, Ning Qiao, Yongjie Li

Subjects: Computer Vision and Pattern Recognition (cs.CV); Symbolic Computation (cs.SC)
[384] arXiv:2405.08272 [pdf, other]: Title: VS-Assistant: Versatile Surgery Assistant on the Demand of Surgeons

Authors: Zhen Chen, Xingjian Luo, Jinlin Wu, Danny T.M. Chan, Zhen Lei, Jinqiao Wang, Sebastien Ourselin, Hongbin Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[385] arXiv:2405.08270 [pdf, other]: Title: Towards Clinician-Preferred Segmentation: Leveraging Human-in-the-Loop for Test Time Adaptation in Medical Image Segmentation

Authors: Shishuai Hu, Zehui Liao, Zeyou Liu, Yong Xia

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[386] arXiv:2405.08263 [pdf, other]: Title: Palette-based Color Transfer between Images

Authors: Chenlei Lv, Dan Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[387] arXiv:2405.08251 [pdf, other]: Title: Multimodal Collaboration Networks for Geospatial Vehicle Detection in Dense, Occluded, and Large-Scale Events

Authors: Xin Wu, Zhanchao Huang, Li Wang, Jocelyn Chanussot, Jiaojiao Tian

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[388] arXiv:2405.08246 [pdf, other]: Title: Compositional Text-to-Image Generation with Dense Blob Representations

Authors: Weili Nie, Sifei Liu, Morteza Mardani, Chao Liu, Benjamin Eckart, Arash Vahdat

Comments: ICML 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[389] arXiv:2405.08245 [pdf, ps, other]: Title: Progressive enhancement and restoration for mural images under low-light and defected conditions based on multi-receptive field strategy

Authors: Xiameng Wei, Binbin Fan, Ying Wang, Yanxiang Feng, Laiyi Fu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[390] arXiv:2405.08210 [pdf, other]: Title: Infinite Texture: Text-guided High Resolution Diffusion Texture Synthesis

Authors: Yifan Wang, Aleksander Holynski, Brian L. Curless, Steven M. Seitz

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[391] arXiv:2405.08204 [pdf, other]: Title: A Semantic and Motion-Aware Spatiotemporal Transformer Network for Action Detection

Authors: Matthew Korban, Peter Youngs, Scott T. Acton

Comments: IEEE Transactions on Pattern Analysis and Machine Intelligence (2024)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[392] arXiv:2405.08197 [pdf, other]: Title: IHC Matters: Incorporating IHC analysis to H&E Whole Slide Image Analysis for Improved Cancer Grading via Two-stage Multimodal Bilinear Pooling Fusion

Authors: Jun Wang, Yu Mao, Yufei Cui, Nan Guan, Chun Jason Xue

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[393] arXiv:2405.08114 [pdf, other]: Title: RATLIP: Generative Adversarial CLIP Text-to-Image Synthesis Based on Recurrent Affine Transformations

Authors: Chengde Lin, Xijun Lu, Guangxi Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[394] arXiv:2405.08055 [pdf, other]: Title: DiffTF++: 3D-aware Diffusion Transformer for Large-Vocabulary 3D Generation

Authors: Ziang Cao, Fangzhou Hong, Tong Wu, Liang Pan, Ziwei Liu

Comments: arXiv admin note: substantial text overlap with arXiv:2309.07920

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[395] arXiv:2405.08766 (cross-list from cs.LG) [pdf, other]: Title: Energy-based Hopfield Boosting for Out-of-Distribution Detection

Authors: Claus Hofmann, Simon Schmid, Bernhard Lehner, Daniel Klotz, Sepp Hochreiter

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[396] arXiv:2405.08745 (cross-list from eess.IV) [pdf, other]: Title: Enhancing Blind Video Quality Assessment with Rich Quality-aware Features

Authors: Wei Sun, Haoning Wu, Zicheng Zhang, Jun Jia, Zhichao Zhang, Linhan Cao, Qiubo Chen, Xiongkuo Min, Weisi Lin, Guangtao Zhai

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[397] arXiv:2405.08733 (cross-list from cs.GR) [pdf, other]: Title: A Simple Approach to Differentiable Rendering of SDFs

Authors: Zichen Wang, Xi Deng, Ziyi Zhang, Wenzel Jakob, Steve Marschner

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[398] arXiv:2405.08672 (cross-list from eess.IV) [pdf, other]: Title: EndoDAC: Efficient Adapting Foundation Model for Self-Supervised Depth Estimation from Any Endoscopic Camera

Authors: Beilei Cui, Mobarakol Islam, Long Bai, An Wang, Hongliang Ren

Comments: early accepted by MICCAI 2024

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[399] arXiv:2405.08658 (cross-list from eess.IV) [pdf, other]: Title: Beyond the Black Box: Do More Complex Models Provide Superior XAI Explanations?

Authors: Mateusz Cedro, Marcin Chlebus

Comments: 15 pages, 9 figures, 5 tables

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[400] arXiv:2405.08657 (cross-list from eess.IV) [pdf, other]: Title: Self-supervised learning improves robustness of deep learning lung tumor segmentation to CT imaging differences

Authors: Jue Jiang, Aneesh Rangnekar, Harini Veeraraghavan

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[401] arXiv:2405.08654 (cross-list from cs.LG) [pdf, other]: Title: Can we Defend Against the Unknown? An Empirical Study About Threshold Selection for Neural Network Monitoring

Authors: Khoi Tran Dang, Kevin Delmas, Jérémie Guiochet, Joris Guérin

Comments: 13 pages, 5 figures, 6 tables. To appear in the proceedings of the 40th Conference on Uncertainty in Artificial Intelligence (UAI 2024)

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[402] arXiv:2405.08621 (cross-list from eess.IV) [pdf, other]: Title: RMT-BVQA: Recurrent Memory Transformer-based Blind Video Quality Assessment for Enhanced Video Content

Authors: Tianhao Peng, Chen Feng, Duolikun Danier, Fan Zhang, David Bull

Comments: 8pages, 2figures

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[403] arXiv:2405.08576 (cross-list from cs.RO) [pdf, other]: Title: Hearing Touch: Audio-Visual Pretraining for Contact-Rich Manipulation

Authors: Jared Mejia, Victoria Dean, Tess Hellebrekers, Abhinav Gupta

Comments: Accepted to ICRA 2024

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[404] arXiv:2405.08556 (cross-list from eess.IV) [pdf, other]: Title: Shape-aware synthesis of pathological lung CT scans using CycleGAN for enhanced semi-supervised lung segmentation

Authors: Rezkellah Noureddine Khiati, Pierre-Yves Brillet, Aurélien Justet, Radu Ispa, Catalin Fetita

Comments: 14 pages, 7 figures

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[405] arXiv:2405.08431 (cross-list from eess.IV) [pdf, other]: Title: Similarity Metrics for MR Image-To-Image Translation

Authors: Melanie Dohmen, Mark Klemens, Ivo Baltruschat, Tuan Truong, Matthias Lenga

Comments: 29 pages, 6 figures, appendix with 5 figures

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[406] arXiv:2405.08423 (cross-list from eess.IV) [pdf, other]: Title: NAFRSSR: a Lightweight Recursive Network for Efficient Stereo Image Super-Resolution

Authors: Yihong Chen, Zhen Fan, Shuai Dong, Zhiwei Chen, Wenjie Li, Minghui Qin, Min Zeng, Xubing Lu, Guofu Zhou, Xingsen Gao, Jun-Ming Liu

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[407] arXiv:2405.08363 (cross-list from cs.CR) [pdf, other]: Title: UnMarker: A Universal Attack on Defensive Watermarking

Authors: Andre Kassis, Urs Hengartner

Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[408] arXiv:2405.08340 (cross-list from cs.CR) [pdf, other]: Title: Achieving Resolution-Agnostic DNN-based Image Watermarking:A Novel Perspective of Implicit Neural Representation

Authors: Yuchen Wang, Xingyu Zhu, Guanhui Ye, Shiyao Zhang, Xuetao Wei

Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[409] arXiv:2405.08297 (cross-list from cs.LG) [pdf, ps, other]: Title: Distance-Restricted Explanations: Theoretical Underpinnings & Efficient Implementation

Authors: Yacine Izza, Xuanxiang Huang, Antonio Morgado, Jordi Planes, Alexey Ignatiev, Joao Marques-Silva

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)
[410] arXiv:2405.08282 (cross-list from eess.IV) [pdf, ps, other]: Title: Automatic Segmentation of the Kidneys and Cystic Renal Lesions on Non-Contrast CT Using a Convolutional Neural Network

Authors: Lucas Aronson (1), Ruben Ngnitewe Massaa (1), Syed Jamal Safdar Gardezi (1), Andrew L. Wentland (1,2,3) ((1) Department of Radiology, University of Wisconsin School of Medicine & Public Health, Madison, WI, USA, (2) Department of Medical Physics, University of Wisconsin School of Medicine & Public Health, Madison, WI, USA, (3) Department of Biomedical Engineering, University of Wisconsin School of Medicine & Public Health, Madison, WI, USA)

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[411] arXiv:2405.08275 (cross-list from math.OC) [pdf, other]: Title: Power of $\ell_1$-Norm Regularized Kaczmarz Algorithms for High-Order Tensor Recovery

Authors: Katherine Henneberger, Jing Qin

Comments: arXiv admin note: text overlap with arXiv:2311.00783

Subjects: Optimization and Control (math.OC); Computer Vision and Pattern Recognition (cs.CV); Numerical Analysis (math.NA)
[412] arXiv:2405.08209 (cross-list from cs.CY) [pdf, other]: Title: Who's in and who's out? A case study of multimodal CLIP-filtering in DataComp

Authors: Rachel Hong, William Agnew, Tadayoshi Kohno, Jamie Morgenstern

Comments: Content warning: This paper discusses societal stereotypes and sexually-explicit material that may be disturbing, distressing, and/or offensive to the reader

Subjects: Computers and Society (cs.CY); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[413] arXiv:2405.08169 (cross-list from eess.IV) [pdf, other]: Title: Rethinking Histology Slide Digitization Workflows for Low-Resource Settings

Authors: Talat Zehra, Joseph Marino, Wendy Wang, Grigoriy Frantsuzov, Saad Nadeem

Comments: MICCAI 2024 Early Accept. First four authors contributed equally

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[414] arXiv:2405.08119 (cross-list from eess.SY) [pdf, other]: Title: GPS-IMU Sensor Fusion for Reliable Autonomous Vehicle Position Estimation

Authors: Simegnew Yihunie Alaba

Comments: 6 pages, 4 figures, and conference

Subjects: Systems and Control (eess.SY); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[415] arXiv:2405.08054 (cross-list from cs.GR) [pdf, other]: Title: Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning

Authors: Wenqi Dong, Bangbang Yang, Lin Ma, Xiao Liu, Liyuan Cui, Hujun Bao, Yuewen Ma, Zhaopeng Cui

Comments: Project webpage: this https URL

Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[416] arXiv:2405.08049 (cross-list from eess.IV) [pdf, other]: Title: Optimizing Synthetic Correlated Diffusion Imaging for Breast Cancer Tumour Delineation

Authors: Chi-en Amy Tai, Alexander Wong

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[417] arXiv:2405.08042 (cross-list from cs.HC) [pdf, other]: Title: LLAniMAtion: LLAMA Driven Gesture Animation

Authors: Jonathan Windle, Iain Matthews, Sarah Taylor

Subjects: Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[418] arXiv:2405.08038 (cross-list from cs.LG) [pdf, other]: Title: Feature Expansion and enhanced Compression for Class Incremental Learning

Authors: Quentin Ferdinand (ENSTA Bretagne, Lab-STICC\_MATRIX), Gilles Le Chenadec (ENSTA Bretagne, Lab-STICC\_MATRIX), Benoit Clement (CROSSING, ENSTA Bretagne, Lab-STICC\_MATRIX), Panagiotis Papadakis (Lab-STICC\_RAMBO, IMT Atlantique - INFO), Quentin Oliveau

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[419] arXiv:2405.08020 (cross-list from cs.LG) [pdf, other]: Title: ReActXGB: A Hybrid Binary Convolutional Neural Network Architecture for Improved Performance and Computational Efficiency

Authors: Po-Hsun Chu, Ching-Han Chen

Comments: Accepted to ICCE-TW 2024

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[420] arXiv:2405.07994 (cross-list from eess.IV) [pdf, ps, other]: Title: BubbleID: A Deep Learning Framework for Bubble Interface Dynamics Analysis

Authors: Christy Dunlap, Changgen Li, Hari Pandey, Ngan Le, Han Hu

Comments: 16 pages, 4 figures

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Tue, 21 May 2024
Mon, 20 May 2024
Fri, 17 May 2024
Thu, 16 May 2024
Wed, 15 May 2024

[ total of 420 entries: 1-420 ]
[ showing up to 553 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2405, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions

Tue, 21 May 2024

Mon, 20 May 2024

Fri, 17 May 2024

Thu, 16 May 2024

Wed, 15 May 2024