We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions

[ total of 420 entries: 1-420 ]
[ showing up to 553 entries per page: fewer | more ]

Tue, 21 May 2024

[1]  arXiv:2405.12221 [pdf, other]
Title: Images that Sound: Composing Images and Sounds on a Single Canvas
Comments: Project site: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[2]  arXiv:2405.12218 [pdf, other]
Title: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[3]  arXiv:2405.12217 [pdf, other]
Title: Adapting Large Multimodal Models to Distribution Shifts: The Role of In-Context Learning
Comments: 17 pages, 7 figures, 7 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[4]  arXiv:2405.12211 [pdf, other]
Title: Slicedit: Zero-Shot Video Editing With Text-to-Image Diffusion Models Using Spatio-Temporal Slices
Comments: ICML 2024. Code and examples are available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[5]  arXiv:2405.12202 [pdf, other]
Title: Hierarchical Neural Operator Transformer with Learnable Frequency-aware Loss Prior for Arbitrary-scale Super-resolution
Comments: 20 pages, 14 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[6]  arXiv:2405.12200 [pdf, other]
Title: Multi-View Attentive Contextualization for Multi-View 3D Object Detection
Comments: Accepted by CVPR2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[7]  arXiv:2405.12175 [pdf, other]
Title: Enhancing Explainable AI: A Hybrid Approach Combining GradCAM and LRP for CNN Interpretability
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[8]  arXiv:2405.12150 [pdf, other]
Title: Bangladeshi Native Vehicle Detection in Wild
Comments: 13 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[9]  arXiv:2405.12139 [pdf, other]
Title: DTLLM-VLT: Diverse Text Generation for Visual Language Tracking Based on LLM
Comments: Accepted by CVPR Workshop 2024, Oral Presentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[10]  arXiv:2405.12126 [pdf, other]
Title: Alzheimer's Magnetic Resonance Imaging Classification Using Deep and Meta-Learning Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Emerging Technologies (cs.ET); Machine Learning (cs.LG); Multimedia (cs.MM)
[11]  arXiv:2405.12114 [pdf, other]
Title: A New Cross-Space Total Variation Regularization Model for Color Image Restoration with Quaternion Blur Operator
Comments: 15pages,10figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Numerical Analysis (math.NA)
[12]  arXiv:2405.12110 [pdf, other]
Title: CoR-GS: Sparse-View 3D Gaussian Splatting via Co-Regularization
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[13]  arXiv:2405.12107 [pdf, other]
Title: Imp: Highly Capable Large Multimodal Models for Mobile Devices
Comments: 19 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[14]  arXiv:2405.12105 [pdf, other]
Title: Sheet Music Transformer ++: End-to-End Full-Page Optical Music Recognition for Pianoform Sheet Music
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[15]  arXiv:2405.12070 [pdf, other]
Title: AutoSoccerPose: Automated 3D posture Analysis of Soccer Shot Movements
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[16]  arXiv:2405.12069 [pdf, other]
Title: Gaussian Head & Shoulders: High Fidelity Neural Upper Body Avatars with Anchor Gaussian Guided Texture Warping
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[17]  arXiv:2405.12057 [pdf, other]
Title: NPLMV-PS: Neural Point-Light Multi-View Photometric Stereo
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[18]  arXiv:2405.12018 [pdf, other]
Title: Continuous Sign Language Recognition with Adapted Conformer via Unsupervised Pretraining
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[19]  arXiv:2405.12006 [pdf, other]
Title: Depth Reconstruction with Neural Signed Distance Fields in Structured Light Systems
Comments: 10 pages, 8 figures, accepted by 3DV 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[20]  arXiv:2405.12003 [pdf, other]
Title: Mamba-in-Mamba: Centralized Mamba-Cross-Scan in Tokenized Mamba Model for Hyperspectral Image Classification
Comments: 19 pages, 16 figures,
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[21]  arXiv:2405.11993 [pdf, other]
Title: GGAvatar: Geometric Adjustment of Gaussian Head Avatar
Comments: 9 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[22]  arXiv:2405.11985 [pdf, other]
Title: MTVQA: Benchmarking Multilingual Text-Centric Visual Question Answering
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[23]  arXiv:2405.11978 [pdf, other]
Title: SM-DTW: Stability Modulated Dynamic Time Warping for signature verification
Journal-ref: Pattern Recognition Letters, Volume: 121, Pages 113-122 (2019)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[24]  arXiv:2405.11977 [pdf, other]
Title: GuidedRec: Guiding Ill-Posed Unsupervised Volumetric Recovery
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[25]  arXiv:2405.11976 [pdf, other]
Title: Position-Guided Prompt Learning for Anomaly Detection in Chest X-Rays
Comments: MICCAI 2024 Early Accept
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[26]  arXiv:2405.11971 [pdf, other]
Title: Data Augmentation for Text-based Person Retrieval Using Large Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[27]  arXiv:2405.11936 [pdf, other]
Title: UAV-VisLoc: A Large-scale Dataset for UAV Visual Localization
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[28]  arXiv:2405.11921 [pdf, other]
Title: MirrorGaussian: Reflecting 3D Gaussians for Reconstructing Mirror Reflections
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[29]  arXiv:2405.11914 [pdf, other]
Title: PT43D: A Probabilistic Transformer for Generating 3D Shapes from Single Highly-Ambiguous RGB Images
Comments: 10 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[30]  arXiv:2405.11913 [pdf, other]
Title: Diff-BGM: A Diffusion Model for Video Background Music Generation
Comments: Accepted by CVPR 2024(Poster)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[31]  arXiv:2405.11905 [pdf, other]
Title: CSTA: CNN-based Spatiotemporal Attention for Video Summarization
Comments: Accepted at CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[32]  arXiv:2405.11903 [pdf, ps, other]
Title: A comprehensive overview of deep learning techniques for 3D point cloud classification and semantic segmentation
Comments: Published in Springer Nature (Machine Vision and Applications)
Journal-ref: Machine Vision and Applications 35, 67 (2024)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[33]  arXiv:2405.11894 [pdf, other]
Title: Refining Coded Image in Human Vision Layer Using CNN-Based Post-Processing
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[34]  arXiv:2405.11867 [pdf, other]
Title: Depth Prompting for Sensor-Agnostic Depth Estimation
Comments: Accepted at CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)
[35]  arXiv:2405.11862 [pdf, other]
Title: SEMv3: A Fast and Robust Approach to Table Separation Line Detection
Comments: 9 pages, 6 figures, 5 tables. Accepted by IJCAI2024 main track
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[36]  arXiv:2405.11852 [pdf, other]
Title: Evolving Storytelling: Benchmarks and Methods for New Character Customization with Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[37]  arXiv:2405.11850 [pdf, other]
Title: Rethinking Overlooked Aspects in Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[38]  arXiv:2405.11846 [pdf, other]
Title: EPPS: Advanced Polyp Segmentation via Edge Information Injection and Selective Feature Decoupling
Authors: Mengqi Lei, Xin Wang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[39]  arXiv:2405.11837 [pdf, other]
Title: Improving the Explain-Any-Concept by Introducing Nonlinearity to the Trainable Surrogate Model
Comments: This paper is accepted for publication at IEEE SIU conference, 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[40]  arXiv:2405.11823 [pdf, other]
Title: Stereo-Knowledge Distillation from dpMV to Dual Pixels for Light Field Video Reconstruction
Comments: International Conference of Computational Photography (ICCP 2024), 11 pages and 12 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[41]  arXiv:2405.11822 [pdf, other]
Title: FeTT: Continual Class Incremental Learning via Feature Transformation Tuning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[42]  arXiv:2405.11814 [pdf, other]
Title: Climatic & Anthropogenic Hazards to the Nasca World Heritage: Application of Remote Sensing, AI, and Flood Modelling
Comments: accepted at IGARSS 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computers and Society (cs.CY)
[43]  arXiv:2405.11809 [pdf, other]
Title: Distill-then-prune: An Efficient Compression Framework for Real-time Stereo Matching Network on Edge Devices
Comments: International Conference on Robotics and Automation (ICRA) 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[44]  arXiv:2405.11794 [pdf, other]
Title: ViViD: Video Virtual Try-on using Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[45]  arXiv:2405.11793 [pdf, other]
Title: MM-Retinal: Knowledge-Enhanced Foundational Pretraining with Fundus Image-Text Expertise
Comments: Early Accepted by The International Conference on Medical Image Computing and Computer Assisted Intervention(MICCAI)2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[46]  arXiv:2405.11770 [pdf, other]
Title: Learning Spatial Similarity Distribution for Few-shot Object Counting
Comments: Accepted to IJCAI2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[47]  arXiv:2405.11765 [pdf, other]
Title: DATR: Unsupervised Domain Adaptive Detection Transformer with Dataset-Level Adaptation and Prototypical Alignment
Comments: Manuscript submitted to IEEE Transactions on Image Processing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[48]  arXiv:2405.11757 [pdf, other]
Title: DLAFormer: An End-to-End Transformer For Document Layout Analysis
Comments: ICDAR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[49]  arXiv:2405.11754 [pdf, other]
Title: Versatile Teacher: A Class-aware Teacher-student Framework for Cross-domain Adaptation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[50]  arXiv:2405.11732 [pdf, ps, other]
Title: Quality assurance of organs-at-risk delineation in radiotherapy
Comments: 14 pages,5 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)
[51]  arXiv:2405.11690 [pdf, other]
Title: InterAct: Capture and Modelling of Realistic, Expressive and Interactive Activities between Two Persons in Daily Scenarios
Comments: The first two authors contributed equally to this work
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[52]  arXiv:2405.11685 [pdf, other]
Title: ColorFoil: Investigating Color Blindness in Large Vision and Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[53]  arXiv:2405.11682 [pdf, other]
Title: FADet: A Multi-sensor 3D Object Detection Network based on Local Featured Attention
Comments: Submitted to IEEE
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[54]  arXiv:2405.11677 [pdf, other]
Title: Advancing 6-DoF Instrument Pose Estimation in Variable X-Ray Imaging Geometries
Comments: Early author version of paper. Refer to the full paper at this https URL
Journal-ref: IEEE Transactions on Image Processing (2024) (Volume: 33) Page(s): 2462 - 2476
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[55]  arXiv:2405.11675 [pdf, other]
Title: Deep Ensemble Art Style Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[56]  arXiv:2405.11655 [pdf, other]
Title: Track Anything Rapter(TAR)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[57]  arXiv:2405.11643 [pdf, other]
Title: Morphological Prototyping for Unsupervised Slide Representation Learning in Computational Pathology
Comments: CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Applications (stat.AP)
[58]  arXiv:2405.11629 [pdf, other]
Title: Searching Realistic-Looking Adversarial Objects For Autonomous Driving Systems
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[59]  arXiv:2405.11621 [pdf, ps, other]
Title: Computer Vision in the Food Industry: Accurate, Real-time, and Automatic Food Recognition with Pretrained MobileNetV2
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[60]  arXiv:2405.11618 [pdf, other]
Title: Transcriptomics-guided Slide Representation Learning in Computational Pathology
Comments: CVPR'24, Oral
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[61]  arXiv:2405.11616 [pdf, other]
Title: Era3D: High-Resolution Multiview Diffusion using Efficient Row-wise Attention
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[62]  arXiv:2405.11614 [pdf, other]
Title: Nickel and Diming Your GAN: A Dual-Method Approach to Enhancing GAN Efficiency via Knowledge Distillation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[63]  arXiv:2405.11582 [pdf, other]
Title: SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization
Comments: Accepted to ICML 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[64]  arXiv:2405.11574 [pdf, other]
Title: Reproducibility Study of CDUL: CLIP-Driven Unsupervised Learning for Multi-Label Image Classification
Comments: Reproducibility study
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[65]  arXiv:2405.11564 [pdf, other]
Title: CRF360D: Monocular 360 Depth Estimation via Spherical Fully-Connected CRFs
Authors: Zidong Cao, Lin Wang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[66]  arXiv:2405.11551 [pdf, other]
Title: An Invisible Backdoor Attack Based On Semantic Feature
Authors: Yangming Chen
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[67]  arXiv:2405.11536 [pdf, other]
Title: RobMOT: Robust 3D Multi-Object Tracking by Observational Noise and State Estimation Drift Mitigation on LiDAR PointCloud
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[68]  arXiv:2405.11526 [pdf, other]
Title: Register assisted aggregation for Visual Place Recognition
Authors: Xuan Yu, Zhenyong Fu
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[69]  arXiv:2405.11523 [pdf, other]
Title: Diffusion-Based Hierarchical Image Steganography
Comments: arXiv admin note: text overlap with arXiv:2305.16936
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[70]  arXiv:2405.11511 [pdf, other]
Title: Online Action Representation using Change Detection and Symbolic Programming
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[71]  arXiv:2405.11501 [pdf, other]
Title: DogFLW: Dog Facial Landmarks in the Wild Dataset
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[72]  arXiv:2405.11498 [pdf, other]
Title: The Effectiveness of Edge Detection Evaluation Metrics for Automated Coastline Detection
Journal-ref: 2023 Photonics & Electromagnetics Research Symposium (PIERS)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[73]  arXiv:2405.11496 [pdf, other]
Title: DEMO: A Statistical Perspective for Efficient Image-Text Matching
Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR)
[74]  arXiv:2405.11494 [pdf, other]
Title: Automated Coastline Extraction Using Edge Detection Algorithms
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[75]  arXiv:2405.11493 [pdf, other]
Title: Point Cloud Compression with Implicit Neural Representations: A Unified Framework
Comments: 6 Pages, 6 Figures, submitted to IEEE ICCC
Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT); Signal Processing (eess.SP)
[76]  arXiv:2405.11491 [pdf, other]
Title: BOSC: A Backdoor-based Framework for Open Set Synthetic Image Attribution
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[77]  arXiv:2405.11487 [pdf, other]
Title: "Previously on ..." From Recaps to Story Summarization
Comments: CVPR 2024; Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[78]  arXiv:2405.11483 [pdf, other]
Title: MICap: A Unified Model for Identity-aware Movie Descriptions
Comments: CVPR 2024, Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[79]  arXiv:2405.11481 [pdf, other]
Title: Physics-aware Hand-object Interaction Denoising
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[80]  arXiv:2405.11478 [pdf, other]
Title: Unsupervised Image Prior via Prompt Learning and CLIP Semantic Guidance for Low-Light Image Enhancement
Comments: Accepted to CVPR 2024 Workshop NTIRE: New Trends in Image Restoration and Enhancement workshop and Challenges
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[81]  arXiv:2405.11476 [pdf, other]
Title: NubbleDrop: A Simple Way to Improve Matching Strategy for Prompted One-Shot Segmentation
Comments: Under Review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[82]  arXiv:2405.11473 [pdf, other]
Title: FIFO-Diffusion: Generating Infinite Videos from Text without Training
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[83]  arXiv:2405.11468 [pdf, other]
Title: Emphasizing Crucial Features for Efficient Image Restoration
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[84]  arXiv:2405.11467 [pdf, other]
Title: AdaAugment: A Tuning-Free and Adaptive Approach to Enhance Data Augmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[85]  arXiv:2405.11448 [pdf, other]
Title: Cross-Domain Knowledge Distillation for Low-Resolution Human Pose Estimation
Comments: 11 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[86]  arXiv:2405.11442 [pdf, other]
Title: Unifying 3D Vision-Language Understanding via Promptable Queries
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[87]  arXiv:2405.11437 [pdf, other]
Title: The First Swahili Language Scene Text Detection and Recognition Dataset
Comments: Accepted to ICDAR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[88]  arXiv:2405.11351 [pdf, other]
Title: PlantTracing: Tracing Arabidopsis Thaliana Apex with CenterTrack
Comments: 4 pages, 13 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[89]  arXiv:2405.11345 [pdf, ps, other]
Title: City-Scale Multi-Camera Vehicle Tracking System with Improved Self-Supervised Camera Link Model
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[90]  arXiv:2405.11338 [pdf, ps, other]
Title: EyeFound: A Multimodal Generalist Foundation Model for Ophthalmic Imaging
Comments: 21 pages, 2 figures, 4 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[91]  arXiv:2405.11337 [pdf, other]
Title: A Unified Approach Towards Active Learning and Out-of-Distribution Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[92]  arXiv:2405.11336 [pdf, other]
Title: UPAM: Unified Prompt Attack in Text-to-Image Generation Models Against Both Textual Filters and Visual Checkers
Comments: Accepted by ICML2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[93]  arXiv:2405.11315 [pdf, other]
Title: MediCLIP: Adapting CLIP for Few-shot Medical Image Anomaly Detection
Comments: 12 pages, 3 figures, 5 tables, early accepted at MICCAI 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[94]  arXiv:2405.11293 [pdf, other]
Title: InfRS: Incremental Few-Shot Object Detection in Remote Sensing Images
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[95]  arXiv:2405.11286 [pdf, other]
Title: Motion Avatar: Generate Human and Animal Avatars with Arbitrary Motion
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[96]  arXiv:2405.11276 [pdf, other]
Title: Visible and Clear: Finding Tiny Objects in Difference Map
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[97]  arXiv:2405.11270 [pdf, other]
Title: HR Human: Modeling Human Avatars with Triangular Mesh and High-Resolution Textures from Videos
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[98]  arXiv:2405.11252 [pdf, other]
Title: Dreamer XL: Towards High-Resolution Text-to-3D Generation via Trajectory Score Matching
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[99]  arXiv:2405.11240 [pdf, other]
Title: Testing the Performance of Face Recognition for People with Down Syndrome
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[100]  arXiv:2405.11236 [pdf, other]
Title: TriLoRA: Integrating SVD for Advanced Style Personalization in Text-to-Image Generation
Comments: Accepted by AI for Content Creation (AI4CC) workshop at CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[101]  arXiv:2405.11205 [pdf, other]
Title: Fuse & Calibrate: A bi-directional Vision-Language Guided Framework for Referring Image Segmentation
Comments: 12 pages, 4 figures ICIC2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[102]  arXiv:2405.11190 [pdf, other]
Title: ReasonPix2Pix: Instruction Reasoning Dataset for Advanced Image Editing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[103]  arXiv:2405.11180 [pdf, other]
Title: GestFormer: Multiscale Wavelet Pooling Transformer Network for Dynamic Hand Gesture Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[104]  arXiv:2405.11165 [pdf, other]
Title: Automated Multi-level Preference for MLLMs
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[105]  arXiv:2405.11158 [pdf, other]
Title: Dusk Till Dawn: Self-supervised Nighttime Stereo Depth Estimation using Visual Foundation Models
Comments: The paper is published at ICRA 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[106]  arXiv:2405.11154 [pdf, other]
Title: Revisiting the Robust Generalization of Adversarial Prompt Tuning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[107]  arXiv:2405.11151 [pdf, other]
Title: Multi-scale Information Sharing and Selection Network with Boundary Attention for Polyp Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[108]  arXiv:2405.11145 [pdf, other]
Title: Detecting Multimodal Situations with Insufficient Context and Abstaining from Baseless Predictions
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[109]  arXiv:2405.11129 [pdf, other]
Title: MotionGS : Compact Gaussian Splatting SLAM by Motion Filter
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[110]  arXiv:2405.11126 [pdf, other]
Title: Flexible Motion In-betweening with Diffusion Models
Comments: SIGGRAPH 2024. For project page and code, see this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[111]  arXiv:2405.11112 [pdf, other]
Title: Enhancing Understanding Through Wildlife Re-Identification
Authors: J. Buitenhuis
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[112]  arXiv:2405.11067 [pdf, other]
Title: Bayesian Learning-driven Prototypical Contrastive Loss for Class-Incremental Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[113]  arXiv:2405.11021 [pdf, other]
Title: Photorealistic 3D Urban Scene Reconstruction and Point Cloud Extraction using Google Earth Imagery and Gaussian Splatting
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[114]  arXiv:2405.10954 [pdf, ps, other]
Title: Multimodal CLIP Inference for Meta-Few-Shot Image Classification
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[115]  arXiv:2405.10952 [pdf, other]
Title: VICAN: Very Efficient Calibration Algorithm for Large Camera Networks
Comments: To appear at the IEEE International Conference on Robotics and Automation (ICRA), 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[116]  arXiv:2405.10951 [pdf, other]
Title: Block Selective Reprogramming for On-device Training of Vision Transformers
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[117]  arXiv:2405.10949 [pdf, other]
Title: Global License Plate Dataset
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[118]  arXiv:2405.10948 [pdf, other]
Title: Surgical-LVLM: Learning to Adapt Large Vision-Language Model for Grounded Visual Question Answering in Robotic Surgery
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO); Image and Video Processing (eess.IV)
[119]  arXiv:2405.10947 [pdf, other]
Title: Depth-aware Panoptic Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[120]  arXiv:2405.10946 [pdf, other]
Title: Application of Tensorized Neural Networks for Cloud Classification
Comments: 11 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Atmospheric and Oceanic Physics (physics.ao-ph)
[121]  arXiv:2405.12171 (cross-list from cs.SE) [pdf, other]
Title: State of the Practice for Medical Imaging Software
Comments: 73 pages, 14 figures, 12 tables
Subjects: Software Engineering (cs.SE); Computer Vision and Pattern Recognition (cs.CV)
[122]  arXiv:2405.11880 (cross-list from cs.LG) [pdf, other]
Title: Quantifying In-Context Reasoning Effects and Memorization Effects in LLMs
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[123]  arXiv:2405.11829 (cross-list from cs.LG) [pdf, other]
Title: Adversarially Diversified Rehearsal Memory (ADRM): Mitigating Memory Overfitting Challenge in Continual Learning
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[124]  arXiv:2405.11708 (cross-list from cs.LG) [pdf, other]
Title: Adaptive Batch Normalization Networks for Adversarial Robustness
Comments: Accepted at IEEE International Conference on Advanced Video and Signal-based Surveillance (AVSS) 2024
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[125]  arXiv:2405.11659 (cross-list from cs.RO) [pdf, other]
Title: Auto-Platoon : Freight by example
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[126]  arXiv:2405.11640 (cross-list from cs.AI) [pdf, other]
Title: Inquire, Interact, and Integrate: A Proactive Agent Collaborative Framework for Zero-Shot Multimodal Medical Reasoning
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[127]  arXiv:2405.11598 (cross-list from eess.IV) [pdf, other]
Title: AI-Assisted Diagnosis for Covid-19 CXR Screening: From Data Collection to Clinical Validation
Comments: Accepted at 21st IEEE International Symposium on Biomedical Imaging (ISBI)
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[128]  arXiv:2405.11533 (cross-list from cs.LG) [pdf, other]
Title: Hierarchical Selective Classification
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[129]  arXiv:2405.11492 (cross-list from cs.RO) [pdf, other]
Title: Enhancing Vehicle Aerodynamics with Deep Reinforcement Learning in Voxelised Models
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[130]  arXiv:2405.11386 (cross-list from eess.IV) [pdf, other]
Title: Liver Fat Quantification Network with Body Shape
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[131]  arXiv:2405.11326 (cross-list from cs.LG) [pdf, other]
Title: On the Trajectory Regularity of ODE-based Diffusion Sampling
Comments: ICML 2024, 30 pages
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[132]  arXiv:2405.11320 (cross-list from cs.LG) [pdf, other]
Title: Sampling Strategies for Mitigating Bias in Face Synthesis Methods
Comments: Accepted to the BIAS 2023 ECML-PKDD Workshop
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[133]  arXiv:2405.11301 (cross-list from cs.CL) [pdf, other]
Title: Enhancing Fine-Grained Image Classifications via Cascaded Vision Language Models
Authors: Canshi Wei
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[134]  arXiv:2405.11298 (cross-list from cs.RO) [pdf, other]
Title: Visual Episodic Memory-based Exploration
Comments: FLAIRS 2023, 7 pages, 11 figures
Journal-ref: The International FLAIRS Conference Proceedings. Vol. 36. 2023
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[135]  arXiv:2405.11295 (cross-list from eess.IV) [pdf, ps, other]
Title: Medical Image Analysis for Detection, Treatment and Planning of Disease using Artificial Intelligence Approaches
Comments: 10 pages, 3 figures
Journal-ref: International Journal of Microsystems and IoT, Vol. 1, Issue 5, pp.278- 287, 2023
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[136]  arXiv:2405.11289 (cross-list from eess.IV) [pdf, other]
Title: Diffusion Model Driven Test-Time Image Adaptation for Robust Skin Lesion Classification
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[137]  arXiv:2405.11273 (cross-list from cs.AI) [pdf, other]
Title: Uni-MoE: Scaling Unified Multimodal LLMs with Mixture of Experts
Comments: 22 pages, 13 figures. Project Website: this https URL Working in progress
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[138]  arXiv:2405.11176 (cross-list from cs.RO) [pdf, other]
Title: Outlier-Robust Long-Term Robotic Mapping Leveraging Ground Segmentation
Authors: Hyungtae Lim
Comments: 2 pages, 4 figures
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[139]  arXiv:2405.11133 (cross-list from eess.IV) [pdf, ps, other]
Title: XCAT-2.0: A Comprehensive Library of Personalized Digital Twins Derived from CT Scans
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[140]  arXiv:2405.11064 (cross-list from eess.SP) [pdf, other]
Title: TVCondNet: A Conditional Denoising Neural Network for NMR Spectroscopy
Subjects: Signal Processing (eess.SP); Computer Vision and Pattern Recognition (cs.CV)
[141]  arXiv:2405.11029 (cross-list from cs.LG) [pdf, other]
Title: Generative Artificial Intelligence: A Systematic Review and Applications
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[142]  arXiv:2405.10950 (cross-list from eess.IV) [pdf, ps, other]
Title: Classification of colorectal primer carcinoma from normal colon with mid-infrared spectra
Comments: 15 pages, 5 figures, 4 tables, Conferentia Chemometrica 2023 special edition, for the original digital location, see this https URL , digital biblio info: (2024) e3542
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Tissues and Organs (q-bio.TO)

Mon, 20 May 2024

[143]  arXiv:2405.10934 [pdf, other]
Title: Reconstruction of Manipulated Garment with Guided Deformation Prior
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[144]  arXiv:2405.10913 [pdf, other]
Title: Blackbox Adaptation for Medical Image Segmentation
Comments: Accepted early at MICCAI 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[145]  arXiv:2405.10885 [pdf, other]
Title: FA-Depth: Toward Fast and Accurate Self-supervised Monocular Depth Estimation
Authors: Fei Wang, Jun Cheng
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[146]  arXiv:2405.10879 [pdf, other]
Title: One registration is worth two segmentations
Comments: Early Accepted by MICCAI2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[147]  arXiv:2405.10871 [pdf, other]
[148]  arXiv:2405.10868 [pdf, other]
Title: Air Signing and Privacy-Preserving Signature Verification for Digital Documents
Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[149]  arXiv:2405.10864 [pdf, other]
Title: Improving face generation quality and prompt following with synthetic captions
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[150]  arXiv:2405.10842 [pdf, ps, other]
Title: Automated Radiology Report Generation: A Review of Recent Advances
Comments: 24 pages, 8 figures, 6 tables. Submitted to IEEE Reviews in Biomedical Engineering
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[151]  arXiv:2405.10832 [pdf, other]
Title: Open-Vocabulary Spatio-Temporal Action Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[152]  arXiv:2405.10802 [pdf, other]
Title: Reduced storage direct tensor ring decomposition for convolutional neural networks compression
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[153]  arXiv:2405.10748 [pdf, other]
Title: Deep Data Consistency: a Fast and Robust Diffusion Model-based Solver for Inverse Problems
Comments: Codes: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[154]  arXiv:2405.10739 [pdf, other]
Title: Efficient Multimodal Large Language Models: A Survey
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[155]  arXiv:2405.10736 [pdf, other]
Title: StackOverflowVQA: Stack Overflow Visual Question Answering Dataset
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[156]  arXiv:2405.10718 [pdf, other]
Title: SignLLM: Sign Languages Production Large Language Models
Comments: 33 pages, website at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[157]  arXiv:2405.10707 [pdf, ps, other]
Title: HARIS: Human-Like Attention for Reference Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[158]  arXiv:2405.10696 [pdf, other]
Title: Autonomous AI-enabled Industrial Sorting Pipeline for Advanced Textile Recycling
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[159]  arXiv:2405.10690 [pdf, other]
Title: CoLeaF: A Contrastive-Collaborative Learning Framework for Weakly Supervised Audio-Visual Video Parsing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[160]  arXiv:2405.10674 [pdf, other]
Title: From Sora What We Can See: A Survey of Text-to-Video Generation
Comments: A comprehensive list of text-to-video generation studies in this survey is available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[161]  arXiv:2405.10612 [pdf, other]
Title: Not All Prompts Are Secure: A Switchable Backdoor Attack Against Pre-trained Vision Transformers
Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[162]  arXiv:2405.10610 [pdf, other]
Title: Driving Referring Video Object Segmentation with Vision-Language Pre-trained Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[163]  arXiv:2405.10598 [pdf, other]
Title: Learning Object-Centric Representation via Reverse Hierarchy Guidance
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[164]  arXiv:2405.10591 [pdf, other]
Title: GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[165]  arXiv:2405.10589 [pdf, other]
Title: Improving Point-based Crowd Counting and Localization Based on Auxiliary Point Guidance
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
[166]  arXiv:2405.10577 [pdf, other]
Title: DuoSpaceNet: Leveraging Both Bird's-Eye-View and Perspective View Representations for 3D Object Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[167]  arXiv:2405.10575 [pdf, other]
Title: Accurate Training Data for Occupancy Map Prediction in Automated Driving Using Evidence Theory
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[168]  arXiv:2405.10567 [pdf, other]
Title: Team Samsung-RAL: Technical Report for 2024 RoboDrive Challenge-Robust Map Segmentation Track
Comments: ICRA 2024 RoboDrive Challenge Robust Map Segmentation Track 3rd Place Technical Report. arXiv admin note: text overlap with arXiv:2205.09743 by other authors
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[169]  arXiv:2405.10557 [pdf, other]
Title: Resolving Symmetry Ambiguity in Correspondence-based Methods for Instance-level Object Pose Estimation
Comments: 8 pages,10 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[170]  arXiv:2405.10554 [pdf, other]
Title: NeRO: Neural Road Surface Reconstruction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[171]  arXiv:2405.10530 [pdf, other]
Title: CM-UNet: Hybrid CNN-Mamba UNet for Remote Sensing Image Semantic Segmentation
Comments: 5 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[172]  arXiv:2405.10529 [pdf, other]
Title: Safeguarding Vision-Language Models Against Patched Visual Prompt Injectors
Comments: 15 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[173]  arXiv:2405.10518 [pdf, ps, other]
Title: Enhancing Perception Quality in Remote Sensing Image Compression via Invertible Neural Network
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[174]  arXiv:2405.10508 [pdf, other]
Title: ART3D: 3D Gaussian Splatting for Text-Guided Artistic Scenes Generation
Comments: Accepted at CVPR 2024 Workshop on AI3DG
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[175]  arXiv:2405.10504 [pdf, ps, other]
Title: Multi-scale Semantic Prior Features Guided Deep Neural Network for Urban Street-view Image
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[176]  arXiv:2405.10489 [pdf, other]
Title: MixCut:A Data Augmentation Method for Facial Expression Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[177]  arXiv:2405.10456 [pdf, other]
Title: Region-level labels in ice charts can produce pixel-level segmentation for Sea Ice types
Comments: Published at ICLR 2024 Machine Learning for Remote Sensing (ML4RS) Workshop
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[178]  arXiv:2405.10444 [pdf, other]
Title: A Novel Bounding Box Regression Method for Single Object Tracking
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[179]  arXiv:2405.10439 [pdf, other]
Title: Beyond Traditional Single Object Tracking: A Survey
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[180]  arXiv:2405.10423 [pdf, other]
Title: Diversity-Aware Sign Language Production through a Pose Encoding Variational Autoencoder
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[181]  arXiv:2405.10398 [pdf, other]
Title: Drone-type-Set: Drone types detection benchmark for drone detection and tracking
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[182]  arXiv:2405.10370 [pdf, other]
Title: Grounded 3D-LLM with Referent Tokens
Comments: Preprint
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[183]  arXiv:2405.10357 [pdf, other]
Title: RGB Guided ToF Imaging System: A Survey of Deep Learning-based Methods
Comments: To appear on International Journal of Computer Vision (IJCV)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[184]  arXiv:2405.10347 [pdf, other]
Title: Networking Systems for Video Anomaly Detection: A Tutorial and Survey
Comments: Submitted to ACM Computing Surveys, under review,for more information and supplementary material, please see this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computers and Society (cs.CY)
[185]  arXiv:2405.10939 (cross-list from cs.LG) [pdf, other]
Title: DINO as a von Mises-Fisher mixture model
Comments: Accepted to ICLR 2023
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[186]  arXiv:2405.10870 (cross-list from eess.IV) [pdf, other]
Title: Multicenter Privacy-Preserving Model Training for Deep Learning Brain Metastases Autosegmentation
Comments: Submission to the Green Journal (Major Revision)
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[187]  arXiv:2405.10833 (cross-list from eess.IV) [pdf, other]
Title: Automatic segmentation of Organs at Risk in Head and Neck cancer patients from CT and MRI scans
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[188]  arXiv:2405.10803 (cross-list from eess.IV) [pdf, other]
Title: A Large-scale Multi Domain Leukemia Dataset for the White Blood Cells Detection with Morphological Attributes for Explainability
Comments: Early Accept
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[189]  arXiv:2405.10754 (cross-list from math.OC) [pdf, other]
Title: Stable Phase Retrieval with Mirror Descent
Subjects: Optimization and Control (math.OC); Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT)
[190]  arXiv:2405.10723 (cross-list from eess.IV) [pdf, other]
Title: Eddeep: Fast eddy-current distortion correction for diffusion MRI with deep learning
Comments: submitted to MICCAI 2024
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[191]  arXiv:2405.10705 (cross-list from eess.IV) [pdf, other]
Title: 3D Vessel Reconstruction from Sparse-View Dynamic DSA Images via Vessel Probability Guided Attenuation Learning
Comments: 12 pages, 13 figures, 5 tables
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[192]  arXiv:2405.10702 (cross-list from cs.CL) [pdf, ps, other]
Title: Empowering Prior to Court Legal Analysis: A Transparent and Accessible Dataset for Defensive Statement Classification and Interpretation
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[193]  arXiv:2405.10691 (cross-list from eess.IV) [pdf, other]
Title: LoCI-DiffCom: Longitudinal Consistency-Informed Diffusion Model for 3D Infant Brain Image Completion
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[194]  arXiv:2405.10561 (cross-list from eess.IV) [pdf, other]
Title: Infrared Image Super-Resolution via Lightweight Information Split Network
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[195]  arXiv:2405.10550 (cross-list from eess.IV) [pdf, other]
Title: LighTDiff: Surgical Endoscopic Image Low-Light Enhancement with T-Diffusion
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[196]  arXiv:2405.10531 (cross-list from cs.LG) [pdf, other]
Title: Nonparametric Teaching of Implicit Neural Representations
Comments: ICML 2024 (24 pages, 13 figures)
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[197]  arXiv:2405.10497 (cross-list from cs.MM) [pdf, other]
Title: SMP Challenge: An Overview and Analysis of Social Media Prediction Challenge
Comments: ACM Multimedia. arXiv admin note: text overlap with arXiv:1910.01795
Subjects: Multimedia (cs.MM); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Social and Information Networks (cs.SI)

Fri, 17 May 2024

[198]  arXiv:2405.10320 [pdf, other]
Title: Toon3D: Seeing Cartoons from a New Perspective
Comments: Please see our project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[199]  arXiv:2405.10317 [pdf, other]
Title: Text-to-Vector Generation with Neural Path Representation
Comments: Accepted by SIGGRAPH 2024. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[200]  arXiv:2405.10316 [pdf, other]
Title: Analogist: Out-of-the-box Visual In-Context Learning with Image Diffusion Model
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[201]  arXiv:2405.10314 [pdf, other]
Title: CAT3D: Create Anything in 3D with Multi-View Diffusion Models
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[202]  arXiv:2405.10305 [pdf, other]
Title: 4D Panoptic Scene Graph Generation
Comments: Accepted as NeurIPS 2023. Code: this https URL Previous Series: PSG this https URL and PVSG this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[203]  arXiv:2405.10300 [pdf, other]
Title: Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection
Comments: Technical report
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[204]  arXiv:2405.10286 [pdf, other]
Title: FFF: Fixing Flawed Foundations in contrastive pre-training results in very strong Vision-Language models
Comments: Accepted at CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[205]  arXiv:2405.10272 [pdf, other]
Title: Faces that Speak: Jointly Synthesising Talking Face and Speech from Text
Comments: CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[206]  arXiv:2405.10266 [pdf, other]
Title: A Tale of Two Languages: Large-Vocabulary Continuous Sign Language Recognition from Spoken Language Supervision
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[207]  arXiv:2405.10256 [pdf, other]
Title: Biasing & Debiasing based Approach Towards Fair Knowledge Transfer for Equitable Skin Analysis
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[208]  arXiv:2405.10255 [pdf, other]
Title: When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[209]  arXiv:2405.10244 [pdf, ps, other]
Title: Towards Task-Compatible Compressible Representations
Comments: To be published in ICME Workshops 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Signal Processing (eess.SP)
[210]  arXiv:2405.10185 [pdf, other]
Title: DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative Data
Comments: Accepted to CVPR 2024, codes are available at \href{this https URL}{this https URL}
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[211]  arXiv:2405.10175 [pdf, other]
Title: Filling Missing Values Matters for Range Image-Based Point Cloud Segmentation
Comments: This paper has been submitted to a journal
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[212]  arXiv:2405.10160 [pdf, other]
Title: PIR: Remote Sensing Image-Text Retrieval with Prior Instruction Representation Learning
Comments: 15 pages, 9 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[213]  arXiv:2405.10148 [pdf, other]
Title: SpecDETR: A Transformer-based Hyperspectral Point Object Detection Network
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[214]  arXiv:2405.10140 [pdf, other]
Title: Libra: Building Decoupled Vision System on Large Language Models
Comments: ICML2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[215]  arXiv:2405.10132 [pdf, other]
Title: Cooperative Visual-LiDAR Extrinsic Calibration Technology for Intersection Vehicle-Infrastructure: A review
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[216]  arXiv:2405.10122 [pdf, other]
Title: Generating Coherent Sequences of Visual Illustrations for Real-World Manual Tasks
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[217]  arXiv:2405.10082 [pdf, other]
Title: An Integrated Framework for Multi-Granular Explanation of Video Summarization
Comments: Under review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[218]  arXiv:2405.10075 [pdf, other]
Title: HecVL: Hierarchical Video-Language Pretraining for Zero-shot Surgical Phase Recognition
Comments: Accepted by MICCAI2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[219]  arXiv:2405.10053 [pdf, other]
Title: SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection
Comments: Accepted as a conference paper (highlight) at CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[220]  arXiv:2405.10046 [pdf, other]
Title: A Preprocessing and Postprocessing Voxel-based Method for LiDAR Semantic Segmentation Improvement in Long Distance
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[221]  arXiv:2405.10041 [pdf, other]
Title: Revealing Hierarchical Structure of Leaf Venations in Plant Science via Label-Efficient Segmentation: Dataset and Method
Comments: Accepted by IJCAI2024, Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[222]  arXiv:2405.10037 [pdf, other]
Title: Bilateral Event Mining and Complementary for Event Stream Super-Resolution
Comments: Accepted to CVPR2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[223]  arXiv:2405.10030 [pdf, other]
Title: RSDehamba: Lightweight Vision Mamba for Remote Sensing Satellite Image Dehazing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[224]  arXiv:2405.10014 [pdf, other]
Title: Frequency-Domain Refinement with Multiscale Diffusion for Super Resolution
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[225]  arXiv:2405.10008 [pdf, other]
Title: Solving the enigma: Deriving optimal explanations of deep networks
Comments: keywords: XAI, neuroscience, brain, 3D, 2D, computer vision, classification
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[226]  arXiv:2405.09996 [pdf, other]
Title: Driving-Video Dehazing with Non-Aligned Regularization for Safety Assistance
Comments: Accepted by CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[227]  arXiv:2405.09985 [pdf, other]
Title: VirtualModel: Generating Object-ID-retentive Human-object Interaction Image by Diffusion Model for E-commerce Marketing
Comments: project page: this https URL;
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[228]  arXiv:2405.09981 [pdf, other]
Title: Adversarial Robustness for Visual Grounding of Multimodal Large Language Models
Comments: ICLR 2024 Workshop on Reliable and Responsible Foundation Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[229]  arXiv:2405.09976 [pdf, other]
Title: Language-Oriented Semantic Latent Representation for Image Transmission
Comments: Under review at IEEE International Workshop on Machine Learning for Signal Processing (MLSP) 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Signal Processing (eess.SP)
[230]  arXiv:2405.09964 [pdf, other]
Title: KPNDepth: Depth Estimation of Lane Images under Complex Rainy Environment
Authors: Zhengxu Shi
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[231]  arXiv:2405.09955 [pdf, other]
Title: Dual-band feature selection for maturity classification of specialty crops by hyperspectral imaging
Comments: Preprint: Paper submitted to the special issue of "Computers and Electronics in Agriculture"
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[232]  arXiv:2405.09942 [pdf, other]
Title: FPDIoU Loss: A Loss Function for Efficient Bounding Box Regression of Rotated Object Detection
Authors: Siliang Ma, Yong Xu
Comments: arXiv admin note: text overlap with arXiv:2307.07662, text overlap with arXiv:1902.09630 by other authors
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[233]  arXiv:2405.09934 [pdf, other]
Title: Detecting Domain Shift in Multiple Instance Learning for Digital Pathology Using Fréchet Domain Distance
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[234]  arXiv:2405.09933 [pdf, other]
Title: MiniMaxAD: A Lightweight Autoencoder for Feature-Rich Anomaly Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[235]  arXiv:2405.09931 [pdf, other]
Title: Learning from Observer Gaze:Zero-Shot Attention Prediction Oriented by Human-Object Interaction Recognition
Comments: Accepted by CVPR2024. Project HomePage: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[236]  arXiv:2405.09924 [pdf, other]
Title: Infrared Adversarial Car Stickers
Comments: Accepted by CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[237]  arXiv:2405.09923 [pdf, other]
Title: NTIRE 2024 Restore Any Image Model (RAIM) in the Wild Challenge
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[238]  arXiv:2405.09922 [pdf, other]
Title: Cross-sensor self-supervised training and alignment for remote sensing
Authors: Valerio Marsocci (CEDRIC - VERTIGO, CNAM), Nicolas Audebert (CEDRIC - VERTIGO, CNAM, LaSTIG, IGN)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[239]  arXiv:2405.09902 [pdf, other]
Title: Unveiling the Potential: Harnessing Deep Metric Learning to Circumvent Video Streaming Encryption
Comments: Published in the WI-IAT 2023 proceedings
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
[240]  arXiv:2405.09883 [pdf, other]
Title: RoScenes: A Large-scale Multi-view 3D Dataset for Roadside Perception
Comments: Technical report. 32 pages, 21 figures, 13 tables. this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[241]  arXiv:2405.09882 [pdf, other]
Title: DiffAM: Diffusion-based Adversarial Makeup Transfer for Facial Privacy Protection
Comments: 16 pages, 11 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[242]  arXiv:2405.09880 [pdf, other]
Title: Deep Learning-Based Quasi-Conformal Surface Registration for Partial 3D Faces Applied to Facial Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[243]  arXiv:2405.09879 [pdf, other]
Title: Generative Unlearning for Any Identity
Comments: 15 pages, 17 figures, 10 tables, CVPR 2024 Poster
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[244]  arXiv:2405.09874 [pdf, other]
Title: Dual3D: Efficient and Consistent Text-to-3D Generation with Dual-mode Multi-view Latent Diffusion
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[245]  arXiv:2405.09873 [pdf, other]
Title: IRSRMamba: Infrared Image Super-Resolution via Mamba-based Wavelet Transform Feature Modulation Model
Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[246]  arXiv:2405.09863 [pdf, other]
Title: Box-Free Model Watermarks Are Prone to Black-Box Removal Attacks
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[247]  arXiv:2405.09858 [pdf, other]
Title: Towards Realistic Incremental Scenario in Class Incremental Semantic Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[248]  arXiv:2405.09828 [pdf, other]
Title: PillarNeXt: Improving the 3D detector by introducing Voxel2Pillar feature encoding and extracting multi-scale features
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[249]  arXiv:2405.09827 [pdf, other]
Title: Parallel Backpropagation for Shared-Feature Visualization
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[250]  arXiv:2405.09806 [pdf, other]
Title: MediSyn: Text-Guided Diffusion Models for Broad Medical 2D and 3D Image Synthesis
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[251]  arXiv:2405.09789 [pdf, other]
Title: LeMeViT: Efficient Vision Transformer with Learnable Meta Tokens for Remote Sensing Image Interpretation
Comments: Accepted by IJCAI'2024. The code is available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[252]  arXiv:2405.09782 [pdf, other]
Title: Size-invariance Matters: Rethinking Metrics and Losses for Imbalanced Multi-object Salient Object Detection
Comments: This paper has been accepted by ICML2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[253]  arXiv:2405.09777 [pdf, other]
Title: Rethinking Barely-Supervised Segmentation from an Unsupervised Domain Adaptation Perspective
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[254]  arXiv:2405.09755 [pdf, other]
Title: Collision Avoidance Metric for 3D Camera Evaluation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[255]  arXiv:2405.09717 [pdf, other]
Title: From NeRFs to Gaussian Splats, and Back
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[256]  arXiv:2405.09713 [pdf, other]
Title: SOK-Bench: A Situated Video Reasoning Benchmark with Aligned Open-World Knowledge
Comments: CVPR
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[257]  arXiv:2405.09707 [pdf, other]
Title: Point2SSM++: Self-Supervised Learning of Anatomical Shape Models from Point Clouds
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[258]  arXiv:2405.09697 [pdf, other]
Title: Weakly Supervised Bayesian Shape Modeling from Unsegmented Medical Images
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[259]  arXiv:2405.09682 [pdf, other]
Title: Synth-to-Real Unsupervised Domain Adaptation for Instance Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[260]  arXiv:2405.09588 [pdf, ps, other]
Title: Training Deep Learning Models with Hybrid Datasets for Robust Automatic Target Detection on real SAR images
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Signal Processing (eess.SP)
[261]  arXiv:2405.09582 [pdf, other]
Title: AD-Aligning: Emulating Human-like Generalization for Cognitive Domain Adaptation in Deep Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[262]  arXiv:2405.09550 [pdf, other]
Title: Mask-based Invisible Backdoor Attacks on Object Detection
Authors: Shin Jeong Jin
Comments: 7 pages, 3 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
[263]  arXiv:2405.10292 (cross-list from cs.AI) [pdf, other]
Title: Fine-Tuning Large Vision-Language Models as Decision-Making Agents via Reinforcement Learning
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[264]  arXiv:2405.10262 (cross-list from cs.LG) [pdf, other]
Title: Two-Phase Dynamics of Interactions Explains the Starting Point of a DNN Learning Over-Fitted Features
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[265]  arXiv:2405.10254 (cross-list from eess.IV) [pdf, other]
Title: PRISM: A Multi-Modal Generative Foundation Model for Slide-Level Histopathology
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[266]  arXiv:2405.10246 (cross-list from eess.IV) [pdf, other]
Title: A Foundation Model for Brain Lesion Segmentation with Mixture of Modality Experts
Comments: The work has been early accepted by MICCAI 2024
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[267]  arXiv:2405.10068 (cross-list from eess.IV) [pdf, other]
Title: MrRegNet: Multi-resolution Mask Guided Convolutional Neural Network for Medical Image Registration with Large Deformations
Comments: Accepted for publication at IEEE International Symposium on Biomedical Imaging (ISBI) 2024
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[268]  arXiv:2405.10020 (cross-list from cs.RO) [pdf, other]
Title: Natural Language Can Help Bridge the Sim2Real Gap
Comments: To appear in RSS 2024
Subjects: Robotics (cs.RO); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[269]  arXiv:2405.10004 (cross-list from eess.IV) [pdf, other]
Title: ROCOv2: Radiology Objects in COntext Version 2, an Updated Multimodal Image Dataset
Comments: Major revision Scientific Data
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[270]  arXiv:2405.09990 (cross-list from eess.IV) [pdf, other]
Title: Histopathology Foundation Models Enable Accurate Ovarian Cancer Subtype Classification
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[271]  arXiv:2405.09959 (cross-list from eess.IV) [pdf, other]
Title: Patient-Specific Real-Time Segmentation in Trackerless Brain Ultrasound
Comments: Early accept at MICCAI 2024 - code available at: this https URL
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[272]  arXiv:2405.09864 (cross-list from astro-ph.IM) [pdf, other]
Title: Solar multi-object multi-frame blind deconvolution with a spatially variant convolution neural emulator
Authors: A. Asensio Ramos (IAC+ULL)
Comments: 15 pages, 14 figures, accepted for publication in A&A
Subjects: Instrumentation and Methods for Astrophysics (astro-ph.IM); Computer Vision and Pattern Recognition (cs.CV)
[273]  arXiv:2405.09851 (cross-list from eess.IV) [pdf, other]
Title: Region of Interest Detection in Melanocytic Skin Tumor Whole Slide Images -- Nevus & Melanoma
Comments: 5 figures, NeurIPS 2022 Workshop
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Quantitative Methods (q-bio.QM)
[274]  arXiv:2405.09820 (cross-list from cs.LG) [pdf, other]
Title: Densely Distilling Cumulative Knowledge for Continual Learning
Comments: 12 pages; Continual Leanrning; Class-incremental Learning; Knowledge Distillation; Forgetting
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[275]  arXiv:2405.09814 (cross-list from cs.GR) [pdf, other]
Title: Semantic Gesticulator: Semantics-Aware Co-Speech Gesture Synthesis
Comments: SIGGRAPH 2024 (Journal Track); Project page: this https URL
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[276]  arXiv:2405.09798 (cross-list from cs.LG) [pdf, other]
Title: Many-Shot In-Context Learning in Multimodal Foundation Models
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[277]  arXiv:2405.09787 (cross-list from eess.IV) [pdf, other]
[278]  arXiv:2405.09716 (cross-list from eess.IV) [pdf, other]
Title: Illumination Histogram Consistency Metric for Quantitative Assessment of Video Sequences
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[279]  arXiv:2405.09711 (cross-list from cs.AI) [pdf, other]
Title: STAR: A Benchmark for Situated Reasoning in Real-World Videos
Comments: NeurIPS
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[280]  arXiv:2405.09695 (cross-list from cs.HC) [pdf, other]
Title: Enhancing Saliency Prediction in Monitoring Tasks: The Role of Visual Highlights
Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV)
[281]  arXiv:2405.09601 (cross-list from physics.med-ph) [pdf, ps, other]
Title: Fully Automated OCT-based Tissue Screening System
Subjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV)
[282]  arXiv:2405.09600 (cross-list from cs.LG) [pdf, other]
Title: Aggregate Representation Measure for Predictive Model Reusability
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
[283]  arXiv:2405.09594 (cross-list from eess.IV) [pdf, other]
Title: Learning Generalized Medical Image Representations through Image-Graph Contrastive Pretraining
Comments: Accepted into Machine Learning for Health (ML4H) 2023
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[284]  arXiv:2405.09589 (cross-list from cs.LG) [pdf, other]
Title: Unveiling Hallucination in Text, Image, Video, and Audio Foundation Models: A Comprehensive Survey
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[285]  arXiv:2405.09586 (cross-list from eess.IV) [pdf, other]
Title: Factual Serialization Enhancement: A Key Innovation for Chest X-ray Report Generation
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[286]  arXiv:2405.09558 (cross-list from eess.SP) [pdf, other]
Title: An EM Body Model for Device-Free Localization with Multiple Antenna Receivers: A First Study
Journal-ref: 2023 IEEE-APS Topical Conference on Antennas and Propagation in Wireless Communications (APWC)
Subjects: Signal Processing (eess.SP); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[287]  arXiv:2405.09552 (cross-list from eess.IV) [pdf, other]
Title: ODFormer: Semantic Fundus Image Segmentation Using Transformer for Optic Nerve Head Detection
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)

Thu, 16 May 2024

[288]  arXiv:2405.09546 [pdf, other]
Title: BEHAVIOR Vision Suite: Customizable Dataset Generation via Simulation
Comments: CVPR 2024 (Highlight). Project website: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[289]  arXiv:2405.09544 [pdf, other]
Title: Classifying geospatial objects from multiview aerial imagery using semantic meshes
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[290]  arXiv:2405.09487 [pdf, other]
Title: Color Space Learning for Cross-Color Person Re-Identification
Comments: Accepted by ICME 2024 (Oral)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[291]  arXiv:2405.09463 [pdf, other]
Title: Gaze-DETR: Using Expert Gaze to Reduce False Positives in Vulvovaginal Candidiasis Screening
Comments: MICCAI-2024 early accept. Our code is available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[292]  arXiv:2405.09459 [pdf, other]
Title: Fourier Boundary Features Network with Wider Catchers for Glass Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[293]  arXiv:2405.09431 [pdf, other]
Title: A Survey On Text-to-3D Contents Generation In The Wild
Authors: Chenhan Jiang
Comments: 11 pages, 10 figures, 4 tables. arXiv admin note: text overlap with arXiv:2401.17807 by other authors
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[294]  arXiv:2405.09426 [pdf, other]
Title: Global-Local Image Perceptual Score (GLIPS): Evaluating Photorealistic Quality of AI-Generated Images
Comments: 10 pages, 3 figures. Submitted to IEEE Transactions on Human-Machine Systems
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[295]  arXiv:2405.09409 [pdf, ps, other]
Title: Real-World Federated Learning in Radiology: Hurdles to overcome and Benefits to gain
Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)
[296]  arXiv:2405.09404 [pdf, other]
Title: Time-Equivariant Contrastive Learning for Degenerative Disease Progression in Retinal OCT
Comments: Accepted at MICCAI 2024 (early accept, top 11%)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[297]  arXiv:2405.09403 [pdf, other]
Title: Identity Overlap Between Face Recognition Train/Test Data: Causing Optimistic Bias in Accuracy Measurement
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[298]  arXiv:2405.09365 [pdf, other]
Title: SARATR-X: A Foundation Model for Synthetic Aperture Radar Images Target Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[299]  arXiv:2405.09355 [pdf, other]
Title: Vision-Based Neurosurgical Guidance: Unsupervised Localization and Camera-Pose Prediction
Comments: Early Accept at MICCAI 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[300]  arXiv:2405.09342 [pdf, other]
Title: Progressive Depth Decoupling and Modulating for Flexible Depth Completion
Comments: The article is accepted by IEEE Transactions on Instrumentation & Measurement
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[301]  arXiv:2405.09334 [pdf, other]
Title: Content-Based Image Retrieval for Multi-Class Volumetric Radiology Images: A Benchmark Study
Comments: 23 pages, 9 Figures, 13 Tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR)
[302]  arXiv:2405.09333 [pdf, other]
Title: Application of Gated Recurrent Units for CT Trajectory Optimization
Comments: 4 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[303]  arXiv:2405.09321 [pdf, other]
Title: ReconBoost: Boosting Can Achieve Modality Reconcilement
Comments: This paper has been accepted by ICML2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)
[304]  arXiv:2405.09291 [pdf, other]
Title: Sensitivity Decouple Learning for Image Compression Artifacts Reduction
Comments: Accepted by Transactions on Image Processing
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
[305]  arXiv:2405.09288 [pdf, other]
Title: DeCoDEx: Confounder Detector Guidance for Improved Diffusion-based Counterfactual Explanations
Comments: Accepted to Medical Imaging with Deep Learning (MIDL) 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[306]  arXiv:2405.09266 [pdf, other]
Title: Dance Any Beat: Blending Beats with Visuals in Dance Video Generation
Comments: 11 pages, 6 figures, demo page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[307]  arXiv:2405.09247 [pdf, other]
Title: Graph Neural Network based Handwritten Trajectories Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[308]  arXiv:2405.09215 [pdf, other]
Title: Xmodel-VLM: A Simple Baseline for Multimodal Vision Language Model
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[309]  arXiv:2405.09194 [pdf, ps, other]
Title: Flexible image analysis for law enforcement agencies with deep neural networks to determine: where, who and what
Journal-ref: SPIE - Counterterrorism, Crime Fighting, Forensics, and Surveillance Technologies II, 2018, pp.27
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[310]  arXiv:2405.09152 [pdf, other]
Title: Scalable Image Coding for Humans and Machines Using Feature Fusion Network
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[311]  arXiv:2405.09150 [pdf, other]
Title: Curriculum Dataset Distillation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[312]  arXiv:2405.09148 [pdf, ps, other]
Title: A Hierarchically Feature Reconstructed Autoencoder for Unsupervised Anomaly Detection
Comments: 12 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[313]  arXiv:2405.09138 [pdf, other]
Title: OpenGait: A Comprehensive Benchmark Study for Gait Recognition towards Better Practicality
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[314]  arXiv:2405.09131 [pdf, other]
Title: RobustMVS: Single Domain Generalized Deep Multi-view Stereo
Comments: Accepted to TCSVT. Code will be released at: this https URL Benchmark will be released at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[315]  arXiv:2405.09125 [pdf, other]
Title: HAAP: Vision-context Hierarchical Attention Autoregressive with Adaptive Permutation for Scene Text Recognition
Comments: 12 pages, 10 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[316]  arXiv:2405.09114 [pdf, other]
Title: SOEDiff: Efficient Distillation for Small Object Editing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[317]  arXiv:2405.09083 [pdf, other]
Title: RSHazeDiff: A Unified Fourier-aware Diffusion Model for Remote Sensing Image Dehazing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[318]  arXiv:2405.09059 [pdf, other]
Title: Task-adaptive Q-Face
Comments: Ever submitted to ECCV2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[319]  arXiv:2405.09056 [pdf, other]
Title: CTS: A Consistency-Based Medical Image Segmentation Model
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[320]  arXiv:2405.09054 [pdf, other]
Title: Dim Small Target Detection and Tracking: A Novel Method Based on Temporal Energy Selective Scaling and Trajectory Association
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[321]  arXiv:2405.09050 [pdf, other]
Title: 3D Shape Augmentation with Content-Aware Shape Resizing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[322]  arXiv:2405.09045 [pdf, other]
Title: AMSNet: Netlist Dataset for AMS Circuits
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[323]  arXiv:2405.09041 [pdf, other]
Title: Learning from Partial Label Proportions for Whole Slide Image Segmentation
Comments: Accepted at MICCAI2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[324]  arXiv:2405.09032 [pdf, other]
Title: ICAL: Implicit Character-Aided Learning for Enhanced Handwritten Mathematical Expression Recognition
Comments: Accept by ICDAR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[325]  arXiv:2405.09024 [pdf, other]
Title: Dynamic Loss Decay based Robust Oriented Object Detection on Remote Sensing Images with Noisy Labels
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[326]  arXiv:2405.09006 [pdf, other]
Title: Spatial Semantic Recurrent Mining for Referring Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[327]  arXiv:2405.08996 [pdf, other]
Title: Learning Correspondence for Deformable Objects
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[328]  arXiv:2405.08992 [pdf, other]
Title: Contextual Emotion Recognition using Large Vision Language Models
Comments: 8 pages, website: this https URL arXiv admin note: text overlap with arXiv:2310.19995
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[329]  arXiv:2405.08991 [pdf, other]
Title: Theoretical Analysis for Expectation-Maximization-Based Multi-Model 3D Registration
Comments: arXiv admin note: substantial text overlap with arXiv:2402.10865
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[330]  arXiv:2405.08961 [pdf, other]
Title: Bird's-Eye View to Street-View: A Survey
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[331]  arXiv:2405.08932 [pdf, other]
Title: Self-supervised vision-langage alignment of deep learning representations for bone X-rays analysis
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[332]  arXiv:2405.08911 [pdf, other]
Title: CLIP with Quality Captions: A Strong Pretraining for Vision Tasks
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[333]  arXiv:2405.08909 [pdf, other]
Title: ADA-Track: End-to-End Multi-Camera 3D Multi-Object Tracking with Alternating Detection and Association
Comments: 14 pages, 3 figures, accepted by CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[334]  arXiv:2405.08890 [pdf, other]
Title: Language-Guided Self-Supervised Video Summarization Using Text Semantic Matching Considering the Diversity of the Video
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[335]  arXiv:2405.09539 (cross-list from eess.IV) [pdf, ps, other]
Title: MMFusion: Multi-modality Diffusion Model for Lymph Node Metastasis Diagnosis in Esophageal Cancer
Comments: Early accepted to MICCAI 2024 (6/6/5)
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[336]  arXiv:2405.09530 (cross-list from cs.CY) [pdf, other]
[337]  arXiv:2405.09472 (cross-list from eess.IV) [pdf, other]
Title: Perception- and Fidelity-aware Reduced-Reference Super-Resolution Image Quality Assessment
Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[338]  arXiv:2405.09353 (cross-list from eess.IV) [pdf, other]
Title: Large coordinate kernel attention network for lightweight image super-resolution
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[339]  arXiv:2405.09298 (cross-list from eess.IV) [pdf, ps, other]
Title: Deep Blur Multi-Model (DeepBlurMM) -- a strategy to mitigate the impact of image blur on deep learning model performance in histopathology image analysis
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[340]  arXiv:2405.09286 (cross-list from cs.MM) [pdf, other]
Title: MVBIND: Self-Supervised Music Recommendation For Videos Via Embedding Space Binding
Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV)
[341]  arXiv:2405.09077 (cross-list from eess.IV) [pdf, other]
Title: Compressive Feature Selection for Remote Visual Multi-Task Inference
Comments: 6 pages, 8 figures, IEEE ICME Workshop on Coding for Machines
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[342]  arXiv:2405.09049 (cross-list from cs.LG) [pdf, other]
Title: Perception Without Vision for Trajectory Prediction: Ego Vehicle Dynamics as Scene Representation for Efficient Active Learning in Autonomous Driving
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[343]  arXiv:2405.08981 (cross-list from cs.HC) [pdf, other]
Title: Impact of Design Decisions in Scanpath Modeling
Comments: 16 pages
Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[344]  arXiv:2405.08920 (cross-list from cs.LG) [pdf, other]
Title: Neural Collapse Meets Differential Privacy: Curious Behaviors of NoisyGD with Near-perfect Representation Learning
Comments: To appear in ICML 2024
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

Wed, 15 May 2024

[345]  arXiv:2405.08816 [pdf, other]
[346]  arXiv:2405.08815 [pdf, other]
Title: Efficient Vision-Language Pre-training by Cluster Masking
Comments: CVPR 2024, Project page: this https URL , Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[347]  arXiv:2405.08813 [pdf, other]
Title: CinePile: A Long Video Question Answering Dataset and Benchmark
Comments: Project page with all the artifacts - this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[348]  arXiv:2405.08807 [pdf, other]
Title: SciFIBench: Benchmarking Large Multimodal Models for Scientific Figure Interpretation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[349]  arXiv:2405.08794 [pdf, other]
Title: Ambiguous Annotations: When is a Pedestrian not a Pedestrian?
Comments: Paper accepted at the CVPR 2024 Vision and Language for Autonomous Driving and Robotics Workshop
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[350]  arXiv:2405.08786 [pdf, other]
Title: Incorporating Clinical Guidelines through Adapting Multi-modal Large Language Model for Prostate Cancer PI-RADS Scoring
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[351]  arXiv:2405.08780 [pdf, ps, other]
Title: Harnessing the power of longitudinal medical imaging for eye disease prognosis using Transformer-based sequence modeling
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[352]  arXiv:2405.08776 [pdf, ps, other]
Title: FolkTalent: Enhancing Classification and Tagging of Indian Folk Paintings
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[353]  arXiv:2405.08768 [pdf, other]
Title: EfficientTrain++: Generalized Curriculum Learning for Efficient Visual Backbone Training
Comments: Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI). Journal version of arXiv:2211.09703 (ICCV 2023). Code is available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[354]  arXiv:2405.08765 [pdf, other]
Title: Image to Pseudo-Episode: Boosting Few-Shot Segmentation by Unlabeled Data
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[355]  arXiv:2405.08748 [pdf, other]
[356]  arXiv:2405.08720 [pdf, other]
Title: The Lost Melody: Empirical Observations on Text-to-Video Generation From A Storytelling Perspective
Comments: To appear at CVPR 2024 Workshop on AI for Content Creation (AI4CC)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[357]  arXiv:2405.08717 [pdf, other]
Title: How Much You Ate? Food Portion Estimation on Spoons
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[358]  arXiv:2405.08715 [pdf, other]
Title: DeVOS: Flow-Guided Deformable Transformer for Video Object Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[359]  arXiv:2405.08695 [pdf, other]
Title: The impact of Compositionality in Zero-shot Multi-label action recognition for Object-based tasks
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[360]  arXiv:2405.08681 [pdf, other]
Title: Achieving Fairness Through Channel Pruning for Dermatological Disease Diagnosis
Comments: 13 pages, 3 figures, early accepted by International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[361]  arXiv:2405.08668 [pdf, other]
Title: Promoting AI Equity in Science: Generalized Domain Prompt Learning for Accessible VLM Research
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Applications (stat.AP)
[362]  arXiv:2405.08609 [pdf, other]
Title: Dynamic NeRF: A Review
Authors: Jinwei Lin
Comments: 25 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[363]  arXiv:2405.08593 [pdf, other]
Title: Open-Vocabulary Object Detection via Neighboring Region Attention Alignment
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[364]  arXiv:2405.08589 [pdf, other]
Title: Variable Substitution and Bilinear Programming for Aligning Partially Overlapping Point Sets
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[365]  arXiv:2405.08587 [pdf, other]
Title: EchoTracker: Advancing Myocardial Point Tracking in Echocardiography
Comments: Submitted version that got provisionally (early) accepted (top 11%) to MICCAI2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[366]  arXiv:2405.08586 [pdf, other]
Title: Cross-Domain Feature Augmentation for Domain Generalization
Comments: Accepted to the 33rd International Joint Conference on Artificial Intelligence (IJCAI 2024); Code is available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[367]  arXiv:2405.08578 [pdf, ps, other]
Title: Local-peak scale-invariant feature transform for fast and random image stitching
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[368]  arXiv:2405.08555 [pdf, other]
Title: Dual-Branch Network for Portrait Image Quality Assessment
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[369]  arXiv:2405.08547 [pdf, other]
Title: Exploring Graph-based Knowledge: Multi-Level Feature Distillation via Channels Relational Graph
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[370]  arXiv:2405.08533 [pdf, other]
Title: Dynamic Feature Learning and Matching for Class-Incremental Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[371]  arXiv:2405.08493 [pdf, ps, other]
Title: Rethinking Scanning Strategies with Vision Mamba in Semantic Segmentation of Remote Sensing Imagery: An Experimental Study
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[372]  arXiv:2405.08487 [pdf, other]
Title: Semantic Contextualization of Face Forgery: A New Definition, Dataset, and Detection Method
Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
[373]  arXiv:2405.08483 [pdf, other]
Title: RDPN6D: Residual-based Dense Point-wise Network for 6Dof Object Pose Estimation Based on RGB-D Images
Comments: Accepted by CVPR Workshop DLGC, 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[374]  arXiv:2405.08463 [pdf, other]
Title: A Timely Survey on Vision Transformer for Deepfake Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[375]  arXiv:2405.08458 [pdf, other]
Title: Rethinking Prior Information Generation with CLIP for Few-Shot Segmentation
Comments: Accepted by CVPR 2024; The camera-ready version
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[376]  arXiv:2405.08434 [pdf, other]
Title: TP3M: Transformer-based Pseudo 3D Image Matching with Reference
Comments: Accepted by ICRA 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[377]  arXiv:2405.08429 [pdf, other]
Title: TEDNet: Twin Encoder Decoder Neural Network for 2D Camera and LiDAR Road Detection
Comments: Source code: this https URL
Journal-ref: M Bay\'on-Guti\'errez, MT Garc\'ia-Ord\'as, H Alaiz Moret\'on, J Aveleira-Mata, S Rubio-Mart\'in, JA Ben\'itez-Andrades. TEDNet: Twin Encoder Decoder Neural Network for 2D Camera and LiDAR Road Detection. Logic Journal of the IGPL. 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[378]  arXiv:2405.08419 [pdf, other]
Title: WaterMamba: Visual State Space Model for Underwater Image Enhancement
Comments: arXiv admin note: substantial text overlap with arXiv:2403.06098
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[379]  arXiv:2405.08344 [pdf, other]
Title: No Time to Waste: Squeeze Time into Channel for Mobile Video Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[380]  arXiv:2405.08337 [pdf, ps, other]
Title: Perivascular space Identification Nnunet for Generalised Usage (PINGU)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[381]  arXiv:2405.08329 [pdf, other]
Title: Cross-Dataset Generalization For Retinal Lesions Segmentation
Comments: 6 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[382]  arXiv:2405.08322 [pdf, other]
Title: StraightPCF: Straight Point Cloud Filtering
Comments: This paper has been accepted to the IEEE/CVF CVPR Conference, 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[383]  arXiv:2405.08300 [pdf, other]
Title: Vector-Symbolic Architecture for Event-Based Optical Flow
Subjects: Computer Vision and Pattern Recognition (cs.CV); Symbolic Computation (cs.SC)
[384]  arXiv:2405.08272 [pdf, other]
Title: VS-Assistant: Versatile Surgery Assistant on the Demand of Surgeons
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[385]  arXiv:2405.08270 [pdf, other]
Title: Towards Clinician-Preferred Segmentation: Leveraging Human-in-the-Loop for Test Time Adaptation in Medical Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[386]  arXiv:2405.08263 [pdf, other]
Title: Palette-based Color Transfer between Images
Authors: Chenlei Lv, Dan Zhang
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[387]  arXiv:2405.08251 [pdf, other]
Title: Multimodal Collaboration Networks for Geospatial Vehicle Detection in Dense, Occluded, and Large-Scale Events
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[388]  arXiv:2405.08246 [pdf, other]
Title: Compositional Text-to-Image Generation with Dense Blob Representations
Comments: ICML 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[389]  arXiv:2405.08245 [pdf, ps, other]
Title: Progressive enhancement and restoration for mural images under low-light and defected conditions based on multi-receptive field strategy
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[390]  arXiv:2405.08210 [pdf, other]
Title: Infinite Texture: Text-guided High Resolution Diffusion Texture Synthesis
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[391]  arXiv:2405.08204 [pdf, other]
Title: A Semantic and Motion-Aware Spatiotemporal Transformer Network for Action Detection
Comments: IEEE Transactions on Pattern Analysis and Machine Intelligence (2024)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[392]  arXiv:2405.08197 [pdf, other]
Title: IHC Matters: Incorporating IHC analysis to H&E Whole Slide Image Analysis for Improved Cancer Grading via Two-stage Multimodal Bilinear Pooling Fusion
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[393]  arXiv:2405.08114 [pdf, other]
Title: RATLIP: Generative Adversarial CLIP Text-to-Image Synthesis Based on Recurrent Affine Transformations
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[394]  arXiv:2405.08055 [pdf, other]
Title: DiffTF++: 3D-aware Diffusion Transformer for Large-Vocabulary 3D Generation
Comments: arXiv admin note: substantial text overlap with arXiv:2309.07920
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[395]  arXiv:2405.08766 (cross-list from cs.LG) [pdf, other]
Title: Energy-based Hopfield Boosting for Out-of-Distribution Detection
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[396]  arXiv:2405.08745 (cross-list from eess.IV) [pdf, other]
Title: Enhancing Blind Video Quality Assessment with Rich Quality-aware Features
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[397]  arXiv:2405.08733 (cross-list from cs.GR) [pdf, other]
Title: A Simple Approach to Differentiable Rendering of SDFs
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[398]  arXiv:2405.08672 (cross-list from eess.IV) [pdf, other]
Title: EndoDAC: Efficient Adapting Foundation Model for Self-Supervised Depth Estimation from Any Endoscopic Camera
Comments: early accepted by MICCAI 2024
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[399]  arXiv:2405.08658 (cross-list from eess.IV) [pdf, other]
Title: Beyond the Black Box: Do More Complex Models Provide Superior XAI Explanations?
Comments: 15 pages, 9 figures, 5 tables
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[400]  arXiv:2405.08657 (cross-list from eess.IV) [pdf, other]
Title: Self-supervised learning improves robustness of deep learning lung tumor segmentation to CT imaging differences
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[401]  arXiv:2405.08654 (cross-list from cs.LG) [pdf, other]
Title: Can we Defend Against the Unknown? An Empirical Study About Threshold Selection for Neural Network Monitoring
Comments: 13 pages, 5 figures, 6 tables. To appear in the proceedings of the 40th Conference on Uncertainty in Artificial Intelligence (UAI 2024)
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[402]  arXiv:2405.08621 (cross-list from eess.IV) [pdf, other]
Title: RMT-BVQA: Recurrent Memory Transformer-based Blind Video Quality Assessment for Enhanced Video Content
Comments: 8pages, 2figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[403]  arXiv:2405.08576 (cross-list from cs.RO) [pdf, other]
Title: Hearing Touch: Audio-Visual Pretraining for Contact-Rich Manipulation
Comments: Accepted to ICRA 2024
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[404]  arXiv:2405.08556 (cross-list from eess.IV) [pdf, other]
Title: Shape-aware synthesis of pathological lung CT scans using CycleGAN for enhanced semi-supervised lung segmentation
Comments: 14 pages, 7 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[405]  arXiv:2405.08431 (cross-list from eess.IV) [pdf, other]
Title: Similarity Metrics for MR Image-To-Image Translation
Comments: 29 pages, 6 figures, appendix with 5 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[406]  arXiv:2405.08423 (cross-list from eess.IV) [pdf, other]
Title: NAFRSSR: a Lightweight Recursive Network for Efficient Stereo Image Super-Resolution
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[407]  arXiv:2405.08363 (cross-list from cs.CR) [pdf, other]
Title: UnMarker: A Universal Attack on Defensive Watermarking
Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[408]  arXiv:2405.08340 (cross-list from cs.CR) [pdf, other]
Title: Achieving Resolution-Agnostic DNN-based Image Watermarking:A Novel Perspective of Implicit Neural Representation
Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[409]  arXiv:2405.08297 (cross-list from cs.LG) [pdf, ps, other]
Title: Distance-Restricted Explanations: Theoretical Underpinnings & Efficient Implementation
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)
[410]  arXiv:2405.08282 (cross-list from eess.IV) [pdf, ps, other]
Title: Automatic Segmentation of the Kidneys and Cystic Renal Lesions on Non-Contrast CT Using a Convolutional Neural Network
Authors: Lucas Aronson (1), Ruben Ngnitewe Massaa (1), Syed Jamal Safdar Gardezi (1), Andrew L. Wentland (1,2,3) ((1) Department of Radiology, University of Wisconsin School of Medicine & Public Health, Madison, WI, USA, (2) Department of Medical Physics, University of Wisconsin School of Medicine & Public Health, Madison, WI, USA, (3) Department of Biomedical Engineering, University of Wisconsin School of Medicine & Public Health, Madison, WI, USA)
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[411]  arXiv:2405.08275 (cross-list from math.OC) [pdf, other]
Title: Power of $\ell_1$-Norm Regularized Kaczmarz Algorithms for High-Order Tensor Recovery
Comments: arXiv admin note: text overlap with arXiv:2311.00783
Subjects: Optimization and Control (math.OC); Computer Vision and Pattern Recognition (cs.CV); Numerical Analysis (math.NA)
[412]  arXiv:2405.08209 (cross-list from cs.CY) [pdf, other]
Title: Who's in and who's out? A case study of multimodal CLIP-filtering in DataComp
Comments: Content warning: This paper discusses societal stereotypes and sexually-explicit material that may be disturbing, distressing, and/or offensive to the reader
Subjects: Computers and Society (cs.CY); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[413]  arXiv:2405.08169 (cross-list from eess.IV) [pdf, other]
Title: Rethinking Histology Slide Digitization Workflows for Low-Resource Settings
Comments: MICCAI 2024 Early Accept. First four authors contributed equally
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[414]  arXiv:2405.08119 (cross-list from eess.SY) [pdf, other]
Title: GPS-IMU Sensor Fusion for Reliable Autonomous Vehicle Position Estimation
Comments: 6 pages, 4 figures, and conference
Subjects: Systems and Control (eess.SY); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[415]  arXiv:2405.08054 (cross-list from cs.GR) [pdf, other]
Title: Coin3D: Controllable and Interactive 3D Assets Generation with Proxy-Guided Conditioning
Comments: Project webpage: this https URL
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[416]  arXiv:2405.08049 (cross-list from eess.IV) [pdf, other]
Title: Optimizing Synthetic Correlated Diffusion Imaging for Breast Cancer Tumour Delineation
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[417]  arXiv:2405.08042 (cross-list from cs.HC) [pdf, other]
Title: LLAniMAtion: LLAMA Driven Gesture Animation
Subjects: Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[418]  arXiv:2405.08038 (cross-list from cs.LG) [pdf, other]
Title: Feature Expansion and enhanced Compression for Class Incremental Learning
Authors: Quentin Ferdinand (ENSTA Bretagne, Lab-STICC\_MATRIX), Gilles Le Chenadec (ENSTA Bretagne, Lab-STICC\_MATRIX), Benoit Clement (CROSSING, ENSTA Bretagne, Lab-STICC\_MATRIX), Panagiotis Papadakis (Lab-STICC\_RAMBO, IMT Atlantique - INFO), Quentin Oliveau
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[419]  arXiv:2405.08020 (cross-list from cs.LG) [pdf, other]
Title: ReActXGB: A Hybrid Binary Convolutional Neural Network Architecture for Improved Performance and Computational Efficiency
Comments: Accepted to ICCE-TW 2024
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[420]  arXiv:2405.07994 (cross-list from eess.IV) [pdf, ps, other]
Title: BubbleID: A Deep Learning Framework for Bubble Interface Dynamics Analysis
Comments: 16 pages, 4 figures
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[ total of 420 entries: 1-420 ]
[ showing up to 553 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2405, contact, help  (Access key information)