We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions

[ total of 593 entries: 1-224 | 225-448 | 449-593 ]
[ showing 224 entries per page: fewer | more | all ]

Fri, 26 Apr 2024

[1]  arXiv:2404.16831 [pdf, other]
[2]  arXiv:2404.16829 [pdf, other]
Title: Make-it-Real: Unleashing Large Multimodal Model's Ability for Painting 3D Objects with Realistic Materials
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[3]  arXiv:2404.16828 [pdf, other]
Title: Made to Order: Discovering monotonic temporal changes via self-supervised video ordering
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[4]  arXiv:2404.16825 [pdf, other]
Title: ResVR: Joint Rescaling and Viewport Rendering of Omnidirectional Images
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[5]  arXiv:2404.16824 [pdf, other]
Title: V2A-Mark: Versatile Deep Visual-Audio Watermarking for Manipulation Localization and Copyright Protection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[6]  arXiv:2404.16821 [pdf, other]
Title: How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites
Comments: Technical report
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[7]  arXiv:2404.16820 [pdf, other]
Title: Revisiting Text-to-Image Evaluation with Gecko: On Metrics, Prompts, and Human Ratings
Comments: Data and code will be released at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[8]  arXiv:2404.16818 [pdf, other]
Title: Boosting Unsupervised Semantic Segmentation with Principal Mask Proposals
Comments: Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[9]  arXiv:2404.16814 [pdf, other]
Title: Meta-Transfer Derm-Diagnosis: Exploring Few-Shot Learning and Transfer Learning for Skin Disease Classification in Long-Tail Distribution
Comments: 17 pages, 5 figures, 6 tables, submitted to IEEE Journal of Biomedical and Health Informatics
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[10]  arXiv:2404.16804 [pdf, other]
Title: AAPL: Adding Attributes to Prompt Learning for Vision-Language Models
Comments: Accepted to CVPR 2024 Workshop on Prompting in Vision, Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[11]  arXiv:2404.16790 [pdf, other]
Title: SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[12]  arXiv:2404.16781 [pdf, other]
Title: Registration by Regression (RbR): a framework for interpretable and flexible atlas registration
Comments: 11 pages, 3 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[13]  arXiv:2404.16773 [pdf, other]
Title: ConKeD++ -- Improving descriptor learning for retinal image registration: A comprehensive study of contrastive losses
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[14]  arXiv:2404.16771 [pdf, other]
Title: ConsistentID: Portrait Generation with Multimodal Fine-Grained Identity Preserving
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[15]  arXiv:2404.16754 [pdf, other]
Title: RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[16]  arXiv:2404.16752 [pdf, other]
Title: TokenHMR: Advancing Human Mesh Recovery with a Tokenized Pose Representation
Comments: CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[17]  arXiv:2404.16748 [pdf, other]
Title: TELA: Text to Layer-wise 3D Clothed Human Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[18]  arXiv:2404.16739 [pdf, ps, other]
Title: CBRW: A Novel Approach for Cancelable Biometric Template Generation based on
Authors: Nitin Kumar, Manisha
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[19]  arXiv:2404.16717 [pdf, other]
Title: Embracing Diversity: Interpretable Zero-shot classification beyond one vector per class
Comments: Accepted to FAccT 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC)
[20]  arXiv:2404.16687 [pdf, other]
Title: NTIRE 2024 Quality Assessment of AI-Generated Content Challenge
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[21]  arXiv:2404.16685 [pdf, other]
Title: Multi-scale HSV Color Feature Embedding for High-fidelity NIR-to-RGB Spectrum Translation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[22]  arXiv:2404.16678 [pdf, other]
Title: Multimodal Semantic-Aware Automatic Colorization with Diffusion Prior
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[23]  arXiv:2404.16670 [pdf, other]
Title: EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning
Comments: Accepted by CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[24]  arXiv:2404.16666 [pdf, other]
Title: PhyRecon: Physically Plausible Neural Scene Reconstruction
Comments: project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[25]  arXiv:2404.16637 [pdf, other]
Title: Zero-Shot Distillation for Image Encoders: How to Make Effective Use of Synthetic Data
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[26]  arXiv:2404.16635 [pdf, other]
Title: TinyChart: Efficient Chart Understanding with Visual Token Merging and Program-of-Thoughts Learning
Comments: 13 pages, 11 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[27]  arXiv:2404.16633 [pdf, other]
Title: Self-Balanced R-CNN for Instance Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[28]  arXiv:2404.16622 [pdf, other]
Title: DAVE -- A Detect-and-Verify Paradigm for Low-Shot Counting
Comments: Accepted to CVPR2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[29]  arXiv:2404.16617 [pdf, other]
Title: Denoising: from classical methods to deep CNNs
Comments: 33 pages, 33 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); History and Overview (math.HO)
[30]  arXiv:2404.16612 [pdf, other]
Title: MuseumMaker: Continual Style Customization without Catastrophic Forgetting
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[31]  arXiv:2404.16609 [pdf, other]
Title: SFMViT: SlowFast Meet ViT in Chaotic World
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[32]  arXiv:2404.16581 [pdf, other]
Title: AudioScenic: Audio-Driven Video Scene Editing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[33]  arXiv:2404.16578 [pdf, other]
Title: Road Surface Friction Estimation for Winter Conditions Utilising General Visual Features
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[34]  arXiv:2404.16573 [pdf, other]
Title: Multi-Scale Representations by Varying Window Attention for Semantic Segmentation
Comments: ICLR2024 Poster
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[35]  arXiv:2404.16571 [pdf, other]
Title: MonoPCC: Photometric-invariant Cycle Constraint for Monocular Depth Estimation of Endoscopic Images
Comments: 9 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[36]  arXiv:2404.16561 [pdf, ps, other]
Title: Research on geometric figure classification algorithm based on Deep Learning
Comments: 6 pages,9 figures
Journal-ref: Scientific Journal of Intelligent Systems Research,Volume 4 Issue 6, 2022
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[37]  arXiv:2404.16558 [pdf, other]
Title: DeepKalPose: An Enhanced Deep-Learning Kalman Filter for Temporally Consistent Monocular Vehicle Pose Estimation
Comments: 4 pages, 3 Figures, published to IET Electronic Letters
Journal-ref: Electronics Letters (ISSN: 00135194), jaar: 2024, volume: 60, nummer: 8, startpagina: ?
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[38]  arXiv:2404.16557 [pdf, other]
Title: Energy-Latency Manipulation of Multi-modal Large Language Models via Verbose Samples
Comments: arXiv admin note: substantial text overlap with arXiv:2401.11170
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[39]  arXiv:2404.16556 [pdf, other]
Title: Conditional Distribution Modelling for Few-Shot Image Synthesis with Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[40]  arXiv:2404.16552 [pdf, other]
Title: Efficient Solution of Point-Line Absolute Pose
Comments: CVPR 2024, 11 pages, 8 figures, 5 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[41]  arXiv:2404.16548 [pdf, other]
Title: Cross-Domain Spatial Matching for Camera and Radar Sensor Data Fusion in Autonomous Vehicle Perception System
Comments: 12 pages including highlights and graphical abstract, submitted to Expert Systems with Applications journal
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[42]  arXiv:2404.16538 [pdf, other]
Title: OpenDlign: Enhancing Open-World 3D Learning with Depth-Aligned Images
Comments: 12 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[43]  arXiv:2404.16536 [pdf, other]
Title: 3D Face Modeling via Weakly-supervised Disentanglement Network joint Identity-consistency Prior
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[44]  arXiv:2404.16507 [pdf, other]
Title: Semantic-aware Next-Best-View for Multi-DoFs Mobile System in Search-and-Acquisition based Visual Perception
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[45]  arXiv:2404.16501 [pdf, other]
Title: 360SFUDA++: Towards Source-free UDA for Panoramic Segmentation by Learning Reliable Category Prototypes
Comments: arXiv admin note: substantial text overlap with arXiv:2403.12505
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[46]  arXiv:2404.16493 [pdf, other]
Title: Commonsense Prototype for Outdoor Unsupervised 3D Object Detection
Comments: Accepted by CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[47]  arXiv:2404.16484 [pdf, other]
[48]  arXiv:2404.16474 [pdf, other]
Title: DiffSeg: A Segmentation Model for Skin Lesions Based on Diffusion Difference
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[49]  arXiv:2404.16471 [pdf, other]
Title: COBRA -- COnfidence score Based on shape Regression Analysis for method-independent quality assessment of object pose estimation from single images
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[50]  arXiv:2404.16456 [pdf, other]
Title: Correlation-Decoupled Knowledge Distillation for Multimodal Sentiment Analysis with Incomplete Modalities
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[51]  arXiv:2404.16452 [pdf, other]
Title: PAD: Patch-Agnostic Defense against Adversarial Patch Attacks
Comments: Accepted by CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[52]  arXiv:2404.16451 [pdf, other]
Title: Latent Modulated Function for Computational Optimal Continuous Image Representation
Authors: Zongyao He, Zhi Jin
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[53]  arXiv:2404.16432 [pdf, other]
Title: Point-JEPA: A Joint Embedding Predictive Architecture for Self-Supervised Learning on Point Cloud
Comments: 10 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[54]  arXiv:2404.16429 [pdf, other]
Title: Depth Supervised Neural Surface Reconstruction from Airborne Imagery
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[55]  arXiv:2404.16423 [pdf, other]
Title: Neural Assembler: Learning to Generate Fine-Grained Robotic Assembly Instructions from Multi-View Images
Authors: Hongyu Yan, Yadong Mu
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[56]  arXiv:2404.16422 [pdf, other]
Title: Robust Fine-tuning for Pre-trained 3D Point Cloud Models
Comments: 9 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[57]  arXiv:2404.16421 [pdf, other]
Title: SynCellFactory: Generative Data Augmentation for Cell Tracking
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[58]  arXiv:2404.16416 [pdf, other]
Title: Learning Discriminative Spatio-temporal Representations for Semi-supervised Action Recognition
Comments: 10 pages, 6 figures, 6 tables, 56 conferences
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[59]  arXiv:2404.16409 [pdf, other]
Title: Cross-sensor super-resolution of irregularly sampled Sentinel-2 time series
Authors: Aimi Okabayashi (IRISA, OBELIX), Nicolas Audebert (CEDRIC - VERTIGO, CNAM, LaSTIG, IGN), Simon Donike (IPL), Charlotte Pelletier (OBELIX, IRISA)
Journal-ref: EARTHVISION 2024 IEEE/CVF CVPR Workshop. Large Scale Computer Vision for Remote Sensing Imagery, Jun 2024, Seattle, United States
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[60]  arXiv:2404.16398 [pdf, other]
Title: Revisiting Relevance Feedback for CLIP-based Interactive Image Retrieval
Comments: 20 pages, 8 sugures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[61]  arXiv:2404.16386 [pdf, other]
Title: Promoting CNNs with Cross-Architecture Knowledge Distillation for Efficient Monocular Depth Estimation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[62]  arXiv:2404.16385 [pdf, other]
Title: Efficiency in Focus: LayerNorm as a Catalyst for Fine-tuning Medical Visual Language Pre-trained Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[63]  arXiv:2404.16380 [pdf, ps, other]
Title: Efficient Higher-order Convolution for Small Kernels in Deep Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[64]  arXiv:2404.16375 [pdf, other]
Title: List Items One by One: A New Data Source and Learning Paradigm for Multimodal LLMs
Comments: Preprint
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[65]  arXiv:2404.16371 [pdf, other]
Title: Multimodal Information Interaction for Medical Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[66]  arXiv:2404.16359 [pdf, other]
Title: An Improved Graph Pooling Network for Skeleton-Based Action Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[67]  arXiv:2404.16348 [pdf, other]
Title: Dual Expert Distillation Network for Generalized Zero-Shot Learning
Comments: 11 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[68]  arXiv:2404.16339 [pdf, other]
Title: Training-Free Unsupervised Prompt for Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[69]  arXiv:2404.16331 [pdf, other]
Title: IMWA: Iterative Model Weight Averaging Benefits Class-Imbalanced Learning Tasks
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[70]  arXiv:2404.16325 [pdf, other]
Title: Semantic Segmentation Refiner for Ultrasound Applications with Zero-Shot Foundation Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[71]  arXiv:2404.16323 [pdf, other]
Title: DIG3D: Marrying Gaussian Splatting with Deformable Transformer for Single Image 3D Reconstruction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[72]  arXiv:2404.16306 [pdf, other]
Title: TI2V-Zero: Zero-Shot Image Conditioning for Text-to-Video Diffusion Models
Comments: CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[73]  arXiv:2404.16304 [pdf, other]
Title: BezierFormer: A Unified Architecture for 2D and 3D Lane Detection
Comments: ICME 2024, 11 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[74]  arXiv:2404.16302 [pdf, other]
Title: CFMW: Cross-modality Fusion Mamba for Multispectral Object Detection under Adverse Weather Conditions
Comments: The dataset and source code will be made publicly available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Robotics (cs.RO); Image and Video Processing (eess.IV)
[75]  arXiv:2404.16301 [pdf, other]
Title: Style Adaptation for Domain-adaptive Semantic Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[76]  arXiv:2404.16296 [pdf, ps, other]
Title: Research on Splicing Image Detection Algorithms Based on Natural Image Statistical Characteristics
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[77]  arXiv:2404.16268 [pdf, other]
Title: Lacunarity Pooling Layers for Plant Image Classification using Texture Analysis
Comments: 9 pages, 7 figures, accepted at 2024 IEEE/CVF Computer Vision and Pattern Recognition Vision for Agriculture Workshop
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[78]  arXiv:2404.16266 [pdf, other]
Title: A Multi-objective Optimization Benchmark Test Suite for Real-time Semantic Segmentation
Comments: 8 pages, 16 figures, GECCO 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
[79]  arXiv:2404.16223 [pdf, other]
Title: Deep RAW Image Super-Resolution. A NTIRE 2024 Challenge Survey
Comments: CVPR 2024 - NTIRE Workshop
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[80]  arXiv:2404.16222 [pdf, other]
Title: Step Differences in Instructional Video
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[81]  arXiv:2404.16221 [pdf, other]
Title: NeRF-XL: Scaling NeRFs with Multiple GPUs
Comments: Webpage: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Graphics (cs.GR)
[82]  arXiv:2404.16216 [pdf, other]
Title: ActiveRIR: Active Audio-Visual Exploration for Acoustic Environment Modeling
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[83]  arXiv:2404.16205 [pdf, other]
Title: AIS 2024 Challenge on Video Quality Assessment of User-Generated Content: Methods and Results
Comments: CVPR 2024 Workshop -- AI for Streaming (AIS) Video Quality Assessment Challenge
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[84]  arXiv:2404.16193 [pdf, other]
Title: Improving Multi-label Recognition using Class Co-Occurrence Probabilities
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM); Image and Video Processing (eess.IV)
[85]  arXiv:2404.16155 [pdf, other]
Title: Does SAM dream of EIG? Characterizing Interactive Segmenter Performance using Expected Information Gain
Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT); Machine Learning (cs.LG)
[86]  arXiv:2404.16139 [pdf, other]
Title: A Survey on Intermediate Fusion Methods for Collaborative Perception Categorized by Real World Challenges
Comments: 8 pages, 6 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[87]  arXiv:2404.16136 [pdf, other]
Title: 3D Human Pose Estimation with Occlusions: Introducing BlendMimic3D Dataset and GCN Refinement
Comments: Accepted at 6th Workshop and Competition on Affective Behavior Analysis in-the-wild - CVPR 2024 Workshop
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[88]  arXiv:2404.16133 [pdf, ps, other]
Title: Quantitative Characterization of Retinal Features in Translated OCTA
Comments: The article has been revised and edited
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[89]  arXiv:2404.16123 [pdf, other]
Title: FairDeDup: Detecting and Mitigating Vision-Language Fairness Disparities in Semantic Dataset Deduplication
Comments: Conference paper at CVPR 2024. 6 pages, 8 figures. Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[90]  arXiv:2404.16038 [pdf, other]
Title: A Survey on Generative AI and LLM for Video Generation, Understanding, and Streaming
Comments: 16 pages, 10 figures, 4 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[91]  arXiv:2404.16037 [pdf, other]
Title: VN-Net: Vision-Numerical Fusion Graph Convolutional Network for Sparse Spatio-Temporal Meteorological Forecasting
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Atmospheric and Oceanic Physics (physics.ao-ph)
[92]  arXiv:2404.16823 (cross-list from cs.RO) [pdf, other]
Title: Learning Visuotactile Skills with Two Multifingered Hands
Comments: Code and Project Website: this https URL
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[93]  arXiv:2404.16767 (cross-list from cs.LG) [pdf, other]
Title: REBEL: Reinforcement Learning via Regressing Relative Rewards
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[94]  arXiv:2404.16718 (cross-list from eess.IV) [pdf, other]
Title: Features Fusion for Dual-View Mammography Mass Detection
Comments: Accepted at ISBI 2024 (21st IEEE International Symposium on Biomedical Imaging)
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[95]  arXiv:2404.16708 (cross-list from eess.IV) [pdf, other]
Title: Multi-view Cardiac Image Segmentation via Trans-Dimensional Priors
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[96]  arXiv:2404.16529 (cross-list from cs.RO) [pdf, other]
Title: Vision-based robot manipulation of transparent liquid containers in a laboratory setting
Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[97]  arXiv:2404.16510 (cross-list from cs.GR) [pdf, other]
Title: Interactive3D: Create What You Want by Interactive 3D Generation
Comments: project page: this https URL
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[98]  arXiv:2404.16482 (cross-list from q-bio.NC) [pdf, other]
Title: CoCoG: Controllable Visual Stimuli Generation based on Human Concept Representations
Subjects: Neurons and Cognition (q-bio.NC); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[99]  arXiv:2404.16397 (cross-list from eess.IV) [pdf, other]
Title: Deep Learning-based Prediction of Breast Cancer Tumor and Immune Phenotypes from Histopathology
Comments: Paper accepted at the First Workshop on Imageomics (Imageomics-AAAI-24) - Discovering Biological Knowledge from Images using AI (this https URL), held as part of the 38th Annual AAAI Conference on Artificial Intelligence (this https URL)
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Quantitative Methods (q-bio.QM)
[100]  arXiv:2404.16346 (cross-list from eess.IV) [pdf, other]
Title: Light-weight Retinal Layer Segmentation with Global Reasoning
Comments: IEEE Transactions on Instrumentation & Measurement
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[101]  arXiv:2404.16336 (cross-list from cs.LG) [pdf, other]
Title: FedStyle: Style-Based Federated Learning Crowdsourcing Framework for Art Commissions
Comments: Accepted to ICME 2024
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[102]  arXiv:2404.16307 (cross-list from cs.LG) [pdf, other]
Title: Boosting Model Resilience via Implicit Adversarial Data Augmentation
Comments: 9 pages, 6 figures, accepted by IJCAI 2024
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[103]  arXiv:2404.16300 (cross-list from cs.LG) [pdf, other]
Title: Reinforcement Learning with Generative Models for Compact Support Sets
Comments: 4 pages, 2 figures. Code available at: this https URL
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[104]  arXiv:2404.16292 (cross-list from cs.GR) [pdf, other]
Title: One Noise to Rule Them All: Learning a Unified Model of Spatially-Varying Noise Patterns
Comments: In ACM Transactions on Graphics (Proceedings of SIGGRAPH) 2024, 21 pages
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[105]  arXiv:2404.16255 (cross-list from cs.CR) [pdf, other]
Title: Enhancing Privacy in Face Analytics Using Fully Homomorphic Encryption
Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[106]  arXiv:2404.16212 (cross-list from cs.CR) [pdf, other]
Title: An Analysis of Recent Advances in Deepfake Image Detection in an Evolving Threat Landscape
Comments: Accepted to IEEE S&P 2024; 19 pages, 10 figures
Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[107]  arXiv:2404.16192 (cross-list from cs.CL) [pdf, other]
Title: Fusion of Domain-Adapted Vision and Language Models for Medical Visual Question Answering
Comments: Clinical NLP @ NAACL 2024
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[108]  arXiv:2404.16174 (cross-list from cs.HC) [pdf, other]
Title: MiMICRI: Towards Domain-centered Counterfactual Explanations of Cardiovascular Image Classification Models
Comments: 14 pages, 6 figures, ACM FAccT 2024
Subjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[109]  arXiv:2404.16112 (cross-list from cs.LG) [pdf, other]
Title: Mamba-360: Survey of State Space Models as Transformer Alternative for Long Sequence Modelling: Methods, Applications, and Challenges
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Image and Video Processing (eess.IV)
[110]  arXiv:2404.16080 (cross-list from eess.IV) [pdf, other]
Title: Enhancing Diagnosis through AI-driven Analysis of Reflectance Confocal Microscopy
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[111]  arXiv:2404.16049 (cross-list from physics.med-ph) [pdf, other]
Title: Exploring the limitations of blood pressure estimation using the photoplethysmography signal
Comments: 17 pages, 7 figures, 3 tables
Subjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV); Signal Processing (eess.SP)
[112]  arXiv:2404.15405 (cross-list from astro-ph.SR) [pdf, ps, other]
Title: Photometry of Saturated Stars with Machine Learning
Authors: Dominek Winecki (1) Christopher S. Kochanek (2) ((1) Dept. of Computer Science and Engineeering, The Ohio State University (2) Dept. of Astronomy, The Ohio State University)
Comments: submitted to ApJ
Subjects: Solar and Stellar Astrophysics (astro-ph.SR); Instrumentation and Methods for Astrophysics (astro-ph.IM); Computer Vision and Pattern Recognition (cs.CV)

Thu, 25 Apr 2024

[113]  arXiv:2404.16035 [pdf, other]
Title: MaGGIe: Masked Guided Gradual Human Instance Matting
Comments: CVPR 2024. Project link: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[114]  arXiv:2404.16033 [pdf, other]
Title: Cantor: Inspiring Multimodal Chain-of-Thought of MLLM
Comments: The project page is available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[115]  arXiv:2404.16030 [pdf, other]
Title: MoDE: CLIP Data Experts via Clustering
Comments: IEEE CVPR 2024 Camera Ready. Code Link: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[116]  arXiv:2404.16029 [pdf, other]
Title: Editable Image Elements for Controllable Synthesis
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[117]  arXiv:2404.16022 [pdf, other]
Title: PuLID: Pure and Lightning ID Customization via Contrastive Alignment
Comments: Tech Report. Codes and models will be available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[118]  arXiv:2404.16017 [pdf, other]
Title: RetinaRegNet: A Versatile Approach for Retinal Image Registration
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computer Science and Game Theory (cs.GT); Machine Learning (cs.LG)
[119]  arXiv:2404.16012 [pdf, other]
Title: GaussianTalker: Real-Time High-Fidelity Talking Head Synthesis with Audio-Driven 3D Gaussian Splatting
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[120]  arXiv:2404.16006 [pdf, other]
Title: MMT-Bench: A Comprehensive Multimodal Benchmark for Evaluating Large Vision-Language Models Towards Multitask AGI
Comments: 77 pages, 41 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[121]  arXiv:2404.16000 [pdf, other]
Title: A comprehensive and easy-to-use multi-domain multi-task medical imaging meta-dataset (MedIMeta)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[122]  arXiv:2404.15992 [pdf, other]
Title: HDDGAN: A Heterogeneous Dual-Discriminator Generative Adversarial Network for Infrared and Visible Image Fusion
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[123]  arXiv:2404.15979 [pdf, other]
Title: On the Fourier analysis in the SO(3) space : EquiLoPO Network
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Group Theory (math.GR)
[124]  arXiv:2404.15956 [pdf, other]
Title: A Survey on Visual Mamba
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[125]  arXiv:2404.15955 [pdf, other]
Title: Beyond Deepfake Images: Detecting AI-Generated Videos
Comments: To be published in CVPRW24
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[126]  arXiv:2404.15946 [pdf, ps, other]
Title: Mammo-CLIP: Leveraging Contrastive Language-Image Pre-training (CLIP) for Enhanced Breast Cancer Diagnosis with Multi-view Mammography
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
[127]  arXiv:2404.15909 [pdf, other]
Title: Learning Long-form Video Prior via Generative Pre-Training
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[128]  arXiv:2404.15903 [pdf, other]
Title: Drawing the Line: Deep Segmentation for Extracting Art from Ancient Etruscan Mirrors
Comments: 19 pages, accepted at ICDAR2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[129]  arXiv:2404.15891 [pdf, other]
Title: OMEGAS: Object Mesh Extraction from Large Scenes Guided by Gaussian Segmentation
Comments: arXiv admin note: text overlap with arXiv:2311.17061 by other authors
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[130]  arXiv:2404.15889 [pdf, other]
Title: Sketch2Human: Deep Human Generation with Disentangled Geometry and Appearance Control
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[131]  arXiv:2404.15882 [pdf, ps, other]
Title: Unexplored Faces of Robustness and Out-of-Distribution: Covariate Shifts in Environment and Sensor Domains
Comments: Published as a conference paper at CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[132]  arXiv:2404.15881 [pdf, other]
Title: Steal Now and Attack Later: Evaluating Robustness of Object Detection against Black-box Adversarial Attacks
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[133]  arXiv:2404.15879 [pdf, other]
Title: Revisiting Out-of-Distribution Detection in LiDAR-based 3D Object Detection
Comments: Accepted for publication at the 2024 35th IEEE Intelligent Vehicles Symposium (IV 2024), June 2-5, 2024, in Jeju Island, Korea
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[134]  arXiv:2404.15851 [pdf, ps, other]
Title: Porting Large Language Models to Mobile Devices for Question Answering
Authors: Hannes Fassold
Comments: Accepted for ASPAI 2024 Conference
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[135]  arXiv:2404.15817 [pdf, other]
Title: Vision Transformer-based Adversarial Domain Adaptation
Authors: Yahan Li, Yuan Wu
Comments: 6 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[136]  arXiv:2404.15815 [pdf, other]
Title: Single-View Scene Point Cloud Human Grasp Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[137]  arXiv:2404.15812 [pdf, other]
Title: Facilitating Advanced Sentinel-2 Analysis Through a Simplified Computation of Nadir BRDF Adjusted Reflectance
Comments: Submitted to FOSS4G Europe 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Instrumentation and Methods for Astrophysics (astro-ph.IM)
[138]  arXiv:2404.15802 [pdf, other]
Title: Raformer: Redundancy-Aware Transformer for Video Wire Inpainting
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[139]  arXiv:2404.15790 [pdf, other]
Title: Leveraging Large Language Models for Multimodal Search
Comments: Published at CVPRW 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[140]  arXiv:2404.15789 [pdf, other]
Title: MotionMaster: Training-free Camera Motion Transfer For Video Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[141]  arXiv:2404.15785 [pdf, other]
Title: Seeing Beyond Classes: Zero-Shot Grounded Situation Recognition via Language Explainer
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[142]  arXiv:2404.15781 [pdf, other]
Title: Real-Time Compressed Sensing for Joint Hyperspectral Image Transmission and Restoration for CubeSat
Comments: Accepted by TGRS 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
[143]  arXiv:2404.15774 [pdf, other]
Title: Toward Physics-Aware Deep Learning Architectures for LiDAR Intensity Simulation
Comments: 7 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
[144]  arXiv:2404.15771 [pdf, other]
Title: DVF: Advancing Robust and Accurate Fine-Grained Image Retrieval with Retrieval Guidelines
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[145]  arXiv:2404.15770 [pdf, other]
Title: ChEX: Interactive Localization and Region Description in Chest X-rays
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
[146]  arXiv:2404.15765 [pdf, other]
Title: 3D Face Morphing Attack Generation using Non-Rigid Registration
Comments: Accepted to 2024 18th International Conference on Automatic Face and Gesture Recognition (FG) as short paper
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[147]  arXiv:2404.15743 [pdf, other]
Title: SRAGAN: Saliency Regularized and Attended Generative Adversarial Network for Chinese Ink-wash Painting Generation
Authors: Xiang Gao, Yuqi Zhang
Comments: 25 pages, 14 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[148]  arXiv:2404.15736 [pdf, other]
Title: What Makes Multimodal In-Context Learning Work?
Comments: 20 pages, 16 figures. Accepted to CVPR 2024 Workshop on Prompting in Vision. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[149]  arXiv:2404.15734 [pdf, other]
Title: Fine-grained Spatial-temporal MLP Architecture for Metro Origin-Destination Prediction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[150]  arXiv:2404.15721 [pdf, other]
Title: SPARO: Selective Attention for Robust and Compositional Transformer Encodings for Vision
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[151]  arXiv:2404.15719 [pdf, other]
Title: HDBN: A Novel Hybrid Dual-branch Network for Robust Skeleton-based Action Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[152]  arXiv:2404.15714 [pdf, other]
Title: Ada-DF: An Adaptive Label Distribution Fusion Network For Facial Expression Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[153]  arXiv:2404.15709 [pdf, other]
Title: ViViDex: Learning Vision-based Dexterous Manipulation from Human Videos
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)
[154]  arXiv:2404.15707 [pdf, other]
Title: ESR-NeRF: Emissive Source Reconstruction Using LDR Multi-view Images
Comments: CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[155]  arXiv:2404.15700 [pdf, other]
Title: MAS-SAM: Segment Any Marine Animal with Aggregated Features
Comments: Accepted by IJCAI2024. More modifications may be performed
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[156]  arXiv:2404.15697 [pdf, other]
Title: DeepFeatureX Net: Deep Features eXtractors based Network for discriminating synthetic from real images
Authors: Orazio Pontorno (1), Luca Guarnera (1), Sebastiano Battiato (1) ((1) University of Catania)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[157]  arXiv:2404.15683 [pdf, other]
Title: AnoFPDM: Anomaly Segmentation with Forward Process of Diffusion Models for Brain MRI
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[158]  arXiv:2404.15677 [pdf, other]
Title: CharacterFactory: Sampling Consistent Characters with GANs for Diffusion Models
Comments: Code will be released very soon: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[159]  arXiv:2404.15672 [pdf, other]
Title: Representing Part-Whole Hierarchies in Foundation Models by Learning Localizability, Composability, and Decomposability from Anatomy via Self-Supervision
Comments: Accepted at CVPR 2024 [main conference]
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[160]  arXiv:2404.15655 [pdf, other]
Title: Multi-Modal Proxy Learning Towards Personalized Visual Multiple Clustering
Comments: Accepted by CVPR 2024. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[161]  arXiv:2404.15653 [pdf, other]
Title: CatLIP: CLIP-level Visual Recognition Accuracy with 2.7x Faster Pre-training on Web-scale Image-Text Data
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[162]  arXiv:2404.15644 [pdf, other]
Title: Building-PCC: Building Point Cloud Completion Benchmarks
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[163]  arXiv:2404.15638 [pdf, other]
Title: PriorNet: A Novel Lightweight Network with Multidimensional Interactive Attention for Efficient Image Dehazing
Comments: 8 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[164]  arXiv:2404.15635 [pdf, other]
Title: A Real-time Evaluation Framework for Pedestrian's Potential Risk at Non-Signalized Intersections Based on Predicted Post-Encroachment Time
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[165]  arXiv:2404.15608 [pdf, other]
Title: Understanding and Improving CNNs with Complex Structure Tensor: A Biometrics Study
Comments: preprint manuscript
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[166]  arXiv:2404.15592 [pdf, other]
Title: ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Retrieval (cs.IR); Machine Learning (cs.LG)
[167]  arXiv:2404.15591 [pdf, other]
Title: Domain Adaptation for Learned Image Compression with Supervised Adapters
Comments: 10 pages, published to Data compression conference 2024 (DCC2024)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[168]  arXiv:2404.15580 [pdf, other]
Title: MiM: Mask in Mask Self-Supervised Pre-Training for 3D Medical Image Analysis
Comments: submitted to journal
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[169]  arXiv:2404.15564 [pdf, other]
Title: Guided AbsoluteGrad: Magnitude of Gradients Matters to Explanation's Localization and Saliency
Authors: Jun Huang, Yan Liu
Comments: CAI2024 Camera-ready Submission
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
[170]  arXiv:2404.15552 [pdf, other]
Title: Cross-Temporal Spectrogram Autoencoder (CTSAE): Unsupervised Dimensionality Reduction for Clustering Gravitational Wave Glitches
Subjects: Computer Vision and Pattern Recognition (cs.CV); Instrumentation and Methods for Astrophysics (astro-ph.IM); Machine Learning (cs.LG); General Relativity and Quantum Cosmology (gr-qc)
[171]  arXiv:2404.15523 [pdf, other]
Title: Understanding Hyperbolic Metric Learning through Hard Negative Sampling
Comments: published in Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 2024. arXiv admin note: text overlap with arXiv:2203.10833 by other authors
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[172]  arXiv:2404.15516 [pdf, other]
Title: Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval
Comments: 15 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[173]  arXiv:2404.15506 [pdf, other]
Title: Metric3D v2: A Versatile Monocular Geometric Foundation Model for Zero-shot Metric Depth and Surface Normal Estimation
Comments: Our project page is at this https URL arXiv admin note: substantial text overlap with arXiv:2307.10984
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[174]  arXiv:2404.15451 [pdf, other]
Title: CFPFormer: Feature-pyramid like Transformer Decoder for Segmentation and Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[175]  arXiv:2404.15449 [pdf, other]
Title: ID-Aligner: Enhancing Identity-Preserving Text-to-Image Generation with Reward Feedback Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[176]  arXiv:2404.15447 [pdf, other]
Title: GLoD: Composing Global Contexts and Local Details in Image Generation
Authors: Moyuru Yamada
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[177]  arXiv:2404.15445 [pdf, other]
Title: Deep multi-prototype capsule networks
Subjects: Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
[178]  arXiv:2404.15436 [pdf, other]
Title: Iterative Cluster Harvesting for Wafer Map Defect Patterns
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[179]  arXiv:2404.15406 [pdf, other]
Title: Wiki-LLaVA: Hierarchical Retrieval-Augmented Generation for Multimodal LLMs
Comments: CVPR 2024 Workshop on What is Next in Multimodal Foundation Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Multimedia (cs.MM)
[180]  arXiv:2404.15385 [pdf, ps, other]
Title: Sum of Group Error Differences: A Critical Examination of Bias Evaluation in Biometric Verification and a Dual-Metric Measure
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computers and Society (cs.CY)
[181]  arXiv:2404.15383 [pdf, other]
Title: WANDR: Intention-guided Human Motion Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[182]  arXiv:2404.15378 [pdf, other]
Title: Hierarchical Hybrid Sliced Wasserstein: A Scalable Metric for Heterogeneous Joint Distributions
Authors: Khai Nguyen, Nhat Ho
Comments: 24 pages, 11 figures, 4 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG); Machine Learning (stat.ML)
[183]  arXiv:2404.15919 (cross-list from cs.LG) [pdf, other]
Title: An Element-Wise Weights Aggregation Method for Federated Learning
Comments: 2023 IEEE International Conference on Data Mining Workshops (ICDMW)
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[184]  arXiv:2404.15918 (cross-list from eess.IV) [pdf, other]
Title: Perception and Localization of Macular Degeneration Applying Convolutional Neural Network, ResNet and Grad-CAM
Comments: 12 pages, 5 figures, 2 tables
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[185]  arXiv:2404.15847 (cross-list from physics.med-ph) [pdf, other]
Title: 3D Freehand Ultrasound using Visual Inertial and Deep Inertial Odometry for Measuring Patellar Tracking
Comments: Accepted to IEEE Medical Measurements & Applications (MeMeA) 2024
Subjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV)
[186]  arXiv:2404.15786 (cross-list from eess.IV) [pdf, other]
Title: Rethinking Model Prototyping through the MedMNIST+ Dataset Collection
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[187]  arXiv:2404.15718 (cross-list from eess.IV) [pdf, other]
Title: Mitigating False Predictions In Unreasonable Body Regions
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[188]  arXiv:2404.15661 (cross-list from cs.GR) [pdf, other]
Title: CWF: Consolidating Weak Features in High-quality Mesh Simplification
Comments: 14 pages, 22 figures
Subjects: Graphics (cs.GR); Computational Geometry (cs.CG); Computer Vision and Pattern Recognition (cs.CV)
[189]  arXiv:2404.15532 (cross-list from cs.HC) [pdf, other]
Title: BattleAgent: Multi-modal Dynamic Emulation on Historical Battles to Complement Historical Analysis
Comments: 26 pages, 14 figures The data and code for this project are accessible at this https URL
Subjects: Human-Computer Interaction (cs.HC); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[190]  arXiv:2404.15394 (cross-list from eess.IV) [pdf, ps, other]
Title: On Generating Cancelable Biometric Template using Reverse of Boolean XOR
Authors: Manisha, Nitin Kumar
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[191]  arXiv:2404.15367 (cross-list from eess.SP) [pdf, other]
Title: Leveraging Visibility Graphs for Enhanced Arrhythmia Classification with Graph Convolutional Networks
Subjects: Signal Processing (eess.SP); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[192]  arXiv:2404.15364 (cross-list from eess.SP) [pdf, other]
Title: MP-DPD: Low-Complexity Mixed-Precision Neural Networks for Energy-Efficient Digital Predistortion of Wideband Power Amplifiers
Comments: Accepted to IEEE Microwave and Wireless Technology Letters (MWTL)
Subjects: Signal Processing (eess.SP); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[193]  arXiv:2404.15346 (cross-list from eess.SP) [pdf, other]
Title: A Novel Micro-Doppler Coherence Loss for Deep Learning Radar Applications
Comments: Presented at 2021 18th European Radar Conference (EuRAD)
Subjects: Signal Processing (eess.SP); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[194]  arXiv:2404.15318 (cross-list from q-bio.QM) [pdf, ps, other]
Title: VASARI-auto: equitable, efficient, and economical featurisation of glioma MRI
Comments: 28 pages, 6 figures, 1 table
Subjects: Quantitative Methods (q-bio.QM); Computer Vision and Pattern Recognition (cs.CV); Tissues and Organs (q-bio.TO)
[195]  arXiv:2404.15312 (cross-list from eess.SP) [pdf, other]
Title: Realtime Person Identification via Gait Analysis
Subjects: Signal Processing (eess.SP); Computer Vision and Pattern Recognition (cs.CV)
[196]  arXiv:2404.15287 (cross-list from eess.IV) [pdf, other]
Title: A Semi-automatic Cranial Implant Design Tool Based on Rigid ICP Template Alignment and Voxel Space Reconstruction
Comments: 6 pages
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[197]  arXiv:2404.14956 (cross-list from eess.IV) [pdf, other]
Title: DAWN: Domain-Adaptive Weakly Supervised Nuclei Segmentation via Cross-Task Interactions
Comments: 13 pages, 11 figures, 8 tables
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)

Wed, 24 Apr 2024 (showing first 27 of 110 entries)

[198]  arXiv:2404.15276 [pdf, other]
Title: SMPLer: Taming Transformers for Monocular 3D Human Shape and Pose Estimation
Comments: Published at TPAMI 2024
Journal-ref: https://www.computer.org/csdl/journal/tp/2024/05/10354384/1SP2qWh8Fq0
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG); Multimedia (cs.MM)
[199]  arXiv:2404.15275 [pdf, other]
Title: ID-Animator: Zero-Shot Identity-Preserving Human Video Generation
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[200]  arXiv:2404.15272 [pdf, other]
Title: CT-GLIP: 3D Grounded Language-Image Pretraining with CT Scans and Radiology Reports for Full-Body Scenarios
Comments: 12 pages, 5 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[201]  arXiv:2404.15271 [pdf, other]
Title: Automatic Layout Planning for Visually-Rich Documents with Instruction-Following Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[202]  arXiv:2404.15267 [pdf, other]
Title: From Parts to Whole: A Unified Reference Framework for Controllable Human Image Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[203]  arXiv:2404.15264 [pdf, other]
Title: TalkingGaussian: Structure-Persistent 3D Talking Head Synthesis via Gaussian Splatting
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[204]  arXiv:2404.15263 [pdf, other]
Title: Multi-Session SLAM with Differentiable Wide-Baseline Pose Optimization
Comments: Accepted to CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[205]  arXiv:2404.15259 [pdf, other]
Title: FlowMap: High-Quality Camera Poses, Intrinsics, and Depth via Gradient Descent
Comments: Project website: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[206]  arXiv:2404.15254 [pdf, other]
Title: UniMERNet: A Universal Network for Real-World Mathematical Expression Recognition
Comments: 17 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[207]  arXiv:2404.15252 [pdf, other]
Title: Source-free Domain Adaptation for Video Object Detection Under Adverse Image Conditions
Comments: accepted by the UG2+ workshop at CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[208]  arXiv:2404.15244 [pdf, other]
Title: Efficient Transformer Encoders for Mask2Former-style models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[209]  arXiv:2404.15234 [pdf, other]
Title: Massively Annotated Datasets for Assessment of Synthetic and Real Data in Face Recognition
Comments: Accepted at FG 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[210]  arXiv:2404.15228 [pdf, other]
Title: Re-Thinking Inverse Graphics With Large Language Models
Comments: 31 pages; project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[211]  arXiv:2404.15224 [pdf, other]
Title: Deep Models for Multi-View 3D Object Recognition: A Review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[212]  arXiv:2404.15217 [pdf, other]
Title: Towards Large-Scale Training of Pathology Foundation Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[213]  arXiv:2404.15212 [pdf, other]
Title: Real-time Lane-wise Traffic Monitoring in Optimal ROIs
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[214]  arXiv:2404.15174 [pdf, other]
Title: Fourier-enhanced Implicit Neural Fusion Network for Multispectral and Hyperspectral Image Fusion
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[215]  arXiv:2404.15163 [pdf, other]
Title: Adaptive Mixed-Scale Feature Fusion Network for Blind AI-Generated Image Quality Assessment
Comments: IEEE Transactions on Broadcasting (TBC)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[216]  arXiv:2404.15161 [pdf, other]
Title: Combating Missing Modalities in Egocentric Videos at Test Time
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[217]  arXiv:2404.15141 [pdf, other]
Title: CutDiffusion: A Simple, Fast, Cheap, and Strong Diffusion Extrapolation Method
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[218]  arXiv:2404.15129 [pdf, ps, other]
Title: Gallbladder Cancer Detection in Ultrasound Images based on YOLO and Faster R-CNN
Comments: Published in 2024 10th International Conference on Artificial Intelligence and Robotics (QICAR)
Journal-ref: 2024 10th International Conference on Artificial Intelligence and Robotics (QICAR) (pp. 227-231). IEEE
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[219]  arXiv:2404.15127 [pdf, other]
Title: MedDr: Diagnosis-Guided Bootstrapping for Large-Scale Medical Vision-Language Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[220]  arXiv:2404.15100 [pdf, other]
Title: Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[221]  arXiv:2404.15081 [pdf, other]
Title: Perturbing Attention Gives You More Bang for the Buck: Subtle Imaging Perturbations That Efficiently Fool Customized Diffusion Models
Comments: Published at CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[222]  arXiv:2404.15041 [pdf, other]
Title: LEAF: Unveiling Two Sides of the Same Coin in Semi-supervised Facial Expression Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[223]  arXiv:2404.15037 [pdf, other]
Title: DP-Net: Learning Discriminative Parts for image recognition
Comments: IEEE ICIP 2023
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[224]  arXiv:2404.15033 [pdf, other]
Title: IPAD: Industrial Process Anomaly Detection Dataset
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[ total of 593 entries: 1-224 | 225-448 | 449-593 ]
[ showing 224 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2404, contact, help  (Access key information)