We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions

[ total of 604 entries: 1-311 | 312-604 ]
[ showing 311 entries per page: fewer | more | all ]

Fri, 19 Apr 2024

[1]  arXiv:2404.12391 [pdf, other]
Title: On the Content Bias in Fréchet Video Distance
Comments: CVPR 2024. Project webpage: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[2]  arXiv:2404.12390 [pdf, other]
Title: BLINK: Multimodal Large Language Models Can See but Not Perceive
Comments: Multimodal Benchmark, Project Url: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[3]  arXiv:2404.12389 [pdf, other]
Title: Moving Object Segmentation: All You Need Is SAM (and Flow)
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[4]  arXiv:2404.12388 [pdf, other]
Title: VideoGigaGAN: Towards Detail-rich Video Super-Resolution
Comments: project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[5]  arXiv:2404.12386 [pdf, other]
Title: SOHES: Self-supervised Open-world Hierarchical Entity Segmentation
Comments: ICLR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[6]  arXiv:2404.12385 [pdf, other]
Title: MeshLRM: Large Reconstruction Model for High-Quality Mesh
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[7]  arXiv:2404.12383 [pdf, ps, other]
Title: G-HOP: Generative Hand-Object Prior for Interaction Reconstruction and Grasp Synthesis
Comments: accepted to CVPR2024; project page at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[8]  arXiv:2404.12382 [pdf, other]
Title: Lazy Diffusion Transformer for Interactive Image Editing
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[9]  arXiv:2404.12379 [pdf, other]
Title: Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Monocular Videos
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[10]  arXiv:2404.12378 [pdf, other]
Title: 6Img-to-3D: Few-Image Large-Scale Outdoor Driving Scene Reconstruction
Comments: Joint first authorship. Project page: this https URL Code this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[11]  arXiv:2404.12372 [pdf, other]
Title: MedThink: Explaining Medical Visual Question Answering via Multimodal Decision-Making Rationale
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[12]  arXiv:2404.12368 [pdf, other]
Title: Gradient-Regularized Out-of-Distribution Detection
Comments: Under review for the 18th European Conference on Computer Vision (ECCV) 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[13]  arXiv:2404.12359 [pdf, other]
Title: Inverse Neural Rendering for Explainable Multi-Object Tracking
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Robotics (cs.RO)
[14]  arXiv:2404.12353 [pdf, other]
Title: V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[15]  arXiv:2404.12352 [pdf, other]
Title: Point-In-Context: Understanding Point Cloud via In-Context Learning
Comments: Project page: this https URL arXiv admin note: text overlap with arXiv:2306.08659
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[16]  arXiv:2404.12347 [pdf, other]
Title: AniClipart: Clipart Animation with Text-to-Video Priors
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[17]  arXiv:2404.12333 [pdf, other]
Title: Customizing Text-to-Image Diffusion with Camera Viewpoint Control
Comments: project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[18]  arXiv:2404.12330 [pdf, other]
Title: A Perspective on Deep Vision Performance with Standard Image and Video Codecs
Comments: Accepted at CVPR 2024 Workshop on AI for Streaming (AIS)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[19]  arXiv:2404.12322 [pdf, other]
Title: Generalizable Face Landmarking Guided by Conditional Face Warping
Comments: Accepted in CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[20]  arXiv:2404.12309 [pdf, other]
Title: iRAG: An Incremental Retrieval Augmented Generation System for Videos
Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR); Machine Learning (cs.LG)
[21]  arXiv:2404.12295 [pdf, other]
Title: When Medical Imaging Met Self-Attention: A Love Story That Didn't Quite Work Out
Comments: 10 pages, 2 figures, 5 tables, presented at VISAPP 2024
Journal-ref: Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP (2024), ISBN 978-989-758-679-8, ISSN 2184-4321, SciTePress, pages 149-158
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[22]  arXiv:2404.12292 [pdf, other]
Title: Reducing Bias in Pre-trained Models by Tuning while Penalizing Change
Comments: 12 pages, 12 figures, presented at VISAPP 2024
Journal-ref: Proceedings of the 19th International Joint Conference on Computer Vision (2024), Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP, ISBN 978-989-758-679-8, ISSN 2184-4321, SciTePress, pages 90-101
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[23]  arXiv:2404.12285 [pdf, other]
Title: Performance Evaluation of Segment Anything Model with Variational Prompting for Application to Non-Visible Spectrum Imagery
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[24]  arXiv:2404.12260 [pdf, other]
Title: Alleviating Catastrophic Forgetting in Facial Expression Recognition with Emotion-Centered Models
Comments: 15 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[25]  arXiv:2404.12258 [pdf, ps, other]
Title: DeepLocalization: Using change point detection for Temporal Action Localization
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[26]  arXiv:2404.12257 [pdf, other]
Title: Food Portion Estimation via 3D Object Scaling
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM); Image and Video Processing (eess.IV)
[27]  arXiv:2404.12252 [pdf, other]
Title: Deep Gaussian mixture model for unsupervised image segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[28]  arXiv:2404.12246 [pdf, other]
Title: Blind Localization and Clustering of Anomalies in Textures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[29]  arXiv:2404.12235 [pdf, other]
Title: Beyond Average: Individualized Visual Scanpath Prediction
Comments: To appear in CVPR2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[30]  arXiv:2404.12216 [pdf, other]
Title: ProTA: Probabilistic Token Aggregation for Text-Video Retrieval
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[31]  arXiv:2404.12210 [pdf, other]
Title: Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[32]  arXiv:2404.12209 [pdf, other]
Title: Partial-to-Partial Shape Matching with Geometric Consistency
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[33]  arXiv:2404.12203 [pdf, other]
Title: GraFIQs: Face Image Quality Assessment Using Gradient Magnitudes
Comments: Accepted at CVPR Workshop 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[34]  arXiv:2404.12192 [pdf, other]
Title: Aligning Actions and Walking to LLM-Generated Textual Descriptions
Comments: Accepted at 2nd Workshop on Learning with Few or without Annotated Face, Body and Gesture Data
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[35]  arXiv:2404.12183 [pdf, other]
Title: Gait Recognition from Highly Compressed Videos
Comments: Accepted at 2nd Workshop on Learning with Few or without Annotated Face, Body and Gesture Data
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[36]  arXiv:2404.12172 [pdf, other]
Title: How to Benchmark Vision Foundation Models for Semantic Segmentation?
Comments: CVPR 2024 Workshop Proceedings for the Second Workshop on Foundation Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[37]  arXiv:2404.12168 [pdf, other]
Title: Real-World Efficient Blind Motion Deblurring via Blur Pixel Discretization
Comments: CVPR2024 Camera-Ready
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[38]  arXiv:2404.12154 [pdf, other]
Title: StyleBooth: Image Style Editing with Multimodal Instruction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[39]  arXiv:2404.12144 [pdf, other]
Title: Mushroom Segmentation and 3D Pose Estimation from Point Clouds using Fully Convolutional Geometric Features and Implicit Pose Encoding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[40]  arXiv:2404.12142 [pdf, other]
Title: SDIP: Self-Reinforcement Deep Image Prior Framework for Image Processing
Authors: Ziyu Shu, Zhixin Pan
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[41]  arXiv:2404.12139 [pdf, other]
Title: Omniview-Tuning: Boosting Viewpoint Invariance of Vision-Language Pre-training Models
Comments: 20 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[42]  arXiv:2404.12120 [pdf, other]
Title: Fortify the Guardian, Not the Treasure: Resilient Adversarial Detectors
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[43]  arXiv:2404.12104 [pdf, other]
Title: Ethical-Lens: Curbing Malicious Usages of Open-Source Text-to-Image Models
Comments: 42 pages, 17 figures, 29 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
[44]  arXiv:2404.12103 [pdf, other]
Title: S3R-Net: A Single-Stage Approach to Self-Supervised Shadow Removal
Comments: NTIRE workshop @ CVPR 2024. Code & models available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[45]  arXiv:2404.12091 [pdf, other]
Title: Harnessing Joint Rain-/Detail-aware Representations to Eliminate Intricate Rains
Comments: 21 pages, 14 figures
Journal-ref: International Conference on Learning Representations 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[46]  arXiv:2404.12083 [pdf, other]
Title: MambaPupil: Bidirectional Selective Recurrent model for Event-based Eye tracking
Comments: Accepted by CVPR 2024 Workshop (AIS: Vision, Graphics and AI for Streaming), top solution of challenge Event-based Eye Tracking, see this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[47]  arXiv:2404.12081 [pdf, other]
Title: MaskCD: A Remote Sensing Change Detection Network Based on Mask Classification
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[48]  arXiv:2404.12064 [pdf, other]
Title: PureForest: A Large-scale Aerial Lidar and Aerial Imagery Dataset for Tree Species Classification in Monospecific Forests
Comments: 14 pages | 5 figures | Dataset is available at this http URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[49]  arXiv:2404.12055 [pdf, other]
Title: Improving the perception of visual fiducial markers in the field using Adaptive Active Exposure Control
Comments: Paper accepted by ISER 2023
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[50]  arXiv:2404.12037 [pdf, other]
Title: Data-free Knowledge Distillation for Fine-grained Visual Categorization
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[51]  arXiv:2404.12031 [pdf, other]
Title: MLS-Track: Multilevel Semantic Interaction in RMOT
Comments: 17 pages 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[52]  arXiv:2404.12024 [pdf, other]
Title: Meta-Auxiliary Learning for Micro-Expression Recognition
Comments: 10 pages, 7 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[53]  arXiv:2404.12020 [pdf, other]
Title: Look, Listen, and Answer: Overcoming Biases for Audio-Visual Question Answering
Comments: 16 pages, 9 figures,5 Tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[54]  arXiv:2404.12015 [pdf, other]
Title: What does CLIP know about peeling a banana?
Comments: Accepted to MAR Workshop at CVPR2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[55]  arXiv:2404.11998 [pdf, other]
Title: Curriculum Point Prompting for Weakly-Supervised Referring Image Segmentation
Comments: Accepted to CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[56]  arXiv:2404.11987 [pdf, other]
Title: MultiPhys: Multi-Person Physics-aware 3D Motion Estimation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[57]  arXiv:2404.11981 [pdf, other]
Title: Tendency-driven Mutual Exclusivity for Weakly Supervised Incremental Semantic Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[58]  arXiv:2404.11979 [pdf, other]
Title: MTGA: Multi-view Temporal Granularity aligned Aggregation for Event-based Lip-reading
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[59]  arXiv:2404.11958 [pdf, other]
Title: Not All Voxels Are Equal: Hardness-Aware Semantic Scene Completion with Self-Distillation
Comments: Accepted by CVPR2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[60]  arXiv:2404.11957 [pdf, other]
Title: The devil is in the object boundary: towards annotation-free instance segmentation using Foundation Models
Authors: Cheng Shi, Sibei Yang
Comments: ICLR2024, Code is released at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[61]  arXiv:2404.11949 [pdf, other]
Title: Sketch-guided Image Inpainting with Partial Discrete Diffusion Process
Comments: Accepted to NTIRE Workshop @ CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[62]  arXiv:2404.11903 [pdf, other]
Title: Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition
Comments: 12 pages, 5 figures, submitted to IEEE Transactions on Multimedia
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[63]  arXiv:2404.11897 [pdf, other]
Title: AG-NeRF: Attention-guided Neural Radiance Fields for Multi-height Large-scale Outdoor Scene Rendering
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[64]  arXiv:2404.11895 [pdf, other]
Title: FreeDiff: Progressive Frequency Truncation for Image Editing with Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[65]  arXiv:2404.11884 [pdf, other]
Title: Seeing Motion at Nighttime with an Event Camera
Comments: Accepted by CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[66]  arXiv:2404.11871 [pdf, other]
Title: Group-On: Boosting One-Shot Segmentation with Supportive Query
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[67]  arXiv:2404.11868 [pdf, other]
Title: OPTiML: Dense Semantic Invariance Using Optimal Transport for Self-Supervised Medical Image Representation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[68]  arXiv:2404.11865 [pdf, other]
Title: From Image to Video, what do we need in multimodal LLMs?
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[69]  arXiv:2404.11864 [pdf, other]
Title: Progressive Multi-modal Conditional Prompt Tuning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[70]  arXiv:2404.11848 [pdf, other]
Title: Partial Large Kernel CNNs for Efficient Super-Resolution
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[71]  arXiv:2404.11824 [pdf, other]
Title: TextCenGen: Attention-Guided Text-Centric Background Adaptation for Text-to-Image Generation
Comments: 7 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[72]  arXiv:2404.11819 [pdf, other]
Title: Utilizing Adversarial Examples for Bias Mitigation and Accuracy Enhancement
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[73]  arXiv:2404.11812 [pdf, other]
Title: Cross-model Mutual Learning for Exemplar-based Medical Image Segmentation
Authors: Qing En, Yuhong Guo
Comments: AISTATS 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[74]  arXiv:2404.11803 [pdf, other]
Title: TempBEV: Improving Learned BEV Encoders with Combined Image and BEV Space Temporal Aggregation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[75]  arXiv:2404.11798 [pdf, other]
Title: Establishing a Baseline for Gaze-driven Authentication Performance in VR: A Breadth-First Investigation on a Very Large Dataset
Comments: 28 pages, 18 figures, 5 tables, includes supplementary material
Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[76]  arXiv:2404.11797 [pdf, other]
Title: When are Foundation Models Effective? Understanding the Suitability for Pixel-Level Classification Using Multispectral Imagery
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[77]  arXiv:2404.11778 [pdf, other]
Title: CU-Mamba: Selective State Space Models with Channel Learning for Image Restoration
Authors: Rui Deng, Tianpei Gu
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[78]  arXiv:2404.11770 [pdf, other]
[79]  arXiv:2404.11764 [pdf, other]
Title: Multimodal 3D Object Detection on Unseen Domains
Comments: technical report
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[80]  arXiv:2404.11762 [pdf, other]
Title: IrrNet: Advancing Irrigation Mapping with Incremental Patch Size Training on Remote Sensing Imagery
Comments: Full version of the paper will be appearing in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[81]  arXiv:2404.11737 [pdf, other]
Title: Equivariant Spatio-Temporal Self-Supervision for LiDAR Object Detection
Comments: technical report
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[82]  arXiv:2404.11732 [pdf, other]
Title: Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach
Comments: Accepted at CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[83]  arXiv:2404.11727 [pdf, ps, other]
Title: Deep Learning for Video-Based Assessment of Endotracheal Intubation Skills
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[84]  arXiv:2404.11669 [pdf, other]
Title: Factorized Motion Fields for Fast Sparse Input Dynamic View Synthesis
Comments: Accepted at SIGGRAPH 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[85]  arXiv:2404.11630 [pdf, other]
Title: SNP: Structured Neuron-level Pruning to Preserve Attention Scores
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[86]  arXiv:2404.12387 (cross-list from cs.CL) [pdf, other]
Title: Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[87]  arXiv:2404.12341 (cross-list from cs.LG) [pdf, other]
Title: Measuring Feature Dependency of Neural Networks by Collapsing Feature Dimensions in the Data Manifold
Comments: Accepted and will be pulished in International Symposium on Biomedical Imaging (ISBI) 2024
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[88]  arXiv:2404.12339 (cross-list from cs.RO) [pdf, other]
Title: SPOT: Point Cloud Based Stereo Visual Place Recognition for Similar and Opposing Viewpoints
Comments: Accepted to ICRA 2024, project website: this https URL
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[89]  arXiv:2404.12251 (cross-list from cs.LG) [pdf, other]
Title: Dynamic Modality and View Selection for Multimodal Emotion Recognition with Missing Modalities
Comments: 15 pages
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[90]  arXiv:2404.12163 (cross-list from eess.IV) [pdf, other]
Title: Unsupervised Microscopy Video Denoising
Comments: Accepted at CVPRW 2024
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[91]  arXiv:2404.12130 (cross-list from cs.LG) [pdf, other]
Title: One-Shot Sequential Federated Learning for Non-IID Data by Enhancing Local Model Diversity
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)
[92]  arXiv:2404.12062 (cross-list from cs.SD) [pdf, other]
Title: MIDGET: Music Conditioned 3D Dance Generation
Comments: 12 pages, 6 figures Published in AI 2023: Advances in Artificial Intelligence
Journal-ref: In Australasian Joint Conference on Artificial Intelligence (pp. 277-288). Singapore: Springer Nature Singapore 2023
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Audio and Speech Processing (eess.AS)
[93]  arXiv:2404.11974 (cross-list from eess.IV) [pdf, other]
Title: Device (In)Dependence of Deep Learning-based Image Age Approximation
Comments: This work was accepted and presented in: 2022 ICPR-Workshop on Artificial Intelligence for Multimedia Forensics and Disinformation Detection. Montreal, Quebec, Canada. However, due to a technical issue on the publishing companies' side, the work does not appear in the workshop proceedings
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[94]  arXiv:2404.11962 (cross-list from cs.AI) [pdf, other]
Title: ©Plug-in Authorization for Human Content Copyright Protection in Text-to-Image Model
Comments: 20 pages, 6 figures
Subjects: Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[95]  arXiv:2404.11947 (cross-list from cs.LG) [pdf, other]
Title: VCC-INFUSE: Towards Accurate and Efficient Selection of Unlabeled Examples in Semi-supervised Learning
Comments: Accepted paper of IJCAI 2024. Shijie Fang and Qianhan Feng contributed equally to this paper
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[96]  arXiv:2404.11946 (cross-list from cs.RO) [pdf, other]
Title: S4TP: Social-Suitable and Safety-Sensitive Trajectory Planning for Autonomous Vehicles
Comments: 12 pages,4 figures, published to IEEE Transactions on Intelligent Vehicles
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[97]  arXiv:2404.11936 (cross-list from cs.LG) [pdf, other]
Title: LD-Pruner: Efficient Pruning of Latent Diffusion Models using Task-Agnostic Insights
Comments: 8 pages, accepted to CVPR24 First Workshop on Efficient and On-Device Generation (EDGE)
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[98]  arXiv:2404.11929 (cross-list from eess.IV) [pdf, other]
Title: A Symmetric Regressor for MRI-Based Assessment of Striatal Dopamine Transporter Uptake in Parkinson's Disease
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[99]  arXiv:2404.11925 (cross-list from cs.LG) [pdf, other]
Title: EdgeFusion: On-Device Text-to-Image Generation
Comments: 4 pages, accepted to CVPR24 First Workshop on Efficient and On-Device Generation (EDGE)
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[100]  arXiv:2404.11889 (cross-list from eess.IV) [pdf, other]
Title: Multi-view X-ray Image Synthesis with Multiple Domain Disentanglement from CT Scans
Comments: 13 pages, 10 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[101]  arXiv:2404.11843 (cross-list from eess.IV) [pdf, other]
Title: Computer-Aided Diagnosis of Thoracic Diseases in Chest X-rays using hybrid CNN-Transformer Architecture
Authors: Sonit Singh
Comments: 24 pages, 13 Figures, 13 Tables. arXiv admin note: text overlap with arXiv:1904.09925 by other authors
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[102]  arXiv:2404.11795 (cross-list from cs.LG) [pdf, other]
Title: Prompt-Driven Feature Diffusion for Open-World Semi-Supervised Learning
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[103]  arXiv:2404.11776 (cross-list from cs.LG) [pdf, ps, other]
Title: 3D object quality prediction for Metal Jet Printer with Multimodal thermal encoder
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[104]  arXiv:2404.11769 (cross-list from cs.LG) [pdf, other]
Title: QGen: On the Ability to Generalize in Quantization Aware Training
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[105]  arXiv:2404.11741 (cross-list from physics.med-ph) [pdf, other]
Title: Diffusion Schrödinger Bridge Models for High-Quality MR-to-CT Synthesis for Head and Neck Proton Treatment Planning
Comments: International Conference on the use of Computers in Radiation therapy (ICCR)
Subjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV)
[106]  arXiv:2404.11735 (cross-list from cs.LG) [pdf, other]
Title: Learning with 3D rotations, a hitchhiker's guide to SO(3)
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[107]  arXiv:2404.11725 (cross-list from eess.IV) [pdf, ps, other]
Title: Postoperative glioblastoma segmentation: Development of a fully automated pipeline using deep convolutional neural networks and comparison with currently available models
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[108]  arXiv:2404.11683 (cross-list from cs.RO) [pdf, other]
Title: Unifying Scene Representation and Hand-Eye Calibration with 3D Foundation Models
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[109]  arXiv:2404.11667 (cross-list from cs.LG) [pdf, other]
Title: Deep Dependency Networks and Advanced Inference Schemes for Multi-Label Classification
Comments: Will appear in AISTATS 2024. arXiv admin note: substantial text overlap with arXiv:2302.00633
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

Thu, 18 Apr 2024

[110]  arXiv:2404.11615 [pdf, other]
Title: Factorized Diffusion: Perceptual Illusions by Noise Decomposition
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[111]  arXiv:2404.11614 [pdf, other]
Title: Dynamic Typography: Bringing Text to Life via Video Diffusion Prior
Comments: Our demo page is available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[112]  arXiv:2404.11613 [pdf, other]
Title: InFusion: Inpainting 3D Gaussians via Learning Depth Completion from Diffusion Prior
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[113]  arXiv:2404.11605 [pdf, other]
Title: VG4D: Vision-Language Model Goes 4D Video Recognition
Comments: ICRA 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[114]  arXiv:2404.11593 [pdf, other]
Title: IntrinsicAnything: Learning Diffusion Priors for Inverse Rendering Under Unknown Illumination
Authors: Xi Chen (1), Sida Peng (1), Dongchen Yang (1), Yuan Liu (2), Bowen Pan (3), Chengfei Lv (3), Xiaowei Zhou (1) ((1) Zhejiang University, (2) The University of Hong Kong, (3) Tao Technology Department, Alibaba Group)
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[115]  arXiv:2404.11590 [pdf, other]
Title: A Subspace-Constrained Tyler's Estimator and its Applications to Structure from Motion
Comments: 23 pages, accepted by CVPR 24
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[116]  arXiv:2404.11589 [pdf, other]
Title: Prompt Optimizer of Text-to-Image Diffusion Models for Abstract Concept Understanding
Comments: WWW 2024 Companion
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[117]  arXiv:2404.11576 [pdf, other]
Title: State-space Decomposition Model for Video Prediction Considering Long-term Motion Trend
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[118]  arXiv:2404.11569 [pdf, other]
Title: Simple Image Signal Processing using Global Context Guidance
Comments: Preprint under review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[119]  arXiv:2404.11565 [pdf, other]
Title: MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation
Comments: Project Website: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[120]  arXiv:2404.11554 [pdf, other]
Title: Predicting Long-horizon Futures by Conditioning on Geometry and Time
Comments: Project page: this http URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[121]  arXiv:2404.11537 [pdf, other]
Title: SSDiff: Spatial-spectral Integrated Diffusion Model for Remote Sensing Pansharpening
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[122]  arXiv:2404.11525 [pdf, other]
Title: JointViT: Modeling Oxygen Saturation Levels with Joint Supervision on Long-Tailed OCTA
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[123]  arXiv:2404.11492 [pdf, other]
Title: arcjetCV: an open-source software to analyze material ablation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[124]  arXiv:2404.11488 [pdf, other]
Title: Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems
Comments: 9 pages, 3 figures Accepted for publication at the Embedded Vision Workshop of the Computer Vision and Pattern Recognition conference, Seattle, 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[125]  arXiv:2404.11475 [pdf, other]
Title: AdaIR: Exploiting Underlying Similarities of Image Restoration Tasks with Adapters
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[126]  arXiv:2404.11474 [pdf, other]
Title: Towards Highly Realistic Artistic Style Transfer via Stable Diffusion with Step-aware and Layer-aware Prompt
Comments: Accepted by IJCAI2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[127]  arXiv:2404.11461 [pdf, other]
Title: Using Game Engines and Machine Learning to Create Synthetic Satellite Imagery for a Tabletop Verification Exercise
Comments: Annual Meeting of the Institute of Nuclear Materials Management (INMM), Vienna
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
[128]  arXiv:2404.11429 [pdf, other]
Title: CarcassFormer: An End-to-end Transformer-based Framework for Simultaneous Localization, Segmentation and Classification of Poultry Carcass Defect
Comments: Accepted to Poultry Science Journal
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[129]  arXiv:2404.11426 [pdf, other]
Title: SPAMming Labels: Efficient Annotations for the Trackers of Tomorrow
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[130]  arXiv:2404.11419 [pdf, other]
Title: SLAIM: Robust Dense Neural SLAM for Online Tracking and Mapping
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[131]  arXiv:2404.11416 [pdf, other]
Title: Neural Shrödinger Bridge Matching for Pansharpening
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[132]  arXiv:2404.11401 [pdf, other]
Title: RainyScape: Unsupervised Rainy Scene Reconstruction using Decoupled Neural Rendering
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[133]  arXiv:2404.11375 [pdf, other]
Title: Text-controlled Motion Mamba: Text-Instructed Temporal Grounding of Human Motion
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[134]  arXiv:2404.11358 [pdf, other]
Title: DeblurGS: Gaussian Splatting for Camera Motion Blur
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[135]  arXiv:2404.11357 [pdf, other]
Title: Detector Collapse: Backdooring Object Detection to Catastrophic Overload or Blindness
Comments: Accepted by IJCAI-24
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[136]  arXiv:2404.11355 [pdf, other]
Title: Consisaug: A Consistency-based Augmentation for Polyp Detection in Endoscopy Image Analysis
Comments: MLMI 2023
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[137]  arXiv:2404.11339 [pdf, other]
Title: Best Practices for a Handwritten Text Recognition System
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[138]  arXiv:2404.11335 [pdf, other]
Title: SoccerNet Game State Reconstruction: End-to-End Athlete Tracking and Identification on a Minimap
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[139]  arXiv:2404.11326 [pdf, other]
Title: Single-temporal Supervised Remote Change Detection for Domain Generalization
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[140]  arXiv:2404.11322 [pdf, other]
Title: VBR: A Vision Benchmark in Rome
Comments: Accepted at IEEE ICRA 2024 Website: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[141]  arXiv:2404.11318 [pdf, other]
Title: Leveraging Fine-Grained Information and Noise Decoupling for Remote Sensing Change Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[142]  arXiv:2404.11317 [pdf, other]
Title: Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives
Comments: 12 pages, 11 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[143]  arXiv:2404.11309 [pdf, other]
Title: Achieving Rotation Invariance in Convolution Operations: Shifting from Data-Driven to Mechanism-Assured
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[144]  arXiv:2404.11302 [pdf, other]
Title: A Semantic Segmentation-guided Approach for Ground-to-Aerial Image Matching
Comments: 6 pages, 2 figures, 2 tables, Submitted to IGARSS 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[145]  arXiv:2404.11299 [pdf, other]
Title: Learning from Unlabelled Data with Transformers: Domain Adaptation for Semantic Segmentation of High Resolution Aerial Images
Comments: 6 pages, 7 figures, Submitted to IGARSS 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[146]  arXiv:2404.11291 [pdf, other]
Title: Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption
Comments: CVPR2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[147]  arXiv:2404.11266 [pdf, other]
Title: Criteria for Uncertainty-based Corner Cases Detection in Instance Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[148]  arXiv:2404.11265 [pdf, other]
Title: The Victim and The Beneficiary: Exploiting a Poisoned Model to Train a Clean Model on Poisoned Data
Comments: 13 pages, 6 figures, published to ICCV
Journal-ref: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2023: 155-164
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[149]  arXiv:2404.11256 [pdf, other]
Title: MMCBE: Multi-modality Dataset for Crop Biomass Estimation and Beyond
Comments: 10 pages, 10 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[150]  arXiv:2404.11249 [pdf, other]
Title: A Progressive Framework of Vision-language Knowledge Distillation and Alignment for Multilingual Scene
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[151]  arXiv:2404.11243 [pdf, other]
Title: Optical Image-to-Image Translation Using Denoising Diffusion Models: Heterogeneous Change Detection as a Use Case
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[152]  arXiv:2404.11236 [pdf, other]
Title: ONOT: a High-Quality ICAO-compliant Synthetic Mugshot Dataset
Comments: Paper accepted in IEEE FG 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[153]  arXiv:2404.11230 [pdf, other]
Title: Energy-Efficient Uncertainty-Aware Biomass Composition Prediction at the Edge
Comments: The paper has been accepted to CVPR 2024 5th Workshop on Vision for Agriculture
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[154]  arXiv:2404.11226 [pdf, other]
Title: Simple In-place Data Augmentation for Surveillance Object Detection
Comments: CVPR Workshop 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[155]  arXiv:2404.11214 [pdf, other]
Title: Feature Corrective Transfer Learning: End-to-End Solutions to Object Detection in Non-Ideal Visual Conditions
Comments: 10 pages, 3 figures, accepted by 2024 CVPR UG2 Workshop
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[156]  arXiv:2404.11207 [pdf, other]
Title: Exploring the Transferability of Visual Prompting for Multimodal Large Language Models
Comments: Accepted in CVPR 2024 as Poster (Highlight)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[157]  arXiv:2404.11205 [pdf, other]
Title: Kathakali Hand Gesture Recognition With Minimal Data
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[158]  arXiv:2404.11202 [pdf, other]
Title: GhostNetV3: Exploring the Training Strategies for Compact Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[159]  arXiv:2404.11161 [pdf, other]
Title: Pre-processing matters: A segment search method for WSI classification
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[160]  arXiv:2404.11159 [pdf, other]
Title: Deep Portrait Quality Assessment. A NTIRE 2024 Challenge Survey
Comments: CVPRW - NTIRE 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[161]  arXiv:2404.11156 [pdf, other]
Title: Learning SO(3)-Invariant Semantic Correspondence via Local Shape Transform
Comments: Accepted to CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[162]  arXiv:2404.11155 [pdf, other]
Title: HybriMap: Hybrid Clues Utilization for Effective Vectorized HD Map Construction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[163]  arXiv:2404.11151 [pdf, other]
Title: REACTO: Reconstructing Articulated Objects from a Single Video
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[164]  arXiv:2404.11139 [pdf, other]
Title: GeoReF: Geometric Alignment Across Shape Variation for Category-level Object Pose Refinement
Comments: The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[165]  arXiv:2404.11129 [pdf, other]
Title: Fact :Teaching MLLMs with Faithful, Concise and Transferable Rationales
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[166]  arXiv:2404.11127 [pdf, other]
Title: D-Aug: Enhancing Data Augmentation for Dynamic LiDAR Scenes
Comments: 4pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[167]  arXiv:2404.11120 [pdf, other]
Title: TiNO-Edit: Timestep and Noise Optimization for Robust Diffusion-Based Image Editing
Comments: Conference on Computer Vision and Pattern Recognition (CVPR) 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[168]  arXiv:2404.11118 [pdf, other]
Title: MHLR: Moving Haar Learning Rate Scheduler for Large-scale Face Recognition Training with One GPU
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[169]  arXiv:2404.11111 [pdf, other]
Title: CorrNet+: Sign Language Recognition and Translation via Spatial-Temporal Correlation
Comments: arXiv admin note: substantial text overlap with arXiv:2303.03202
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[170]  arXiv:2404.11108 [pdf, other]
Title: LADDER: An Efficient Framework for Video Frame Interpolation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[171]  arXiv:2404.11104 [pdf, other]
Title: Object Remover Performance Evaluation Methods using Class-wise Object Removal Images
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[172]  arXiv:2404.11100 [pdf, other]
Title: Synthesizing Realistic Data for Table Recognition
Comments: ICDAR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[173]  arXiv:2404.11098 [pdf, other]
Title: LAPTOP-Diff: Layer Pruning and Normalized Distillation for Compressing Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[174]  arXiv:2404.11070 [pdf, ps, other]
Title: Sky-GVIO: an enhanced GNSS/INS/Vision navigation with FCN-based sky-segmentation in urban canyon
Subjects: Computer Vision and Pattern Recognition (cs.CV); Signal Processing (eess.SP)
[175]  arXiv:2404.11064 [pdf, other]
Title: Rethinking 3D Dense Caption and Visual Grounding in A Unified Framework through Prompt-based Localization
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[176]  arXiv:2404.11054 [pdf, other]
Title: Multilateral Temporal-view Pyramid Transformer for Video Inpainting Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[177]  arXiv:2404.11052 [pdf, other]
Title: Supervised Contrastive Vision Transformer for Breast Histopathological Image Classification
Comments: 8 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[178]  arXiv:2404.11051 [pdf, ps, other]
Title: WPS-Dataset: A benchmark for wood plate segmentation in bark removal processing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[179]  arXiv:2404.11031 [pdf, other]
Title: TaCOS: Task-Specific Camera Optimization with Simulation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[180]  arXiv:2404.11025 [pdf, other]
Title: Spatial-Aware Image Retrieval: A Hyperdimensional Computing Approach for Efficient Similarity Hashing
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[181]  arXiv:2404.11016 [pdf, other]
Title: MaeFuse: Transferring Omni Features with Pretrained Masked Autoencoders for Infrared and Visible Image Fusion via Guided Training
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[182]  arXiv:2404.11008 [pdf, other]
Title: AKGNet: Attribute Knowledge-Guided Unsupervised Lung-Infected Area Segmentation
Authors: Qing En, Yuhong Guo
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[183]  arXiv:2404.11003 [pdf, other]
Title: InfoMatch: Entropy Neural Estimation for Semi-Supervised Image Classification
Comments: IJCAI 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[184]  arXiv:2404.10992 [pdf, other]
Title: How to deal with glare for improved perception of Autonomous Vehicles
Comments: 14 pages, 9 figures, Accepted IEEE TIV
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[185]  arXiv:2404.10989 [pdf, other]
Title: FairSSD: Understanding Bias in Synthetic Speech Detectors
Comments: Accepted at CVPR 2024 (WMF)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[186]  arXiv:2404.10985 [pdf, ps, other]
Title: Pixel-Wise Symbol Spotting via Progressive Points Location for Parsing CAD Images
Comments: 10 pages, 10 figures,6 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[187]  arXiv:2404.10980 [pdf, other]
Title: Hyper Evidential Deep Learning to Quantify Composite Classification Uncertainty
Comments: In Proceedings of The Twelfth International Conference on Learning Representations, ICLR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[188]  arXiv:2404.10978 [pdf, other]
Title: Leveraging 3D LiDAR Sensors to Enable Enhanced Urban Safety and Public Health: Pedestrian Monitoring and Abnormal Activity Detection
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[189]  arXiv:2404.10966 [pdf, other]
Title: Domain-Specific Block Selection and Paired-View Pseudo-Labeling for Online Test-Time Adaptation
Comments: Accepted at CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[190]  arXiv:2404.10947 [pdf, other]
Title: Residual Connections Harm Self-Supervised Abstract Feature Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[191]  arXiv:2404.10940 [pdf, other]
Title: Neuromorphic Vision-based Motion Segmentation with Graph Transformer Neural Network
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[192]  arXiv:2404.10927 [pdf, other]
Title: A Concise Tiling Strategy for Preserving Spatial Context in Earth Observation Imagery
Comments: Accepted to the Machine Learning for Remote Sensing (ML4RS) Workshop at ICLR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[193]  arXiv:2404.10904 [pdf, other]
Title: Multi-Task Multi-Modal Self-Supervised Learning for Facial Expression Recognition
Comments: The paper will appear in the CVPR 2024 workshops proceedings
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[194]  arXiv:2404.10896 [pdf, ps, other]
Title: From a Lossless (~1.5:1) Compression Algorithm for Llama2 7B Weights to Variable Precision, Variable Range, Compressed Numeric Data Types for CNNs and LLMs
Authors: Vincenzo Liguori
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Hardware Architecture (cs.AR)
[195]  arXiv:2404.10894 [pdf, other]
Title: Semantics-Aware Attention Guidance for Diagnosing Whole Slide Images
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[196]  arXiv:2404.10880 [pdf, other]
Title: HumMUSS: Human Motion Understanding using State Space Models
Comments: CVPR 24
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[197]  arXiv:2404.10865 [pdf, other]
Title: OSR-ViT: A Simple and Modular Framework for Open-Set Object Detection and Discovery
Comments: 28 pages, 8 figures, 7 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[198]  arXiv:2404.10864 [pdf, other]
Title: Vocabulary-free Image Classification and Semantic Segmentation
Comments: Under review, 22 pages, 10 figures, code is available at this https URL arXiv admin note: text overlap with arXiv:2306.00917
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[199]  arXiv:2404.10856 [pdf, other]
Title: UruDendro, a public dataset of cross-section images of Pinus taeda
Comments: Submitted to Dendrochronologia. arXiv admin note: text overlap with arXiv:2305.10809
Subjects: Computer Vision and Pattern Recognition (cs.CV); Quantitative Methods (q-bio.QM)
[200]  arXiv:2404.10841 [pdf, other]
Title: Gasformer: A Transformer-based Architecture for Segmenting Methane Emissions from Livestock in Optical Gas Imaging
Comments: 9 pages, 5 figures, this paper has been submitted and accepted for publication at CVPRW 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[201]  arXiv:2404.10838 [pdf, other]
Title: Dynamic Self-adaptive Multiscale Distillation from Pre-trained Multimodal Large Model for Efficient Cross-modal Representation Learning
Comments: 10 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Multimedia (cs.MM)
[202]  arXiv:2404.10836 [pdf, other]
Title: Semantic-Based Active Perception for Humanoid Visual Tasks with Foveal Sensors
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[203]  arXiv:2404.11599 (cross-list from cs.LG) [pdf, other]
Title: Variational Bayesian Last Layers
Comments: International Conference on Learning Representations (ICLR) 2024
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[204]  arXiv:2404.11511 (cross-list from eess.IV) [pdf, other]
Title: Event Cameras Meet SPADs for High-Speed, Low-Bandwidth Imaging
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[205]  arXiv:2404.11459 (cross-list from cs.CL) [pdf, other]
Title: Octopus v3: Technical Report for On-device Sub-billion Multimodal AI Agent
Authors: Wei Chen, Zhiyuan Li
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[206]  arXiv:2404.11428 (cross-list from eess.IV) [pdf, other]
Title: Explainable Lung Disease Classification from Chest X-Ray Images Utilizing Deep Learning and XAI
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[207]  arXiv:2404.11361 (cross-list from eess.IV) [pdf, other]
Title: Boosting Medical Image Segmentation Performance with Adaptive Convolution Layer
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[208]  arXiv:2404.11336 (cross-list from eess.SY) [pdf, other]
Title: Vision-based control for landing an aerial vehicle on a marine vessel
Authors: Haohua Dong
Subjects: Systems and Control (eess.SY); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[209]  arXiv:2404.11327 (cross-list from cs.RO) [pdf, other]
Title: Following the Human Thread in Social Navigation
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[210]  arXiv:2404.11273 (cross-list from eess.IV) [pdf, other]
Title: Training Transformer Models by Wavelet Losses Improves Quantitative and Visual Performance in Single Image Super-Resolution
Comments: total of 10 pages including references, 5 tables and 5 figures, accepted for NTIRE 2024 Single Image Super Resolution (x4) challenge
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[211]  arXiv:2404.11209 (cross-list from cs.AI) [pdf, ps, other]
Title: Prompt-Guided Generation of Structured Chest X-Ray Report Using a Pre-trained LLM
Comments: Accepted by IEEE Conference on Multimedia Expo 2024
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[212]  arXiv:2404.11152 (cross-list from eess.IV) [pdf, other]
Title: Multi-target and multi-stage liver lesion segmentation and detection in multi-phase computed tomography scans
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[213]  arXiv:2404.11046 (cross-list from cs.AI) [pdf, other]
Title: Lightweight Unsupervised Federated Learning with Pretrained Vision Language Model
Authors: Hao Yan, Yuhong Guo
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[214]  arXiv:2404.10892 (cross-list from eess.IV) [pdf, other]
Title: Automatic classification of prostate MR series type using image content and metadata
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[215]  arXiv:2404.10790 (cross-list from cs.CR) [pdf, other]
Title: Multimodal Attack Detection for Action Recognition Models
Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[216]  arXiv:2307.00071 (cross-list from cs.RO) [pdf, other]
Title: GIRA: Gaussian Mixture Models for Inference and Robot Autonomy
Comments: 2024 IEEE International Conference on Robotics and Automation (ICRA)
Subjects: Robotics (cs.RO); Computational Geometry (cs.CG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Wed, 17 Apr 2024 (showing first 95 of 114 entries)

[217]  arXiv:2404.10775 [pdf, other]
Title: COMBO: Compositional World Models for Embodied Multi-Agent Cooperation
Comments: 23 pages. The first three authors contributed equally
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)
[218]  arXiv:2404.10772 [pdf, other]
Title: Gaussian Opacity Fields: Efficient and Compact Surface Reconstruction in Unbounded Scenes
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[219]  arXiv:2404.10765 [pdf, other]
Title: RefFusion: Reference Adapted Diffusion Models for 3D Scene Inpainting
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[220]  arXiv:2404.10760 [pdf, other]
Title: Learning Feature Inversion for Multi-class Anomaly Detection under General-purpose COCO-AD Benchmark
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[221]  arXiv:2404.10758 [pdf, other]
Title: Watch Your Step: Optimal Retrieval for Continual Learning at Scale
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[222]  arXiv:2404.10718 [pdf, other]
Title: GazeHTA: End-to-end Gaze Target Detection with Head-Target Association
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[223]  arXiv:2404.10717 [pdf, other]
Title: Mixed Prototype Consistency Learning for Semi-supervised Medical Image Segmentation
Authors: Lijian Li
Comments: 15 pages, 2 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[224]  arXiv:2404.10716 [pdf, other]
Title: MOWA: Multiple-in-One Image Warping Model
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[225]  arXiv:2404.10713 [pdf, ps, other]
Title: A Plausibility Study of Using Augmented Reality in the Ventriculoperitoneal Shunt Operations
Comments: Accepted for the 2024 - 16th International Conference on Knowledge and Smart Technology (KST). To be published in IEEEXplore Digital Library (#61284), ISBN: 979-8-3503-7073-7
Subjects: Computer Vision and Pattern Recognition (cs.CV); Software Engineering (cs.SE)
[226]  arXiv:2404.10699 [pdf, other]
Title: ECLAIR: A High-Fidelity Aerial LiDAR Dataset for Semantic Segmentation
Comments: 11 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[227]  arXiv:2404.10690 [pdf, other]
Title: MathWriting: A Dataset For Handwritten Mathematical Expression Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
[228]  arXiv:2404.10688 [pdf, other]
Title: Efficient Conditional Diffusion Model with Probability Flow Sampling for Image Super-resolution
Authors: Yutao Yuan, Chun Yuan
Comments: AAAI 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[229]  arXiv:2404.10685 [pdf, other]
Title: Generating Human Interaction Motions in Scenes with Text Control
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[230]  arXiv:2404.10681 [pdf, other]
Title: StyleCity: Large-Scale 3D Urban Scenes Stylization with Vision-and-Text Reference via Progressive Optimization
Comments: project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[231]  arXiv:2404.10667 [pdf, other]
Title: VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time
Comments: Tech Report. Project webpage: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[232]  arXiv:2404.10664 [pdf, ps, other]
Title: Assessing The Impact of CNN Auto Encoder-Based Image Denoising on Image Classification Tasks
Comments: 13 pages, 13 figures, 13th International conference on innovative technologies in the field of science, engineering and technology
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[233]  arXiv:2404.10633 [pdf, other]
Title: Contextrast: Contextual Contrastive Learning for Semantic Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[234]  arXiv:2404.10626 [pdf, other]
Title: Exploring selective image matching methods for zero-shot and few-sample unsupervised domain adaptation of urban canopy prediction
Comments: ICLR 2024 Machine Learning for Remote Sensing (ML4RS) Workshop
Subjects: Computer Vision and Pattern Recognition (cs.CV); Applications (stat.AP)
[235]  arXiv:2404.10625 [pdf, other]
Title: Gaussian Splatting Decoder for 3D-aware Generative Adversarial Networks
Comments: CVPRW
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[236]  arXiv:2404.10620 [pdf, other]
Title: PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction
Comments: In Submission
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[237]  arXiv:2404.10603 [pdf, other]
Title: Enhancing 3D Fidelity of Text-to-3D using Cross-View Correspondences
Comments: 25 pages, 22 figures, accepted to CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[238]  arXiv:2404.10600 [pdf, ps, other]
Title: Intra-operative tumour margin evaluation in breast-conserving surgery with deep learning
Comments: 1 pages, 6 figures and 2 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[239]  arXiv:2404.10595 [pdf, other]
Title: Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[240]  arXiv:2404.10584 [pdf, other]
Title: ReWiTe: Realistic Wide-angle and Telephoto Dual Camera Fusion Dataset via Beam Splitter Camera Rig
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[241]  arXiv:2404.10574 [pdf, other]
Title: Uncertainty-guided Open-Set Source-Free Unsupervised Domain Adaptation with Target-private Class Segregation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[242]  arXiv:2404.10572 [pdf, other]
Title: Label merge-and-split: A graph-colouring approach for memory-efficient brain parcellation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[243]  arXiv:2404.10571 [pdf, other]
Title: CMU-Flownet: Exploring Point Cloud Scene Flow Estimation in Occluded Scenario
Comments: 14 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[244]  arXiv:2404.10540 [pdf, other]
Title: SEVD: Synthetic Event-based Vision Dataset for Ego and Fixed Traffic Perception
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[245]  arXiv:2404.10539 [pdf, other]
Title: VideoSAGE: Video Summarization with Graph Representation Learning
Comments: arXiv admin note: text overlap with arXiv:2207.07783
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[246]  arXiv:2404.10534 [pdf, other]
Title: Into the Fog: Evaluating Multiple Object Tracking Robustness
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[247]  arXiv:2404.10527 [pdf, other]
Title: SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments
Comments: This submission includes the paper and supplementary material. 24 pages, 11 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[248]  arXiv:2404.10518 [pdf, other]
Title: MobileNetV4 -- Universal Models for the Mobile Ecosystem
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[249]  arXiv:2404.10501 [pdf, other]
Title: Self-Supervised Visual Preference Alignment
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[250]  arXiv:2404.10499 [pdf, other]
Title: Robust Noisy Label Learning via Two-Stream Sample Distillation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[251]  arXiv:2404.10490 [pdf, other]
Title: Teaching Chinese Sign Language with Feedback in Mixed Reality
Comments: 8 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[252]  arXiv:2404.10484 [pdf, other]
Title: AbsGS: Recovering Fine Details for 3D Gaussian Splatting
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[253]  arXiv:2404.10476 [pdf, other]
Title: Efficient optimal dispersed Haar-like filters for face detection
Subjects: Computer Vision and Pattern Recognition (cs.CV); Numerical Analysis (math.NA)
[254]  arXiv:2404.10454 [pdf, other]
Title: A Computer Vision-Based Quality Assessment Technique for the automatic control of consumables for analytical laboratories
Comments: 31 pages, 13 figures, 10 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[255]  arXiv:2404.10441 [pdf, other]
Title: 1st Place Solution for ICCV 2023 OmniObject3D Challenge: Sparse-View Reconstruction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[256]  arXiv:2404.10438 [pdf, other]
Title: The Unreasonable Effectiveness of Pre-Trained Features for Camera Pose Refinement
Comments: Accepted to CVPR2024 (Highlight)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[257]  arXiv:2404.10433 [pdf, other]
Title: Explainable concept mappings of MRI: Revealing the mechanisms underlying deep learning-based brain disease classification
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[258]  arXiv:2404.10411 [pdf, other]
Title: Camera clustering for scalable stream-based active distillation
Comments: This manuscript is currently under review at IEEE Transactions on Circuits and Systems for Video Technology
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[259]  arXiv:2404.10408 [pdf, other]
Title: Adversarial Identity Injection for Semantic Face Image Synthesis
Comments: Paper accepted at CVPR 2024 Biometrics Workshop
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[260]  arXiv:2404.10407 [pdf, ps, other]
Title: Comprehensive Survey of Model Compression and Speed up for Vision Transformers
Journal-ref: Journal of Information, Technology and Policy (2024): 1-12
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[261]  arXiv:2404.10405 [pdf, other]
Title: Integration of Self-Supervised BYOL in Semi-Supervised Medical Image Recognition
Comments: Accepted by ICCS 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[262]  arXiv:2404.10394 [pdf, other]
Title: Portrait3D: Text-Guided High-Quality 3D Portrait Generation Using Pyramid Representation and GANs Prior
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[263]  arXiv:2404.10383 [pdf, other]
Title: Learning to Score Sign Language with Two-stage Method
Authors: Hongli Wen, Yang Xu
Comments: 9 pages, 7 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[264]  arXiv:2404.10378 [pdf, other]
Title: Second Edition FRCSyn Challenge at CVPR 2024: Face Recognition Challenge in the Era of Synthetic Data
Comments: arXiv admin note: text overlap with arXiv:2311.10476
Journal-ref: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRw 2024)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Machine Learning (cs.LG)
[265]  arXiv:2404.10370 [pdf, other]
Title: Know Yourself Better: Diverse Discriminative Feature Learning Improves Open Set Recognition
Authors: Jiawen Xu
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[266]  arXiv:2404.10358 [pdf, other]
Title: Improving Bracket Image Restoration and Enhancement with Flow-guided Alignment and Enhanced Feature Aggregation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[267]  arXiv:2404.10357 [pdf, other]
Title: Optimization of Prompt Learning via Multi-Knowledge Representation for Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[268]  arXiv:2404.10343 [pdf, other]
Title: The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report
Comments: The report paper of NTIRE2024 Efficient Super-resolution, accepted by CVPRW2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[269]  arXiv:2404.10342 [pdf, other]
Title: Referring Flexible Image Restoration
Comments: 15 pages, 19 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[270]  arXiv:2404.10335 [pdf, other]
Title: Efficiently Adversarial Examples Generation for Visual-Language Models under Targeted Transfer Scenarios using Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[271]  arXiv:2404.10332 [pdf, other]
Title: Prescribing the Right Remedy: Mitigating Hallucinations in Large Vision-Language Models via Targeted Instruction Tuning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[272]  arXiv:2404.10322 [pdf, other]
Title: Domain-Rectifying Adapter for Cross-Domain Few-Shot Segmentation
Comments: Accepted by CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[273]  arXiv:2404.10319 [pdf, other]
Title: Application of Deep Learning Methods to Processing of Noisy Medical Video Data
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[274]  arXiv:2404.10318 [pdf, other]
Title: SRGS: Super-Resolution 3D Gaussian Splatting
Comments: submit ACM MM 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[275]  arXiv:2404.10314 [pdf, other]
Title: Awareness of uncertainty in classification using a multivariate model and multi-views
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[276]  arXiv:2404.10312 [pdf, other]
Title: OmniSSR: Zero-shot Omnidirectional Image Super-Resolution using Stable Diffusion Model
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[277]  arXiv:2404.10307 [pdf, other]
Title: Learnable Prompt for Few-Shot Semantic Segmentation in Remote Sensing Domain
Comments: Accepted to CVPRW 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[278]  arXiv:2404.10305 [pdf, other]
Title: TC-OCR: TableCraft OCR for Efficient Detection & Recognition of Table Structure & Content
Comments: 8 pages, 2 figures, Workshop of 1st MMIR Deep Multimodal Learning for Information Retrieval
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[279]  arXiv:2404.10292 [pdf, other]
Title: From Data Deluge to Data Curation: A Filtering-WoRA Paradigm for Efficient Text-based Person Search
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[280]  arXiv:2404.10279 [pdf, other]
Title: EucliDreamer: Fast and High-Quality Texturing for 3D Models with Depth-Conditioned Stable Diffusion
Comments: Short version of arXiv:2311.15573
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[281]  arXiv:2404.10272 [pdf, other]
Title: Plug-and-Play Acceleration of Occupancy Grid-based NeRF Rendering using VDB Grid and Hierarchical Ray Traversal
Comments: Short paper for CVPR Neural Rendering Intelligence Workshop 2024. Code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[282]  arXiv:2404.10267 [pdf, other]
Title: OneActor: Consistent Character Generation via Cluster-Conditioned Guidance
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[283]  arXiv:2404.10263 [pdf, ps, other]
Title: PreGSU-A Generalized Traffic Scene Understanding Model for Autonomous Driving based on Pre-trained Graph Attention Network
Comments: 12 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[284]  arXiv:2404.10242 [pdf, other]
Title: Masked Autoencoders for Microscopy are Scalable Learners of Cellular Biology
Comments: CVPR 2024 Highlight. arXiv admin note: text overlap with arXiv:2309.16064
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[285]  arXiv:2404.10241 [pdf, other]
Title: Vision-and-Language Navigation via Causal Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[286]  arXiv:2404.10237 [pdf, other]
Title: MoE-TinyMed: Mixture of Experts for Tiny Medical Large Vision-Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[287]  arXiv:2404.10227 [pdf, other]
Title: MS-MANO: Enabling Hand Pose Tracking with Biomechanical Constraints
Comments: 11 pages, 5 figures; CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[288]  arXiv:2404.10213 [pdf, other]
Title: GaitPoint+: A Gait Recognition Network Incorporating Point Cloud Analysis and Recycling
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[289]  arXiv:2404.10212 [pdf, other]
Title: LWIRPOSE: A novel LWIR Thermal Image Dataset and Benchmark
Comments: Submitted in ICIP2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[290]  arXiv:2404.10210 [pdf, other]
Title: MK-SGN: A Spiking Graph Convolutional Network with Multimodal Fusion and Knowledge Distillation for Skeleton-based Action Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[291]  arXiv:2404.10193 [pdf, other]
Title: Consistency and Uncertainty: Identifying Unreliable Responses From Black-Box Vision-Language Models for Selective Visual Question Answering
Authors: Zaid Khan, Yun Fu
Comments: CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[292]  arXiv:2404.10177 [pdf, other]
Title: Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models with Noisy Data
Comments: Preprint, work in progress. 19 pages, 9 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[293]  arXiv:2404.10172 [pdf, other]
Title: Forensic Iris Image-Based Post-Mortem Interval Estimation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[294]  arXiv:2404.10170 [pdf, other]
Title: High-Resolution Detection of Earth Structural Heterogeneities from Seismic Amplitudes using Convolutional Neural Networks with Attention layers
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[295]  arXiv:2404.10166 [pdf, other]
Title: Self-Supervised Learning Featuring Small-Scale Image Dataset for Treatable Retinal Diseases Classification
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[296]  arXiv:2404.10163 [pdf, other]
Title: EyeFormer: Predicting Personalized Scanpaths with Transformer-Guided Reinforcement Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
[297]  arXiv:2404.10157 [pdf, other]
Title: Salient Object-Aware Background Generation using Text-Guided Diffusion Models
Comments: Accepted for publication at CVPR 2024's Generative Models for Computer Vision workshop
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[298]  arXiv:2404.10156 [pdf, other]
Title: SegFormer3D: an Efficient Transformer for 3D Medical Image Segmentation
Comments: Accepted at CVPR Workshop 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[299]  arXiv:2404.10147 [pdf, other]
Title: Eyes on the Streets: Leveraging Street-Level Imaging to Model Urban Crime Dynamics
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[300]  arXiv:2404.10146 [pdf, ps, other]
Title: Cross-Modal Self-Training: Aligning Images and Pointclouds to Learn Classification without Labels
Comments: To be published in Workshop for Learning 3D with Multi-View Supervision (3DMV) at CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[301]  arXiv:2404.10141 [pdf, other]
Title: ANCHOR: LLM-driven News Subject Conditioning for Text-to-Image Synthesis
Comments: 23 pages, 9 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Multimedia (cs.MM)
[302]  arXiv:2404.10133 [pdf, other]
Title: WB LUTs: Contrastive Learning for White Balancing Lookup Tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[303]  arXiv:2404.10130 [pdf, other]
Title: NOISe: Nuclei-Aware Osteoclast Instance Segmentation for Mouse-to-Human Domain Transfer
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[304]  arXiv:2404.10108 [pdf, other]
Title: GeoAI Reproducibility and Replicability: a computational and spatial perspective
Comments: Accepted by Annals of the American Association of Geographers
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[305]  arXiv:2404.10096 [pdf, other]
Title: Vision Augmentation Prediction Autoencoder with Attention Design (VAPAAD)
Authors: Yiqiao Yin
Comments: 12 pages, 4 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[306]  arXiv:2404.10078 [pdf, other]
Title: Low-Light Image Enhancement Framework for Improved Object Detection in Fisheye Lens Datasets
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[307]  arXiv:2404.10073 [pdf, other]
Title: Explainable Light-Weight Deep Learning Pipeline for Improved Drought Stres
Comments: 21 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[308]  arXiv:2404.10054 [pdf, other]
Title: AIGeN: An Adversarial Approach for Instruction Generation in VLN
Comments: Accepted to 7th Multimodal Learning and Applications Workshop (MULA 2024) at the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Robotics (cs.RO)
[309]  arXiv:2404.10034 [pdf, other]
Title: Realistic Model Selection for Weakly Supervised Object Localization
Comments: 13 pages, 5 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[310]  arXiv:2404.10766 (cross-list from eess.IV) [pdf, other]
Title: RapidVol: Rapid Reconstruction of 3D Ultrasound Volumes from Sensorless 2D Scans
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[311]  arXiv:2404.10763 (cross-list from cs.AI) [pdf, other]
Title: LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation?
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[ total of 604 entries: 1-311 | 312-604 ]
[ showing 311 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2404, contact, help  (Access key information)