We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 248

[ total of 679 entries: 1-104 | 41-144 | 145-248 | 249-352 | 353-456 | 457-560 | 561-664 | 665-679 ]
[ showing 104 entries per page: fewer | more | all ]

Tue, 4 Jun 2024 (continued, showing last 82 of 228 entries)

[249]  arXiv:2406.00434 [pdf, other]
Title: MoDGS: Dynamic Gaussian Splatting from Causually-captured Monocular Videos
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[250]  arXiv:2406.00432 [pdf, other]
Title: Localize, Understand, Collaborate: Semantic-Aware Dragging via Intention Reasoner
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[251]  arXiv:2406.00429 [pdf, other]
Title: Towards Generalizable Multi-Object Tracking
Comments: CVPR2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[252]  arXiv:2406.00427 [pdf, other]
Title: You Only Need Less Attention at Each Stage in Vision Transformers
Comments: CVPR 2024 Camera-Ready; 10 pages, 3 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[253]  arXiv:2406.00423 [pdf, other]
Title: Multimodal Metadata Assignment for Cultural Heritage Artifacts
Journal-ref: Multimedia Systems 29 (2023) 847-869
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[254]  arXiv:2406.00409 [pdf, other]
Title: Arabic Handwritten Text for Person Biometric Identification: A Deep Learning Approach
Comments: 6 pages, 11 figures, 4 tables, International IEEE Conference on the Intelligent Methods, Systems, and Applications (IMSA)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM); Neural and Evolutionary Computing (cs.NE)
[255]  arXiv:2406.00391 [pdf, other]
Title: DS@BioMed at ImageCLEFmedical Caption 2024: Enhanced Attention Mechanisms in Medical Caption Generation through Concept Detection Integration
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[256]  arXiv:2406.00384 [pdf, other]
Title: CapeX: Category-Agnostic Pose Estimation from Textual Point Explanation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[257]  arXiv:2406.00383 [pdf, other]
Title: SpikeMM: Flexi-Magnification of High-Speed Micro-Motions
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[258]  arXiv:2406.00348 [pdf, other]
Title: An Effective Weight Initialization Method for Deep Learning: Application to Satellite Image Classification
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[259]  arXiv:2406.00347 [pdf, other]
Title: E$^3$-Net: Efficient E(3)-Equivariant Normal Estimation Network
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[260]  arXiv:2406.00346 [pdf, other]
Title: Details Enhancement in Unsigned Distance Field Learning for High-fidelity 3D Surface Reconstruction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[261]  arXiv:2406.00345 [pdf, other]
Title: DeCoOp: Robust Prompt Tuning with Out-of-Distribution Detection
Comments: Accepted by ICML 2024. Code is available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[262]  arXiv:2406.00334 [pdf, other]
Title: Image Captioning via Dynamic Path Customization
Comments: TNNLS24
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[263]  arXiv:2406.00327 [pdf, other]
Title: Quality Sentinel: Estimating Label Quality and Errors in Medical Segmentation Datasets
Comments: 13 pages, 6 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[264]  arXiv:2406.00313 [pdf, other]
Title: From Seedling to Harvest: The GrowingSoy Dataset for Weed Detection in Soy Crops via Instance Segmentation
Comments: 11th IEEE International Conference on Cybernetics and Intelligent Systems (CIS)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[265]  arXiv:2406.00307 [pdf, other]
Title: HENASY: Learning to Assemble Scene-Entities for Egocentric Video-Language Model
Comments: Extended Abstract accepted at EgoVis Workshop CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[266]  arXiv:2406.00290 [pdf, other]
Title: Phasor-Driven Acceleration for FFT-based CNNs
Comments: Presented in the 21st Conference on Robots and Vision (CRV 2024) Workshop
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Signal Processing (eess.SP)
[267]  arXiv:2406.00287 [pdf, other]
Title: GenPalm: Contactless Palmprint Generation with Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[268]  arXiv:2406.00282 [pdf, other]
Title: Adversarial 3D Virtual Patches using Integrated Gradients
Comments: IEEE/ACM Workshop on the Internet of Safe Things, May 23rd, 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
[269]  arXiv:2406.00275 [pdf, other]
Title: StyDeSty: Min-Max Stylization and Destylization for Single Domain Generalization
Comments: Accepted at ICML 2024; Work in 2022 spring
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[270]  arXiv:2406.00272 [pdf, other]
Title: Temporally Consistent Object Editing in Videos using Extended Attention
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[271]  arXiv:2406.00263 [pdf, other]
Title: Upright adjustment with graph convolutional networks
Comments: ICIP 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[272]  arXiv:2406.00259 [pdf, other]
Title: PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[273]  arXiv:2406.00258 [pdf, other]
Title: Artemis: Towards Referential Understanding in Complex Videos
Comments: 19 pages, 14 figures. Code and data are available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[274]  arXiv:2406.00239 [pdf, other]
Title: A Review of Pulse-Coupled Neural Network Applications in Computer Vision and Image Processing
Comments: The 25th International Conference on Image Processing, Computer Vision, and Pattern Recognition (IPCV 2021)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
[275]  arXiv:2406.00227 [pdf, other]
Title: ImplicitTerrain: a Continuous Surface Model for Terrain Data Analysis
Comments: 10pages, CVPR2024 Workshop INRV
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[276]  arXiv:2406.00219 [pdf, other]
Title: Fairness in Autonomous Driving: Towards Understanding Confounding Factors in Object Detection under Challenging Weather
Comments: 14 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[277]  arXiv:2406.00210 [pdf, other]
Title: A-SDM: Accelerating Stable Diffusion through Model Assembly and Feature Inheritance Strategies
Comments: 19 pages, 16 figures, submitted to IEEE Transactions on Neural Networks and Learning Systems
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[278]  arXiv:2406.00195 [pdf, other]
Title: SNED: Superposition Network Architecture Search for Efficient Video Diffusion Model
Comments: Accepted in CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[279]  arXiv:2406.00143 [pdf, other]
Title: Diversifying Query: Region-Guided Transformer for Temporal Sentence Grounding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[280]  arXiv:2406.00135 [pdf, other]
Title: Advancing Ear Biometrics: Enhancing Accuracy and Robustness through Deep Learning
Comments: 6 pages, 8 figures, 3 tables, International IEEE Conference on the Intelligent Methods, Systems, and Applications
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Multimedia (cs.MM)
[281]  arXiv:2406.00121 [pdf, other]
Title: Empowering Visual Creativity: A Vision-Language Assistant to Image Editing Recommendations
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[282]  arXiv:2406.00093 [pdf, other]
Title: Bootstrap3D: Improving 3D Content Creation with Synthetic Data
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG); Multimedia (cs.MM)
[283]  arXiv:2406.01469 (cross-list from cs.NE) [pdf, other]
Title: Tomographic Reconstruction and Regularisation with Search Space Expansion and Total Variation
Subjects: Neural and Evolutionary Computing (cs.NE); Computer Vision and Pattern Recognition (cs.CV)
[284]  arXiv:2406.01467 (cross-list from cs.GR) [pdf, other]
Title: RaDe-GS: Rasterizing Depth in Gaussian Splatting
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[285]  arXiv:2406.01428 (cross-list from cs.CL) [pdf, ps, other]
Title: Superhuman performance in urology board questions by an explainable large language model enabled for context integration of the European Association of Urology guidelines: the UroBot study
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[286]  arXiv:2406.01417 (cross-list from cs.LG) [pdf, other]
Title: Mixup Augmentation with Multiple Interpolations
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[287]  arXiv:2406.01403 (cross-list from eess.IV) [pdf, other]
Title: An expert-driven data generation pipeline for histological images
Comments: 5 pages, Accepted at the International Symposium on Biomedical Imaging (ISBI) 2024, Code available at this https URL
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[288]  arXiv:2406.01299 (cross-list from eess.IV) [pdf, other]
Title: Enhancing Dynamic CT Image Reconstruction with Neural Fields Through Explicit Motion Regularizers
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[289]  arXiv:2406.01274 (cross-list from cs.LG) [pdf, other]
Title: Expected Grad-CAM: Towards gradient faithfulness
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[290]  arXiv:2406.01191 (cross-list from eess.IV) [pdf, other]
Title: S-CycleGAN: Semantic Segmentation Enhanced CT-Ultrasound Image-to-Image Translation for Robotic Ultrasonography
Comments: This paper is submitted to 2024 IEEE International Conference on Cyborg and Bionic Systems, and still under review
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[291]  arXiv:2406.01187 (cross-list from eess.IV) [pdf, other]
Title: Patch-Based Encoder-Decoder Architecture for Automatic Transmitted Light to Fluorescence Imaging Transition: Contribution to the LightMyCells Challenge
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[292]  arXiv:2406.01116 (cross-list from cs.LG) [pdf, other]
Title: Accelerating Heterogeneous Federated Learning with Closed-form Classifiers
Comments: Accepted at ICML 2024 - this https URL
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[293]  arXiv:2406.01086 (cross-list from cs.LG) [pdf, other]
Title: Effective Subset Selection Through The Lens of Neural Network Pruning
Authors: Noga Bar, Raja Giryes
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[294]  arXiv:2406.01054 (cross-list from cs.LG) [pdf, other]
Title: Confidence-Based Task Prediction in Continual Disease Classification Using Probability Distribution
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[295]  arXiv:2406.01014 (cross-list from cs.CL) [pdf, other]
Title: Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration
Comments: 22 pages, 11 figures, 10 Tables
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[296]  arXiv:2406.01011 (cross-list from cs.RO) [pdf, ps, other]
Title: Multi-Object Tracking based on Imaging Radar 3D Object Detection
Comments: Presented at: 9. International ATZ-Live Automated Driving 2024
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[297]  arXiv:2406.00980 (cross-list from cs.CL) [pdf, other]
Title: Selectively Answering Visual Questions
Comments: To be published in the findings of the 2024 Annual Meeting of the Association for Computational Linguistics
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[298]  arXiv:2406.00958 (cross-list from cs.LG) [pdf, other]
Title: Navigating Conflicting Views: Harnessing Trust for Learning
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[299]  arXiv:2406.00859 (cross-list from eess.IV) [pdf, other]
Title: Streaming quanta sensors for online, high-performance imaging and vision
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[300]  arXiv:2406.00834 (cross-list from cs.GR) [pdf, other]
Title: End-to-End Hybrid Refractive-Diffractive Lens Design with Differentiable Ray-Wave Model
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Optics (physics.optics)
[301]  arXiv:2406.00816 (cross-list from cs.LG) [pdf, other]
Title: Invisible Backdoor Attacks on Diffusion Models
Comments: Code: this https URL
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[302]  arXiv:2406.00789 (cross-list from cs.CL) [pdf, ps, other]
Title: Developing an efficient corpus using Ensemble Data cleaning approach
Authors: Md Taimur Ahad
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[303]  arXiv:2406.00773 (cross-list from cs.LG) [pdf, other]
Title: Diffusion Tuning: Transferring Diffusion Models via Chain of Forgetting
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[304]  arXiv:2406.00758 (cross-list from eess.IV) [pdf, other]
Title: Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaption
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[305]  arXiv:2406.00683 (cross-list from eess.IV) [pdf, other]
Title: Exploiting Frequency Correlation for Hyperspectral Image Reconstruction
Comments: 14 pages, 11 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[306]  arXiv:2406.00667 (cross-list from eess.IV) [pdf, other]
Title: An Early Investigation into the Utility of Multimodal Large Language Models in Medical Imaging
Comments: Accepted in Fifth IEEE Workshop on Artificial Intelligence for HealthCare, IEEE 25th International Conference on Information Reuse and Integration for Data Science
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[307]  arXiv:2406.00645 (cross-list from cs.LG) [pdf, other]
Title: FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning
Comments: ICML 2024
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[308]  arXiv:2406.00633 (cross-list from cs.LG) [pdf, other]
Title: Improving GFlowNets for Text-to-Image Diffusion Alignment
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[309]  arXiv:2406.00573 (cross-list from cs.LG) [pdf, other]
Title: VOICE: Variance of Induced Contrastive Explanations to quantify Uncertainty in Neural Network Interpretability
Comments: Journal of Selected Topics in Signal Processing (J-STSP) Special Series on AI in Signal & Data Science
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[310]  arXiv:2406.00555 (cross-list from eess.IV) [pdf, ps, other]
Title: Length-scale study in deep learning prediction for non-small cell lung cancer brain metastasis
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[311]  arXiv:2406.00529 (cross-list from cs.LG) [pdf, other]
Title: On the Use of Anchoring for Training Vision Models
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[312]  arXiv:2406.00495 (cross-list from eess.AS) [pdf, other]
Title: Audio-Visual Talker Localization in Video for Spatial Sound Reproduction
Subjects: Audio and Speech Processing (eess.AS); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD)
[313]  arXiv:2406.00492 (cross-list from eess.IV) [pdf, other]
Title: SAM-VMNet: Deep Neural Networks For Coronary Angiography Vessel Segmentation
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[314]  arXiv:2406.00479 (cross-list from eess.IV) [pdf, other]
Title: End-to-End Model-based Deep Learning for Dual-Energy Computed Tomography Material Decomposition
Comments: 7 pages, 4 figures, accepted manuscript in 21st IEEE International Symposium on Biomedical Imaging (ISBI) 2024
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)
[315]  arXiv:2406.00449 (cross-list from eess.IV) [pdf, other]
Title: Dual Hyperspectral Mamba for Efficient Spectral Compressive Imaging
Comments: 13 pages, 6 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[316]  arXiv:2406.00439 (cross-list from cs.RO) [pdf, other]
Title: Learning Manipulation by Predicting Interaction
Comments: Accepted to RSS 2024. Project page: this https URL
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[317]  arXiv:2406.00365 (cross-list from eess.IV) [pdf, other]
Title: SynthBA: Reliable Brain Age Estimation Across Multiple MRI Sequences and Resolutions
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[318]  arXiv:2406.00341 (cross-list from eess.IV) [pdf, other]
Title: DSCA: A Digital Subtraction Angiography Sequence Dataset and Spatio-Temporal Model for Cerebral Artery Segmentation
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[319]  arXiv:2406.00329 (cross-list from eess.IV) [pdf, other]
Title: Whole Heart 3D+T Representation Learning Through Sparse 2D Cardiac MR Images
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[320]  arXiv:2406.00320 (cross-list from cs.SD) [pdf, other]
Title: Frieren: Efficient Video-to-Audio Generation with Rectified Flow Matching
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[321]  arXiv:2406.00315 (cross-list from cs.RO) [pdf, other]
Title: Precision and Adaptability of YOLOv5 and YOLOv8 in Dynamic Robotic Environments
Comments: 11th IEEE International Conference on Cybernetics and Intelligent Systems (CIS)
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[322]  arXiv:2406.00298 (cross-list from eess.IV) [pdf, other]
Title: Complex Style Image Transformations for Domain Generalization in Medical Images
Comments: Accepted at IEEE/CVF Computer Vision and Pattern Recognition Conference Workshops (CVPRW) 2024
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[323]  arXiv:2406.00279 (cross-list from eess.IV) [pdf, ps, other]
Title: Hybrid attention structure preserving network for reconstruction of under-sampled OCT images
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[324]  arXiv:2406.00252 (cross-list from cs.AI) [pdf, other]
Title: Multi-Modal and Multi-Agent Systems Meet Rationality: A Survey
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[325]  arXiv:2406.00237 (cross-list from eess.IV) [pdf, other]
Title: A Comparative Study of CNN, ResNet, and Vision Transformers for Multi-Classification of Chest Diseases
Comments: 8 pages, 6 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[326]  arXiv:2406.00212 (cross-list from eess.IV) [pdf, other]
Title: MVAD: A Multiple Visual Artifact Detector for Video Streaming
Comments: 9 pages
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[327]  arXiv:2406.00192 (cross-list from eess.IV) [pdf, other]
Title: Direct Cardiac Segmentation from Undersampled K-space Using Transformers
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[328]  arXiv:2406.00125 (cross-list from eess.IV) [pdf, ps, other]
Title: TotalVibeSegmentator: Full Torso Segmentation for the NAKO and UK Biobank in Volumetric Interpolated Breath-hold Examination Body Images
Comments: this https URL
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[329]  arXiv:2406.00123 (cross-list from eess.IV) [pdf, ps, other]
Title: Correlation-aware Coarse-to-fine MLPs for Deformable Medical Image Registration
Comments: Accepted at CVPR2024 as Oral Presentation && Best Paper Candidate
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[330]  arXiv:2406.00081 (cross-list from cs.LG) [pdf, other]
Title: From Structured to Unstructured:A Comparative Analysis of Computer Vision and Graph Models in solving Mesh-based PDEs
Subjects: Machine Learning (cs.LG); Computational Engineering, Finance, and Science (cs.CE); Computer Vision and Pattern Recognition (cs.CV)

Mon, 3 Jun 2024 (showing first 22 of 89 entries)

[331]  arXiv:2405.21075 [pdf, other]
Title: Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[332]  arXiv:2405.21074 [pdf, other]
Title: Latent Intrinsics Emerge from Training to Relight
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[333]  arXiv:2405.21070 [pdf, other]
Title: Generalization Beyond Data Imbalance: A Controlled Study on CLIP for Transferable Insights
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
[334]  arXiv:2405.21066 [pdf, other]
Title: Mixed Diffusion for 3D Indoor Scene Synthesis
Comments: 19 pages, 14 figures. Under review. Code to be released at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[335]  arXiv:2405.21059 [pdf, other]
Title: Unified Directly Denoising for Both Variance Preserving and Variance Exploding Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[336]  arXiv:2405.21050 [pdf, other]
Title: Spectrum-Aware Parameter Efficient Fine-Tuning for Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[337]  arXiv:2405.21048 [pdf, other]
Title: Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling
Comments: 22 pages, 14 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[338]  arXiv:2405.21016 [pdf, other]
Title: MpoxSLDNet: A Novel CNN Model for Detecting Monkeypox Lesions and Performance Comparison with Pre-trained Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[339]  arXiv:2405.21013 [pdf, other]
Title: StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[340]  arXiv:2405.20991 [pdf, other]
Title: Hard Cases Detection in Motion Prediction by Vision-Language Foundation Models
Comments: IEEE Intelligent Vehicles Symposium (IV) 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[341]  arXiv:2405.20987 [pdf, other]
Title: Early Stopping Criteria for Training Generative Adversarial Networks in Biomedical Imaging
Comments: This paper is accepted at the 35th IEEE Irish Signals and Systems Conference (ISSC 2024)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[342]  arXiv:2405.20985 [pdf, other]
Title: DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[343]  arXiv:2405.20980 [pdf, other]
Title: Neural Gaussian Scale-Space Fields
Comments: 15 pages; SIGGRAPH 2024; project page at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[344]  arXiv:2405.20906 [pdf, ps, other]
Title: Enhancing Vision Models for Text-Heavy Content Understanding and Interaction
Comments: 5 pages, 4 figures (including 1 graph)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[345]  arXiv:2405.20892 [pdf, other]
Title: MALT: Multi-scale Action Learning Transformer for Online Action Detection
Comments: 8 pages, 3 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[346]  arXiv:2405.20881 [pdf, other]
Title: S4Fusion: Saliency-aware Selective State Space Model for Infrared Visible Image Fusion
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[347]  arXiv:2405.20876 [pdf, other]
Title: Investigating Calibration and Corruption Robustness of Post-hoc Pruned Perception CNNs: An Image Classification Benchmark Study
Comments: 11 pages, 3 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[348]  arXiv:2405.20868 [pdf, other]
Title: Responsible AI for Earth Observation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
[349]  arXiv:2405.20867 [pdf, other]
Title: Automatic Channel Pruning for Multi-Head Attention
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computational Complexity (cs.CC)
[350]  arXiv:2405.20853 [pdf, other]
Title: MeshXL: Neural Coordinate Field for Generative 3D Foundation Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[351]  arXiv:2405.20851 [pdf, other]
Title: MegActor: Harness the Power of Raw Video for Vivid Portrait Animation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[352]  arXiv:2405.20834 [pdf, other]
Title: Retrieval Meets Reasoning: Even High-school Textbook Knowledge Benefits Multimodal Reasoning
Comments: Under review
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[ total of 679 entries: 1-104 | 41-144 | 145-248 | 249-352 | 353-456 | 457-560 | 561-664 | 665-679 ]
[ showing 104 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2406, contact, help  (Access key information)