We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 136

[ total of 729 entries: 1-104 | 33-136 | 137-240 | 241-344 | 345-448 | 449-552 | ... | 657-729 ]
[ showing 104 entries per page: fewer | more | all ]

Tue, 4 Jun 2024 (continued, showing last 92 of 228 entries)

[137]  arXiv:2406.00490 [pdf, ps, other]
Title: Research on the Application of Computer Vision Based on Deep Learning in Autonomous Driving Technology
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[138]  arXiv:2406.00481 [pdf, other]
Title: Effectiveness of Vision Language Models for Open-world Single Image Test Time Adaptation
Comments: PrePrint
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[139]  arXiv:2406.00480 [pdf, other]
Title: AlignSAM: Aligning Segment Anything Model to Open Context via Reinforcement Learning
Comments: CVPR2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[140]  arXiv:2406.00474 [pdf, other]
Title: Adapting Fine-Grained Cross-View Localization to Areas without Fine Ground Truth
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[141]  arXiv:2406.00473 [pdf, other]
Title: Pedestrian intention prediction in Adverse Weather Conditions with Spiking Neural Networks and Dynamic Vision Sensors
Comments: Submitted for peer review to IEEE Transactions on Intelligent Transportation Systems
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[142]  arXiv:2406.00457 [pdf, other]
Title: The Curious Case of End Token: A Zero-Shot Disentangled Image Editing using CLIP
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[143]  arXiv:2406.00448 [pdf, other]
Title: Bilateral Guided Radiance Field Processing
Comments: SIGGRAPH (ACM TOG), 2024. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[144]  arXiv:2406.00447 [pdf, other]
Title: DroneVis: Versatile Computer Vision Library for Drones
Comments: 23 pages, 15 figure, 2 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Machine Learning (cs.LG); Robotics (cs.RO)
[145]  arXiv:2406.00446 [pdf, other]
Title: GLCAN: Global-Local Collaborative Auxiliary Network for Local Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[146]  arXiv:2406.00440 [pdf, other]
Title: Topo4D: Topology-Preserving Gaussian Splatting for High-Fidelity 4D Head Capture
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[147]  arXiv:2406.00434 [pdf, other]
Title: MoDGS: Dynamic Gaussian Splatting from Causually-captured Monocular Videos
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[148]  arXiv:2406.00432 [pdf, other]
Title: Localize, Understand, Collaborate: Semantic-Aware Dragging via Intention Reasoner
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[149]  arXiv:2406.00429 [pdf, other]
Title: Towards Generalizable Multi-Object Tracking
Comments: CVPR2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[150]  arXiv:2406.00427 [pdf, other]
Title: You Only Need Less Attention at Each Stage in Vision Transformers
Comments: CVPR 2024 Camera-Ready; 10 pages, 3 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[151]  arXiv:2406.00423 [pdf, other]
Title: Multimodal Metadata Assignment for Cultural Heritage Artifacts
Journal-ref: Multimedia Systems 29 (2023) 847-869
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[152]  arXiv:2406.00409 [pdf, other]
Title: Arabic Handwritten Text for Person Biometric Identification: A Deep Learning Approach
Comments: 6 pages, 11 figures, 4 tables, International IEEE Conference on the Intelligent Methods, Systems, and Applications (IMSA)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM); Neural and Evolutionary Computing (cs.NE)
[153]  arXiv:2406.00391 [pdf, other]
Title: DS@BioMed at ImageCLEFmedical Caption 2024: Enhanced Attention Mechanisms in Medical Caption Generation through Concept Detection Integration
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[154]  arXiv:2406.00384 [pdf, other]
Title: CapeX: Category-Agnostic Pose Estimation from Textual Point Explanation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[155]  arXiv:2406.00383 [pdf, other]
Title: SpikeMM: Flexi-Magnification of High-Speed Micro-Motions
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[156]  arXiv:2406.00348 [pdf, other]
Title: An Effective Weight Initialization Method for Deep Learning: Application to Satellite Image Classification
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[157]  arXiv:2406.00347 [pdf, other]
Title: E$^3$-Net: Efficient E(3)-Equivariant Normal Estimation Network
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[158]  arXiv:2406.00346 [pdf, other]
Title: Details Enhancement in Unsigned Distance Field Learning for High-fidelity 3D Surface Reconstruction
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[159]  arXiv:2406.00345 [pdf, other]
Title: DeCoOp: Robust Prompt Tuning with Out-of-Distribution Detection
Comments: Accepted by ICML 2024. Code is available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[160]  arXiv:2406.00334 [pdf, other]
Title: Image Captioning via Dynamic Path Customization
Comments: TNNLS24
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[161]  arXiv:2406.00327 [pdf, other]
Title: Quality Sentinel: Estimating Label Quality and Errors in Medical Segmentation Datasets
Comments: 13 pages, 6 figures, 3 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[162]  arXiv:2406.00313 [pdf, other]
Title: From Seedling to Harvest: The GrowingSoy Dataset for Weed Detection in Soy Crops via Instance Segmentation
Comments: 11th IEEE International Conference on Cybernetics and Intelligent Systems (CIS)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[163]  arXiv:2406.00307 [pdf, other]
Title: HENASY: Learning to Assemble Scene-Entities for Egocentric Video-Language Model
Comments: Extended Abstract accepted at EgoVis Workshop CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[164]  arXiv:2406.00290 [pdf, other]
Title: Phasor-Driven Acceleration for FFT-based CNNs
Comments: Presented in the 21st Conference on Robots and Vision (CRV 2024) Workshop
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[165]  arXiv:2406.00287 [pdf, other]
Title: GenPalm: Contactless Palmprint Generation with Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[166]  arXiv:2406.00282 [pdf, other]
Title: Adversarial 3D Virtual Patches using Integrated Gradients
Comments: IEEE/ACM Workshop on the Internet of Safe Things, May 23rd, 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
[167]  arXiv:2406.00275 [pdf, other]
Title: StyDeSty: Min-Max Stylization and Destylization for Single Domain Generalization
Comments: Accepted at ICML 2024; Work in 2022 spring
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[168]  arXiv:2406.00272 [pdf, other]
Title: Temporally Consistent Object Editing in Videos using Extended Attention
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[169]  arXiv:2406.00263 [pdf, other]
Title: Upright adjustment with graph convolutional networks
Comments: ICIP 2020
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[170]  arXiv:2406.00259 [pdf, other]
Title: PuzzleFusion++: Auto-agglomerative 3D Fracture Assembly by Denoise and Verify
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[171]  arXiv:2406.00258 [pdf, other]
Title: Artemis: Towards Referential Understanding in Complex Videos
Comments: 19 pages, 14 figures. Code and data are available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[172]  arXiv:2406.00239 [pdf, other]
Title: A Review of Pulse-Coupled Neural Network Applications in Computer Vision and Image Processing
Comments: The 25th International Conference on Image Processing, Computer Vision, and Pattern Recognition (IPCV 2021)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
[173]  arXiv:2406.00227 [pdf, other]
Title: ImplicitTerrain: a Continuous Surface Model for Terrain Data Analysis
Comments: 10pages, CVPR2024 Workshop INRV
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[174]  arXiv:2406.00219 [pdf, other]
Title: Fairness in Autonomous Driving: Towards Understanding Confounding Factors in Object Detection under Challenging Weather
Comments: 14 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[175]  arXiv:2406.00210 [pdf, other]
Title: A-SDM: Accelerating Stable Diffusion through Model Assembly and Feature Inheritance Strategies
Comments: 19 pages, 16 figures, submitted to IEEE Transactions on Neural Networks and Learning Systems
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[176]  arXiv:2406.00195 [pdf, other]
Title: SNED: Superposition Network Architecture Search for Efficient Video Diffusion Model
Comments: Accepted in CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[177]  arXiv:2406.00143 [pdf, other]
Title: Diversifying Query: Region-Guided Transformer for Temporal Sentence Grounding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[178]  arXiv:2406.00135 [pdf, other]
Title: Advancing Ear Biometrics: Enhancing Accuracy and Robustness through Deep Learning
Comments: 6 pages, 8 figures, 3 tables, International IEEE Conference on the Intelligent Methods, Systems, and Applications
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Multimedia (cs.MM)
[179]  arXiv:2406.00121 [pdf, other]
Title: Empowering Visual Creativity: A Vision-Language Assistant to Image Editing Recommendations
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[180]  arXiv:2406.00093 [pdf, other]
Title: Bootstrap3D: Improving 3D Content Creation with Synthetic Data
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG); Multimedia (cs.MM)
[181]  arXiv:2406.01469 (cross-list from cs.NE) [pdf, other]
Title: Tomographic Reconstruction and Regularisation with Search Space Expansion and Total Variation
Subjects: Neural and Evolutionary Computing (cs.NE); Computer Vision and Pattern Recognition (cs.CV)
[182]  arXiv:2406.01467 (cross-list from cs.GR) [pdf, other]
Title: RaDe-GS: Rasterizing Depth in Gaussian Splatting
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[183]  arXiv:2406.01428 (cross-list from cs.CL) [pdf, ps, other]
Title: Superhuman performance in urology board questions by an explainable large language model enabled for context integration of the European Association of Urology guidelines: the UroBot study
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[184]  arXiv:2406.01417 (cross-list from cs.LG) [pdf, other]
Title: Mixup Augmentation with Multiple Interpolations
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[185]  arXiv:2406.01403 (cross-list from eess.IV) [pdf, other]
Title: An expert-driven data generation pipeline for histological images
Comments: 5 pages, Accepted at the International Symposium on Biomedical Imaging (ISBI) 2024, Code available at this https URL
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[186]  arXiv:2406.01299 (cross-list from eess.IV) [pdf, other]
Title: Enhancing Dynamic CT Image Reconstruction with Neural Fields Through Explicit Motion Regularizers
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[187]  arXiv:2406.01274 (cross-list from cs.LG) [pdf, other]
Title: Expected Grad-CAM: Towards gradient faithfulness
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[188]  arXiv:2406.01191 (cross-list from eess.IV) [pdf, other]
Title: S-CycleGAN: Semantic Segmentation Enhanced CT-Ultrasound Image-to-Image Translation for Robotic Ultrasonography
Comments: This paper is submitted to 2024 IEEE International Conference on Cyborg and Bionic Systems, and still under review
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[189]  arXiv:2406.01187 (cross-list from eess.IV) [pdf, other]
Title: Patch-Based Encoder-Decoder Architecture for Automatic Transmitted Light to Fluorescence Imaging Transition: Contribution to the LightMyCells Challenge
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[190]  arXiv:2406.01116 (cross-list from cs.LG) [pdf, other]
Title: Accelerating Heterogeneous Federated Learning with Closed-form Classifiers
Comments: Accepted at ICML 2024 - this https URL
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[191]  arXiv:2406.01086 (cross-list from cs.LG) [pdf, other]
Title: Effective Subset Selection Through The Lens of Neural Network Pruning
Authors: Noga Bar, Raja Giryes
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[192]  arXiv:2406.01054 (cross-list from cs.LG) [pdf, other]
Title: Confidence-Based Task Prediction in Continual Disease Classification Using Probability Distribution
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[193]  arXiv:2406.01014 (cross-list from cs.CL) [pdf, other]
Title: Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration
Comments: 22 pages, 11 figures, 10 Tables
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[194]  arXiv:2406.01011 (cross-list from cs.RO) [pdf, ps, other]
Title: Multi-Object Tracking based on Imaging Radar 3D Object Detection
Comments: Presented at: 9. International ATZ-Live Automated Driving 2024
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[195]  arXiv:2406.00980 (cross-list from cs.CL) [pdf, other]
Title: Selectively Answering Visual Questions
Comments: To be published in the findings of the 2024 Annual Meeting of the Association for Computational Linguistics
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[196]  arXiv:2406.00958 (cross-list from cs.LG) [pdf, other]
Title: Navigating Conflicting Views: Harnessing Trust for Learning
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[197]  arXiv:2406.00859 (cross-list from eess.IV) [pdf, other]
Title: Streaming quanta sensors for online, high-performance imaging and vision
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[198]  arXiv:2406.00834 (cross-list from cs.GR) [pdf, other]
Title: End-to-End Hybrid Refractive-Diffractive Lens Design with Differentiable Ray-Wave Model
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV); Optics (physics.optics)
[199]  arXiv:2406.00816 (cross-list from cs.LG) [pdf, other]
Title: Invisible Backdoor Attacks on Diffusion Models
Comments: Code: this https URL
Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[200]  arXiv:2406.00789 (cross-list from cs.CL) [pdf, ps, other]
Title: Developing an efficient corpus using Ensemble Data cleaning approach
Authors: Md Taimur Ahad
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[201]  arXiv:2406.00773 (cross-list from cs.LG) [pdf, other]
Title: Diffusion Tuning: Transferring Diffusion Models via Chain of Forgetting
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[202]  arXiv:2406.00758 (cross-list from eess.IV) [pdf, other]
Title: Once-for-All: Controllable Generative Image Compression with Dynamic Granularity Adaption
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[203]  arXiv:2406.00683 (cross-list from eess.IV) [pdf, other]
Title: Exploiting Frequency Correlation for Hyperspectral Image Reconstruction
Comments: 14 pages, 11 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[204]  arXiv:2406.00667 (cross-list from eess.IV) [pdf, other]
Title: An Early Investigation into the Utility of Multimodal Large Language Models in Medical Imaging
Comments: Accepted in Fifth IEEE Workshop on Artificial Intelligence for HealthCare, IEEE 25th International Conference on Information Reuse and Integration for Data Science
Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[205]  arXiv:2406.00645 (cross-list from cs.LG) [pdf, other]
Title: FuRL: Visual-Language Models as Fuzzy Rewards for Reinforcement Learning
Comments: ICML 2024
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[206]  arXiv:2406.00633 (cross-list from cs.LG) [pdf, other]
Title: Improving GFlowNets for Text-to-Image Diffusion Alignment
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[207]  arXiv:2406.00573 (cross-list from cs.LG) [pdf, other]
Title: VOICE: Variance of Induced Contrastive Explanations to quantify Uncertainty in Neural Network Interpretability
Comments: Journal of Selected Topics in Signal Processing (J-STSP) Special Series on AI in Signal & Data Science
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[208]  arXiv:2406.00555 (cross-list from eess.IV) [pdf, ps, other]
Title: Length-scale study in deep learning prediction for non-small cell lung cancer brain metastasis
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[209]  arXiv:2406.00529 (cross-list from cs.LG) [pdf, other]
Title: On the Use of Anchoring for Training Vision Models
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[210]  arXiv:2406.00495 (cross-list from eess.AS) [pdf, other]
Title: Audio-Visual Talker Localization in Video for Spatial Sound Reproduction
Subjects: Audio and Speech Processing (eess.AS); Computer Vision and Pattern Recognition (cs.CV)
[211]  arXiv:2406.00492 (cross-list from eess.IV) [pdf, other]
Title: SAM-VMNet: Deep Neural Networks For Coronary Angiography Vessel Segmentation
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[212]  arXiv:2406.00479 (cross-list from eess.IV) [pdf, other]
Title: End-to-End Model-based Deep Learning for Dual-Energy Computed Tomography Material Decomposition
Comments: 7 pages, 4 figures, accepted manuscript in 21st IEEE International Symposium on Biomedical Imaging (ISBI) 2024
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)
[213]  arXiv:2406.00449 (cross-list from eess.IV) [pdf, other]
Title: Dual Hyperspectral Mamba for Efficient Spectral Compressive Imaging
Comments: 13 pages, 6 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[214]  arXiv:2406.00439 (cross-list from cs.RO) [pdf, other]
Title: Learning Manipulation by Predicting Interaction
Comments: Accepted to RSS 2024. Project page: this https URL
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[215]  arXiv:2406.00365 (cross-list from eess.IV) [pdf, other]
Title: SynthBA: Reliable Brain Age Estimation Across Multiple MRI Sequences and Resolutions
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[216]  arXiv:2406.00341 (cross-list from eess.IV) [pdf, other]
Title: DSCA: A Digital Subtraction Angiography Sequence Dataset and Spatio-Temporal Model for Cerebral Artery Segmentation
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[217]  arXiv:2406.00329 (cross-list from eess.IV) [pdf, other]
Title: Whole Heart 3D+T Representation Learning Through Sparse 2D Cardiac MR Images
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[218]  arXiv:2406.00320 (cross-list from cs.SD) [pdf, other]
Title: Frieren: Efficient Video-to-Audio Generation with Rectified Flow Matching
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[219]  arXiv:2406.00315 (cross-list from cs.RO) [pdf, other]
Title: Precision and Adaptability of YOLOv5 and YOLOv8 in Dynamic Robotic Environments
Comments: 11th IEEE International Conference on Cybernetics and Intelligent Systems (CIS)
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[220]  arXiv:2406.00298 (cross-list from eess.IV) [pdf, other]
Title: Complex Style Image Transformations for Domain Generalization in Medical Images
Comments: Accepted at IEEE/CVF Computer Vision and Pattern Recognition Conference Workshops (CVPRW) 2024
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[221]  arXiv:2406.00279 (cross-list from eess.IV) [pdf, ps, other]
Title: Hybrid attention structure preserving network for reconstruction of under-sampled OCT images
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[222]  arXiv:2406.00252 (cross-list from cs.AI) [pdf, other]
Title: Multi-Modal and Multi-Agent Systems Meet Rationality: A Survey
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[223]  arXiv:2406.00237 (cross-list from eess.IV) [pdf, other]
Title: A Comparative Study of CNN, ResNet, and Vision Transformers for Multi-Classification of Chest Diseases
Comments: 8 pages, 6 figures
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[224]  arXiv:2406.00212 (cross-list from eess.IV) [pdf, other]
Title: MVAD: A Multiple Visual Artifact Detector for Video Streaming
Comments: 9 pages
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[225]  arXiv:2406.00192 (cross-list from eess.IV) [pdf, other]
Title: Direct Cardiac Segmentation from Undersampled K-space Using Transformers
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[226]  arXiv:2406.00125 (cross-list from eess.IV) [pdf, ps, other]
Title: TotalVibeSegmentator: Full Torso Segmentation for the NAKO and UK Biobank in Volumetric Interpolated Breath-hold Examination Body Images
Comments: this https URL
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[227]  arXiv:2406.00123 (cross-list from eess.IV) [pdf, ps, other]
Title: Correlation-aware Coarse-to-fine MLPs for Deformable Medical Image Registration
Comments: Accepted at CVPR2024 as Oral Presentation && Best Paper Candidate
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[228]  arXiv:2406.00081 (cross-list from cs.LG) [pdf, other]
Title: From Structured to Unstructured:A Comparative Analysis of Computer Vision and Graph Models in solving Mesh-based PDEs
Subjects: Machine Learning (cs.LG); Computational Engineering, Finance, and Science (cs.CE); Computer Vision and Pattern Recognition (cs.CV)

Mon, 3 Jun 2024 (showing first 12 of 89 entries)

[229]  arXiv:2405.21075 [pdf, other]
Title: Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[230]  arXiv:2405.21074 [pdf, other]
Title: Latent Intrinsics Emerge from Training to Relight
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[231]  arXiv:2405.21070 [pdf, other]
Title: Generalization Beyond Data Imbalance: A Controlled Study on CLIP for Transferable Insights
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
[232]  arXiv:2405.21066 [pdf, other]
Title: Mixed Diffusion for 3D Indoor Scene Synthesis
Comments: 19 pages, 14 figures. Under review. Code to be released at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[233]  arXiv:2405.21059 [pdf, other]
Title: Unified Directly Denoising for Both Variance Preserving and Variance Exploding Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[234]  arXiv:2405.21050 [pdf, other]
Title: Spectrum-Aware Parameter Efficient Fine-Tuning for Diffusion Models
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[235]  arXiv:2405.21048 [pdf, other]
Title: Kaleido Diffusion: Improving Conditional Diffusion Models with Autoregressive Latent Modeling
Comments: 22 pages, 14 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[236]  arXiv:2405.21016 [pdf, other]
Title: MpoxSLDNet: A Novel CNN Model for Detecting Monkeypox Lesions and Performance Comparison with Pre-trained Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[237]  arXiv:2405.21013 [pdf, other]
Title: StrucTexTv3: An Efficient Vision-Language Model for Text-rich Image Perception, Comprehension, and Beyond
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[238]  arXiv:2405.20991 [pdf, other]
Title: Hard Cases Detection in Motion Prediction by Vision-Language Foundation Models
Comments: IEEE Intelligent Vehicles Symposium (IV) 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[239]  arXiv:2405.20987 [pdf, other]
Title: Early Stopping Criteria for Training Generative Adversarial Networks in Biomedical Imaging
Comments: This paper is accepted at the 35th IEEE Irish Signals and Systems Conference (ISSC 2024)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[240]  arXiv:2405.20985 [pdf, other]
Title: DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[ total of 729 entries: 1-104 | 33-136 | 137-240 | 241-344 | 345-448 | 449-552 | ... | 657-729 ]
[ showing 104 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2406, contact, help  (Access key information)