We gratefully acknowledge support from
the Simons Foundation and member institutions.

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 240

[ total of 729 entries: 1-104 | 33-136 | 137-240 | 241-344 | 345-448 | 449-552 | 553-656 | 657-729 ]
[ showing 104 entries per page: fewer | more | all ]

Mon, 3 Jun 2024 (continued, showing last 77 of 89 entries)

[241]  arXiv:2405.20980 [pdf, other]
Title: Neural Gaussian Scale-Space Fields
Comments: 15 pages; SIGGRAPH 2024; project page at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[242]  arXiv:2405.20906 [pdf, ps, other]
Title: Enhancing Vision Models for Text-Heavy Content Understanding and Interaction
Comments: 5 pages, 4 figures (including 1 graph)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[243]  arXiv:2405.20892 [pdf, other]
Title: MALT: Multi-scale Action Learning Transformer for Online Action Detection
Comments: 8 pages, 3 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[244]  arXiv:2405.20881 [pdf, other]
Title: S4Fusion: Saliency-aware Selective State Space Model for Infrared Visible Image Fusion
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[245]  arXiv:2405.20876 [pdf, other]
Title: Investigating Calibration and Corruption Robustness of Post-hoc Pruned Perception CNNs: An Image Classification Benchmark Study
Comments: 11 pages, 3 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[246]  arXiv:2405.20868 [pdf, other]
Title: Responsible AI for Earth Observation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
[247]  arXiv:2405.20867 [pdf, other]
Title: Automatic Channel Pruning for Multi-Head Attention
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computational Complexity (cs.CC)
[248]  arXiv:2405.20853 [pdf, other]
Title: MeshXL: Neural Coordinate Field for Generative 3D Foundation Models
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[249]  arXiv:2405.20851 [pdf, other]
Title: MegActor: Harness the Power of Raw Video for Vivid Portrait Animation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[250]  arXiv:2405.20834 [pdf, other]
Title: Retrieval Meets Reasoning: Even High-school Textbook Knowledge Benefits Multimodal Reasoning
Comments: Under review
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[251]  arXiv:2405.20829 [pdf, other]
Title: Rethinking Open-World Semi-Supervised Learning: Distribution Mismatch and Inductive Inference
Comments: CVPR Workshop on Computer Vision in the Wild (CVinW), 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[252]  arXiv:2405.20810 [pdf, other]
Title: Context-aware Difference Distilling for Multi-change Captioning
Comments: Accepted by ACL 2024 main conference (long paper)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[253]  arXiv:2405.20797 [pdf, other]
Title: Ovis: Structural Embedding Alignment for Multimodal Large Language Model
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[254]  arXiv:2405.20795 [pdf, other]
Title: InsightSee: Advancing Multi-agent Vision-Language Models for Enhanced Visual Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[255]  arXiv:2405.20791 [pdf, other]
Title: GS-Phong: Meta-Learned 3D Gaussians for Relightable Novel View Synthesis
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[256]  arXiv:2405.20786 [pdf, other]
Title: Stratified Avatar Generation from Sparse Observations
Comments: Accepted by CVPR 2024 (Oral)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[257]  arXiv:2405.20764 [pdf, other]
Title: CoMoFusion: Fast and High-quality Fusion of Infrared and Visible Image with Consistency Model
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[258]  arXiv:2405.20750 [pdf, other]
Title: Diffusion Models Are Innate One-Step Generators
Comments: 9 pages, 4 figures and 4 tables on the main contents
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[259]  arXiv:2405.20743 [pdf, other]
Title: Trajectory Forecasting through Low-Rank Adaptation of Discrete Latent Codes
Comments: 15 pages, 3 figures, 5 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[260]  arXiv:2405.20735 [pdf, other]
Title: Language Augmentation in CLIP for Improved Anatomy Detection on Multi-modal Medical Images
Comments: $\copyright$ 2024 IEEE. Accepted in 46th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC) 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[261]  arXiv:2405.20729 [pdf, other]
Title: Extreme Point Supervised Instance Segmentation
Comments: CVPR 2024 Accepted
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[262]  arXiv:2405.20721 [pdf, other]
Title: ContextGS: Compact 3D Gaussian Splatting with Anchor Level Context Model
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[263]  arXiv:2405.20720 [pdf, other]
Title: Power of Cooperative Supervision: Multiple Teachers Framework for Enhanced 3D Semi-Supervised Object Detection
Comments: under review
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[264]  arXiv:2405.20717 [pdf, other]
Title: Cyclic image generation using chaotic dynamics
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Chaotic Dynamics (nlin.CD)
[265]  arXiv:2405.20711 [pdf, other]
Title: Revisiting Mutual Information Maximization for Generalized Category Discovery
Comments: Preprint version
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[266]  arXiv:2405.20687 [pdf, other]
Title: Conditioning GAN Without Training Dataset
Comments: 5 pages, 2 figures, Part of my MSc project course, School Project Course 2022
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)
[267]  arXiv:2405.20675 [pdf, other]
Title: Adv-KD: Adversarial Knowledge Distillation for Faster Diffusion Sampling
Comments: 7 pages, 11 figures, ELLIS Doctoral Symposium 2023 in Helsinki, Finland
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)
[268]  arXiv:2405.20674 [pdf, other]
Title: 4Diffusion: Multi-view Video Diffusion Model for 4D Generation
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[269]  arXiv:2405.20672 [pdf, other]
Title: Investigating and unmasking feature-level vulnerabilities of CNNs to adversarial perturbations
Comments: 22 pages, 15 figures (including appendix)
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[270]  arXiv:2405.20669 [pdf, other]
Title: Fourier123: One Image to High-Quality 3D Object Generation with Hybrid Fourier Score Distillation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[271]  arXiv:2405.20666 [pdf, other]
Title: MASA: Motion-aware Masked Autoencoder with Semantic Alignment for Sign Language Recognition
Comments: Accepted by TCSVT 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[272]  arXiv:2405.20650 [pdf, other]
Title: GenMix: Combining Generative and Mixture Data Augmentation for Medical Image Classification
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[273]  arXiv:2405.20648 [pdf, other]
Title: Shotluck Holmes: A Family of Efficient Small-Scale Large Language Vision Models For Video Captioning and Summarization
Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
[274]  arXiv:2405.20643 [pdf, other]
Title: Learning Gaze-aware Compositional GAN
Comments: Accepted by ETRA 2024 as Full paper, and as journal paper in Proceedings of the ACM on Computer Graphics and Interactive Techniques
Journal-ref: Proceedings of the ACM on Computer Graphics and Interactive Techniques, 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[275]  arXiv:2405.20633 [pdf, other]
Title: Action-OOD: An End-to-End Skeleton-Based Model for Robust Out-of-Distribution Human Action Detection
Comments: Under consideration at Computer Vision and Image Understanding
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[276]  arXiv:2405.20614 [pdf, other]
Title: EPIDetect: Video-based convulsive seizure detection in chronic epilepsy mouse model for anti-epilepsy drug screening
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[277]  arXiv:2405.20610 [pdf, other]
Title: Revisiting and Maximizing Temporal Knowledge in Semi-supervised Semantic Segmentation
Comments: 14 pages, 5 figures, submitted to IEEE TPAMI. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[278]  arXiv:2405.20607 [pdf, other]
Title: Textual Inversion and Self-supervised Refinement for Radiology Report Generation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[279]  arXiv:2405.20606 [pdf, other]
Title: Vision-Language Meets the Skeleton: Progressively Distillation with Cross-Modal Knowledge for 3D Action Representation Learning
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)
[280]  arXiv:2405.20596 [pdf, other]
Title: Generalized Semi-Supervised Learning via Self-Supervised Feature Adaptation
Comments: 10 pages; Accepted by NeurIPS 2023
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[281]  arXiv:2405.20584 [pdf, other]
Title: Disrupting Diffusion: Token-Level Attention Erasure Attack against Diffusion-based Customization
Comments: Under review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[282]  arXiv:2405.20510 [pdf, other]
Title: Physically Compatible 3D Object Modeling from a Single Image
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[283]  arXiv:2405.20494 [pdf, other]
Title: Slight Corruption in Pre-training Data Makes Better Diffusion Models
Comments: 50 pages, 33 figures, 4 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[284]  arXiv:2405.20469 [pdf, other]
Title: Is Synthetic Data all We Need? Benchmarking the Robustness of Models Trained with Synthetic Images
Comments: Accepted at CVPR 2024 Workshop: SyntaGen-Harnessing Generative Models for Synthetic Visual Datasets. Project page at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[285]  arXiv:2405.20465 [pdf, other]
Title: ENTIRe-ID: An Extensive and Diverse Dataset for Person Re-Identification
Comments: 5 pages, 2024 18th International Conference on Automatic Face and Gesture Recognition (FG)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[286]  arXiv:2405.20462 [pdf, other]
Title: Multi-Label Guided Soft Contrastive Learning for Efficient Earth Observation Pretraining
Comments: 16 pages, 9 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[287]  arXiv:2405.20459 [pdf, other]
Title: On Calibration of Object Detectors: Pitfalls, Evaluation and Baselines
Comments: 31 pages, 8 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[288]  arXiv:2405.20443 [pdf, ps, other]
Title: P-MSDiff: Parallel Multi-Scale Diffusion for Remote Sensing Image Segmentation
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[289]  arXiv:2405.20364 [pdf, other]
Title: Learning 3D Robotics Perception using Inductive Priors
Comments: Georgia Tech Ph.D. Thesis, December 2023. For more details: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[290]  arXiv:2405.20363 [pdf, other]
Title: LLMGeo: Benchmarking Large Language Models on Image Geolocation In-the-wild
Comments: 7 pages, 3 figures, 5 tables, CVPR 2024 Workshop on Computer Vision in the Wild
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[291]  arXiv:2405.21056 (cross-list from cs.RO) [pdf, other]
Title: An Organic Weed Control Prototype using Directed Energy and Deep Learning
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[292]  arXiv:2405.21022 (cross-list from cs.CL) [pdf, other]
Title: You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet
Comments: Technical report. Yiran Zhong is the corresponding author. The code is available at this https URL
Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[293]  arXiv:2405.20986 (cross-list from cs.LG) [pdf, other]
Title: Uncertainty Quantification for Bird's Eye View Semantic Segmentation: Methods and Benchmarks
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[294]  arXiv:2405.20981 (cross-list from cs.AI) [pdf, other]
Title: Generative Adversarial Networks in Ultrasound Imaging: Extending Field of View Beyond Conventional Limits
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[295]  arXiv:2405.20971 (cross-list from cs.LG) [pdf, other]
Title: Amortizing intractable inference in diffusion models for vision, language, and control
Comments: Code: this https URL
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[296]  arXiv:2405.20915 (cross-list from cs.LG) [pdf, other]
Title: Fast yet Safe: Early-Exiting with Risk Control
Comments: 25 pages, 11 figures, 4 tables (incl. appendix)
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[297]  arXiv:2405.20910 (cross-list from physics.app-ph) [pdf, other]
Title: Predicting ptychography probe positions using single-shot phase retrieval neural network
Subjects: Applied Physics (physics.app-ph); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Data Analysis, Statistics and Probability (physics.data-an)
[298]  arXiv:2405.20838 (cross-list from cs.LG) [pdf, other]
Title: einspace: Searching for Neural Architectures from Fundamental Operations
Comments: Project page at this https URL
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[299]  arXiv:2405.20771 (cross-list from cs.CR) [pdf, other]
Title: Towards Black-Box Membership Inference Attack for Diffusion Models
Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[300]  arXiv:2405.20759 (cross-list from cs.LG) [pdf, other]
Title: Information Theoretic Text-to-Image Alignment
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[301]  arXiv:2405.20725 (cross-list from cs.AI) [pdf, other]
Title: GI-NAS: Boosting Gradient Inversion Attacks through Adaptive Neural Architecture Search
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[302]  arXiv:2405.20719 (cross-list from cs.AI) [pdf, other]
Title: Climate Variable Downscaling with Conditional Normalizing Flows
Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Atmospheric and Oceanic Physics (physics.ao-ph)
[303]  arXiv:2405.20693 (cross-list from eess.IV) [pdf, other]
Title: R$^2$-Gaussian: Rectifying Radiative Gaussian Splatting for Tomographic Reconstruction
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[304]  arXiv:2405.20685 (cross-list from cs.LG) [pdf, other]
Title: Enhancing Counterfactual Image Generation Using Mahalanobis Distance with Distribution Preferences in Feature Space
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[305]  arXiv:2405.20628 (cross-list from cs.AI) [pdf, other]
Title: ToxVidLLM: A Multimodal LLM-based Framework for Toxicity Detection in Code-Mixed Videos
Comments: ACL Findings 2024
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[306]  arXiv:2405.20605 (cross-list from cs.LG) [pdf, other]
Title: Searching for internal symbols underlying deep learning
Comments: 10 pages, 7 figures, 3 tables and Appendix
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[307]  arXiv:2405.20559 (cross-list from physics.optics) [pdf, other]
Title: Universal evaluation and design of imaging systems using information estimation
Subjects: Optics (physics.optics); Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT); Image and Video Processing (eess.IV); Data Analysis, Statistics and Probability (physics.data-an)
[308]  arXiv:2405.20525 (cross-list from cs.ET) [pdf, other]
Title: Comparing Quantum Annealing and Spiking Neuromorphic Computing for Sampling Binary Sparse Coding QUBO Problems
Subjects: Emerging Technologies (cs.ET); Computer Vision and Pattern Recognition (cs.CV); Discrete Mathematics (cs.DM); Neural and Evolutionary Computing (cs.NE); Quantum Physics (quant-ph)
[309]  arXiv:2405.20513 (cross-list from cs.LG) [pdf, other]
Title: Deep Modeling of Non-Gaussian Aleatoric Uncertainty
Comments: 8 pages, 7 figures
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[310]  arXiv:2405.20501 (cross-list from cs.RO) [pdf, other]
Title: ShelfHelp: Empowering Humans to Perform Vision-Independent Manipulation Tasks with a Socially Assistive Robotic Cane
Comments: 8 pages, 14 figures and charts
Journal-ref: In AAMAS (pp. 1514-1523) 2023
Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
[311]  arXiv:2405.20470 (cross-list from cs.RO) [pdf, other]
Title: STHN: Deep Homography Estimation for UAV Thermal Geo-localization with Satellite Imagery
Comments: 8 pages, 7 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible
Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[312]  arXiv:2405.20431 (cross-list from cs.LG) [pdf, other]
Title: Exploring the Practicality of Federated Learning: A Survey Towards the Communication Perspective
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[313]  arXiv:2405.20420 (cross-list from cs.LG) [pdf, other]
Title: Back to the Basics on Predicting Transfer Performance
Comments: 15 pages, 3 figures, 2 tables
Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[314]  arXiv:2405.20413 (cross-list from cs.CR) [pdf, other]
Title: Jailbreaking Large Language Models Against Moderation Guardrails via Cipher Characters
Comments: 20 pages
Subjects: Cryptography and Security (cs.CR); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[315]  arXiv:2405.20392 (cross-list from eess.IV) [pdf, other]
Title: Can No-Reference Quality-Assessment Methods Serve as Perceptual Losses for Super-Resolution?
Comments: 4 pages, 3 figures. The first two authors contributed equally to this work
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[316]  arXiv:2405.20380 (cross-list from cs.AI) [pdf, other]
Title: Gradient Inversion of Federated Diffusion Models
Subjects: Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
[317]  arXiv:2405.20355 (cross-list from cs.NE) [pdf, other]
Title: Enhancing Adversarial Robustness in SNNs with Sparse Gradients
Comments: accepted by ICML 2024
Subjects: Neural and Evolutionary Computing (cs.NE); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Fri, 31 May 2024 (showing first 27 of 144 entries)

[318]  arXiv:2405.20343 [pdf, other]
Title: Unique3D: High-Quality and Efficient 3D Mesh Generation from a Single Image
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[319]  arXiv:2405.20340 [pdf, other]
Title: MotionLLM: Understanding Human Behaviors from Human Motions and Videos
Comments: MotionLLM version 1.0, project page see this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[320]  arXiv:2405.20339 [pdf, other]
Title: Visual Perception by Large Language Model's Weights
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[321]  arXiv:2405.20337 [pdf, other]
Title: OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving
Comments: Code is available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[322]  arXiv:2405.20336 [pdf, other]
Title: RapVerse: Coherent Vocals and Whole-Body Motions Generations from Text
Comments: Project website: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[323]  arXiv:2405.20334 [pdf, other]
Title: VividDream: Generating 3D Scene with Ambient Dynamics
Comments: Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[324]  arXiv:2405.20333 [pdf, other]
Title: SurgiTrack: Fine-Grained Multi-Class Multi-Tool Tracking in Surgical Videos
Comments: 15 pages, 7 figures, 9 tables, 1 video. Supplementary video available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[325]  arXiv:2405.20330 [pdf, other]
Title: 4DHands: Reconstructing Interactive Hands in 4D with Transformers
Comments: More demo videos can be seen at our project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[326]  arXiv:2405.20327 [pdf, other]
Title: GECO: Generative Image-to-3D within a SECOnd
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[327]  arXiv:2405.20325 [pdf, other]
Title: MotionFollower: Editing Video Motion via Lightweight Score-Guided Diffusion
Comments: 23 pages, 18 figures. Project page at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[328]  arXiv:2405.20324 [pdf, other]
Title: Don't drop your samples! Coherence-aware training benefits Conditional diffusion
Comments: Accepted at CVPR 2024 as a Highlight. Project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[329]  arXiv:2405.20323 [pdf, other]
Title: $\textit{S}^3$Gaussian: Self-Supervised Street Gaussians for Autonomous Driving
Comments: Code is available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[330]  arXiv:2405.20320 [pdf, other]
Title: Improving the Training of Rectified Flows
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[331]  arXiv:2405.20319 [pdf, other]
Title: ParSEL: Parameterized Shape Editing with Language
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Human-Computer Interaction (cs.HC); Symbolic Computation (cs.SC)
[332]  arXiv:2405.20310 [pdf, other]
Title: A Pixel Is Worth More Than One 3D Gaussians in Single-View 3D Reconstruction
Comments: preprint, under review
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[333]  arXiv:2405.20305 [pdf, other]
Title: Can't make an Omelette without Breaking some Eggs: Plausible Action Anticipation using Large Video-Language Models
Comments: CVPR 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[334]  arXiv:2405.20299 [pdf, other]
Title: Scaling White-Box Transformers for Vision
Comments: project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[335]  arXiv:2405.20283 [pdf, other]
Title: TetSphere Splatting: Representing High-Quality Geometry with Lagrangian Volumetric Meshes
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[336]  arXiv:2405.20282 [pdf, other]
Title: SemFlow: Binding Semantic Segmentation and Image Synthesis via Rectified Flow
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[337]  arXiv:2405.20279 [pdf, other]
Title: CV-VAE: A Compatible Video VAE for Latent Generative Video Models
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
[338]  arXiv:2405.20259 [pdf, other]
Title: FaceMixup: Enhancing Facial Expression Recognition through Mixed Face Regularization
Comments: 29 pages, 9 figures, paper is under review on journal
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[339]  arXiv:2405.20230 [pdf, other]
Title: Feature Fusion for Improved Classification: Combining Dempster-Shafer Theory and Multiple CNN Architectures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[340]  arXiv:2405.20224 [pdf, other]
Title: EvaGaussians: Event Stream Assisted Gaussian Splatting from Blurry Images
Comments: Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
[341]  arXiv:2405.20222 [pdf, other]
Title: MOFA-Video: Controllable Image Animation via Generative Motion Field Adaptions in Frozen Image-to-Video Diffusion Model
Comments: Project Page: this https URL ; Codes: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[342]  arXiv:2405.20216 [pdf, other]
Title: Boost Your Own Human Image Generation Model via Direct Preference Optimization with AI Feedback
Comments: 28 pages, 18 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[343]  arXiv:2405.20188 [pdf, other]
Title: SPARE: Symmetrized Point-to-Plane Distance for Robust Non-Rigid Registration
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[344]  arXiv:2405.20161 [pdf, other]
Title: Landslide mapping from Sentinel-2 imagery through change detection
Comments: to be published in IEEE IGARSS 2024 conference proceedings
Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[ total of 729 entries: 1-104 | 33-136 | 137-240 | 241-344 | 345-448 | 449-552 | 553-656 | 657-729 ]
[ showing 104 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2406, contact, help  (Access key information)