Computer Vision and Pattern Recognition
Authors and titles for recent submissions
[ total of 437 entries: 1-437 ][ showing up to 580 entries per page: fewer | more ]
Tue, 14 May 2024
- [1] arXiv:2405.07992 [pdf, other]
-
Title: MambaOut: Do We Really Need Mamba for Vision?Comments: Code: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [2] arXiv:2405.07988 [pdf, ps, other]
-
Title: A Generalist Learner for Multifaceted Medical Image InterpretationComments: Technical studySubjects: Computer Vision and Pattern Recognition (cs.CV)
- [3] arXiv:2405.07974 [pdf, other]
-
Title: SignAvatar: Sign Language 3D Motion Reconstruction and GenerationComments: Accepted by FG2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [4] arXiv:2405.07969 [pdf, other]
-
Title: Investigating the Semantic Robustness of CLIP-based Zero-Shot Anomaly SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [5] arXiv:2405.07966 [pdf, other]
-
Title: OverlapMamba: Novel Shift State Space Model for LiDAR-based Place RecognitionSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [6] arXiv:2405.07933 [pdf, other]
-
Title: Authentic Hand Avatar from a Phone Scan via Universal Hand ModelComments: Accepted to CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [7] arXiv:2405.07921 [pdf, other]
-
Title: Can Better Text Semantics in Prompt Tuning Improve VLM Generalization?Authors: Hari Chandana Kuchibhotla, Sai Srinivas Kancheti, Abbavaram Gowtham Reddy, Vineeth N BalasubramanianSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [8] arXiv:2405.07919 [pdf, other]
-
Title: Exploring the Low-Pass Filtering Behavior in Image Super-ResolutionComments: Accepted by ICML 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [9] arXiv:2405.07916 [pdf, other]
-
Title: IMAFD: An Interpretable Multi-stage Approach to Flood Detection from time series Multispectral DataSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [10] arXiv:2405.07913 [pdf, other]
-
Title: CTRLorALTer: Conditional LoRAdapter for Efficient 0-Shot Control & Altering of T2I ModelsAuthors: Nick Stracke, Stefan Andreas Baumann, Joshua M. Susskind, Miguel Angel Bautista, Björn OmmerSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [11] arXiv:2405.07868 [pdf, other]
-
Title: Boostlet.js: Image processing plugins for the web via JavaScript injectionAuthors: Edward Gaibor, Shruti Varade, Rohini Deshmukh, Tim Meyer, Mahsa Geshvadi, SangHyuk Kim, Vidhya Sree Narayanappa, Daniel HaehnComments: 5 pages, 5 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [12] arXiv:2405.07865 [pdf, other]
-
Title: AnoVox: A Benchmark for Multimodal Anomaly Detection in Autonomous DrivingAuthors: Daniel Bogdoll, Iramm Hamdard, Lukas Namgyu Rößler, Felix Geisler, Muhammed Bayram, Felix Wang, Jan Imhof, Miguel de Campos, Anushervon Tabarov, Yitian Yang, Hanno Gottschalk, J. Marius ZöllnerComments: Daniel Bogdoll, Iramm Hamdard, and Lukas Namgyu R\"o{\ss}ler contributed equallySubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [13] arXiv:2405.07857 [pdf, other]
-
Title: Synergistic Integration of Coordinate Network and Tensorial Feature for Improving Neural Radiance Fields from Sparse InputsComments: ICML2024 ; Project page is accessible at this https URL ; Code is available at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [14] arXiv:2405.07847 [pdf, other]
-
Title: SceneFactory: A Workflow-centric and Unified Framework for Incremental Scene ModelingSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [15] arXiv:2405.07845 [pdf, other]
-
Title: Multi-Task Learning for Fatigue Detection and Face Recognition of Drivers via Tree-Style Space-Channel Attention Fusion NetworkSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [16] arXiv:2405.07814 [pdf, other]
-
Title: NutritionVerse-Direct: Exploring Deep Neural Networks for Multitask Nutrition Prediction from Food ImagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [17] arXiv:2405.07801 [pdf, other]
-
Title: Deep Learning-Based Object Pose Estimation: A Comprehensive SurveyAuthors: Jian Liu, Wei Sun, Hui Yang, Zhiwen Zeng, Chongpei Liu, Jin Zheng, Xingyu Liu, Hossein Rahmani, Nicu Sebe, Ajmal MianComments: 27 pages, 7 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [18] arXiv:2405.07798 [pdf, other]
-
Title: FreeVA: Offline MLLM as Training-Free Video AssistantAuthors: Wenhao WuComments: Preprint. Work in progressSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [19] arXiv:2405.07784 [pdf, other]
-
Title: Generating Human Motion in 3D Scenes from Text DescriptionsAuthors: Zhi Cen, Huaijin Pi, Sida Peng, Zehong Shen, Minghui Yang, Shuai Zhu, Hujun Bao, Xiaowei ZhouComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [20] arXiv:2405.07777 [pdf, other]
-
Title: GMSR:Gradient-Guided Mamba for Spectral Reconstruction from RGB ImagesSubjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
- [21] arXiv:2405.07776 [pdf, other]
-
Title: SAR Image Synthesis with Diffusion ModelsComments: Published at IEEE Radar Conference 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV); Signal Processing (eess.SP)
- [22] arXiv:2405.07723 [pdf, other]
-
Title: Coarse or Fine? Recognising Action End States without LabelsComments: The Eleventh Workshop on Fine-Grained Visual Categorization (CVPR 24)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [23] arXiv:2405.07702 [pdf, other]
-
Title: FORESEE: Multimodal and Multi-view Representation Learning for Robust Prediction of Cancer SurvivalSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [24] arXiv:2405.07698 [pdf, other]
-
Title: oTTC: Object Time-to-Contact for Motion Estimation in Autonomous DrivingAuthors: Abdul Hannan Khan, Syed Tahseen Raza Rizvi, Dheeraj Varma Chittari Macharavtu, Andreas DengelComments: 9 pages, 4 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [25] arXiv:2405.07696 [pdf, other]
-
Title: MonoMAE: Enhancing Monocular 3D Detection through Depth-Aware Masked AutoencodersSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [26] arXiv:2405.07680 [pdf, other]
-
Title: Establishing a Unified Evaluation Framework for Human Motion Generation: A Comparative Analysis of MetricsSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [27] arXiv:2405.07663 [pdf, other]
-
Title: Sign Stitching: A Novel Approach to Sign Language ProductionComments: 18 pages, 3 figures, 4 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
- [28] arXiv:2405.07655 [pdf, other]
-
Title: Quality-aware Selective Fusion Network for V-D-T Salient Object DetectionAuthors: Liuxin Bao, Xiaofei Zhou, Xiankai Lu, Yaoqi Sun, Haibing Yin, Zhenghui Hu, Jiyong Zhang, Chenggang YanComments: Accepted by IEEE Transactions on Image Processing (TIP)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [29] arXiv:2405.07653 [pdf, other]
-
Title: Fast Training Data Acquisition for Object Detection and Segmentation using Black Screen Luminance KeyingComments: 32. International Conference in Central Europe on Computer Graphics, Visualization and Computer Vision'2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [30] arXiv:2405.07648 [pdf, other]
-
Title: CDFormer:When Degradation Prediction Embraces Diffusion Model for Blind Image Super-ResolutionSubjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
- [31] arXiv:2405.07600 [pdf, other]
-
Title: Integrity Monitoring of 3D Object Detection in Automated Driving Systems using Raw Activation Patterns and Spatial FilteringComments: Submitted to ITSC 2024. arXiv admin note: text overlap with arXiv:2404.07685Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [32] arXiv:2405.07595 [pdf, other]
-
Title: Environmental Matching Attack Against Unmanned Aerial Vehicles Object DetectionSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [33] arXiv:2405.07594 [pdf, other]
-
Title: RGBD-Glue: General Feature Combination for Robust RGB-D Point Cloud RegistrationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [34] arXiv:2405.07582 [pdf, other]
-
Title: FRRffusion: Unveiling Authenticity with Diffusion-Based Face Retouching ReversalSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [35] arXiv:2405.07573 [pdf, other]
-
Title: MaskFuser: Masked Fusion of Joint Multi-Modal Tokenization for End-to-End Autonomous DrivingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [36] arXiv:2405.07571 [pdf, other]
-
Title: TattTRN: Template Reconstruction Network for Tattoo RetrievalComments: Accepted at CVPR Workshop 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [37] arXiv:2405.07550 [pdf, other]
-
Title: Wild Berry image dataset collected in Finnish forests and peatlands using dronesAuthors: Luigi Riz, Sergio Povoli, Andrea Caraffa, Davide Boscaini, Mohamed Lamine Mekhalfi, Paul Chippendale, Marjut Turtiainen, Birgitta Partanen, Laura Smith Ballester, Francisco Blanes Noguera, Alessio Franchi, Elisa Castelli, Giacomo Piccinini, Luca Marchesotti, Micael Santos Couceiro, Fabio PoiesiSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [38] arXiv:2405.07524 [pdf, other]
-
Title: HybridHash: Hybrid Convolutional and Self-Attention Deep Hashing for Image RetrievalSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [39] arXiv:2405.07523 [pdf, other]
-
Title: Adaptation of Distinct Semantics for Uncertain Areas in Polyp SegmentationComments: 13 pages with 7 figures, British Machine Vision Conference 2023Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [40] arXiv:2405.07520 [pdf, ps, other]
-
Title: Dehazing Remote Sensing and UAV Imagery: A Review of Deep Learning, Prior-based, and Hybrid ApproachesComments: Submitted to journal and under review, once the paper is accepted, the copyright will be transferred to the corresponding journalSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [41] arXiv:2405.07516 [pdf, other]
-
Title: Support-Query Prototype Fusion Network for Few-shot Medical Image SegmentationComments: 19 pages, 7 figures, 4 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [42] arXiv:2405.07481 [pdf, other]
-
Title: Text Grouping Adapter: Adapting Pre-trained Text Detector for Layout AnalysisComments: Accepted to CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [43] arXiv:2405.07472 [pdf, other]
-
Title: GaussianVTON: 3D Human Virtual Try-ON via Multi-Stage Gaussian Splatting Editing with Image PromptingComments: On-going workSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [44] arXiv:2405.07459 [pdf, other]
-
Title: DualFocus: A Unified Framework for Integrating Positive and Negative Descriptors in Text-based Person RetrievalSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [45] arXiv:2405.07451 [pdf, other]
-
Title: CLIP-Powered TASS: Target-Aware Single-Stream Network for Audio-Visual Question AnsweringComments: Submitted to the Journal on February 6, 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [46] arXiv:2405.07444 [pdf, other]
-
Title: Motion Keyframe Interpolation for Any Human Skeleton via Temporally Consistent Point Cloud Sampling and ReconstructionComments: 17 pages, 7 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [47] arXiv:2405.07425 [pdf, other]
-
Title: Sakuga-42M Dataset: Scaling Up Cartoon ResearchComments: Arxiv Pre-print. Work in ProgressSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [48] arXiv:2405.07411 [pdf, other]
-
Title: MoVL:Exploring Fusion Strategies for the Domain-Adaptive Application of Pretrained Models in Medical Imaging TasksSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [49] arXiv:2405.07407 [pdf, other]
-
Title: PitcherNet: Powering the Moneyball Evolution in Baseball Video AnalyticsComments: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW'24)Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [50] arXiv:2405.07399 [pdf, other]
-
Title: Semi-Supervised Weed Detection for Rapid Deployment and Enhanced EfficiencyComments: 16 pages, 4 figures, 6 tables. Submitted to ElsevierSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [51] arXiv:2405.07369 [pdf, other]
-
Title: Incorporating Anatomical Awareness for Enhanced Generalizability and Progression Prediction in Deep Learning-Based Radiographic Sacroiliitis DetectionAuthors: Felix J. Dorfner, Janis L. Vahldiek, Leonhard Donle, Andrei Zhukov, Lina Xu, Hartmut Häntze, Marcus R. Makowski, Hugo J.W.L. Aerts, Fabian Proft, Valeria Rios Rodriguez, Judith Rademacher, Mikhail Protopopov, Hildrun Haibel, Torsten Diekhoff, Murat Torgutalp, Lisa C. Adams, Denis Poddubnyy, Keno K. BressemSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [52] arXiv:2405.07364 [pdf, other]
-
Title: BoQ: A Place is Worth a Bag of Learnable QueriesComments: Accepted at CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [53] arXiv:2405.07346 [pdf, other]
-
Title: Understanding and Evaluating Human Preferences for AI Generated Images with Instruction TuningSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [54] arXiv:2405.07332 [pdf, other]
-
Title: PotatoGANs: Utilizing Generative Adversarial Networks, Instance Segmentation, and Explainable AI for Enhanced Potato Disease Identification and ClassificationAuthors: Mohammad Shafiul Alam, Fatema Tuj Johora Faria, Mukaffi Bin Moin, Ahmed Al Wase, Md. Rabius Sani, Khan Md HasibSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [55] arXiv:2405.07319 [pdf, other]
-
Title: LayGA: Layered Gaussian Avatars for Animatable Clothing TransferComments: SIGGRAPH 2024 conference trackSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [56] arXiv:2405.07306 [pdf, other]
-
Title: Point Resampling and Ray Transformation Aid to Editable NeRF ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [57] arXiv:2405.07293 [pdf, other]
- [58] arXiv:2405.07288 [pdf, other]
-
Title: Erasing Concepts from Text-to-Image Diffusion Models with Few-shot UnlearningComments: 23 pages, 28 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [59] arXiv:2405.07284 [pdf, ps, other]
-
Title: Zero Shot Context-Based Object Segmentation using SLIP (SAM+CLIP)Comments: 5 pages, 3 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [60] arXiv:2405.07272 [pdf, ps, other]
-
Title: MAML MOT: Multiple Object Tracking based on Meta-LearningSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [61] arXiv:2405.07257 [pdf, other]
-
Title: Listen, Disentangle, and Control: Controllable Speech-Driven Talking Head GenerationAuthors: Changpeng Cai, Guinan Guo, Jiao Li, Junhao Su, Chenghao He, Jing Xiao, Yuanxu Chen, Lei Dai, Feiyu ZhuSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [62] arXiv:2405.07202 [pdf, other]
-
Title: Unified Video-Language Pre-training with Synchronized AudioSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
- [63] arXiv:2405.07201 [pdf, other]
-
Title: Building a Strong Pre-Training Baseline for Universal 3D Large-Scale PerceptionComments: Accepted to CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [64] arXiv:2405.07194 [pdf, other]
-
Title: Differentiable Model Scaling using Differentiable TopkComments: Accepted by ICML 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [65] arXiv:2405.07178 [pdf, other]
-
Title: Hologram: Realtime Holographic Overlays via LiDAR Augmented ReconstructionAuthors: Ekansh AgrawalSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [66] arXiv:2405.07174 [pdf, other]
-
Title: CRSFL: Cluster-based Resource-aware Split Federated Learning for Continuous AuthenticationAuthors: Mohamad Wazzeh, Mohamad Arafeh, Hani Sami, Hakima Ould-Slimane, Chamseddine Talhi, Azzam Mourad, Hadi OtrokSubjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)
- [67] arXiv:2405.07171 [pdf, other]
-
Title: Enhanced Online Test-time Adaptation with Feature-Weight Cosine AlignmentComments: 22 pages, 7 figures, 8 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [68] arXiv:2405.07167 [pdf, other]
-
Title: 3D Hand Mesh Recovery from Monocular RGB in Camera SpaceComments: 21 pages, 7 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [69] arXiv:2405.07166 [pdf, other]
-
Title: Resource Efficient Perception for Vision SystemsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [70] arXiv:2405.07164 [pdf, other]
-
Title: Modeling Pedestrian Intrinsic Uncertainty for Multimodal Stochastic Trajectory Prediction via Energy Plan DenoisingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [71] arXiv:2405.07157 [pdf, other]
-
Title: Semi-Self-Supervised Domain Adaptation: Developing Deep Learning Models with Limited Annotated Data for Wheat Head SegmentationComments: 12Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [72] arXiv:2405.07155 [pdf, other]
-
Title: Enhancing Multi-modal Learning: Meta-learned Cross-modal Knowledge Distillation for Handling Missing ModalitiesAuthors: Hu Wang, Congbo Ma, Yuyuan Liu, Yuanhong Chen, Yu Tian, Jodie Avery, Louise Hull, Gustavo CarneiroSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [73] arXiv:2405.07121 [pdf, other]
-
Title: In The Wild Ellipse Parameter Estimation for Circular Dining Plates and BowlsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [74] arXiv:2405.07116 [pdf, other]
-
Title: CoViews: Adaptive Augmentation Using Cooperative Views for Enhanced Contrastive LearningAuthors: Nazim BendibSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [75] arXiv:2405.07047 [pdf, other]
-
Title: Unsupervised Density Neural Representation for CT Metal Artifact ReductionAuthors: Qing Wu, Xu Guo, Lixuan Chen, Dongming He, Hongjiang Wei, Xudong Wang, S. Kevin Zhou, Yifeng Zhang, Jingyi Yu, Yuyao ZhangComments: 13 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [76] arXiv:2405.07046 [pdf, other]
-
Title: Retrieval Enhanced Zero-Shot Video CaptioningSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [77] arXiv:2405.07044 [pdf, other]
-
Title: Semantic Guided Large Scale Factor Remote Sensing Image Super-resolution with Generative Diffusion PriorSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [78] arXiv:2405.07031 [pdf, other]
-
Title: Global Motion Understanding in Large-Scale Video Object SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [79] arXiv:2405.07027 [pdf, other]
-
Title: TD-NeRF: Novel Truncated Depth Prior for Joint Camera Pose and Neural Radiance Field OptimizationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
- [80] arXiv:2405.07012 [pdf, other]
-
Title: Incorporating Degradation Estimation in Light Field Spatial Super-ResolutionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [81] arXiv:2405.06994 [pdf, other]
-
Title: GRASP-GCN: Graph-Shape Prioritization for Neural Architecture Search under Distribution ShiftsSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [82] arXiv:2405.06980 [pdf, other]
-
Title: Fractals as Pre-training Datasets for Anomaly Detection and LocalizationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [83] arXiv:2405.06948 [pdf, other]
-
Title: Training-free Subject-Enhanced Attention Guidance for Compositional Text-to-image GenerationComments: 26 pages, 13 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [84] arXiv:2405.06945 [pdf, other]
-
Title: Direct Learning of Mesh and Appearance via 3D Gaussian SplattingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [85] arXiv:2405.06944 [pdf, other]
-
Title: Learning Monocular Depth from Focus with Event Focal StackSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [86] arXiv:2405.06929 [pdf, other]
-
Title: PRENet: A Plane-Fit Redundancy Encoding Point Cloud Sequence Network for Real-Time 3D Action RecognitionComments: Accepted by the 2024 International Joint Conference on Neural Networks (IJCNN 2024)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [87] arXiv:2405.06926 [pdf, other]
-
Title: TAI++: Text as Image for Multi-Label Image Classification by Co-Learning Transferable PromptComments: Accepted for publication at IJCAI 2024; 13 pages; 11 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [88] arXiv:2405.06918 [pdf, other]
-
Title: Super-Resolving Blurry Images with EventsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [89] arXiv:2405.06916 [pdf, other]
-
Title: High-order Neighborhoods Know More: HyperGraph Learning Meets Source-free Unsupervised Domain AdaptationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [90] arXiv:2405.06914 [pdf, other]
-
Title: Non-confusing Generation of Customized Concepts in Diffusion ModelsAuthors: Wang Lin, Jingyuan Chen, Jiaxin Shi, Yichen Zhu, Chen Liang, Junzhong Miao, Tao Jin, Zhou Zhao, Fei Wu, Shuicheng Yan, Hanwang ZhangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [91] arXiv:2405.06911 [pdf, other]
-
Title: Replication Study and Benchmarking of Real-Time Object Detection ModelsComments: Authors are presented in alphabetical order, each having equal contribution to the work. Copyright may be transferred without notice, after which this version may no longer be accessibleSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [92] arXiv:2405.06903 [pdf, other]
-
Title: UniGarmentManip: A Unified Framework for Category-Level Garment Manipulation via Dense Visual CorrespondenceComments: CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [93] arXiv:2405.06893 [pdf, other]
-
Title: ADLDA: A Method to Reduce the Harm of Data Distribution Shift in Data AugmentationAuthors: Haonan WangComments: 8 page 4 figSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [94] arXiv:2405.06887 [pdf, other]
-
Title: FineParser: A Fine-grained Spatio-temporal Action Parser for Human-centric Action Quality AssessmentComments: Accepted by CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [95] arXiv:2405.06875 [pdf, other]
-
Title: LogicAL: Towards logical anomaly synthesis for unsupervised anomaly localizationAuthors: Ying ZhaoComments: Accepted to Visual Anomaly and Novelty Detection (VAND) 2.0 Workshop at CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [96] arXiv:2405.06872 [pdf, other]
-
Title: eCAR: edge-assisted Collaborative Augmented Reality FrameworkSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [97] arXiv:2405.06865 [pdf, other]
-
Title: Disrupting Style Mimicry Attacks on Video ImagerySubjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
- [98] arXiv:2405.06849 [pdf, other]
-
Title: GreedyViG: Dynamic Axial Graph Construction for Efficient Vision GNNsComments: Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [99] arXiv:2405.06845 [pdf, other]
-
Title: CasCalib: Cascaded Calibration for Motion Capture from Sparse Unsynchronized CamerasComments: Accepted to the 18th IEEE International Conference on Automatic Face and Gesture RecognitionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [100] arXiv:2405.06841 [pdf, other]
-
Title: Bridging the Gap: Protocol Towards Fair and Consistent Affect AnalysisComments: accepted at IEEE FG 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [101] arXiv:2405.06828 [pdf, other]
-
Title: G-FARS: Gradient-Field-based Auto-Regressive Sampling for 3D Part GroupingComments: CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [102] arXiv:2405.06821 [pdf, other]
-
Title: Synchronized Object Detection for Autonomous Sorting, Mapping, and Quantification of Medical MaterialsComments: To be submittedSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [103] arXiv:2405.06814 [pdf, other]
-
Title: Dual-Task Vision Transformer for Rapid and Accurate Intracerebral Hemorrhage Classification on CT ImagesComments: 9 pages, 4 figure3Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [104] arXiv:2405.06782 [pdf, other]
-
Title: GraphRelate3D: Context-Dependent 3D Object Detection with Inter-Object Relationship GraphsAuthors: Mingyu Liu, Ekim Yurtsever, Marc Brede, Jun Meng, Walter Zimmer, Xingcheng Zhou, Bare Luka Zagar, Yuning Cui, Alois KnollSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [105] arXiv:2405.06778 [pdf, other]
-
Title: Shape Conditioned Human Motion Generation with Diffusion ModelSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
- [106] arXiv:2405.06765 [pdf, other]
-
Title: Common Corruptions for Enhancing and Evaluating Robustness in Air-to-Air Visual Object DetectionAuthors: Anastasios Arsenos, Vasileios Karampinis, Evangelos Petrongonas, Christos Skliros, Dimitrios Kollias, Stefanos Kollias, Athanasios VoulodimosSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [107] arXiv:2405.06749 [pdf, other]
-
Title: Ensuring UAV Safety: A Vision-only and Real-time Framework for Collision Avoidance Through Object Detection, Tracking, and Distance EstimationAuthors: Vasileios Karampinis, Anastasios Arsenos, Orfeas Filippopoulos, Evangelos Petrongonas, Christos Skliros, Dimitrios Kollias, Stefanos Kollias, Athanasios VoulodimosSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [108] arXiv:2405.07991 (cross-list from cs.RO) [pdf, other]
-
Title: SPIN: Simultaneous Perception, Interaction and NavigationComments: In CVPR 2024. Website at this https URLSubjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Systems and Control (eess.SY)
- [109] arXiv:2405.07990 (cross-list from cs.CL) [pdf, other]
-
Title: Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific PlotsAuthors: Chengyue Wu, Yixiao Ge, Qiushan Guo, Jiahao Wang, Zhixuan Liang, Zeyu Lu, Ying Shan, Ping LuoSubjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
- [110] arXiv:2405.07987 (cross-list from cs.LG) [pdf, other]
-
Title: The Platonic Representation HypothesisComments: Equal contributionsSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
- [111] arXiv:2405.07930 (cross-list from cs.MM) [pdf, other]
-
Title: Improving Multimodal Learning with Multi-Loss Gradient ModulationSubjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
- [112] arXiv:2405.07905 (cross-list from eess.IV) [pdf, other]
-
Title: PLUTO: Pathology-Universal TransformerAuthors: Dinkar Juyal, Harshith Padigela, Chintan Shah, Daniel Shenker, Natalia Harguindeguy, Yi Liu, Blake Martin, Yibo Zhang, Michael Nercessian, Miles Markey, Isaac Finberg, Kelsey Luu, Daniel Borders, Syed Ashar Javed, Emma Krause, Raymond Biju, Aashish Sood, Allen Ma, Jackson Nyman, John Shamshoian, Guillaume Chhor, Darpan Sanghavi, Marc Thibault, Limin Yu, Fedaa Najdawi, Jennifer A. Hipp, Darren Fahy, Benjamin Glass, Eric Walk, John Abel, Harsha Pokkalla, Andrew H. Beck, Sean GrullonSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [113] arXiv:2405.07869 (cross-list from eess.IV) [pdf, other]
-
Title: Enhancing Clinically Significant Prostate Cancer Prediction in T2-weighted Images through Transfer Learning from Breast CancerSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [114] arXiv:2405.07861 (cross-list from eess.IV) [pdf, other]
-
Title: Improving Breast Cancer Grade Prediction with Multiparametric MRI Created Using Optimized Synthetic Correlated Diffusion ImagingSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [115] arXiv:2405.07854 (cross-list from eess.IV) [pdf, other]
-
Title: Using Multiparametric MRI with Optimized Synthetic Correlated Diffusion Imaging to Enhance Breast Cancer Pathologic Complete Response PredictionSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [116] arXiv:2405.07842 (cross-list from astro-ph.IM) [pdf, other]
-
Title: Ground-based Image Deconvolution with Swin Transformer UNetComments: 11 pages, 14 figuresSubjects: Instrumentation and Methods for Astrophysics (astro-ph.IM); Computer Vision and Pattern Recognition (cs.CV)
- [117] arXiv:2405.07827 (cross-list from cs.MM) [pdf, other]
-
Title: Automatic Recognition of Food Ingestion Environment from the AIM-2 Wearable SensorAuthors: Yuning Huang, Mohamed Abul Hassan, Jiangpeng He, Janine Higgins, Megan McCrory, Heather Eicher-Miller, Graham Thomas, Edward O Sazonov, Fengqing Maggie ZhuComments: Accepted at CVPRw 2024Subjects: Multimedia (cs.MM); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [118] arXiv:2405.07813 (cross-list from cs.LG) [pdf, other]
-
Title: Localizing Task Information for Improved Model Merging and CompressionComments: Accepted ICML 2024; The first two authors contributed equally to this work; Project website: this https URLSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [119] arXiv:2405.07780 (cross-list from cs.LG) [pdf, other]
-
Title: Harnessing Hierarchical Label Distribution Variations in Test Agnostic Long-tail RecognitionAuthors: Zhiyong Yang, Qianqian Xu, Zitai Wang, Sicong Li, Boyu Han, Shilong Bao, Xiaochun Cao, Qingming HuangSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [120] arXiv:2405.07762 (cross-list from eess.IV) [pdf, other]
-
Title: A method for supervoxel-wise association studies of age and other non-imaging variables from coronary computed tomography angiogramsComments: 34 pagesSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [121] arXiv:2405.07674 (cross-list from eess.IV) [pdf, other]
-
Title: CoVScreen: Pitfalls and recommendations for screening COVID-19 using Chest X-raysAuthors: Sonit SinghComments: 21 pagesSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [122] arXiv:2405.07606 (cross-list from cs.HC) [pdf, other]
-
Title: AIris: An AI-powered Wearable Assistive Device for the Visually ImpairedAuthors: Dionysia Danai Brilli, Evangelos Georgaras, Stefania Tsilivaki, Nikos Melanitis, Konstantina NikitaSubjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV)
- [123] arXiv:2405.07544 (cross-list from cs.RO) [pdf, other]
-
Title: Automatic Odometry-Less OpenDRIVE Generation From Sparse Point CloudsComments: 8 pages, 4 figures, 3 algorithms, 2 tablesJournal-ref: 2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC)Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [124] arXiv:2405.07489 (cross-list from cs.LG) [pdf, other]
-
Title: Sparse Domain Transfer via Elastic Net RegularizationSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [125] arXiv:2405.07392 (cross-list from cs.RO) [pdf, other]
-
Title: NGD-SLAM: Towards Real-Time SLAM for Dynamic Environments without GPUAuthors: Yuhao ZhangComments: 12 pages, 5 figuresSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [126] arXiv:2405.07338 (cross-list from eess.IV) [pdf, other]
-
Title: Explainable Convolutional Neural Networks for Retinal Fundus Classification and Cutting-Edge Segmentation Models for Retinal Blood Vessels from Fundus ImagesAuthors: Fatema Tuj Johora Faria, Mukaffi Bin Moin, Pronay Debnath, Asif Iftekher Fahim, Faisal Muhammad ShahSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [127] arXiv:2405.07309 (cross-list from cs.RO) [pdf, other]
-
Title: DiffGen: Robot Demonstration Generation via Differentiable Physics Simulation, Differentiable Rendering, and Vision-Language ModelSubjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [128] arXiv:2405.07283 (cross-list from cs.RO) [pdf, other]
-
Title: BeautyMap: Binary-Encoded Adaptable Ground Matrix for Dynamic Points Removal in Global MapsComments: The first two authors are co-first authors. 8 pages, accepted by RA-LSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [129] arXiv:2405.07256 (cross-list from eess.IV) [pdf, other]
-
Title: Leveraging Fixed and Dynamic Pseudo-labels for Semi-supervised Medical Image SegmentationComments: Under ReviewSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [130] arXiv:2405.07145 (cross-list from cs.CR) [pdf, other]
-
Title: Stable Signature is Unstable: Removing Image Watermark from Diffusion ModelsSubjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
- [131] arXiv:2405.07041 (cross-list from cs.RO) [pdf, other]
-
Title: Multi-agent Traffic Prediction via Denoised Endpoint DistributionSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [132] arXiv:2405.07033 (cross-list from cs.NI) [pdf, ps, other]
-
Title: A Performance Analysis Modeling Framework for Extended Reality Applications in Edge-Assisted Wireless NetworksComments: 12 pages, 4 figures; To appear in Proceedings of IEEE International Conference on Distributed Computing Systems (ICDCS), 2024Subjects: Networking and Internet Architecture (cs.NI); Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Image and Video Processing (eess.IV)
- [133] arXiv:2405.07023 (cross-list from eess.IV) [pdf, other]
-
Title: Efficient Real-world Image Super-Resolution Via Adaptive Directional Gradient ConvolutionAuthors: Long Peng, Yang Cao, Renjing Pei, Wenbo Li, Jiaming Guo, Xueyang Fu, Yang Wang, Zheng-Jun ZhaSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [134] arXiv:2405.07001 (cross-list from cs.CL) [pdf, other]
-
Title: Evaluating Task-based Effectiveness of MLLMs on ChartsSubjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [135] arXiv:2405.06995 (cross-list from cs.SD) [pdf, other]
-
Title: Benchmarking Cross-Domain Audio-Visual Deception DetectionAuthors: Xiaobao Guo, Zitong Yu, Nithish Muthuchamy Selvaraj, Bingquan Shen, Adams Wai-Kin Kong, Alex C. KotComments: 10 pagesSubjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
- [136] arXiv:2405.06880 (cross-list from eess.IV) [pdf, other]
-
Title: EMCAD: Efficient Multi-scale Convolutional Attention Decoding for Medical Image SegmentationComments: 14 pages, 5 figures, 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [137] arXiv:2405.06859 (cross-list from cs.LG) [pdf, other]
-
Title: Reimplementation of Learning to Reweight Examples for Robust Deep LearningSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [138] arXiv:2405.06855 (cross-list from cs.LG) [pdf, other]
-
Title: Linear Explanations for Individual NeuronsComments: Published in ICML 2024Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [139] arXiv:2405.06789 (cross-list from eess.IV) [pdf, other]
-
Title: Self-Consistent Recursive Diffusion Bridge for Medical Image TranslationComments: 11 pages, 6 figuresSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [140] arXiv:2405.06786 (cross-list from eess.IV) [pdf, other]
-
Title: SAM3D: Zero-Shot Semi-Automatic Segmentation in 3D Medical Images with the Segment Anything ModelAuthors: Trevor J. Chan, Aarush Sahni, Jie Li, Alisha Luthra, Amy Fang, Alison Pouch, Chamith S. RajapakseSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [141] arXiv:2405.06702 (cross-list from cs.CL) [pdf, other]
-
Title: Malayalam Sign Language Identification using Finetuned YOLOv8 and Computer Vision TechniquesAuthors: Abhinand K., Abhiram B. Nair, Dhananjay C., Hanan Hamza, Mohammed Fawaz J., Rahma Fahim K., Anoop V. SSubjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
- [142] arXiv:2405.06646 (cross-list from cs.GR) [pdf, other]
-
Title: On-the-fly Learning to Transfer Motion Style with Diffusion Models: A Semantic Guidance ApproachComments: 23 pagesSubjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
Mon, 13 May 2024
- [143] arXiv:2405.06636 [pdf, other]
-
Title: Federated Document Visual Question Answering: A Pilot StudySubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
- [144] arXiv:2405.06634 [pdf, other]
-
Title: Multimodal LLMs Struggle with Basic Visual Network Analysis: a VNA BenchmarkComments: 11 pages, 3 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
- [145] arXiv:2405.06600 [pdf, other]
-
Title: Multi-Object Tracking in the DarkComments: Accepted by CVPR2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [146] arXiv:2405.06598 [pdf, other]
-
Title: A Lightweight Transformer for Remote Sensing Image Change CaptioningSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [147] arXiv:2405.06593 [pdf, other]
-
Title: Non-Uniform Spatial Alignment Errors in sUAS Imagery From Wide-Area DisastersAuthors: Thomas Manzini, Priyankari Perali, Raisa Karnik, Mihir Godbole, Hasnat Abdullah, Robin MurphyComments: 6 pages, 5 figures, 1 tableSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [148] arXiv:2405.06586 [pdf, other]
-
Title: Enhancing Weakly Supervised Semantic Segmentation with Multi-modal Foundation Models: An End-to-End ApproachSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [149] arXiv:2405.06574 [pdf, other]
-
Title: Deep video representation learning: a surveyComments: Multimedia Tools and Applications (2023) 1-31Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [150] arXiv:2405.06547 [pdf, other]
-
Title: OneTo3D: One Image to Re-editable Dynamic 3D Model and Video GenerationAuthors: Jinwei LinComments: 24 pages, 13 figures, 2 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [151] arXiv:2405.06536 [pdf, other]
-
Title: Mesh Denoising TransformerSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [152] arXiv:2405.06535 [pdf, other]
-
Title: Controllable Image Generation With Composed Parallel Token PredictionComments: 9 pages, 6 figures, non-anonymised pre-print for NeurIPS 2024 main conference. arXiv admin note: text overlap with arXiv:2402.04550, arXiv:2404.13788, arXiv:2403.06098, arXiv:2401.16025Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [153] arXiv:2405.06525 [pdf, other]
-
Title: Semantic and Spatial Adaptive Pixel-level Classifier for Semantic SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [154] arXiv:2405.06502 [pdf, other]
-
Title: Multi-Target Unsupervised Domain Adaptation for Semantic Segmentation without External DataSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [155] arXiv:2405.06468 [pdf, other]
-
Title: Pseudo-Prompt Generating in Pre-trained Vision-Language Models for Multi-Label Medical Image ClassificationSubjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
- [156] arXiv:2405.06467 [pdf, other]
-
Title: Attend, Distill, Detect: Attention-aware Entropy Distillation for Anomaly DetectionAuthors: Sushovan Jena, Vishwas Saini, Ujjwal Shaw, Pavitra Jain, Abhay Singh Raihal, Anoushka Banerjee, Sharad Joshi, Ananth Ganesh, Arnav BhavsarComments: 15 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [157] arXiv:2405.06408 [pdf, other]
-
Title: I3DGS: Improve 3D Gaussian Splatting from Multiple DimensionsAuthors: Jinwei LinComments: 16 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [158] arXiv:2405.06389 [pdf, other]
-
Title: Continual Novel Class Discovery via Feature Enhancement and AdaptationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [159] arXiv:2405.06383 [pdf, other]
-
Title: How to Augment for Atmospheric Turbulence Effects on Thermal Adapted Object Detection Models?Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [160] arXiv:2405.06354 [pdf, other]
-
Title: KeepOriginalAugment: Single Image-based Better Information-Preserving Data Augmentation ApproachComments: This paper has been accepted at 20th International Conference on Artificial Intelligence Applications and Innovations 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [161] arXiv:2405.06345 [pdf, other]
-
Title: Evaluating Adversarial Robustness in the Spatial Frequency DomainComments: 14 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [162] arXiv:2405.06342 [pdf, other]
-
Title: Compression-Realized Deep Structural Network for Video Quality EnhancementAuthors: Hanchi Sun, Xiaohong Liu, Xinyang Jiang, Yifei Shen, Dongsheng Li, Xiongkuo Min, Guangtao ZhaiSubjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
- [163] arXiv:2405.06340 [pdf, other]
-
Title: Improving Transferable Targeted Adversarial Attack via Normalized Logit Calibration and Truncated Feature MixingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [164] arXiv:2405.06323 [pdf, other]
-
Title: Open Access Battle Damage Detection via Pixel-Wise T-Test on Sentinel-1 ImageryAuthors: Ollie BallingerSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [165] arXiv:2405.06319 [pdf, other]
-
Title: Decoding Emotions in Abstract Art: Cognitive Plausibility of CLIP in Recognizing Color-Emotion AssociationsComments: To appear in the Proceedings of the Annual Meeting of the Cognitive Science Society 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
- [166] arXiv:2405.06288 [pdf, other]
-
Title: PCLMix: Weakly Supervised Medical Image Segmentation via Pixel-Level Contrastive Learning and Dynamic Mix AugmentationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [167] arXiv:2405.06283 [pdf, other]
-
Title: Novel Class Discovery for Ultra-Fine-Grained Visual CategorizationComments: 10 pages, 6 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [168] arXiv:2405.06279 [pdf, other]
-
Title: Benchmarking Classical and Learning-Based Multibeam Point Cloud RegistrationComments: Accepted at ICRA 2024 (IEEE International Conference on Robotics and Automation 2024)Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [169] arXiv:2405.06278 [pdf, other]
-
Title: Exploring the Interplay of Interpretability and Robustness in Deep Neural Networks: A Saliency-guided ApproachSubjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
- [170] arXiv:2405.06277 [pdf, other]
-
Title: Learning A Spiking Neural Network for Efficient Image DerainingComments: Accepted by IJCAI2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [171] arXiv:2405.06264 [pdf, other]
-
Title: Selective Focus: Investigating Semantics Sensitivity in Post-training Quantization for Lane DetectionComments: Accepted by AAAI-24Journal-ref: AAAI 2024, 38, 11936-11943Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [172] arXiv:2405.06260 [pdf, other]
-
Title: Precise Apple Detection and Localization in Orchards using YOLOv5 for Robotic Harvesting SystemsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [173] arXiv:2405.06246 [pdf, ps, other]
-
Title: Comparative Analysis of Advanced Feature Matching Algorithms in Challenging High Spatial Resolution Optical Satellite Stereo ScenariosComments: The manuscript is accepted as Oral Presentation in IEEE International Geoscience and Remote Sensing Symposium(IGARSS 2024)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [174] arXiv:2405.06241 [pdf, other]
-
Title: MGS-SLAM: Monocular Sparse Tracking and Gaussian Mapping with Depth Smooth RegularizationComments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibleSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [175] arXiv:2405.06228 [pdf, other]
-
Title: Context-Guided Spatial Feature Reconstruction for Efficient Semantic SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [176] arXiv:2405.06227 [pdf, other]
-
Title: MaskMatch: Boosting Semi-Supervised Learning Through Mask Autoencoder-Driven Feature LearningSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [177] arXiv:2405.06217 [pdf, other]
-
Title: DARA: Domain- and Relation-aware Adapters Make Parameter-efficient Tuning for Visual GroundingAuthors: Ting Liu, Xuyang Liu, Siteng Huang, Honggang Chen, Quanjun Yin, Long Qin, Donglin Wang, Yue HuComments: Accepted by ICME 2024 (Oral)Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
- [178] arXiv:2405.06216 [pdf, other]
-
Title: Event-based Structure-from-OrbitAuthors: Ethan Elms (1), Yasir Latif (1), Tae Ha Park (2), Tat-Jun Chin (1) ((1) The University of Adelaide, (2) Stanford University)Comments: This work will be published in the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [179] arXiv:2405.06214 [pdf, other]
-
Title: Aerial-NeRF: Adaptive Spatial Partitioning and Sampling for Large-Scale Aerial RenderingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [180] arXiv:2405.06201 [pdf, other]
-
Title: PhysMLE: Generalizable and Priors-Inclusive Multi-task Remote Physiological MeasurementSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [181] arXiv:2405.06198 [pdf, ps, other]
-
Title: MAPL: Memory Augmentation and Pseudo-Labeling for Semi-Supervised Anomaly DetectionAuthors: Junzhuo ChenSubjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
- [182] arXiv:2405.06196 [pdf, other]
-
Title: VLSM-Adapter: Finetuning Vision-Language Segmentation Efficiently with Lightweight BlocksComments: 12 pages, 5 figures, 2 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
- [183] arXiv:2405.06191 [pdf, ps, other]
-
Title: ODC-SA Net: Orthogonal Direction Enhancement and Scale Aware Network for Polyp SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [184] arXiv:2405.06185 [pdf, other]
-
Title: Zero-shot Degree of Ill-posedness Estimation for Active Small Object Change DetectionComments: 7 pages, 7 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [185] arXiv:2405.06181 [pdf, other]
-
Title: Residual-NeRF: Learning Residual NeRFs for Transparent Object ManipulationSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [186] arXiv:2405.06143 [pdf, other]
-
Title: Perceptual Crack Detection for Rendered 3D Textured MeshesComments: Accepted by IEEE QoMEX 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Computational Geometry (cs.CG); Multimedia (cs.MM)
- [187] arXiv:2405.06128 [pdf, other]
-
Title: Enhanced Multimodal Content Moderation of Children's Videos using Audiovisual FusionComments: 8 pages, 3 figures, Accepted at The 37th International FLAIRS ConferenceSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [188] arXiv:2405.06116 [pdf, other]
-
Title: Rethinking Efficient and Effective Point-based Networks for Event Camera Classification and Regression: EventMambaAuthors: Hongwei Ren, Yue Zhou, Jiadong Zhu, Haotian Fu, Yulong Huang, Xiaopeng Lin, Yuetong Fang, Fei Ma, Hao Yu, Bojun ChengComments: Extension Journal of TTPOINT and PEPNetSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [189] arXiv:2405.06088 [pdf, other]
-
Title: A Mixture of Experts Approach to 3D Human Motion PredictionComments: 16 pages, 6 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [190] arXiv:2405.06057 [pdf, other]
-
Title: UnSegGNet: Unsupervised Image Segmentation using Graph Neural NetworksAuthors: Kovvuri Sai Gopal Reddy, Bodduluri Saran, A. Mudit Adityaja, Saurabh J. Shigwan, Nitin KumarSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [191] arXiv:2405.06049 [pdf, other]
-
Title: BB-Patch: BlackBox Adversarial Patch-Attack using Zeroth-Order OptimizationSubjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
- [192] arXiv:2405.05983 [pdf, ps, other]
-
Title: Real-Time Pill Identification for the Visually Impaired Using Deep LearningSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [193] arXiv:2405.06473 (cross-list from cs.RO) [pdf, other]
-
Title: Autonomous Driving with a Deep Dual-Model Solution for Steering and Braking ControlComments: 6 pages, 2 figures, accepted for publication in Proceedings of International Conference on Smart and Sustainable Technologies (SpliTech 2024)Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [194] arXiv:2405.06463 (cross-list from eess.IV) [pdf, other]
-
Title: MRSegmentator: Robust Multi-Modality Segmentation of 40 Classes in MRI and CT SequencesAuthors: Hartmut Häntze, Lina Xu, Felix J. Dorfner, Leonhard Donle, Daniel Truhn, Hugo Aerts, Mathias Prokop, Bram van Ginneken, Alessa Hering, Lisa C. Adams, Keno K. BressemComments: 13 pages, 6 figures; corrected co-author infoSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [195] arXiv:2405.06301 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Learning from String SequencesComments: 10 pages, 1 figure, 4 tables, Technical ReportSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computational Engineering, Finance, and Science (cs.CE); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
- [196] arXiv:2405.06286 (cross-list from cs.RO) [pdf, ps, other]
-
Title: A Joint Approach Towards Data-Driven Virtual Testing for Automated Driving: The AVEAS ProjectAuthors: Leon Eisemann, Mirjam Fehling-Kaschek, Silke Forkert, Andreas Forster, Henrik Gommel, Susanne Guenther, Stephan Hammer, David Hermann, Marvin Klemp, Benjamin Lickert, Florian Luettner, Robin Moss, Nicole Neis, Maria Pohle, Dominik Schreiber, Cathrina Sowa, Daniel Stadler, Janina Stompe, Michael Strobelt, David Unger, Jens ZiehnComments: 6 pages, 5 figures, 2 tablesJournal-ref: Proceedings of the 7th International Symposium on Future Active Safety Technology toward zero traffic accidents (JSAE FAST-zero '23), 2023Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Machine Learning (cs.LG); Systems and Control (eess.SY)
- [197] arXiv:2405.06284 (cross-list from eess.IV) [pdf, other]
-
Title: Modality-agnostic Domain Generalizable Medical Image Segmentation by Multi-Frequency in Multi-Scale AttentionComments: Accepted in Computer Vision and Pattern Recognition (CVPR) 2024Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [198] arXiv:2405.06265 (cross-list from cs.RO) [pdf, other]
-
Title: Uncertainty-aware Semantic Mapping in Off-road Environments with Dempster-Shafer Theory of EvidenceComments: Our project website can be found at this https URLSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [199] arXiv:2405.06234 (cross-list from cs.LG) [pdf, other]
- [200] arXiv:2405.06175 (cross-list from eess.IV) [pdf, other]
-
Title: Prior-guided Diffusion Model for Cell Segmentation in Quantitative Phase ImagingSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [201] arXiv:2405.06166 (cross-list from eess.IV) [pdf, other]
-
Title: MDNet: Multi-Decoder Network for Abdominal CT Organs SegmentationAuthors: Debesh Jha, Nikhil Kumar Tomar, Koushik Biswas, Gorkem Durak, Matthew Antalek, Zheyuan Zhang, Bin Wang, Md Mostafijur Rahman, Hongyi Pan, Alpay Medetalibeyoglu, Yury Velichko, Daniela Ladner, Amir Borhani, Ulas BagciSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [202] arXiv:2405.06149 (cross-list from cs.AI) [pdf, other]
-
Title: DisBeaNet: A Deep Neural Network to augment Unmanned Surface Vessels for maritime situational awarenessSubjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Fri, 10 May 2024
- [203] arXiv:2405.05967 [pdf, other]
-
Title: Distilling Diffusion Models into Conditional GANsAuthors: Minguk Kang, Richard Zhang, Connelly Barnes, Sylvain Paris, Suha Kwak, Jaesik Park, Eli Shechtman, Jun-Yan Zhu, Taesung ParkComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
- [204] arXiv:2405.05953 [pdf, other]
-
Title: Frame Interpolation with Consecutive Brownian Bridge DiffusionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [205] arXiv:2405.05949 [pdf, other]
-
Title: CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-ExpertsAuthors: Jiachen Li, Xinyao Wang, Sijie Zhu, Chia-Wen Kuo, Lu Xu, Fan Chen, Jitesh Jain, Humphrey Shi, Longyin WenSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [206] arXiv:2405.05945 [pdf, other]
-
Title: Lumina-T2X: Transforming Text into Any Modality, Resolution, and Duration via Flow-based Large Diffusion TransformersAuthors: Peng Gao, Le Zhuo, Ziyi Lin, Chris Liu, Junsong Chen, Ruoyi Du, Enze Xie, Xu Luo, Longtian Qiu, Yuhang Zhang, Chen Lin, Rongjie Huang, Shijie Geng, Renrui Zhang, Junlin Xi, Wenqi Shao, Zhengkai Jiang, Tianshuo Yang, Weicai Ye, He Tong, Jingwen He, Yu Qiao, Hongsheng LiComments: Technical Report; Code at: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [207] arXiv:2405.05900 [pdf, other]
-
Title: A Comprehensive Survey of Masked Faces: Recognition, Detection, and UnmaskingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [208] arXiv:2405.05858 [pdf, other]
-
Title: Free-Moving Object Reconstruction and Pose Estimation with Virtual CameraSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Robotics (cs.RO)
- [209] arXiv:2405.05853 [pdf, other]
-
Title: Robust and Explainable Fine-Grained Visual Classification with Transfer Learning: A Dual-Carriageway FrameworkComments: Accepted in the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024 workshopSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [210] arXiv:2405.05852 [pdf, other]
-
Title: Pre-trained Text-to-Image Diffusion Models Are Versatile Representation Learners for ControlSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG); Robotics (cs.RO); Machine Learning (stat.ML)
- [211] arXiv:2405.05841 [pdf, other]
-
Title: Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text RecognitionComments: Accepted to IJCAI2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [212] arXiv:2405.05830 [pdf, ps, other]
-
Title: Mask-TS Net: Mask Temperature Scaling Uncertainty Calibration for Polyp SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [213] arXiv:2405.05811 [pdf, other]
-
Title: Parallel Cross Strip Attention Network for Single Image DehazingComments: 10 pages , 4 figures, CTISC'24Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [214] arXiv:2405.05808 [pdf, other]
-
Title: Fast and Controllable Post-training Sparsity: Learning Optimal Sparsity Allocation with Global Constraint in MinutesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [215] arXiv:2405.05806 [pdf, other]
-
Title: MasterWeaver: Taming Editability and Identity for Personalized Text-to-Image GenerationComments: 34 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [216] arXiv:2405.05803 [pdf, other]
-
Title: Boosting Multimodal Large Language Models with Visual Tokens Withdrawal for Rapid InferenceSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [217] arXiv:2405.05791 [pdf, other]
-
Title: Sequential Amodal Segmentation via Cumulative Occlusion LearningSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [218] arXiv:2405.05769 [pdf, other]
-
Title: Exploring Text-Guided Single Image Editing for Remote Sensing ImagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [219] arXiv:2405.05768 [pdf, other]
-
Title: FastScene: Text-Driven Fast 3D Indoor Scene Generation via Panoramic Gaussian SplattingComments: Accepted by IJCAI-2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [220] arXiv:2405.05766 [pdf, other]
-
Title: To Trust or Not to Trust: Towards a novel approach to measure trust for XAI systemsAuthors: Miquel Miró-Nicolau, Gabriel Moyà-Alcover, Antoni Jaume-i-Capó, Manuel González-Hidalgo, Maria Gemma Sempere Campello, Juan Antonio Palmer SanchoSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [221] arXiv:2405.05763 [pdf, ps, other]
-
Title: DP-MDM: Detail-Preserving MR Reconstruction via Multiple Diffusion ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [222] arXiv:2405.05760 [pdf, other]
-
Title: Similarity Guided Multimodal Fusion Transformer for Semantic Location Prediction in Social MediaSubjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
- [223] arXiv:2405.05755 [pdf, other]
-
Title: CSA-Net: Channel-wise Spatially Autocorrelated Attention NetworksSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [224] arXiv:2405.05749 [pdf, other]
-
Title: NeRFFaceSpeech: One-shot Audio-driven 3D Talking Head Synthesis via Generative PriorComments: 11 pages, 5 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [225] arXiv:2405.05745 [pdf, other]
- [226] arXiv:2405.05742 [pdf, other]
-
Title: How Quality Affects Deep Neural Networks in Fine-Grained Image ClassificationComments: VISAPP 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [227] arXiv:2405.05714 [pdf, other]
-
Title: Estimating Noisy Class Posterior with Part-level Labels for Noisy Label LearningComments: CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [228] arXiv:2405.05707 [pdf, other]
-
Title: LatentColorization: Latent Diffusion-Based Speaker Video ColorizationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [229] arXiv:2405.05691 [pdf, other]
-
Title: StableMoFusion: Towards Robust and Efficient Diffusion-based Motion Generation FrameworkAuthors: Yiheng Huang, Hui Yang, Chuanchen Luo, Yuxi Wang, Shibiao Xu, Zhaoxiang Zhang, Man Zhang, Junran PengSubjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
- [230] arXiv:2405.05674 [pdf, ps, other]
-
Title: TransAnaNet: Transformer-based Anatomy Change Prediction Network for Head and Neck Cancer Patient RadiotherapySubjects: Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)
- [231] arXiv:2405.05672 [pdf, other]
-
Title: Multi-Stream Keypoint Attention Network for Sign Language Recognition and TranslationComments: 15 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [232] arXiv:2405.05663 [pdf, other]
-
Title: RPBG: Towards Robust Neural Point-based Graphics in the WildAuthors: Qingtian Zhu, Zizhuang Wei, Zhongtian Zheng, Yifan Zhan, Zhuyu Yao, Jiawang Zhang, Kejian Wu, Yinqiang ZhengSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [233] arXiv:2405.05647 [pdf, ps, other]
-
Title: Letter to the Editor: What are the legal and ethical considerations of submitting radiology reports to ChatGPT?Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [234] arXiv:2405.05636 [pdf, other]
-
Title: SwapTalk: Audio-Driven Talking Face Generation with One-Shot Customization in Latent SpaceSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [235] arXiv:2405.05615 [pdf, other]
-
Title: Memory-Space Visual Prompting for Efficient Vision-Language Fine-TuningComments: Accepted to ICML2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
- [236] arXiv:2405.05614 [pdf, other]
-
Title: Depth Awakens: A Depth-perceptual Attention Fusion Network for RGB-D Camouflaged Object DetectionJournal-ref: Image and Vision Computing, 143:104924, 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Networking and Internet Architecture (cs.NI)
- [237] arXiv:2405.05613 [pdf, other]
-
Title: Robust Pseudo-label Learning with Neighbor Relation for Unsupervised Visible-Infrared Person Re-IdentificationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [238] arXiv:2405.05605 [pdf, other]
-
Title: Minimal Perspective AutocalibrationComments: 8 pages main paper + 2 pages references + 8 pages supplementary; to be presented at CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [239] arXiv:2405.05587 [pdf, other]
-
Title: Navigate Beyond Shortcuts: Debiased Learning through the Lens of Neural CollapseComments: CVPR 2024 HighlightSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [240] arXiv:2405.05584 [pdf, other]
-
Title: A Survey on Backbones for Deep Video Action RecognitionComments: This paper has been accepted by ICME workshopSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [241] arXiv:2405.05574 [pdf, other]
-
Title: Vision-Language Modeling with Regularized Spatial Transformer Networks for All Weather Crosswind Landing of AircraftSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [242] arXiv:2405.05573 [pdf, other]
-
Title: Poisoning-based Backdoor Attacks for Arbitrary Target Label with Positive TriggersSubjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
- [243] arXiv:2405.05553 [pdf, other]
-
Title: Towards Robust Physical-world Backdoor Attacks on Lane DetectionSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [244] arXiv:2405.05552 [pdf, other]
-
Title: Bidirectional Progressive Transformer for Interaction Intention AnticipationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [245] arXiv:2405.05551 [pdf, ps, other]
-
Title: The object detection model uses combined extraction with KNN and RF classificationAuthors: Florentina Tatrin Kurniati, Daniel HF Manongga, Irwan Sembiring, Sutarto Wijono, Roy Rudolf HuizenJournal-ref: IJEECS, pp 436-445, Vol 35, No 1 July 2024; https://ijeecs.iaescore.com/index.php/IJEECS/article/view/35888Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [246] arXiv:2405.05538 [pdf, other]
-
Title: A Survey on Personalized Content Synthesis with Diffusion ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [247] arXiv:2405.05530 [pdf, other]
-
Title: NurtureNet: A Multi-task Video-based Approach for Newborn AnthropometryAuthors: Yash Khandelwal, Mayur Arvind, Sriram Kumar, Ashish Gupta, Sachin Kumar Danisetty, Piyush Bagad, Anish Madan, Mayank Lunayach, Aditya Annavajjala, Abhishek Maiti, Sansiddh Jain, Aman Dalmia, Namrata Deka, Jerome White, Jigar Doshi, Angjoo Kanazawa, Rahul Panicker, Alpan Raval, Srinivas Rana, Makarand TapaswiComments: Accepted at CVPM Workshop at CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [248] arXiv:2405.05524 [pdf, other]
-
Title: Universal Adversarial Perturbations for Vision-Language Pre-trained ModelsComments: 9 pages, 5 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
- [249] arXiv:2405.05523 [pdf, other]
-
Title: Prompt When the Animal is: Temporal Animal Behavior Grounding with Positional Recovery TrainingComments: Accepted by ICMEW 2024. arXiv admin note: text overlap with arXiv:2404.13657Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [250] arXiv:2405.05518 [pdf, other]
-
Title: DTCLMapper: Dual Temporal Consistent Learning for Vectorized HD Map ConstructionComments: The source code will be made publicly available at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO); Image and Video Processing (eess.IV)
- [251] arXiv:2405.05502 [pdf, other]
-
Title: Towards Accurate and Robust Architectures via Neural Architecture SearchComments: Accepted by CVPR2024. arXiv admin note: substantial text overlap with arXiv:2212.14049Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
- [252] arXiv:2405.05497 [pdf, other]
-
Title: Multi-Level Feature Fusion Network for Lightweight Stereo Image Super-ResolutionComments: 10 pages, 7 figures, CVPRWorkshop NTIRE2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [253] arXiv:2405.05488 [pdf, ps, other]
-
Title: Advancing Head and Neck Cancer Survival Prediction via Multi-Label Learning and Deep Model InterpretationComments: 10 pages, 4 figures, 2 tables, 2 pages of supplementary materialSubjects: Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)
- [254] arXiv:2405.05477 [pdf, other]
-
Title: DynaSeg: A Deep Dynamic Fusion Method for Unsupervised Image Segmentation Incorporating Feature Similarity and Spatial ContinuitySubjects: Computer Vision and Pattern Recognition (cs.CV)
- [255] arXiv:2405.05446 [pdf, other]
-
Title: GDGS: Gradient Domain Gaussian Splatting for Sparse Representation of Radiance FieldsAuthors: Yuanhao GongComments: arXiv admin note: text overlap with arXiv:2404.09105Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
- [256] arXiv:2405.05428 [pdf, other]
-
Title: Adversary-Guided Motion Retargeting for Skeleton AnonymizationSubjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
- [257] arXiv:2405.05422 [pdf, other]
-
Title: EarthMatch: Iterative Coregistration for Fine-grained Localization of Astronaut PhotographyAuthors: Gabriele Berton, Gabriele Goletto, Gabriele Trivigno, Alex Stoken, Barbara Caputo, Carlo MasoneComments: CVPR 2024 IMW - webpage: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [258] arXiv:2405.05363 [pdf, other]
-
Title: LOC-ZSON: Language-driven Object-Centric Zero-Shot Object Retrieval and NavigationAuthors: Tianrui Guan, Yurou Yang, Harry Cheng, Muyuan Lin, Richard Kim, Rajasimman Madhivanan, Arnie Sen, Dinesh ManochaComments: Accepted to ICRA 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [259] arXiv:2405.05355 [pdf, other]
-
Title: Geometry-Informed Distance Candidate Selection for Adaptive Lightweight Omnidirectional Stereo Vision with Fisheye ImagesSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [260] arXiv:2405.05354 [pdf, other]
-
Title: Transfer-LMR: Heavy-Tail Driving Behavior Recognition in Diverse Traffic ScenariosSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [261] arXiv:2405.05297 [pdf, ps, other]
-
Title: Deep Learning Method to Predict Wound Healing Progress Based on Collagen Fibers in Wound TissueSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [262] arXiv:2405.05295 [pdf, other]
-
Title: Relevant Irrelevance: Generating Alterfactual Explanations for Image ClassifiersAuthors: Silvan Mertes, Tobias Huber, Christina Karle, Katharina Weitz, Ruben Schlagowski, Cristina Conati, Elisabeth AndréComments: Accepted at IJCAI 2024. arXiv admin note: text overlap with arXiv:2207.09374Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [263] arXiv:2405.05261 [pdf, other]
-
Title: 3D Holistic OR AnonymizationAuthors: Tony Danjun WangComments: This bachelor's thesis was the foundation of the paper "DisguisOR: Holistic Face Anonymization for the Operating Room" (see arXiv:2307.14241), published at IPCAI'23Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [264] arXiv:2405.05260 [pdf, other]
-
Title: Financial Table Extraction in Image DocumentsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [265] arXiv:2405.05956 (cross-list from cs.RO) [pdf, other]
-
Title: Probing Multimodal LLMs as World Models for DrivingSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [266] arXiv:2405.05944 (cross-list from eess.IV) [pdf, other]
-
Title: MRISegmentator-Abdomen: A Fully Automated Multi-Organ and Structure Segmentation Tool for T1-weighted Abdominal MRIAuthors: Yan Zhuang, Tejas Sudharshan Mathai, Pritam Mukherjee, Brandon Khoury, Boah Kim, Benjamin Hou, Nusrat Rabbee, Ronald M. SummersComments: 23 pages, 13 figuresSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [267] arXiv:2405.05941 (cross-list from cs.RO) [pdf, other]
-
Title: Evaluating Real-World Robot Manipulation Policies in SimulationAuthors: Xuanlin Li, Kyle Hsu, Jiayuan Gu, Karl Pertsch, Oier Mees, Homer Rich Walke, Chuyuan Fu, Ishikaa Lunawat, Isabel Sieh, Sean Kirmani, Sergey Levine, Jiajun Wu, Chelsea Finn, Hao Su, Quan Vuong, Ted XiaoSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [268] arXiv:2405.05934 (cross-list from cs.LG) [pdf, other]
-
Title: Theoretical Guarantees of Data Augmented Last Layer Retraining MethodsComments: Extended version of a paper accepted to ISIT 2024. arXiv admin note: text overlap with arXiv:2402.11039Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Information Theory (cs.IT); Machine Learning (stat.ML)
- [269] arXiv:2405.05886 (cross-list from cs.LG) [pdf, other]
-
Title: Exploiting Autoencoder's Weakness to Generate Pseudo AnomaliesComments: SharedIt link: this https URLJournal-ref: Neural Computing and Applications, pp.1-17 (2024)Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [270] arXiv:2405.05876 (cross-list from cs.RO) [pdf, other]
-
Title: Composable Part-Based ManipulationComments: Presented at CoRL 2023. For videos and additional results, see our website: this https URLSubjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [271] arXiv:2405.05847 (cross-list from cs.LG) [pdf, other]
-
Title: Learned feature representations are biased by complexity, learning order, position, and moreSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [272] arXiv:2405.05846 (cross-list from cs.CR) [pdf, other]
-
Title: Could It Be Generated? Towards Practical Analysis of Memorization in Text-To-Image Diffusion ModelsSubjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
- [273] arXiv:2405.05836 (cross-list from cs.LG) [pdf, other]
-
Title: Informed Decision-Making through Advancements in Open Set Recognition and Unknown Sample DetectionComments: Accepted for proceedings of the 57th Hawaii International Conference on System Sciences: 10 pages, 6 figures, 3-6 January 2024, Honolulu, United StatesJournal-ref: Atefeh, M., & Marco, C. (2024). "Informed Decision-Making through Advancements in Open Set Recognition and Unknown Sample Detection." Proceedings of the 57th Hawaii International Conference on System Sciences, 1090-1999Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [274] arXiv:2405.05828 (cross-list from cs.RO) [pdf, other]
-
Title: MAD-ICP: It Is All About Matching Data -- Robust and Informed LiDAR OdometryComments: this https URLSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [275] arXiv:2405.05814 (cross-list from eess.IV) [pdf, ps, other]
-
Title: MSDiff: Multi-Scale Diffusion Model for Ultra-Sparse View CT ReconstructionSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [276] arXiv:2405.05800 (cross-list from cs.GR) [pdf, other]
-
Title: DragGaussian: Enabling Drag-style Manipulation on 3D Gaussian RepresentationSubjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
- [277] arXiv:2405.05792 (cross-list from cs.RO) [pdf, other]
-
Title: RoboHop: Segment-based Topological Map Representation for Open-World Visual NavigationAuthors: Sourav Garg, Krishan Rana, Mehdi Hosseinzadeh, Lachlan Mares, Niko Sünderhauf, Feras Dayoub, Ian ReidComments: Published at ICRA 2024; 9 pages, 8 figuresSubjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
- [278] arXiv:2405.05787 (cross-list from cs.RO) [pdf, other]
-
Title: Autonomous Robotic Ultrasound System for Liver Follow-up Diagnosis: Pilot Phantom StudyAuthors: Tianpeng Zhang (1), Sekeun Kim (2), Jerome Charton (2), Haitong Ma (1), Kyungsang Kim (2), Na Li (1), Quanzheng Li (2) ((1) SEAS, Harvard University (2) CAMCA, Massachusetts General Hospital and Harvard Medical School)Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Systems and Control (eess.SY)
- [279] arXiv:2405.05695 (cross-list from cs.LG) [pdf, other]
-
Title: Aux-NAS: Exploiting Auxiliary Labels with Negligibly Extra Inference CostComments: Accepted to ICLR 2024Journal-ref: International Conference on Learning Representations (ICLR), 2024Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
- [280] arXiv:2405.05667 (cross-list from eess.IV) [pdf, other]
-
Title: VM-DDPM: Vision Mamba Diffusion for Medical Image SynthesisSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [281] arXiv:2405.05658 (cross-list from eess.IV) [pdf, ps, other]
-
Title: Artificial intelligence for abnormality detection in high volume neuroimaging: a systematic review and meta-analysisAuthors: Siddharth Agarwal, David A. Wood, Mariusz Grzeda, Chandhini Suresh, Munaib Din, James Cole, Marc Modat, Thomas C BoothSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [282] arXiv:2405.05648 (cross-list from cs.RO) [pdf, other]
-
Title: ASGrasp: Generalizable Transparent Object Reconstruction and Grasping from RGB-D Active Stereo CameraComments: IEEE International Conference on Robotics and Automation (ICRA), 2024Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [283] arXiv:2405.05619 (cross-list from cs.LG) [pdf, other]
-
Title: Rectified Gaussian kernel multi-view k-means clusteringAuthors: Kristina P. SinagaComments: 13 pages, 1 figure, 7 TablesSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [284] arXiv:2405.05588 (cross-list from cs.LG) [pdf, other]
-
Title: Model Inversion Robustness: Can Transfer Learning Help?Journal-ref: CVPR 2024Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
- [285] arXiv:2405.05564 (cross-list from eess.IV) [pdf, other]
- [286] arXiv:2405.05520 (cross-list from eess.IV) [pdf, other]
-
Title: Continuous max-flow augmentation of self-supervised few-shot learning on SPECT left ventriclesComments: ISBI 2024 Accepted paper for presentationSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [287] arXiv:2405.05386 (cross-list from cs.LG) [pdf, other]
-
Title: Interpretability Needs a New ParadigmSubjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
- [288] arXiv:2405.05336 (cross-list from eess.IV) [pdf, other]
-
Title: Joint semi-supervised and contrastive learning enables zero-shot domain-adaptation and multi-domain segmentationAuthors: Alvaro Gomariz, Yusuke Kikuchi, Yun Yvonna Li, Thomas Albrecht, Andreas Maunz, Daniela Ferrara, Huanxiang Lu, Orcun GokselSubjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
Thu, 9 May 2024
- [289] arXiv:2405.05259 [pdf, other]
-
Title: OpenESS: Event-based Semantic Scene Understanding with Open VocabulariesComments: CVPR 2024 (Highlight); 26 pages, 12 figures, 11 tables; Code at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [290] arXiv:2405.05258 [pdf, other]
-
Title: Multi-Modal Data-Efficient 3D Scene Understanding for Autonomous DrivingAuthors: Lingdong Kong, Xiang Xu, Jiawei Ren, Wenwei Zhang, Liang Pan, Kai Chen, Wei Tsang Ooi, Ziwei LiuComments: Preprint; 17 pages, 6 figures, 8 tables; Code at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)
- [291] arXiv:2405.05256 [pdf, other]
-
Title: THRONE: An Object-based Hallucination Benchmark for the Free-form Generations of Large Vision-Language ModelsAuthors: Prannay Kaul, Zhizhong Li, Hao Yang, Yonatan Dukler, Ashwin Swaminathan, C. J. Taylor, Stefano SoattoComments: In CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [292] arXiv:2405.05252 [pdf, other]
-
Title: Attention-Driven Training-Free Efficiency Enhancement of Diffusion ModelsComments: Accepted to IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Image and Video Processing (eess.IV); Signal Processing (eess.SP)
- [293] arXiv:2405.05241 [pdf, other]
-
Title: BenthicNet: A global compilation of seafloor images for deep learning applicationsAuthors: Scott C. Lowe, Benjamin Misiuk, Isaac Xu, Shakhboz Abdulazizov, Amit R. Baroi, Alex C. Bastos, Merlin Best, Vicki Ferrini, Ariell Friedman, Deborah Hart, Ove Hoegh-Guldberg, Daniel Ierodiaconou, Julia Mackin-McLaughlin, Kathryn Markey, Pedro S. Menandro, Jacquomo Monk, Shreya Nemani, John O'Brien, Elizabeth Oh, Luba Y. Reshitnyk, Katleen Robert, Chris M. Roelfsema, Jessica A. Sameoto, Alexandre C. G. Schimel, Jordan A. Thomson, Brittany R. Wilson, Melisa C. Wong, Craig J. Brown, Thomas TrappenbergSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [294] arXiv:2405.05237 [pdf, other]
-
Title: EVA-X: A Foundation Model for General Chest X-ray Analysis with Self-supervised LearningAuthors: Jingfeng Yao, Xinggang Wang, Yuehao Song, Huangxuan Zhao, Jun Ma, Yajie Chen, Wenyu Liu, Bo WangComments: codes available at: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [295] arXiv:2405.05224 [pdf, other]
-
Title: Imagine Flash: Accelerating Emu Diffusion Models with Backward DistillationAuthors: Jonas Kohler, Albert Pumarola, Edgar Schönfeld, Artsiom Sanakoyeu, Roshan Sumbaly, Peter Vajda, Ali ThabetSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [296] arXiv:2405.05216 [pdf, other]
-
Title: FinePOSE: Fine-Grained Prompt-Driven 3D Human Pose Estimation via Diffusion ModelsComments: Accepted by CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [297] arXiv:2405.05173 [pdf, other]
-
Title: A Survey on Occupancy Perception for Autonomous Driving: The Information Fusion PerspectiveSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
- [298] arXiv:2405.05164 [pdf, other]
-
Title: ProbRadarM3F: mmWave Radar based Human Skeletal Pose Estimation with Probability Map Guided Multi-Format Feature FusionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [299] arXiv:2405.05145 [pdf, other]
-
Title: Conformal Semantic Image Segmentation: Post-hoc Quantification of Predictive UncertaintySubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [300] arXiv:2405.05143 [pdf, other]
-
Title: Learning Object Semantic Similarity with Self-SupervisionSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
- [301] arXiv:2405.05133 [pdf, other]
-
Title: Identifying every building's function in large-scale urban areas with multi-modality remote-sensing dataComments: 5 pages, 7 figures, accepted by IGARSS 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
- [302] arXiv:2405.05130 [pdf, other]
-
Title: Multi-scale Bottleneck Transformer for Weakly Supervised Multimodal Violence DetectionComments: Accepted by ICME 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
- [303] arXiv:2405.05079 [pdf, other]
-
Title: Power Variable Projection for Initialization-Free Large-Scale Bundle AdjustmentSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [304] arXiv:2405.05057 [pdf, other]
-
Title: Real-Time Motion Detection Using Dynamic Mode DecompositionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [305] arXiv:2405.05039 [pdf, other]
-
Title: Reviewing Intelligent Cinematography: AI research for camera-based video productionComments: For researchers and cinematographers. 43 pages including Table of Contents, List of Figures and Tables. We obtained permission to use Figures 5 and 11. All other Figures have been drawn by usSubjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
- [306] arXiv:2405.05031 [pdf, other]
-
Title: Mitigating Bias Using Model-Agnostic Data AttributionComments: Accepted to the 2024 IEEE CVPR Workshop on Fair, Data-efficient, and Trusted Computer VisionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [307] arXiv:2405.05027 [pdf, other]
-
Title: StyleMamba : State Space Model for Efficient Text-driven Image Style TransferComments: Blind submission to ECAI 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [308] arXiv:2405.05016 [pdf, other]
-
Title: TGTM: TinyML-based Global Tone Mapping for HDR SensorsSubjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
- [309] arXiv:2405.05012 [pdf, other]
-
Title: The Entropy Enigma: Success and Failure of Entropy MinimizationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [310] arXiv:2405.05010 [pdf, other]
-
Title: ${M^2D}$NeRF: Multi-Modal Decomposition NeRF with 3D Feature FieldsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [311] arXiv:2405.05004 [pdf, other]
-
Title: TENet: Targetness Entanglement Incorporating with Multi-Scale Pooling and Mutually-Guided Fusion for RGB-E Object TrackingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [312] arXiv:2405.05001 [pdf, other]
-
Title: HMANet: Hybrid Multi-Axis Aggregation Network for Image Super-ResolutionComments: 12 pages, 10 figures, conferenceSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [313] arXiv:2405.04997 [pdf, other]
-
Title: Bridging the Gap Between Saliency Prediction and Image Quality AssessmentSubjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
- [314] arXiv:2405.04974 [pdf, other]
-
Title: Discrepancy-based Diffusion Models for Lesion Detection in Brain MRISubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [315] arXiv:2405.04971 [pdf, other]
-
Title: End-to-End Semi-Supervised approach with Modulated Object Queries for Table Detection in DocumentsComments: ICDAR-IJDAR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [316] arXiv:2405.04969 [pdf, other]
-
Title: A review on discriminative self-supervised learning methodsComments: 21 pages, 7 figures, 11 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [317] arXiv:2405.04964 [pdf, other]
-
Title: Frequency-Assisted Mamba for Remote Sensing Image Super-ResolutionComments: Frequency-Assisted Mamba for Remote Sensing Image Super-ResolutionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [318] arXiv:2405.04953 [pdf, other]
-
Title: Supervised Anomaly Detection for Complex Industrial ImagesComments: Accepted to CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [319] arXiv:2405.04950 [pdf, other]
-
Title: VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual ContextComments: 17 pages; Accepted by ICML 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
- [320] arXiv:2405.04943 [pdf, ps, other]
-
Title: Unsupervised Skin Feature Tracking with Deep Neural NetworksComments: arXiv admin note: text overlap with arXiv:2112.14159Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [321] arXiv:2405.04940 [src]
-
Title: Harnessing the Power of MLLMs for Transferable Text-to-Image Person ReIDAuthors: Wentao TanComments: arXiv admin comment: This version has been removed by arXiv administrators as the submitter did not have the rights to agree to the license at the time of submissionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [322] arXiv:2405.04918 [pdf, other]
-
Title: Delve into Base-Novel Confusion: Redundancy Exploration for Few-Shot Class-Incremental LearningSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [323] arXiv:2405.04913 [pdf, other]
-
Title: Weakly-supervised Semantic Segmentation via Dual-stream Contrastive Learning of Cross-image Contextual InformationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [324] arXiv:2405.04909 [pdf, other]
-
Title: Traj-LLM: A New Exploration for Empowering Trajectory Prediction with Pre-trained Large Language ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [325] arXiv:2405.04900 [pdf, other]
-
Title: Self-supervised Gait-based Emotion Representation Learning from Selective Strongly Augmented Skeleton SequencesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [326] arXiv:2405.04889 [pdf, other]
-
Title: Fast LiDAR Upsampling using Conditional Diffusion ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [327] arXiv:2405.04883 [pdf, other]
-
Title: FreeBind: Free Lunch in Unified Multimodal Space via Knowledge FusionAuthors: Zehan Wang, Ziang Zhang, Xize Cheng, Rongjie Huang, Luping Liu, Zhenhui Ye, Haifeng Huang, Yang Zhao, Tao Jin, Peng Gao, Zhou ZhaoComments: Accepted by ICML 2024. The code and checkpoints will be released at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [328] arXiv:2405.04858 [pdf, other]
-
Title: Pedestrian Attribute Recognition as Label-balanced Multi-label LearningComments: Accepted as ICML2024 main conference paperSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [329] arXiv:2405.04834 [pdf, other]
-
Title: FlexEControl: Flexible and Efficient Multimodal Control for Text-to-Image GenerationAuthors: Xuehai He, Jian Zheng, Jacob Zhiyuan Fang, Robinson Piramuthu, Mohit Bansal, Vicente Ordonez, Gunnar A Sigurdsson, Nanyun Peng, Xin Eric WangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [330] arXiv:2405.04815 [pdf, other]
-
Title: Proportion Estimation by Masked Learning from Label ProportionComments: Accepted at The 3rd MICCAI workshop on Data Augmentation, Labeling, and ImperfectionsSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [331] arXiv:2405.04807 [pdf, other]
-
Title: Transformer Architecture for NetsDBSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [332] arXiv:2405.04800 [pdf, other]
-
Title: DeepDamageNet: A two-step deep-learning model for multi-disaster building damage segmentation and classification using satellite imagerySubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [333] arXiv:2405.04788 [pdf, other]
-
Title: DiffMatch: Visual-Language Guidance Makes Better Semi-supervised Change DetectorComments: 13 pages, 4 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [334] arXiv:2405.04782 [pdf, other]
-
Title: Dual-Image Enhanced CLIP for Zero-Shot Anomaly DetectionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [335] arXiv:2405.04771 [pdf, other]
-
Title: Exploring Vision Transformers for 3D Human Motion-Language Models with Motion PatchesComments: Accepted to CVPR 2024, Project website: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [336] arXiv:2405.04759 [pdf, ps, other]
-
Title: Multi-Label Out-of-Distribution Detection with Spectral Normalized Joint EnergySubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [337] arXiv:2405.04741 [pdf, other]
-
Title: All in One Framework for Multimodal Re-identification in the WildComments: 12 pages, 3 figure, CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [338] arXiv:2405.04722 [pdf, other]
-
Title: Detecting and Refining HiRISE Image Patches Obscured by Atmospheric DustAuthors: Kunal Sunil KasodekarSubjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
- [339] arXiv:2405.04717 [pdf, other]
-
Title: Remote DiffusionAuthors: Kunal Sunil KasodekarSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [340] arXiv:2405.04682 [pdf, other]
-
Title: TALC: Time-Aligned Captions for Multi-Scene Text-to-Video GenerationComments: 23 pages, 12 figures, 8 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [341] arXiv:2405.04675 [pdf, other]
-
Title: TexControl: Sketch-Based Two-Stage Fashion Image Generation Using Diffusion ModelComments: 5 pages, 8 figures, accepted in NICOGRAPH International 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
- [342] arXiv:2405.04662 [pdf, other]
-
Title: Radar Fields: Frequency-Space Neural Scene Representations for FMCW RadarAuthors: David Borts, Erich Liang, Tim Brödermann, Andrea Ramazzina, Stefanie Walz, Edoardo Palladin, Jipeng Sun, David Bruggemann, Christos Sakaridis, Luc Van Gool, Mario Bijelic, Felix HeideComments: 8 pages, 6 figures, to be published in SIGGRAPH 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [343] arXiv:2405.04650 [pdf, other]
-
Title: A Self-Supervised Method for Body Part Segmentation and Keypoint Detection of Rat ImagesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [344] arXiv:2405.04634 [pdf, other]
-
Title: FRACTAL: An Ultra-Large-Scale Aerial Lidar Dataset for 3D Semantic Segmentation of Diverse LandscapesComments: 15 pages | 9 figures | 8 tables | Dataset is available at this https URL | Trained model is available at this https URL | Deep learning code repository is on Gihtub at this https URL | Data engineering code repository is on Github at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [345] arXiv:2405.04605 [pdf, ps, other]
-
Title: AI in Lung Health: Benchmarking Detection and Diagnostic Models Across Multiple CT Scan DatasetsAuthors: Fakrul Islam Tushar, Avivah Wang, Lavsen Dahal, Michael R. Harowicz, Kyle J. Lafata, Tina D. Tailor, Joseph Y. LoComments: 16 pages, 2 tables, 5 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [346] arXiv:2405.04589 [pdf, other]
-
Title: A Novel Wide-Area Multiobject Detection System with High-Probability Region SearchingComments: Accepted by ICRA 2024Journal-ref: 2024 IEEE International Conference on Robotics and Automation (ICRA)Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [347] arXiv:2405.04549 [pdf, other]
-
Title: ClothPPO: A Proximal Policy Optimization Enhancing Framework for Robotic Cloth Manipulation with Observation-Aligned Action SpacesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
- [348] arXiv:2405.04538 [pdf, other]
-
Title: DiffFinger: Advancing Synthetic Fingerprint Generation through Denoising Diffusion Probabilistic ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [349] arXiv:2405.04537 [pdf, other]
-
Title: An intuitive multi-frequency feature representation for SO(3)-equivariant networksComments: ICLR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
- [350] arXiv:2405.04536 [pdf, other]
-
Title: When Training-Free NAS Meets Vision Transformer: A Neural Tangent Kernel PerspectiveComments: ICASSP2024 oralSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [351] arXiv:2405.04535 [pdf, other]
-
Title: Image Classification for CSSVD Detection in Cacao PlantsSubjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
- [352] arXiv:2405.05170 (cross-list from cs.MM) [pdf, other]
-
Title: Picking watermarks from noise (PWFN): an improved robust watermarking model against intensive distortionsComments: Accepted by ICME2024Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
- [353] arXiv:2405.05160 (cross-list from cs.LG) [pdf, other]
-
Title: Selective Classification Under Distribution ShiftsComments: Total 25 pages (14 pages for main body); preprint for journal submissionSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [354] arXiv:2405.05095 (cross-list from math.NA) [pdf, other]
-
Title: Approximation properties relative to continuous scale space for hybrid discretizations of Gaussian derivative operatorsAuthors: Tony LindebergComments: 13 pages, 11 figures. arXiv admin note: text overlap with arXiv:2311.11317Subjects: Numerical Analysis (math.NA); Computer Vision and Pattern Recognition (cs.CV)
- [355] arXiv:2405.05007 (cross-list from eess.IV) [pdf, other]
-
Title: HC-Mamba: Vision MAMBA with Hybrid Convolutional Techniques for Medical Image SegmentationAuthors: Jiashu XuSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [356] arXiv:2405.04966 (cross-list from cs.IT) [pdf, other]
-
Title: Communication-Efficient Collaborative Perception via Information Filling with CodebookComments: 10 pages, Accepted by CVPR 2024Subjects: Information Theory (cs.IT); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
- [357] arXiv:2405.04902 (cross-list from eess.IV) [pdf, other]
-
Title: HAGAN: Hybrid Augmented Generative Adversarial Network for Medical Image SynthesisSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [358] arXiv:2405.04890 (cross-list from cs.RO) [pdf, other]
-
Title: GISR: Geometric Initialization and Silhouette-based Refinement for Single-View Robot Pose and Configuration EstimationComments: Submitted to IEEE Robotics and Automation Letters (RA-L)Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [359] arXiv:2405.04867 (cross-list from eess.IV) [pdf, other]
-
Title: MIPI 2024 Challenge on Demosaic for HybridEVS Camera: Methods and ResultsAuthors: Yaqi Wu, Zhihao Fan, Xiaofeng Chu, Jimmy S. Ren, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangcheng Zhou, Ruicheng Feng, Yuekun Dai, Peiqing Yang, Chen Change Loy, Senyan Xu, Zhijing Sun, Jiaying Zhu, Yurui Zhu, Xueyang Fu, Zheng-Jun Zha, Jun Cao, Cheng Li, Shu Chen, Liang Ma, Shiyang Zhou, Haijin Zeng, Kai Feng, Yongyong Chen, Jingyong Su, Xianyu Guan, Hongyuan Yu, Cheng Wan, Jiamin Lin, Binnan Han, Yajun Zou, Zhuoyuan Wu, Yuan Huang, Yongsheng Yu, Daoan Zhang, Jizhe Li, Xuanwu Yin, Kunlong Zuo, Yunfan Lu, Yijie Xu, Wenzong Ma, Weiyu Guo, Hui Xiong, Wei Yu, Bingchun Luo, Sabari Nathan, Priya KansalComments: MIPI@CVPR2024. Website: this https URLSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [360] arXiv:2405.04812 (cross-list from cs.RO) [pdf, other]
-
Title: General Place Recognition Survey: Towards Real-World AutonomyAuthors: Peng Yin, Jianhao Jiao, Shiqi Zhao, Lingyun Xu, Guoquan Huang, Howie Choset, Sebastian Scherer, Jianda HanComments: 20 pages, 12 figures, under reviewSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [361] arXiv:2405.04778 (cross-list from eess.IV) [pdf, other]
-
Title: Teacher-Student Network for Real-World Face Super-Resolution with Progressive Embedding of Edge InformationComments: Accepted by ICIP 2023Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [362] arXiv:2405.04610 (cross-list from eess.IV) [pdf, other]
-
Title: Exploring Explainable AI Techniques for Improved Interpretability in Lung and Colon Cancer ClassificationAuthors: Mukaffi Bin Moin, Fatema Tuj Johora Faria, Swarnajit Saha, Bushra Kamal Rafa, Mohammad Shafiul AlamComments: Accepted in 4th International Conference on Computing and Communication Networks (ICCCNet-2024)Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [363] arXiv:2405.04595 (cross-list from eess.IV) [pdf, ps, other]
-
Title: An Advanced Features Extraction Module for Remote Sensing Image Super-ResolutionComments: Preprint of paper from The 21st International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology or ECTI-CON 2024, Khon Kaen, ThailandJournal-ref: ECTI-CON 2024, Khon Kaen ThailandSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [364] arXiv:2405.04507 (cross-list from stat.AP) [pdf, other]
-
Title: New allometric models for the USA create a step-change in forest carbon estimation, modeling, and mappingAuthors: Lucas K. Johnson (1), Michael J. Mahoney (1), Grant Domke (2), Colin M. Beier (1) ((1) State University of New York College of Environmental Science and Forestry, (2) USDA Forest Service)Comments: Manuscript: 16 pages, 7 figures; Supplements: 3 pages, 2 figures; Submitted to: Remote Sensing of EnvironmentSubjects: Applications (stat.AP); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Wed, 8 May 2024
- [365] arXiv:2405.04534 [pdf, other]
-
Title: Tactile-Augmented Radiance FieldsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [366] arXiv:2405.04533 [pdf, other]
-
Title: ChatHuman: Language-driven 3D Human Understanding with Retrieval-Augmented Tool ReasoningComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [367] arXiv:2405.04496 [pdf, other]
-
Title: Edit-Your-Motion: Space-Time Diffusion Decoupling Learning for Video Motion EditingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [368] arXiv:2405.04489 [pdf, other]
-
Title: S3Former: Self-supervised High-resolution Transformer for Solar PV ProfilingAuthors: Minh Tran, Adrian De Luis, Haitao Liao, Ying Huang, Roy McCann, Alan Mantooth, Jack Cothren, Ngan LeComments: PreprintSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [369] arXiv:2405.04457 [pdf, other]
-
Title: Towards Geographic Inclusion in the Evaluation of Text-to-Image ModelsAuthors: Melissa Hall, Samuel J. Bell, Candace Ross, Adina Williams, Michal Drozdzal, Adriana Romero SorianoSubjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Human-Computer Interaction (cs.HC)
- [370] arXiv:2405.04442 [pdf, other]
-
Title: AugmenTory: A Fast and Flexible Polygon Augmentation LibraryAuthors: Tanaz Ghahremani, Mohammad Hoseyni, Mohammad Javad Ahmadi, Pouria Mehrabi, Amirhossein NikoofardSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [371] arXiv:2405.04416 [pdf, other]
-
Title: DistGrid: Scalable Scene Reconstruction with Distributed Multi-resolution Hash GridComments: Originally submitted to Siggraph Asia 2023Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [372] arXiv:2405.04408 [pdf, other]
-
Title: DocRes: A Generalist Model Toward Unifying Document Image Restoration TasksComments: Accepted by CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [373] arXiv:2405.04404 [pdf, other]
-
Title: Vision Mamba: A Comprehensive Survey and TaxonomyComments: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
- [374] arXiv:2405.04403 [pdf, other]
-
Title: Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak AttacksSubjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
- [375] arXiv:2405.04390 [pdf, other]
-
Title: DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous DrivingAuthors: Chen Min, Dawei Zhao, Liang Xiao, Jian Zhao, Xinli Xu, Zheng Zhu, Lei Jin, Jianshu Li, Yulan Guo, Junliang Xing, Liping Jing, Yiming Nie, Bin DaiComments: Accepted by CVPR2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [376] arXiv:2405.04377 [pdf, other]
-
Title: Choose What You Need: Disentangled Representation Learning for Scene Text Recognition, Removal and EditingComments: Accepted to CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [377] arXiv:2405.04370 [pdf, other]
-
Title: Diff-IP2D: Diffusion-Based Hand-Object Interaction Prediction on Egocentric VideosSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [378] arXiv:2405.04356 [pdf, other]
-
Title: Diffusion-driven GAN Inversion for Multi-Modal Face Image GenerationComments: Accepted by CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [379] arXiv:2405.04345 [pdf, other]
-
Title: Novel View Synthesis with Neural Radiance Fields for Industrial Robot ApplicationsAuthors: Markus Hillemann, Robert Langendörfer, Max Heiken, Max Mehltretter, Andreas Schenk, Martin Weinmann, Stefan Hinz, Christian Heipke, Markus UlrichComments: 8 pages, 8 figures, accepted for publication in The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences (ISPRS Archives) 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
- [380] arXiv:2405.04327 [pdf, other]
-
Title: Audio-Visual Speech Representation Expert for Enhanced Talking Face Video Generation and EvaluationAuthors: Dogucan Yaman, Fevziye Irem Eyiokur, Leonard Bärmann, Seymanur Aktı, Hazım Kemal Ekenel, Alexander WaibelComments: CVPR2024 NTIRE WorkshopSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [381] arXiv:2405.04312 [pdf, other]
-
Title: Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion TransformerAuthors: Zhuoyi Yang, Heyang Jiang, Wenyi Hong, Jiayan Teng, Wendi Zheng, Yuxiao Dong, Ming Ding, Jie TangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [382] arXiv:2405.04311 [pdf, ps, other]
-
Title: Cross-IQA: Unsupervised Learning for Image Quality AssessmentAuthors: Zhen ZhangSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
- [383] arXiv:2405.04309 [pdf, other]
-
Title: Non-rigid Structure-from-Motion: Temporally-smooth Procrustean Alignment and Spatially-variant Deformation ModelingComments: Accepted by CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [384] arXiv:2405.04305 [pdf, other]
-
Title: A New Dataset and Comparative Study for Aphid Cluster Detection and Segmentation in Sorghum FieldsAuthors: Raiyan Rahman, Christopher Indris, Goetz Bramesfeld, Tianxiao Zhang, Kaidong Li, Xiangyu Chen, Ivan Grijalva, Brian McCornack, Daniel Flippo, Ajay Sharda, Guanghui WangSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [385] arXiv:2405.04299 [pdf, other]
-
Title: ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided TransformersSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [386] arXiv:2405.04251 [pdf, other]
-
Title: A General Model for Detecting Learner Engagement: Implementation and EvaluationComments: 13 pages, 2 Postscript figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
- [387] arXiv:2405.04233 [pdf, other]
-
Title: Vidu: a Highly Consistent, Dynamic and Skilled Text-to-Video Generator with Diffusion ModelsAuthors: Fan Bao, Chendong Xiang, Gang Yue, Guande He, Hongzhou Zhu, Kaiwen Zheng, Min Zhao, Shilong Liu, Yaole Wang, Jun ZhuComments: Project page at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [388] arXiv:2405.04211 [pdf, other]
-
Title: Breast Histopathology Image Retrieval by Attention-based Adversarially Regularized Variational Graph Autoencoder with Contrastive Learning-Based Feature ExtractionAuthors: Nematollah Saeidi, Hossein Karshenas, Bijan Shoushtarian, Sepideh Hatamikia, Ramona Woitek, Amirreza MahbodComments: 31 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [389] arXiv:2405.04189 [pdf, ps, other]
-
Title: Artificial Intelligence-powered fossil shark tooth identification: Unleashing the potential of Convolutional Neural NetworksAuthors: Andrea Barucci, Giulia Ciacci, Pietro Liò, Tiago Azevedo, Andrea Di Cencio, Marco Merella, Giovanni Bianucci, Giulia Bosio, Simone Casati, Alberto CollaretaComments: 40 pages, 8 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [390] arXiv:2405.04175 [pdf, other]
-
Title: Topicwise Separable Sentence Retrieval for Medical Report GenerationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [391] arXiv:2405.04167 [pdf, other]
-
Title: Bridging the Synthetic-to-Authentic Gap: Distortion-Guided Unsupervised Domain Adaptation for Blind Image Quality AssessmentComments: Accepted by CVPR2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
- [392] arXiv:2405.04164 [pdf, other]
-
Title: Sign2GPT: Leveraging Large Language Models for Gloss-Free Sign Language TranslationComments: Accepted at ICLR2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [393] arXiv:2405.04133 [pdf, other]
-
Title: Exposing AI-generated Videos: A Benchmark Dataset and a Local-and-Global Temporal Defect Based Detection MethodSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [394] arXiv:2405.04121 [pdf, other]
-
Title: ELiTe: Efficient Image-to-LiDAR Knowledge Transfer for Semantic SegmentationComments: 9 pages, 6 figures, ICME 2024 oralSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [395] arXiv:2405.04103 [pdf, other]
-
Title: COM3D: Leveraging Cross-View Correspondence and Cross-Modal Mining for 3D RetrievalComments: Accepted by ICME 2024 oralSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [396] arXiv:2405.04100 [pdf, other]
-
Title: ESP: Extro-Spective Prediction for Long-term Behavior Reasoning in Emergency ScenariosComments: Accepted by ICRA 2024 as Oral PresentationSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [397] arXiv:2405.04097 [pdf, other]
-
Title: Unmasking Illusions: Understanding Human Perception of Audiovisual DeepfakesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Machine Learning (cs.LG); Multimedia (cs.MM)
- [398] arXiv:2405.04093 [pdf, other]
-
Title: DCNN: Dual Cross-current Neural Networks Realized Using An Interactive Deep Learning Discriminator for Fine-grained ObjectsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [399] arXiv:2405.04044 [pdf, other]
-
Title: DMOFC: Discrimination Metric-Optimized Feature CompressionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [400] arXiv:2405.04042 [pdf, other]
-
Title: Space-time Reinforcement Network for Video Object SegmentationComments: Accepted by ICME 2024. 6 pages, 10 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [401] arXiv:2405.04009 [pdf, other]
-
Title: Structured Click Control in Transformer-based Interactive SegmentationComments: 10 pages, 6 figures, submitted to NeurIPS 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [402] arXiv:2405.04007 [pdf, other]
-
Title: SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image EditingComments: Technical Report; Dataset released in this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [403] arXiv:2405.03995 [pdf, other]
-
Title: Deep Event-based Object Detection in Autonomous Driving: A SurveySubjects: Computer Vision and Pattern Recognition (cs.CV)
- [404] arXiv:2405.03981 [pdf, other]
-
Title: Predicting Lung Disease Severity via Image-Based AQI Analysis using Deep Learning TechniquesComments: 11 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [405] arXiv:2405.03978 [pdf, other]
-
Title: VMambaCC: A Visual State Space Model for Crowd CountingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [406] arXiv:2405.03971 [pdf, other]
-
Title: Unified End-to-End V2X Cooperative Autonomous DrivingAuthors: Zhiwei Li, Bozhen Zhang, Lei Yang, Tianyu Shen, Nuo Xu, Ruosen Hao, Weiting Li, Tao Yan, Huaping LiuSubjects: Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
- [407] arXiv:2405.03959 [pdf, other]
-
Title: Joint Estimation of Identity Verification and Relative Pose for Partial FingerprintsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [408] arXiv:2405.03958 [pdf, other]
-
Title: Simple Drop-in LoRA Conditioning on Attention Layers Will Improve Your Diffusion ModelSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [409] arXiv:2405.03955 [pdf, ps, other]
-
Title: IPFed: Identity protected federated learning for user authenticationJournal-ref: 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
- [410] arXiv:2405.03945 [pdf, other]
-
Title: Role of Sensing and Computer Vision in 6G Wireless CommunicationsAuthors: Seungnyun Kim, Jihoon Moon, Jinhong Kim, Yongjun Ahn, Donghoon Kim, Sunwoo Kim, Kyuhong Shim, Byonghyo ShimSubjects: Computer Vision and Pattern Recognition (cs.CV); Networking and Internet Architecture (cs.NI)
- [411] arXiv:2405.03894 [pdf, other]
-
Title: MVDiff: Scalable and Flexible Multi-View Diffusion for 3D Object Reconstruction from Single-ViewComments: CVPRW: Generative Models for Computer VisionSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [412] arXiv:2405.03884 [pdf, other]
-
Title: BadFusion: 2D-Oriented Backdoor Attacks against 3D Object DetectionComments: Accepted at IJCAI 2024 ConferenceSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [413] arXiv:2405.03882 [pdf, other]
-
Title: Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision TransformerSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [414] arXiv:2405.03852 [pdf, other]
-
Title: VSA4VQA: Scaling a Vector Symbolic Architecture to Visual Question Answering on Natural ImagesComments: To be published in the Proceedings of the Annual Meeting of the Cognitive Science Society (CogSci'24)Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [415] arXiv:2405.03846 [pdf, other]
-
Title: Enhancing Apparent Personality Trait Analysis with Cross-Modal EmbeddingsComments: 14 pages, 4 figuresJournal-ref: Annales Universitatis Scientiarium Budapestinensis de Rolando E\"otv\"os Nominatae. Sectio Computatorica, MaCS Special Issue, 2021Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
- [416] arXiv:2405.03803 [pdf, other]
-
Title: MoDiPO: text-to-motion alignment via AI-feedback-driven Direct Preference OptimizationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [417] arXiv:2405.03770 [pdf, other]
-
Title: Foundation Models for Video Understanding: A SurveySubjects: Computer Vision and Pattern Recognition (cs.CV)
- [418] arXiv:2405.03722 [pdf, other]
-
Title: Class-relevant Patch Embedding Selection for Few-Shot Image ClassificationComments: arXiv admin note: text overlap with arXiv:2405.03109Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [419] arXiv:2405.03715 [pdf, other]
-
Title: Iterative Filter Pruning for Concatenation-based CNN ArchitecturesComments: Accepted for publication at IJCNN 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [420] arXiv:2405.03702 [pdf, other]
-
Title: Leafy Spurge Dataset: Real-world Weed Classification Within Aerial Drone ImageryAuthors: Kyle Doherty, Max Gurinas, Erik Samsoe, Charles Casper, Beau Larkin, Philip Ramsey, Brandon Trabucco, Ruslan SalakhutdinovComments: Official Dataset Technical Report. Used in DA-Fusion (arXiv:2302.07944)Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [421] arXiv:2405.04459 (cross-list from cs.AI) [pdf, other]
-
Title: A Significantly Better Class of Activation Functions Than ReLU Like Activation FunctionsComments: 14 pagesSubjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
- [422] arXiv:2405.04392 (cross-list from cs.RO) [pdf, other]
-
Title: BILTS: A novel bi-invariant local trajectory-shape descriptor for rigid-body motionComments: This work has been submitted as a regular research paper for consideration in the IEEE Transactions on Robotics. Copyright may be transferred without notice, after which this version may no longer be accessibleSubjects: Robotics (cs.RO); Computational Geometry (cs.CG); Computer Vision and Pattern Recognition (cs.CV)
- [423] arXiv:2405.04378 (cross-list from cs.RO) [pdf, other]
-
Title: $\textbf{Splat-MOVER}$: Multi-Stage, Open-Vocabulary Robotic Manipulation via Editable Gaussian SplattingAuthors: Ola Shorinwa, Johnathan Tucker, Aliyah Smith, Aiden Swann, Timothy Chen, Roya Firoozi, Monroe Kennedy III, Mac SchwagerSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [424] arXiv:2405.04295 (cross-list from eess.IV) [pdf, other]
-
Title: Semi-Supervised Disease Classification based on Limited Medical Image DataSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [425] arXiv:2405.04288 (cross-list from eess.IV) [pdf, other]
-
Title: BetterNet: An Efficient CNN Architecture with Residual Learning and Attention for Precision Polyp SegmentationSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [426] arXiv:2405.04274 (cross-list from eess.IV) [pdf, other]
-
Title: Group-aware Parameter-efficient Updating for Content-Adaptive Neural Video CompressionSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [427] arXiv:2405.04191 (cross-list from cs.LG) [pdf, other]
-
Title: Effective and Robust Adversarial Training against Data and Label CorruptionsComments: 12 pages, 8 figuresSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [428] arXiv:2405.04169 (cross-list from eess.IV) [pdf, other]
-
Title: D-TrAttUnet: Toward Hybrid CNN-Transformer Architecture for Generic and Subtle Segmentation in Medical ImagesComments: arXiv admin note: text overlap with arXiv:2303.15576Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [429] arXiv:2405.04071 (cross-list from cs.RO) [pdf, other]
-
Title: IMU-Aided Event-based Stereo Visual OdometryComments: 10 pages, 7 figures, ICRASubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [430] arXiv:2405.04041 (cross-list from cs.AI) [pdf, other]
-
Title: Feature Map Convergence Evaluation for Functional ModuleSubjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [431] arXiv:2405.04023 (cross-list from eess.IV) [pdf, other]
-
Title: Lumbar Spine Tumor Segmentation and Localization in T2 MRI Images Using AIAuthors: Rikathi Pal, Sudeshna Mondal, Aditi Gupta, Priya Saha, Somoballi Ghoshal, Amlan Chakrabarti, Susmita Sur-KolayComments: 9 pages, 12 figuresSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [432] arXiv:2405.03905 (cross-list from cs.AR) [pdf, other]
-
Title: A 65nm 36nJ/Decision Bio-inspired Temporal-Sparsity-Aware Digital Keyword Spotting IC with 0.6V Near-Threshold SRAMAuthors: Qinyu Chen, Kwantae Kim, Chang Gao, Sheng Zhou, Taekwang Jang, Tobi Delbruck, Shih-Chii LiuSubjects: Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
- [433] arXiv:2405.03827 (cross-list from cs.RO) [pdf, other]
-
Title: Direct learning of home vector direction for insect-inspired robot navigationComments: Published at ICRA 2024, project webpage at this https URLSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [434] arXiv:2405.03762 (cross-list from eess.IV) [pdf, other]
-
Title: Deep learning classifier of locally advanced rectal cancer treatment response from endoscopy imagesAuthors: Jorge Tapias Gomez, Aneesh Rangnekar, Hannah Williams, Hannah Thompson, Julio Garcia-Aguilar, Joshua Jesse Smith, Harini VeeraraghavanSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [435] arXiv:2405.03732 (cross-list from eess.IV) [pdf, ps, other]
-
Title: Accelerated MR Cholangiopancreatography with Deep Learning-based ReconstructionComments: 20 pages, 6 figures, 2 tablesSubjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [436] arXiv:2405.03730 (cross-list from cs.LG) [pdf, other]
-
Title: Tilt your Head: Activating the Hidden Spatial-Invariance of ClassifiersSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [437] arXiv:2405.03713 (cross-list from eess.IV) [pdf, other]
-
Title: Improve Cross-Modality Segmentation by Treating MRI Images as Inverted CT ScansAuthors: Hartmut Häntze, Lina Xu, Leonhard Donle, Felix J. Dorfner, Alessa Hering, Lisa C. Adams, Keno K. BressemComments: 3 pages, 2 figuresSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[ showing up to 580 entries per page: fewer | more ]
Disable MathJax (What is MathJax?)
Links to: arXiv, form interface, find, cs, new, 2405, contact, help (Access key information)