Computer Vision and Pattern Recognition
Authors and titles for recent submissions
[ total of 751 entries: 1-405 | 406-751 ][ showing 405 entries per page: fewer | more | all ]
Fri, 29 Mar 2024
- [1] arXiv:2403.19655 [pdf, other]
-
Title: GaussianCube: Structuring Gaussian Splatting using Optimal Transport for 3D Generative ModelingAuthors: Bowen Zhang, Yiji Cheng, Jiaolong Yang, Chunyu Wang, Feng Zhao, Yansong Tang, Dong Chen, Baining GuoComments: Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [2] arXiv:2403.19654 [pdf, other]
-
Title: RSMamba: Remote Sensing Image Classification with State Space ModelSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [3] arXiv:2403.19653 [pdf, other]
-
Title: Detecting Image Attribution for Text-to-Image Diffusion Models in RGB and BeyondComments: Code available at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [4] arXiv:2403.19652 [pdf, other]
-
Title: InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object InteractionComments: Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [5] arXiv:2403.19651 [pdf, other]
-
Title: MagicLens: Self-Supervised Image Retrieval with Open-Ended InstructionsComments: Work in progressSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Information Retrieval (cs.IR); Multimedia (cs.MM)
- [6] arXiv:2403.19646 [pdf, other]
-
Title: Change-Agent: Towards Interactive Comprehensive Change Interpretation and Analysis from Change Detection and Change CaptioningSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [7] arXiv:2403.19645 [pdf, other]
-
Title: GANTASTIC: GAN-based Transfer of Interpretable Directions for Disentangled Image Editing in Text-to-Image Diffusion ModelsComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [8] arXiv:2403.19638 [pdf, other]
-
Title: Siamese Vision Transformers are Scalable Audio-visual LearnersSubjects: Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
- [9] arXiv:2403.19632 [pdf, other]
-
Title: GauStudio: A Modular Framework for 3D Gaussian Splatting and BeyondComments: Code: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [10] arXiv:2403.19615 [pdf, other]
-
Title: SA-GS: Scale-Adaptive Gaussian Splatting for Training-Free Anti-AliasingAuthors: Xiaowei Song, Jv Zheng, Shiran Yuan, Huan-ang Gao, Jingwei Zhao, Xiang He, Weihao Gu, Hao ZhaoSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [11] arXiv:2403.19612 [pdf, other]
-
Title: ILPO-NET: Network for the invariant recognition of arbitrary volumetric patterns in 3DSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [12] arXiv:2403.19611 [pdf, ps, other]
-
Title: Nearest Neighbor Classication for Classical Image UpsamplingComments: 6 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
- [13] arXiv:2403.19600 [pdf, other]
-
Title: Enhance Image Classification via Inter-Class Image Mixup with Diffusion ModelAuthors: Zhicai Wang, Longhui Wei, Tan Wang, Heyu Chen, Yanbin Hao, Xiang Wang, Xiangnan He, Qi TianSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [14] arXiv:2403.19596 [pdf, other]
-
Title: LocCa: Visual Pretraining with Location-aware CaptionersAuthors: Bo Wan, Michael Tschannen, Yongqin Xian, Filip Pavetic, Ibrahim Alabdulmohsin, Xiao Wang, André Susano Pinto, Andreas Steiner, Lucas Beyer, Xiaohua ZhaiSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [15] arXiv:2403.19595 [pdf, other]
-
Title: Situation Awareness for Driver-Centric Driving Style AdaptationComments: 14 pages, 6 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibleSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Systems and Control (eess.SY)
- [16] arXiv:2403.19593 [pdf, other]
-
Title: Frame by Familiar Frame: Understanding Replication in Video Diffusion ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [17] arXiv:2403.19589 [pdf, other]
-
Title: TOD3Cap: Towards 3D Dense Captioning in Outdoor ScenesAuthors: Bu Jin, Yupeng Zheng, Pengfei Li, Weize Li, Yuhang Zheng, Sujie Hu, Xinyu Liu, Jinwei Zhu, Zhijie Yan, Haiyang Sun, Kun Zhan, Peng Jia, Xiaoxiao Long, Yilun Chen, Hao ZhaoComments: Code, data, and models are publicly available at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [18] arXiv:2403.19588 [pdf, other]
-
Title: DenseNets Reloaded: Paradigm Shift Beyond ResNets and ViTsComments: Code at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
- [19] arXiv:2403.19586 [pdf, other]
-
Title: TOGS: Gaussian Splatting with Temporal Opacity Offset for Real-Time 4D DSA RenderingAuthors: Shuai Zhang, Huangxuan Zhao, Zhenghong Zhou, Guanjun Wu, Chuansheng Zheng, Xinggang Wang, Wenyu LiuSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
- [20] arXiv:2403.19584 [pdf, other]
-
Title: Img2Loc: Revisiting Image Geolocalization using Multi-modality Foundation Models and Image-based Retrieval-Augmented GenerationAuthors: Zhongliang Zhou, Jielu Zhang, Zihan Guan, Mengxuan Hu, Ni Lao, Lan Mu, Sheng Li, Gengchen MaiSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [21] arXiv:2403.19580 [pdf, other]
-
Title: OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality PropagationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [22] arXiv:2403.19579 [pdf, other]
-
Title: The Bad Batches: Enhancing Self-Supervised Learning in Image Classification Through Representative Batch CurationComments: 8 Pages, 4 figures, IEEE WCCI 2024 ConferenceSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [23] arXiv:2403.19554 [pdf, other]
-
Title: Cross-Attention is Not Always Needed: Dynamic Cross-Attention for Audio-Visual Dimensional Emotion RecognitionComments: Accepted at IEEE ICME2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [24] arXiv:2403.19549 [pdf, other]
-
Title: GlORIE-SLAM: Globally Optimized RGB-only Implicit Encoding Point Cloud SLAMSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [25] arXiv:2403.19539 [pdf, other]
-
Title: De-confounded Data-free Knowledge Distillation for Handling Distribution ShiftsAuthors: Yuzheng Wang, Dingkang Yang, Zhaoyu Chen, Yang Liu, Siao Liu, Wenqiang Zhang, Lihua Zhang, Lizhe QiComments: Accepted by CVPR24Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [26] arXiv:2403.19534 [pdf, other]
-
Title: Locate, Assign, Refine: Taming Customized Image Inpainting with Text-Subject GuidanceComments: 22 pages, 14 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [27] arXiv:2403.19527 [pdf, other]
-
Title: Instance-Adaptive and Geometric-Aware Keypoint Learning for Category-Level 6D Object Pose EstimationComments: Accepted to CVPR2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [28] arXiv:2403.19517 [pdf, other]
-
Title: XScale-NVS: Cross-Scale Novel View Synthesis with Hash Featurized ManifoldComments: Accepted to CVPR 2024. Project page: xscalenvs.github.io/Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [29] arXiv:2403.19514 [pdf, other]
-
Title: CDIMC-net: Cognitive Deep Incomplete Multi-view Clustering NetworkComments: Accepted by IJCAI 2020Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [30] arXiv:2403.19501 [pdf, other]
-
Title: RELI11D: A Comprehensive Multimodal Human Motion Dataset and MethodAuthors: Ming Yan, Yan Zhang, Shuqiang Cai, Shuqi Fan, Xincheng Lin, Yudi Dai, Siqi Shen, Chenglu Wen, Lan Xu, Yuexin Ma, Cheng WangComments: CVPR2024, Project website: this http URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [31] arXiv:2403.19497 [pdf, other]
-
Title: Surface-based parcellation and vertex-wise analysis of ultra high-resolution ex vivo 7 tesla MRI in neurodegenerative diseasesAuthors: Pulkit Khandelwal, Michael Tran Duong, Constanza Fuentes, Amanda Denning, Winifred Trotman, Ranjit Ittyerah, Alejandra Bahena, Theresa Schuck, Marianna Gabrielyan, Karthik Prabhakaran, Daniel Ohm, Gabor Mizsei, John Robinson, Monica Munoz, John Detre, Edward Lee, David Irwin, Corey McMillan, M. Dylan Tisdall, Sandhitsu Das, David Wolk, Paul A. YushkevichComments: Under review at MICCAI 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [32] arXiv:2403.19495 [pdf, other]
-
Title: CoherentGS: Sparse Novel View Synthesis with Coherent 3D GaussiansAuthors: Avinash Paliwal, Wei Ye, Jinhui Xiong, Dmytro Kotovenko, Rakesh Ranjan, Vikas Chandra, Nima Khademi KalantariComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
- [33] arXiv:2403.19492 [pdf, other]
-
Title: Segmentation tool for images of cracksSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [34] arXiv:2403.19490 [pdf, other]
-
Title: Jointly Training and Pruning CNNs via Learnable Agent Guidance and AlignmentComments: IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [35] arXiv:2403.19474 [pdf, other]
-
Title: SG-PGM: Partial Graph Matching Network with Semantic Geometric Fusion for 3D Scene Graph Alignment and Its Downstream TasksComments: 16 pages, 10 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [36] arXiv:2403.19473 [pdf, other]
-
Title: Benchmarking Implicit Neural Representation and Geometric Rendering in Real-Time RGB-D SLAMComments: CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [37] arXiv:2403.19467 [pdf, other]
-
Title: Beyond Talking -- Generating Holistic 3D Human Dyadic Motion for CommunicationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [38] arXiv:2403.19456 [pdf, other]
-
Title: Break-for-Make: Modular Low-Rank Adaptations for Composable Content-Style CustomizationAuthors: Yu Xu, Fan Tang, Juan Cao, Yuxin Zhang, Oliver Deussen, Weiming Dong, Jintao Li, Tong-Yee LeeSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Multimedia (cs.MM)
- [39] arXiv:2403.19438 [pdf, other]
-
Title: SubjectDrive: Scaling Generative Data in Autonomous Driving via Subject ControlAuthors: Binyuan Huang, Yuqing Wen, Yucheng Zhao, Yaosi Hu, Yingfei Liu, Fan Jia, Weixin Mao, Tiancai Wang, Chi Zhang, Chang Wen Chen, Zhenzhong Chen, Xiangyu ZhangComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [40] arXiv:2403.19435 [pdf, other]
-
Title: BAMM: Bidirectional Autoregressive Motion ModelSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [41] arXiv:2403.19428 [pdf, other]
-
Title: Burst Super-Resolution with Diffusion Models for Improving Perceptual QualityComments: Accepted to IJCNN 2024 (International Joint Conference on Neural Networks)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [42] arXiv:2403.19417 [pdf, other]
-
Title: OAKINK2: A Dataset of Bimanual Hands-Object Manipulation in Complex Task CompletionComments: To be appeared in CVPR 2024. 26 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [43] arXiv:2403.19412 [pdf, other]
-
Title: A Simple and Effective Point-based Network for Event Camera 6-DOFs Pose RelocalizationComments: Accepted by CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [44] arXiv:2403.19407 [pdf, other]
-
Title: Towards Temporally Consistent Referring Video Object SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [45] arXiv:2403.19386 [pdf, other]
-
Title: PointCloud-Text Matching: Benchmark Datasets and a BaselineSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [46] arXiv:2403.19376 [pdf, other]
-
Title: NIGHT -- Non-Line-of-Sight Imaging from Indirect Time of Flight DataComments: Submitted to ECCV 24, 17 pages, 6 figures, 2 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
- [47] arXiv:2403.19366 [pdf, other]
-
Title: Infrared Small Target Detection with Scale and Location SensitivityComments: Accepted by CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [48] arXiv:2403.19336 [pdf, other]
-
Title: IVLMap: Instance-Aware Visual Language Grounding for Consumer Robot NavigationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [49] arXiv:2403.19334 [pdf, other]
-
Title: Test-Time Domain Generalization for Face Anti-SpoofingComments: Accepted to IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [50] arXiv:2403.19322 [pdf, other]
-
Title: Plug-and-Play Grounding of Reasoning in Multimodal Large Language ModelsComments: 14 pages, 3 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
- [51] arXiv:2403.19319 [pdf, other]
-
Title: Mesh2NeRF: Direct Mesh Supervision for Neural Radiance Field Representation and GenerationAuthors: Yujin Chen, Yinyu Nie, Benjamin Ummenhofer, Reiner Birkl, Michael Paulitsch, Matthias Müller, Matthias NießnerSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [52] arXiv:2403.19316 [pdf, other]
-
Title: Hypergraph-based Multi-View Action Recognition using Event CamerasComments: Accepted by IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI 2024)Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [53] arXiv:2403.19314 [pdf, other]
-
Title: Total-Decom: Decomposed 3D Scene Reconstruction with Minimal InteractionComments: 8 pages, 7 figures, accepted by CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [54] arXiv:2403.19306 [pdf, other]
-
Title: Sparse Generation: Making Pseudo Labels Sparse for weakly supervision with pointsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [55] arXiv:2403.19294 [pdf, other]
-
Title: FlowDepth: Decoupling Optical Flow for Self-Supervised Monocular Depth EstimationSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [56] arXiv:2403.19278 [pdf, other]
-
Title: CAT: Exploiting Inter-Class Dynamics for Domain Adaptive Object DetectionComments: Accepted into CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [57] arXiv:2403.19265 [pdf, other]
-
Title: Neural Fields for 3D Tracking of Anatomy and Surgical Instruments in Monocular Laparoscopic Video ClipsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [58] arXiv:2403.19254 [pdf, other]
-
Title: Imperceptible Protection against Style Imitation from Diffusion ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [59] arXiv:2403.19242 [pdf, other]
-
Title: RTracker: Recoverable Tracking via PN Tree Structured MemoryComments: accepted by CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [60] arXiv:2403.19238 [pdf, other]
-
Title: Taming Lookup Tables for Efficient Image RetouchingSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
- [61] arXiv:2403.19235 [pdf, other]
-
Title: DreamSalon: A Staged Diffusion Framework for Preserving Identity-Context in Editable Face GenerationAuthors: Haonan Lin, Mengmeng Wang, Yan Chen, Wenbin An, Yuzhe Yao, Guang Dai, Qianying Wang, Yong Liu, Jingdong WangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [62] arXiv:2403.19232 [pdf, other]
-
Title: AZ-NAS: Assembling Zero-Cost Proxies for Network Architecture SearchComments: CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [63] arXiv:2403.19225 [pdf, other]
-
Title: Efficient and Effective Weakly-Supervised Action Segmentation via Action-Transition-Aware Boundary AlignmentComments: Accepted to CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [64] arXiv:2403.19221 [pdf, other]
-
Title: Towards Multimodal Video Paragraph Captioning Models Robust to Missing ModalityComments: Code available at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [65] arXiv:2403.19220 [pdf, other]
-
Title: GeoAuxNet: Towards Universal 3D Representation Learning for Multi-sensor Point CloudsComments: CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [66] arXiv:2403.19213 [pdf, other]
-
Title: Learning Multiple Representations with Inconsistency-Guided Detail Regularization for Mask-Guided MattingAuthors: Weihao Jiang, Zhaozhi Xie, Yuxiang Lu, Longjie Qi, Jingyong Cai, Hiroyuki Uchiyama, Bin Chen, Yue Ding, Hongtao LuSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [67] arXiv:2403.19205 [pdf, other]
-
Title: From Activation to Initialization: Scaling Insights for Optimizing Neural FieldsComments: CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [68] arXiv:2403.19193 [pdf, other]
-
Title: Text Data-Centric Image Captioning with Interactive PromptsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [69] arXiv:2403.19177 [pdf, other]
-
Title: Rethinking Information Loss in Medical Image Segmentation with Various-sized TargetsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [70] arXiv:2403.19164 [pdf, other]
-
Title: RecDiffusion: Rectangling for Image Stitching with Diffusion ModelsAuthors: Tianhao Zhou, Haipeng Li, Ziyi Wang, Ao Luo, Chen-Lin Zhang, Jiajun Li, Bing Zeng, Shuaicheng LiuSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [71] arXiv:2403.19160 [pdf, other]
-
Title: Within the Dynamic Context: Inertia-aware 3D Human Modeling with Pose SequenceSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [72] arXiv:2403.19158 [pdf, other]
- [73] arXiv:2403.19144 [pdf, other]
-
Title: MoDiTalker: Motion-Disentangled Diffusion Model for High-Fidelity Talking Head GenerationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [74] arXiv:2403.19140 [pdf, other]
-
Title: QNCD: Quantization Noise Correction for Diffusion ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [75] arXiv:2403.19137 [pdf, other]
-
Title: CLAP4CLIP: Continual Learning with Probabilistic Finetuning for Vision-Language ModelsComments: Work under reviewSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [76] arXiv:2403.19128 [pdf, other]
-
Title: OmniParser: A Unified Framework for Text Spotting, Key Information Extraction and Table RecognitionAuthors: Jianqiang Wan, Sibo Song, Wenwen Yu, Yuliang Liu, Wenqing Cheng, Fei Huang, Xiang Bai, Cong Yao, Zhibo YangComments: CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [77] arXiv:2403.19124 [pdf, other]
-
Title: PoCo: A Self-Supervised Approach via Polar Transformation Based Progressive Contrastive Learning for Ophthalmic Disease DiagnosisAuthors: Jinhong Wang, Tingting Chen, Jintai Chen, Yixuan Wu, Yuyang Xu, Danny Chen, Haochao Ying, Jian WuSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [78] arXiv:2403.19111 [pdf, other]
-
Title: Patch Spatio-Temporal Relation Prediction for Video Anomaly DetectionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [79] arXiv:2403.19107 [pdf, ps, other]
-
Title: Synthetic Medical Imaging Generation with Generative Adversarial Networks For Plain RadiographsAuthors: John R. McNulty, Lee Kho, Alexandria L. Case, Charlie Fornaca, Drew Johnston, David Slater, Joshua M. Abzug, Sybil A. RussellSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [80] arXiv:2403.19104 [pdf, other]
-
Title: CRKD: Enhanced Camera-Radar Object Detection with Cross-modality Knowledge DistillationComments: Accepted to CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [81] arXiv:2403.19103 [pdf, other]
-
Title: Automated Black-box Prompt Engineering for Personalized Text-to-Image GenerationAuthors: Yutong He, Alexander Robey, Naoki Murata, Yiding Jiang, Joshua Williams, George J. Pappas, Hamed Hassani, Yuki Mitsufuji, Ruslan Salakhutdinov, J. Zico KolterSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
- [82] arXiv:2403.19101 [pdf, other]
-
Title: AAPMT: AGI Assessment Through Prompt and Metric TransformerAuthors: Benhao HuangSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [83] arXiv:2403.19098 [pdf, other]
-
Title: GraphAD: Interaction Scene Graph for End-to-end Autonomous DrivingAuthors: Yunpeng Zhang, Deheng Qian, Ding Li, Yifeng Pan, Yong Chen, Zhenbao Liang, Zhiyao Zhang, Shurui Zhang, Hongxu Li, Maolei Fu, Yun Ye, Zhujin Liang, Yi Shan, Dalong DuComments: project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [84] arXiv:2403.19080 [pdf, other]
-
Title: MMCert: Provable Defense against Adversarial Attacks to Multi-modal ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
- [85] arXiv:2403.19079 [pdf, other]
-
Title: A Real-Time Framework for Domain-Adaptive Underwater Object Detection with Image EnhancementComments: accepted by ICRA24Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [86] arXiv:2403.19078 [pdf, other]
-
Title: MVEB: Self-Supervised Learning with Multi-View Entropy BottleneckComments: Accepted by TPAMISubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [87] arXiv:2403.19067 [pdf, other]
-
Title: Low-Rank Rescaled Vision Transformer Fine-Tuning: A Residual Design ApproachSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [88] arXiv:2403.19066 [pdf, other]
-
Title: Generative Quanta Color ImagingComments: Accepted at IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [89] arXiv:2403.19046 [pdf, other]
-
Title: LITA: Language Instructed Temporal-Localization AssistantAuthors: De-An Huang, Shijia Liao, Subhashree Radhakrishnan, Hongxu Yin, Pavlo Molchanov, Zhiding Yu, Jan KautzSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [90] arXiv:2403.19043 [pdf, other]
-
Title: Illicit object detection in X-ray images using Vision TransformersAuthors: Jorgen Cani, Ioannis Mademlis, Adamantia Anna Rebolledo Chrysochoou, Georgios Th. PapadopoulosSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [91] arXiv:2403.19026 [pdf, other]
-
Title: Egocentric Scene-aware Human Trajectory PredictionComments: 14 pages, 9 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [92] arXiv:2403.19022 [pdf, other]
-
Title: WALT3D: Generating Realistic Training Data from Time-Lapse Imagery for Reconstructing Dynamic Objects under OcclusionComments: To appear in CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [93] arXiv:2403.19001 [pdf, other]
-
Title: Cross--domain Fiber Cluster Shape Analysis for Language Performance Cognitive Score PredictionAuthors: Yui Lo, Yuqian Chen, Dongnan Liu, Wan Liu, Leo Zekelman, Fan Zhang, Yogesh Rathi, Nikos Makris, Alexandra J. Golby, Weidong Cai, Lauren J. O'DonnellComments: 2 figures, 11 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV); Neurons and Cognition (q-bio.NC)
- [94] arXiv:2403.18996 [pdf, other]
-
Title: Envisioning MedCLIP: A Deep Dive into Explainability for Medical Vision-Language ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [95] arXiv:2403.18978 [pdf, other]
-
Title: TextCraftor: Your Text Encoder Can be Image Quality ControllerAuthors: Yanyu Li, Xian Liu, Anil Kag, Ju Hu, Yerlan Idelbayev, Dhritiman Sagar, Yanzhi Wang, Sergey Tulyakov, Jian RenSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [96] arXiv:2403.18922 [pdf, other]
-
Title: Lift3D: Zero-Shot Lifting of Any 2D Vision Model to 3DComments: Computer Vision and Pattern Recognition Conference (CVPR), 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [97] arXiv:2403.18915 [pdf, other]
-
Title: PLOT-TAL -- Prompt Learning with Optimal Transport for Few-Shot Temporal Action LocalizationComments: Under ReviewSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [98] arXiv:2403.18913 [pdf, other]
-
Title: UniDepth: Universal Monocular Metric Depth EstimationAuthors: Luigi Piccinelli, Yung-Hsu Yang, Christos Sakaridis, Mattia Segu, Siyuan Li, Luc Van Gool, Fisher YuSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [99] arXiv:2403.18908 [pdf, other]
-
Title: Enhancing Multiple Object Tracking Accuracy via Quantum AnnealingAuthors: Yasuyuki IharaComments: 19pages, 15 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV); Quantum Physics (quant-ph)
- [100] arXiv:2403.18878 [pdf, other]
-
Title: AIC-UNet: Anatomy-informed Cascaded UNet for Robust Multi-Organ SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
- [101] arXiv:2403.18871 [pdf, ps, other]
-
Title: Clinical Domain Knowledge-Derived Template Improves Post Hoc AI Explanations in Pneumothorax ClassificationAuthors: Han Yuan, Chuan Hong, Pengtao Jiang, Gangming Zhao, Nguyen Tuan Anh Tran, Xinxing Xu, Yet Yen Yan, Nan LiuSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [102] arXiv:2403.18870 [pdf, ps, other]
-
Title: SugarcaneNet2024: An Optimized Weighted Average Ensemble Approach of LASSO Regularized Pre-trained Models for Sugarcane Disease ClassificationComments: 32 pages, 11 Figures, 13 TablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [103] arXiv:2403.18843 [pdf, other]
-
Title: JEP-KD: Joint-Embedding Predictive Architecture Based Knowledge Distillation for Visual Speech RecognitionSubjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG); Sound (cs.SD); Audio and Speech Processing (eess.AS)
- [104] arXiv:2403.19649 (cross-list from cs.RO) [pdf, other]
-
Title: GraspXL: Generating Grasping Motions for Diverse Objects at ScaleComments: Project Page: this https URLSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [105] arXiv:2403.19622 (cross-list from cs.RO) [pdf, other]
-
Title: RH20T-P: A Primitive-Level Robotic Dataset Towards Composable Generalization AgentsAuthors: Zeren Chen, Zhelun Shi, Xiaoya Lu, Lehan He, Sucheng Qian, Hao Shu Fang, Zhenfei Yin, Wanli Ouyang, Jing Shao, Yu Qiao, Cewu Lu, Lu ShengComments: 24 pages, 12 figures, 6 tablesSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [106] arXiv:2403.19620 (cross-list from cs.NE) [pdf, other]
-
Title: Collaborative Interactive Evolution of Art in the Latent Space of Deep Generative ModelsComments: Preprint. The Version of Record of this contribution is to be published in the proceedings of the 13th International Conference on Artificial Intelligence in Music, Sound, Art and Design (EvoMUSART) 2024Subjects: Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
- [107] arXiv:2403.19607 (cross-list from cs.RO) [pdf, other]
-
Title: SAID-NeRF: Segmentation-AIDed NeRF for Depth Completion of Transparent ObjectsAuthors: Avinash Ummadisingu, Jongkeum Choi, Koki Yamane, Shimpei Masuda, Naoki Fukaya, Kuniyuki TakahashiComments: 8 pages. An accompanying video is available at this https URLSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [108] arXiv:2403.19603 (cross-list from cs.CL) [pdf, other]
-
Title: Semantic Map-based Generation of Navigation InstructionsComments: 5 pages, 2 figures, 3 tables (13 pages, 3 figures, 5 tables including references and appendices), accepted at LREC-COLING 2024Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [109] arXiv:2403.19522 (cross-list from cs.LG) [pdf, other]
-
Title: Model Stock: All we need is just a few fine-tuned modelsComments: Code at this https URLSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [110] arXiv:2403.19508 (cross-list from eess.IV) [pdf, other]
-
Title: Debiasing Cardiac Imaging with Controlled Latent Diffusion ModelsAuthors: Grzegorz Skorupko, Richard Osuala, Zuzanna Szafranowska, Kaisar Kushibar, Nay Aung, Steffen E Petersen, Karim Lekadir, Polyxeni GkontraSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [111] arXiv:2403.19444 (cross-list from cs.LG) [pdf, other]
-
Title: Transparent and Clinically Interpretable AI for Lung Cancer Detection in Chest X-RaysComments: 12 pages, 10 figuresSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [112] arXiv:2403.19425 (cross-list from eess.IV) [pdf, ps, other]
-
Title: A Robust Ensemble Algorithm for Ischemic Stroke Lesion Segmentation: Generalizability and Clinical Utility Beyond the ISLES ChallengeAuthors: Ezequiel de la Rosa, Mauricio Reyes, Sook-Lei Liew, Alexandre Hutton, Roland Wiest, Johannes Kaesmacher, Uta Hanning, Arsany Hakim, Richard Zubal, Waldo Valenzuela, David Robben, Diana M. Sima, Vincenzo Anania, Arne Brys, James A. Meakin, Anne Mickan, Gabriel Broocks, Christian Heitkamp, Shengbo Gao, Kongming Liang, Ziji Zhang, Md Mahfuzur Rahman Siddiquee, Andriy Myronenko, Pooya Ashtari, Sabine Van Huffel, Hyun-su Jeong, Chi-ho Yoon, Chulhong Kim, Jiayu Huo, Sebastien Ourselin, Rachel Sparks, Albert Clèrigues, Arnau Oliver, Xavier Lladó, Liam Chalcroft, Ioannis Pappas, Jeroen Bertels, Ewout Heylen, Juliette Moreau, Nima Hatami, Carole Frindel, Abdul Qayyum, Moona Mazher, Domenec Puig, Shao-Chieh Lin, Chun-Jung Juan, Tianxi Hu, Lyndon Boone, Maged Goubran, Yi-Jui Liu, Susanne Wegener, et al. (7 additional authors not shown)Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [113] arXiv:2403.19415 (cross-list from eess.IV) [pdf, ps, other]
-
Title: Brain-Shift: Unsupervised Pseudo-Healthy Brain Synthesis for Novel Biomarker Extraction in Chronic Subdural HematomaSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [114] arXiv:2403.19326 (cross-list from cs.LG) [pdf, other]
-
Title: MedBN: Robust Test-Time Adaptation against Malicious Test SamplesComments: Accepted to CVPR 2024Subjects: Machine Learning (cs.LG); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
- [115] arXiv:2403.19243 (cross-list from cs.LG) [pdf, other]
-
Title: Sine Activated Low-Rank Matrices for Parameter Efficient LearningComments: The first two authors contributed equallySubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
- [116] arXiv:2403.19203 (cross-list from eess.IV) [pdf, other]
-
Title: Single-Shared Network with Prior-Inspired Loss for Parameter-Efficient Multi-Modal Imaging Skin Lesion ClassificationComments: This paper have submitted to Journal for reviewSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [117] arXiv:2403.19174 (cross-list from cs.HC) [pdf, other]
-
Title: Algorithmic Ways of Seeing: Using Object Detection to Facilitate Art ExplorationAuthors: Louie Søs Meyer, Johanne Engel Aaen, Anitamalina Regitse Tranberg, Peter Kun, Matthias Freiberger, Sebastian Risi, Anders Sundnes LøvlieSubjects: Human-Computer Interaction (cs.HC); Computer Vision and Pattern Recognition (cs.CV)
- [118] arXiv:2403.19163 (cross-list from cs.LG) [pdf, other]
-
Title: D'OH: Decoder-Only random Hypernetworks for Implicit Neural RepresentationsComments: 29 pages, 17 figuresSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [119] arXiv:2403.19150 (cross-list from cs.LG) [pdf, other]
-
Title: Towards Understanding Dual BN In Hybrid Adversarial TrainingComments: Accepted at TMLRSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
- [120] arXiv:2403.19076 (cross-list from cs.LG) [pdf, other]
-
Title: Tiny Machine Learning: Progress and FuturesComments: IEEE Circuits and Systems Magazine (2023). arXiv admin note: text overlap with arXiv:2206.15472Journal-ref: IEEE Circuits and Systems Magazine, 23(3), pp. 8-34, October 2023Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [121] arXiv:2403.19002 (cross-list from cs.MM) [pdf, other]
-
Title: Robust Active Speaker Detection in Noisy EnvironmentsComments: 15 pages, 5 figuresSubjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
- [122] arXiv:2403.18985 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Robustness and Visual Explanation for Black Box Image, Video, and ECG Signal Classification with Reinforcement LearningAuthors: Soumyendu Sarkar, Ashwin Ramesh Babu, Sajad Mousavi, Vineet Gundecha, Avisek Naug, Sahand GhorbanpourComments: AAAI Proceedings reference: this https URLJournal-ref: 2024 Proceedings of the AAAI Conference on Artificial IntelligenceSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
- [123] arXiv:2403.18955 (cross-list from cs.LG) [pdf, other]
-
Title: Structurally Prune Anything: Any Architecture, Any Framework, Any TimeSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [124] arXiv:2403.18921 (cross-list from cs.AR) [pdf, other]
-
Title: SMOF: Streaming Modern CNNs on FPGAs with Smart Off-Chip EvictionComments: 12 pages, 8 figures, 5 tablesSubjects: Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [125] arXiv:2403.18920 (cross-list from cs.CR) [pdf, other]
-
Title: CPR: Retrieval Augmented Generation for Copyright ProtectionAuthors: Aditya Golatkar, Alessandro Achille, Luca Zancato, Yu-Xiang Wang, Ashwin Swaminathan, Stefano SoattoComments: CVPR 2024Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [126] arXiv:2403.18910 (cross-list from cs.LG) [pdf, other]
-
Title: A Geometric Explanation of the Likelihood OOD Detection ParadoxAuthors: Hamidreza Kamkari, Brendan Leigh Ross, Jesse C. Cresswell, Anthony L. Caterini, Rahul G. Krishnan, Gabriel Loaiza-GanemSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
- [127] arXiv:2403.18886 (cross-list from cs.LG) [pdf, other]
-
Title: Self-Expansion of Pre-trained Models with Mixture of Adapters for Continual LearningSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [128] arXiv:2403.18873 (cross-list from eess.IV) [pdf, ps, other]
-
Title: Predicting risk of cardiovascular disease using retinal OCT imagingAuthors: Cynthia Maldonado-Garcia, Rodrigo Bonazzola, Enzo Ferrante, Thomas H Julian, Panagiotis I Sergouniotis, Nishant Ravikumara, Alejandro F FrangiComments: 18 pages for main manuscript, 7 figures, 2 pages for appendix and preprint for a journalSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Thu, 28 Mar 2024
- [129] arXiv:2403.18820 [pdf, other]
-
Title: MetaCap: Meta-learning Priors from Multi-View Imagery for Sparse-view Human Performance Capture and RenderingComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [130] arXiv:2403.18819 [pdf, other]
-
Title: Benchmarking Object Detectors with COCO: A New Path ForwardSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [131] arXiv:2403.18818 [pdf, other]
-
Title: ObjectDrop: Bootstrapping Counterfactuals for Photorealistic Object Removal and InsertionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [132] arXiv:2403.18816 [pdf, other]
-
Title: Garment3DGen: 3D Garment Stylization and Texture GenerationComments: Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [133] arXiv:2403.18814 [pdf, other]
-
Title: Mini-Gemini: Mining the Potential of Multi-modality Vision Language ModelsAuthors: Yanwei Li, Yuechen Zhang, Chengyao Wang, Zhisheng Zhong, Yixin Chen, Ruihang Chu, Shaoteng Liu, Jiaya JiaComments: Code and models are available at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
- [134] arXiv:2403.18811 [pdf, other]
-
Title: Duolando: Follower GPT with Off-Policy Reinforcement Learning for Dance AccompanimentAuthors: Li Siyao, Tianpei Gu, Zhitao Yang, Zhengyu Lin, Ziwei Liu, Henghui Ding, Lei Yang, Chen Change LoyComments: ICLR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Sound (cs.SD); Audio and Speech Processing (eess.AS)
- [135] arXiv:2403.18807 [pdf, other]
-
Title: ECoDepth: Effective Conditioning of Diffusion Models for Monocular Depth EstimationComments: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [136] arXiv:2403.18795 [pdf, other]
-
Title: Gamba: Marry Gaussian Splatting with Mamba for single view 3D reconstructionSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [137] arXiv:2403.18791 [pdf, other]
-
Title: Object Pose Estimation via the Aggregation of Diffusion FeaturesComments: Accepted to CVPR2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [138] arXiv:2403.18784 [pdf, other]
-
Title: SplatFace: Gaussian Splat Face Reconstruction Leveraging an Optimizable SurfaceSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [139] arXiv:2403.18775 [pdf, other]
-
Title: ImageNet-D: Benchmarking Neural Network Robustness on Diffusion Synthetic ObjectComments: Accepted at CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [140] arXiv:2403.18774 [pdf, other]
-
Title: RAW: A Robust and Agile Plug-and-Play Watermark Framework for AI-Generated Images with Provable GuaranteesSubjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
- [141] arXiv:2403.18762 [pdf, other]
-
Title: ModaLink: Unifying Modalities for Efficient Image-to-PointCloud Place RecognitionAuthors: Weidong Xie, Lun Luo, Nanfei Ye, Yi Ren, Shaoyi Du, Minhang Wang, Jintao Xu, Rui Ai, Weihao Gu, Xieyuanli ChenComments: 8 pages, 11 figures, conferenceSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
- [142] arXiv:2403.18756 [pdf, ps, other]
-
Title: Detection of subclinical atherosclerosis by image-based deep learning on chest x-rayAuthors: Guglielmo Gallone, Francesco Iodice, Alberto Presta, Davide Tore, Ovidio de Filippo, Michele Visciano, Carlo Alberto Barbano, Alessandro Serafini, Paola Gorrini, Alessandro Bruno, Walter Grosso Marra, James Hughes, Mario Iannaccone, Paolo Fonio, Attilio Fiandrotti, Alessandro Depaoli, Marco Grangetto, Gaetano Maria de Ferrari, Fabrizio D'AscenzoComments: Submitted to European Heart Journal - Cardiovascular Imaging Added also the additional material 44 pages (30 main paper, 14 additional material), 14 figures (5 main manuscript, 9 additional material)Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [143] arXiv:2403.18730 [pdf, other]
-
Title: Towards Image Ambient Lighting NormalizationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [144] arXiv:2403.18715 [pdf, other]
-
Title: Mitigating Hallucinations in Large Vision-Language Models with Instruction Contrastive DecodingSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Multimedia (cs.MM)
- [145] arXiv:2403.18714 [pdf, other]
- [146] arXiv:2403.18711 [pdf, other]
-
Title: SAT-NGP : Unleashing Neural Graphics Primitives for Fast Relightable Transient-Free 3D reconstruction from Satellite ImageryComments: 5 pages, 3 figures, 1 table; Accepted to International Geoscience and Remote Sensing Symposium (IGARSS) 2024; Code available at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [147] arXiv:2403.18708 [pdf, other]
-
Title: Dense Vision Transformer Compression with Few SamplesComments: Accepted to CVPR 2024. Note: Jianxin Wu is a contributing author for the arXiv version of this paper but is not listed as an author in the CVPR version due to his role as Program ChairSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [148] arXiv:2403.18690 [pdf, other]
-
Title: Annolid: Annotate, Segment, and Track Anything You NeedSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [149] arXiv:2403.18674 [pdf, other]
-
Title: Deep Learning for Robust and Explainable Models in Computer VisionAuthors: Mohammadreza AmirianComments: 150 pages, 37 figures, 12 tablesJournal-ref: OPARU is the OPen Access Repository of Ulm University and Ulm University of Applied Sciences, 2023Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [150] arXiv:2403.18649 [pdf, other]
-
Title: Addressing Data Annotation Challenges in Multiple Sensors: A Solution for Scania Collected DatasetsAuthors: Ajinkya Khoche, Aron Asefaw, Alejandro Gonzalez, Bogdan Timus, Sina Sharif Mansouri, Patric JensfeltComments: Accepted to European Control Conference 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Systems and Control (eess.SY)
- [151] arXiv:2403.18605 [pdf, other]
-
Title: FlexEdit: Flexible and Controllable Diffusion-based Object-centric Image EditingComments: Our project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [152] arXiv:2403.18600 [pdf, other]
-
Title: RAP: Retrieval-Augmented Planner for Adaptive Procedure Planning in Instructional VideosComments: 23 pages, 6 figures, 12 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
- [153] arXiv:2403.18593 [pdf, other]
-
Title: Homogeneous Tokenizer Matters: Homogeneous Visual Tokenizer for Remote Sensing Image UnderstandingComments: 20 pages, 8 figures, 6 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [154] arXiv:2403.18575 [pdf, other]
-
Title: HandBooster: Boosting 3D Hand-Mesh Reconstruction by Conditional Synthesis and Sampling of Hand-Object InteractionsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [155] arXiv:2403.18565 [pdf, other]
-
Title: Artifact Reduction in 3D and 4D Cone-beam Computed Tomography Images with Deep Learning -- A ReviewComments: 16 pages, 4 figures, 1 Table, published in IEEE Access JournalJournal-ref: IEEE Access, vol. 12, pp. 10281-10295, 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [156] arXiv:2403.18554 [pdf, other]
-
Title: CosalPure: Learning Concept from Group Images for Robust Co-Saliency DetectionComments: 8 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [157] arXiv:2403.18551 [pdf, other]
-
Title: Attention Calibration for Disentangled Text-to-Image PersonalizationComments: Accepted to CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [158] arXiv:2403.18550 [pdf, other]
-
Title: OrCo: Towards Better Generalization via Orthogonality and Contrast for Few-Shot Class-Incremental LearningSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [159] arXiv:2403.18548 [pdf, other]
-
Title: A Semi-supervised Nighttime Dehazing Baseline with Spatial-Frequency Aware and Realistic Brightness ConstraintComments: This paper is accepted by CVPR2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [160] arXiv:2403.18525 [pdf, other]
-
Title: Language Plays a Pivotal Role in the Object-Attribute Compositional Generalization of CLIPComments: Oral accepted at OODCV 2023(this http URL)Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
- [161] arXiv:2403.18512 [pdf, other]
-
Title: ParCo: Part-Coordinating Text-to-Motion SynthesisSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [162] arXiv:2403.18495 [pdf, other]
-
Title: Direct mineral content prediction from drill core images via transfer learningAuthors: Romana Boiger, Sergey V. Churakov, Ignacio Ballester Llagaria, Georg Kosakowski, Raphael Wüst, Nikolaos I. PrasianakisSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
- [163] arXiv:2403.18493 [pdf, other]
-
Title: VersaT2I: Improving Text-to-Image Models with Versatile RewardAuthors: Jianshu Guo, Wenhao Chai, Jie Deng, Hsiang-Wei Huang, Tian Ye, Yichen Xu, Jiawei Zhang, Jenq-Neng Hwang, Gaoang WangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [164] arXiv:2403.18490 [pdf, other]
-
Title: I2CKD : Intra- and Inter-Class Knowledge Distillation for Semantic SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [165] arXiv:2403.18476 [pdf, other]
-
Title: Modeling uncertainty for Gaussian SplattingSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
- [166] arXiv:2403.18471 [pdf, other]
-
Title: DiffusionFace: Towards a Comprehensive Dataset for Diffusion-Based Face Forgery AnalysisSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [167] arXiv:2403.18469 [pdf, other]
-
Title: Density-guided Translator Boosts Synthetic-to-Real Unsupervised Domain Adaptive Segmentation of 3D Point CloudsComments: CVPR2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [168] arXiv:2403.18461 [pdf, other]
-
Title: DiffStyler: Diffusion-based Localized Image Style TransferAuthors: Shaoxu LiSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [169] arXiv:2403.18454 [pdf, other]
-
Title: Scaling Vision-and-Language Navigation With Offline RLComments: Published in Transactions on Machine Learning Research (04/2024)Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [170] arXiv:2403.18452 [pdf, other]
-
Title: SingularTrajectory: Universal Trajectory Predictor Using Diffusion ModelComments: Accepted at CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)
- [171] arXiv:2403.18443 [pdf, other]
-
Title: $\mathrm{F^2Depth}$: Self-supervised Indoor Monocular Depth Estimation via Optical Flow Consistency and Feature Map SynthesisSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [172] arXiv:2403.18442 [pdf, other]
-
Title: Backpropagation-free Network for 3D Test-time AdaptationAuthors: Yanshuo Wang, Ali Cheraghian, Zeeshan Hayder, Jie Hong, Sameera Ramasinghe, Shafin Rahman, David Ahmedt-Aristizabal, Xuesong Li, Lars Petersson, Mehrtash HarandiComments: CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [173] arXiv:2403.18425 [pdf, other]
-
Title: U-Sketch: An Efficient Approach for Sketch to Image Diffusion ModelsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [174] arXiv:2403.18417 [pdf, other]
-
Title: ECNet: Effective Controllable Text-to-Image Diffusion ModelsAuthors: Sicheng Li, Keqiang Sun, Zhixin Lai, Xiaoshi Wu, Feng Qiu, Haoran Xie, Kazunori Miyata, Hongsheng LiSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [175] arXiv:2403.18407 [pdf, other]
-
Title: A Channel-ensemble Approach: Unbiased and Low-variance Pseudo-labels is Critical for Semi-supervised ClassificationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [176] arXiv:2403.18406 [pdf, other]
-
Title: An Image Grid Can Be Worth a Video: Zero-shot Video Question Answering Using a VLMComments: Our code is available at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
- [177] arXiv:2403.18397 [pdf, ps, other]
-
Title: Colour and Brush Stroke Pattern Recognition in Abstract Art using Modified Deep Convolutional Generative Adversarial NetworksComments: 28 pages, 5 tables, 7 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [178] arXiv:2403.18383 [pdf, other]
-
Title: Generative Multi-modal Models are Good Class-Incremental LearnersComments: Accepted at CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [179] arXiv:2403.18373 [pdf, other]
-
Title: BAM: Box Abstraction Monitors for Real-time OoD Detection in Object DetectionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [180] arXiv:2403.18370 [pdf, other]
-
Title: Ship in Sight: Diffusion Models for Ship-Image Super ResolutionComments: Accepted at 2024 International Joint Conference on Neural Networks (IJCNN)Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [181] arXiv:2403.18361 [pdf, other]
-
Title: ViTAR: Vision Transformer with Any ResolutionAuthors: Qihang Fan, Quanzeng You, Xiaotian Han, Yongfei Liu, Yunzhe Tao, Huaibo Huang, Ran He, Hongxia YangSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [182] arXiv:2403.18360 [pdf, other]
-
Title: Learning CNN on ViT: A Hybrid Model to Explicitly Class-specific Boundaries for Domain AdaptationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [183] arXiv:2403.18356 [pdf, other]
-
Title: MonoHair: High-Fidelity Hair Modeling from a Monocular VideoAuthors: Keyu Wu, Lingchen Yang, Zhiyi Kuang, Yao Feng, Xutao Han, Yuefan Shen, Hongbo Fu, Kun Zhou, Youyi ZhengComments: Accepted by IEEE CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [184] arXiv:2403.18351 [pdf, other]
-
Title: Generating Diverse Agricultural Data for Vision-Based Farming ApplicationsAuthors: Mikolaj Cieslak, Umabharathi Govindarajan, Alejandro Garcia, Anuradha Chandrashekar, Torsten Hädrich, Aleksander Mendoza-Drosik, Dominik L. Michels, Sören Pirk, Chia-Chun Fu, Wojciech PałubickiComments: 10 pages, 8 figures, 3 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG)
- [185] arXiv:2403.18342 [pdf, other]
-
Title: Learning Inclusion Matching for Animation Paint Bucket ColorizationComments: accepted to CVPR 2024. Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [186] arXiv:2403.18334 [pdf, other]
-
Title: DODA: Diffusion for Object-detection Domain Adaptation in AgricultureSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [187] arXiv:2403.18330 [pdf, other]
-
Title: Tracking-Assisted Object Detection with Event CamerasAuthors: Ting-Kang Yen, Igor Morawski, Shusil Dangi, Kai He, Chung-Yi Lin, Jia-Fong Yeh, Hung-Ting Su, Winston HsuSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [188] arXiv:2403.18328 [pdf, other]
-
Title: PIPNet3D: Interpretable Detection of Alzheimer in MRI ScansAuthors: Lisa Anita De Santi, Jörg Schlötterer, Michael Scheschenja, Joel Wessendorf, Meike Nauta, Vincenzo Positano, Christin SeifertSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [189] arXiv:2403.18318 [pdf, other]
-
Title: Uncertainty-Aware SAR ATR: Defending Against Adversarial Attacks via Bayesian Neural NetworksSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [190] arXiv:2403.18294 [pdf, other]
-
Title: Multi-scale Unified Network for Image ClassificationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [191] arXiv:2403.18293 [pdf, other]
-
Title: Efficient Test-Time Adaptation of Vision-Language ModelsComments: Accepted to CVPR 2024. The code has been released in \url{this https URL}Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [192] arXiv:2403.18291 [pdf, other]
-
Title: Towards Non-Exemplar Semi-Supervised Class-Incremental LearningSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [193] arXiv:2403.18282 [pdf, other]
-
Title: SGDM: Static-Guided Dynamic Module Make Stronger Visual ModelsComments: 16 pages, 4 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [194] arXiv:2403.18281 [pdf, other]
-
Title: AIR-HLoc: Adaptive Image Retrieval for Efficient Visual LocalisationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [195] arXiv:2403.18274 [pdf, other]
-
Title: DVLO: Deep Visual-LiDAR Odometry with Local-to-Global Feature Fusion and Bi-Directional Structure AlignmentSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [196] arXiv:2403.18271 [pdf, other]
-
Title: Unleashing the Potential of SAM for Medical Adaptation via Hierarchical DecodingComments: CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [197] arXiv:2403.18270 [pdf, other]
-
Title: Image Deraining via Self-supervised Reinforcement LearningSubjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
- [198] arXiv:2403.18260 [pdf, other]
-
Title: Toward Interactive Regional Understanding in Vision-Large Language ModelsComments: NAACL 2024 Main ConferenceSubjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
- [199] arXiv:2403.18258 [pdf, other]
-
Title: Enhancing Generative Class Incremental Learning Performance with Model Forgetting ApproachSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [200] arXiv:2403.18252 [pdf, other]
-
Title: Beyond Embeddings: The Promise of Visual Table in Multi-Modal ModelsComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG); Multimedia (cs.MM)
- [201] arXiv:2403.18241 [pdf, other]
-
Title: NeuSDFusion: A Spatial-Aware Generative Model for 3D Shape Completion, Reconstruction, and GenerationAuthors: Ruikai Cui, Weizhe Liu, Weixuan Sun, Senbo Wang, Taizhang Shang, Yang Li, Xibin Song, Han Yan, Zhennan Wu, Shenzhou Chen, Hongdong Li, Pan JiSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG)
- [202] arXiv:2403.18238 [pdf, other]
-
Title: TAFormer: A Unified Target-Aware Transformer for Video and Motion Joint Prediction in Aerial ScenesAuthors: Liangyu Xu, Wanxuan Lu, Hongfeng Yu, Yongqiang Mao, Hanbo Bi, Chenglong Liu, Xian Sun, Kun FuComments: 17 pages, 9 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [203] arXiv:2403.18228 [pdf, other]
-
Title: Fourier or Wavelet bases as counterpart self-attention in spikformer for efficient visual classificationComments: 18 pages, 2 figures. arXiv admin note: substantial text overlap with arXiv:2308.02557Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
- [204] arXiv:2403.18211 [pdf, other]
-
Title: NeuroPictor: Refining fMRI-to-Image Reconstruction via Multi-individual Pretraining and Multi-level ModulationSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [205] arXiv:2403.18208 [pdf, other]
-
Title: An Evolutionary Network Architecture Search Framework with Adaptive Multimodal Fusion for Hand Gesture RecognitionSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Neural and Evolutionary Computing (cs.NE)
- [206] arXiv:2403.18207 [pdf, other]
-
Title: Road Obstacle Detection based on Unknown Objectness ScoresComments: ICRA 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [207] arXiv:2403.18201 [pdf, other]
-
Title: Few-shot Online Anomaly Detection and SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [208] arXiv:2403.18193 [pdf, other]
-
Title: Middle Fusion and Multi-Stage, Multi-Form Prompts for Robust RGB-T TrackingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [209] arXiv:2403.18187 [pdf, other]
-
Title: LayoutFlow: Flow Matching for Layout GenerationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [210] arXiv:2403.18186 [pdf, other]
-
Title: Don't Look into the Dark: Latent Codes for Pluralistic Image InpaintingComments: cvpr 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [211] arXiv:2403.18180 [pdf, other]
-
Title: Multi-Layer Dense Attention Decoder for Polyp SegmentationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [212] arXiv:2403.18158 [pdf, other]
-
Title: The Effects of Short Video-Sharing Services on Video Copy DetectionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [213] arXiv:2403.18118 [pdf, other]
-
Title: EgoLifter: Open-world 3D Segmentation for Egocentric PerceptionComments: Preprint. Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [214] arXiv:2403.18117 [pdf, ps, other]
-
Title: TDIP: Tunable Deep Image Processing, a Real Time Melt Pool Monitoring SolutionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [215] arXiv:2403.18116 [pdf, other]
-
Title: QuakeSet: A Dataset and Low-Resource Models to Monitor Earthquakes through Sentinel-1Comments: Accepted at ISCRAM 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [216] arXiv:2403.18114 [pdf, other]
-
Title: Segment Any Medical Model ExtendedAuthors: Yihao Liu, Jiaming Zhang, Andres Diaz-Pinto, Haowei Li, Alejandro Martin-Gomez, Amir Kheradmand, Mehran ArmandComments: The content of the manuscript has been presented in SPIE Medical Imaging 2024, and had been accepted to appear in the proceedings of the conferenceSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [217] arXiv:2403.18104 [pdf, other]
-
Title: Mathematical Foundation and Corrections for Full Range Head Pose EstimationSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [218] arXiv:2403.18094 [pdf, other]
-
Title: A Personalized Video-Based Hand Taxonomy: Application for Individuals with Spinal Cord InjurySubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [219] arXiv:2403.18092 [pdf, other]
-
Title: OCAI: Improving Optical Flow Estimation by Occlusion and Consistency Aware InterpolationComments: CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [220] arXiv:2403.18080 [pdf, other]
-
Title: EgoPoseFormer: A Simple Baseline for Egocentric 3D Human Pose EstimationAuthors: Chenhongyi Yang, Anastasia Tkach, Shreyas Hampali, Linguang Zhang, Elliot J. Crowley, Cem KeskinComments: Tech ReportSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [221] arXiv:2403.18074 [pdf, other]
-
Title: Every Shot Counts: Using Exemplars for Repetition Counting in VideosComments: Project website: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
- [222] arXiv:2403.18067 [pdf, other]
-
Title: State of the art applications of deep learning within tracking and detecting marine debris: A surveyComments: Review paper, 60 pages including references, 1 figure, 3 tables, 1 supplementary dataSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [223] arXiv:2403.18063 [pdf, other]
-
Title: Spectral Convolutional Transformer: Harmonizing Real vs. Complex Multi-View Spectral Operators for Vision TransformerSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG); Multimedia (cs.MM)
- [224] arXiv:2403.18040 [pdf, other]
-
Title: Global Point Cloud Registration Network for Large TransformationsAuthors: Hanz Cuevas-Velasquez, Alejandro Galán-Cuenca, Antonio Javier Gallego, Marcelo Saval-Calvo, Robert B. FisherSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [225] arXiv:2403.18038 [pdf, ps, other]
-
Title: TGGLinesPlus: A robust topological graph-guided computer vision algorithm for line detection from imagesComments: Our TGGLinesPlus Python implementation is open source. 27 pages, 8 figures and 4 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [226] arXiv:2403.18036 [pdf, other]
-
Title: Move as You Say, Interact as You Can: Language-guided Human Motion Generation with Scene AffordanceAuthors: Zan Wang, Yixin Chen, Baoxiong Jia, Puhao Li, Jinlu Zhang, Jingze Zhang, Tengyu Liu, Yixin Zhu, Wei Liang, Siyuan HuangComments: CVPR 2024; 16 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [227] arXiv:2403.18033 [pdf, other]
-
Title: SpectralWaste Dataset: Multimodal Data for Waste Sorting AutomationAuthors: Sara Casao, Fernando Peña, Alberto Sabater, Rosa Castillón, Darío Suárez, Eduardo Montijano, Ana C. MurilloSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [228] arXiv:2403.17998 [pdf, other]
-
Title: Text Is MASS: Modeling as Stochastic Embedding for Text-Video RetrievalAuthors: Jiamian Wang, Guohao Sun, Pichao Wang, Dongfang Liu, Sohail Dianat, Majid Rabbani, Raghuveer Rao, Zhiqiang TaoComments: Accepted by CVPR 2024, code and model are available at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [229] arXiv:2403.17995 [pdf, other]
-
Title: Semi-Supervised Image Captioning Considering Wasserstein Graph MatchingAuthors: Yang YangSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [230] arXiv:2403.17994 [pdf, other]
-
Title: Solution for Point Tracking Task of ICCV 1st Perception Test Challenge 2023Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [231] arXiv:2403.18821 (cross-list from cs.SD) [pdf, other]
-
Title: Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and BenchmarkAuthors: Ziyang Chen, Israel D. Gebru, Christian Richardt, Anurag Kumar, William Laney, Andrew Owens, Alexander RichardComments: Accepted to CVPR 2024. Project site: this https URLSubjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
- [232] arXiv:2403.18734 (cross-list from eess.IV) [pdf, other]
-
Title: A vascular synthetic model for improved aneurysm segmentation and detection via Deep Neural NetworksSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [233] arXiv:2403.18731 (cross-list from cs.AI) [pdf, other]
-
Title: Enhancing Manufacturing Quality Prediction Models through the Integration of Explainability MethodsSubjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Machine Learning (cs.LG)
- [234] arXiv:2403.18717 (cross-list from cs.LG) [pdf, other]
-
Title: Semi-Supervised Learning for Deep Causal Generative ModelsSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
- [235] arXiv:2403.18660 (cross-list from cs.GR) [pdf, other]
-
Title: InstructBrush: Learning Attention-based Instruction Optimization for Image EditingAuthors: Ruoyu Zhao, Qingnan Fan, Fei Kou, Shuai Qin, Hong Gu, Wei Wu, Pengcheng Xu, Mingrui Zhu, Nannan Wang, Xinbo GaoComments: Project Page: this https URLSubjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
- [236] arXiv:2403.18637 (cross-list from eess.IV) [pdf, other]
-
Title: Transformers-based architectures for stroke segmentation: A reviewSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [237] arXiv:2403.18589 (cross-list from eess.IV) [pdf, ps, other]
-
Title: Users prefer Jpegli over same-sized libjpeg-turbo or MozJPEGSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [238] arXiv:2403.18587 (cross-list from cs.CR) [pdf, other]
-
Title: The Impact of Uniform Inputs on Activation Sparsity and Energy-Latency Attacks in Computer VisionComments: Accepted at the DLSP 2024Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [239] arXiv:2403.18546 (cross-list from cs.RO) [pdf, other]
-
Title: Efficient Heatmap-Guided 6-Dof Grasp Detection in Cluttered ScenesComments: Extensive results on GraspNet-1B datasetSubjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [240] arXiv:2403.18514 (cross-list from eess.IV) [pdf, other]
-
Title: CT-3DFlow : Leveraging 3D Normalizing Flows for Unsupervised Detection of Pathological Pulmonary CT scansAuthors: Aissam Djahnine, Alexandre Popoff, Emilien Jupin-Delevaux, Vincent Cottin, Olivier Nempont, Loic BousselSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [241] arXiv:2403.18501 (cross-list from eess.IV) [pdf, other]
-
Title: HEMIT: H&E to Multiplex-immunohistochemistry Image Translation with Dual-Branch Pix2pix GeneratorSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [242] arXiv:2403.18468 (cross-list from eess.IV) [pdf, ps, other]
-
Title: Deep Learning Segmentation and Classification of Red Blood Cells Using a Large Multi-Scanner DatasetComments: 15 pages, 12 figures, 8 tablesSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [243] arXiv:2403.18447 (cross-list from cs.CL) [pdf, other]
-
Title: Can Language Beat Numerical Regression? Language-Based Multimodal Trajectory PredictionComments: Accepted at CVPR 2024Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)
- [244] arXiv:2403.18388 (cross-list from cs.AI) [pdf, other]
-
Title: FTBC: Forward Temporal Bias Correction for Optimizing ANN-SNN ConversionSubjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [245] arXiv:2403.18347 (cross-list from astro-ph.SR) [pdf, other]
-
Title: A Quantum Fuzzy-based Approach for Real-Time Detection of Solar Coronal HolesComments: 14 pages, 5 figures, 3 tablesSubjects: Solar and Stellar Astrophysics (astro-ph.SR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [246] arXiv:2403.18346 (cross-list from cs.CL) [pdf, other]
-
Title: Quantifying and Mitigating Unimodal Biases in Multimodal Large Language Models: A Causal PerspectiveSubjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
- [247] arXiv:2403.18339 (cross-list from eess.IV) [pdf, other]
-
Title: H2ASeg: Hierarchical Adaptive Interaction and Weighting Network for Tumor Segmentation in PET/CT ImagesComments: 10 pages,4 figuresSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [248] arXiv:2403.18321 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Implementation of the Principal Component Analysis onto High-Performance Computer Facilities for Hyperspectral Dimensionality Reduction: Results and ComparisonsAuthors: E. Martel, R. Lazcano, J. Lopez, D. Madroñal, R. Salvador, S. Lopez, E. Juarez, R. Guerra, C. Sanz, R. SarmientoComments: 30 pages, 10 figuresSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [249] arXiv:2403.18301 (cross-list from cs.LG) [pdf, other]
-
Title: Selective Mixup Fine-Tuning for Optimizing Non-Decomposable ObjectivesAuthors: Shrinivas Ramasubramanian, Harsh Rangwani, Sho Takemori, Kunal Samanta, Yuhei Umeda, Venkatesh Babu RadhakrishnanComments: ICLR 2024 SpotLightSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
- [250] arXiv:2403.18266 (cross-list from cs.LG) [pdf, other]
-
Title: Branch-Tuning: Balancing Stability and Plasticity for Continual Self-Supervised LearningSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [251] arXiv:2403.18233 (cross-list from eess.IV) [pdf, other]
-
Title: Benchmarking Image Transformers for Prostate Cancer Detection from Ultrasound DataAuthors: Mohamed Harmanani, Paul F. R. Wilson, Fahimeh Fooladgar, Amoon Jamzad, Mahdi Gilany, Minh Nguyen Nhat To, Brian Wodlinger, Purang Abolmaesumi, Parvin MousaviComments: early draft, 7 pages; Accepted to SPIE Medical Imaging 2024Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Tissues and Organs (q-bio.TO)
- [252] arXiv:2403.18198 (cross-list from eess.IV) [pdf, other]
-
Title: Generative Medical SegmentationSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [253] arXiv:2403.18196 (cross-list from cs.LG) [pdf, ps, other]
-
Title: Looking Beyond What You See: An Empirical Analysis on Subgroup Intersectional Fairness for Multi-label Chest X-ray Classification Using Social Determinants of Racial Health InequitiesComments: ICCV CVAMD 2023Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY)
- [254] arXiv:2403.18178 (cross-list from cs.RO) [pdf, other]
-
Title: Online Embedding Multi-Scale CLIP Features into 3D MapsComments: 8 pages, 7 figuresSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [255] arXiv:2403.18151 (cross-list from eess.IV) [pdf, ps, other]
-
Title: Automated Report Generation for Lung Cytological Images Using a CNN Vision Classifier and Multiple-Transformer Text Decoders: Preliminary StudyAuthors: Atsushi Teramoto, Ayano Michiba, Yuka Kiriyama, Tetsuya Tsukamoto, Kazuyoshi Imaizumi, Hiroshi FujitaComments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessibleSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Medical Physics (physics.med-ph)
- [256] arXiv:2403.18144 (cross-list from cs.CR) [pdf, other]
-
Title: Leak and Learn: An Attacker's Cookbook to Train Using Leaked Data from Federated LearningComments: Accepted to CVPR 2024Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
- [257] arXiv:2403.18139 (cross-list from eess.IV) [pdf, other]
-
Title: Pseudo-MRI-Guided PET Image Reconstruction Method Based on a Diffusion Probabilistic ModelAuthors: Weijie Gan, Huidong Xie, Carl von Gall, Günther Platsch, Michael T. Jurkiewicz, Andrea Andrade, Udunna C. Anazodo, Ulugbek S. Kamilov, Hongyu An, Jorge CabelloSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [258] arXiv:2403.18134 (cross-list from eess.IV) [pdf, other]
-
Title: Integrative Graph-Transformer Framework for Histopathology Whole Slide Image Representation and ClassificationSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [259] arXiv:2403.18132 (cross-list from cs.LG) [pdf, other]
-
Title: Recommendation of data-free class-incremental learning algorithms by simulating future dataSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [260] arXiv:2403.18103 (cross-list from cs.LG) [pdf, other]
-
Title: Tutorial on Diffusion Models for Imaging and VisionAuthors: Stanley H. ChanSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [261] arXiv:2403.18096 (cross-list from cs.RO) [pdf, other]
-
Title: Efficient Multi-Band Temporal Video Filter for Reducing Human-Robot InteractionAuthors: Lawrence O'GormanComments: 15 pages, 5 figures, 4 tablesSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [262] arXiv:2403.18035 (cross-list from cs.LG) [pdf, other]
-
Title: Bidirectional Consistency ModelsComments: 40 pages, 25 figuresSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [263] arXiv:2403.18028 (cross-list from cs.LG) [pdf, other]
-
Title: Predicting Species Occurrence Patterns from Partial ObservationsComments: Tackling Climate Change with Machine Learning workshop at ICLR 2024Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Populations and Evolution (q-bio.PE)
- [264] arXiv:2403.17958 (cross-list from cs.LG) [pdf, other]
-
Title: Deep Generative Domain Adaptation with Temporal Attention for Cross-User Activity RecognitionSubjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
Wed, 27 Mar 2024
- [265] arXiv:2403.17937 [pdf, other]
-
Title: Efficient Video Object Segmentation via Modulated Cross-Attention MemoryAuthors: Abdelrahman Shaker, Syed Talal Wasim, Martin Danelljan, Salman Khan, Ming-Hsuan Yang, Fahad Shahbaz KhanSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [266] arXiv:2403.17936 [pdf, other]
-
Title: ConvoFusion: Multi-Modal Conversational Diffusion for Co-Speech Gesture SynthesisAuthors: Muhammad Hamza Mughal, Rishabh Dabral, Ikhsanul Habibie, Lucia Donatelli, Marc Habermann, Christian TheobaltComments: CVPR 2024. Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [267] arXiv:2403.17935 [pdf, other]
-
Title: OmniVid: A Generative Framework for Universal Video UnderstandingComments: Accepted by CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [268] arXiv:2403.17934 [pdf, other]
-
Title: AiOS: All-in-One-Stage Expressive Human Pose and Shape EstimationAuthors: Qingping Sun, Yanjun Wang, Ailing Zeng, Wanqi Yin, Chen Wei, Wenjia Wang, Haiyi Mei, Chi Sing Leung, Ziwei Liu, Lei Yang, Zhongang CaiComments: Homepage: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [269] arXiv:2403.17931 [pdf, other]
-
Title: Track Everything Everywhere Fast and RobustlyComments: project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [270] arXiv:2403.17929 [pdf, other]
-
Title: Towards Explaining Hypercomplex Neural NetworksComments: The paper has been accepted at IEEE WCCI 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [271] arXiv:2403.17926 [pdf, other]
-
Title: FastCAR: Fast Classification And Regression Multi-Task Learning via Task Consolidation for Modelling a Continuous Property Variable of Object ClassesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [272] arXiv:2403.17924 [pdf, other]
-
Title: AID: Attention Interpolation of Text-to-Image DiffusionSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [273] arXiv:2403.17920 [pdf, other]
-
Title: TC4D: Trajectory-Conditioned Text-to-4D GenerationAuthors: Sherwin Bahmani, Xian Liu, Yifan Wang, Ivan Skorokhodov, Victor Rong, Ziwei Liu, Xihui Liu, Jeong Joon Park, Sergey Tulyakov, Gordon Wetzstein, Andrea Tagliasacchi, David B. LindellComments: Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [274] arXiv:2403.17915 [pdf, other]
-
Title: Leveraging Near-Field Lighting for Monocular Depth Estimation from Endoscopy VideosAuthors: Akshay Paruchuri, Samuel Ehrenstein, Shuxian Wang, Inbar Fried, Stephen M. Pizer, Marc Niethammer, Roni SenguptaComments: 26 pages, 7 tables, 7 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [275] arXiv:2403.17909 [pdf, other]
-
Title: ELGC-Net: Efficient Local-Global Context Aggregation for Remote Sensing Change DetectionComments: accepted at IEEE TGRSSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [276] arXiv:2403.17898 [pdf, other]
-
Title: Octree-GS: Towards Consistent Real-time Rendering with LOD-Structured 3D GaussiansComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [277] arXiv:2403.17893 [pdf, other]
-
Title: A Survey on 3D Egocentric Human Pose EstimationSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [278] arXiv:2403.17888 [pdf, other]
-
Title: 2D Gaussian Splatting for Geometrically Accurate Radiance FieldsComments: 12 pages, 12 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
- [279] arXiv:2403.17884 [pdf, other]
-
Title: Sen2Fire: A Challenging Benchmark Dataset for Wildfire Detection using Sentinel DataSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [280] arXiv:2403.17883 [pdf, other]
-
Title: Superior and Pragmatic Talking Face Generation with Teacher-Student FrameworkAuthors: Chao Liang, Jianwen Jiang, Tianyun Zhong, Gaojie Lin, Zhengkun Rong, Jiaqi Yang, Yongming ZhuSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [281] arXiv:2403.17881 [pdf, other]
-
Title: Deepfake Generation and Detection: A Benchmark and SurveyAuthors: Gan Pei, Jiangning Zhang, Menghan Hu, Guangtao Zhai, Chengjie Wang, Zhenyu Zhang, Jian Yang, Chunhua Shen, Dacheng TaoSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [282] arXiv:2403.17879 [pdf, other]
-
Title: Low-Latency Neural Stereo StreamingComments: Accepted by CVPR2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
- [283] arXiv:2403.17870 [pdf, other]
-
Title: Boosting Diffusion Models with Moving Average Sampling in Frequency DomainComments: CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
- [284] arXiv:2403.17869 [pdf, other]
-
Title: To Supervise or Not to Supervise: Understanding and Addressing the Key Challenges of 3D Transfer LearningSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [285] arXiv:2403.17839 [pdf, other]
-
Title: ReMamber: Referring Image Segmentation with Mamba TwisterSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [286] arXiv:2403.17837 [pdf, other]
-
Title: GTA-HDR: A Large-Scale Synthetic Dataset for HDR Image ReconstructionComments: Submitted to IEEESubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG); Multimedia (cs.MM); Image and Video Processing (eess.IV)
- [287] arXiv:2403.17834 [pdf, other]
-
Title: A foundation model utilizing chest CT volumes and radiology reports for supervised-level zero-shot detection of abnormalitiesAuthors: Ibrahim Ethem Hamamci, Sezgin Er, Furkan Almas, Ayse Gulnihan Simsek, Sevval Nil Esirgun, Irem Dogan, Muhammed Furkan Dasdelen, Bastian Wittmann, Enis Simsar, Mehmet Simsar, Emine Bensu Erdemir, Abdullah Alanbay, Anjany Sekuboyina, Berkan Lafci, Mehmet K. Ozdemir, Bjoern MenzeSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [288] arXiv:2403.17830 [pdf, other]
-
Title: Assessment of Multimodal Large Language Models in Alignment with Human ValuesAuthors: Zhelun Shi, Zhipin Wang, Hongxing Fan, Zaibin Zhang, Lijun Li, Yongting Zhang, Zhenfei Yin, Lu Sheng, Yu Qiao, Jing ShaoComments: arXiv admin note: text overlap with arXiv:2311.02692Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [289] arXiv:2403.17827 [pdf, other]
-
Title: DiffH2O: Diffusion-Based Synthesis of Hand-Object Interactions from Textual DescriptionsAuthors: Sammy Christen, Shreyas Hampali, Fadime Sener, Edoardo Remelli, Tomas Hodan, Eric Sauser, Shugao Ma, Bugra TekinComments: Project Page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG)
- [290] arXiv:2403.17823 [pdf, other]
-
Title: Efficient Image Pre-Training with Siamese Cropped Masked AutoencodersAuthors: Alexandre Eymaël, Renaud Vandeghen, Anthony Cioppa, Silvio Giancola, Bernard Ghanem, Marc Van DroogenbroeckComments: 19 pages, 6 figures, 3 tables, 1 page of supplementary materialSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [291] arXiv:2403.17822 [pdf, other]
-
Title: DN-Splatter: Depth and Normal Priors for Gaussian Splatting and MeshingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [292] arXiv:2403.17804 [pdf, other]
-
Title: Improving Text-to-Image Consistency via Automatic Prompt OptimizationAuthors: Oscar Mañas, Pietro Astolfi, Melissa Hall, Candace Ross, Jack Urbanek, Adina Williams, Aishwarya Agrawal, Adriana Romero-Soriano, Michal DrozdzalSubjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
- [293] arXiv:2403.17801 [pdf, other]
-
Title: Towards 3D Vision with Low-Cost Single-Photon CamerasAuthors: Fangzhou Mu, Carter Sifferman, Sacha Jungerman, Yiquan Li, Mark Han, Michael Gleicher, Mohit Gupta, Yin LiSubjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
- [294] arXiv:2403.17782 [pdf, other]
-
Title: GenesisTex: Adapting Image Denoising Diffusion to Texture SpaceComments: 12 pages, 10 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
- [295] arXiv:2403.17765 [pdf, other]
-
Title: MUTE-SLAM: Real-Time Neural SLAM with Multiple Tri-Plane Hash RepresentationsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [296] arXiv:2403.17761 [pdf, other]
-
Title: Makeup Prior Models for 3D Facial Makeup Estimation and ApplicationsComments: CVPR2024. Project: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
- [297] arXiv:2403.17757 [pdf, other]
-
Title: Noise2Noise Denoising of CRISM Hyperspectral DataComments: 5 pages, 3 figures. Accepted as a conference paper at the ICLR 2024 ML4RS WorkshopSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [298] arXiv:2403.17749 [pdf, other]
-
Title: Multi-Task Dense Prediction via Mixture of Low-Rank ExpertsComments: Accepted at CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [299] arXiv:2403.17727 [pdf, other]
-
Title: FastPerson: Enhancing Video Learning through Effective Video Summarization that Preserves Linguistic and Visual ContextsJournal-ref: AHs '24: Proceedings of the Augmented Humans International Conference 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Human-Computer Interaction (cs.HC); Multimedia (cs.MM)
- [300] arXiv:2403.17725 [pdf, other]
-
Title: Deep Learning for Segmentation of Cracks in High-Resolution Images of Steel BridgesSubjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
- [301] arXiv:2403.17712 [pdf, other]
-
Title: Invisible Gas Detection: An RGB-Thermal Cross Attention Network and A New BenchmarkSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [302] arXiv:2403.17709 [pdf, other]
-
Title: Groupwise Query Specialization and Quality-Aware Multi-Assignment for Transformer-based Visual Relationship DetectionComments: CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [303] arXiv:2403.17708 [pdf, other]
-
Title: Panonut360: A Head and Eye Tracking Dataset for Panoramic VideoComments: 7 pages,ACM MMSys'24 acceptedSubjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Multimedia (cs.MM)
- [304] arXiv:2403.17702 [pdf, other]
-
Title: The Solution for the CVPR 2023 1st foundation model challenge-Track2Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [305] arXiv:2403.17695 [pdf, other]
-
Title: PlainMamba: Improving Non-Hierarchical Mamba in Visual RecognitionAuthors: Chenhongyi Yang, Zehui Chen, Miguel Espinosa, Linus Ericsson, Zhenyu Wang, Jiaming Liu, Elliot J. CrowleySubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [306] arXiv:2403.17694 [pdf, other]
-
Title: AniPortrait: Audio-Driven Synthesis of Photorealistic Portrait AnimationSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Image and Video Processing (eess.IV)
- [307] arXiv:2403.17692 [pdf, other]
-
Title: Manifold-Guided Lyapunov Control with Diffusion ModelsComments: 14 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Differential Geometry (math.DG); Optimization and Control (math.OC); Computation (stat.CO)
- [308] arXiv:2403.17691 [pdf, other]
-
Title: Not All Similarities Are Created Equal: Leveraging Data-Driven Biases to Inform GenAI Copyright DisputesAuthors: Uri Hacohen, Adi Haviv, Shahar Sarfaty, Bruria Friedman, Niva Elkin-Koren, Roi Livni, Amit H BermanoComments: Presented at ACM CSLAW 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
- [309] arXiv:2403.17678 [pdf, other]
-
Title: Hierarchical Light Transformer Ensembles for Multimodal Trajectory ForecastingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [310] arXiv:2403.17664 [pdf, other]
-
Title: DiffFAE: Advancing High-fidelity One-shot Facial Appearance Editing with Space-sensitive Customization and Semantic PreservationAuthors: Qilin Wang, Jiangning Zhang, Chengming Xu, Weijian Cao, Ying Tai, Yue Han, Yanhao Ge, Hong Gu, Chengjie Wang, Yanwei FuSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [311] arXiv:2403.17651 [pdf, other]
-
Title: Exploring Dynamic Transformer for Efficient Object TrackingAuthors: Jiawen Zhu, Xin Chen, Haiwen Diao, Shuai Li, Jun-Yan He, Chenyang Li, Bin Luo, Dong Wang, Huchuan LuSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [312] arXiv:2403.17638 [pdf, other]
-
Title: Learning with Unreliability: Fast Few-shot Voxel Radiance Fields with Relative Geometric ConsistencyComments: CVPR 2024 final versionSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [313] arXiv:2403.17633 [pdf, other]
-
Title: UADA3D: Unsupervised Adversarial Domain Adaptation for 3D Object Detection with Sparse LiDAR and Large Domain GapsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
- [314] arXiv:2403.17631 [pdf, other]
-
Title: AniArtAvatar: Animatable 3D Art Avatar from a Single ImageAuthors: Shaoxu LiSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [315] arXiv:2403.17610 [pdf, other]
-
Title: MMVP: A Multimodal MoCap Dataset with Vision and Pressure SensorsAuthors: He Zhang, Shenghao Ren, Haolei Yuan, Jianhui Zhao, Fan Li, Shuangpeng Sun, Zhenghao Liang, Tao Yu, Qiu Shen, Xun CaoComments: CVPR2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [316] arXiv:2403.17608 [pdf, other]
-
Title: Fake or JPEG? Revealing Common Biases in Generated Image Detection DatasetsSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [317] arXiv:2403.17589 [pdf, other]
-
Title: Dual Memory Networks: A Versatile Adaptation Approach for Vision-Language ModelsComments: CVPR2024; Codes are available at \url{this https URL}Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)
- [318] arXiv:2403.17550 [pdf, other]
-
Title: DeepMIF: Deep Monotonic Implicit Fields for Large-Scale LiDAR 3D MappingComments: 8 pages, 6 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)
- [319] arXiv:2403.17541 [pdf, other]
-
Title: WordRobe: Text-Guided Generation of Textured 3D GarmentsSubjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
- [320] arXiv:2403.17537 [pdf, other]
-
Title: NeRF-HuGS: Improved Neural Radiance Fields in Non-static Scenes Using Heuristics-Guided SegmentationComments: To appear in CVPR2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [321] arXiv:2403.17530 [pdf, other]
-
Title: Boosting Few-Shot Learning with Disentangled Self-Supervised Learning and Meta-Learning for Medical Image ClassificationComments: 20 pages, 4 figures, 4 tables. Submitted to Elsevier on 25 March 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [322] arXiv:2403.17525 [pdf, other]
-
Title: Equipping Sketch Patches with Context-Aware Positional Encoding for Graphic Sketch RepresentationSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [323] arXiv:2403.17512 [pdf, other]
-
Title: Random-coupled Neural NetworkSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [324] arXiv:2403.17502 [pdf, other]
-
Title: SeNM-VAE: Semi-Supervised Noise Modeling with Hierarchical Variational AutoencoderSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [325] arXiv:2403.17496 [pdf, other]
-
Title: Dr.Hair: Reconstructing Scalp-Connected Hair Strands without Pre-training via Differentiable Rendering of Line SegmentsComments: CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
- [326] arXiv:2403.17477 [pdf, other]
-
Title: DiffGaze: A Diffusion Model for Continuous Gaze Sequence Generation on 360° ImagesSubjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
- [327] arXiv:2403.17465 [pdf, other]
-
Title: LaRE^2: Latent Reconstruction Error Based Method for Diffusion-Generated Image DetectionComments: CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [328] arXiv:2403.17423 [pdf, other]
-
Title: Test-time Adaptation Meets Image Enhancement: Improving Accuracy via Uncertainty-aware Logit SwitchingAuthors: Shohei Enomoto, Naoya Hasegawa, Kazuki Adachi, Taku Sasaki, Shin'ya Yamaguchi, Satoshi Suzuki, Takeharu EdaComments: Accepted to IJCNN2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
- [329] arXiv:2403.17422 [pdf, other]
-
Title: InterHandGen: Two-Hand Interaction Generation via Cascaded Reverse DiffusionComments: Accepted to CVPR 2024, project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [330] arXiv:2403.17409 [pdf, other]
-
Title: Neural Clustering based Visual Representation LearningComments: CVPR 2024. Code: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [331] arXiv:2403.17390 [pdf, other]
-
Title: SSF3D: Strict Semi-Supervised 3D Object Detection with Switching FilterAuthors: Songbur WongSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [332] arXiv:2403.17387 [pdf, other]
-
Title: Decoupled Pseudo-labeling for Semi-Supervised Monocular 3D Object DetectionAuthors: Jiacheng Zhang, Jiaming Li, Xiangru Lin, Wei Zhang, Xiao Tan, Junyu Han, Errui Ding, Jingdong Wang, Guanbin LiComments: To appear in CVPR2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [333] arXiv:2403.17377 [pdf, other]
-
Title: Self-Rectifying Diffusion Sampling with Perturbed-Attention GuidanceAuthors: Donghoon Ahn, Hyoungwon Cho, Jaewon Min, Wooseok Jang, Jungwoo Kim, SeonHwa Kim, Hyun Hee Park, Kyong Hwan Jin, Seungryong KimComments: Project page is available at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [334] arXiv:2403.17373 [pdf, other]
-
Title: AIDE: An Automatic Data Engine for Object Detection in Autonomous DrivingAuthors: Mingfu Liang, Jong-Chyi Su, Samuel Schulter, Sparsh Garg, Shiyu Zhao, Ying Wu, Manmohan ChandrakerComments: Accepted by CVPR-2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [335] arXiv:2403.17369 [pdf, other]
-
Title: CoDA: Instructive Chain-of-Domain Adaptation with Severity-Aware Visual Prompt TuningSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [336] arXiv:2403.17360 [pdf, other]
-
Title: Activity-Biometrics: Person Identification from Daily ActivitiesComments: CVPR 2024 Main conferenceSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [337] arXiv:2403.17346 [pdf, other]
-
Title: TRAM: Global Trajectory and Motion of 3D Humans from in-the-wild VideosComments: The project website: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [338] arXiv:2403.17343 [pdf, other]
-
Title: Language Models are Free Boosters for Biomedical Imaging TasksSubjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
- [339] arXiv:2403.17342 [pdf, other]
-
Title: The Solution for the ICCV 2023 1st Scientific Figure Captioning ChallengeSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [340] arXiv:2403.17334 [pdf, other]
-
Title: OVER-NAV: Elevating Iterative Vision-and-Language Navigation with Open-Vocabulary Detection and StructurEd RepresentationComments: Accepted by CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [341] arXiv:2403.17330 [pdf, other]
-
Title: Staircase Localization for Autonomous Exploration in Urban EnvironmentsComments: 9 pages, 10 figuresSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [342] arXiv:2403.17301 [pdf, other]
-
Title: Physical 3D Adversarial Attacks against Monocular Depth Estimation in Autonomous DrivingComments: Accepted by CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
- [343] arXiv:2403.17237 [pdf, other]
-
Title: DreamPolisher: Towards High-Quality Text-to-3D Generation via Geometric DiffusionComments: Project webpage: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
- [344] arXiv:2403.17223 [pdf, ps, other]
-
Title: Co-Occurring of Object Detection and Identification towards unlabeled object discoveryComments: 6 pages, 2 figures,Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [345] arXiv:2403.17217 [pdf, other]
-
Title: DiffusionAct: Controllable Diffusion Autoencoder for One-shot Face ReenactmentAuthors: Stella Bounareli, Christos Tzelepis, Vasileios Argyriou, Ioannis Patras, Georgios TzimiropoulosComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
- [346] arXiv:2403.17213 [pdf, other]
-
Title: AnimateMe: 4D Facial Expressions via Diffusion ModelsAuthors: Dimitrios Gerogiannis, Foivos Paraperas Papantoniou, Rolandos Alexandros Potamias, Alexandros Lattas, Stylianos Moschoglou, Stylianos Ploumpis, Stefanos ZafeiriouSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [347] arXiv:2403.17192 [pdf, ps, other]
-
Title: Strategies to Improve Real-World Applicability of Laparoscopic Anatomy Segmentation ModelsComments: 13 pages, 5 figures, 4 tablesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [348] arXiv:2403.17188 [pdf, other]
-
Title: LOTUS: Evasive and Resilient Backdoor Attacks through Sub-PartitioningAuthors: Siyuan Cheng, Guanhong Tao, Yingqi Liu, Guangyu Shen, Shengwei An, Shiwei Feng, Xiangzhe Xu, Kaiyuan Zhang, Shiqing Ma, Xiangyu ZhangComments: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR 2024)Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR)
- [349] arXiv:2403.17176 [pdf, other]
-
Title: Histogram Layers for Neural Engineered FeaturesComments: 11 pages, 7 figures, submitted for reviewSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [350] arXiv:2403.17175 [pdf, ps, other]
-
Title: Engagement Measurement Based on Facial Landmarks and Spatial-Temporal Graph Convolutional NetworksSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [351] arXiv:2403.17173 [pdf, other]
-
Title: Task2Box: Box Embeddings for Modeling Asymmetric Task RelationshipsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [352] arXiv:2403.17128 [pdf, other]
-
Title: Benchmarking Video Frame InterpolationComments: this http URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [353] arXiv:2403.17103 [pdf, other]
-
Title: Animal Avatars: Reconstructing Animatable 3D Animals from Casual VideosSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [354] arXiv:2403.17094 [pdf, other]
-
Title: SynFog: A Photo-realistic Synthetic Fog Dataset based on End-to-end Imaging Simulation for Advancing Real-World Defogging in Autonomous DrivingSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [355] arXiv:2403.17064 [pdf, other]
-
Title: Continuous, Subject-Specific Attribute Control in T2I Models by Identifying Semantic DirectionsAuthors: Stefan Andreas Baumann, Felix Krause, Michael Neumayr, Nick Stracke, Vincent Tao Hu, Björn OmmerComments: Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
- [356] arXiv:2403.17025 [pdf, other]
-
Title: Boosting Few-Shot Learning via Attentive Feature RegularizationComments: Accepted to AAAI 2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [357] arXiv:2403.17016 [pdf, other]
-
Title: HEAL-ViT: Vision Transformers on a spherical mesh for medium-range weather forecastingAuthors: Vivek RamavajjalaComments: 18 pages, 14 figures, preprintSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Atmospheric and Oceanic Physics (physics.ao-ph)
- [358] arXiv:2403.17014 [pdf, other]
-
Title: Contrastive Learning for Regression on Hyperspectral DataComments: Accepted in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [359] arXiv:2403.17013 [pdf, other]
-
Title: Temporal-Spatial Processing of Event Camera Data via Delay-Loop Reservoir Neural NetworkAuthors: Richard Lau, Anthony Tylan-Tyler, Lihan Yao, Rey de Castro Roberto, Robert Taylor, Isaiah JonesComments: 10 pages, 12 figures, Darpa Distribution Statement A. Approved for public release. Distribution UnlimitedSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [360] arXiv:2403.17933 (cross-list from cs.RO) [pdf, other]
-
Title: SLEDGE: Synthesizing Simulation Environments for Driving Agents with Generative ModelsSubjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [361] arXiv:2403.17916 (cross-list from cs.RO) [pdf, other]
-
Title: CMP: Cooperative Motion Prediction with Multi-Agent CommunicationSubjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multiagent Systems (cs.MA)
- [362] arXiv:2403.17905 (cross-list from eess.IV) [pdf, other]
-
Title: Scalable Non-Cartesian Magnetic Resonance Imaging with R2D2Comments: submitted to IEEE EUSIPCO 2024Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Signal Processing (eess.SP)
- [363] arXiv:2403.17902 (cross-list from eess.IV) [pdf, other]
-
Title: Serpent: Scalable and Efficient Image Restoration via Multi-scale Structured State Space ModelsComments: 7 pages, 5 figures, preliminary workshop submission of a comprehensive work to be released soonSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [364] arXiv:2403.17846 (cross-list from cs.RO) [pdf, other]
-
Title: Hierarchical Open-Vocabulary 3D Scene Graphs for Language-Grounded Robot NavigationComments: Code and video are available at this http URLSubjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [365] arXiv:2403.17808 (cross-list from eess.IV) [pdf, ps, other]
-
Title: Annotated Biomedical Video Generation using Denoising Diffusion Probabilistic Models and Flow FieldsSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [366] arXiv:2403.17787 (cross-list from cs.AI) [pdf, other]
-
Title: Evaluating the Efficacy of Prompt-Engineered Large Multimodal Models Versus Fine-Tuned Vision Transformers in Image-Based Security ApplicationsSubjects: Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
- [367] arXiv:2403.17770 (cross-list from eess.IV) [pdf, other]
-
Title: CT Synthesis with Conditional Diffusion Models for Abdominal Lymph Node SegmentationAuthors: Yongrui Yu, Hanyu Chen, Zitian Zhang, Qiong Xiao, Wenhui Lei, Linrui Dai, Yu Fu, Hui Tan, Guan Wang, Peng Gao, Xiaofan ZhangSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [368] arXiv:2403.17755 (cross-list from cs.AI) [pdf, other]
-
Title: DataCook: Crafting Anti-Adversarial Examples for Healthcare Data Copyright ProtectionSubjects: Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV)
- [369] arXiv:2403.17734 (cross-list from eess.IV) [pdf, other]
-
Title: Paired Diffusion: Generation of related, synthetic PET-CT-Segmentation scans using Linked Denoising Diffusion Probabilistic ModelsComments: to be published in IEEE International Symposium on Biomedical Imaging 2024Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [370] arXiv:2403.17719 (cross-list from eess.SP) [pdf, other]
-
Title: Resolution Limit of Single-Photon LiDARAuthors: Stanley H. Chan, Hashan K. Weerasooriya, Weijian Zhang, Pamela Abshire, Istvan Gyongy, Robert K. HendersonSubjects: Signal Processing (eess.SP); Computer Vision and Pattern Recognition (cs.CV)
- [371] arXiv:2403.17701 (cross-list from eess.IV) [pdf, other]
-
Title: Rotate to Scan: UNet-like Mamba with Triplet SSM Module for Medical Image SegmentationSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [372] arXiv:2403.17672 (cross-list from cs.GR) [pdf, other]
-
Title: Predicting Perceived Gloss: Do Weak Labels Suffice?Authors: Julia Guerrero-Viu, J. Daniel Subias, Ana Serrano, Katherine R. Storrs, Roland W. Fleming, Belen Masia, Diego GutierrezComments: Computer Graphics Forum (Eurographics 2024)Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
- [373] arXiv:2403.17639 (cross-list from eess.IV) [pdf, other]
- [374] arXiv:2403.17615 (cross-list from eess.IV) [pdf, other]
-
Title: Grad-CAMO: Learning Interpretable Single-Cell Morphological Profiles from 3D Cell Painting ImagesSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Quantitative Methods (q-bio.QM)
- [375] arXiv:2403.17549 (cross-list from cs.AI) [pdf, ps, other]
-
Title: Practical Applications of Advanced Cloud Services and Generative AI Systems in Medical Image AnalysisSubjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
- [376] arXiv:2403.17545 (cross-list from cs.CL) [pdf, other]
-
Title: A Gaze-grounded Visual Question Answering Dataset for Clarifying Ambiguous Japanese QuestionsComments: LREC-COLING 2024Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
- [377] arXiv:2403.17520 (cross-list from cs.LG) [pdf, other]
-
Title: Boosting Adversarial Training via Fisher-Rao Norm-based RegularizationComments: This paper has been accepted to CVPR2024Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [378] arXiv:2403.17503 (cross-list from cs.LG) [pdf, other]
-
Title: DS-AL: A Dual-Stream Analytic Learning for Exemplar-Free Class-Incremental LearningComments: Accepted in AAAI 2024Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
- [379] arXiv:2403.17497 (cross-list from cs.CL) [pdf, other]
-
Title: Sharing the Cost of Success: A Game for Evaluating and Learning Collaborative Multi-Agent Instruction Giving and Following PoliciesComments: 9 pages, Accepted at LREC-COLING 2024Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
- [380] arXiv:2403.17460 (cross-list from eess.IV) [pdf, other]
-
Title: Building Bridges across Spatial and Temporal Resolutions: Reference-Based Super-Resolution via Change Priors and Conditional Diffusion ModelAuthors: Runmin Dong, Shuai Yuan, Bin Luo, Mengxuan Chen, Jinxiao Zhang, Lixian Zhang, Weijia Li, Juepeng Zheng, Haohuan FuComments: Accepted by CVPR2024Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [381] arXiv:2403.17447 (cross-list from cs.LG) [pdf, other]
-
Title: Chain of Compression: A Systematic Approach to Combinationally Compress Convolutional Neural NetworksComments: 10 pages, 15 figuresSubjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Neural and Evolutionary Computing (cs.NE)
- [382] arXiv:2403.17432 (cross-list from eess.IV) [pdf, other]
-
Title: Integrating Mamba Sequence Model and Hierarchical Upsampling Network for Accurate Semantic Segmentation of Multiple Sclerosis LegionAuthors: Kazi Shahriar Sanjid, Md. Tanzim Hossain, Md. Shakib Shahariar Junayed, Dr. Mohammad Monir UddinComments: 13 pagesSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [383] arXiv:2403.17420 (cross-list from cs.MM) [pdf, other]
-
Title: Learning to Visually Localize Sound Sources from Mixtures without Prior Source KnowledgeComments: Accepted at CVPR 2024Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
- [384] arXiv:2403.17332 (cross-list from eess.IV) [pdf, ps, other]
-
Title: Labeling subtypes in a Parkinson's Cohort using Multifeatures in MRI -- Integrating Grey and White Matter InformationAuthors: Tanmayee Samantaray, Jitender Saini, Pramod Kumar Pal, Bithiah Grace Jaganathan, Vijaya V Saradhi, Gupta CNComments: 31 pages, 10 figures, 3 tablesSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Neurons and Cognition (q-bio.NC)
- [385] arXiv:2403.17327 (cross-list from cs.SD) [pdf, other]
-
Title: Accuracy enhancement method for speech emotion recognition from spectrogram using temporal frequency correlation and positional information learning through knowledge transferSubjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Audio and Speech Processing (eess.AS)
- [386] arXiv:2403.17293 (cross-list from eess.IV) [pdf, ps, other]
-
Title: Tracing and segmentation of molecular patterns in 3-dimensional cryo-et/em density maps through algorithmic image processing and deep learning-based techniquesAuthors: Salim SazzedSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Biological Physics (physics.bio-ph)
- [387] arXiv:2403.17255 (cross-list from eess.IV) [pdf, other]
-
Title: Decoding the visual attention of pathologists to reveal their level of expertiseAuthors: Souradeep Chakraborty, Dana Perez, Paul Friedman, Natallia Sheuka, Constantin Friedman, Oksana Yaskiv, Rajarsi Gupta, Gregory J. Zelinsky, Joel H. Saltz, Dimitris SamarasSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
- [388] arXiv:2403.17177 (cross-list from eess.IV) [pdf, other]
-
Title: Brain Stroke Segmentation Using Deep Learning Models: A Comparative StudyAuthors: Ahmed Soliman, Yousif Yousif, Ahmed Ibrahim, Yalda Zafari-Ghadim, Essam A. Rashed, Mohamed MabrokSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
- [389] arXiv:2403.17084 (cross-list from cs.RO) [pdf, other]
-
Title: A Comparative Analysis of Visual Odometry in Virtual and Real-World Railways EnvironmentsSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
- [390] arXiv:2403.17083 (cross-list from eess.IV) [pdf, other]
-
Title: A Study in Dataset Pruning for Image Super-ResolutionSubjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
- [391] arXiv:2403.17042 (cross-list from eess.IV) [pdf, other]
-
Title: Provably Robust Score-Based Diffusion Posterior Sampling for Plug-and-Play Image ReconstructionSubjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Signal Processing (eess.SP); Optimization and Control (math.OC); Machine Learning (stat.ML)
- [392] arXiv:2403.13358 (cross-list from cs.RO) [pdf, other]
-
Title: GeRM: A Generalist Robotic Model with Mixture-of-experts for Quadruped RobotSubjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
Tue, 26 Mar 2024 (showing first 13 of 246 entries)
- [393] arXiv:2403.17010 [pdf, other]
-
Title: Calib3D: Calibrating Model Preferences for Reliable 3D Scene UnderstandingComments: Preprint; 37 pages, 8 figures, 11 tables; Code at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Robotics (cs.RO)
- [394] arXiv:2403.17009 [pdf, other]
-
Title: Optimizing LiDAR Placements for Robust Driving Perception in Adverse ConditionsComments: Preprint; 40 pages, 11 figures, 15 tables; Code at this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
- [395] arXiv:2403.17008 [pdf, other]
-
Title: FlashFace: Human Image Personalization with High-fidelity Identity PreservationAuthors: Shilong Zhang, Lianghua Huang, Xi Chen, Yifei Zhang, Zhi-Fan Wu, Yutong Feng, Wei Wang, Yujun Shen, Yu Liu, Ping LuoComments: Project Page:this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [396] arXiv:2403.17007 [pdf, other]
-
Title: DreamLIP: Language-Image Pre-training with Long CaptionsSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [397] arXiv:2403.17006 [pdf, other]
-
Title: Invertible Diffusion Models for Compressed SensingSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [398] arXiv:2403.17005 [pdf, other]
-
Title: TRIP: Temporal Residual Learning with Image Noise Prior for Image-to-Video Diffusion ModelsComments: CVPR 2024; Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
- [399] arXiv:2403.17004 [pdf, other]
-
Title: SD-DiT: Unleashing the Power of Self-supervised Discrimination in Diffusion TransformerComments: CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
- [400] arXiv:2403.17001 [pdf, other]
-
Title: VP3D: Unleashing 2D Visual Prompt for Text-to-3D GenerationComments: CVPR 2024; Project page: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
- [401] arXiv:2403.17000 [pdf, other]
-
Title: Learning Spatial Adaptation and Temporal Coherence in Diffusion Models for Video Super-ResolutionComments: CVPR 2024Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
- [402] arXiv:2403.16999 [pdf, other]
-
Title: Visual CoT: Unleashing Chain-of-Thought Reasoning in Multi-Modal Language ModelsAuthors: Hao Shao, Shengju Qian, Han Xiao, Guanglu Song, Zhuofan Zong, Letian Wang, Yu Liu, Hongsheng LiComments: Code: this https URLSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [403] arXiv:2403.16998 [pdf, other]
-
Title: Understanding Long Videos in One Multimodal Language Model PassComments: 24 pagesSubjects: Computer Vision and Pattern Recognition (cs.CV)
- [404] arXiv:2403.16997 [pdf, other]
-
Title: Composed Video Retrieval via Enriched Context and Discriminative EmbeddingsAuthors: Omkar Thawakar, Muzammal Naseer, Rao Muhammad Anwer, Salman Khan, Michael Felsberg, Mubarak Shah, Fahad Shahbaz KhanComments: CVPR-2024Subjects: Computer Vision and Pattern Recognition (cs.CV)
- [405] arXiv:2403.16996 [pdf, other]
-
Title: DriveCoT: Integrating Chain-of-Thought Reasoning with End-to-End DrivingSubjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[ showing 405 entries per page: fewer | more | all ]
Disable MathJax (What is MathJax?)
Links to: arXiv, form interface, find, cs, new, 2403, contact, help (Access key information)