Computer Vision and Pattern Recognition

Authors and titles for recent submissions

[ total of 604 entries: 1-311 | 312-604 ]
[ showing 311 entries per page: fewer | more | all ]

Fri, 19 Apr 2024

[1] arXiv:2404.12391 [pdf, other]: Title: On the Content Bias in Fréchet Video Distance

Authors: Songwei Ge, Aniruddha Mahapatra, Gaurav Parmar, Jun-Yan Zhu, Jia-Bin Huang

Comments: CVPR 2024. Project webpage: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)
[2] arXiv:2404.12390 [pdf, other]: Title: BLINK: Multimodal Large Language Models Can See but Not Perceive

Authors: Xingyu Fu, Yushi Hu, Bangzheng Li, Yu Feng, Haoyu Wang, Xudong Lin, Dan Roth, Noah A. Smith, Wei-Chiu Ma, Ranjay Krishna

Comments: Multimodal Benchmark, Project Url: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[3] arXiv:2404.12389 [pdf, other]: Title: Moving Object Segmentation: All You Need Is SAM (and Flow)

Authors: Junyu Xie, Charig Yang, Weidi Xie, Andrew Zisserman

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[4] arXiv:2404.12388 [pdf, other]: Title: VideoGigaGAN: Towards Detail-rich Video Super-Resolution

Authors: Yiran Xu, Taesung Park, Richard Zhang, Yang Zhou, Eli Shechtman, Feng Liu, Jia-Bin Huang, Difan Liu

Comments: project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[5] arXiv:2404.12386 [pdf, other]: Title: SOHES: Self-supervised Open-world Hierarchical Entity Segmentation

Authors: Shengcao Cao, Jiuxiang Gu, Jason Kuen, Hao Tan, Ruiyi Zhang, Handong Zhao, Ani Nenkova, Liang-Yan Gui, Tong Sun, Yu-Xiong Wang

Comments: ICLR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[6] arXiv:2404.12385 [pdf, other]: Title: MeshLRM: Large Reconstruction Model for High-Quality Mesh

Authors: Xinyue Wei, Kai Zhang, Sai Bi, Hao Tan, Fujun Luan, Valentin Deschaintre, Kalyan Sunkavalli, Hao Su, Zexiang Xu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[7] arXiv:2404.12383 [pdf, ps, other]: Title: G-HOP: Generative Hand-Object Prior for Interaction Reconstruction and Grasp Synthesis

Authors: Yufei Ye, Abhinav Gupta, Kris Kitani, Shubham Tulsiani

Comments: accepted to CVPR2024; project page at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[8] arXiv:2404.12382 [pdf, other]: Title: Lazy Diffusion Transformer for Interactive Image Editing

Authors: Yotam Nitzan, Zongze Wu, Richard Zhang, Eli Shechtman, Daniel Cohen-Or, Taesung Park, Michaël Gharbi

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[9] arXiv:2404.12379 [pdf, other]: Title: Dynamic Gaussians Mesh: Consistent Mesh Reconstruction from Monocular Videos

Authors: Isabella Liu, Hao Su, Xiaolong Wang

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[10] arXiv:2404.12378 [pdf, other]: Title: 6Img-to-3D: Few-Image Large-Scale Outdoor Driving Scene Reconstruction

Authors: Théo Gieruc, Marius Kästingschäfer, Sebastian Bernhard, Mathieu Salzmann

Comments: Joint first authorship. Project page: this https URL Code this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[11] arXiv:2404.12372 [pdf, other]: Title: MedThink: Explaining Medical Visual Question Answering via Multimodal Decision-Making Rationale

Authors: Xiaotang Gai, Chenyi Zhou, Jiaxiang Liu, Yang Feng, Jian Wu, Zuozhu Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[12] arXiv:2404.12368 [pdf, other]: Title: Gradient-Regularized Out-of-Distribution Detection

Authors: Sina Sharifi, Taha Entesari, Bardia Safaei, Vishal M. Patel, Mahyar Fazlyab

Comments: Under review for the 18th European Conference on Computer Vision (ECCV) 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[13] arXiv:2404.12359 [pdf, other]: Title: Inverse Neural Rendering for Explainable Multi-Object Tracking

Authors: Julian Ost, Tanushree Banerjee, Mario Bijelic, Felix Heide

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Robotics (cs.RO)
[14] arXiv:2404.12353 [pdf, other]: Title: V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning

Authors: Hang Hua, Yunlong Tang, Chenliang Xu, Jiebo Luo

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[15] arXiv:2404.12352 [pdf, other]: Title: Point-In-Context: Understanding Point Cloud via In-Context Learning

Authors: Mengyuan Liu, Zhongbin Fang, Xia Li, Joachim M. Buhmann, Xiangtai Li, Chen Change Loy

Comments: Project page: this https URL arXiv admin note: text overlap with arXiv:2306.08659

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[16] arXiv:2404.12347 [pdf, other]: Title: AniClipart: Clipart Animation with Text-to-Video Priors

Authors: Ronghuan Wu, Wanchao Su, Kede Ma, Jing Liao

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[17] arXiv:2404.12333 [pdf, other]: Title: Customizing Text-to-Image Diffusion with Camera Viewpoint Control

Authors: Nupur Kumari, Grace Su, Richard Zhang, Taesung Park, Eli Shechtman, Jun-Yan Zhu

Comments: project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[18] arXiv:2404.12330 [pdf, other]: Title: A Perspective on Deep Vision Performance with Standard Image and Video Codecs

Authors: Christoph Reich, Oliver Hahn, Daniel Cremers, Stefan Roth, Biplob Debnath

Comments: Accepted at CVPR 2024 Workshop on AI for Streaming (AIS)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[19] arXiv:2404.12322 [pdf, other]: Title: Generalizable Face Landmarking Guided by Conditional Face Warping

Authors: Jiayi Liang, Haotian Liu, Hongteng Xu, Dixin Luo

Comments: Accepted in CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[20] arXiv:2404.12309 [pdf, other]: Title: iRAG: An Incremental Retrieval Augmented Generation System for Videos

Authors: Md Adnan Arefeen, Biplob Debnath, Md Yusuf Sarwar Uddin, Srimat Chakradhar

Subjects: Computer Vision and Pattern Recognition (cs.CV); Information Retrieval (cs.IR); Machine Learning (cs.LG)
[21] arXiv:2404.12295 [pdf, other]: Title: When Medical Imaging Met Self-Attention: A Love Story That Didn't Quite Work Out

Authors: Tristan Piater, Niklas Penzel, Gideon Stein, Joachim Denzler

Comments: 10 pages, 2 figures, 5 tables, presented at VISAPP 2024

Journal-ref: Proceedings of the 19th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP (2024), ISBN 978-989-758-679-8, ISSN 2184-4321, SciTePress, pages 149-158

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[22] arXiv:2404.12292 [pdf, other]: Title: Reducing Bias in Pre-trained Models by Tuning while Penalizing Change

Authors: Niklas Penzel, Gideon Stein, Joachim Denzler

Comments: 12 pages, 12 figures, presented at VISAPP 2024

Journal-ref: Proceedings of the 19th International Joint Conference on Computer Vision (2024), Imaging and Computer Graphics Theory and Applications - Volume 2: VISAPP, ISBN 978-989-758-679-8, ISSN 2184-4321, SciTePress, pages 90-101

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[23] arXiv:2404.12285 [pdf, other]: Title: Performance Evaluation of Segment Anything Model with Variational Prompting for Application to Non-Visible Spectrum Imagery

Authors: Yona Falinie A. Gaus, Neelanjan Bhowmik, Brian K. S. Isaac-Medina, Toby P. Breckon

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[24] arXiv:2404.12260 [pdf, other]: Title: Alleviating Catastrophic Forgetting in Facial Expression Recognition with Emotion-Centered Models

Authors: Israel A. Laurensi, Alceu de Souza Britto Jr., Jean Paul Barddal, Alessandro Lameiras Koerich

Comments: 15 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[25] arXiv:2404.12258 [pdf, ps, other]: Title: DeepLocalization: Using change point detection for Temporal Action Localization

Authors: Mohammed Shaiqur Rahman, Ibne Farabi Shihab, Lynna Chu, Anuj Sharma

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[26] arXiv:2404.12257 [pdf, other]: Title: Food Portion Estimation via 3D Object Scaling

Authors: Gautham Vinod, Jiangpeng He, Zeman Shao, Fengqing Zhu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM); Image and Video Processing (eess.IV)
[27] arXiv:2404.12252 [pdf, other]: Title: Deep Gaussian mixture model for unsupervised image segmentation

Authors: Matthias Schwab, Agnes Mayr, Markus Haltmeier

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[28] arXiv:2404.12246 [pdf, other]: Title: Blind Localization and Clustering of Anomalies in Textures

Authors: Andrei-Timotei Ardelean, Tim Weyrich

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[29] arXiv:2404.12235 [pdf, other]: Title: Beyond Average: Individualized Visual Scanpath Prediction

Authors: Xianyu Chen, Ming Jiang, Qi Zhao

Comments: To appear in CVPR2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[30] arXiv:2404.12216 [pdf, other]: Title: ProTA: Probabilistic Token Aggregation for Text-Video Retrieval

Authors: Han Fang, Xianghao Zang, Chao Ban, Zerun Feng, Lanxiang Zhou, Zhongjiang He, Yongxiang Li, Hao Sun

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[31] arXiv:2404.12210 [pdf, other]: Title: Observation, Analysis, and Solution: Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training

Authors: Jin Gao, Shubo Lin, Shaoru Wang, Yutong Kou, Zeming Li, Liang Li, Congxuan Zhang, Xiaoqin Zhang, Yizheng Wang, Weiming Hu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[32] arXiv:2404.12209 [pdf, other]: Title: Partial-to-Partial Shape Matching with Geometric Consistency

Authors: Viktoria Ehm, Maolin Gao, Paul Roetzer, Marvin Eisenberger, Daniel Cremers, Florian Bernard

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[33] arXiv:2404.12203 [pdf, other]: Title: GraFIQs: Face Image Quality Assessment Using Gradient Magnitudes

Authors: Jan Niklas Kolf, Naser Damer, Fadi Boutros

Comments: Accepted at CVPR Workshop 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[34] arXiv:2404.12192 [pdf, other]: Title: Aligning Actions and Walking to LLM-Generated Textual Descriptions

Authors: Radu Chivereanu, Adrian Cosma, Andy Catruna, Razvan Rughinis, Emilian Radoi

Comments: Accepted at 2nd Workshop on Learning with Few or without Annotated Face, Body and Gesture Data

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[35] arXiv:2404.12183 [pdf, other]: Title: Gait Recognition from Highly Compressed Videos

Authors: Andrei Niculae, Andy Catruna, Adrian Cosma, Daniel Rosner, Emilian Radoi

Comments: Accepted at 2nd Workshop on Learning with Few or without Annotated Face, Body and Gesture Data

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[36] arXiv:2404.12172 [pdf, other]: Title: How to Benchmark Vision Foundation Models for Semantic Segmentation?

Authors: Tommie Kerssies, Daan de Geus, Gijs Dubbelman

Comments: CVPR 2024 Workshop Proceedings for the Second Workshop on Foundation Models

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[37] arXiv:2404.12168 [pdf, other]: Title: Real-World Efficient Blind Motion Deblurring via Blur Pixel Discretization

Authors: Insoo Kim, Jae Seok Choi, Geonseok Seo, Kinam Kwon, Jinwoo Shin, Hyong-Euk Lee

Comments: CVPR2024 Camera-Ready

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[38] arXiv:2404.12154 [pdf, other]: Title: StyleBooth: Image Style Editing with Multimodal Instruction

Authors: Zhen Han, Chaojie Mao, Zeyinzi Jiang, Yulin Pan, Jingfeng Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[39] arXiv:2404.12144 [pdf, other]: Title: Mushroom Segmentation and 3D Pose Estimation from Point Clouds using Fully Convolutional Geometric Features and Implicit Pose Encoding

Authors: George Retsinas, Niki Efthymiou, Petros Maragos

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[40] arXiv:2404.12142 [pdf, other]: Title: SDIP: Self-Reinforcement Deep Image Prior Framework for Image Processing

Authors: Ziyu Shu, Zhixin Pan

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[41] arXiv:2404.12139 [pdf, other]: Title: Omniview-Tuning: Boosting Viewpoint Invariance of Vision-Language Pre-training Models

Authors: Shouwei Ruan, Yinpeng Dong, Hanqing Liu, Yao Huang, Hang Su, Xingxing Wei

Comments: 20 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[42] arXiv:2404.12120 [pdf, other]: Title: Fortify the Guardian, Not the Treasure: Resilient Adversarial Detectors

Authors: Raz Lapid, Almog Dubin, Moshe Sipper

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[43] arXiv:2404.12104 [pdf, other]: Title: Ethical-Lens: Curbing Malicious Usages of Open-Source Text-to-Image Models

Authors: Yuzhu Cai, Sheng Yin, Yuxi Wei, Chenxin Xu, Weibo Mao, Felix Juefei-Xu, Siheng Chen, Yanfeng Wang

Comments: 42 pages, 17 figures, 29 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Machine Learning (cs.LG)
[44] arXiv:2404.12103 [pdf, other]: Title: S3R-Net: A Single-Stage Approach to Self-Supervised Shadow Removal

Authors: Nikolina Kubiak, Armin Mustafa, Graeme Phillipson, Stephen Jolly, Simon Hadfield

Comments: NTIRE workshop @ CVPR 2024. Code & models available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[45] arXiv:2404.12091 [pdf, other]: Title: Harnessing Joint Rain-/Detail-aware Representations to Eliminate Intricate Rains

Authors: Wu Ran, Peirong Ma, Zhiquan He, Hao Ren, Hong Lu

Comments: 21 pages, 14 figures

Journal-ref: International Conference on Learning Representations 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[46] arXiv:2404.12083 [pdf, other]: Title: MambaPupil: Bidirectional Selective Recurrent model for Event-based Eye tracking

Authors: Zhong Wang, Zengyu Wan, Han Han, Bohao Liao, Yuliang Wu, Wei Zhai, Yang Cao, Zheng-jun Zha

Comments: Accepted by CVPR 2024 Workshop (AIS: Vision, Graphics and AI for Streaming), top solution of challenge Event-based Eye Tracking, see this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[47] arXiv:2404.12081 [pdf, other]: Title: MaskCD: A Remote Sensing Change Detection Network Based on Mask Classification

Authors: Weikang Yu, Xiaokang Zhang, Samiran Das, Xiao Xiang Zhu, Pedram Ghamisi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[48] arXiv:2404.12064 [pdf, other]: Title: PureForest: A Large-scale Aerial Lidar and Aerial Imagery Dataset for Tree Species Classification in Monospecific Forests

Authors: Charles Gaydon, Floryne Roche

Comments: 14 pages | 5 figures | Dataset is available at this http URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[49] arXiv:2404.12055 [pdf, other]: Title: Improving the perception of visual fiducial markers in the field using Adaptive Active Exposure Control

Authors: Ziang Ren, Samuel Lensgraf, Alberto Quattrini Li

Comments: Paper accepted by ISER 2023

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[50] arXiv:2404.12037 [pdf, other]: Title: Data-free Knowledge Distillation for Fine-grained Visual Categorization

Authors: Renrong Shao, Wei Zhang, Jianhua Yin, Jun Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[51] arXiv:2404.12031 [pdf, other]: Title: MLS-Track: Multilevel Semantic Interaction in RMOT

Authors: Zeliang Ma, Song Yang, Zhe Cui, Zhicheng Zhao, Fei Su, Delong Liu, Jingyu Wang

Comments: 17 pages 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[52] arXiv:2404.12024 [pdf, other]: Title: Meta-Auxiliary Learning for Micro-Expression Recognition

Authors: Jingyao Wang, Yunhan Tian, Yuxuan Yang, Xiaoxin Chen, Changwen Zheng, Wenwen Qiang

Comments: 10 pages, 7 figures, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[53] arXiv:2404.12020 [pdf, other]: Title: Look, Listen, and Answer: Overcoming Biases for Audio-Visual Question Answering

Authors: Jie Ma, Min Hu, Pinghui Wang, Wangchun Sun, Lingyun Song, Hongbin Pei, Jun Liu, Youtian Du

Comments: 16 pages, 9 figures,5 Tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[54] arXiv:2404.12015 [pdf, other]: Title: What does CLIP know about peeling a banana?

Authors: Claudia Cuttano, Gabriele Rosi, Gabriele Trivigno, Giuseppe Averta

Comments: Accepted to MAR Workshop at CVPR2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[55] arXiv:2404.11998 [pdf, other]: Title: Curriculum Point Prompting for Weakly-Supervised Referring Image Segmentation

Authors: Qiyuan Dai, Sibei Yang

Comments: Accepted to CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[56] arXiv:2404.11987 [pdf, other]: Title: MultiPhys: Multi-Person Physics-aware 3D Motion Estimation

Authors: Nicolas Ugrinovic, Boxiao Pan, Georgios Pavlakos, Despoina Paschalidou, Bokui Shen, Jordi Sanchez-Riera, Francesc Moreno-Noguer, Leonidas Guibas

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[57] arXiv:2404.11981 [pdf, other]: Title: Tendency-driven Mutual Exclusivity for Weakly Supervised Incremental Semantic Segmentation

Authors: Chongjie Si, Xuehui Wang, Xiaokang Yang, Wei Shen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[58] arXiv:2404.11979 [pdf, other]: Title: MTGA: Multi-view Temporal Granularity aligned Aggregation for Event-based Lip-reading

Authors: Wenhao Zhang, Jun Wang, Yong Luo, Lei Yu, Wei Yu, Zheng He

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[59] arXiv:2404.11958 [pdf, other]: Title: Not All Voxels Are Equal: Hardness-Aware Semantic Scene Completion with Self-Distillation

Authors: Song Wang, Jiawei Yu, Wentong Li, Wenyu Liu, Xiaolu Liu, Junbo Chen, Jianke Zhu

Comments: Accepted by CVPR2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[60] arXiv:2404.11957 [pdf, other]: Title: The devil is in the object boundary: towards annotation-free instance segmentation using Foundation Models

Authors: Cheng Shi, Sibei Yang

Comments: ICLR2024, Code is released at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[61] arXiv:2404.11949 [pdf, other]: Title: Sketch-guided Image Inpainting with Partial Discrete Diffusion Process

Authors: Nakul Sharma, Aditay Tripathi, Anirban Chakraborty, Anand Mishra

Comments: Accepted to NTIRE Workshop @ CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[62] arXiv:2404.11903 [pdf, other]: Title: Simultaneous Detection and Interaction Reasoning for Object-Centric Action Recognition

Authors: Xunsong Li, Pengzhan Sun, Yangcen Liu, Lixin Duan, Wen Li

Comments: 12 pages, 5 figures, submitted to IEEE Transactions on Multimedia

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[63] arXiv:2404.11897 [pdf, other]: Title: AG-NeRF: Attention-guided Neural Radiance Fields for Multi-height Large-scale Outdoor Scene Rendering

Authors: Jingfeng Guo, Xiaohan Zhang, Baozhu Zhao, Qi Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[64] arXiv:2404.11895 [pdf, other]: Title: FreeDiff: Progressive Frequency Truncation for Image Editing with Diffusion Models

Authors: Wei Wu, Qingnan Fan, Shuai Qin, Hong Gu, Ruoyu Zhao, Antoni B. Chan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[65] arXiv:2404.11884 [pdf, other]: Title: Seeing Motion at Nighttime with an Event Camera

Authors: Haoyue Liu, Shihan Peng, Lin Zhu, Yi Chang, Hanyu Zhou, Luxin Yan

Comments: Accepted by CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[66] arXiv:2404.11871 [pdf, other]: Title: Group-On: Boosting One-Shot Segmentation with Supportive Query

Authors: Hanjing Zhou, Mingze Yin, JinTai Chen, Danny Chen, Jian Wu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[67] arXiv:2404.11868 [pdf, other]: Title: OPTiML: Dense Semantic Invariance Using Optimal Transport for Self-Supervised Medical Image Representation

Authors: Azad Singh, Vandan Gorade, Deepak Mishra

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[68] arXiv:2404.11865 [pdf, other]: Title: From Image to Video, what do we need in multimodal LLMs?

Authors: Suyuan Huang, Haoxin Zhang, Yan Gao, Yao Hu, Zengchang Qin

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[69] arXiv:2404.11864 [pdf, other]: Title: Progressive Multi-modal Conditional Prompt Tuning

Authors: Xiaoyu Qiu, Hao Feng, Yuechen Wang, Wengang Zhou, Houqiang Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[70] arXiv:2404.11848 [pdf, other]: Title: Partial Large Kernel CNNs for Efficient Super-Resolution

Authors: Dongheon Lee, Seokju Yun, Youngmin Ro

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[71] arXiv:2404.11824 [pdf, other]: Title: TextCenGen: Attention-Guided Text-Centric Background Adaptation for Text-to-Image Generation

Authors: Tianyi Liang, Jiangqi Liu, Sicheng Song, Shiqi Jiang, Yifei Huang, Changbo Wang, Chenhui Li

Comments: 7 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[72] arXiv:2404.11819 [pdf, other]: Title: Utilizing Adversarial Examples for Bias Mitigation and Accuracy Enhancement

Authors: Pushkar Shukla, Dhruv Srikanth, Lee Cohen, Matthew Turk

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[73] arXiv:2404.11812 [pdf, other]: Title: Cross-model Mutual Learning for Exemplar-based Medical Image Segmentation

Authors: Qing En, Yuhong Guo

Comments: AISTATS 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[74] arXiv:2404.11803 [pdf, other]: Title: TempBEV: Improving Learned BEV Encoders with Combined Image and BEV Space Temporal Aggregation

Authors: Thomas Monninger, Vandana Dokkadi, Md Zafar Anwar, Steffen Staab

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Robotics (cs.RO)
[75] arXiv:2404.11798 [pdf, other]: Title: Establishing a Baseline for Gaze-driven Authentication Performance in VR: A Breadth-First Investigation on a Very Large Dataset

Authors: Dillon Lohr, Michael J. Proulx, Oleg Komogortsev

Comments: 28 pages, 18 figures, 5 tables, includes supplementary material

Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[76] arXiv:2404.11797 [pdf, other]: Title: When are Foundation Models Effective? Understanding the Suitability for Pixel-Level Classification Using Multispectral Imagery

Authors: Yiqun Xie, Zhihao Wang, Weiye Chen, Zhili Li, Xiaowei Jia, Yanhua Li, Ruichen Wang, Kangyang Chai, Ruohan Li, Sergii Skakun

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[77] arXiv:2404.11778 [pdf, other]: Title: CU-Mamba: Selective State Space Models with Channel Learning for Image Restoration

Authors: Rui Deng, Tianpei Gu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[78] arXiv:2404.11770 [pdf, other]: Title: Event-Based Eye Tracking. AIS 2024 Challenge Survey

Authors: Zuowen Wang, Chang Gao, Zongwei Wu, Marcos V. Conde, Radu Timofte, Shih-Chii Liu, Qinyu Chen, Zheng-jun Zha, Wei Zhai, Han Han, Bohao Liao, Yuliang Wu, Zengyu Wan, Zhong Wang, Yang Cao, Ganchao Tan, Jinze Chen, Yan Ru Pei, Sasskia Brüers, Sébastien Crouzet, Douglas McLelland, Oliver Coenen, Baoheng Zhang, Yizhao Gao, Jingyuan Li, Hayden Kwok-Hay So, Philippe Bich, Chiara Boretti, Luciano Prono, Mircea Lică, David Dinucu-Jianu, Cătălin Grîu, Xiaopeng Lin, Hongwei Ren, Bojun Cheng, Xinan Zhang, Valentin Vial, Anthony Yezzi, James Tsai

Comments: Qinyu Chen is the corresponding author

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[79] arXiv:2404.11764 [pdf, other]: Title: Multimodal 3D Object Detection on Unseen Domains

Authors: Deepti Hegde, Suhas Lohit, Kuan-Chuan Peng, Michael J. Jones, Vishal M. Patel

Comments: technical report

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[80] arXiv:2404.11762 [pdf, other]: Title: IrrNet: Advancing Irrigation Mapping with Incremental Patch Size Training on Remote Sensing Imagery

Authors: Oishee Bintey Hoque, Samarth Swarup, Abhijin Adiga, Sayjro Kossi Nouwakpo, Madhav Marathe

Comments: Full version of the paper will be appearing in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) Workshops, 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[81] arXiv:2404.11737 [pdf, other]: Title: Equivariant Spatio-Temporal Self-Supervision for LiDAR Object Detection

Authors: Deepti Hegde, Suhas Lohit, Kuan-Chuan Peng, Michael J. Jones, Vishal M. Patel

Comments: technical report

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[82] arXiv:2404.11732 [pdf, other]: Title: Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach

Authors: Mir Rayat Imtiaz Hossain, Mennatullah Siam, Leonid Sigal, James J. Little

Comments: Accepted at CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[83] arXiv:2404.11727 [pdf, ps, other]: Title: Deep Learning for Video-Based Assessment of Endotracheal Intubation Skills

Authors: Jean-Paul Ainam, Erim Yanik, Rahul Rahul, Taylor Kunkes, Lora Cavuoto, Brian Clemency, Kaori Tanaka, Matthew Hackett, Jack Norfleet, Suvranu De

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[84] arXiv:2404.11669 [pdf, other]: Title: Factorized Motion Fields for Fast Sparse Input Dynamic View Synthesis

Authors: Nagabhushan Somraj, Kapil Choudhary, Sai Harsha Mupparaju, Rajiv Soundararajan

Comments: Accepted at SIGGRAPH 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[85] arXiv:2404.11630 [pdf, other]: Title: SNP: Structured Neuron-level Pruning to Preserve Attention Scores

Authors: Kyunghwan Shim, Jaewoong Yun, Shinkook Choi

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[86] arXiv:2404.12387 (cross-list from cs.CL) [pdf, other]: Title: Reka Core, Flash, and Edge: A Series of Powerful Multimodal Language Models

Authors: Aitor Ormazabal, Che Zheng, Cyprien de Masson d'Autume, Dani Yogatama, Deyu Fu, Donovan Ong, Eric Chen, Eugenie Lamprecht, Hai Pham, Isaac Ong, Kaloyan Aleksiev, Lei Li, Matthew Henderson, Max Bain, Mikel Artetxe, Nishant Relan, Piotr Padlewski, Qi Liu, Ren Chen, Samuel Phua, Yazheng Yang, Yi Tay, Yuqi Wang, Zhongkai Zhu, Zhihui Xie

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[87] arXiv:2404.12341 (cross-list from cs.LG) [pdf, other]: Title: Measuring Feature Dependency of Neural Networks by Collapsing Feature Dimensions in the Data Manifold

Authors: Yinzhu Jin, Matthew B. Dwyer, P. Thomas Fletcher

Comments: Accepted and will be pulished in International Symposium on Biomedical Imaging (ISBI) 2024

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[88] arXiv:2404.12339 (cross-list from cs.RO) [pdf, other]: Title: SPOT: Point Cloud Based Stereo Visual Place Recognition for Similar and Opposing Viewpoints

Authors: Spencer Carmichael, Rahul Agrawal, Ram Vasudevan, Katherine A. Skinner

Comments: Accepted to ICRA 2024, project website: this https URL

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[89] arXiv:2404.12251 (cross-list from cs.LG) [pdf, other]: Title: Dynamic Modality and View Selection for Multimodal Emotion Recognition with Missing Modalities

Authors: Luciana Trinkaus Menon, Luiz Carlos Ribeiro Neduziak, Jean Paul Barddal, Alessandro Lameiras Koerich, Alceu de Souza Britto Jr

Comments: 15 pages

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[90] arXiv:2404.12163 (cross-list from eess.IV) [pdf, other]: Title: Unsupervised Microscopy Video Denoising

Authors: Mary Aiyetigbo, Alexander Korte, Ethan Anderson, Reda Chalhoub, Peter Kalivas, Feng Luo, Nianyi Li

Comments: Accepted at CVPRW 2024

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[91] arXiv:2404.12130 (cross-list from cs.LG) [pdf, other]: Title: One-Shot Sequential Federated Learning for Non-IID Data by Enhancing Local Model Diversity

Authors: Naibo Wang, Yuchen Deng, Wenjie Feng, Shichen Fan, Jianwei Yin, See-Kiong Ng

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC)
[92] arXiv:2404.12062 (cross-list from cs.SD) [pdf, other]: Title: MIDGET: Music Conditioned 3D Dance Generation

Authors: Jinwu Wang, Wei Mao, Miaomiao Liu

Comments: 12 pages, 6 figures Published in AI 2023: Advances in Artificial Intelligence

Journal-ref: In Australasian Joint Conference on Artificial Intelligence (pp. 277-288). Singapore: Springer Nature Singapore 2023

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Audio and Speech Processing (eess.AS)
[93] arXiv:2404.11974 (cross-list from eess.IV) [pdf, other]: Title: Device (In)Dependence of Deep Learning-based Image Age Approximation

Authors: Robert Jöchl, Andreas Uhl

Comments: This work was accepted and presented in: 2022 ICPR-Workshop on Artificial Intelligence for Multimedia Forensics and Disinformation Detection. Montreal, Quebec, Canada. However, due to a technical issue on the publishing companies' side, the work does not appear in the workshop proceedings

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[94] arXiv:2404.11962 (cross-list from cs.AI) [pdf, other]: Title: ©Plug-in Authorization for Human Content Copyright Protection in Text-to-Image Model

Authors: Chao Zhou, Huishuai Zhang, Jiang Bian, Weiming Zhang, Nenghai Yu

Comments: 20 pages, 6 figures

Subjects: Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[95] arXiv:2404.11947 (cross-list from cs.LG) [pdf, other]: Title: VCC-INFUSE: Towards Accurate and Efficient Selection of Unlabeled Examples in Semi-supervised Learning

Authors: Shijie Fang, Qianhan Feng, Tong Lin

Comments: Accepted paper of IJCAI 2024. Shijie Fang and Qianhan Feng contributed equally to this paper

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[96] arXiv:2404.11946 (cross-list from cs.RO) [pdf, other]: Title: S4TP: Social-Suitable and Safety-Sensitive Trajectory Planning for Autonomous Vehicles

Authors: Xiao Wang, Ke Tang, Xingyuan Dai, Jintao Xu, Quancheng Du, Rui Ai, Yuxiao Wang, Weihao Gu

Comments: 12 pages,4 figures, published to IEEE Transactions on Intelligent Vehicles

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[97] arXiv:2404.11936 (cross-list from cs.LG) [pdf, other]: Title: LD-Pruner: Efficient Pruning of Latent Diffusion Models using Task-Agnostic Insights

Authors: Thibault Castells, Hyoung-Kyu Song, Bo-Kyeong Kim, Shinkook Choi

Comments: 8 pages, accepted to CVPR24 First Workshop on Efficient and On-Device Generation (EDGE)

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[98] arXiv:2404.11929 (cross-list from eess.IV) [pdf, other]: Title: A Symmetric Regressor for MRI-Based Assessment of Striatal Dopamine Transporter Uptake in Parkinson's Disease

Authors: Walid Abdullah Al, Il Dong Yun, Yun Jung Bae

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[99] arXiv:2404.11925 (cross-list from cs.LG) [pdf, other]: Title: EdgeFusion: On-Device Text-to-Image Generation

Authors: Thibault Castells, Hyoung-Kyu Song, Tairen Piao, Shinkook Choi, Bo-Kyeong Kim, Hanyoung Yim, Changgwun Lee, Jae Gon Kim, Tae-Ho Kim

Comments: 4 pages, accepted to CVPR24 First Workshop on Efficient and On-Device Generation (EDGE)

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[100] arXiv:2404.11889 (cross-list from eess.IV) [pdf, other]: Title: Multi-view X-ray Image Synthesis with Multiple Domain Disentanglement from CT Scans

Authors: Lixing Tan, Shuang Song, Kangneng Zhou, Chengbo Duan, Lanying Wang, Huayang Ren, Linlin Liu, Wei Zhang, Ruoxiu Xiao

Comments: 13 pages, 10 figures

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[101] arXiv:2404.11843 (cross-list from eess.IV) [pdf, other]: Title: Computer-Aided Diagnosis of Thoracic Diseases in Chest X-rays using hybrid CNN-Transformer Architecture

Authors: Sonit Singh

Comments: 24 pages, 13 Figures, 13 Tables. arXiv admin note: text overlap with arXiv:1904.09925 by other authors

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[102] arXiv:2404.11795 (cross-list from cs.LG) [pdf, other]: Title: Prompt-Driven Feature Diffusion for Open-World Semi-Supervised Learning

Authors: Marzi Heidari, Hanping Zhang, Yuhong Guo

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[103] arXiv:2404.11776 (cross-list from cs.LG) [pdf, ps, other]: Title: 3D object quality prediction for Metal Jet Printer with Multimodal thermal encoder

Authors: Rachel (Lei) Chen, Wenjia Zheng, Sandeep Jalui, Pavan Suri, Jun Zeng

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[104] arXiv:2404.11769 (cross-list from cs.LG) [pdf, other]: Title: QGen: On the Ability to Generalize in Quantization Aware Training

Authors: MohammadHossein AskariHemmat, Ahmadreza Jeddi, Reyhane Askari Hemmat, Ivan Lazarevich, Alexander Hoffman, Sudhakar Sah, Ehsan Saboori, Yvon Savaria, Jean-Pierre David

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[105] arXiv:2404.11741 (cross-list from physics.med-ph) [pdf, other]: Title: Diffusion Schrödinger Bridge Models for High-Quality MR-to-CT Synthesis for Head and Neck Proton Treatment Planning

Authors: Muheng Li, Xia Li, Sairos Safai, Damien Weber, Antony Lomax, Ye Zhang

Comments: International Conference on the use of Computers in Radiation therapy (ICCR)

Subjects: Medical Physics (physics.med-ph); Computer Vision and Pattern Recognition (cs.CV)
[106] arXiv:2404.11735 (cross-list from cs.LG) [pdf, other]: Title: Learning with 3D rotations, a hitchhiker's guide to SO(3)

Authors: A. René Geist, Jonas Frey, Mikel Zobro, Anna Levina, Georg Martius

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[107] arXiv:2404.11725 (cross-list from eess.IV) [pdf, ps, other]: Title: Postoperative glioblastoma segmentation: Development of a fully automated pipeline using deep convolutional neural networks and comparison with currently available models

Authors: Santiago Cepeda, Roberto Romero, Daniel Garcia-Perez, Guillermo Blasco, Luigi Tommaso Luppino, Samuel Kuttner, Ignacio Arrese, Ole Solheim, Live Eikenes, Anna Karlberg, Angel Perez-Nunez, Trinidad Escudero, Roberto Hornero, Rosario Sarabia

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[108] arXiv:2404.11683 (cross-list from cs.RO) [pdf, other]: Title: Unifying Scene Representation and Hand-Eye Calibration with 3D Foundation Models

Authors: Weiming Zhi, Haozhan Tang, Tianyi Zhang, Matthew Johnson-Roberson

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[109] arXiv:2404.11667 (cross-list from cs.LG) [pdf, other]: Title: Deep Dependency Networks and Advanced Inference Schemes for Multi-Label Classification

Authors: Shivvrat Arya, Yu Xiang, Vibhav Gogate

Comments: Will appear in AISTATS 2024. arXiv admin note: substantial text overlap with arXiv:2302.00633

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)

Thu, 18 Apr 2024

[110] arXiv:2404.11615 [pdf, other]: Title: Factorized Diffusion: Perceptual Illusions by Noise Decomposition

Authors: Daniel Geng, Inbum Park, Andrew Owens

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[111] arXiv:2404.11614 [pdf, other]: Title: Dynamic Typography: Bringing Text to Life via Video Diffusion Prior

Authors: Zichen Liu, Yihao Meng, Hao Ouyang, Yue Yu, Bolin Zhao, Daniel Cohen-Or, Huamin Qu

Comments: Our demo page is available at: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[112] arXiv:2404.11613 [pdf, other]: Title: InFusion: Inpainting 3D Gaussians via Learning Depth Completion from Diffusion Prior

Authors: Zhiheng Liu, Hao Ouyang, Qiuyu Wang, Ka Leong Cheng, Jie Xiao, Kai Zhu, Nan Xue, Yu Liu, Yujun Shen, Yang Cao

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[113] arXiv:2404.11605 [pdf, other]: Title: VG4D: Vision-Language Model Goes 4D Video Recognition

Authors: Zhichao Deng, Xiangtai Li, Xia Li, Yunhai Tong, Shen Zhao, Mengyuan Liu

Comments: ICRA 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[114] arXiv:2404.11593 [pdf, other]: Title: IntrinsicAnything: Learning Diffusion Priors for Inverse Rendering Under Unknown Illumination

Authors: Xi Chen (1), Sida Peng (1), Dongchen Yang (1), Yuan Liu (2), Bowen Pan (3), Chengfei Lv (3), Xiaowei Zhou (1) ((1) Zhejiang University, (2) The University of Hong Kong, (3) Tao Technology Department, Alibaba Group)

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[115] arXiv:2404.11590 [pdf, other]: Title: A Subspace-Constrained Tyler's Estimator and its Applications to Structure from Motion

Authors: Feng Yu, Teng Zhang, Gilad Lerman

Comments: 23 pages, accepted by CVPR 24

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[116] arXiv:2404.11589 [pdf, other]: Title: Prompt Optimizer of Text-to-Image Diffusion Models for Abstract Concept Understanding

Authors: Zezhong Fan, Xiaohan Li, Chenhao Fang, Topojoy Biswas, Kaushiki Nag, Jianpeng Xu, Kannan Achan

Comments: WWW 2024 Companion

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[117] arXiv:2404.11576 [pdf, other]: Title: State-space Decomposition Model for Video Prediction Considering Long-term Motion Trend

Authors: Fei Cui, Jiaojiao Fang, Xiaojiang Wu, Zelong Lai, Mengke Yang, Menghan Jia, Guizhong Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[118] arXiv:2404.11569 [pdf, other]: Title: Simple Image Signal Processing using Global Context Guidance

Authors: Omar Elezabi, Marcos V. Conde, Radu Timofte

Comments: Preprint under review

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[119] arXiv:2404.11565 [pdf, other]: Title: MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation

Authors: Kuan-Chieh (Jackson) Wang, Daniil Ostashev, Yuwei Fang, Sergey Tulyakov, Kfir Aberman

Comments: Project Website: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[120] arXiv:2404.11554 [pdf, other]: Title: Predicting Long-horizon Futures by Conditioning on Geometry and Time

Authors: Tarasha Khurana, Deva Ramanan

Comments: Project page: this http URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[121] arXiv:2404.11537 [pdf, other]: Title: SSDiff: Spatial-spectral Integrated Diffusion Model for Remote Sensing Pansharpening

Authors: Yu Zhong, Xiao Wu, Liang-Jian Deng, Zihan Cao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[122] arXiv:2404.11525 [pdf, other]: Title: JointViT: Modeling Oxygen Saturation Levels with Joint Supervision on Long-Tailed OCTA

Authors: Zeyu Zhang, Xuyin Qi, Mingxi Chen, Guangxi Li, Ryan Pham, Ayub Qassim, Ella Berry, Zhibin Liao, Owen Siggs, Robert Mclaughlin, Jamie Craig, Minh-Son To

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[123] arXiv:2404.11492 [pdf, other]: Title: arcjetCV: an open-source software to analyze material ablation

Authors: Alexandre Quintart, Magnus Haw, Federico Semeraro

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[124] arXiv:2404.11488 [pdf, other]: Title: Multi-resolution Rescored ByteTrack for Video Object Detection on Ultra-low-power Embedded Systems

Authors: Luca Bompani, Manuele Rusci, Daniele Palossi, Francesco Conti, Luca Benini

Comments: 9 pages, 3 figures Accepted for publication at the Embedded Vision Workshop of the Computer Vision and Pattern Recognition conference, Seattle, 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[125] arXiv:2404.11475 [pdf, other]: Title: AdaIR: Exploiting Underlying Similarities of Image Restoration Tasks with Adapters

Authors: Hao-Wei Chen, Yu-Syuan Xu, Kelvin C.K. Chan, Hsien-Kai Kuo, Chun-Yi Lee, Ming-Hsuan Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[126] arXiv:2404.11474 [pdf, other]: Title: Towards Highly Realistic Artistic Style Transfer via Stable Diffusion with Step-aware and Layer-aware Prompt

Authors: Zhanjie Zhang, Quanwei Zhang, Huaizhong Lin, Wei Xing, Juncheng Mo, Shuaicheng Huang, Jinheng Xie, Guangyuan Li, Junsheng Luan, Lei Zhao, Dalong Zhang, Lixia Chen

Comments: Accepted by IJCAI2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[127] arXiv:2404.11461 [pdf, other]: Title: Using Game Engines and Machine Learning to Create Synthetic Satellite Imagery for a Tabletop Verification Exercise

Authors: Johannes Hoster, Sara Al-Sayed, Felix Biessmann, Alexander Glaser, Kristian Hildebrand, Igor Moric, Tuong Vy Nguyen

Comments: Annual Meeting of the Institute of Nuclear Materials Management (INMM), Vienna

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
[128] arXiv:2404.11429 [pdf, other]: Title: CarcassFormer: An End-to-end Transformer-based Framework for Simultaneous Localization, Segmentation and Classification of Poultry Carcass Defect

Authors: Minh Tran, Sang Truong, Arthur F. A. Fernandes, Michael T. Kidd, Ngan Le

Comments: Accepted to Poultry Science Journal

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[129] arXiv:2404.11426 [pdf, other]: Title: SPAMming Labels: Efficient Annotations for the Trackers of Tomorrow

Authors: Orcun Cetintas, Tim Meinhardt, Guillem Brasó, Laura Leal-Taixé

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[130] arXiv:2404.11419 [pdf, other]: Title: SLAIM: Robust Dense Neural SLAM for Online Tracking and Mapping

Authors: Vincent Cartillier, Grant Schindler, Irfan Essa

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[131] arXiv:2404.11416 [pdf, other]: Title: Neural Shrödinger Bridge Matching for Pansharpening

Authors: Zihan Cao, Xiao Wu, Liang-Jian Deng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[132] arXiv:2404.11401 [pdf, other]: Title: RainyScape: Unsupervised Rainy Scene Reconstruction using Decoupled Neural Rendering

Authors: Xianqiang Lyu, Hui Liu, Junhui Hou

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[133] arXiv:2404.11375 [pdf, other]: Title: Text-controlled Motion Mamba: Text-Instructed Temporal Grounding of Human Motion

Authors: Xinghan Wang, Zixi Kang, Yadong Mu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[134] arXiv:2404.11358 [pdf, other]: Title: DeblurGS: Gaussian Splatting for Camera Motion Blur

Authors: Jeongtaek Oh, Jaeyoung Chung, Dongwoo Lee, Kyoung Mu Lee

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[135] arXiv:2404.11357 [pdf, other]: Title: Detector Collapse: Backdooring Object Detection to Catastrophic Overload or Blindness

Authors: Hangtao Zhang, Shengshan Hu, Yichen Wang, Leo Yu Zhang, Ziqi Zhou, Xianlong Wang, Yanjun Zhang, Chao Chen

Comments: Accepted by IJCAI-24

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[136] arXiv:2404.11355 [pdf, other]: Title: Consisaug: A Consistency-based Augmentation for Polyp Detection in Endoscopy Image Analysis

Authors: Ziyu Zhou, Wenyuan Shen, Chang Liu

Comments: MLMI 2023

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[137] arXiv:2404.11339 [pdf, other]: Title: Best Practices for a Handwritten Text Recognition System

Authors: George Retsinas, Giorgos Sfikas, Basilis Gatos, Christophoros Nikou

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[138] arXiv:2404.11335 [pdf, other]: Title: SoccerNet Game State Reconstruction: End-to-End Athlete Tracking and Identification on a Minimap

Authors: Vladimir Somers, Victor Joos, Anthony Cioppa, Silvio Giancola, Seyed Abolfazl Ghasemzadeh, Floriane Magera, Baptiste Standaert, Amir Mohammad Mansourian, Xin Zhou, Shohreh Kasaei, Bernard Ghanem, Alexandre Alahi, Marc Van Droogenbroeck, Christophe De Vleeschouwer

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[139] arXiv:2404.11326 [pdf, other]: Title: Single-temporal Supervised Remote Change Detection for Domain Generalization

Authors: Qiangang Du, Jinlong Peng, Xu Chen, Qingdong He, Liren He, Qiang Nie, Wenbing Zhu, Mingmin Chi, Yabiao Wang, Chengjie Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[140] arXiv:2404.11322 [pdf, other]: Title: VBR: A Vision Benchmark in Rome

Authors: Leonardo Brizi, Emanuele Giacomini, Luca Di Giammarino, Simone Ferrari, Omar Salem, Lorenzo De Rebotti, Giorgio Grisetti

Comments: Accepted at IEEE ICRA 2024 Website: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[141] arXiv:2404.11318 [pdf, other]: Title: Leveraging Fine-Grained Information and Noise Decoupling for Remote Sensing Change Detection

Authors: Qiangang Du, Jinlong Peng, Changan Wang, Xu Chen, Qingdong He, Wenbing Zhu, Mingmin Chi, Yabiao Wang, Chengjie Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[142] arXiv:2404.11317 [pdf, other]: Title: Improving Composed Image Retrieval via Contrastive Learning with Scaling Positives and Negatives

Authors: Zhangchi Feng, Richong Zhang, Zhijie Nie

Comments: 12 pages, 11 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[143] arXiv:2404.11309 [pdf, other]: Title: Achieving Rotation Invariance in Convolution Operations: Shifting from Data-Driven to Mechanism-Assured

Authors: Hanlin Mo, Guoying Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[144] arXiv:2404.11302 [pdf, other]: Title: A Semantic Segmentation-guided Approach for Ground-to-Aerial Image Matching

Authors: Francesco Pro, Nikolaos Dionelis, Luca Maiano, Bertrand Le Saux, Irene Amerini

Comments: 6 pages, 2 figures, 2 tables, Submitted to IGARSS 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[145] arXiv:2404.11299 [pdf, other]: Title: Learning from Unlabelled Data with Transformers: Domain Adaptation for Semantic Segmentation of High Resolution Aerial Images

Authors: Nikolaos Dionelis, Francesco Pro, Luca Maiano, Irene Amerini, Bertrand Le Saux

Comments: 6 pages, 7 figures, Submitted to IGARSS 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[146] arXiv:2404.11291 [pdf, other]: Title: Closely Interactive Human Reconstruction with Proxemics and Physics-Guided Adaption

Authors: Buzhen Huang, Chen Li, Chongyang Xu, Liang Pan, Yangang Wang, Gim Hee Lee

Comments: CVPR2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[147] arXiv:2404.11266 [pdf, other]: Title: Criteria for Uncertainty-based Corner Cases Detection in Instance Segmentation

Authors: Florian Heidecker, Ahmad El-Khateeb, Maarten Bieshaar, Bernhard Sick

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[148] arXiv:2404.11265 [pdf, other]: Title: The Victim and The Beneficiary: Exploiting a Poisoned Model to Train a Clean Model on Poisoned Data

Authors: Zixuan Zhu, Rui Wang, Cong Zou, Lihua Jing

Comments: 13 pages, 6 figures, published to ICCV

Journal-ref: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 2023: 155-164

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[149] arXiv:2404.11256 [pdf, other]: Title: MMCBE: Multi-modality Dataset for Crop Biomass Estimation and Beyond

Authors: Xuesong Li, Zeeshan Hayder, Ali Zia, Connor Cassidy, Shiming Liu, Warwick Stiller, Eric Stone, Warren Conaty, Lars Petersson, Vivien Rolland

Comments: 10 pages, 10 figures, 3 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[150] arXiv:2404.11249 [pdf, other]: Title: A Progressive Framework of Vision-language Knowledge Distillation and Alignment for Multilingual Scene

Authors: Wenbo Zhang, Yifan Zhang, Jianfeng Lin, Binqiang Huang, Jinlu Zhang, Wenhao Yu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[151] arXiv:2404.11243 [pdf, other]: Title: Optical Image-to-Image Translation Using Denoising Diffusion Models: Heterogeneous Change Detection as a Use Case

Authors: João Gabriel Vinholi, Marco Chini, Anis Amziane, Renato Machado, Danilo Silva, Patrick Matgen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[152] arXiv:2404.11236 [pdf, other]: Title: ONOT: a High-Quality ICAO-compliant Synthetic Mugshot Dataset

Authors: Nicolò Di Domenico, Guido Borghi, Annalisa Franco, Davide Maltoni

Comments: Paper accepted in IEEE FG 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[153] arXiv:2404.11230 [pdf, other]: Title: Energy-Efficient Uncertainty-Aware Biomass Composition Prediction at the Edge

Authors: Muhammad Zawish, Paul Albert, Flavio Esposito, Steven Davy, Lizy Abraham

Comments: The paper has been accepted to CVPR 2024 5th Workshop on Vision for Agriculture

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[154] arXiv:2404.11226 [pdf, other]: Title: Simple In-place Data Augmentation for Surveillance Object Detection

Authors: Munkh-Erdene Otgonbold, Ganzorig Batnasan, Munkhjargal Gochoo

Comments: CVPR Workshop 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[155] arXiv:2404.11214 [pdf, other]: Title: Feature Corrective Transfer Learning: End-to-End Solutions to Object Detection in Non-Ideal Visual Conditions

Authors: Chuheng Wei, Guoyuan Wu, Matthew J. Barth

Comments: 10 pages, 3 figures, accepted by 2024 CVPR UG2 Workshop

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[156] arXiv:2404.11207 [pdf, other]: Title: Exploring the Transferability of Visual Prompting for Multimodal Large Language Models

Authors: Yichi Zhang, Yinpeng Dong, Siyuan Zhang, Tianzan Min, Hang Su, Jun Zhu

Comments: Accepted in CVPR 2024 as Poster (Highlight)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[157] arXiv:2404.11205 [pdf, other]: Title: Kathakali Hand Gesture Recognition With Minimal Data

Authors: Kavitha Raju, Nandini J. Warrier

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[158] arXiv:2404.11202 [pdf, other]: Title: GhostNetV3: Exploring the Training Strategies for Compact Models

Authors: Zhenhua Liu, Zhiwei Hao, Kai Han, Yehui Tang, Yunhe Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[159] arXiv:2404.11161 [pdf, other]: Title: Pre-processing matters: A segment search method for WSI classification

Authors: Jun Wang, Yufei Cui, Yu Mao, Nan Guan, Chun Jason Xue

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[160] arXiv:2404.11159 [pdf, other]: Title: Deep Portrait Quality Assessment. A NTIRE 2024 Challenge Survey

Authors: Nicolas Chahine, Marcos V. Conde, Daniela Carfora, Gabriel Pacianotto, Benoit Pochon, Sira Ferradans, Radu Timofte

Comments: CVPRW - NTIRE 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[161] arXiv:2404.11156 [pdf, other]: Title: Learning SO(3)-Invariant Semantic Correspondence via Local Shape Transform

Authors: Chunghyun Park, Seungwook Sim, Jaesik Park, Minsu Cho

Comments: Accepted to CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[162] arXiv:2404.11155 [pdf, other]: Title: HybriMap: Hybrid Clues Utilization for Effective Vectorized HD Map Construction

Authors: Chi Zhang, Qi Song, Feifei Li, Yongquan Chen, Rui Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[163] arXiv:2404.11151 [pdf, other]: Title: REACTO: Reconstructing Articulated Objects from a Single Video

Authors: Chaoyue Song, Jiacheng Wei, Chuan-Sheng Foo, Guosheng Lin, Fayao Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[164] arXiv:2404.11139 [pdf, other]: Title: GeoReF: Geometric Alignment Across Shape Variation for Category-level Object Pose Refinement

Authors: Linfang Zheng, Tze Ho Elden Tse, Chen Wang, Yinghan Sun, Hua Chen, Ales Leonardis, Wei Zhang

Comments: The IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[165] arXiv:2404.11129 [pdf, other]: Title: Fact :Teaching MLLMs with Faithful, Concise and Transferable Rationales

Authors: Minghe Gao, Shuang Chen, Liang Pang, Yuan Yao, Jisheng Dang, Wenqiao Zhang, Juncheng Li, Siliang Tang, Yueting Zhuang, Tat-Seng Chua

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[166] arXiv:2404.11127 [pdf, other]: Title: D-Aug: Enhancing Data Augmentation for Dynamic LiDAR Scenes

Authors: Jiaxing Zhao, Peng Zheng, Rui Ma

Comments: 4pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[167] arXiv:2404.11120 [pdf, other]: Title: TiNO-Edit: Timestep and Noise Optimization for Robust Diffusion-Based Image Editing

Authors: Sherry X. Chen, Yaron Vaxman, Elad Ben Baruch, David Asulin, Aviad Moreshet, Kuo-Chin Lien, Misha Sra, Pradeep Sen

Comments: Conference on Computer Vision and Pattern Recognition (CVPR) 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[168] arXiv:2404.11118 [pdf, other]: Title: MHLR: Moving Haar Learning Rate Scheduler for Large-scale Face Recognition Training with One GPU

Authors: Xueyuan Gong, Yain-whar Si, Zheng Zhang, Xiaochen Yuan, Ke Wang, Xinyuan Zhang, Cong Lin, Xiaoxiang Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[169] arXiv:2404.11111 [pdf, other]: Title: CorrNet+: Sign Language Recognition and Translation via Spatial-Temporal Correlation

Authors: Lianyu Hu, Wei Feng, Liqing Gao, Zekang Liu, Liang Wan

Comments: arXiv admin note: substantial text overlap with arXiv:2303.03202

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[170] arXiv:2404.11108 [pdf, other]: Title: LADDER: An Efficient Framework for Video Frame Interpolation

Authors: Tong Shen, Dong Li, Ziheng Gao, Lu Tian, Emad Barsoum

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[171] arXiv:2404.11104 [pdf, other]: Title: Object Remover Performance Evaluation Methods using Class-wise Object Removal Images

Authors: Changsuk Oh, Dongseok Shim, Taekbeom Lee, H. Jin Kim

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[172] arXiv:2404.11100 [pdf, other]: Title: Synthesizing Realistic Data for Table Recognition

Authors: Qiyu Hou, Jun Wang, Meixuan Qiao, Lujun Tian

Comments: ICDAR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[173] arXiv:2404.11098 [pdf, other]: Title: LAPTOP-Diff: Layer Pruning and Normalized Distillation for Compressing Diffusion Models

Authors: Dingkun Zhang, Sijia Li, Chen Chen, Qingsong Xie, Haonan Lu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[174] arXiv:2404.11070 [pdf, ps, other]: Title: Sky-GVIO: an enhanced GNSS/INS/Vision navigation with FCN-based sky-segmentation in urban canyon

Authors: Jingrong Wang, Bo Xu, Ronghe Jin, Shoujian Zhang, Kefu Gao, Jingnan Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Signal Processing (eess.SP)
[175] arXiv:2404.11064 [pdf, other]: Title: Rethinking 3D Dense Caption and Visual Grounding in A Unified Framework through Prompt-based Localization

Authors: Yongdong Luo, Haojia Lin, Xiawu Zheng, Yigeng Jiang, Fei Chao, Jie Hu, Guannan Jiang, Songan Zhang, Rongrong Ji

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[176] arXiv:2404.11054 [pdf, other]: Title: Multilateral Temporal-view Pyramid Transformer for Video Inpainting Detection

Authors: Ying Zhang, Bo Peng, Jiaran Zhou, Huiyu Zhou, Junyu Dong, Yuezun Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[177] arXiv:2404.11052 [pdf, other]: Title: Supervised Contrastive Vision Transformer for Breast Histopathological Image Classification

Authors: Mohammad Shiri, Monalika Padma Reddy, Jiangwen Sun

Comments: 8 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[178] arXiv:2404.11051 [pdf, ps, other]: Title: WPS-Dataset: A benchmark for wood plate segmentation in bark removal processing

Authors: Rijun Wang, Guanghao Zhang, Fulong Liang, Bo Wang, Xiangwei Mou, Yesheng Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[179] arXiv:2404.11031 [pdf, other]: Title: TaCOS: Task-Specific Camera Optimization with Simulation

Authors: Chengyang Yan, Donald G. Dansereau

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[180] arXiv:2404.11025 [pdf, other]: Title: Spatial-Aware Image Retrieval: A Hyperdimensional Computing Approach for Efficient Similarity Hashing

Authors: Sanggeon Yun, Ryozo Masukawa, SungHeon Jeong, Mohsen Imani

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[181] arXiv:2404.11016 [pdf, other]: Title: MaeFuse: Transferring Omni Features with Pretrained Masked Autoencoders for Infrared and Visible Image Fusion via Guided Training

Authors: Jiayang Li, Junjun Jiang, Pengwei Liang, Jiayi Ma

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[182] arXiv:2404.11008 [pdf, other]: Title: AKGNet: Attribute Knowledge-Guided Unsupervised Lung-Infected Area Segmentation

Authors: Qing En, Yuhong Guo

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[183] arXiv:2404.11003 [pdf, other]: Title: InfoMatch: Entropy Neural Estimation for Semi-Supervised Image Classification

Authors: Qi Han, Zhibo Tian, Chengwei Xia, Kun Zhan

Comments: IJCAI 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[184] arXiv:2404.10992 [pdf, other]: Title: How to deal with glare for improved perception of Autonomous Vehicles

Authors: Muhammad Z. Alam, Zeeshan Kaleem, Sousso Kelouwani

Comments: 14 pages, 9 figures, Accepted IEEE TIV

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[185] arXiv:2404.10989 [pdf, other]: Title: FairSSD: Understanding Bias in Synthetic Speech Detectors

Authors: Amit Kumar Singh Yadav, Kratika Bhagtani, Davide Salvi, Paolo Bestagini, Edward J.Delp

Comments: Accepted at CVPR 2024 (WMF)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[186] arXiv:2404.10985 [pdf, ps, other]: Title: Pixel-Wise Symbol Spotting via Progressive Points Location for Parsing CAD Images

Authors: Junbiao Pang, Zailin Dong, Jiaxin Deng, Mengyuan Zhu, Yunwei Zhang

Comments: 10 pages, 10 figures,6 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[187] arXiv:2404.10980 [pdf, other]: Title: Hyper Evidential Deep Learning to Quantify Composite Classification Uncertainty

Authors: Changbin Li, Kangshuo Li, Yuzhe Ou, Lance M. Kaplan, Audun Jøsang, Jin-Hee Cho, Dong Hyun Jeong, Feng Chen

Comments: In Proceedings of The Twelfth International Conference on Learning Representations, ICLR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[188] arXiv:2404.10978 [pdf, other]: Title: Leveraging 3D LiDAR Sensors to Enable Enhanced Urban Safety and Public Health: Pedestrian Monitoring and Abnormal Activity Detection

Authors: Nawfal Guefrachi, Jian Shi, Hakim Ghazzai, Ahmad Alsharoa

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[189] arXiv:2404.10966 [pdf, other]: Title: Domain-Specific Block Selection and Paired-View Pseudo-Labeling for Online Test-Time Adaptation

Authors: Yeonguk Yu, Sungho Shin, Seunghyeok Back, Minhwan Ko, Sangjun Noh, Kyoobin Lee

Comments: Accepted at CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[190] arXiv:2404.10947 [pdf, other]: Title: Residual Connections Harm Self-Supervised Abstract Feature Learning

Authors: Xiao Zhang, Ruoxi Jiang, William Gao, Rebecca Willett, Michael Maire

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[191] arXiv:2404.10940 [pdf, other]: Title: Neuromorphic Vision-based Motion Segmentation with Graph Transformer Neural Network

Authors: Yusra Alkendi, Rana Azzam, Sajid Javed, Lakmal Seneviratne, Yahya Zweiri

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[192] arXiv:2404.10927 [pdf, other]: Title: A Concise Tiling Strategy for Preserving Spatial Context in Earth Observation Imagery

Authors: Ellianna Abrahams, Tasha Snow, Matthew R. Siegfried, Fernando Pérez

Comments: Accepted to the Machine Learning for Remote Sensing (ML4RS) Workshop at ICLR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[193] arXiv:2404.10904 [pdf, other]: Title: Multi-Task Multi-Modal Self-Supervised Learning for Facial Expression Recognition

Authors: Marah Halawa, Florian Blume, Pia Bideau, Martin Maier, Rasha Abdel Rahman, Olaf Hellwich

Comments: The paper will appear in the CVPR 2024 workshops proceedings

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[194] arXiv:2404.10896 [pdf, ps, other]: Title: From a Lossless (~1.5:1) Compression Algorithm for Llama2 7B Weights to Variable Precision, Variable Range, Compressed Numeric Data Types for CNNs and LLMs

Authors: Vincenzo Liguori

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Hardware Architecture (cs.AR)
[195] arXiv:2404.10894 [pdf, other]: Title: Semantics-Aware Attention Guidance for Diagnosing Whole Slide Images

Authors: Kechun Liu, Wenjun Wu, Joann G. Elmore, Linda G. Shapiro

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[196] arXiv:2404.10880 [pdf, other]: Title: HumMUSS: Human Motion Understanding using State Space Models

Authors: Arnab Kumar Mondal, Stefano Alletto, Denis Tome

Comments: CVPR 24

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[197] arXiv:2404.10865 [pdf, other]: Title: OSR-ViT: A Simple and Modular Framework for Open-Set Object Detection and Discovery

Authors: Matthew Inkawhich, Nathan Inkawhich, Hao Yang, Jingyang Zhang, Randolph Linderman, Yiran Chen

Comments: 28 pages, 8 figures, 7 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[198] arXiv:2404.10864 [pdf, other]: Title: Vocabulary-free Image Classification and Semantic Segmentation

Authors: Alessandro Conti, Enrico Fini, Massimiliano Mancini, Paolo Rota, Yiming Wang, Elisa Ricci

Comments: Under review, 22 pages, 10 figures, code is available at this https URL arXiv admin note: text overlap with arXiv:2306.00917

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[199] arXiv:2404.10856 [pdf, other]: Title: UruDendro, a public dataset of cross-section images of Pinus taeda

Authors: Henry Marichal, Diego Passarella, Christine Lucas, Ludmila Profumo, Verónica Casaravilla, María Noel Rocha Galli, Serrana Ambite, Gregory Randall

Comments: Submitted to Dendrochronologia. arXiv admin note: text overlap with arXiv:2305.10809

Subjects: Computer Vision and Pattern Recognition (cs.CV); Quantitative Methods (q-bio.QM)
[200] arXiv:2404.10841 [pdf, other]: Title: Gasformer: A Transformer-based Architecture for Segmenting Methane Emissions from Livestock in Optical Gas Imaging

Authors: Toqi Tahamid Sarker, Mohamed G Embaby, Khaled R Ahmed, Amer AbuGhazaleh

Comments: 9 pages, 5 figures, this paper has been submitted and accepted for publication at CVPRW 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[201] arXiv:2404.10838 [pdf, other]: Title: Dynamic Self-adaptive Multiscale Distillation from Pre-trained Multimodal Large Model for Efficient Cross-modal Representation Learning

Authors: Zhengyang Liang, Meiyu Liang, Wei Huang, Yawen Li, Zhe Xue

Comments: 10 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Multimedia (cs.MM)
[202] arXiv:2404.10836 [pdf, other]: Title: Semantic-Based Active Perception for Humanoid Visual Tasks with Foveal Sensors

Authors: João Luzio, Alexandre Bernardino, Plinio Moreno

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[203] arXiv:2404.11599 (cross-list from cs.LG) [pdf, other]: Title: Variational Bayesian Last Layers

Authors: James Harrison, John Willes, Jasper Snoek

Comments: International Conference on Learning Representations (ICLR) 2024

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (stat.ML)
[204] arXiv:2404.11511 (cross-list from eess.IV) [pdf, other]: Title: Event Cameras Meet SPADs for High-Speed, Low-Bandwidth Imaging

Authors: Manasi Muglikar, Siddharth Somasundaram, Akshat Dave, Edoardo Charbon, Ramesh Raskar, Davide Scaramuzza

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[205] arXiv:2404.11459 (cross-list from cs.CL) [pdf, other]: Title: Octopus v3: Technical Report for On-device Sub-billion Multimodal AI Agent

Authors: Wei Chen, Zhiyuan Li

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
[206] arXiv:2404.11428 (cross-list from eess.IV) [pdf, other]: Title: Explainable Lung Disease Classification from Chest X-Ray Images Utilizing Deep Learning and XAI

Authors: Tanzina Taher Ifty, Saleh Ahmed Shafin, Shoeb Mohammad Shahriar, Tashfia Towhid

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[207] arXiv:2404.11361 (cross-list from eess.IV) [pdf, other]: Title: Boosting Medical Image Segmentation Performance with Adaptive Convolution Layer

Authors: Seyed M.R. Modaresi, Aomar Osmani, Mohammadreza Razzazi, Abdelghani Chibani

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[208] arXiv:2404.11336 (cross-list from eess.SY) [pdf, other]: Title: Vision-based control for landing an aerial vehicle on a marine vessel

Authors: Haohua Dong

Subjects: Systems and Control (eess.SY); Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[209] arXiv:2404.11327 (cross-list from cs.RO) [pdf, other]: Title: Following the Human Thread in Social Navigation

Authors: Luca Scofano, Alessio Sampieri, Tommaso Campari, Valentino Sacco, Indro Spinelli, Lamberto Ballan, Fabio Galasso

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[210] arXiv:2404.11273 (cross-list from eess.IV) [pdf, other]: Title: Training Transformer Models by Wavelet Losses Improves Quantitative and Visual Performance in Single Image Super-Resolution

Authors: Cansu Korkmaz, A. Murat Tekalp

Comments: total of 10 pages including references, 5 tables and 5 figures, accepted for NTIRE 2024 Single Image Super Resolution (x4) challenge

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[211] arXiv:2404.11209 (cross-list from cs.AI) [pdf, ps, other]: Title: Prompt-Guided Generation of Structured Chest X-Ray Report Using a Pre-trained LLM

Authors: Hongzhao Li, Hongyu Wang, Xia Sun, Hua He, Jun Feng

Comments: Accepted by IEEE Conference on Multimedia Expo 2024

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[212] arXiv:2404.11152 (cross-list from eess.IV) [pdf, other]: Title: Multi-target and multi-stage liver lesion segmentation and detection in multi-phase computed tomography scans

Authors: Abdullah F. Al-Battal, Soan T. M. Duong, Van Ha Tang, Quang Duc Tran, Steven Q. H. Truong, Chien Phan, Truong Q. Nguyen, Cheolhong An

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[213] arXiv:2404.11046 (cross-list from cs.AI) [pdf, other]: Title: Lightweight Unsupervised Federated Learning with Pretrained Vision Language Model

Authors: Hao Yan, Yuhong Guo

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[214] arXiv:2404.10892 (cross-list from eess.IV) [pdf, other]: Title: Automatic classification of prostate MR series type using image content and metadata

Authors: Deepa Krishnaswamy, Bálint Kovács, Stefan Denner, Steve Pieper, David Clunie, Christopher P. Bridge, Tina Kapur, Klaus H. Maier-Hein, Andrey Fedorov

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[215] arXiv:2404.10790 (cross-list from cs.CR) [pdf, other]: Title: Multimodal Attack Detection for Action Recognition Models

Authors: Furkan Mumcu, Yasin Yilmaz

Subjects: Cryptography and Security (cs.CR); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[216] arXiv:2307.00071 (cross-list from cs.RO) [pdf, other]: Title: GIRA: Gaussian Mixture Models for Inference and Robot Autonomy

Authors: Kshitij Goel, Wennie Tabib

Comments: 2024 IEEE International Conference on Robotics and Automation (ICRA)

Subjects: Robotics (cs.RO); Computational Geometry (cs.CG); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Wed, 17 Apr 2024 (showing first 95 of 114 entries)

[217] arXiv:2404.10775 [pdf, other]: Title: COMBO: Compositional World Models for Embodied Multi-Agent Cooperation

Authors: Hongxin Zhang, Zeyuan Wang, Qiushi Lyu, Zheyuan Zhang, Sunli Chen, Tianmin Shu, Yilun Du, Chuang Gan

Comments: 23 pages. The first three authors contributed equally

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multiagent Systems (cs.MA)
[218] arXiv:2404.10772 [pdf, other]: Title: Gaussian Opacity Fields: Efficient and Compact Surface Reconstruction in Unbounded Scenes

Authors: Zehao Yu, Torsten Sattler, Andreas Geiger

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[219] arXiv:2404.10765 [pdf, other]: Title: RefFusion: Reference Adapted Diffusion Models for 3D Scene Inpainting

Authors: Ashkan Mirzaei, Riccardo De Lutio, Seung Wook Kim, David Acuna, Jonathan Kelly, Sanja Fidler, Igor Gilitschenski, Zan Gojcic

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[220] arXiv:2404.10760 [pdf, other]: Title: Learning Feature Inversion for Multi-class Anomaly Detection under General-purpose COCO-AD Benchmark

Authors: Jiangning Zhang, Chengjie Wang, Xiangtai Li, Guanzhong Tian, Zhucun Xue, Yong Liu, Guansong Pang, Dacheng Tao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[221] arXiv:2404.10758 [pdf, other]: Title: Watch Your Step: Optimal Retrieval for Continual Learning at Scale

Authors: Truman Hickok, Dhireesha Kudithipudi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[222] arXiv:2404.10718 [pdf, other]: Title: GazeHTA: End-to-end Gaze Target Detection with Head-Target Association

Authors: Zhi-Yi Lin, Jouh Yeong Chew, Jan van Gemert, Xucong Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[223] arXiv:2404.10717 [pdf, other]: Title: Mixed Prototype Consistency Learning for Semi-supervised Medical Image Segmentation

Authors: Lijian Li

Comments: 15 pages, 2 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[224] arXiv:2404.10716 [pdf, other]: Title: MOWA: Multiple-in-One Image Warping Model

Authors: Kang Liao, Zongsheng Yue, Zhonghua Wu, Chen Change Loy

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[225] arXiv:2404.10713 [pdf, ps, other]: Title: A Plausibility Study of Using Augmented Reality in the Ventriculoperitoneal Shunt Operations

Authors: Tandin Dorji, Pakinee Aimmanee, Vich Yindeedej

Comments: Accepted for the 2024 - 16th International Conference on Knowledge and Smart Technology (KST). To be published in IEEEXplore Digital Library (#61284), ISBN: 979-8-3503-7073-7

Subjects: Computer Vision and Pattern Recognition (cs.CV); Software Engineering (cs.SE)
[226] arXiv:2404.10699 [pdf, other]: Title: ECLAIR: A High-Fidelity Aerial LiDAR Dataset for Semantic Segmentation

Authors: Iaroslav Melekhov, Anand Umashankar, Hyeong-Jin Kim, Vladislav Serkov, Dusty Argyle

Comments: 11 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[227] arXiv:2404.10690 [pdf, other]: Title: MathWriting: A Dataset For Handwritten Mathematical Expression Recognition

Authors: Philippe Gervais, Asya Fadeeva, Andrii Maksai

Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
[228] arXiv:2404.10688 [pdf, other]: Title: Efficient Conditional Diffusion Model with Probability Flow Sampling for Image Super-resolution

Authors: Yutao Yuan, Chun Yuan

Comments: AAAI 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[229] arXiv:2404.10685 [pdf, other]: Title: Generating Human Interaction Motions in Scenes with Text Control

Authors: Hongwei Yi, Justus Thies, Michael J. Black, Xue Bin Peng, Davis Rempe

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[230] arXiv:2404.10681 [pdf, other]: Title: StyleCity: Large-Scale 3D Urban Scenes Stylization with Vision-and-Text Reference via Progressive Optimization

Authors: Yingshu Chen, Huajian Huang, Tuan-Anh Vu, Ka Chun Shum, Sai-Kit Yeung

Comments: project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[231] arXiv:2404.10667 [pdf, other]: Title: VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time

Authors: Sicheng Xu, Guojun Chen, Yu-Xiao Guo, Jiaolong Yang, Chong Li, Zhenyu Zang, Yizhong Zhang, Xin Tong, Baining Guo

Comments: Tech Report. Project webpage: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[232] arXiv:2404.10664 [pdf, ps, other]: Title: Assessing The Impact of CNN Auto Encoder-Based Image Denoising on Image Classification Tasks

Authors: Mohsen Hami, Mahdi JameBozorg

Comments: 13 pages, 13 figures, 13th International conference on innovative technologies in the field of science, engineering and technology

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[233] arXiv:2404.10633 [pdf, other]: Title: Contextrast: Contextual Contrastive Learning for Semantic Segmentation

Authors: Changki Sung, Wanhee Kim, Jungho An, Wooju Lee, Hyungtae Lim, Hyun Myung

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[234] arXiv:2404.10626 [pdf, other]: Title: Exploring selective image matching methods for zero-shot and few-sample unsupervised domain adaptation of urban canopy prediction

Authors: John Francis, Stephen Law

Comments: ICLR 2024 Machine Learning for Remote Sensing (ML4RS) Workshop

Subjects: Computer Vision and Pattern Recognition (cs.CV); Applications (stat.AP)
[235] arXiv:2404.10625 [pdf, other]: Title: Gaussian Splatting Decoder for 3D-aware Generative Adversarial Networks

Authors: Florian Barthel, Arian Beckmann, Wieland Morgenstern, Anna Hilsmann, Peter Eisert

Comments: CVPRW

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[236] arXiv:2404.10620 [pdf, other]: Title: PyTorchGeoNodes: Enabling Differentiable Shape Programs for 3D Shape Reconstruction

Authors: Sinisa Stekovic, Stefan Ainetter, Mattia D'Urso, Friedrich Fraundorfer, Vincent Lepetit

Comments: In Submission

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[237] arXiv:2404.10603 [pdf, other]: Title: Enhancing 3D Fidelity of Text-to-3D using Cross-View Correspondences

Authors: Seungwook Kim, Kejie Li, Xueqing Deng, Yichun Shi, Minsu Cho, Peng Wang

Comments: 25 pages, 22 figures, accepted to CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[238] arXiv:2404.10600 [pdf, ps, other]: Title: Intra-operative tumour margin evaluation in breast-conserving surgery with deep learning

Authors: Wei-Chung Shia, Yu-Len Huang, Yi-Chun Chen, Hwa-Koon Wu, Dar-Ren Chen

Comments: 1 pages, 6 figures and 2 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[239] arXiv:2404.10595 [pdf, other]: Title: Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases

Authors: Yanze Li, Wenhua Zhang, Kai Chen, Yanxin Liu, Pengxiang Li, Ruiyuan Gao, Lanqing Hong, Meng Tian, Xinhai Zhao, Zhenguo Li, Dit-Yan Yeung, Huchuan Lu, Xu Jia

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[240] arXiv:2404.10584 [pdf, other]: Title: ReWiTe: Realistic Wide-angle and Telephoto Dual Camera Fusion Dataset via Beam Splitter Camera Rig

Authors: Chunli Peng, Xuan Dong, Tiantian Cao, Zhengqing Li, Kun Dong, Weixin Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[241] arXiv:2404.10574 [pdf, other]: Title: Uncertainty-guided Open-Set Source-Free Unsupervised Domain Adaptation with Target-private Class Segregation

Authors: Mattia Litrico, Davide Talon, Sebastiano Battiato, Alessio Del Bue, Mario Valerio Giuffrida, Pietro Morerio

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[242] arXiv:2404.10572 [pdf, other]: Title: Label merge-and-split: A graph-colouring approach for memory-efficient brain parcellation

Authors: Aaron Kujawa, Reuben Dorent, Sebastien Ourselin, Tom Vercauteren

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[243] arXiv:2404.10571 [pdf, other]: Title: CMU-Flownet: Exploring Point Cloud Scene Flow Estimation in Occluded Scenario

Authors: Jingze Chen, Junfeng Yao, Qiqin Lin, Lei Li

Comments: 14 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[244] arXiv:2404.10540 [pdf, other]: Title: SEVD: Synthetic Event-based Vision Dataset for Ego and Fixed Traffic Perception

Authors: Manideep Reddy Aliminati, Bharatesh Chakravarthi, Aayush Atul Verma, Arpitsinh Vaghela, Hua Wei, Xuesong Zhou, Yezhou Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[245] arXiv:2404.10539 [pdf, other]: Title: VideoSAGE: Video Summarization with Graph Representation Learning

Authors: Jose M. Rojas Chaves, Subarna Tripathi

Comments: arXiv admin note: text overlap with arXiv:2207.07783

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[246] arXiv:2404.10534 [pdf, other]: Title: Into the Fog: Evaluating Multiple Object Tracking Robustness

Authors: Nadezda Kirillova, M. Jehanzeb Mirza, Horst Possegger, Horst Bischof

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[247] arXiv:2404.10527 [pdf, other]: Title: SPVLoc: Semantic Panoramic Viewport Matching for 6D Camera Localization in Unseen Environments

Authors: Niklas Gard, Anna Hilsmann, Peter Eisert

Comments: This submission includes the paper and supplementary material. 24 pages, 11 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[248] arXiv:2404.10518 [pdf, other]: Title: MobileNetV4 -- Universal Models for the Mobile Ecosystem

Authors: Danfeng Qin, Chas Leichner, Manolis Delakis, Marco Fornoni, Shixin Luo, Fan Yang, Weijun Wang, Colby Banbury, Chengxi Ye, Berkin Akin, Vaibhav Aggarwal, Tenghui Zhu, Daniele Moro, Andrew Howard

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[249] arXiv:2404.10501 [pdf, other]: Title: Self-Supervised Visual Preference Alignment

Authors: Ke Zhu, Liang Zhao, Zheng Ge, Xiangyu Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[250] arXiv:2404.10499 [pdf, other]: Title: Robust Noisy Label Learning via Two-Stream Sample Distillation

Authors: Sihan Bai, Sanping Zhou, Zheng Qin, Le Wang, Nanning Zheng

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[251] arXiv:2404.10490 [pdf, other]: Title: Teaching Chinese Sign Language with Feedback in Mixed Reality

Authors: Hongli Wen, Yang Xu, Lin Li, Xudong Ru

Comments: 8 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[252] arXiv:2404.10484 [pdf, other]: Title: AbsGS: Recovering Fine Details for 3D Gaussian Splatting

Authors: Zongxin Ye, Wenyu Li, Sidun Liu, Peng Qiao, Yong Dou

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[253] arXiv:2404.10476 [pdf, other]: Title: Efficient optimal dispersed Haar-like filters for face detection

Authors: Zeinab Sedaghatjoo, Hossein Hosseinzadeh, Ahmad shirzadi

Subjects: Computer Vision and Pattern Recognition (cs.CV); Numerical Analysis (math.NA)
[254] arXiv:2404.10454 [pdf, other]: Title: A Computer Vision-Based Quality Assessment Technique for the automatic control of consumables for analytical laboratories

Authors: Meriam Zribi, Paolo Pagliuca, Francesca Pitolli

Comments: 31 pages, 13 figures, 10 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[255] arXiv:2404.10441 [pdf, other]: Title: 1st Place Solution for ICCV 2023 OmniObject3D Challenge: Sparse-View Reconstruction

Authors: Hang Du, Yaping Xue, Weidong Dai, Xuejun Yan, Jingjing Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[256] arXiv:2404.10438 [pdf, other]: Title: The Unreasonable Effectiveness of Pre-Trained Features for Camera Pose Refinement

Authors: Gabriele Trivigno, Carlo Masone, Barbara Caputo, Torsten Sattler

Comments: Accepted to CVPR2024 (Highlight)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[257] arXiv:2404.10433 [pdf, other]: Title: Explainable concept mappings of MRI: Revealing the mechanisms underlying deep learning-based brain disease classification

Authors: Christian Tinauer, Anna Damulina, Maximilian Sackl, Martin Soellradl, Reduan Achtibat, Maximilian Dreyer, Frederik Pahde, Sebastian Lapuschkin, Reinhold Schmidt, Stefan Ropele, Wojciech Samek, Christian Langkammer

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[258] arXiv:2404.10411 [pdf, other]: Title: Camera clustering for scalable stream-based active distillation

Authors: Dani Manjah, Davide Cacciarelli, Christophe De Vleeschouwer, Benoit Macq

Comments: This manuscript is currently under review at IEEE Transactions on Circuits and Systems for Video Technology

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[259] arXiv:2404.10408 [pdf, other]: Title: Adversarial Identity Injection for Semantic Face Image Synthesis

Authors: Giuseppe Tarollo, Tomaso Fontanini, Claudio Ferrari, Guido Borghi, Andrea Prati

Comments: Paper accepted at CVPR 2024 Biometrics Workshop

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[260] arXiv:2404.10407 [pdf, ps, other]: Title: Comprehensive Survey of Model Compression and Speed up for Vision Transformers

Authors: Feiyang Chen, Ziqian Luo, Lisang Zhou, Xueting Pan, Ying Jiang

Journal-ref: Journal of Information, Technology and Policy (2024): 1-12

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[261] arXiv:2404.10405 [pdf, other]: Title: Integration of Self-Supervised BYOL in Semi-Supervised Medical Image Recognition

Authors: Hao Feng, Yuanzhe Jia, Ruijia Xu, Mukesh Prasad, Ali Anaissi, Ali Braytee

Comments: Accepted by ICCS 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[262] arXiv:2404.10394 [pdf, other]: Title: Portrait3D: Text-Guided High-Quality 3D Portrait Generation Using Pyramid Representation and GANs Prior

Authors: Yiqian Wu, Hao Xu, Xiangjun Tang, Xien Chen, Siyu Tang, Zhebin Zhang, Chen Li, Xiaogang Jin

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[263] arXiv:2404.10383 [pdf, other]: Title: Learning to Score Sign Language with Two-stage Method

Authors: Hongli Wen, Yang Xu

Comments: 9 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[264] arXiv:2404.10378 [pdf, other]: Title: Second Edition FRCSyn Challenge at CVPR 2024: Face Recognition Challenge in the Era of Synthetic Data

Authors: Ivan DeAndres-Tame, Ruben Tolosana, Pietro Melzi, Ruben Vera-Rodriguez, Minchul Kim, Christian Rathgeb, Xiaoming Liu, Aythami Morales, Julian Fierrez, Javier Ortega-Garcia, Zhizhou Zhong, Yuge Huang, Yuxi Mi, Shouhong Ding, Shuigeng Zhou, Shuai He, Lingzhi Fu, Heng Cong, Rongyu Zhang, Zhihong Xiao, Evgeny Smirnov, Anton Pimenov, Aleksei Grigorev, Denis Timoshenko, Kaleb Mesfin Asfaw, Cheng Yaw Low, Hao Liu, Chuyi Wang, Qing Zuo, Zhixiang He, Hatef Otroshi Shahreza, Anjith George, Alexander Unnervik, Parsa Rahimi, Sébastien Marcel, Pedro C. Neto, Marco Huber, Jan Niklas Kolf, Naser Damer, Fadi Boutros, Jaime S. Cardoso, Ana F. Sequeira, Andrea Atzori, Gianni Fenu, Mirko Marras, Vitomir Štruc, Jiang Yu, Zhangjie Li, Jichun Li, Weisong Zhao, Zhen Lei, Xiangyu Zhu, Xiao-Yu Zhang, Bernardo Biesseck, et al. (4 additional authors not shown)

Comments: arXiv admin note: text overlap with arXiv:2311.10476

Journal-ref: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRw 2024)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Machine Learning (cs.LG)
[265] arXiv:2404.10370 [pdf, other]: Title: Know Yourself Better: Diverse Discriminative Feature Learning Improves Open Set Recognition

Authors: Jiawen Xu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[266] arXiv:2404.10358 [pdf, other]: Title: Improving Bracket Image Restoration and Enhancement with Flow-guided Alignment and Enhanced Feature Aggregation

Authors: Wenjie Lin, Zhen Liu, Chengzhi Jiang, Mingyan Han, Ting Jiang, Shuaicheng Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[267] arXiv:2404.10357 [pdf, other]: Title: Optimization of Prompt Learning via Multi-Knowledge Representation for Vision-Language Models

Authors: Enming Zhang, Bingke Zhu, Yingying Chen, Qinghai Miao, Ming Tang, Jinqiao Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[268] arXiv:2404.10343 [pdf, other]: Title: The Ninth NTIRE 2024 Efficient Super-Resolution Challenge Report

Authors: Bin Ren, Yawei Li, Nancy Mehta, Radu Timofte, Hongyuan Yu, Cheng Wan, Yuxin Hong, Bingnan Han, Zhuoyuan Wu, Yajun Zou, Yuqing Liu, Jizhe Li, Keji He, Chao Fan, Heng Zhang, Xiaolin Zhang, Xuanwu Yin, Kunlong Zuo, Bohao Liao, Peizhe Xia, Long Peng, Zhibo Du, Xin Di, Wangkai Li, Yang Wang, Wei Zhai, Renjing Pei, Jiaming Guo, Songcen Xu, Yang Cao, Zhengjun Zha, Yan Wang, Yi Liu, Qing Wang, Gang Zhang, Liou Zhang, Shijie Zhao, Long Sun, Jinshan Pan, Jiangxin Dong, Jinhui Tang, Xin Liu, Min Yan, Qian Wang, Menghan Zhou, Yiqiang Yan, Yixuan Liu, Wensong Chan, Dehua Tang, Dong Zhou, Li Wang, Lu Tian, Barsoum Emad, Bohan Jia, Junbo Qiao, Yunshuai Zhou, Yun Zhang, Wei Li, Shaohui Lin, Shenglong Zhou, Binbin Chen, Jincheng Liao, Suiyi Zhao, Zhao Zhang, Bo Wang, Yan Luo, Yanyan Wei, Feng Li, Mingshen Wang, et al. (65 additional authors not shown)

Comments: The report paper of NTIRE2024 Efficient Super-resolution, accepted by CVPRW2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[269] arXiv:2404.10342 [pdf, other]: Title: Referring Flexible Image Restoration

Authors: Runwei Guan, Rongsheng Hu, Zhuhao Zhou, Tianlang Xue, Ka Lok Man, Jeremy Smith, Eng Gee Lim, Weiping Ding, Yutao Yue

Comments: 15 pages, 19 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[270] arXiv:2404.10335 [pdf, other]: Title: Efficiently Adversarial Examples Generation for Visual-Language Models under Targeted Transfer Scenarios using Diffusion Models

Authors: Qi Guo, Shanmin Pang, Xiaojun Jia, Qing Guo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[271] arXiv:2404.10332 [pdf, other]: Title: Prescribing the Right Remedy: Mitigating Hallucinations in Large Vision-Language Models via Targeted Instruction Tuning

Authors: Rui Hu, Yahan Tu, Jitao Sang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[272] arXiv:2404.10322 [pdf, other]: Title: Domain-Rectifying Adapter for Cross-Domain Few-Shot Segmentation

Authors: Jiapeng Su, Qi Fan, Guangming Lu, Fanglin Chen, Wenjie Pei

Comments: Accepted by CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[273] arXiv:2404.10319 [pdf, other]: Title: Application of Deep Learning Methods to Processing of Noisy Medical Video Data

Authors: Danil Afonchikov, Elena Kornaeva, Irina Makovik, Alexey Kornaev

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[274] arXiv:2404.10318 [pdf, other]: Title: SRGS: Super-Resolution 3D Gaussian Splatting

Authors: Xiang Feng, Yongbo He, Yubo Wang, Yan Yang, Zhenzhong Kuang, Yu Jun, Jianping Fan, Jiajun ding

Comments: submit ACM MM 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[275] arXiv:2404.10314 [pdf, other]: Title: Awareness of uncertainty in classification using a multivariate model and multi-views

Authors: Alexey Kornaev, Elena Kornaeva, Oleg Ivanov, Ilya Pershin, Danis Alukaev

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[276] arXiv:2404.10312 [pdf, other]: Title: OmniSSR: Zero-shot Omnidirectional Image Super-Resolution using Stable Diffusion Model

Authors: Runyi Li, Xuhan Sheng, Weiqi Li, Jian Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[277] arXiv:2404.10307 [pdf, other]: Title: Learnable Prompt for Few-Shot Semantic Segmentation in Remote Sensing Domain

Authors: Steve Andreas Immanuel, Hagai Raja Sinulingga

Comments: Accepted to CVPRW 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[278] arXiv:2404.10305 [pdf, other]: Title: TC-OCR: TableCraft OCR for Efficient Detection & Recognition of Table Structure & Content

Authors: Avinash Anand, Raj Jaiswal, Pijush Bhuyan, Mohit Gupta, Siddhesh Bangar, Md. Modassir Imam, Rajiv Ratn Shah, Shin'ichi Satoh

Comments: 8 pages, 2 figures, Workshop of 1st MMIR Deep Multimodal Learning for Information Retrieval

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[279] arXiv:2404.10292 [pdf, other]: Title: From Data Deluge to Data Curation: A Filtering-WoRA Paradigm for Efficient Text-based Person Search

Authors: Jintao Sun, Zhedong Zheng, Gangyi Ding

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[280] arXiv:2404.10279 [pdf, other]: Title: EucliDreamer: Fast and High-Quality Texturing for 3D Models with Depth-Conditioned Stable Diffusion

Authors: Cindy Le, Congrui Hetang, Chendi Lin, Ang Cao, Yihui He

Comments: Short version of arXiv:2311.15573

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[281] arXiv:2404.10272 [pdf, other]: Title: Plug-and-Play Acceleration of Occupancy Grid-based NeRF Rendering using VDB Grid and Hierarchical Ray Traversal

Authors: Yoshio Kato, Shuhei Tarashima

Comments: Short paper for CVPR Neural Rendering Intelligence Workshop 2024. Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[282] arXiv:2404.10267 [pdf, other]: Title: OneActor: Consistent Character Generation via Cluster-Conditioned Guidance

Authors: Jiahao Wang, Caixia Yan, Haonan Lin, Weizhan Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[283] arXiv:2404.10263 [pdf, ps, other]: Title: PreGSU-A Generalized Traffic Scene Understanding Model for Autonomous Driving based on Pre-trained Graph Attention Network

Authors: Yuning Wang, Zhiyuan Liu, Haotian Lin, Junkai Jiang, Shaobing Xu, Jianqiang Wang

Comments: 12 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[284] arXiv:2404.10242 [pdf, other]: Title: Masked Autoencoders for Microscopy are Scalable Learners of Cellular Biology

Authors: Oren Kraus, Kian Kenyon-Dean, Saber Saberian, Maryam Fallah, Peter McLean, Jess Leung, Vasudev Sharma, Ayla Khan, Jia Balakrishnan, Safiye Celik, Dominique Beaini, Maciej Sypetkowski, Chi Vicky Cheng, Kristen Morse, Maureen Makes, Ben Mabey, Berton Earnshaw

Comments: CVPR 2024 Highlight. arXiv admin note: text overlap with arXiv:2309.16064

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[285] arXiv:2404.10241 [pdf, other]: Title: Vision-and-Language Navigation via Causal Learning

Authors: Liuyi Wang, Zongtao He, Ronghao Dang, Mengjiao Shen, Chengju Liu, Qijun Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[286] arXiv:2404.10237 [pdf, other]: Title: MoE-TinyMed: Mixture of Experts for Tiny Medical Large Vision-Language Models

Authors: Songtao Jiang, Tuo Zheng, Yan Zhang, Yeying Jin, Zuozhu Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[287] arXiv:2404.10227 [pdf, other]: Title: MS-MANO: Enabling Hand Pose Tracking with Biomechanical Constraints

Authors: Pengfei Xie, Wenqiang Xu, Tutian Tang, Zhenjun Yu, Cewu Lu

Comments: 11 pages, 5 figures; CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[288] arXiv:2404.10213 [pdf, other]: Title: GaitPoint+: A Gait Recognition Network Incorporating Point Cloud Analysis and Recycling

Authors: Huantao Ren, Jiajing Chen, Senem Velipasalar

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[289] arXiv:2404.10212 [pdf, other]: Title: LWIRPOSE: A novel LWIR Thermal Image Dataset and Benchmark

Authors: Avinash Upadhyay, Bhipanshu Dhupar, Manoj Sharma, Ankit Shukla, Ajith Abraham

Comments: Submitted in ICIP2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[290] arXiv:2404.10210 [pdf, other]: Title: MK-SGN: A Spiking Graph Convolutional Network with Multimodal Fusion and Knowledge Distillation for Skeleton-based Action Recognition

Authors: Naichuan Zheng, Hailun Xia, Zeyu Liang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[291] arXiv:2404.10193 [pdf, other]: Title: Consistency and Uncertainty: Identifying Unreliable Responses From Black-Box Vision-Language Models for Selective Visual Question Answering

Authors: Zaid Khan, Yun Fu

Comments: CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[292] arXiv:2404.10177 [pdf, other]: Title: Consistent Diffusion Meets Tweedie: Training Exact Ambient Diffusion Models with Noisy Data

Authors: Giannis Daras, Alexandros G. Dimakis, Constantinos Daskalakis

Comments: Preprint, work in progress. 19 pages, 9 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[293] arXiv:2404.10172 [pdf, other]: Title: Forensic Iris Image-Based Post-Mortem Interval Estimation

Authors: Rasel Ahmed Bhuiyan, Adam Czajka

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[294] arXiv:2404.10170 [pdf, other]: Title: High-Resolution Detection of Earth Structural Heterogeneities from Seismic Amplitudes using Convolutional Neural Networks with Attention layers

Authors: Luiz Schirmer, Guilherme Schardong, Vinícius da Silva, Rogério Santos, Hélio Lopes

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[295] arXiv:2404.10166 [pdf, other]: Title: Self-Supervised Learning Featuring Small-Scale Image Dataset for Treatable Retinal Diseases Classification

Authors: Luffina C. Huang, Darren J. Chiu, Manish Mehta

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[296] arXiv:2404.10163 [pdf, other]: Title: EyeFormer: Predicting Personalized Scanpaths with Transformer-Guided Reinforcement Learning

Authors: Yue Jiang, Zixin Guo, Hamed Rezazadegan Tavakoli, Luis A. Leiva, Antti Oulasvirta

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
[297] arXiv:2404.10157 [pdf, other]: Title: Salient Object-Aware Background Generation using Text-Guided Diffusion Models

Authors: Amir Erfan Eshratifar, Joao V. B. Soares, Kapil Thadani, Shaunak Mishra, Mikhail Kuznetsov, Yueh-Ning Ku, Paloma de Juan

Comments: Accepted for publication at CVPR 2024's Generative Models for Computer Vision workshop

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[298] arXiv:2404.10156 [pdf, other]: Title: SegFormer3D: an Efficient Transformer for 3D Medical Image Segmentation

Authors: Shehan Perera, Pouyan Navard, Alper Yilmaz

Comments: Accepted at CVPR Workshop 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[299] arXiv:2404.10147 [pdf, other]: Title: Eyes on the Streets: Leveraging Street-Level Imaging to Model Urban Crime Dynamics

Authors: Zhixuan Qi, Huaiying Luo, Chen Chi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[300] arXiv:2404.10146 [pdf, ps, other]: Title: Cross-Modal Self-Training: Aligning Images and Pointclouds to Learn Classification without Labels

Authors: Amaya Dharmasiri, Muzammal Naseer, Salman Khan, Fahad Shahbaz Khan

Comments: To be published in Workshop for Learning 3D with Multi-View Supervision (3DMV) at CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[301] arXiv:2404.10141 [pdf, other]: Title: ANCHOR: LLM-driven News Subject Conditioning for Text-to-Image Synthesis

Authors: Aashish Anantha Ramakrishnan, Sharon X. Huang, Dongwon Lee

Comments: 23 pages, 9 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL); Multimedia (cs.MM)
[302] arXiv:2404.10133 [pdf, other]: Title: WB LUTs: Contrastive Learning for White Balancing Lookup Tables

Authors: Sai Kumar Reddy Manne, Michael Wan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[303] arXiv:2404.10130 [pdf, other]: Title: NOISe: Nuclei-Aware Osteoclast Instance Segmentation for Mouse-to-Human Domain Transfer

Authors: Sai Kumar Reddy Manne, Brendan Martin, Tyler Roy, Ryan Neilson, Rebecca Peters, Meghana Chillara, Christine W. Lary, Katherine J. Motyl, Michael Wan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[304] arXiv:2404.10108 [pdf, other]: Title: GeoAI Reproducibility and Replicability: a computational and spatial perspective

Authors: Wenwen Lia, Chia-Yu Hsu, Sizhe Wang, Peter Kedron

Comments: Accepted by Annals of the American Association of Geographers

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[305] arXiv:2404.10096 [pdf, other]: Title: Vision Augmentation Prediction Autoencoder with Attention Design (VAPAAD)

Authors: Yiqiao Yin

Comments: 12 pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[306] arXiv:2404.10078 [pdf, other]: Title: Low-Light Image Enhancement Framework for Improved Object Detection in Fisheye Lens Datasets

Authors: Dai Quoc Tran, Armstrong Aboah, Yuntae Jeon, Maged Shoman, Minsoo Park, Seunghee Park

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[307] arXiv:2404.10073 [pdf, other]: Title: Explainable Light-Weight Deep Learning Pipeline for Improved Drought Stres

Authors: Aswini Kumar Patra, Lingaraj Sahoo

Comments: 21 pages, 5 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[308] arXiv:2404.10054 [pdf, other]: Title: AIGeN: An Adversarial Approach for Instruction Generation in VLN

Authors: Niyati Rawal, Roberto Bigazzi, Lorenzo Baraldi, Rita Cucchiara

Comments: Accepted to 7th Multimodal Learning and Applications Workshop (MULA 2024) at the IEEE/CVF Conference on Computer Vision and Pattern Recognition 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Robotics (cs.RO)
[309] arXiv:2404.10034 [pdf, other]: Title: Realistic Model Selection for Weakly Supervised Object Localization

Authors: Shakeeb Murtaza, Soufiane Belharbi, Marco Pedersoli, Eric Granger

Comments: 13 pages, 5 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[310] arXiv:2404.10766 (cross-list from eess.IV) [pdf, other]: Title: RapidVol: Rapid Reconstruction of 3D Ultrasound Volumes from Sensorless 2D Scans

Authors: Mark C. Eid, Pak-Hei Yeung, Madeleine K. Wyburd, João F. Henriques, Ana I.L. Namburete

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[311] arXiv:2404.10763 (cross-list from cs.AI) [pdf, other]: Title: LaDiC: Are Diffusion Models Really Inferior to Autoregressive Counterparts for Image-to-Text Generation?

Authors: Yuchi Wang, Shuhuai Ren, Rundong Gao, Linli Yao, Qingyan Guo, Kaikai An, Jianhong Bai, Xu Sun

Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)

[ total of 604 entries: 1-311 | 312-604 ]
[ showing 311 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2404, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions

Fri, 19 Apr 2024

Thu, 18 Apr 2024

Wed, 17 Apr 2024 (showing first 95 of 114 entries)