Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 112

[ total of 679 entries: 1-104 | 9-112 | 113-216 | 217-320 | 321-424 | 425-528 | ... | 633-679 ]
[ showing 104 entries per page: fewer | more | all ]

Tue, 4 Jun 2024 (continued, showing 104 of 228 entries)

[113] arXiv:2406.01555 [pdf, other]: Title: Towards Flexible Interactive Reflection Removal with Human Guidance

Authors: Xiao Chen, Xudong Jiang, Yunkang Tao, Zhen Lei, Qing Li, Chenyang Lei, Zhaoxiang Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[114] arXiv:2406.01551 [pdf, other]: Title: ELSA: Evaluating Localization of Social Activities in Urban Streets

Authors: Maryam Hosseini, Marco Cipriano, Sedigheh Eslami, Daniel Hodczak, Liu Liu, Andres Sevtsuk, Gerard de Melo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[115] arXiv:2406.01494 [pdf, other]: Title: Robust Classification by Coupling Data Mollification with Label Smoothing

Authors: Markus Heinonen, Ba-Hien Tran, Michael Kampffmeyer, Maurizio Filippone

Comments: Under review

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Machine Learning (stat.ML)
[116] arXiv:2406.01493 [pdf, other]: Title: Learning Temporally Consistent Video Depth from Video Diffusion Priors

Authors: Jiahao Shao, Yuanbo Yang, Hongyu Zhou, Youmin Zhang, Yujun Shen, Matteo Poggi, Yiyi Liao

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[117] arXiv:2406.01489 [pdf, other]: Title: DA-HFNet: Progressive Fine-Grained Forgery Image Detection and Localization Based on Dual Attention

Authors: Yang Liu, Xiaofei Li, Jun Zhang, Shengze Hu, Jun Lei

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[118] arXiv:2406.01486 [pdf, other]: Title: Differentiable Task Graph Learning: Procedural Activity Representation and Online Mistake Detection from Egocentric Videos

Authors: Luigi Seminara, Giovanni Maria Farinella, Antonino Furnari

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[119] arXiv:2406.01480 [pdf, other]: Title: Towards Automating the Retrospective Generation of BIM Models: A Unified Framework for 3D Semantic Reconstruction of the Built Environment

Authors: Ka Lung Cheung, Chi Chung Lee

Comments: CVPRW 2024, Oral

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[120] arXiv:2406.01476 [pdf, other]: Title: DreamPhysics: Learning Physical Properties of Dynamic 3D Gaussians with Video Diffusion Priors

Authors: Tianyu Huang, Yihan Zeng, Hui Li, Wangmeng Zuo, Rynson W. H. Lau

Comments: Technical report. Codes are released at: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[121] arXiv:2406.01460 [pdf, other]: Title: MLIP: Efficient Multi-Perspective Language-Image Pretraining with Exhaustive Data Utilization

Authors: Yu Zhang, Qi Zhang, Zixuan Gong, Yiwei Shi, Yepeng Liu, Duoqian Miao, Yang Liu, Ke Liu, Kun Yi, Wei Fan, Liang Hu, Changwei Wang

Comments: ICML 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[122] arXiv:2406.01455 [pdf, other]: Title: Automatic Fused Multimodal Deep Learning for Plant Identification

Authors: Alfreds Lapkovskis, Natalia Nefedova, Ali Beikmohammadi

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[123] arXiv:2406.01451 [pdf, other]: Title: SAM as the Guide: Mastering Pseudo-Label Refinement in Semi-Supervised Referring Expression Segmentation

Authors: Danni Yang, Jiayi Ji, Yiwei Ma, Tianyu Guo, Haowei Wang, Xiaoshuai Sun, Rongrong Ji

Comments: Accepted by ICML2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[124] arXiv:2406.01449 [pdf, other]: Title: SLANT: Spurious Logo ANalysis Toolkit

Authors: Maan Qraitem, Piotr Teterwak, Kate Saenko, Bryan A. Plummer

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[125] arXiv:2406.01432 [pdf, other]: Title: ED-SAM: An Efficient Diffusion Sampling Approach to Domain Generalization in Vision-Language Foundation Models

Authors: Thanh-Dat Truong, Xin Li, Bhiksha Raj, Jackson Cothren, Khoa Luu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[126] arXiv:2406.01429 [pdf, other]: Title: EAGLE: Efficient Adaptive Geometry-based Learning in Cross-view Understanding

Authors: Thanh-Dat Truong, Utsav Prabhu, Dongyi Wang, Bhiksha Raj, Susan Gauch, Jeyamkondan Subbiah, Khoa Luu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[127] arXiv:2406.01425 [pdf, other]: Title: Sensitivity-Informed Augmentation for Robust Segmentation

Authors: Laura Zheng, Wenjie Wei, Tony Wu, Jacob Clements, Shreelekha Revankar, Andre Harrison, Yu Shen, Ming C. Lin

Comments: 10 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[128] arXiv:2406.01402 [pdf, other]: Title: Mixture of Rationale: Multi-Modal Reasoning Mixture for Visual Question Answering

Authors: Tao Li, Linjun Shou, Xuejun Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[129] arXiv:2406.01395 [pdf, other]: Title: TE-NeXt: A LiDAR-Based 3D Sparse Convolutional Network for Traversability Estimation

Authors: Antonio Santo, Juan J. Cabrera, David Valiente, Carlos Viegas, Arturo Gil

Comments: This work has been submitted to the IEEE Transactions on Intelligent Vehicles for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[130] arXiv:2406.01388 [pdf, other]: Title: AutoStudio: Crafting Consistent Subjects in Multi-turn Interactive Image Generation

Authors: Junhao Cheng, Xi Lu, Hanhui Li, Khun Loun Zai, Baiqiao Yin, Yuhao Cheng, Yiqiang Yan, Xiaodan Liang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[131] arXiv:2406.01380 [pdf, other]: Title: Convolutional Unscented Kalman Filter for Multi-Object Tracking with Outliers

Authors: Shiqi Liu, Wenhan Cao, Chang Liu, Tianyi Zhang, Shengbo Eben Li

Comments: 11 pages, 5 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Applications (stat.AP)
[132] arXiv:2406.01365 [pdf, other]: Title: From Feature Visualization to Visual Circuits: Effect of Adversarial Model Manipulation

Authors: Geraldin Nanfack, Michael Eickenberg, Eugene Belilovsky

Comments: Under review

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[133] arXiv:2406.01356 [pdf, other]: Title: MP-PolarMask: A Faster and Finer Instance Segmentation for Concave Images

Authors: Ke-Lei Wang, Pin-Hsuan Chou, Young-Ching Chou, Chia-Jen Liu, Cheng-Kuan Lin, Yu-Chee Tseng

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[134] arXiv:2406.01355 [pdf, other]: Title: Differentially Private Fine-Tuning of Diffusion Models

Authors: Yu-Lin Tsai, Yizhe Li, Zekai Chen, Po-Yu Chen, Chia-Mu Yu, Xuebin Ren, Francois Buet-Golfouse

Comments: 16 pages, 5 figures, 11 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Cryptography and Security (cs.CR)
[135] arXiv:2406.01349 [pdf, other]: Title: Unleashing Generalization of End-to-End Autonomous Driving with Controllable Long Video Generation

Authors: Enhui Ma, Lijun Zhou, Tao Tang, Zhan Zhang, Dong Han, Junpeng Jiang, Kun Zhan, Peng Jia, Xianpeng Lang, Haiyang Sun, Di Lin, Kaicheng Yu

Comments: Project Page: this https URL, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[136] arXiv:2406.01337 [pdf, other]: Title: ARCH2S: Dataset, Benchmark and Challenges for Learning Exterior Architectural Structures from Point Clouds

Authors: Ka Lung Cheung, Chi Chung Lee

Comments: CVPRW 2024 (Oral)

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[137] arXiv:2406.01334 [pdf, other]: Title: HHMR: Holistic Hand Mesh Recovery by Enhancing the Multimodal Controllability of Graph Diffusion Models

Authors: Mengcheng Li, Hongwen Zhang, Yuxiang Zhang, Ruizhi Shao, Tao Yu, Yebin Liu

Comments: accepted in CVPR2024, project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[138] arXiv:2406.01326 [pdf, other]: Title: TabPedia: Towards Comprehensive Visual Table Understanding with Concept Synergy

Authors: Weichao Zhao, Hao Feng, Qi Liu, Jingqun Tang, Shu Wei, Binghong Wu, Lei Liao, Yongjie Ye, Hao Liu, Houqiang Li, Can Huang

Comments: 20 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[139] arXiv:2406.01316 [pdf, other]: Title: Enhancing Inertial Hand based HAR through Joint Representation of Language, Pose and Synthetic IMUs

Authors: Vitor Fortes Rey, Lala Shakti Swarup Ray, Xia Qingxin, Kaishun Wu, Paul Lukowicz

Comments: Review Copy

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[140] arXiv:2406.01315 [pdf, other]: Title: Scale-Free Image Keypoints Using Differentiable Persistent Homology

Authors: Giovanni Barbarani, Francesco Vaccarino, Gabriele Trivigno, Marco Guerra, Gabriele Berton, Carlo Masone

Comments: Accepted to ICML 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Algebraic Topology (math.AT)
[141] arXiv:2406.01314 [pdf, other]: Title: Compute-Efficient Medical Image Classification with Softmax-Free Transformers and Sequence Normalization

Authors: Firas Khader, Omar S. M. El Nahhas, Tianyu Han, Gustav Müller-Franzes, Sven Nebelung, Jakob Nikolas Kather, Daniel Truhn

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[142] arXiv:2406.01302 [pdf, ps, other]: Title: Pulmonary Embolism Mortality Prediction Using Multimodal Learning Based on Computed Tomography Angiography and Clinical Data

Authors: Zhusi Zhong, Helen Zhang, Fayez H. Fayad, Andrew C. Lancaster, John Sollee, Shreyas Kulkarni, Cheng Ting Lin, Jie Li, Xinbo Gao, Scott Collinsa, Sun H. Ahn, Harrison X. Bai, Zhicheng Jiao, Michael K. Atalay

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[143] arXiv:2406.01300 [pdf, other]: Title: pOps: Photo-Inspired Diffusion Operators

Authors: Elad Richardson, Yuval Alaluf, Ali Mahdavi-Amiri, Daniel Cohen-Or

Comments: Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[144] arXiv:2406.01294 [pdf, other]: Title: Capsule Enhanced Variational AutoEncoder for Underwater Image Reconstruction

Authors: Rita Pucci, Niki Martinel

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[145] arXiv:2406.01278 [pdf, other]: Title: fruit-SALAD: A Style Aligned Artwork Dataset to reveal similarity perception in image embeddings

Authors: Tillmann Ohm, Andres Karjus, Mikhail Tamm, Maximilian Schich

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computational Complexity (cs.CC); Machine Learning (cs.LG)
[146] arXiv:2406.01264 [pdf, other]: Title: FreeTumor: Advance Tumor Segmentation via Large-Scale Tumor Synthesis

Authors: Linshan Wu, Jiaxin Zhuang, Xuefeng Ni, Hao Chen

Comments: Preprint

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[147] arXiv:2406.01256 [pdf, other]: Title: Augmented Commonsense Knowledge for Remote Object Grounding

Authors: Bahram Mohammadi, Yicong Hong, Yuankai Qi, Qi Wu, Shirui Pan, Javen Qinfeng Shi

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[148] arXiv:2406.01210 [pdf, other]: Title: GeminiFusion: Efficient Pixel-wise Multimodal Fusion for Vision Transformer

Authors: Ding Jia, Jianyuan Guo, Kai Han, Han Wu, Chao Zhang, Chang Xu, Xinghao Chen

Comments: Accepted by ICML 2024, code and models are available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[149] arXiv:2406.01203 [pdf, other]: Title: Scaling Up Deep Clustering Methods Beyond ImageNet-1K

Authors: Nikolas Adaloglou, Felix Michels, Kaspar Senft, Diana Petrusheva, Markus Kollmann

Comments: Work in progress

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[150] arXiv:2406.01196 [pdf, other]: Title: 3D WholeBody Pose Estimation based on Semantic Graph Attention Network and Distance Information

Authors: Sihan Wen, Xiantan Zhu, Zhiming Tan

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[151] arXiv:2406.01194 [pdf, other]: Title: AFF-ttention! Affordances and Attention models for Short-Term Object Interaction Anticipation

Authors: Lorenzo Mur-Labadia, Ruben Martinez-Cantin, Josechu Guerrero, Giovanni Maria Farinella, Antonino Furnari

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[152] arXiv:2406.01188 [pdf, other]: Title: UniAnimate: Taming Unified Video Diffusion Models for Consistent Human Image Animation

Authors: Xiang Wang, Shiwei Zhang, Changxin Gao, Jiayu Wang, Xiaoqiang Zhou, Yingya Zhang, Luxin Yan, Nong Sang

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[153] arXiv:2406.01170 [pdf, other]: Title: Zero-Shot Out-of-Distribution Detection with Outlier Label Exposure

Authors: Choubo Ding, Guansong Pang

Comments: Accepted by IJCNN2024, 8 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[154] arXiv:2406.01159 [pdf, other]: Title: Dimba: Transformer-Mamba Diffusion Models

Authors: Zhengcong Fei, Mingyuan Fan, Changqian Yu, Debang Li, Youqiang Zhang, Junshi Huang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[155] arXiv:2406.01154 [pdf, other]: Title: DeepUniUSTransformer: Towards A Universal UltraSound Model with Prompted Guidance

Authors: Zehui Lin, Zhuoneng Zhang, Xindi Hu, Zhifan Gao, Xin Yang, Yue Sun, Dong Ni, Tao Tan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[156] arXiv:2406.01136 [pdf, other]: Title: Towards Practical Single-shot Motion Synthesis

Authors: Konstantinos Roditakis, Spyridon Thermos, Nikolaos Zioulis

Comments: CVPR 2024, AI for 3D Generation Workshop, Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[157] arXiv:2406.01127 [pdf, other]: Title: Learning Adaptive Fusion Bank for Multi-modal Salient Object Detection

Authors: Kunpeng Wang, Zhengzheng Tu, Chenglong Li, Cheng Zhang, Bin Luo

Comments: Accepted by TCSVT 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[158] arXiv:2406.01125 [pdf, other]: Title: $Δ$-DiT: A Training-Free Acceleration Method Tailored for Diffusion Transformers

Authors: Pengtao Chen, Mingzhu Shen, Peng Ye, Jianjian Cao, Chongjun Tu, Christos-Savvas Bouganis, Yiren Zhao, Tao Chen

Comments: 12 pages, 6 figures, 6 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[159] arXiv:2406.01112 [pdf, other]: Title: BACON: Bayesian Optimal Condensation Framework for Dataset Distillation

Authors: Zheng Zhou, Hongbo Zhao, Guangliang Cheng, Xiangtai Li, Shuchang Lyu, Wenquan Feng, Qi Zhao

Comments: 22 pages, 10 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[160] arXiv:2406.01079 [pdf, other]: Title: Object Aware Egocentric Online Action Detection

Authors: Joungbin An, Yunsu Park, Hyolim Kang, Seon Joo Kim

Comments: CVPR First Joint Egocentric Vision Workshop 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[161] arXiv:2406.01078 [pdf, other]: Title: CUT: A Controllable, Universal, and Training-Free Visual Anomaly Generation Framework

Authors: Han Sun, Yunkang Cao, Olga Fink

Comments: 9 pages excluding appendix

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[162] arXiv:2406.01076 [pdf, other]: Title: Estimating Canopy Height at Scale

Authors: Jan Pauls, Max Zimmer, Una M. Kelly, Martin Schwartz, Sassan Saatchi, Philippe Ciais, Sebastian Pokutta, Martin Brandt, Fabian Gieseke

Comments: ICML Camera-Ready, 17 pages, 14 figures, 7 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[163] arXiv:2406.01073 [pdf, other]: Title: Understanding the Cross-Domain Capabilities of Video-Based Few-Shot Action Recognition Models

Authors: Georgia Markham, Mehala Balamurali, Andrew J. Hill

Comments: Preprint. Under review

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[164] arXiv:2406.01071 [pdf, other]: Title: Visual Car Brand Classification by Implementing a Synthetic Image Dataset Creation Pipeline

Authors: Jan Lippemeier, Stefanie Hittmeyer, Oliver Niehörster, Markus Lange-Hegermann

Comments: 10 pages, 6 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[165] arXiv:2406.01069 [pdf, other]: Title: UniQA: Unified Vision-Language Pre-training for Image Quality and Aesthetic Assessment

Authors: Hantao Zhou, Longxiang Tang, Rui Yang, Guanyi Qin, Yan Zhang, Runze Hu, Xiu Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[166] arXiv:2406.01063 [pdf, other]: Title: DANCE: Dual-View Distribution Alignment for Dataset Condensation

Authors: Hansong Zhang, Shikun Li, Fanzhao Lin, Weiping Wang, Zhenxing Qian, Shiming Ge

Comments: This work has been accepted by IJCAI-24

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[167] arXiv:2406.01062 [pdf, other]: Title: SceneTextGen: Layout-Agnostic Scene Text Image Synthesis with Diffusion Models

Authors: Qilong Zhangli, Jindong Jiang, Di Liu, Licheng Yu, Xiaoliang Dai, Ankit Ramchandani, Guan Pang, Dimitris N. Metaxas, Praveen Krishnan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[168] arXiv:2406.01059 [pdf, other]: Title: VIP: Versatile Image Outpainting Empowered by Multimodal Large Language Model

Authors: Jinze Yang, Haoran Wang, Zining Zhu, Chenglong Liu, Meng Wymond Wu, Zeke Xie, Zhong Ji, Jungong Han, Mingming Sun

Comments: 15 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[169] arXiv:2406.01056 [pdf, other]: Title: Virtual avatar generation models as world navigators

Authors: Sai Mandava

Comments: 16 pages, 15 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Robotics (cs.RO)
[170] arXiv:2406.01042 [pdf, other]: Title: Self-Calibrating 4D Novel View Synthesis from Monocular Videos Using Gaussian Splatting

Authors: Fang Li, Hao Zhang, Narendra Ahuja

Comments: GitHub Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[171] arXiv:2406.01040 [pdf, other]: Title: Synthetic Data Generation for 3D Myocardium Deformation Analysis

Authors: Shahar Zuler, Dan Raviv

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[172] arXiv:2406.01033 [pdf, ps, other]: Title: Generalized Jersey Number Recognition Using Multi-task Learning With Orientation-guided Weight Refinement

Authors: Yung-Hui Lin, Yu-Wen Chang, Huang-Chia Shih, Takahiro Ogawa

Comments: 10 pages, 6 figures, 5 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[173] arXiv:2406.01029 [pdf, other]: Title: CYCLO: Cyclic Graph Transformer Approach to Multi-Object Relationship Modeling in Aerial Videos

Authors: Trong-Thuan Nguyen, Pha Nguyen, Xin Li, Jackson Cothren, Alper Yilmaz, Khoa Luu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[174] arXiv:2406.01028 [pdf, other]: Title: LLEMamba: Low-Light Enhancement via Relighting-Guided Mamba with Deep Unfolding Network

Authors: Xuanqi Zhang, Haijin Zeng, Jinwang Pan, Qiangqiang Shen, Yongyong Chen

Comments: 9pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[175] arXiv:2406.01025 [pdf, ps, other]: Title: Khayyam Offline Persian Handwriting Dataset

Authors: Pourya Jafarzadeh, Padideh Choobdar, Vahid Mohammadi Safarzadeh

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[176] arXiv:2406.01020 [pdf, other]: Title: CLIP-Guided Attribute Aware Pretraining for Generalizable Image Quality Assessment

Authors: Daekyu Kwon, Dongyoung Kim, Sehwan Ki, Younghyun Jo, Hyong-Euk Lee, Seon Joo Kim

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[177] arXiv:2406.01003 [pdf, other]: Title: Uni-ISP: Unifying the Learning of ISPs from Multiple Cameras

Authors: Lingen Li, Mingde Yao, Xingyu Meng, Muquan Yu, Tianfan Xue, Jinwei Gu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[178] arXiv:2406.00985 [pdf, other]: Title: MultiEdits: Simultaneous Multi-Aspect Editing with Text-to-Image Diffusion Models

Authors: Mingzhen Huang, Jialing Cai, Shan Jia, Vishnu Suresh Lokhande, Siwei Lyu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[179] arXiv:2406.00977 [pdf, other]: Title: Dragonfly: Multi-Resolution Zoom Supercharges Large Visual-Language Model

Authors: Kezhen Chen, Rahul Thapa, Rahul Chalamala, Ben Athiwaratkun, Shuaiwen Leon Song, James Zou

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[180] arXiv:2406.00971 [pdf, other]: Title: MiniGPT-Reverse-Designing: Predicting Image Adjustments Utilizing MiniGPT-4

Authors: Vahid Azizi, Fatemeh Koochaki

Comments: 8 pages, 7 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[181] arXiv:2406.00956 [pdf, other]: Title: Improving Segment Anything on the Fly: Auxiliary Online Learning and Adaptive Fusion for Medical Image Segmentation

Authors: Tianyu Huang, Tao Zhou, Weidi Xie, Shuo Wang, Qi Dou, Yizhe Zhang

Comments: Project Link: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Image and Video Processing (eess.IV)
[182] arXiv:2406.00955 [pdf, other]: Title: How Video Meetings Change Your Expression

Authors: Sumit Sarin, Utkarsh Mall, Purva Tendulkar, Carl Vondrick

Comments: Project webpage is available at: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[183] arXiv:2406.00947 [pdf, other]: Title: Cross-Dimensional Medical Self-Supervised Representation Learning Based on a Pseudo-3D Transformation

Authors: Fei Gao, Siwen Wang, Churan Wang, Fandong Zhang, Hong-Yu Zhou, Yizhou Wang, Gang Yu, Yizhou Yu

Comments: MICCAI 2024 accept

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[184] arXiv:2406.00934 [pdf, other]: Title: LanEvil: Benchmarking the Robustness of Lane Detection to Environmental Illusions

Authors: Tianyuan Zhang, Lu Wang, Hainan Li, Yisong Xiao, Siyuan Liang, Aishan Liu, Xianglong Liu, Dacheng Tao

Comments: Submitted to ACM MM 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[185] arXiv:2406.00929 [pdf, other]: Title: Self-Supervised Geometry-Guided Initialization for Robust Monocular Visual Odometry

Authors: Takayuki Kanai, Igor Vasiljevic, Vitor Guizilini, Kazuhiro Shintani

Comments: 8 pages. 5 figures. This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[186] arXiv:2406.00919 [pdf, other]: Title: Advancing Weakly-Supervised Audio-Visual Video Parsing via Segment-wise Pseudo Labeling

Authors: Jinxing Zhou, Dan Guo, Yiran Zhong, Meng Wang

Comments: IJCV 2024 Accepted. arXiv admin note: substantial text overlap with arXiv:2303.02344

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[187] arXiv:2406.00917 [pdf, other]: Title: Alignment-Free RGBT Salient Object Detection: Semantics-guided Asymmetric Correlation Network and A Unified Benchmark

Authors: Kunpeng Wang, Danying Lin, Chenglong Li, Zhengzheng Tu, Bin Luo

Comments: Accepted by TMM 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[188] arXiv:2406.00908 [pdf, other]: Title: ZeroSmooth: Training-free Diffuser Adaptation for High Frame Rate Video Generation

Authors: Shaoshu Yang, Yong Zhang, Xiaodong Cun, Ying Shan, Ran He

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[189] arXiv:2406.00907 [pdf, other]: Title: DDA: Dimensionality Driven Augmentation Search for Contrastive Learning in Laparoscopic Surgery

Authors: Yuning Zhou, Henry Badgery, Matthew Read, James Bailey, Catherine E. Davey

Comments: 29 pages, 16 figures; MIDL 2024 - Medical Imaging with Deep Learning

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[190] arXiv:2406.00891 [pdf, other]: Title: Global High Categorical Resolution Land Cover Mapping via Weak Supervision

Authors: Xin-Yi Tong, Runmin Dong, Xiao Xiang Zhu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[191] arXiv:2406.00885 [pdf, other]: Title: Visual place recognition for aerial imagery: A survey

Authors: Ivan Moskalenko, Anastasiia Kornilova, Gonzalo Ferrer

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[192] arXiv:2406.00872 [pdf, other]: Title: OLIVE: Object Level In-Context Visual Embeddings

Authors: Timothy Ossowski, Junjie Hu

Comments: ACL 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[193] arXiv:2406.00856 [pdf, other]: Title: DistilDIRE: A Small, Fast, Cheap and Lightweight Diffusion Synthesized Deepfake Detection

Authors: Yewon Lim, Changyeon Lee, Aerin Kim, Oren Etzioni

Comments: 6 pages, 1 figure

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[194] arXiv:2406.00848 [pdf, ps, other]: Title: Eating Smart: Advancing Health Informatics with the Grounding DINO based Dietary Assistant App

Authors: Abdelilah Nossair, Hamza El Housni

Comments: The work presented in this paper was part of the proceedings for the First International Conference on Artificial Intelligence (ICATA 2024)

Journal-ref: Eating Smart: Advancing Health Informatics with the Grounding DINO-based Dietary Assistant App, International Journal of Scientific and Innovative Studies, June 2024, Volume 3, Number 3, Pages 26-34, Available online at IJSRIS

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[195] arXiv:2406.00830 [pdf, other]: Title: Collaborative Novel Object Discovery and Box-Guided Cross-Modal Alignment for Open-Vocabulary 3D Object Detection

Authors: Yang Cao, Yihan Zeng, Hang Xu, Dan Xu

Comments: Code Page: this https URL This paper has been submitted to IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) for possible publication

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[196] arXiv:2406.00828 [pdf, other]: Title: Stealing Image-to-Image Translation Models With a Single Query

Authors: Nurit Spingarn-Eliezer, Tomer Michaeli

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[197] arXiv:2406.00808 [pdf, other]: Title: EchoNet-Synthetic: Privacy-preserving Video Generation for Safe Medical Data Sharing

Authors: Hadrien Reynaud, Qingjie Meng, Mischa Dombrowski, Arijit Ghosh, Thomas Day, Alberto Gomez, Paul Leeson, Bernhard Kainz

Comments: Accepted at MICCAI 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[198] arXiv:2406.00798 [pdf, other]: Title: PruNeRF: Segment-Centric Dataset Pruning via 3D Spatial Consistency

Authors: Yeonsung Jung, Heecheol Yun, Joonhyung Park, Jin-Hwa Kim, Eunho Yang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[199] arXiv:2406.00791 [pdf, other]: Title: Towards Point Cloud Compression for Machine Perception: A Simple and Strong Baseline by Learning the Octree Depth Level Predictor

Authors: Lei Liu, Zhihao Hu, Zhenghao Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Image and Video Processing (eess.IV)
[200] arXiv:2406.00783 [pdf, other]: Title: AI-Face: A Million-Scale Demographically Annotated AI-Generated Face Dataset and Fairness Benchmark

Authors: Li Lin, Santosh, Xin Wang, Shu Hu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[201] arXiv:2406.00777 [pdf, other]: Title: Diffusion Features to Bridge Domain Gap for Semantic Segmentation

Authors: Yuxiang Ji, Boyong He, Chenyuan Qu, Zhuoyue Tan, Chuan Qin, Liaoni Wu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[202] arXiv:2406.00772 [pdf, other]: Title: Unsupervised Contrastive Analysis for Salient Pattern Detection using Conditional Diffusion Models

Authors: Cristiano Patrício, Carlo Alberto Barbano, Attilio Fiandrotti, Riccardo Renzulli, Marco Grangetto, Luis F. Teixeira, João C. Neves

Comments: 18 pages, 11 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[203] arXiv:2406.00750 [pdf, other]: Title: Freeplane: Unlocking Free Lunch in Triplane-Based Sparse-View Reconstruction Models

Authors: Wenqiang Sun, Zhengyi Wang, Shuo Chen, Yikai Wang, Zilong Chen, Jun Zhu, Jun Zhang

Comments: project can be found in: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[204] arXiv:2406.00749 [pdf, other]: Title: CCF: Cross Correcting Framework for Pedestrian Trajectory Prediction

Authors: Pranav Singh Chib, Pravendra Singh

Comments: Under review

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[205] arXiv:2406.00721 [pdf, other]: Title: Explore Internal and External Similarity for Single Image Deraining with Graph Neural Networks

Authors: Cong Wang, Wei Wang, Chengjin Yu, Jie Mu

Comments: IJCAI-24; Project Page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[206] arXiv:2406.00714 [pdf, other]: Title: A Survey of Deep Learning Based Radar and Vision Fusion for 3D Object Detection in Autonomous Driving

Authors: Di Wu, Feng Yang, Benlian Xu, Pan Liao, Bo Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[207] arXiv:2406.00704 [pdf, other]: Title: An Optimized Toolbox for Advanced Image Processing with Tsetlin Machine Composites

Authors: Ylva Grønningsæter, Halvor S. Smørvik, Ole-Christoffer Granmo

Comments: 8 pages, 3 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[208] arXiv:2406.00699 [pdf, other]: Title: Towards General Robustness Verification of MaxPool-based Convolutional Neural Networks via Tightening Linear Approximation

Authors: Yuan Xiao, Shiqing Ma, Juan Zhai, Chunrong Fang, Jinyuan Jia, Zhenyu Chen

Comments: Accepted to CVPR2024. Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[209] arXiv:2406.00696 [pdf, ps, other]: Title: Bilinear-Convolutional Neural Network Using a Matrix Similarity-based Joint Loss Function for Skin Disease Classification

Authors: Belal Ahmad, Mohd Usama, Tanvir Ahmad, Adnan Saeed, Shabnam Khatoon, Long Hu

Comments: 16 pages, 11 figures, 2 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[210] arXiv:2406.00687 [pdf, other]: Title: Lay-A-Scene: Personalized 3D Object Arrangement Using Text-to-Image Priors

Authors: Ohad Rahamim, Hilit Segev, Idan Achituve, Yuval Atzmon, Yoni Kasten, Gal Chechik

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[211] arXiv:2406.00685 [pdf, other]: Title: Improving Accuracy-robustness Trade-off via Pixel Reweighted Adversarial Training

Authors: Jiacheng Zhang, Feng Liu, Dawei Zhou, Jingfeng Zhang, Tongliang Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[212] arXiv:2406.00684 [pdf, other]: Title: Deciphering Oracle Bone Language with Diffusion Models

Authors: Haisu Guan, Huanxin Yang, Xinyu Wang, Shengwei Han, Yongge Liu, Lianwen Jin, Xiang Bai, Yuliang Liu

Comments: ACL2024 main conference long paper

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[213] arXiv:2406.00676 [pdf, other]: Title: W-Net: A Facial Feature-Guided Face Super-Resolution Network

Authors: Hao Liu, Yang Yang, Yunxia Liu

Comments: 15 pages,9 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[214] arXiv:2406.00672 [pdf, other]: Title: Task-oriented Embedding Counts: Heuristic Clustering-driven Feature Fine-tuning for Whole Slide Image Classification

Authors: Xuenian Wang, Shanshan Shi, Renao Yan, Qiehe Sun, Lianghui Zhu, Tian Guan, Yonghong He

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[215] arXiv:2406.00670 [pdf, other]: Title: Cascade-CLIP: Cascaded Vision-Language Embeddings Alignment for Zero-Shot Semantic Segmentation

Authors: Yunheng Li, ZhongYu Li, Quansheng Zeng, Qibin Hou, Ming-Ming Cheng

Comments: Accepted by ICML 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[216] arXiv:2406.00663 [pdf, other]: Title: SimSAM: Zero-shot Medical Image Segmentation via Simulated Interaction

Authors: Benjamin Towle, Xin Chen, Ke Zhou

Comments: Published at ISBI 2024. Awarded Top 12 Oral Presentation

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)

[ total of 679 entries: 1-104 | 9-112 | 113-216 | 217-320 | 321-424 | 425-528 | ... | 633-679 ]
[ showing 104 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2406, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 112

Tue, 4 Jun 2024 (continued, showing 104 of 228 entries)