Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 112

[ total of 456 entries: 1-104 | 9-112 | 113-216 | 217-320 | 321-424 | 425-456 ]
[ showing 104 entries per page: fewer | more | all ]

Thu, 9 May 2024 (continued, showing last 50 of 76 entries)

[113] arXiv:2405.04971 [pdf, other]: Title: End-to-End Semi-Supervised approach with Modulated Object Queries for Table Detection in Documents

Authors: Iqraa Ehsan, Tahira Shehzadi, Didier Stricker, Muhammad Zeshan Afzal

Comments: ICDAR-IJDAR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[114] arXiv:2405.04969 [pdf, other]: Title: A review on discriminative self-supervised learning methods

Authors: Nikolaos Giakoumoglou, Tania Stathaki

Comments: 21 pages, 7 figures, 11 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[115] arXiv:2405.04964 [pdf, other]: Title: Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution

Authors: Yi Xiao, Qiangqiang Yuan, Kui Jiang, Yuzeng Chen, Qiang Zhang, Chia-Wen Lin

Comments: Frequency-Assisted Mamba for Remote Sensing Image Super-Resolution

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[116] arXiv:2405.04953 [pdf, other]: Title: Supervised Anomaly Detection for Complex Industrial Images

Authors: Aimira Baitieva, David Hurych, Victor Besnier, Olivier Bernard

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[117] arXiv:2405.04950 [pdf, other]: Title: VisionGraph: Leveraging Large Multimodal Models for Graph Theory Problems in Visual Context

Authors: Yunxin Li, Baotian Hu, Haoyuan Shi, Wei Wang, Longyue Wang, Min Zhang

Comments: 17 pages; Accepted by ICML 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL)
[118] arXiv:2405.04943 [pdf, ps, other]: Title: Unsupervised Skin Feature Tracking with Deep Neural Networks

Authors: Jose Chang, Torbjörn E.M. Nordling

Comments: arXiv admin note: text overlap with arXiv:2112.14159

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[119] arXiv:2405.04940 [pdf, other]: Title: Harnessing the Power of MLLMs for Transferable Text-to-Image Person ReID

Authors: Wentao Tan, Changxing Ding, Jiayu Jiang, Fei Wang, Yibing Zhan, Dapeng Tao

Comments: CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[120] arXiv:2405.04918 [pdf, other]: Title: Delve into Base-Novel Confusion: Redundancy Exploration for Few-Shot Class-Incremental Learning

Authors: Haichen Zhou, Yixiong Zou, Ruixuan Li, Yuhua Li, Kui Xiao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[121] arXiv:2405.04913 [pdf, other]: Title: Weakly-supervised Semantic Segmentation via Dual-stream Contrastive Learning of Cross-image Contextual Information

Authors: Qi Lai, Chi-Man Vong

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[122] arXiv:2405.04909 [pdf, other]: Title: Traj-LLM: A New Exploration for Empowering Trajectory Prediction with Pre-trained Large Language Models

Authors: Zhengxing Lan, Hongbo Li, Lingshan Liu, Bo Fan, Yisheng Lv, Yilong Ren, Zhiyong Cui

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[123] arXiv:2405.04900 [pdf, other]: Title: Self-supervised Gait-based Emotion Representation Learning from Selective Strongly Augmented Skeleton Sequences

Authors: Cheng Song, Lu Lu, Zhen Ke, Long Gao, Shuai Ding

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[124] arXiv:2405.04889 [pdf, other]: Title: Fast LiDAR Upsampling using Conditional Diffusion Models

Authors: Sander Elias Magnussen Helgesen, Kazuto Nakashima, Jim Tørresen, Ryo Kurazume

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[125] arXiv:2405.04883 [pdf, other]: Title: Molecule-Space: Free Lunch in Unified Multimodal Space via Knowledge Fusion

Authors: Zehan Wang, Ziang Zhang, Xize Cheng, Rongjie Huang, Luping Liu, Zhenhui Ye, Haifeng Huang, Yang Zhao, Tao Jin, Peng Gao, Zhou Zhao

Comments: Accepted by ICML 2024. The code and checkpoints are released at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[126] arXiv:2405.04858 [pdf, other]: Title: Pedestrian Attribute Recognition as Label-balanced Multi-label Learning

Authors: Yibo Zhou, Hai-Miao Hu, Yirong Xiang, Xiaokang Zhang, Haotian Wu

Comments: Accepted as ICML2024 main conference paper

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[127] arXiv:2405.04834 [pdf, other]: Title: FlexEControl: Flexible and Efficient Multimodal Control for Text-to-Image Generation

Authors: Xuehai He, Jian Zheng, Jacob Zhiyuan Fang, Robinson Piramuthu, Mohit Bansal, Vicente Ordonez, Gunnar A Sigurdsson, Nanyun Peng, Xin Eric Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[128] arXiv:2405.04815 [pdf, other]: Title: Proportion Estimation by Masked Learning from Label Proportion

Authors: Takumi Okuo, Kazuya Nishimura, Hiroaki Ito, Kazuhiro Terada, Akihiko Yoshizawa, Ryoma Bise

Comments: Accepted at The 3rd MICCAI workshop on Data Augmentation, Labeling, and Imperfections

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[129] arXiv:2405.04807 [pdf, other]: Title: Transformer Architecture for NetsDB

Authors: Subodh Kamble, Kunal Sunil Kasodekar

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[130] arXiv:2405.04800 [pdf, other]: Title: DeepDamageNet: A two-step deep-learning model for multi-disaster building damage segmentation and classification using satellite imagery

Authors: Irene Alisjahbana, Jiawei Li, Ben (Mullet) Strong, Yue Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[131] arXiv:2405.04788 [pdf, other]: Title: DiffMatch: Visual-Language Guidance Makes Better Semi-supervised Change Detector

Authors: Kaiyu Li, Xiangyong Cao, Yupeng Deng, Deyu Meng

Comments: 13 pages, 4 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[132] arXiv:2405.04782 [pdf, other]: Title: Dual-Image Enhanced CLIP for Zero-Shot Anomaly Detection

Authors: Zhaoxiang Zhang, Hanqiu Deng, Jinan Bao, Xingyu Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[133] arXiv:2405.04771 [pdf, other]: Title: Exploring Vision Transformers for 3D Human Motion-Language Models with Motion Patches

Authors: Qing Yu, Mikihiro Tanaka, Kent Fujiwara

Comments: Accepted to CVPR 2024, Project website: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[134] arXiv:2405.04759 [pdf, ps, other]: Title: Multi-Label Out-of-Distribution Detection with Spectral Normalized Joint Energy

Authors: Yihan Mei, Xinyu Wang, Dell Zhang, Xiaoling Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[135] arXiv:2405.04741 [pdf, other]: Title: All in One Framework for Multimodal Re-identification in the Wild

Authors: He Li, Mang Ye, Ming Zhang, Bo Du

Comments: 12 pages, 3 figure, CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[136] arXiv:2405.04722 [pdf, other]: Title: Detecting and Refining HiRISE Image Patches Obscured by Atmospheric Dust

Authors: Kunal Sunil Kasodekar

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[137] arXiv:2405.04717 [pdf, other]: Title: Remote Diffusion

Authors: Kunal Sunil Kasodekar

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[138] arXiv:2405.04682 [pdf, other]: Title: TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation

Authors: Hritik Bansal, Yonatan Bitton, Michal Yarom, Idan Szpektor, Aditya Grover, Kai-Wei Chang

Comments: 23 pages, 12 figures, 8 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[139] arXiv:2405.04675 [pdf, other]: Title: TexControl: Sketch-Based Two-Stage Fashion Image Generation Using Diffusion Model

Authors: Yongming Zhang, Tianyu Zhang, Haoran Xie

Comments: 5 pages, 8 figures, accepted in NICOGRAPH International 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[140] arXiv:2405.04662 [pdf, other]: Title: Radar Fields: Frequency-Space Neural Scene Representations for FMCW Radar

Authors: David Borts, Erich Liang, Tim Brödermann, Andrea Ramazzina, Stefanie Walz, Edoardo Palladin, Jipeng Sun, David Bruggemann, Christos Sakaridis, Luc Van Gool, Mario Bijelic, Felix Heide

Comments: 8 pages, 6 figures, to be published in SIGGRAPH 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[141] arXiv:2405.04650 [pdf, other]: Title: A Self-Supervised Method for Body Part Segmentation and Keypoint Detection of Rat Images

Authors: László Kopácsi, Áron Fóthi, András Lőrincz

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[142] arXiv:2405.04634 [pdf, other]: Title: FRACTAL: An Ultra-Large-Scale Aerial Lidar Dataset for 3D Semantic Segmentation of Diverse Landscapes

Authors: Charles Gaydon, Michel Daab, Floryne Roche

Comments: 15 pages | 9 figures | 8 tables | Dataset is available at this https URL | Trained model is available at this https URL | Deep learning code repository is on Gihtub at this https URL | Data engineering code repository is on Github at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[143] arXiv:2405.04605 [pdf, ps, other]: Title: AI in Lung Health: Benchmarking Detection and Diagnostic Models Across Multiple CT Scan Datasets

Authors: Fakrul Islam Tushar, Avivah Wang, Lavsen Dahal, Michael R. Harowicz, Kyle J. Lafata, Tina D. Tailor, Joseph Y. Lo

Comments: 16 pages, 2 tables, 5 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[144] arXiv:2405.04589 [pdf, other]: Title: A Novel Wide-Area Multiobject Detection System with High-Probability Region Searching

Authors: Xianlei Long, Hui Zhao, Chao Chen, Fuqiang Gu, Qingyi Gu

Comments: Accepted by ICRA 2024

Journal-ref: 2024 IEEE International Conference on Robotics and Automation (ICRA)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[145] arXiv:2405.04549 [pdf, other]: Title: ClothPPO: A Proximal Policy Optimization Enhancing Framework for Robotic Cloth Manipulation with Observation-Aligned Action Spaces

Authors: Libing Yang, Yang Li, Long Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[146] arXiv:2405.04538 [pdf, other]: Title: DiffFinger: Advancing Synthetic Fingerprint Generation through Denoising Diffusion Probabilistic Models

Authors: Freddie Grabovski, Lior Yasur, Yaniv Hacmon, Lior Nisimov, Stav Nimrod

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[147] arXiv:2405.04537 [pdf, other]: Title: An intuitive multi-frequency feature representation for SO(3)-equivariant networks

Authors: Dongwon Son, Jaehyung Kim, Sanghyeon Son, Beomjoon Kim

Comments: ICLR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[148] arXiv:2405.04536 [pdf, other]: Title: When Training-Free NAS Meets Vision Transformer: A Neural Tangent Kernel Perspective

Authors: Qiqi Zhou, Yichen Zhu

Comments: ICASSP2024 oral

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[149] arXiv:2405.04535 [pdf, other]: Title: Image Classification for CSSVD Detection in Cacao Plants

Authors: Atuhurra Jesse, N'guessan Yves-Roland Douha, Pabitra Lenka

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[150] arXiv:2405.05170 (cross-list from cs.MM) [pdf, other]: Title: Picking watermarks from noise (PWFN): an improved robust watermarking model against intensive distortions

Authors: Sijing Xie, Chengxin Zhao, Nan Sun, Wei Li, Hefei Ling

Comments: Accepted by ICME2024

Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[151] arXiv:2405.05160 (cross-list from cs.LG) [pdf, other]: Title: Selective Classification Under Distribution Shifts

Authors: Hengyue Liang, Le Peng, Ju Sun

Comments: Total 25 pages (14 pages for main body); preprint for journal submission

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[152] arXiv:2405.05095 (cross-list from math.NA) [pdf, other]: Title: Approximation properties relative to continuous scale space for hybrid discretizations of Gaussian derivative operators

Authors: Tony Lindeberg

Comments: 13 pages, 11 figures. arXiv admin note: text overlap with arXiv:2311.11317

Subjects: Numerical Analysis (math.NA); Computer Vision and Pattern Recognition (cs.CV)
[153] arXiv:2405.05007 (cross-list from eess.IV) [pdf, other]: Title: HC-Mamba: Vision MAMBA with Hybrid Convolutional Techniques for Medical Image Segmentation

Authors: Jiashu Xu

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[154] arXiv:2405.04966 (cross-list from cs.IT) [pdf, other]: Title: Communication-Efficient Collaborative Perception via Information Filling with Codebook

Authors: Yue Hu, Juntong Peng, Sifei Liu, Junhao Ge, Si Liu, Siheng Chen

Comments: 10 pages, Accepted by CVPR 2024

Subjects: Information Theory (cs.IT); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[155] arXiv:2405.04902 (cross-list from eess.IV) [pdf, other]: Title: HAGAN: Hybrid Augmented Generative Adversarial Network for Medical Image Synthesis

Authors: Zhihan Ju, Wanting Zhou, Longteng Kong, Yu Chen, Yi Li, Zhenan Sun, Caifeng Shan

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[156] arXiv:2405.04890 (cross-list from cs.RO) [pdf, other]: Title: GISR: Geometric Initialization and Silhouette-based Refinement for Single-View Robot Pose and Configuration Estimation

Authors: Ivan Bilić, Filip Marić, Fabio Bonsignorio, Ivan Petrović

Comments: Submitted to IEEE Robotics and Automation Letters (RA-L)

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[157] arXiv:2405.04867 (cross-list from eess.IV) [pdf, other]: Title: MIPI 2024 Challenge on Demosaic for HybridEVS Camera: Methods and Results

Authors: Yaqi Wu, Zhihao Fan, Xiaofeng Chu, Jimmy S. Ren, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangcheng Zhou, Ruicheng Feng, Yuekun Dai, Peiqing Yang, Chen Change Loy, Senyan Xu, Zhijing Sun, Jiaying Zhu, Yurui Zhu, Xueyang Fu, Zheng-Jun Zha, Jun Cao, Cheng Li, Shu Chen, Liang Ma, Shiyang Zhou, Haijin Zeng, Kai Feng, Yongyong Chen, Jingyong Su, Xianyu Guan, Hongyuan Yu, Cheng Wan, Jiamin Lin, Binnan Han, Yajun Zou, Zhuoyuan Wu, Yuan Huang, Yongsheng Yu, Daoan Zhang, Jizhe Li, Xuanwu Yin, Kunlong Zuo, Yunfan Lu, Yijie Xu, Wenzong Ma, Weiyu Guo, Hui Xiong, Wei Yu, Bingchun Luo, Sabari Nathan, Priya Kansal

Comments: MIPI@CVPR2024. Website: this https URL

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[158] arXiv:2405.04812 (cross-list from cs.RO) [pdf, other]: Title: General Place Recognition Survey: Towards Real-World Autonomy

Authors: Peng Yin, Jianhao Jiao, Shiqi Zhao, Lingyun Xu, Guoquan Huang, Howie Choset, Sebastian Scherer, Jianda Han

Comments: 20 pages, 12 figures, under review

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[159] arXiv:2405.04778 (cross-list from eess.IV) [pdf, other]: Title: Teacher-Student Network for Real-World Face Super-Resolution with Progressive Embedding of Edge Information

Authors: Zhilei Liu, Chenggong Zhang

Comments: Accepted by ICIP 2023

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[160] arXiv:2405.04610 (cross-list from eess.IV) [pdf, other]: Title: Exploring Explainable AI Techniques for Improved Interpretability in Lung and Colon Cancer Classification

Authors: Mukaffi Bin Moin, Fatema Tuj Johora Faria, Swarnajit Saha, Bushra Kamal Rafa, Mohammad Shafiul Alam

Comments: Accepted in 4th International Conference on Computing and Communication Networks (ICCCNet-2024)

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[161] arXiv:2405.04595 (cross-list from eess.IV) [pdf, ps, other]: Title: An Advanced Features Extraction Module for Remote Sensing Image Super-Resolution

Authors: Naveed Sultan, Amir Hajian, Supavadee Aramvith

Comments: Preprint of paper from The 21st International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology or ECTI-CON 2024, Khon Kaen, Thailand

Journal-ref: ECTI-CON 2024, Khon Kaen Thailand

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[162] arXiv:2405.04507 (cross-list from stat.AP) [pdf, other]: Title: New allometric models for the USA create a step-change in forest carbon estimation, modeling, and mapping

Authors: Lucas K. Johnson (1), Michael J. Mahoney (1), Grant Domke (2), Colin M. Beier (1) ((1) State University of New York College of Environmental Science and Forestry, (2) USDA Forest Service)

Comments: Manuscript: 16 pages, 7 figures; Supplements: 3 pages, 2 figures; Submitted to: Remote Sensing of Environment

Subjects: Applications (stat.AP); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Wed, 8 May 2024 (showing first 54 of 73 entries)

[163] arXiv:2405.04534 [pdf, other]: Title: Tactile-Augmented Radiance Fields

Authors: Yiming Dou, Fengyu Yang, Yi Liu, Antonio Loquercio, Andrew Owens

Comments: CVPR 2024, Project page: this https URL, Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[164] arXiv:2405.04533 [pdf, other]: Title: ChatHuman: Language-driven 3D Human Understanding with Retrieval-Augmented Tool Reasoning

Authors: Jing Lin, Yao Feng, Weiyang Liu, Michael J. Black

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[165] arXiv:2405.04496 [pdf, other]: Title: Edit-Your-Motion: Space-Time Diffusion Decoupling Learning for Video Motion Editing

Authors: Yi Zuo, Lingling Li, Licheng Jiao, Fang Liu, Xu Liu, Wenping Ma, Shuyuan Yang, Yuwei Guo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[166] arXiv:2405.04489 [pdf, other]: Title: S3Former: Self-supervised High-resolution Transformer for Solar PV Profiling

Authors: Minh Tran, Adrian De Luis, Haitao Liao, Ying Huang, Roy McCann, Alan Mantooth, Jack Cothren, Ngan Le

Comments: Preprint

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[167] arXiv:2405.04457 [pdf, other]: Title: Towards Geographic Inclusion in the Evaluation of Text-to-Image Models

Authors: Melissa Hall, Samuel J. Bell, Candace Ross, Adina Williams, Michal Drozdzal, Adriana Romero Soriano

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Human-Computer Interaction (cs.HC)
[168] arXiv:2405.04442 [pdf, other]: Title: AugmenTory: A Fast and Flexible Polygon Augmentation Library

Authors: Tanaz Ghahremani, Mohammad Hoseyni, Mohammad Javad Ahmadi, Pouria Mehrabi, Amirhossein Nikoofard

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[169] arXiv:2405.04416 [pdf, other]: Title: DistGrid: Scalable Scene Reconstruction with Distributed Multi-resolution Hash Grid

Authors: Sidun Liu, Peng Qiao, Zongxin Ye, Wenyu Li, Yong Dou

Comments: Originally submitted to Siggraph Asia 2023

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[170] arXiv:2405.04408 [pdf, other]: Title: DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks

Authors: Jiaxin Zhang, Dezhi Peng, Chongyu Liu, Peirong Zhang, Lianwen Jin

Comments: Accepted by CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[171] arXiv:2405.04404 [pdf, other]: Title: Vision Mamba: A Comprehensive Survey and Taxonomy

Authors: Xiao Liu, Chenxu Zhang, Lei Zhang

Comments: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[172] arXiv:2405.04403 [pdf, other]: Title: Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks

Authors: Georgios Pantazopoulos, Amit Parekh, Malvina Nikandrou, Alessandro Suglia

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[173] arXiv:2405.04390 [pdf, other]: Title: DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving

Authors: Chen Min, Dawei Zhao, Liang Xiao, Jian Zhao, Xinli Xu, Zheng Zhu, Lei Jin, Jianshu Li, Yulan Guo, Junliang Xing, Liping Jing, Yiming Nie, Bin Dai

Comments: Accepted by CVPR2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[174] arXiv:2405.04377 [pdf, other]: Title: Choose What You Need: Disentangled Representation Learning for Scene Text Recognition, Removal and Editing

Authors: Boqiang Zhang, Hongtao Xie, Zuan Gao, Yuxin Wang

Comments: Accepted to CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[175] arXiv:2405.04370 [pdf, other]: Title: Diff-IP2D: Diffusion-Based Hand-Object Interaction Prediction on Egocentric Videos

Authors: Junyi Ma, Jingyi Xu, Xieyuanli Chen, Hesheng Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[176] arXiv:2405.04356 [pdf, other]: Title: Diffusion-driven GAN Inversion for Multi-Modal Face Image Generation

Authors: Jihyun Kim, Changjae Oh, Hoseok Do, Soohyun Kim, Kwanghoon Sohn

Comments: Accepted by CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[177] arXiv:2405.04345 [pdf, other]: Title: Novel View Synthesis with Neural Radiance Fields for Industrial Robot Applications

Authors: Markus Hillemann, Robert Langendörfer, Max Heiken, Max Mehltretter, Andreas Schenk, Martin Weinmann, Stefan Hinz, Christian Heipke, Markus Ulrich

Comments: 8 pages, 8 figures, accepted for publication in The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences (ISPRS Archives) 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[178] arXiv:2405.04327 [pdf, other]: Title: Audio-Visual Speech Representation Expert for Enhanced Talking Face Video Generation and Evaluation

Authors: Dogucan Yaman, Fevziye Irem Eyiokur, Leonard Bärmann, Seymanur Aktı, Hazım Kemal Ekenel, Alexander Waibel

Comments: CVPR2024 NTIRE Workshop

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[179] arXiv:2405.04312 [pdf, other]: Title: Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer

Authors: Zhuoyi Yang, Heyang Jiang, Wenyi Hong, Jiayan Teng, Wendi Zheng, Yuxiao Dong, Ming Ding, Jie Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[180] arXiv:2405.04311 [pdf, ps, other]: Title: Cross-IQA: Unsupervised Learning for Image Quality Assessment

Authors: Zhen Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
[181] arXiv:2405.04309 [pdf, other]: Title: Non-rigid Structure-from-Motion: Temporally-smooth Procrustean Alignment and Spatially-variant Deformation Modeling

Authors: Jiawei Shi, Hui Deng, Yuchao Dai

Comments: Accepted by CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[182] arXiv:2405.04305 [pdf, other]: Title: A New Dataset and Comparative Study for Aphid Cluster Detection and Segmentation in Sorghum Fields

Authors: Raiyan Rahman, Christopher Indris, Goetz Bramesfeld, Tianxiao Zhang, Kaidong Li, Xiangyu Chen, Ivan Grijalva, Brian McCornack, Daniel Flippo, Ajay Sharda, Guanghui Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[183] arXiv:2405.04299 [pdf, other]: Title: ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers

Authors: Jinke Li, Xiao He, Chonghua Zhou, Xiaoqiang Cheng, Yang Wen, Dan Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[184] arXiv:2405.04251 [pdf, other]: Title: A General Model for Detecting Learner Engagement: Implementation and Evaluation

Authors: Somayeh Malekshahi, Javad M. Kheyridoost, Omid Fatemi

Comments: 13 pages, 2 Postscript figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
[185] arXiv:2405.04233 [pdf, other]: Title: Vidu: a Highly Consistent, Dynamic and Skilled Text-to-Video Generator with Diffusion Models

Authors: Fan Bao, Chendong Xiang, Gang Yue, Guande He, Hongzhou Zhu, Kaiwen Zheng, Min Zhao, Shilong Liu, Yaole Wang, Jun Zhu

Comments: Project page at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[186] arXiv:2405.04211 [pdf, other]: Title: Breast Histopathology Image Retrieval by Attention-based Adversarially Regularized Variational Graph Autoencoder with Contrastive Learning-Based Feature Extraction

Authors: Nematollah Saeidi, Hossein Karshenas, Bijan Shoushtarian, Sepideh Hatamikia, Ramona Woitek, Amirreza Mahbod

Comments: 31 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[187] arXiv:2405.04189 [pdf, ps, other]: Title: Artificial Intelligence-powered fossil shark tooth identification: Unleashing the potential of Convolutional Neural Networks

Authors: Andrea Barucci, Giulia Ciacci, Pietro Liò, Tiago Azevedo, Andrea Di Cencio, Marco Merella, Giovanni Bianucci, Giulia Bosio, Simone Casati, Alberto Collareta

Comments: 40 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[188] arXiv:2405.04175 [pdf, other]: Title: Topicwise Separable Sentence Retrieval for Medical Report Generation

Authors: Junting Zhao, Yang Zhou, Zhihao Chen, Huazhu Fu, Liang Wan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[189] arXiv:2405.04167 [pdf, other]: Title: Bridging the Synthetic-to-Authentic Gap: Distortion-Guided Unsupervised Domain Adaptation for Blind Image Quality Assessment

Authors: Aobo Li, Jinjian Wu, Yongxu Liu, Leida Li

Comments: Accepted by CVPR2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[190] arXiv:2405.04164 [pdf, other]: Title: Sign2GPT: Leveraging Large Language Models for Gloss-Free Sign Language Translation

Authors: Ryan Wong, Necati Cihan Camgoz, Richard Bowden

Comments: Accepted at ICLR2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[191] arXiv:2405.04133 [pdf, other]: Title: Exposing AI-generated Videos: A Benchmark Dataset and a Local-and-Global Temporal Defect Based Detection Method

Authors: Peisong He, Leyao Zhu, Jiaxing Li, Shiqi Wang, Haoliang Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[192] arXiv:2405.04121 [pdf, other]: Title: ELiTe: Efficient Image-to-LiDAR Knowledge Transfer for Semantic Segmentation

Authors: Zhibo Zhang, Ximing Yang, Weizhong Zhang, Cheng Jin

Comments: 9 pages, 6 figures, ICME 2024 oral

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[193] arXiv:2405.04103 [pdf, other]: Title: COM3D: Leveraging Cross-View Correspondence and Cross-Modal Mining for 3D Retrieval

Authors: Hao Wu, Ruochong LI, Hao Wang, Hui Xiong

Comments: Accepted by ICME 2024 oral

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[194] arXiv:2405.04100 [pdf, other]: Title: ESP: Extro-Spective Prediction for Long-term Behavior Reasoning in Emergency Scenarios

Authors: Dingrui Wang, Zheyuan Lai, Yuda Li, Yi Wu, Yuexin Ma, Johannes Betz, Ruigang Yang, Wei Li

Comments: Accepted by ICRA 2024 as Oral Presentation

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[195] arXiv:2405.04097 [pdf, other]: Title: Unmasking Illusions: Understanding Human Perception of Audiovisual Deepfakes

Authors: Ammarah Hashmi, Sahibzada Adil Shahzad, Chia-Wen Lin, Yu Tsao, Hsin-Min Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Machine Learning (cs.LG); Multimedia (cs.MM)
[196] arXiv:2405.04093 [pdf, other]: Title: DCNN: Dual Cross-current Neural Networks Realized Using An Interactive Deep Learning Discriminator for Fine-grained Objects

Authors: Da Fu, Mingfei Rong, Eun-Hu Kim, Hao Huang, Witold Pedrycz

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[197] arXiv:2405.04044 [pdf, other]: Title: DMOFC: Discrimination Metric-Optimized Feature Compression

Authors: Changsheng Gao, Yiheng Jiang, Li Li, Dong Liu, Feng Wu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[198] arXiv:2405.04042 [pdf, other]: Title: Space-time Reinforcement Network for Video Object Segmentation

Authors: Yadang Chen, Wentao Zhu, Zhi-Xin Yang, Enhua Wu

Comments: Accepted by ICME 2024. 6 pages, 10 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[199] arXiv:2405.04009 [pdf, other]: Title: Structured Click Control in Transformer-based Interactive Segmentation

Authors: Long Xu, Yongquan Chen, Rui Huang, Feng Wu, Shiwu Lai

Comments: 10 pages, 6 figures, submitted to NeurIPS 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[200] arXiv:2405.04007 [pdf, other]: Title: SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing

Authors: Yuying Ge, Sijie Zhao, Chen Li, Yixiao Ge, Ying Shan

Comments: Technical Report; Dataset released in this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[201] arXiv:2405.03995 [pdf, other]: Title: Deep Event-based Object Detection in Autonomous Driving: A Survey

Authors: Bingquan Zhou, Jie Jiang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[202] arXiv:2405.03981 [pdf, other]: Title: Predicting Lung Disease Severity via Image-Based AQI Analysis using Deep Learning Techniques

Authors: Anvita Mahajan, Sayali Mate, Chinmayee Kulkarni, Suraj Sawant

Comments: 11 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[203] arXiv:2405.03978 [pdf, other]: Title: VMambaCC: A Visual State Space Model for Crowd Counting

Authors: Hao-Yuan Ma, Li Zhang, Shuai Shi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[204] arXiv:2405.03971 [pdf, other]: Title: Unified End-to-End V2X Cooperative Autonomous Driving

Authors: Zhiwei Li, Bozhen Zhang, Lei Yang, Tianyu Shen, Nuo Xu, Ruosen Hao, Weiting Li, Tao Yan, Huaping Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[205] arXiv:2405.03959 [pdf, other]: Title: Joint Estimation of Identity Verification and Relative Pose for Partial Fingerprints

Authors: Xiongjun Guan, Zhiyu Pan, Jianjiang Feng, Jie Zhou

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[206] arXiv:2405.03958 [pdf, other]: Title: Simple Drop-in LoRA Conditioning on Attention Layers Will Improve Your Diffusion Model

Authors: Joo Young Choi, Jaesung R. Park, Inkyu Park, Jaewoong Cho, Albert No, Ernest K. Ryu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[207] arXiv:2405.03955 [pdf, ps, other]: Title: IPFed: Identity protected federated learning for user authentication

Authors: Yosuke Kaga, Yusei Suzuki, Kenta Takahashi

Journal-ref: 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[208] arXiv:2405.03945 [pdf, other]: Title: Role of Sensing and Computer Vision in 6G Wireless Communications

Authors: Seungnyun Kim, Jihoon Moon, Jinhong Kim, Yongjun Ahn, Donghoon Kim, Sunwoo Kim, Kyuhong Shim, Byonghyo Shim

Subjects: Computer Vision and Pattern Recognition (cs.CV); Networking and Internet Architecture (cs.NI)
[209] arXiv:2405.03894 [pdf, other]: Title: MVDiff: Scalable and Flexible Multi-View Diffusion for 3D Object Reconstruction from Single-View

Authors: Emmanuelle Bourigault, Pauline Bourigault

Comments: CVPRW: Generative Models for Computer Vision

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[210] arXiv:2405.03884 [pdf, other]: Title: BadFusion: 2D-Oriented Backdoor Attacks against 3D Object Detection

Authors: Saket S. Chaturvedi, Lan Zhang, Wenbin Zhang, Pan He, Xiaoyong Yuan

Comments: Accepted at IJCAI 2024 Conference

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[211] arXiv:2405.03882 [pdf, other]: Title: Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer

Authors: Huihong Shi, Haikuo Shao, Wendong Mao, Zhongfeng Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[212] arXiv:2405.03852 [pdf, other]: Title: VSA4VQA: Scaling a Vector Symbolic Architecture to Visual Question Answering on Natural Images

Authors: Anna Penzkofer, Lei Shi, Andreas Bulling

Comments: To be published in the Proceedings of the Annual Meeting of the Cognitive Science Society (CogSci'24)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[213] arXiv:2405.03846 [pdf, other]: Title: Enhancing Apparent Personality Trait Analysis with Cross-Modal Embeddings

Authors: Ádám Fodor, Rachid R. Saboundji, András Lőrincz

Comments: 14 pages, 4 figures

Journal-ref: Annales Universitatis Scientiarium Budapestinensis de Rolando E\"otv\"os Nominatae. Sectio Computatorica, MaCS Special Issue, 2021

Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[214] arXiv:2405.03803 [pdf, other]: Title: MoDiPO: text-to-motion alignment via AI-feedback-driven Direct Preference Optimization

Authors: Massimiliano Pappa, Luca Collorone, Giovanni Ficarra, Indro Spinelli, Fabio Galasso

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[215] arXiv:2405.03770 [pdf, other]: Title: Foundation Models for Video Understanding: A Survey

Authors: Neelu Madan, Andreas Moegelmose, Rajat Modi, Yogesh S. Rawat, Thomas B. Moeslund

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[216] arXiv:2405.03722 [pdf, other]: Title: Class-relevant Patch Embedding Selection for Few-Shot Image Classification

Authors: Weihao Jiang, Haoyang Cui, Kun He

Comments: arXiv admin note: text overlap with arXiv:2405.03109

Subjects: Computer Vision and Pattern Recognition (cs.CV)

[ total of 456 entries: 1-104 | 9-112 | 113-216 | 217-320 | 321-424 | 425-456 ]
[ showing 104 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2405, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 112

Thu, 9 May 2024 (continued, showing last 50 of 76 entries)

Wed, 8 May 2024 (showing first 54 of 73 entries)