Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 136

[ total of 456 entries: 1-100 | 37-136 | 137-236 | 237-336 | 337-436 | 437-456 ]
[ showing 100 entries per page: fewer | more | all ]

Thu, 9 May 2024 (continued, showing last 26 of 76 entries)

[137] arXiv:2405.04717 [pdf, other]: Title: Remote Diffusion

Authors: Kunal Sunil Kasodekar

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[138] arXiv:2405.04682 [pdf, other]: Title: TALC: Time-Aligned Captions for Multi-Scene Text-to-Video Generation

Authors: Hritik Bansal, Yonatan Bitton, Michal Yarom, Idan Szpektor, Aditya Grover, Kai-Wei Chang

Comments: 23 pages, 12 figures, 8 tables

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[139] arXiv:2405.04675 [pdf, other]: Title: TexControl: Sketch-Based Two-Stage Fashion Image Generation Using Diffusion Model

Authors: Yongming Zhang, Tianyu Zhang, Haoran Xie

Comments: 5 pages, 8 figures, accepted in NICOGRAPH International 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[140] arXiv:2405.04662 [pdf, other]: Title: Radar Fields: Frequency-Space Neural Scene Representations for FMCW Radar

Authors: David Borts, Erich Liang, Tim Brödermann, Andrea Ramazzina, Stefanie Walz, Edoardo Palladin, Jipeng Sun, David Bruggemann, Christos Sakaridis, Luc Van Gool, Mario Bijelic, Felix Heide

Comments: 8 pages, 6 figures, to be published in SIGGRAPH 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[141] arXiv:2405.04650 [pdf, other]: Title: A Self-Supervised Method for Body Part Segmentation and Keypoint Detection of Rat Images

Authors: László Kopácsi, Áron Fóthi, András Lőrincz

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[142] arXiv:2405.04634 [pdf, other]: Title: FRACTAL: An Ultra-Large-Scale Aerial Lidar Dataset for 3D Semantic Segmentation of Diverse Landscapes

Authors: Charles Gaydon, Michel Daab, Floryne Roche

Comments: 15 pages | 9 figures | 8 tables | Dataset is available at this https URL | Trained model is available at this https URL | Deep learning code repository is on Gihtub at this https URL | Data engineering code repository is on Github at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[143] arXiv:2405.04605 [pdf, ps, other]: Title: AI in Lung Health: Benchmarking Detection and Diagnostic Models Across Multiple CT Scan Datasets

Authors: Fakrul Islam Tushar, Avivah Wang, Lavsen Dahal, Michael R. Harowicz, Kyle J. Lafata, Tina D. Tailor, Joseph Y. Lo

Comments: 16 pages, 2 tables, 5 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[144] arXiv:2405.04589 [pdf, other]: Title: A Novel Wide-Area Multiobject Detection System with High-Probability Region Searching

Authors: Xianlei Long, Hui Zhao, Chao Chen, Fuqiang Gu, Qingyi Gu

Comments: Accepted by ICRA 2024

Journal-ref: 2024 IEEE International Conference on Robotics and Automation (ICRA)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Robotics (cs.RO)
[145] arXiv:2405.04549 [pdf, other]: Title: ClothPPO: A Proximal Policy Optimization Enhancing Framework for Robotic Cloth Manipulation with Observation-Aligned Action Spaces

Authors: Libing Yang, Yang Li, Long Chen

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[146] arXiv:2405.04538 [pdf, other]: Title: DiffFinger: Advancing Synthetic Fingerprint Generation through Denoising Diffusion Probabilistic Models

Authors: Freddie Grabovski, Lior Yasur, Yaniv Hacmon, Lior Nisimov, Stav Nimrod

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[147] arXiv:2405.04537 [pdf, other]: Title: An intuitive multi-frequency feature representation for SO(3)-equivariant networks

Authors: Dongwon Son, Jaehyung Kim, Sanghyeon Son, Beomjoon Kim

Comments: ICLR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR)
[148] arXiv:2405.04536 [pdf, other]: Title: When Training-Free NAS Meets Vision Transformer: A Neural Tangent Kernel Perspective

Authors: Qiqi Zhou, Yichen Zhu

Comments: ICASSP2024 oral

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[149] arXiv:2405.04535 [pdf, other]: Title: Image Classification for CSSVD Detection in Cacao Plants

Authors: Atuhurra Jesse, N'guessan Yves-Roland Douha, Pabitra Lenka

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[150] arXiv:2405.05170 (cross-list from cs.MM) [pdf, other]: Title: Picking watermarks from noise (PWFN): an improved robust watermarking model against intensive distortions

Authors: Sijing Xie, Chengxin Zhao, Nan Sun, Wei Li, Hefei Ling

Comments: Accepted by ICME2024

Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[151] arXiv:2405.05160 (cross-list from cs.LG) [pdf, other]: Title: Selective Classification Under Distribution Shifts

Authors: Hengyue Liang, Le Peng, Ju Sun

Comments: Total 25 pages (14 pages for main body); preprint for journal submission

Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[152] arXiv:2405.05095 (cross-list from math.NA) [pdf, other]: Title: Approximation properties relative to continuous scale space for hybrid discretizations of Gaussian derivative operators

Authors: Tony Lindeberg

Comments: 13 pages, 11 figures. arXiv admin note: text overlap with arXiv:2311.11317

Subjects: Numerical Analysis (math.NA); Computer Vision and Pattern Recognition (cs.CV)
[153] arXiv:2405.05007 (cross-list from eess.IV) [pdf, other]: Title: HC-Mamba: Vision MAMBA with Hybrid Convolutional Techniques for Medical Image Segmentation

Authors: Jiashu Xu

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[154] arXiv:2405.04966 (cross-list from cs.IT) [pdf, other]: Title: Communication-Efficient Collaborative Perception via Information Filling with Codebook

Authors: Yue Hu, Juntong Peng, Sifei Liu, Junhao Ge, Si Liu, Siheng Chen

Comments: 10 pages, Accepted by CVPR 2024

Subjects: Information Theory (cs.IT); Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[155] arXiv:2405.04902 (cross-list from eess.IV) [pdf, other]: Title: HAGAN: Hybrid Augmented Generative Adversarial Network for Medical Image Synthesis

Authors: Zhihan Ju, Wanting Zhou, Longteng Kong, Yu Chen, Yi Li, Zhenan Sun, Caifeng Shan

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[156] arXiv:2405.04890 (cross-list from cs.RO) [pdf, other]: Title: GISR: Geometric Initialization and Silhouette-based Refinement for Single-View Robot Pose and Configuration Estimation

Authors: Ivan Bilić, Filip Marić, Fabio Bonsignorio, Ivan Petrović

Comments: Submitted to IEEE Robotics and Automation Letters (RA-L)

Subjects: Robotics (cs.RO); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[157] arXiv:2405.04867 (cross-list from eess.IV) [pdf, other]: Title: MIPI 2024 Challenge on Demosaic for HybridEVS Camera: Methods and Results

Authors: Yaqi Wu, Zhihao Fan, Xiaofeng Chu, Jimmy S. Ren, Xiaoming Li, Zongsheng Yue, Chongyi Li, Shangcheng Zhou, Ruicheng Feng, Yuekun Dai, Peiqing Yang, Chen Change Loy, Senyan Xu, Zhijing Sun, Jiaying Zhu, Yurui Zhu, Xueyang Fu, Zheng-Jun Zha, Jun Cao, Cheng Li, Shu Chen, Liang Ma, Shiyang Zhou, Haijin Zeng, Kai Feng, Yongyong Chen, Jingyong Su, Xianyu Guan, Hongyuan Yu, Cheng Wan, Jiamin Lin, Binnan Han, Yajun Zou, Zhuoyuan Wu, Yuan Huang, Yongsheng Yu, Daoan Zhang, Jizhe Li, Xuanwu Yin, Kunlong Zuo, Yunfan Lu, Yijie Xu, Wenzong Ma, Weiyu Guo, Hui Xiong, Wei Yu, Bingchun Luo, Sabari Nathan, Priya Kansal

Comments: MIPI@CVPR2024. Website: this https URL

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[158] arXiv:2405.04812 (cross-list from cs.RO) [pdf, other]: Title: General Place Recognition Survey: Towards Real-World Autonomy

Authors: Peng Yin, Jianhao Jiao, Shiqi Zhao, Lingyun Xu, Guoquan Huang, Howie Choset, Sebastian Scherer, Jianda Han

Comments: 20 pages, 12 figures, under review

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[159] arXiv:2405.04778 (cross-list from eess.IV) [pdf, other]: Title: Teacher-Student Network for Real-World Face Super-Resolution with Progressive Embedding of Edge Information

Authors: Zhilei Liu, Chenggong Zhang

Comments: Accepted by ICIP 2023

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[160] arXiv:2405.04610 (cross-list from eess.IV) [pdf, other]: Title: Exploring Explainable AI Techniques for Improved Interpretability in Lung and Colon Cancer Classification

Authors: Mukaffi Bin Moin, Fatema Tuj Johora Faria, Swarnajit Saha, Bushra Kamal Rafa, Mohammad Shafiul Alam

Comments: Accepted in 4th International Conference on Computing and Communication Networks (ICCCNet-2024)

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[161] arXiv:2405.04595 (cross-list from eess.IV) [pdf, ps, other]: Title: An Advanced Features Extraction Module for Remote Sensing Image Super-Resolution

Authors: Naveed Sultan, Amir Hajian, Supavadee Aramvith

Comments: Preprint of paper from The 21st International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology or ECTI-CON 2024, Khon Kaen, Thailand

Journal-ref: ECTI-CON 2024, Khon Kaen Thailand

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[162] arXiv:2405.04507 (cross-list from stat.AP) [pdf, other]: Title: New allometric models for the USA create a step-change in forest carbon estimation, modeling, and mapping

Authors: Lucas K. Johnson (1), Michael J. Mahoney (1), Grant Domke (2), Colin M. Beier (1) ((1) State University of New York College of Environmental Science and Forestry, (2) USDA Forest Service)

Comments: Manuscript: 16 pages, 7 figures; Supplements: 3 pages, 2 figures; Submitted to: Remote Sensing of Environment

Subjects: Applications (stat.AP); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Wed, 8 May 2024

[163] arXiv:2405.04534 [pdf, other]: Title: Tactile-Augmented Radiance Fields

Authors: Yiming Dou, Fengyu Yang, Yi Liu, Antonio Loquercio, Andrew Owens

Comments: CVPR 2024, Project page: this https URL, Code: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[164] arXiv:2405.04533 [pdf, other]: Title: ChatHuman: Language-driven 3D Human Understanding with Retrieval-Augmented Tool Reasoning

Authors: Jing Lin, Yao Feng, Weiyang Liu, Michael J. Black

Comments: Project page: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[165] arXiv:2405.04496 [pdf, other]: Title: Edit-Your-Motion: Space-Time Diffusion Decoupling Learning for Video Motion Editing

Authors: Yi Zuo, Lingling Li, Licheng Jiao, Fang Liu, Xu Liu, Wenping Ma, Shuyuan Yang, Yuwei Guo

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[166] arXiv:2405.04489 [pdf, other]: Title: S3Former: Self-supervised High-resolution Transformer for Solar PV Profiling

Authors: Minh Tran, Adrian De Luis, Haitao Liao, Ying Huang, Roy McCann, Alan Mantooth, Jack Cothren, Ngan Le

Comments: Preprint

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[167] arXiv:2405.04457 [pdf, other]: Title: Towards Geographic Inclusion in the Evaluation of Text-to-Image Models

Authors: Melissa Hall, Samuel J. Bell, Candace Ross, Adina Williams, Michal Drozdzal, Adriana Romero Soriano

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Human-Computer Interaction (cs.HC)
[168] arXiv:2405.04442 [pdf, other]: Title: AugmenTory: A Fast and Flexible Polygon Augmentation Library

Authors: Tanaz Ghahremani, Mohammad Hoseyni, Mohammad Javad Ahmadi, Pouria Mehrabi, Amirhossein Nikoofard

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[169] arXiv:2405.04416 [pdf, other]: Title: DistGrid: Scalable Scene Reconstruction with Distributed Multi-resolution Hash Grid

Authors: Sidun Liu, Peng Qiao, Zongxin Ye, Wenyu Li, Yong Dou

Comments: Originally submitted to Siggraph Asia 2023

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[170] arXiv:2405.04408 [pdf, other]: Title: DocRes: A Generalist Model Toward Unifying Document Image Restoration Tasks

Authors: Jiaxin Zhang, Dezhi Peng, Chongyu Liu, Peirong Zhang, Lianwen Jin

Comments: Accepted by CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[171] arXiv:2405.04404 [pdf, other]: Title: Vision Mamba: A Comprehensive Survey and Taxonomy

Authors: Xiao Liu, Chenxu Zhang, Lei Zhang

Comments: this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)
[172] arXiv:2405.04403 [pdf, other]: Title: Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks

Authors: Georgios Pantazopoulos, Amit Parekh, Malvina Nikandrou, Alessandro Suglia

Subjects: Computer Vision and Pattern Recognition (cs.CV); Computation and Language (cs.CL)
[173] arXiv:2405.04390 [pdf, other]: Title: DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving

Authors: Chen Min, Dawei Zhao, Liang Xiao, Jian Zhao, Xinli Xu, Zheng Zhu, Lei Jin, Jianshu Li, Yulan Guo, Junliang Xing, Liping Jing, Yiming Nie, Bin Dai

Comments: Accepted by CVPR2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[174] arXiv:2405.04377 [pdf, other]: Title: Choose What You Need: Disentangled Representation Learning for Scene Text Recognition, Removal and Editing

Authors: Boqiang Zhang, Hongtao Xie, Zuan Gao, Yuxin Wang

Comments: Accepted to CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[175] arXiv:2405.04370 [pdf, other]: Title: Diff-IP2D: Diffusion-Based Hand-Object Interaction Prediction on Egocentric Videos

Authors: Junyi Ma, Jingyi Xu, Xieyuanli Chen, Hesheng Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[176] arXiv:2405.04356 [pdf, other]: Title: Diffusion-driven GAN Inversion for Multi-Modal Face Image Generation

Authors: Jihyun Kim, Changjae Oh, Hoseok Do, Soohyun Kim, Kwanghoon Sohn

Comments: Accepted by CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[177] arXiv:2405.04345 [pdf, other]: Title: Novel View Synthesis with Neural Radiance Fields for Industrial Robot Applications

Authors: Markus Hillemann, Robert Langendörfer, Max Heiken, Max Mehltretter, Andreas Schenk, Martin Weinmann, Stefan Hinz, Christian Heipke, Markus Ulrich

Comments: 8 pages, 8 figures, accepted for publication in The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences (ISPRS Archives) 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Robotics (cs.RO)
[178] arXiv:2405.04327 [pdf, other]: Title: Audio-Visual Speech Representation Expert for Enhanced Talking Face Video Generation and Evaluation

Authors: Dogucan Yaman, Fevziye Irem Eyiokur, Leonard Bärmann, Seymanur Aktı, Hazım Kemal Ekenel, Alexander Waibel

Comments: CVPR2024 NTIRE Workshop

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[179] arXiv:2405.04312 [pdf, other]: Title: Inf-DiT: Upsampling Any-Resolution Image with Memory-Efficient Diffusion Transformer

Authors: Zhuoyi Yang, Heyang Jiang, Wenyi Hong, Jiayan Teng, Wendi Zheng, Yuxiao Dong, Ming Ding, Jie Tang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[180] arXiv:2405.04311 [pdf, ps, other]: Title: Cross-IQA: Unsupervised Learning for Image Quality Assessment

Authors: Zhen Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Image and Video Processing (eess.IV)
[181] arXiv:2405.04309 [pdf, other]: Title: Non-rigid Structure-from-Motion: Temporally-smooth Procrustean Alignment and Spatially-variant Deformation Modeling

Authors: Jiawei Shi, Hui Deng, Yuchao Dai

Comments: Accepted by CVPR 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[182] arXiv:2405.04305 [pdf, other]: Title: A New Dataset and Comparative Study for Aphid Cluster Detection and Segmentation in Sorghum Fields

Authors: Raiyan Rahman, Christopher Indris, Goetz Bramesfeld, Tianxiao Zhang, Kaidong Li, Xiangyu Chen, Ivan Grijalva, Brian McCornack, Daniel Flippo, Ajay Sharda, Guanghui Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[183] arXiv:2405.04299 [pdf, other]: Title: ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers

Authors: Jinke Li, Xiao He, Chonghua Zhou, Xiaoqiang Cheng, Yang Wen, Dan Zhang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[184] arXiv:2405.04251 [pdf, other]: Title: A General Model for Detecting Learner Engagement: Implementation and Evaluation

Authors: Somayeh Malekshahi, Javad M. Kheyridoost, Omid Fatemi

Comments: 13 pages, 2 Postscript figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG)
[185] arXiv:2405.04233 [pdf, other]: Title: Vidu: a Highly Consistent, Dynamic and Skilled Text-to-Video Generator with Diffusion Models

Authors: Fan Bao, Chendong Xiang, Gang Yue, Guande He, Hongzhou Zhu, Kaiwen Zheng, Min Zhao, Shilong Liu, Yaole Wang, Jun Zhu

Comments: Project page at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[186] arXiv:2405.04211 [pdf, other]: Title: Breast Histopathology Image Retrieval by Attention-based Adversarially Regularized Variational Graph Autoencoder with Contrastive Learning-Based Feature Extraction

Authors: Nematollah Saeidi, Hossein Karshenas, Bijan Shoushtarian, Sepideh Hatamikia, Ramona Woitek, Amirreza Mahbod

Comments: 31 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[187] arXiv:2405.04189 [pdf, ps, other]: Title: Artificial Intelligence-powered fossil shark tooth identification: Unleashing the potential of Convolutional Neural Networks

Authors: Andrea Barucci, Giulia Ciacci, Pietro Liò, Tiago Azevedo, Andrea Di Cencio, Marco Merella, Giovanni Bianucci, Giulia Bosio, Simone Casati, Alberto Collareta

Comments: 40 pages, 8 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[188] arXiv:2405.04175 [pdf, other]: Title: Topicwise Separable Sentence Retrieval for Medical Report Generation

Authors: Junting Zhao, Yang Zhou, Zhihao Chen, Huazhu Fu, Liang Wan

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[189] arXiv:2405.04167 [pdf, other]: Title: Bridging the Synthetic-to-Authentic Gap: Distortion-Guided Unsupervised Domain Adaptation for Blind Image Quality Assessment

Authors: Aobo Li, Jinjian Wu, Yongxu Liu, Leida Li

Comments: Accepted by CVPR2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[190] arXiv:2405.04164 [pdf, other]: Title: Sign2GPT: Leveraging Large Language Models for Gloss-Free Sign Language Translation

Authors: Ryan Wong, Necati Cihan Camgoz, Richard Bowden

Comments: Accepted at ICLR2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[191] arXiv:2405.04133 [pdf, other]: Title: Exposing AI-generated Videos: A Benchmark Dataset and a Local-and-Global Temporal Defect Based Detection Method

Authors: Peisong He, Leyao Zhu, Jiaxing Li, Shiqi Wang, Haoliang Li

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[192] arXiv:2405.04121 [pdf, other]: Title: ELiTe: Efficient Image-to-LiDAR Knowledge Transfer for Semantic Segmentation

Authors: Zhibo Zhang, Ximing Yang, Weizhong Zhang, Cheng Jin

Comments: 9 pages, 6 figures, ICME 2024 oral

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[193] arXiv:2405.04103 [pdf, other]: Title: COM3D: Leveraging Cross-View Correspondence and Cross-Modal Mining for 3D Retrieval

Authors: Hao Wu, Ruochong LI, Hao Wang, Hui Xiong

Comments: Accepted by ICME 2024 oral

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[194] arXiv:2405.04100 [pdf, other]: Title: ESP: Extro-Spective Prediction for Long-term Behavior Reasoning in Emergency Scenarios

Authors: Dingrui Wang, Zheyuan Lai, Yuda Li, Yi Wu, Yuexin Ma, Johannes Betz, Ruigang Yang, Wei Li

Comments: Accepted by ICRA 2024 as Oral Presentation

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[195] arXiv:2405.04097 [pdf, other]: Title: Unmasking Illusions: Understanding Human Perception of Audiovisual Deepfakes

Authors: Ammarah Hashmi, Sahibzada Adil Shahzad, Chia-Wen Lin, Yu Tsao, Hsin-Min Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computers and Society (cs.CY); Machine Learning (cs.LG); Multimedia (cs.MM)
[196] arXiv:2405.04093 [pdf, other]: Title: DCNN: Dual Cross-current Neural Networks Realized Using An Interactive Deep Learning Discriminator for Fine-grained Objects

Authors: Da Fu, Mingfei Rong, Eun-Hu Kim, Hao Huang, Witold Pedrycz

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[197] arXiv:2405.04044 [pdf, other]: Title: DMOFC: Discrimination Metric-Optimized Feature Compression

Authors: Changsheng Gao, Yiheng Jiang, Li Li, Dong Liu, Feng Wu

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[198] arXiv:2405.04042 [pdf, other]: Title: Space-time Reinforcement Network for Video Object Segmentation

Authors: Yadang Chen, Wentao Zhu, Zhi-Xin Yang, Enhua Wu

Comments: Accepted by ICME 2024. 6 pages, 10 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[199] arXiv:2405.04009 [pdf, other]: Title: Structured Click Control in Transformer-based Interactive Segmentation

Authors: Long Xu, Yongquan Chen, Rui Huang, Feng Wu, Shiwu Lai

Comments: 10 pages, 6 figures, submitted to NeurIPS 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[200] arXiv:2405.04007 [pdf, other]: Title: SEED-Data-Edit Technical Report: A Hybrid Dataset for Instructional Image Editing

Authors: Yuying Ge, Sijie Zhao, Chen Li, Yixiao Ge, Ying Shan

Comments: Technical Report; Dataset released in this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[201] arXiv:2405.03995 [pdf, other]: Title: Deep Event-based Object Detection in Autonomous Driving: A Survey

Authors: Bingquan Zhou, Jie Jiang

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[202] arXiv:2405.03981 [pdf, other]: Title: Predicting Lung Disease Severity via Image-Based AQI Analysis using Deep Learning Techniques

Authors: Anvita Mahajan, Sayali Mate, Chinmayee Kulkarni, Suraj Sawant

Comments: 11 pages

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[203] arXiv:2405.03978 [pdf, other]: Title: VMambaCC: A Visual State Space Model for Crowd Counting

Authors: Hao-Yuan Ma, Li Zhang, Shuai Shi

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[204] arXiv:2405.03971 [pdf, other]: Title: Unified End-to-End V2X Cooperative Autonomous Driving

Authors: Zhiwei Li, Bozhen Zhang, Lei Yang, Tianyu Shen, Nuo Xu, Ruosen Hao, Weiting Li, Tao Yan, Huaping Liu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multiagent Systems (cs.MA)
[205] arXiv:2405.03959 [pdf, other]: Title: Joint Estimation of Identity Verification and Relative Pose for Partial Fingerprints

Authors: Xiongjun Guan, Zhiyu Pan, Jianjiang Feng, Jie Zhou

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[206] arXiv:2405.03958 [pdf, other]: Title: Simple Drop-in LoRA Conditioning on Attention Layers Will Improve Your Diffusion Model

Authors: Joo Young Choi, Jaesung R. Park, Inkyu Park, Jaewoong Cho, Albert No, Ernest K. Ryu

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
[207] arXiv:2405.03955 [pdf, ps, other]: Title: IPFed: Identity protected federated learning for user authentication

Authors: Yosuke Kaga, Yusei Suzuki, Kenta Takahashi

Journal-ref: 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Cryptography and Security (cs.CR); Machine Learning (cs.LG)
[208] arXiv:2405.03945 [pdf, other]: Title: Role of Sensing and Computer Vision in 6G Wireless Communications

Authors: Seungnyun Kim, Jihoon Moon, Jinhong Kim, Yongjun Ahn, Donghoon Kim, Sunwoo Kim, Kyuhong Shim, Byonghyo Shim

Subjects: Computer Vision and Pattern Recognition (cs.CV); Networking and Internet Architecture (cs.NI)
[209] arXiv:2405.03894 [pdf, other]: Title: MVDiff: Scalable and Flexible Multi-View Diffusion for 3D Object Reconstruction from Single-View

Authors: Emmanuelle Bourigault, Pauline Bourigault

Comments: CVPRW: Generative Models for Computer Vision

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[210] arXiv:2405.03884 [pdf, other]: Title: BadFusion: 2D-Oriented Backdoor Attacks against 3D Object Detection

Authors: Saket S. Chaturvedi, Lan Zhang, Wenbin Zhang, Pan He, Xiaoyong Yuan

Comments: Accepted at IJCAI 2024 Conference

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[211] arXiv:2405.03882 [pdf, other]: Title: Trio-ViT: Post-Training Quantization and Acceleration for Softmax-Free Efficient Vision Transformer

Authors: Huihong Shi, Haikuo Shao, Wendong Mao, Zhongfeng Wang

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[212] arXiv:2405.03852 [pdf, other]: Title: VSA4VQA: Scaling a Vector Symbolic Architecture to Visual Question Answering on Natural Images

Authors: Anna Penzkofer, Lei Shi, Andreas Bulling

Comments: To be published in the Proceedings of the Annual Meeting of the Cognitive Science Society (CogSci'24)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
[213] arXiv:2405.03846 [pdf, other]: Title: Enhancing Apparent Personality Trait Analysis with Cross-Modal Embeddings

Authors: Ádám Fodor, Rachid R. Saboundji, András Lőrincz

Comments: 14 pages, 4 figures

Journal-ref: Annales Universitatis Scientiarium Budapestinensis de Rolando E\"otv\"os Nominatae. Sectio Computatorica, MaCS Special Issue, 2021

Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC)
[214] arXiv:2405.03803 [pdf, other]: Title: MoDiPO: text-to-motion alignment via AI-feedback-driven Direct Preference Optimization

Authors: Massimiliano Pappa, Luca Collorone, Giovanni Ficarra, Indro Spinelli, Fabio Galasso

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[215] arXiv:2405.03770 [pdf, other]: Title: Foundation Models for Video Understanding: A Survey

Authors: Neelu Madan, Andreas Moegelmose, Rajat Modi, Yogesh S. Rawat, Thomas B. Moeslund

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[216] arXiv:2405.03722 [pdf, other]: Title: Class-relevant Patch Embedding Selection for Few-Shot Image Classification

Authors: Weihao Jiang, Haoyang Cui, Kun He

Comments: arXiv admin note: text overlap with arXiv:2405.03109

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[217] arXiv:2405.03715 [pdf, other]: Title: Iterative Filter Pruning for Concatenation-based CNN Architectures

Authors: Svetlana Pavlitska, Oliver Bagge, Federico Peccia, Toghrul Mammadov, J. Marius Zöllner

Comments: Accepted for publication at IJCNN 2024

Subjects: Computer Vision and Pattern Recognition (cs.CV)
[218] arXiv:2405.03702 [pdf, other]: Title: Leafy Spurge Dataset: Real-world Weed Classification Within Aerial Drone Imagery

Authors: Kyle Doherty, Max Gurinas, Erik Samsoe, Charles Casper, Beau Larkin, Philip Ramsey, Brandon Trabucco, Ruslan Salakhutdinov

Comments: Official Dataset Technical Report. Used in DA-Fusion (arXiv:2302.07944)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[219] arXiv:2405.04459 (cross-list from cs.AI) [pdf, other]: Title: A Significantly Better Class of Activation Functions Than ReLU Like Activation Functions

Authors: Mathew Mithra Noel, Yug Oswal

Comments: 14 pages

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Neural and Evolutionary Computing (cs.NE)
[220] arXiv:2405.04392 (cross-list from cs.RO) [pdf, other]: Title: BILTS: A novel bi-invariant local trajectory-shape descriptor for rigid-body motion

Authors: Arno Verduyn, Erwin Aertbeliën, Glenn Maes, Joris De Schutter, Maxim Vochten

Comments: This work has been submitted as a regular research paper for consideration in the IEEE Transactions on Robotics. Copyright may be transferred without notice, after which this version may no longer be accessible

Subjects: Robotics (cs.RO); Computational Geometry (cs.CG); Computer Vision and Pattern Recognition (cs.CV)
[221] arXiv:2405.04378 (cross-list from cs.RO) [pdf, other]: Title: $\textbf{Splat-MOVER}$: Multi-Stage, Open-Vocabulary Robotic Manipulation via Editable Gaussian Splatting

Authors: Ola Shorinwa, Johnathan Tucker, Aliyah Smith, Aiden Swann, Timothy Chen, Roya Firoozi, Monroe Kennedy III, Mac Schwager

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[222] arXiv:2405.04295 (cross-list from eess.IV) [pdf, other]: Title: Semi-Supervised Disease Classification based on Limited Medical Image Data

Authors: Yan Zhang, Chun Li, Zhaoxia Liu, Ming Li

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[223] arXiv:2405.04288 (cross-list from eess.IV) [pdf, other]: Title: BetterNet: An Efficient CNN Architecture with Residual Learning and Attention for Precision Polyp Segmentation

Authors: Owen Singh, Sandeep Singh Sengar

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[224] arXiv:2405.04274 (cross-list from eess.IV) [pdf, other]: Title: Group-aware Parameter-efficient Updating for Content-Adaptive Neural Video Compression

Authors: Zhenghao Chen, Luping Zhou, Zhihao Hu, Dong Xu

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[225] arXiv:2405.04191 (cross-list from cs.LG) [pdf, other]: Title: Effective and Robust Adversarial Training against Data and Label Corruptions

Authors: Peng-Fei Zhang, Zi Huang, Xin-Shun Xu, Guangdong Bai

Comments: 12 pages, 8 figures

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[226] arXiv:2405.04169 (cross-list from eess.IV) [pdf, other]: Title: D-TrAttUnet: Toward Hybrid CNN-Transformer Architecture for Generic and Subtle Segmentation in Medical Images

Authors: Fares Bougourzi, Fadi Dornaika, Cosimo Distante, Abdelmalik Taleb-Ahmed

Comments: arXiv admin note: text overlap with arXiv:2303.15576

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[227] arXiv:2405.04071 (cross-list from cs.RO) [pdf, other]: Title: IMU-Aided Event-based Stereo Visual Odometry

Authors: Junkai Niu, Sheng Zhong, Yi Zhou

Comments: 10 pages, 7 figures, ICRA

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[228] arXiv:2405.04041 (cross-list from cs.AI) [pdf, other]: Title: Feature Map Convergence Evaluation for Functional Module

Authors: Ludan Zhang, Chaoyi Chen, Lei He, Keqiang Li

Subjects: Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV)
[229] arXiv:2405.04023 (cross-list from eess.IV) [pdf, other]: Title: Lumbar Spine Tumor Segmentation and Localization in T2 MRI Images Using AI

Authors: Rikathi Pal, Sudeshna Mondal, Aditi Gupta, Priya Saha, Somoballi Ghoshal, Amlan Chakrabarti, Susmita Sur-Kolay

Comments: 9 pages, 12 figures

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[230] arXiv:2405.03905 (cross-list from cs.AR) [pdf, other]: Title: A 65nm 36nJ/Decision Bio-inspired Temporal-Sparsity-Aware Digital Keyword Spotting IC with 0.6V Near-Threshold SRAM

Authors: Qinyu Chen, Kwantae Kim, Chang Gao, Sheng Zhou, Taekwang Jang, Tobi Delbruck, Shih-Chii Liu

Subjects: Hardware Architecture (cs.AR); Computer Vision and Pattern Recognition (cs.CV); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[231] arXiv:2405.03827 (cross-list from cs.RO) [pdf, other]: Title: Direct learning of home vector direction for insect-inspired robot navigation

Authors: Michiel Firlefyn, Jesse Hagenaars, Guido de Croon

Comments: Published at ICRA 2024, project webpage at this https URL

Subjects: Robotics (cs.RO); Computer Vision and Pattern Recognition (cs.CV)
[232] arXiv:2405.03762 (cross-list from eess.IV) [pdf, other]: Title: Deep learning classifier of locally advanced rectal cancer treatment response from endoscopy images

Authors: Jorge Tapias Gomez, Aneesh Rangnekar, Hannah Williams, Hannah Thompson, Julio Garcia-Aguilar, Joshua Jesse Smith, Harini Veeraraghavan

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV)
[233] arXiv:2405.03732 (cross-list from eess.IV) [pdf, ps, other]: Title: Accelerated MR Cholangiopancreatography with Deep Learning-based Reconstruction

Authors: Jinho Kim, Marcel Dominik Nickel, Florian Knoll

Comments: 20 pages, 6 figures, 2 tables

Subjects: Image and Video Processing (eess.IV); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[234] arXiv:2405.03730 (cross-list from cs.LG) [pdf, other]: Title: Tilt your Head: Activating the Hidden Spatial-Invariance of Classifiers

Authors: Johann Schmidt, Sebastian Stober

Subjects: Machine Learning (cs.LG); Computer Vision and Pattern Recognition (cs.CV)
[235] arXiv:2405.03713 (cross-list from eess.IV) [pdf, other]: Title: Improve Cross-Modality Segmentation by Treating MRI Images as Inverted CT Scans

Authors: Hartmut Häntze, Lina Xu, Leonhard Donle, Felix J. Dorfner, Alessa Hering, Lisa C. Adams, Keno K. Bressem

Comments: 3 pages, 2 figures

Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)

Tue, 7 May 2024 (showing first 1 of 159 entries)

[236] arXiv:2405.03690 [pdf, other]: Title: How Good is my Video LMM? Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs

Authors: Muhammad Uzair Khattak, Muhammad Ferjad Naeem, Jameel Hassan, Muzammal Naseer, Federico Tombari, Fahad Shahbaz Khan, Salman Khan

Comments: Technical report

Subjects: Computer Vision and Pattern Recognition (cs.CV)

[ total of 456 entries: 1-100 | 37-136 | 137-236 | 237-336 | 337-436 | 437-456 ]
[ showing 100 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2405, contact, help (Access key information)

> cs > cs.CV

Computer Vision and Pattern Recognition

Authors and titles for recent submissions, skipping first 136

Thu, 9 May 2024 (continued, showing last 26 of 76 entries)

Wed, 8 May 2024

Tue, 7 May 2024 (showing first 1 of 159 entries)