Multimedia

Authors and titles for cs.MM in May 2022

[ total of 63 entries: 1-50 | 51-63 ]
[ showing 50 entries per page: fewer | more | all ]

[1] arXiv:2205.00132 [pdf, other]: Title: Learn to Understand Negation in Video Retrieval

Authors: Ziyue Wang, Aozhu Chen, Fan Hu, Xirong Li

Comments: Accepted by ACMMM2022

Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV)
[2] arXiv:2205.01583 [pdf, other]: Title: An Explore of Virtual Reality for Awareness of the Climate Change Crisis: A Simulation of Sea Level Rise

Authors: Zixiang Xu, Abraham G. Campbell, Soumyabrata Dev, Yuan Liang

Comments: Published in 8th International Conference of the Immersive Learning Research Network (iLRN 2022)

Subjects: Multimedia (cs.MM); Human-Computer Interaction (cs.HC)
[3] arXiv:2205.03595 [pdf, ps, other]: Title: $λ$-domain VVC Rate Control Based on Game Theory

Authors: Jielian Lin, Aiping Huang, Keke Zhang, Xu Wang, Tiesong Zhao

Subjects: Multimedia (cs.MM); Multiagent Systems (cs.MA)
[4] arXiv:2205.03684 [pdf, other]: Title: Timestamp-independent Haptic-Visual Synchronization

Authors: Yiwen Xu, Liangtao Huang, Tiesong Zhao, Liqun Lin, Ying Fang

Subjects: Multimedia (cs.MM)
[5] arXiv:2205.03782 [pdf, ps, other]: Title: SSIM-Variation-Based Complexity Optimization for Versatile Video Coding

Authors: Jielian Lin, Hongbin Lin, Zhichen Zhang, Yiwen Xu, Tiesong Zhao

Subjects: Multimedia (cs.MM); Multiagent Systems (cs.MA)
[6] arXiv:2205.04906 [pdf, other]: Title: Evaluating the Impact of Tiled User-Adaptive Real-Time Point Cloud Streaming on VR Remote Communication

Authors: Shishir Subramanyam, Irene Viola, Jack Jansen, Evangelos Alexiou, Alan Hanjalic, Pablo Cesar

Subjects: Multimedia (cs.MM)
[7] arXiv:2205.05177 [pdf, other]: Title: ConfLab: A Data Collection Concept, Dataset, and Benchmark for Machine Analysis of Free-Standing Social Interactions in the Wild

Authors: Chirag Raman, Jose Vargas-Quiros, Stephanie Tan, Ashraful Islam, Ekin Gedik, Hayley Hung

Comments: In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS D&B)

Subjects: Multimedia (cs.MM); Machine Learning (cs.LG)
[8] arXiv:2205.05880 [pdf, other]: Title: Deep Decomposition and Bilinear Pooling Network for Blind Night-Time Image Quality Evaluation

Authors: Qiuping Jiang, Jiawu Xu, Yudong Mao, Wei Zhou, Xiongkuo Min, Guangtao Zhai

Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV)
[9] arXiv:2205.08007 [pdf, other]: Title: Perceptual Evaluation on Audio-visual Dataset of 360 Content

Authors: Randy F Fela, Andréas Pastor, Patrick Le Callet, Nick Zacharov, Toinon Vigier, Søren Forchhammer

Comments: 6 pages, 5 figures, International Conference on Multimedia and Expo 2022

Subjects: Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[10] arXiv:2205.08738 [pdf, other]: Title: 3D-VFD: A Victim-free Detector against 3D Adversarial Point Clouds

Authors: Jiahao Zhu, Huajun Zhou, Zixuan Chen, Yi Zhou, Xiaohua Xie

Comments: 6 pages, 13pages

Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV); Image and Video Processing (eess.IV)
[11] arXiv:2205.08866 [pdf, other]: Title: Seeing Sounds, Hearing Shapes: a gamified study to evaluate sound-sketches

Authors: Sebastian Löbbers, György Fazekas

Comments: Accepted at International Computer Music Conference (ICMC) 2022

Subjects: Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[12] arXiv:2205.10649 [pdf, other]: Title: Towards the Effects of Alignment Edits on the Quality of Experience of 360 Videos

Authors: Lucas Althoff, Alessandro Rodrigues, Mylène C. Q. Farias

Comments: 14 pages, 13 figures, 4 tables

Subjects: Multimedia (cs.MM)
[13] arXiv:2205.10815 [pdf, other]: Title: Recent Advances in Rate Control: From Optimisation to Implementation and Beyond

Authors: Xuekai Wei, Mingliang Zhou, Heqiang Wang, Haoyan Yang, Lei Chen, Sam Kwong

Comments: Copyright \c{opyright} 20xx IEEE. Personal use of this material is permitted. However, permission to use this material for any other purposes must be obtained from the IEEE by sending an email to pubs-permissions@ieee.org

Subjects: Multimedia (cs.MM)
[14] arXiv:2205.11825 [pdf, ps, other]: Title: A Rate Control Algorithm for Video-based Point Cloud Compression

Authors: Fangyu Shen, Wei Gao

Comments: 5 pages, 3 figures, 4 tables

Journal-ref: 2021 International Conference on Visual Communications and Image Processing (VCIP)

Subjects: Multimedia (cs.MM)
[15] arXiv:2205.00694 (cross-list from cs.CV) [pdf, other]: Title: A Multi-stage deep architecture for summary generation of soccer videos

Authors: Melissa Sanabria, Frédéric Precioso, Pierre-Alexandre Mattei, Thomas Menguy

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)
[16] arXiv:2205.00941 (cross-list from cs.SD) [pdf, ps, other]: Title: Music Interpretation Analysis. A Multimodal Approach To Score-Informed Resynthesis of Piano Recordings

Authors: Federico Simonetta

Comments: PhD Thesis. Author: F. Simonetta; tutor: S. Ntalampiras; co-tutor: F. Avanzini; Universit\`a degli studi di Milano - Dipartimento di Informatica "Giovanni Degli Antoni", 2022 Apr 22

Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[17] arXiv:2205.01155 (cross-list from cs.CV) [pdf, other]: Title: Emotion-Controllable Generalized Talking Face Generation

Authors: Sanjana Sinha, Sandika Biswas, Ravindra Yadav, Brojeshwar Bhowmick

Comments: Accepted at IJCAI 2022

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[18] arXiv:2205.01917 (cross-list from cs.CV) [pdf, other]: Title: CoCa: Contrastive Captioners are Image-Text Foundation Models

Authors: Jiahui Yu, Zirui Wang, Vijay Vasudevan, Legg Yeung, Mojtaba Seyedhosseini, Yonghui Wu

Comments: Preprint

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[19] arXiv:2205.01989 (cross-list from cs.CL) [pdf, other]: Title: MM-Claims: A Dataset for Multimodal Claim Detection in Social Media

Authors: Gullal S. Cheema, Sherzod Hakimov, Abdul Sittar, Eric Müller-Budack, Christian Otto, Ralph Ewerth

Comments: Accepted to Findings of NAACL 2022

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Social and Information Networks (cs.SI)
[20] arXiv:2205.02357 (cross-list from cs.CL) [pdf, other]: Title: Hybrid Transformer with Multi-level Fusion for Multimodal Knowledge Graph Completion

Authors: Xiang Chen, Ningyu Zhang, Lei Li, Shumin Deng, Chuanqi Tan, Changliang Xu, Fei Huang, Luo Si, Huajun Chen

Comments: Accepted by SIGIR 2022. Fix a severe bug

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[21] arXiv:2205.02456 (cross-list from cs.CV) [pdf, other]: Title: Declaration-based Prompt Tuning for Visual Question Answering

Authors: Yuhang Liu, Wei Wei, Daowan Peng, Feida Zhu

Comments: Accepted to IJCAI2022, data and codes are available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[22] arXiv:2205.02538 (cross-list from cs.CV) [pdf, other]: Title: Parametric Reshaping of Portraits in Videos

Authors: Xiangjun Tang, Wenxin Sun, Yong-Liang Yang, Xiaogang Jin

Journal-ref: MM'21: Proceedings of the 29th ACM International Conference on MultimediaOctober 2021

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[23] arXiv:2205.03297 (cross-list from cs.IR) [pdf, other]: Title: Implicit semantic-based personalized micro-videos recommendation

Authors: Bo Liu

Subjects: Information Retrieval (cs.IR); Multimedia (cs.MM)
[24] arXiv:2205.03521 (cross-list from cs.CL) [pdf, other]: Title: Good Visual Guidance Makes A Better Extractor: Hierarchical Visual Prefix for Multimodal Entity and Relation Extraction

Authors: Xiang Chen, Ningyu Zhang, Lei Li, Yunzhi Yao, Shumin Deng, Chuanqi Tan, Fei Huang, Luo Si, Huajun Chen

Comments: Accepted by NAACL 2022

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[25] arXiv:2205.03534 (cross-list from cs.CL) [pdf, other]: Title: Attract me to Buy: Advertisement Copywriting Generation with Multimodal Multi-structured Information

Authors: Zhipeng Zhang, Xinglin Hou, Kai Niu, Zhongzhen Huang, Tiezheng Ge, Yuning Jiang, Qi Wu, Peng Wang

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[26] arXiv:2205.03923 (cross-list from cs.CV) [pdf, other]: Title: Unsupervised Discovery and Composition of Object Light Fields

Authors: Cameron Smith, Hong-Xing Yu, Sergey Zakharov, Fredo Durand, Joshua B. Tenenbaum, Jiajun Wu, Vincent Sitzmann

Comments: Project website: this https URL TMLR 2023

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG); Multimedia (cs.MM)
[27] arXiv:2205.04029 (cross-list from cs.SD) [pdf, other]: Title: Muskits: an End-to-End Music Processing Toolkit for Singing Voice Synthesis

Authors: Jiatong Shi, Shuai Guo, Tao Qian, Nan Huo, Tomoki Hayashi, Yuning Wu, Frank Xu, Xuankai Chang, Huazhe Li, Peter Wu, Shinji Watanabe, Qin Jin

Comments: Accepted by Interspeech

Subjects: Sound (cs.SD); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[28] arXiv:2205.04188 (cross-list from cs.CV) [pdf, other]: Title: Joint learning of object graph and relation graph for visual question answering

Authors: Hao Li, Xu Li, Belhal Karimi, Jie Chen, Mingming Sun

Comments: 6 pages, 4 figures, Accepted by ICME 2022

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[29] arXiv:2205.04264 (cross-list from cs.CV) [pdf, other]: Title: SwinIQA: Learned Swin Distance for Compressed Image Quality Assessment

Authors: Jianzhao Liu, Xin Li, Yanding Peng, Tao Yu, Zhibo Chen

Comments: CVPR2022 Workshop (CLIC) accepted

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Image and Video Processing (eess.IV)
[30] arXiv:2205.04402 (cross-list from cs.CL) [pdf, other]: Title: Detecting the Role of an Entity in Harmful Memes: Techniques and Their Limitations

Authors: Rabindra Nath Nandi, Firoj Alam, Preslav Nakov

Comments: Accepted at CONSTRAINT 2022 (Colocated with ACL-2022), disinformation, misinformation, factuality, harmfulness, fake news, propaganda, multimodality, text, images, videos, network structure, temporality

Subjects: Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Social and Information Networks (cs.SI)
[31] arXiv:2205.04404 (cross-list from cs.CL) [pdf, other]: Title: TeamX@DravidianLangTech-ACL2022: A Comparative Analysis for Troll-Based Meme Classification

Authors: Rabindra Nath Nandi, Firoj Alam, Preslav Nakov

Comments: Accepted at DravidianLangTech-ACL2022 (Colocated with ACL-2022). disinformation, misinformation, factuality, harmfulness, fake news, propaganda, multimodality, text, images, videos, network structure, temporality

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Social and Information Networks (cs.SI)
[32] arXiv:2205.04749 (cross-list from cs.CV) [pdf, other]: Title: Spatio-Temporal Transformer for Dynamic Facial Expression Recognition in the Wild

Authors: Fuyan Ma, Bin Sun, Shutao Li

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[33] arXiv:2205.04908 (cross-list from cs.CV) [pdf, other]: Title: Shadow-Aware Dynamic Convolution for Shadow Removal

Authors: Yimin Xu, Mingbao Lin, Hong Yang, Fei Chao, Rongrong Ji

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[34] arXiv:2205.05069 (cross-list from cs.CV) [pdf, other]: Title: Accelerating the Training of Video Super-Resolution Models

Authors: Lijian Lin, Xintao Wang, Zhongang Qi, Ying Shan

Comments: The code is available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[35] arXiv:2205.05072 (cross-list from cs.CV) [pdf, other]: Title: Learning Visual Styles from Audio-Visual Associations

Authors: Tingle Li, Yichen Liu, Andrew Owens, Hang Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[36] arXiv:2205.05738 (cross-list from cs.CL) [pdf, other]: Title: DISARM: Detecting the Victims Targeted by Harmful Memes

Authors: Shivam Sharma, Md. Shad Akhtar, Preslav Nakov, Tanmoy Chakraborty

Comments: Accepted at NAACL 2022 (Findings)

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV); Computers and Society (cs.CY); Multimedia (cs.MM)
[37] arXiv:2205.05854 (cross-list from cs.CV) [pdf, other]: Title: Entity-aware and Motion-aware Transformers for Language-driven Action Localization in Videos

Authors: Shuo Yang, Xinxiao Wu

Comments: accepted by IJCAI-22, Codes are available at this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[38] arXiv:2205.05953 (cross-list from cs.HC) [pdf, ps, other]: Title: Emerging Immersive Communication Systems: Overview, Taxonomy, and Good Practises for QoE Assessment

Authors: Pablo Pérez, Ester Gonzalez-Sosa, Jesús Gutiérrez, Narciso García

Comments: Frontiers in Signal Processing

Journal-ref: Front. Signal Process. (2022)

Subjects: Human-Computer Interaction (cs.HC); Multimedia (cs.MM)
[39] arXiv:2205.07100 (cross-list from cs.CL) [pdf, other]: Title: Multiformer: A Head-Configurable Transformer-Based Model for Direct Speech Translation

Authors: Gerard Sant, Gerard I. Gállego, Belen Alastruey, Marta R. Costa-Jussà

Comments: NAACL-SRW 2022

Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[40] arXiv:2205.07611 (cross-list from cs.CV) [pdf, other]: Title: Noise-Tolerant Learning for Audio-Visual Action Recognition

Authors: Haochen Han, Qinghua Zheng, Minnan Luo, Kaiyao Miao, Feng Tian, Yan Chen

Comments: This work has been submitted to the IEEE for possible publication. Copyright may be transferred without notice, after which this version may no longer be accessible

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[41] arXiv:2205.07721 (cross-list from cs.CV) [pdf, other]: Title: Towards Space-to-Ground Data Availability for Agriculture Monitoring

Authors: George Choumos, Alkiviadis Koukos, Vasileios Sitokonstantinou, Charalampos Kontoes

Comments: Has been accepted for publication in IEEE IVMSP 2022: this https URL Specifically in the special session "Multimodal Analysis, Fusion and Retrieval of satellite images": this https URL

Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[42] arXiv:2205.07752 (cross-list from cs.CV) [pdf, other]: Title: A Data Cube of Big Satellite Image Time-Series for Agriculture Monitoring

Authors: Thanassis Drivas, Vasileios Sitokonstantinou, Iason Tsardanidis, Alkiviadis Koukos, Charalampos Kontoes, Vassilia Karathanassi

Comments: This work has been accepted for publication in IEEE 14th Image, Video, and Multidimensional Signal Processing Workshop (IVMSP 2022)

Subjects: Computer Vision and Pattern Recognition (cs.CV); Databases (cs.DB); Multimedia (cs.MM)
[43] arXiv:2205.08324 (cross-list from cs.CV) [pdf, other]: Title: Exploring the Interactive Guidance for Unified and Effective Image Matting

Authors: Dinghao Yang, Bin Wang, Weijia Li, Yiqi Lin, Conghui He

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[44] arXiv:2205.09068 (cross-list from cs.CV) [pdf, other]: Title: VRAG: Region Attention Graphs for Content-Based Video Retrieval

Authors: Kennard Ng, Ser-Nam Lim, Gim Hee Lee

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[45] arXiv:2205.09248 (cross-list from cs.SD) [pdf, other]: Title: MESH2IR: Neural Acoustic Impulse Response Generator for Complex 3D Scenes

Authors: Anton Ratnarajah, Zhenyu Tang, Rohith Chandrashekar Aralikatti, Dinesh Manocha

Comments: Accepted to ACM Multimedia 2022. More results and source code is available at this https URL

Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[46] arXiv:2205.09256 (cross-list from cs.CV) [pdf, other]: Title: Training Vision-Language Transformers from Captions

Authors: Liangke Gui, Yingshan Chang, Qiuyuan Huang, Subhojit Som, Alex Hauptmann, Jianfeng Gao, Yonatan Bisk

Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[47] arXiv:2205.09744 (cross-list from cs.LG) [pdf, other]: Title: Overcoming Language Disparity in Online Content Classification with Multimodal Learning

Authors: Gaurav Verma, Rohit Mujumdar, Zijie J. Wang, Munmun De Choudhury, Srijan Kumar

Comments: Accepted for publication at ICWSM 2022 as a full paper

Subjects: Machine Learning (cs.LG); Computers and Society (cs.CY); Multimedia (cs.MM)
[48] arXiv:2205.09791 (cross-list from cs.CV) [pdf, other]: Title: A Peek at Peak Emotion Recognition

Authors: Tzvi Michelson, Hillel Aviezer, Shmuel Peleg

Comments: Submitted to HBU Workshop at ICPR, 6 pages, 5 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Multimedia (cs.MM)
[49] arXiv:2205.10254 (cross-list from cs.CV) [pdf, ps, other]: Title: A Demographic Attribute Guided Approach to Age Estimation

Authors: Zhicheng Cao, Kaituo Zhang, Liaojun Pang, Heng Zhao

Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Multimedia (cs.MM)
[50] arXiv:2205.10611 (cross-list from cs.CV) [pdf, other]: Title: Lightweight Human Pose Estimation Using Heatmap-Weighting Loss

Authors: Shiqi Li, Xiang Xiang

Comments: 7 pages, 5 figures

Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)

[ total of 63 entries: 1-50 | 51-63 ]
[ showing 50 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2406, contact, help (Access key information)

> cs > cs.MM

Multimedia

Authors and titles for cs.MM in May 2022