We gratefully acknowledge support from
the Simons Foundation and member institutions.

Multimedia

Authors and titles for cs.MM in Dec 2021

[ total of 67 entries: 1-25 | 26-50 | 51-67 ]
[ showing 25 entries per page: fewer | more | all ]
[1]  arXiv:2112.00331 [pdf, other]
Title: Mutltimodal AI Companion for Interactive Fairytale Co-creation
Subjects: Multimedia (cs.MM)
[2]  arXiv:2112.01131 [pdf, other]
Title: FNR: A Similarity and Transformer-Based Approach to Detect Multi-Modal Fake News in Social Media
Comments: 10 pages, 11 figures, 4 tables and 20 references
Subjects: Multimedia (cs.MM); Computers and Society (cs.CY); Machine Learning (cs.LG); Social and Information Networks (cs.SI)
[3]  arXiv:2112.01832 [pdf, other]
Title: Lightweight Attentional Feature Fusion for Video Retrieval by Text
Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV)
[4]  arXiv:2112.01849 [pdf, ps, other]
Title: Cross-modal Knowledge Distillation for Vision-to-Sensor Action Recognition
Comments: 5 pages, 2 figures, submitted to ICASSP2022
Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG)
[5]  arXiv:2112.02070 [pdf, other]
Title: Malakai: Music That Adapts to the Shape of Emotions
Subjects: Multimedia (cs.MM); Artificial Intelligence (cs.AI)
[6]  arXiv:2112.02839 [pdf, other]
Title: MoCA: Incorporating Multi-stage Domain Pretraining and Cross-guided Multimodal Attention for Textbook Question Answering
Comments: 9 pages, 6 figures
Subjects: Multimedia (cs.MM); Artificial Intelligence (cs.AI)
[7]  arXiv:2112.03727 [pdf, other]
Title: RFGAN: RF-Based Human Synthesis
Subjects: Multimedia (cs.MM); Artificial Intelligence (cs.AI)
[8]  arXiv:2112.05428 [pdf, other]
Title: Protecting Your NLG Models with Semantic and Robust Watermarks
Subjects: Multimedia (cs.MM)
[9]  arXiv:2112.08432 [pdf, other]
Title: Expert and Crowd-Guided Affect Annotation and Prediction
Comments: Manuscript submitted for review to IEEE Transactions on Affective Computing
Subjects: Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[10]  arXiv:2112.09401 [pdf, other]
Title: AI-Empowered Persuasive Video Generation: A Survey
Authors: Chang Liu, Han Yu
Comments: under review
Subjects: Multimedia (cs.MM)
[11]  arXiv:2112.10381 [pdf, other]
Title: Automated Vision-Based Wellness Analysis for Elderly Care Centers
Comments: To be appeared at AAAI22 health intelligence workshop
Subjects: Multimedia (cs.MM)
[12]  arXiv:2112.10603 [pdf, other]
Title: A Multi-user Oriented Live Free-viewpoint Video Streaming System Based On View Interpolation
Comments: 10 pages, 7 figures
Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV)
[13]  arXiv:2112.12273 [pdf, other]
Title: Perceptual Evaluation of 360 Audiovisual Quality and Machine Learning Predictions
Subjects: Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[14]  arXiv:2112.12284 [pdf, other]
Title: A Survey on Perceptually Optimized Video Coding
Comments: 34 pages, 11 figures, 5 tables, submitted to ACM Computing Surveys
Subjects: Multimedia (cs.MM); Image and Video Processing (eess.IV)
[15]  arXiv:2112.12907 [pdf, other]
Title: 3D Point Cloud Reconstruction and SLAM as an Input
Comments: 7 pages
Subjects: Multimedia (cs.MM)
[16]  arXiv:2112.00317 (cross-list from cs.CV) [pdf, other]
Title: Unleashing the Potential of Unsupervised Pre-Training with Intra-Identity Regularization for Person Re-Identification
Comments: Technical report, code: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[17]  arXiv:2112.00355 (cross-list from cs.SD) [pdf, other]
Title: Score Transformer: Generating Musical Score from Note-level Representation
Authors: Masahiro Suzuki
Comments: Accepted at ACM Multimedia Asia 2021 (MMAsia '21); Project page: this https URL
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[18]  arXiv:2112.01085 (cross-list from cs.CV) [pdf, other]
Title: TCTN: A 3D-Temporal Convolutional Transformer Network for Spatiotemporal Predictive Learning
Comments: 8 pages, 6 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[19]  arXiv:2112.01194 (cross-list from cs.CV) [pdf, other]
Title: Video-Text Pre-training with Learned Regions
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[20]  arXiv:2112.01839 (cross-list from cs.CV) [src]
Title: Mind Your Clever Neighbours: Unsupervised Person Re-identification via Adaptive Clustering Relationship Modeling
Comments: The experimental results are not sufficient
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Machine Learning (cs.LG); Multimedia (cs.MM)
[21]  arXiv:2112.02644 (cross-list from cs.CV) [pdf, other]
Title: Boosting Mobile CNN Inference through Semantic Memory
Comments: 13 pages, 13 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[22]  arXiv:2112.03420 (cross-list from cs.HC) [pdf, other]
Title: ORCLSim: A System Architecture for Studying Bicyclist and Pedestrian Physiological Behavior Through Immersive Virtual Environments
Comments: 36 pages, 7 figures
Subjects: Human-Computer Interaction (cs.HC); Multimedia (cs.MM)
[23]  arXiv:2112.03857 (cross-list from cs.CV) [pdf, other]
Title: Grounded Language-Image Pre-training
Comments: Code will be released at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG); Multimedia (cs.MM)
[24]  arXiv:2112.04222 (cross-list from cs.CV) [pdf, other]
Title: Classification-Then-Grounding: Reformulating Video Scene Graphs as Temporal Bipartite Graphs
Comments: Accepted by CVPR 2022. Code is available at this https URL We also won the 1st place of Video Relation Understanding (VRU) Grand Challenge in ACM Multimedia 2021, with a simplified version of our model.(The code for object tracklets generation is available at this https URL)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[25]  arXiv:2112.04312 (cross-list from cs.CV) [pdf, other]
Title: Geometry-Guided Progressive NeRF for Generalizable and Efficient Neural Human Rendering
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG); Multimedia (cs.MM)
[ total of 67 entries: 1-25 | 26-50 | 51-67 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2205, contact, help  (Access key information)