We gratefully acknowledge support from
the Simons Foundation and member institutions.

Multimedia

Authors and titles for recent submissions

[ total of 18 entries: 1-18 ]
[ showing up to 25 entries per page: fewer | more ]

Thu, 6 Oct 2022

[1]  arXiv:2210.02206 [pdf, other]
Title: Improving Visual-Semantic Embedding with Adaptive Pooling and Optimization Objective
Subjects: Multimedia (cs.MM)
[2]  arXiv:2210.02437 (cross-list from cs.SD) [pdf, other]
Title: ASVspoof 2021: Towards Spoofed and Deepfake Speech Detection in the Wild
Comments: Submitted to IEEE/ACM Transactions on Audio, Speech and Language Processing
Subjects: Sound (cs.SD); Cryptography and Security (cs.CR); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[3]  arXiv:2210.02391 (cross-list from cs.CV) [pdf, other]
Title: Geometry Driven Progressive Warping for One-Shot Face Animation
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[4]  arXiv:2210.02324 (cross-list from cs.CV) [pdf, other]
Title: Promising or Elusive? Unsupervised Object Segmentation from Real-world Single Images
Authors: Yafei Yang, Bo Yang
Comments: NeurIPS 2022. Code and data are available at project page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM); Robotics (cs.RO)
[5]  arXiv:2210.02257 (cross-list from cs.CR) [pdf, other]
Title: Hiding Images in Deep Probabilistic Models
Subjects: Cryptography and Security (cs.CR); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[6]  arXiv:2210.02227 (cross-list from cs.CV) [pdf, other]
Title: Comprint: Image Forgery Detection and Localization using Compression Fingerprints
Comments: Presented at the Workshop on MultiMedia FORensics in the WILD 2022, held in conjunction with the International Conference on Pattern Recognition (ICPR) 2022
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)

Wed, 5 Oct 2022

[7]  arXiv:2210.01652 [pdf, other]
Title: A Conditional-Probability-Distribution Model for Bandwidth Estimation with Application in Live Video Streaming
Authors: Weijia Zheng
Comments: 5 pages, 6 figures
Subjects: Multimedia (cs.MM)
[8]  arXiv:2210.01719 (cross-list from cs.SD) [pdf, other]
Title: Learning the Spectrogram Temporal Resolution for Audio Classification
Comments: Under review. Code open-sourced at this https URL
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Multimedia (cs.MM); Audio and Speech Processing (eess.AS); Signal Processing (eess.SP)
[9]  arXiv:2210.01402 (cross-list from cs.CV) [pdf, other]
Title: Streaming Video Analytics On The Edge With Asynchronous Cloud Support
Comments: 12 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Multimedia (cs.MM)

Tue, 4 Oct 2022

[10]  arXiv:2210.00821 [pdf]
Title: A high accuracy and low complexity quality control method for image compression
Subjects: Multimedia (cs.MM)
[11]  arXiv:2210.00330 [pdf]
Title: Social VR and multi-party holographic communications: Opportunities, Challenges and Impact in the Education and Training Sectors
Subjects: Multimedia (cs.MM)
[12]  arXiv:2210.00757 (cross-list from cs.CV) [pdf, other]
Title: Fully Transformer Network for Change Detection of Remote Sensing Images
Comments: 18 pages, 6 figures and 5 tables. This work will appear in ACCV2022 as a poster paper
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[13]  arXiv:2210.00753 (cross-list from cs.SD) [pdf, other]
Title: Push-Pull: Characterizing the Adversarial Robustness for Audio-Visual Active Speaker Detection
Comments: Accepted by SLT 2022
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[14]  arXiv:2210.00434 (cross-list from eess.AS) [pdf, other]
Title: Music-to-Text Synaesthesia: Generating Descriptive Text from Music Recordings
Subjects: Audio and Speech Processing (eess.AS); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Multimedia (cs.MM); Sound (cs.SD)
[15]  arXiv:2210.00378 (cross-list from eess.AS) [pdf, other]
Title: Optimized Decoders for Mixed-Order Ambisonics
Authors: Aaron Heller (1), Eric Benjamin (2), Fernando Lopez-Lezcano (3) ((1) Artificial Intelligence Center, SRI International, (2) Surround Research, (3) Center for Computer Research in Music and Acoustics (CCRMA), Stanford University)
Comments: 9 pages, 10 figures,
Journal-ref: Paper 10507, 150th Audio Engineering Society Convention, May 2021
Subjects: Audio and Speech Processing (eess.AS); Multimedia (cs.MM); Sound (cs.SD)

Mon, 3 Oct 2022

[16]  arXiv:2209.15557 [pdf, other]
Title: Explaining Hierarchical Features in Dynamic Point Cloud Processing
Subjects: Multimedia (cs.MM)
[17]  arXiv:2209.15198 (cross-list from cs.NI) [pdf]
Title: FoVR: Attention-based VR Streaming through Bandwidth-limited Wireless Networks
Subjects: Networking and Internet Architecture (cs.NI); Multimedia (cs.MM)

Fri, 30 Sep 2022

[18]  arXiv:2209.14667 (cross-list from cs.CL) [pdf, other]
Title: Domain-aware Self-supervised Pre-training for Label-Efficient Meme Analysis
Comments: Accepted at AACL-IJCNLP 2022 main conference. 9 Pages (main content); 6 Figures; 5 Tables and an Appendix
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[ total of 18 entries: 1-18 ]
[ showing up to 25 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, new, 2210, contact, help  (Access key information)