We gratefully acknowledge support from
the Simons Foundation and member institutions.

Multimedia

Authors and titles for cs.MM in May 2021

[ total of 58 entries: 1-58 ]
[ showing 58 entries per page: fewer | more ]
[1]  arXiv:2105.00136 [pdf, other]
Title: Cross-Modal Self-Attention with Multi-Task Pre-Training for Medical Visual Question Answering
Comments: ICMR '21: ACM International Conference on Multimedia Retrieval, Taipei, Taiwan, August 21-24, 2021
Subjects: Multimedia (cs.MM)
[2]  arXiv:2105.00567 [pdf, other]
Title: Multi-feature 360 Video Quality Estimation
Subjects: Multimedia (cs.MM)
[3]  arXiv:2105.00641 [pdf, other]
Title: Naturalistic audio-visual volumetric sequences dataset of sounding actions for six degree-of-freedom interaction
Comments: for dataset visit cvssp.org/data/navvs; accepted as poster in IEEE VR 2021
Subjects: Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[4]  arXiv:2105.01415 [pdf, ps, other]
Title: A Power and Area Efficient Lepton Hardware Encoder with Hash-based Memory Optimization
Subjects: Multimedia (cs.MM)
[5]  arXiv:2105.01475 [pdf, other]
Title: Insights on the V3C2 Dataset
Subjects: Multimedia (cs.MM)
[6]  arXiv:2105.01633 [pdf, other]
Title: An Estimation of Online Video User Engagement from Features of Continuous Emotions
Subjects: Multimedia (cs.MM); Computation and Language (cs.CL)
[7]  arXiv:2105.01701 [pdf, other]
Title: Viewport-Aware Dynamic 360° Video Segment Categorization
Subjects: Multimedia (cs.MM)
[8]  arXiv:2105.02409 [pdf, other]
Title: Multimedia Edge Computing
Comments: 20 pages, 9 figures. arXiv admin note: text overlap with arXiv:1702.07627
Subjects: Multimedia (cs.MM)
[9]  arXiv:2105.03611 [pdf, other]
Title: 360NorVic: 360-Degree Video Classification from Mobile Encrypted Video Traffic
Comments: 7 pages, 15 figures, accepted in Workshop on Network and OperatingSystem Support for Digital Audio and Video (NOSSDAV 21)
Subjects: Multimedia (cs.MM)
[10]  arXiv:2105.06361 [pdf, other]
Title: Forensic Analysis of Video Files Using Metadata
Comments: v2: fixed a typo in Section 3.4; added page number; added IEEE copyright notice
Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV)
[11]  arXiv:2105.07135 [pdf, ps, other]
Title: Analyzing Images for Music Recommendation
Comments: IEEE International Conference on Consumer Electronics (IEEE ICCE 2021)
Subjects: Multimedia (cs.MM); Artificial Intelligence (cs.AI); Sound (cs.SD); Audio and Speech Processing (eess.AS); Image and Video Processing (eess.IV)
[12]  arXiv:2105.08191 [pdf, ps, other]
Title: Adaptive Video Encoding For Different Video Codecs
Comments: Video codecs, Video signal processing, Video coding, Video compression, Video quality, Video streaming, Adaptive video streaming, Versatile Video Coding, AV1, HEVC
Journal-ref: IEEE Access 2021
Subjects: Multimedia (cs.MM); Image and Video Processing (eess.IV)
[13]  arXiv:2105.08350 [pdf, other]
Title: Generic Reversible Visible Watermarking Via Regularized Graph Fourier Transform Coding
Comments: This manuscript is accepted to IEEE Transactions on Image Processing on November 21th 2021. It has 15 pages, 12 figures and 4 tables
Subjects: Multimedia (cs.MM); Cryptography and Security (cs.CR); Image and Video Processing (eess.IV)
[14]  arXiv:2105.09280 [pdf, other]
Title: A Deep Learning Scheme for Efficient Multimedia IoT Data Compression
Subjects: Multimedia (cs.MM)
[15]  arXiv:2105.09281 [pdf, other]
Title: A Decade of Research for Image Compression In Multimedia Laboratory
Subjects: Multimedia (cs.MM); Computer Vision and Pattern Recognition (cs.CV)
[16]  arXiv:2105.09284 [pdf, other]
Title: SemEval-2021 Task 6: Detection of Persuasion Techniques in Texts and Images
Comments: propaganda, disinformation, misinformation, fake news, memes, multimodality
Journal-ref: SemEval-2021
Subjects: Multimedia (cs.MM); Computation and Language (cs.CL); Machine Learning (cs.LG)
[17]  arXiv:2105.11095 [pdf, ps, other]
Title: Robust Watermarking using Diffusion of Logo into Autoencoder Feature Maps
Comments: 16 pages, 6 figures
Subjects: Multimedia (cs.MM); Machine Learning (cs.LG)
[18]  arXiv:2105.11563 [pdf, other]
Title: VAD360: Viewport Aware Dynamic 360-Degree Video Frame Tiling
Comments: 10, 16 figures
Subjects: Multimedia (cs.MM)
[19]  arXiv:2105.14550 [pdf, other]
Title: Blind Quality Assessment for in-the-Wild Images via Hierarchical Feature Fusion and Iterative Mixed Database Training
Comments: Accepted by IEEE Journal of Selected Topics in Signal Processing
Subjects: Multimedia (cs.MM)
[20]  arXiv:2105.00171 (cross-list from cs.CL) [pdf, other]
Title: AlloST: Low-resource Speech Translation without Source Transcription
Comments: Accepted by Interspeech2021
Subjects: Computation and Language (cs.CL); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[21]  arXiv:2105.00397 (cross-list from cs.LG) [pdf, other]
Title: OR-Net: Pointwise Relational Inference for Data Completion under Partial Observation
Subjects: Machine Learning (cs.LG); Multimedia (cs.MM)
[22]  arXiv:2105.00708 (cross-list from cs.SD) [pdf, other]
Title: Exploiting Audio-Visual Consistency with Partial Supervision for Spatial Audio Generation
Comments: AAAI'21
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[23]  arXiv:2105.01466 (cross-list from cs.CL) [pdf, other]
Title: GraphTMT: Unsupervised Graph-based Topic Modeling from Video Transcripts
Comments: JT and LS contributed equally to this work
Subjects: Computation and Language (cs.CL); Multimedia (cs.MM)
[24]  arXiv:2105.02636 (cross-list from cs.CV) [pdf, other]
Title: Estimating Presentation Competence using Multimodal Nonverbal Behavioral Cues
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[25]  arXiv:2105.02957 (cross-list from cs.CV) [pdf, ps, other]
Title: VID-WIN: Fast Video Event Matching with Query-Aware Windowing at the Edge for the Internet of Multimedia Things
Comments: 22 pages, 24 figures, 9 tables, Journal accepted in IEEE Internet of Things Journal
Subjects: Computer Vision and Pattern Recognition (cs.CV); Distributed, Parallel, and Cluster Computing (cs.DC); Multimedia (cs.MM)
[26]  arXiv:2105.03299 (cross-list from cs.LG) [pdf, other]
Title: Leveraging Multiple Relations for Fashion Trend Forecasting Based on Social Media
Comments: 12 pages, 8 figures
Journal-ref: IEEE Transaction on Multimedia, 2021
Subjects: Machine Learning (cs.LG); Information Retrieval (cs.IR); Multimedia (cs.MM)
[27]  arXiv:2105.04090 (cross-list from cs.SD) [pdf, other]
Title: MuseMorphose: Full-Song and Fine-Grained Piano Music Style Transfer with One Transformer VAE
Comments: Accepted for Publication at IEEE/ACM Transactions on Audio, Speech, and Language Processing (TASLP). Online supplemental materials are attached to the end of this arXiv version
Subjects: Sound (cs.SD); Machine Learning (cs.LG); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[28]  arXiv:2105.05409 (cross-list from cs.CV) [pdf, other]
Title: A Large-Scale Benchmark for Food Image Segmentation
Comments: 16 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[29]  arXiv:2105.06461 (cross-list from cs.CV) [pdf, other]
Title: 3D Spatial Recognition without Spatially Labeled 3D
Comments: CVPR 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM)
[30]  arXiv:2105.06524 (cross-list from cs.DC) [pdf, ps, other]
Title: CrossRoI: Cross-camera Region of Interest Optimization for Efficient Real Time Video Analytics at Scale
Comments: accepted in 12th ACM Multimedia Systems Conference (MMsys 21')
Subjects: Distributed, Parallel, and Cluster Computing (cs.DC); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Networking and Internet Architecture (cs.NI)
[31]  arXiv:2105.06818 (cross-list from cs.CV) [pdf, other]
Title: Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation
Comments: Accepted by CVPR 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[32]  arXiv:2105.07062 (cross-list from cs.IR) [pdf, other]
Title: Measuring the User Satisfaction in a Recommendation Interface with Multiple Carousels
Journal-ref: ACM International Conference on Interactive Media Experiences (IMX '21), June 21--23, 2021, Virtual Event, NY, USA
Subjects: Information Retrieval (cs.IR); Human-Computer Interaction (cs.HC); Machine Learning (cs.LG); Multimedia (cs.MM)
[33]  arXiv:2105.07175 (cross-list from cs.CV) [pdf, other]
Title: Cross-Modal Progressive Comprehension for Referring Segmentation
Comments: Accepted by TPAMI 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[34]  arXiv:2105.07553 (cross-list from cs.CV) [pdf, other]
Title: Prototype-supervised Adversarial Network for Targeted Attack of Deep Hashing
Comments: This paper has been accepted by CVPR 2021, and the related codes could be available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[35]  arXiv:2105.07558 (cross-list from cs.NI) [pdf, other]
Title: fybrrStream: A WebRTC based Efficient and Scalable P2P Live Streaming Platform
Subjects: Networking and Internet Architecture (cs.NI); Multimedia (cs.MM); Social and Information Networks (cs.SI)
[36]  arXiv:2105.07585 (cross-list from cs.IR) [pdf, other]
Title: Leveraging Two Types of Global Graph for Sequential Fashion Recommendation
Subjects: Information Retrieval (cs.IR); Multimedia (cs.MM)
[37]  arXiv:2105.07841 (cross-list from cs.CY) [pdf, ps, other]
Title: Post-war Civil War Propaganda Techniques and Media Spins in Nigeria and Journalism Practice
Subjects: Computers and Society (cs.CY); Multimedia (cs.MM); Physics and Society (physics.soc-ph)
[38]  arXiv:2105.08052 (cross-list from cs.CV) [pdf, other]
Title: The Boombox: Visual Reconstruction from Acoustic Vibrations
Comments: CoRL 2021. Website: boombox.cs.columbia.edu
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Robotics (cs.RO); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[39]  arXiv:2105.08643 (cross-list from cs.LG) [pdf, other]
Title: ASM2TV: An Adaptive Semi-Supervised Multi-Task Multi-View Learning Framework for Human Activity Recognition
Comments: 7 pages, 5 figures; accepted by AAAI'22
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[40]  arXiv:2105.08649 (cross-list from cs.LG) [pdf, other]
Title: DCAP: Deep Cross Attentional Product Network for User Response Prediction
Comments: 10 pages, 7 figures, Accepted by CIKM'21
Subjects: Machine Learning (cs.LG); Artificial Intelligence (cs.AI); Information Retrieval (cs.IR); Multimedia (cs.MM)
[41]  arXiv:2105.08809 (cross-list from cs.CV) [pdf, other]
Title: Multimodal Deep Learning Framework for Image Popularity Prediction on Social Media
Comments: 14 pages, 11 figures, 7 tables
Journal-ref: IEEE Transactions on Cognitive and Developmental Systems. 2020 Nov 9
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[42]  arXiv:2105.08899 (cross-list from cs.CR) [pdf, other]
Title: FairCMS: Cloud Media Sharing with Fair Copyright Protection
Comments: Accepted by IEEE Transactions on Computational Social Systems
Subjects: Cryptography and Security (cs.CR); Multimedia (cs.MM)
[43]  arXiv:2105.09153 (cross-list from cs.HC) [pdf, ps, other]
Title: Procedural animations in interactive art experiences -- A state of the art review
Authors: C. Tollola
Subjects: Human-Computer Interaction (cs.HC); Multimedia (cs.MM)
[44]  arXiv:2105.10005 (cross-list from cs.CV) [pdf, other]
Title: Robust Unsupervised Multi-Object Tracking in Noisy Environments
Comments: Accepted to IEEE ICIP 2021
Journal-ref: 2021 IEEE International Conference on Image Processing (ICIP)
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG); Multimedia (cs.MM); Neural and Evolutionary Computing (cs.NE)
[45]  arXiv:2105.10754 (cross-list from cs.HC) [pdf, ps, other]
Title: Effects of VR Gaming and Game Genre on Player Experience
Comments: 2019 IEEE Games, Entertainment, Media Conference (GEM)
Subjects: Human-Computer Interaction (cs.HC); Multimedia (cs.MM)
[46]  arXiv:2105.11131 (cross-list from cs.CV) [pdf, other]
Title: Unsupervised Video Summarization with a Convolutional Attentive Adversarial Network
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[47]  arXiv:2105.11826 (cross-list from cs.LG) [pdf, other]
Title: Reproducibility Companion Paper: Knowledge Enhanced Neural Fashion Trend Forecasting
Journal-ref: ICMR 2021
Subjects: Machine Learning (cs.LG); Information Retrieval (cs.IR); Multimedia (cs.MM)
[48]  arXiv:2105.11941 (cross-list from cs.CV) [pdf, other]
Title: Understanding Mobile GUI: from Pixel-Words to Screen-Sentences
Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Multimedia (cs.MM)
[49]  arXiv:2105.12043 (cross-list from cs.CV) [pdf, other]
Title: Temporal Action Proposal Generation with Transformers
Comments: The first three authors contributed equally
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[50]  arXiv:2105.12085 (cross-list from cs.CV) [pdf, other]
Title: DSANet: Dynamic Segment Aggregation Network for Video-Level Representation Learning
Comments: Accepted to ACMMM2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[51]  arXiv:2105.13096 (cross-list from cs.IT) [pdf, ps, other]
Title: Lattice-Based Minimum-Distortion Data Hiding
Comments: 5 pages, to appear in IEEE communications letters
Subjects: Information Theory (cs.IT); Multimedia (cs.MM)
[52]  arXiv:2105.13295 (cross-list from cs.HC) [pdf, ps, other]
Title: Electromagnetic actuation for a vibrotactile display: Assessing stimuli complexity and usability
Subjects: Human-Computer Interaction (cs.HC); Multimedia (cs.MM)
[53]  arXiv:2105.14538 (cross-list from cs.CV) [pdf, other]
Title: Longer Version for "Deep Context-Encoding Network for Retinal Image Captioning"
Comments: This paper is a longer version of "Deep Context-Encoding Network for Retinal Image Captioning" which is accepted by IEEE International Conference on Image Processing (ICIP), 2021
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Multimedia (cs.MM)
[54]  arXiv:2105.01705 (cross-list from eess.IV) [pdf, other]
Title: Attention-based Stylisation for Exemplar Image Colourisation
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[55]  arXiv:2105.02824 (cross-list from eess.SP) [pdf, other]
Title: Activity-Aware Deep Cognitive Fatigue Assessment using Wearables
Comments: Submitted to EMBC
Subjects: Signal Processing (eess.SP); Machine Learning (cs.LG); Multimedia (cs.MM)
[56]  arXiv:2105.07139 (cross-list from eess.IV) [pdf, other]
Title: Image Super-Resolution Quality Assessment: Structural Fidelity Versus Statistical Naturalness
Comments: Accepted by QoMEX 2021
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[57]  arXiv:2105.09999 (cross-list from eess.IV) [pdf, other]
Title: Convolutional Block Design for Learned Fractional Downsampling
Comments: 4 pages conference paper
Subjects: Image and Video Processing (eess.IV); Multimedia (cs.MM)
[58]  arXiv:2105.12700 (cross-list from eess.IV) [pdf, ps, other]
Title: Towards Transparent Application of Machine Learning in Video Processing
Comments: International Broadcasting Convention, 11-14 Sep 2020, Amsterdam, Netherlands (Technical Paper section, Virtual)
Subjects: Image and Video Processing (eess.IV); Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM)
[ total of 58 entries: 1-58 ]
[ showing 58 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2405, contact, help  (Access key information)