We gratefully acknowledge support from
the Simons Foundation and member institutions.

Multimedia

Authors and titles for cs.MM in Jun 2022, skipping first 25

[ total of 69 entries: 1-25 | 26-50 | 51-69 ]
[ showing 25 entries per page: fewer | more | all ]
[26]  arXiv:2206.01160 (cross-list from cs.CV) [pdf, other]
Title: DE-Net: Dynamic Text-guided Image Editing Adversarial Networks
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[27]  arXiv:2206.01966 (cross-list from cs.HC) [pdf, ps, other]
Title: Development and Evaluation of Dental Image Exchange and Management System: A User-Centered Perspective
Comments: 3 figures, 5 tables
Subjects: Human-Computer Interaction (cs.HC); Information Retrieval (cs.IR); Multimedia (cs.MM); Software Engineering (cs.SE)
[28]  arXiv:2206.02070 (cross-list from cs.CV) [pdf, other]
Title: Priors in Deep Image Restoration and Enhancement: A Survey
Comments: Preprint. Under review
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[29]  arXiv:2206.02284 (cross-list from cs.SD) [pdf, other]
Title: Tagged-MRI Sequence to Audio Synthesis via Self Residual Attention Guided Heterogeneous Translator
Comments: MICCAI 2022 (early accept, Oral Presentation ~3%)
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[30]  arXiv:2206.02717 (cross-list from cs.CV) [pdf, other]
Title: Scene Aware Person Image Generation through Global Contextual Conditioning
Comments: Accepted in The International Conference on Pattern Recognition (ICPR) 2022
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[31]  arXiv:2206.03789 (cross-list from cs.CV) [pdf, other]
Title: Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation
Comments: Accepted by CVPR 2022
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[32]  arXiv:2206.04780 (cross-list from cs.SD) [pdf, other]
Title: Speak Like a Dog: Human to Non-human creature Voice Conversion
Comments: 5 pages, 4 figures
Journal-ref: 2022 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC) (pp. 1388-1393)
Subjects: Sound (cs.SD); Artificial Intelligence (cs.AI); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[33]  arXiv:2206.05475 (cross-list from cs.LG) [pdf, other]
Title: Reducing Capacity Gap in Knowledge Distillation with Review Mechanism for Crowd Counting
Subjects: Machine Learning (cs.LG); Multimedia (cs.MM)
[34]  arXiv:2206.05651 (cross-list from cs.CV) [pdf, other]
Title: STD-NET: Search of Image Steganalytic Deep-learning Architecture via Hierarchical Tensor Decomposition
Comments: Submitted to IEEE T-DSC
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[35]  arXiv:2206.05833 (cross-list from cs.CV) [pdf, other]
Title: COLD Fusion: Calibrated and Ordinal Latent Distribution Fusion for Uncertainty-Aware Multimodal Emotion Recognition
Comments: Accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence
Subjects: Computer Vision and Pattern Recognition (cs.CV); Human-Computer Interaction (cs.HC); Multimedia (cs.MM)
[36]  arXiv:2206.05836 (cross-list from cs.CV) [pdf, other]
Title: GLIPv2: Unifying Localization and Vision-Language Understanding
Comments: NeurIPS 2022; updated with reviewers' comments addressed; Code is released at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG); Multimedia (cs.MM)
[37]  arXiv:2206.06289 (cross-list from cs.CV) [pdf, other]
Title: Silver-Bullet-3D at ManiSkill 2021: Learning-from-Demonstrations and Heuristic Rule-based Methods for Object Manipulation
Comments: Accepted by ICLR 2022 Workshop on Generalizable Policy Learning in Physical World. Top-performing systems for both no interaction and no restriction tracks in SAPIEN ManiSkill Challenge 2021. The source code and model are publicly available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Machine Learning (cs.LG); Multimedia (cs.MM); Robotics (cs.RO)
[38]  arXiv:2206.06291 (cross-list from cs.CV) [pdf, other]
Title: Exploring Structure-aware Transformer over Interaction Proposals for Human-Object Interaction Detection
Comments: CVPR 2022; Code is publicly available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[39]  arXiv:2206.06292 (cross-list from cs.CV) [pdf, other]
Title: MLP-3D: A MLP-like 3D Architecture with Grouped Time Mixing
Comments: CVPR 2022; Code is publicly available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[40]  arXiv:2206.06930 (cross-list from cs.CV) [pdf, other]
Title: Comprehending and Ordering Semantics for Image Captioning
Comments: CVPR 2022; Code is publicly available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Multimedia (cs.MM)
[41]  arXiv:2206.06931 (cross-list from cs.CV) [pdf, other]
Title: Stand-Alone Inter-Frame Attention in Video Models
Comments: CVPR 2022; Code is publicly available at: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Multimedia (cs.MM)
[42]  arXiv:2206.07684 (cross-list from cs.CV) [pdf, other]
Title: AVATAR: Unconstrained Audiovisual Speech Recognition
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[43]  arXiv:2206.07707 (cross-list from cs.CV) [pdf, other]
Title: Variable Bitrate Neural Fields
Comments: SIGGRAPH 2022. Project Page: this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG); Multimedia (cs.MM)
[44]  arXiv:2206.07893 (cross-list from cs.CV) [pdf, other]
Title: PeQuENet: Perceptual Quality Enhancement of Compressed Video with Adaptation- and Attention-based Network
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Image and Video Processing (eess.IV)
[45]  arXiv:2206.08312 (cross-list from cs.SD) [pdf, other]
Title: SoundSpaces 2.0: A Simulation Platform for Visual-Acoustic Learning
Comments: Camera-ready version. Website: this https URL Project page: this https URL
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM); Audio and Speech Processing (eess.AS)
[46]  arXiv:2206.08864 (cross-list from cs.LG) [pdf, other]
Title: Avoid Overfitting User Specific Information in Federated Keyword Spotting
Comments: Accepted by Interspeech 2022
Subjects: Machine Learning (cs.LG); Multimedia (cs.MM); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[47]  arXiv:2206.09391 (cross-list from cs.LG) [pdf, other]
Title: Towards Adversarial Attack on Vision-Language Pre-training Models
Comments: Accepted by ACM MM2022. Code is available in GitHub
Subjects: Machine Learning (cs.LG); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[48]  arXiv:2206.09427 (cross-list from cs.NI) [pdf, ps, other]
Title: QuDASH: Quantum-inspired rate adaptation approach for DASH video streaming
Comments: Accepted Version
Journal-ref: IEEE Access, 2023
Subjects: Networking and Internet Architecture (cs.NI); Multimedia (cs.MM); Image and Video Processing (eess.IV)
[49]  arXiv:2206.09597 (cross-list from cs.CV) [pdf, other]
Title: Winning the CVPR'2022 AQTC Challenge: A Two-stage Function-centric Approach
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[50]  arXiv:2206.09853 (cross-list from cs.CV) [pdf, other]
Title: DisCoVQA: Temporal Distortion-Content Transformers for Video Quality Assessment
Subjects: Computer Vision and Pattern Recognition (cs.CV); Multimedia (cs.MM)
[ total of 69 entries: 1-25 | 26-50 | 51-69 ]
[ showing 25 entries per page: fewer | more | all ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, 2404, contact, help  (Access key information)