We gratefully acknowledge support from
the Simons Foundation and member institutions.


New submissions

[ total of 4 entries: 1-4 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Fri, 18 Jun 21

[1]  arXiv:2106.09198 [pdf]
Title: Learning Perceptual Manifold of Fonts
Comments: 9 pages, 16 figures
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)

Along the rapid development of deep learning techniques in generative models, it is becoming an urgent issue to combine machine intelligence with human intelligence to solve the practical applications. Motivated by this methodology, this work aims to adjust the machine generated character fonts with the effort of human workers in the perception study. Although numerous fonts are available online for public usage, it is difficult and challenging to generate and explore a font to meet the preferences for common users. To solve the specific issue, we propose the perceptual manifold of fonts to visualize the perceptual adjustment in the latent space of a generative model of fonts. In our framework, we adopt the variational autoencoder network for the font generation. Then, we conduct a perceptual study on the generated fonts from the multi-dimensional latent space of the generative model. After we obtained the distribution data of specific preferences, we utilize manifold learning approach to visualize the font distribution. In contrast to the conventional user interface in our user study, the proposed font-exploring user interface is efficient and helpful in the designated user preference.

Cross-lists for Fri, 18 Jun 21

[2]  arXiv:2106.09486 (cross-list from cs.CV) [pdf, other]
Title: Deep HDR Hallucination for Inverse Tone Mapping
Journal-ref: Sensors 2021, 21, 4032
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

Inverse Tone Mapping (ITM) methods attempt to reconstruct High Dynamic Range (HDR) information from Low Dynamic Range (LDR) image content. The dynamic range of well-exposed areas must be expanded and any missing information due to over/under-exposure must be recovered (hallucinated). The majority of methods focus on the former and are relatively successful, while most attempts on the latter are not of sufficient quality, even ones based on Convolutional Neural Networks (CNNs). A major factor for the reduced inpainting quality in some works is the choice of loss function. Work based on Generative Adversarial Networks (GANs) shows promising results for image synthesis and LDR inpainting, suggesting that GAN losses can improve inverse tone mapping results. This work presents a GAN-based method that hallucinates missing information from badly exposed areas in LDR images and compares its efficacy with alternative variations. The proposed method is quantitatively competitive with state-of-the-art inverse tone mapping methods, providing good dynamic range expansion for well-exposed areas and plausible hallucinations for saturated and under-exposed areas. A density-based normalisation method, targeted for HDR content, is also proposed, as well as an HDR data augmentation method targeted for HDR hallucination.

[3]  arXiv:2106.09509 (cross-list from cs.HC) [pdf, other]
Title: Resurrect3D: An Open and Customizable Platform for Visualizing and Analyzing Cultural Heritage Artifacts
Subjects: Human-Computer Interaction (cs.HC); Graphics (cs.GR)

Art and culture, at their best, lie in the act of discovery and exploration. This paper describes Resurrect3D, an open visualization platform for both casual users and domain experts to explore cultural artifacts. To that end, Resurrect3D takes two steps. First, it provides an interactive cultural heritage toolbox, providing not only commonly used tools in cultural heritage such as relighting and material editing, but also the ability for users to create an interactive "story": a saved session with annotations and visualizations others can later replay. Second, Resurrect3D exposes a set of programming interfaces to extend the toolbox. Domain experts can develop custom tools that perform artifact-specific visualization and analysis.

[4]  arXiv:2106.09696 (cross-list from cs.CV) [pdf, other]
Title: BABEL: Bodies, Action and Behavior with English Labels
Authors: Abhinanda R. Punnakkal (1), Arjun Chandrasekaran (1), Nikos Athanasiou (1), Alejandra Quiros-Ramirez (2), Michael J. Black (1) ((1) Max Planck Institute for Intelligent Systems, (2) Universitat Konstanz)
Comments: 11 pages, 4 figures, Accepted in CVPR'21
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Machine Learning (cs.LG)

Understanding the semantics of human movement -- the what, how and why of the movement -- is an important problem that requires datasets of human actions with semantic labels. Existing datasets take one of two approaches. Large-scale video datasets contain many action labels but do not contain ground-truth 3D human motion. Alternatively, motion-capture (mocap) datasets have precise body motions but are limited to a small number of actions. To address this, we present BABEL, a large dataset with language labels describing the actions being performed in mocap sequences. BABEL consists of action labels for about 43 hours of mocap sequences from AMASS. Action labels are at two levels of abstraction -- sequence labels describe the overall action in the sequence, and frame labels describe all actions in every frame of the sequence. Each frame label is precisely aligned with the duration of the corresponding action in the mocap sequence, and multiple actions can overlap. There are over 28k sequence labels, and 63k frame labels in BABEL, which belong to over 250 unique action categories. Labels from BABEL can be leveraged for tasks like action recognition, temporal action localization, motion synthesis, etc. To demonstrate the value of BABEL as a benchmark, we evaluate the performance of models on 3D action recognition. We demonstrate that BABEL poses interesting learning challenges that are applicable to real-world scenarios, and can serve as a useful benchmark of progress in 3D action recognition. The dataset, baseline method, and evaluation code is made available, and supported for academic research purposes at https://babel.is.tue.mpg.de/.

[ total of 4 entries: 1-4 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, recent, 2106, contact, help  (Access key information)