We gratefully acknowledge support from
the Simons Foundation and member institutions.

Graphics

New submissions

[ total of 13 entries: 1-13 ]
[ showing up to 2000 entries per page: fewer | more ]

New submissions for Thu, 25 Apr 24

[1]  arXiv:2404.15538 [pdf, other]
Title: DreamCraft: Text-Guided Generation of Functional 3D Environments in Minecraft
Comments: 16 pages, 9 figures, accepted to Foundation of Digital Games 2024
Subjects: Graphics (cs.GR); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Machine Learning (cs.LG)

Procedural Content Generation (PCG) algorithms enable the automatic generation of complex and diverse artifacts. However, they don't provide high-level control over the generated content and typically require domain expertise. In contrast, text-to-3D methods allow users to specify desired characteristics in natural language, offering a high amount of flexibility and expressivity. But unlike PCG, such approaches cannot guarantee functionality, which is crucial for certain applications like game design. In this paper, we present a method for generating functional 3D artifacts from free-form text prompts in the open-world game Minecraft. Our method, DreamCraft, trains quantized Neural Radiance Fields (NeRFs) to represent artifacts that, when viewed in-game, match given text descriptions. We find that DreamCraft produces more aligned in-game artifacts than a baseline that post-processes the output of an unconstrained NeRF. Thanks to the quantized representation of the environment, functional constraints can be integrated using specialized loss terms. We show how this can be leveraged to generate 3D structures that match a target distribution or obey certain adjacency rules over the block types. DreamCraft inherits a high degree of expressivity and controllability from the NeRF, while still being able to incorporate functional constraints through domain-specific objectives.

[2]  arXiv:2404.15661 [pdf, other]
Title: CWF: Consolidating Weak Features in High-quality Mesh Simplification
Comments: 14 pages, 22 figures
Subjects: Graphics (cs.GR); Computational Geometry (cs.CG); Computer Vision and Pattern Recognition (cs.CV)

In mesh simplification, common requirements like accuracy, triangle quality, and feature alignment are often considered as a trade-off. Existing algorithms concentrate on just one or a few specific aspects of these requirements. For example, the well-known Quadric Error Metrics (QEM) approach prioritizes accuracy and can preserve strong feature lines/points as well but falls short in ensuring high triangle quality and may degrade weak features that are not as distinctive as strong ones. In this paper, we propose a smooth functional that simultaneously considers all of these requirements. The functional comprises a normal anisotropy term and a Centroidal Voronoi Tessellation (CVT) energy term, with the variables being a set of movable points lying on the surface. The former inherits the spirit of QEM but operates in a continuous setting, while the latter encourages even point distribution, allowing various surface metrics. We further introduce a decaying weight to automatically balance the two terms. We selected 100 CAD models from the ABC dataset, along with 21 organic models, to compare the existing mesh simplification algorithms with ours. Experimental results reveal an important observation: the introduction of a decaying weight effectively reduces the conflict between the two terms and enables the alignment of weak features. This distinctive feature sets our approach apart from most existing mesh simplification methods and demonstrates significant potential in shape understanding.

Cross-lists for Thu, 25 Apr 24

[3]  arXiv:2404.15293 (cross-list from eess.IV) [pdf, other]
Title: Interactive Manipulation and Visualization of 3D Brain MRI for Surgical Training
Subjects: Image and Video Processing (eess.IV); Graphics (cs.GR); Neurons and Cognition (q-bio.NC)

In modern medical diagnostics, magnetic resonance imaging (MRI) is an important technique that provides detailed insights into anatomical structures. In this paper, we present a comprehensive methodology focusing on streamlining the segmentation, reconstruction, and visualization process of 3D MRI data. Segmentation involves the extraction of anatomical regions with the help of state-of-the-art deep learning algorithms. Then, 3D reconstruction converts segmented data from the previous step into multiple 3D representations. Finally, the visualization stage provides efficient and interactive presentations of both 2D and 3D MRI data. Integrating these three steps, the proposed system is able to augment the interpretability of the anatomical information from MRI scans according to our interviews with doctors. Even though this system was originally designed and implemented as part of human brain haptic feedback simulation for surgeon training, it can also provide experienced medical practitioners with an effective tool for clinical data analysis, surgical planning and other purposes

[4]  arXiv:2404.15378 (cross-list from cs.CV) [pdf, other]
Title: Hierarchical Hybrid Sliced Wasserstein: A Scalable Metric for Heterogeneous Joint Distributions
Authors: Khai Nguyen, Nhat Ho
Comments: 24 pages, 11 figures, 4 tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Machine Learning (cs.LG); Machine Learning (stat.ML)

Sliced Wasserstein (SW) and Generalized Sliced Wasserstein (GSW) have been widely used in applications due to their computational and statistical scalability. However, the SW and the GSW are only defined between distributions supported on a homogeneous domain. This limitation prevents their usage in applications with heterogeneous joint distributions with marginal distributions supported on multiple different domains. Using SW and GSW directly on the joint domains cannot make a meaningful comparison since their homogeneous slicing operator i.e., Radon Transform (RT) and Generalized Radon Transform (GRT) are not expressive enough to capture the structure of the joint supports set. To address the issue, we propose two new slicing operators i.e., Partial Generalized Radon Transform (PGRT) and Hierarchical Hybrid Radon Transform (HHRT). In greater detail, PGRT is the generalization of Partial Radon Transform (PRT), which transforms a subset of function arguments non-linearly while HHRT is the composition of PRT and multiple domain-specific PGRT on marginal domain arguments. By using HHRT, we extend the SW into Hierarchical Hybrid Sliced Wasserstein (H2SW) distance which is designed specifically for comparing heterogeneous joint distributions. We then discuss the topological, statistical, and computational properties of H2SW. Finally, we demonstrate the favorable performance of H2SW in 3D mesh deformation, deep 3D mesh autoencoders, and datasets comparison.

[5]  arXiv:2404.15889 (cross-list from cs.CV) [pdf, other]
Title: Sketch2Human: Deep Human Generation with Disentangled Geometry and Appearance Control
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

Geometry- and appearance-controlled full-body human image generation is an interesting but challenging task. Existing solutions are either unconditional or dependent on coarse conditions (e.g., pose, text), thus lacking explicit geometry and appearance control of body and garment. Sketching offers such editing ability and has been adopted in various sketch-based face generation and editing solutions. However, directly adapting sketch-based face generation to full-body generation often fails to produce high-fidelity and diverse results due to the high complexity and diversity in the pose, body shape, and garment shape and texture. Recent geometrically controllable diffusion-based methods mainly rely on prompts to generate appearance and it is hard to balance the realism and the faithfulness of their results to the sketch when the input is coarse. This work presents Sketch2Human, the first system for controllable full-body human image generation guided by a semantic sketch (for geometry control) and a reference image (for appearance control). Our solution is based on the latent space of StyleGAN-Human with inverted geometry and appearance latent codes as input. Specifically, we present a sketch encoder trained with a large synthetic dataset sampled from StyleGAN-Human's latent space and directly supervised by sketches rather than real images. Considering the entangled information of partial geometry and texture in StyleGAN-Human and the absence of disentangled datasets, we design a novel training scheme that creates geometry-preserved and appearance-transferred training data to tune a generator to achieve disentangled geometry and appearance control. Although our method is trained with synthetic data, it can handle hand-drawn sketches as well. Qualitative and quantitative evaluations demonstrate the superior performance of our method to state-of-the-art methods.

[6]  arXiv:2404.15976 (cross-list from cs.HC) [pdf, other]
Title: The State of the Art in Visual Analytics for 3D Urban Data
Comments: Accepted at EuroVis 2024 (STAR track). Surveyed works available at this https URL
Subjects: Human-Computer Interaction (cs.HC); Computers and Society (cs.CY); Graphics (cs.GR)

Urbanization has amplified the importance of three-dimensional structures in urban environments for a wide range of phenomena that are of significant interest to diverse stakeholders. With the growing availability of 3D urban data, numerous studies have focused on developing visual analysis techniques tailored to the unique characteristics of urban environments. However, incorporating the third dimension into visual analytics introduces additional challenges in designing effective visual tools to tackle urban data's diverse complexities. In this paper, we present a survey on visual analytics of 3D urban data. Our work characterizes published works along three main dimensions (why, what, and how), considering use cases, analysis tasks, data, visualizations, and interactions. We provide a fine-grained categorization of published works from visualization journals and conferences, as well as from a myriad of urban domains, including urban planning, architecture, and engineering. By incorporating perspectives from both urban and visualization experts, we identify literature gaps, motivate visualization researchers to understand challenges and opportunities, and indicate future research directions.

Replacements for Thu, 25 Apr 24

[7]  arXiv:2307.10135 (replaced) [pdf, other]
Title: A Hierarchical Architecture for Neural Materials
Subjects: Graphics (cs.GR); Computer Vision and Pattern Recognition (cs.CV)
[8]  arXiv:2212.06310 (replaced) [pdf, other]
Title: Structure-Guided Image Completion with Image-level and Object-level Semantic Discriminators
Comments: 18 pages, 16 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[9]  arXiv:2310.17994 (replaced) [pdf, other]
Title: ZeroNVS: Zero-Shot 360-Degree View Synthesis from a Single Image
Comments: Accepted to CVPR 2024. 12 pages
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[10]  arXiv:2312.09222 (replaced) [pdf, other]
Title: Mosaic-SDF for 3D Generative Models
Comments: More results and details can be found at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[11]  arXiv:2402.16558 (replaced) [pdf, other]
Title: Open Your Ears and Take a Look: A State-of-the-Art Report on the Integration of Sonification and Visualization
Comments: 30 pages, 9 figures, accepted for EuroVis 2024 conference
Subjects: Human-Computer Interaction (cs.HC); Graphics (cs.GR); Sound (cs.SD); Audio and Speech Processing (eess.AS)
[12]  arXiv:2402.18196 (replaced) [pdf, other]
Title: NToP: NeRF-Powered Large-scale Dataset Generation for 2D and 3D Human Pose Estimation in Top-View Fisheye Images
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
[13]  arXiv:2404.06279 (replaced) [pdf, other]
Title: NoiseNCA: Noisy Seed Improves Spatio-Temporal Continuity of Neural Cellular Automata
Comments: 9 pages, 12 figures
Subjects: Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Graphics (cs.GR); Multiagent Systems (cs.MA)
[ total of 13 entries: 1-13 ]
[ showing up to 2000 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, recent, 2404, contact, help  (Access key information)