We gratefully acknowledge support from
the Simons Foundation and member institutions.


New submissions

[ total of 4 entries: 1-4 ]
[ showing up to 500 entries per page: fewer | more ]

New submissions for Fri, 31 Mar 23

[1]  arXiv:2303.17041 [pdf, other]
Title: The secret of immersion: actor driven camera movement generation for auto-cinematography
Subjects: Multimedia (cs.MM); Graphics (cs.GR); Machine Learning (cs.LG)

Immersion plays a vital role when designing cinematic creations, yet the difficulty in immersive shooting prevents designers to create satisfactory outputs. In this work, we analyze the specific components that contribute to cinematographic immersion considering spatial, emotional, and aesthetic level, while these components are then combined into a high-level evaluation mechanism. Guided by such a immersion mechanism, we propose a GAN-based camera control system that is able to generate actor-driven camera movements in the 3D virtual environment to obtain immersive film sequences. The proposed encoder-decoder architecture in the generation flow transfers character motion into camera trajectory conditioned on an emotion factor. This ensures spatial and emotional immersion by performing actor-camera synchronization physically and psychologically. The emotional immersion is further strengthened by incorporating regularization that controls camera shakiness for expressing different mental statuses. To achieve aesthetic immersion, we make effort to improve aesthetic frame compositions by modifying the synthesized camera trajectory. Based on a self-supervised adjustor, the adjusted camera placements can project the character to the appropriate on-frame locations following aesthetic rules. The experimental results indicate that our proposed camera control system can efficiently offer immersive cinematic videos, both quantitatively and qualitatively, based on a fine-grained immersive shooting. Live examples are shown in the supplementary video.

[2]  arXiv:2303.17094 [pdf, other]
Title: Enhanced Stable View Synthesis
Comments: Accepted to IEEE/CVF CVPR 2023. Draft info: 13 pages, 6 Figures, 7 Tables
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

We introduce an approach to enhance the novel view synthesis from images taken from a freely moving camera. The introduced approach focuses on outdoor scenes where recovering accurate geometric scaffold and camera pose is challenging, leading to inferior results using the state-of-the-art stable view synthesis (SVS) method. SVS and related methods fail for outdoor scenes primarily due to (i) over-relying on the multiview stereo (MVS) for geometric scaffold recovery and (ii) assuming COLMAP computed camera poses as the best possible estimates, despite it being well-studied that MVS 3D reconstruction accuracy is limited to scene disparity and camera-pose accuracy is sensitive to key-point correspondence selection. This work proposes a principled way to enhance novel view synthesis solutions drawing inspiration from the basics of multiple view geometry. By leveraging the complementary behavior of MVS and monocular depth, we arrive at a better scene depth per view for nearby and far points, respectively. Moreover, our approach jointly refines camera poses with image-based rendering via multiple rotation averaging graph optimization. The recovered scene depth and the camera-pose help better view-dependent on-surface feature aggregation of the entire scene. Extensive evaluation of our approach on the popular benchmark dataset, such as Tanks and Temples, shows substantial improvement in view synthesis results compared to the prior art. For instance, our method shows 1.5 dB of PSNR improvement on the Tank and Temples. Similar statistics are observed when tested on other benchmark datasets such as FVS, Mip-NeRF 360, and DTU.

[3]  arXiv:2303.17108 [pdf, other]
Title: B-spline freeform surface tailoring for prescribed irradiance based on differentiable ray-tracing
Subjects: Optics (physics.optics); Graphics (cs.GR)

A universal and flexible design method for freeform surface that can modulate the distribution of an zero-\'etendue source to an arbitrary irradiance distribution is a significant challenge in the field of non-imaging optics. Current design methods typically formulate the problem as a partial differential equation and solve it through sophisticated numerical methods, especially for off-axis situations. However, most of the current methods are unsuitable for directly solving multi-freeform surface or hybrid design problems that contains both freeform and spherical surfaces. To address these challenges, we propose the B-spline surface tailoring method, based on a differentiable ray-tracing algorithm. Our method features a computationally efficient B-spline model and a two-step optimization strategy based on optimal transport mapping. This allows for rapid, iterative adjustments to the surface shape based on deviations between the simulated and target distributions while ensuring a smooth resulting surface shape. In experiments, the proposed approach performs well in both paraxial and off-axis situations, and exhibits superior flexibility when applied to hybrid design case.

[4]  arXiv:2303.17181 [pdf, other]
Title: Implicit View-Time Interpolation of Stereo Videos using Multi-Plane Disparities and Non-Uniform Coordinates
Comments: Accepted to CVPR 2023. Project page at this https URL and video at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)

In this paper, we propose an approach for view-time interpolation of stereo videos. Specifically, we build upon X-Fields that approximates an interpolatable mapping between the input coordinates and 2D RGB images using a convolutional decoder. Our main contribution is to analyze and identify the sources of the problems with using X-Fields in our application and propose novel techniques to overcome these challenges. Specifically, we observe that X-Fields struggles to implicitly interpolate the disparities for large baseline cameras. Therefore, we propose multi-plane disparities to reduce the spatial distance of the objects in the stereo views. Moreover, we propose non-uniform time coordinates to handle the non-linear and sudden motion spikes in videos. We additionally introduce several simple, but important, improvements over X-Fields. We demonstrate that our approach is able to produce better results than the state of the art, while running in near real-time rates and having low memory and storage costs.

[ total of 4 entries: 1-4 ]
[ showing up to 500 entries per page: fewer | more ]

Disable MathJax (What is MathJax?)

Links to: arXiv, form interface, find, cs, recent, 2304, contact, help  (Access key information)