Flash Cache: Reducing Bias in Radiance Cache Based Inverse Rendering


1Carnegie Mellon University 2Google Research
ECCV 2024 (Oral)


Abstract

State-of-the-art techniques for 3D reconstruction are largely based on volumetric scene representations, which require sampling multiple points to compute the color arriving along a ray. Using these representations for more general inverse rendering --- reconstructing geometry, materials, and lighting from observed images --- is challenging because recursively path-tracing such volumetric representations is expensive. Recent works alleviate this issue through the use of radiance caches: data structures that store the steady-state, infinite-bounce radiance arriving at any point from any direction. However, these solutions rely on approximations that introduce bias into the renderings and, more importantly, into the gradients used for optimization. We present a method that avoids these approximations while remaining computationally efficient. In particular, we leverage two techniques to reduce variance for unbiased estimators of the rendering equation:

  1. An occlusion-aware importance sampler for incoming illumination and
  2. A fast cache architecture that can be used as a control variate for the radiance from a high-quality, but more expensive, volumetric cache.

We show that by removing these biases our approach improves the generality of radiance cache based inverse rendering, as well as increasing quality in the presence of challenging light transport effects such as specular reflections.


Radiance Caching for Inverse Rendering

Radiance caching is a technique to accelerate Monte Carlo estimation of the rendering integral, by "caching" the distribution of incoming radiance at every point in the scene.


A neural radiance field (NeRF) can function as a high-quality radiance cache, and one which requires only sparse supervision to train. To illustrate, we show the distribution of incoming radiance for the 3D point intersected by the red pixel, as it sweeps across the scene. The radiance for each incoming ray is queried by rendering from the NeRF cache.

Point to visualize
NeRF cache

In inverse rendering, our goal is to reconstruct the scene geometry, materials, and lighting from observed images by optimizing through a differentiable rendering model. A more accurate rendering model often means better reconstruction quality. Here, we aim to make the rendering model as physically-accurate as possible by using a model that combines both (a) volume rendering, and (b) physically-based Monte Carlo rendering with a high-quality radiance cache.


Occlusion-Aware Importance Sampling

Unfortunately, high-quality radiance caches, like the one above, are often expensive to evaluate. Luckily, variance reduction strategies, such as importance sampling, can help accelerate inference while decreasing rendering noise.


We leverage a vMF importance sampler of incoming illumination, which is trained to match the distribution of incoming radiance from the cache. Specifically, we use an NGP to predict the parameters of a mixture of vMFs for each point in space, which can produce different lobes for different light sources. Because the parameters of these distributions are spatially-varying, if a light source is occluded from the perspective of one surface point, then the model can “turn off” the lobe for that light source.

NeRF cache
vMFs

Fast Cache Control Variate

In addition to our occlusion-aware importance sampling scheme, we make use of control variates --- another technique from Monte Carlo rendering.


Specifically, we use an accelerated "fast cache", a cheaper cache architecture that is trained to match the more expensive NeRF cache. The fast cache can be used to produce a low-noise but biased estimator of the rendering integral. Combining this with an unbiased estimator from the NeRF cache can reduce noise in the final rendering, while remaining unbiased.

NeRF cache
Fast cache

Full System

Our full system for inverse rendering combines the high-quality NeRF radiance cache, occlusion-aware importance sampling, and fast cache control variate to produce low-cost and accurate approximation of the rendering integral --- while still using a volume rendering image formation model for primary rays.

Below, we show equal-sample-count renderings using (a) both the occlusion-aware importance sampler and fast cache control variate, (b) only the occlusion-aware importance sampler, and (c) neither.

Ground truth
(a) With Both
(b) With Sampler
(c) With Neither
Interactive visualization. Hover or tap to move the zoom cursor.

Results

We showcase some results of our inverse rendering system below on the TensoIR-synthetic dataset.


Acknowledgements

This work was carried out while Benjamin was an intern at Google Research. Authors thank Rick Szeliski, Aleksander Holynski, and Janne Kontkanen for fruitful discussions, as well as Isabella Liu for help with TensoIR comparisons. Benjamin Attal is supported by a Meta Research PhD Fellowship. Matthew O'Toole acknowledges support from NSF CAREER 2238485 and a Google gift.

BibTeX