A long standing and conceptually elegant view of computer vision is to use a generative model of the physical image formation process and posterior inference to infer or explain the image observations. A key problem in this inverse graphics view is the difficulty of posterior inference at run time. This difficulty stems from a number of causes: (1) high-dimensionality of the posterior, (2) complex and dynamic dependency between model parameters and (3) the forward graphics simulations being expensive. We address these issues in terms of local and global optimization.
For local optimization, we propose an approximate differentiable renderer (DR) [ ] that explicitly models the relationship between changes in model parameters and image observations. The OpenDR framework makes it easy to express a forward graphics model and then automatically obtain derivatives with respect to the model parameters and to optimize over them. Built on a new auto-differentiation package and OpenGL, OpenDR provides a local optimization method that can be incorporated into probabilistic programming frameworks. We demonstrate the power and simplicity of programming with OpenDR by using it to solve the problem of estimating human body shape from Kinect depth and RGB data.
To address issues of more global optimization, we also propose the informed sampler [ ] that leverages computer vision features and algorithms to make informed proposals for the state of latent variables. These proposals are accepted or rejected based on the generative graphics model. The informed sampler is simple and easy to implement, yet it enables inference in generative models that were out of reach for current uninformed samplers. We demonstrate this claim on challenging models that incorporate rendering engines, object occlusion, ill-posedness, and multi-modality.