Header logo is ps


2018


Thumb xl thesis cover2
Model-based Optical Flow: Layers, Learning, and Geometry

Wulff, J.

Tuebingen University, April 2018 (phdthesis)

Abstract
The estimation of motion in video sequences establishes temporal correspondences between pixels and surfaces and allows reasoning about a scene using multiple frames. Despite being a focus of research for over three decades, computing motion, or optical flow, remains challenging due to a number of difficulties, including the treatment of motion discontinuities and occluded regions, and the integration of information from more than two frames. One reason for these issues is that most optical flow algorithms only reason about the motion of pixels on the image plane, while not taking the image formation pipeline or the 3D structure of the world into account. One approach to address this uses layered models, which represent the occlusion structure of a scene and provide an approximation to the geometry. The goal of this dissertation is to show ways to inject additional knowledge about the scene into layered methods, making them more robust, faster, and more accurate. First, this thesis demonstrates the modeling power of layers using the example of motion blur in videos, which is caused by fast motion relative to the exposure time of the camera. Layers segment the scene into regions that move coherently while preserving their occlusion relationships. The motion of each layer therefore directly determines its motion blur. At the same time, the layered model captures complex blur overlap effects at motion discontinuities. Using layers, we can thus formulate a generative model for blurred video sequences, and use this model to simultaneously deblur a video and compute accurate optical flow for highly dynamic scenes containing motion blur. Next, we consider the representation of the motion within layers. Since, in a layered model, important motion discontinuities are captured by the segmentation into layers, the flow within each layer varies smoothly and can be approximated using a low dimensional subspace. We show how this subspace can be learned from training data using principal component analysis (PCA), and that flow estimation using this subspace is computationally efficient. The combination of the layered model and the low-dimensional subspace gives the best of both worlds, sharp motion discontinuities from the layers and computational efficiency from the subspace. Lastly, we show how layered methods can be dramatically improved using simple semantics. Instead of treating all layers equally, a semantic segmentation divides the scene into its static parts and moving objects. Static parts of the scene constitute a large majority of what is shown in typical video sequences; yet, in such regions optical flow is fully constrained by the depth structure of the scene and the camera motion. After segmenting out moving objects, we consider only static regions, and explicitly reason about the structure of the scene and the camera motion, yielding much better optical flow estimates. Furthermore, computing the structure of the scene allows to better combine information from multiple frames, resulting in high accuracies even in occluded regions. For moving regions, we compute the flow using a generic optical flow method, and combine it with the flow computed for the static regions to obtain a full optical flow field. By combining layered models of the scene with reasoning about the dynamic behavior of the real, three-dimensional world, the methods presented herein push the envelope of optical flow computation in terms of robustness, speed, and accuracy, giving state-of-the-art results on benchmarks and pointing to important future research directions for the estimation of motion in natural scenes.

Official link DOI Project Page [BibTex]


Thumb xl coregpatentfig
Co-Registration – Simultaneous Alignment and Modeling of Articulated 3D Shapes

Black, M., Hirshberg, D., Loper, M., Rachlin, E., Weiss, A.

Febuary 2018, U.S.~Patent 9,898,848 (misc)

Abstract
Present application refers to a method, a model generation unit and a computer program (product) for generating trained models (M) of moving persons, based on physically measured person scan data (S). The approach is based on a common template (T) for the respective person and on the measured person scan data (S) in different shapes and different poses. Scan data are measured with a 3D laser scanner. A generic personal model is used for co-registering a set of person scan data (S) aligning the template (T) to the set of person scans (S) while simultaneously training the generic personal model to become a trained person model (M) by constraining the generic person model to be scan-specific, person-specific and pose-specific and providing the trained model (M), based on the co registering of the measured object scan data (S).

text [BibTex]

2014


Thumb xl blueman cropped2
Modeling the Human Body in 3D: Data Registration and Human Shape Representation

Tsoli, A.

Brown University, Department of Computer Science, May 2014 (phdthesis)

pdf [BibTex]

2014

pdf [BibTex]

2013


Thumb xl cover3
Statistics on Manifolds with Applications to Modeling Shape Deformations

Freifeld, O.

Brown University, August 2013 (phdthesis)

Abstract
Statistical models of non-rigid deformable shape have wide application in many fi elds, including computer vision, computer graphics, and biometry. We show that shape deformations are well represented through nonlinear manifolds that are also matrix Lie groups. These pattern-theoretic representations lead to several advantages over other alternatives, including a principled measure of shape dissimilarity and a natural way to compose deformations. Moreover, they enable building models using statistics on manifolds. Consequently, such models are superior to those based on Euclidean representations. We demonstrate this by modeling 2D and 3D human body shape. Shape deformations are only one example of manifold-valued data. More generally, in many computer-vision and machine-learning problems, nonlinear manifold representations arise naturally and provide a powerful alternative to Euclidean representations. Statistics is traditionally concerned with data in a Euclidean space, relying on the linear structure and the distances associated with such a space; this renders it inappropriate for nonlinear spaces. Statistics can, however, be generalized to nonlinear manifolds. Moreover, by respecting the underlying geometry, the statistical models result in not only more e ffective analysis but also consistent synthesis. We go beyond previous work on statistics on manifolds by showing how, even on these curved spaces, problems related to modeling a class from scarce data can be dealt with by leveraging information from related classes residing in di fferent regions of the space. We show the usefulness of our approach with 3D shape deformations. To summarize our main contributions: 1) We de fine a new 2D articulated model -- more expressive than traditional ones -- of deformable human shape that factors body-shape, pose, and camera variations. Its high realism is obtained from training data generated from a detailed 3D model. 2) We defi ne a new manifold-based representation of 3D shape deformations that yields statistical deformable-template models that are better than the current state-of-the- art. 3) We generalize a transfer learning idea from Euclidean spaces to Riemannian manifolds. This work demonstrates the value of modeling manifold-valued data and their statistics explicitly on the manifold. Specifi cally, the methods here provide new tools for shape analysis.

pdf Project Page [BibTex]