14 results

2015


Thumb xl grassmanteaser
Scalable Robust Principal Component Analysis using Grassmann Averages

Hauberg, S., Feragen, A., Enficiaud, R., Black, M.

IEEE Trans. Pattern Analysis and Machine Intelligence (PAMI), December 2015 (article)

Abstract
In large datasets, manual data verification is impossible, and we must expect the number of outliers to increase with data size. While principal component analysis (PCA) can reduce data size, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortunately, state-of-the-art approaches for robust PCA are not scalable. We note that in a zero-mean dataset, each observation spans a one-dimensional subspace, giving a point on the Grassmann manifold. We show that the average subspace corresponds to the leading principal component for Gaussian data. We provide a simple algorithm for computing this Grassmann Average (GA), and show that the subspace estimate is less sensitive to outliers than PCA for general distributions. Because averages can be efficiently computed, we immediately gain scalability. We exploit robust averaging to formulate the Robust Grassmann Average (RGA) as a form of robust PCA. The resulting Trimmed Grassmann Average (TGA) is appropriate for computer vision because it is robust to pixel outliers. The algorithm has linear computational complexity and minimal memory requirements. We demonstrate TGA for background modeling, video restoration, and shadow removal. We show scalability by performing robust PCA on the entire Star Wars IV movie; a task beyond any current method. Source code is available online.

preprint pdf from publisher supplemental Project Page [BibTex]

2015


Thumb xl teaser
Permutohedral Lattice CNNs

Kiefel, M., Jampani, V., Gehler, P. V.

In ICLR Workshop Track, May 2015 (inproceedings)

Abstract
This paper presents a convolutional layer that is able to process sparse input features. As an example, for image recognition problems this allows an efficient filtering of signals that do not lie on a dense grid (like pixel position), but of more general features (such as color values). The presented algorithm makes use of the permutohedral lattice data structure. The permutohedral lattice was introduced to efficiently implement a bilateral filter, a commonly used image processing operation. Its use allows for a generalization of the convolution type found in current (spatial) convolutional network architectures.

pdf link (url) Project Page [BibTex]


Thumb xl jampani15aistats teaser
Consensus Message Passing for Layered Graphical Models

Jampani, V., Eslami, S. M. A., Tarlow, D., Kohli, P., Winn, J.

In Eighteenth International Conference on Artificial Intelligence and Statistics (AISTATS), 38, pages: 425-433, JMLR Workshop and Conference Proceedings, May 2015 (inproceedings)

Abstract
Generative models provide a powerful framework for probabilistic reasoning. However, in many domains their use has been hampered by the practical difficulties of inference. This is particularly the case in computer vision, where models of the imaging process tend to be large, loopy and layered. For this reason bottom-up conditional models have traditionally dominated in such domains. We find that widely-used, general-purpose message passing inference algorithms such as Expectation Propagation (EP) and Variational Message Passing (VMP) fail on the simplest of vision models. With these models in mind, we introduce a modification to message passing that learns to exploit their layered structure by passing 'consensus' messages that guide inference towards good solutions. Experiments on a variety of problems show that the proposed technique leads to significantly more accurate inference results, not only when compared to standard EP and VMP, but also when compared to competitive bottom-up conditional models.

online pdf supplementary link (url) Project Page [BibTex]

online pdf supplementary link (url) Project Page [BibTex]

2014


Thumb xl thumb 9780262028370
Advanced Structured Prediction

Nowozin, S., Gehler, P. V., Jancsary, J., Lampert, C. H.

Advanced Structured Prediction, pages: 432, Neural Information Processing Series, MIT Press, November 2014 (book)

Abstract
The goal of structured prediction is to build machine learning models that predict relational information that itself has structure, such as being composed of multiple interrelated parts. These models, which reflect prior knowledge, task-specific relations, and constraints, are used in fields including computer vision, speech recognition, natural language processing, and computational biology. They can carry out such tasks as predicting a natural language sentence, or segmenting an image into meaningful components. These models are expressive and powerful, but exact computation is often intractable. A broad research effort in recent years has aimed at designing structured prediction models and approximate inference and learning procedures that are computationally efficient. This volume offers an overview of this recent research in order to make the work accessible to a broader research community. The chapters, by leading researchers in the field, cover a range of topics, including research trends, the linear programming relaxation approach, innovations in probabilistic modeling, recent theoretical progress, and resource-aware learning.

publisher link (url) Project Page [BibTex]

2014

publisher link (url) Project Page [BibTex]


Thumb xl grassmann
Grassmann Averages for Scalable Robust PCA

Hauberg, S., Feragen, A., Black, M. J.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages: 3810 -3817, Columbus, Ohio, USA, June 2014 (inproceedings)

Abstract
As the collection of large datasets becomes increasingly automated, the occurrence of outliers will increase – “big data” implies “big outliers”. While principal component analysis (PCA) is often used to reduce the size of data, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortunately, state-of-the-art approaches for robust PCA do not scale beyond small-to-medium sized datasets. To address this, we introduce the Grassmann Average (GA), which expresses dimensionality reduction as an average of the subspaces spanned by the data. Because averages can be efficiently computed, we immediately gain scalability. GA is inherently more robust than PCA, but we show that they coincide for Gaussian data. We exploit that averages can be made robust to formulate the Robust Grassmann Average (RGA) as a form of robust PCA. Robustness can be with respect to vectors (subspaces) or elements of vectors; we focus on the latter and use a trimmed average. The resulting Trimmed Grassmann Average (TGA) is particularly appropriate for computer vision because it is robust to pixel outliers. The algorithm has low computational complexity and minimal memory requirements, making it scalable to “big noisy data.” We demonstrate TGA for background modeling, video restoration, and shadow removal. We show scalability by performing robust PCA on the entire Star Wars IV movie.

pdf code supplementary material tutorial video results video talk poster DOI Project Page [BibTex]

pdf code supplementary material tutorial video results video talk poster DOI Project Page [BibTex]


Thumb xl modeltransport
Model Transport: Towards Scalable Transfer Learning on Manifolds

Freifeld, O., Hauberg, S., Black, M. J.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages: 1378 -1385, Columbus, Ohio, USA, June 2014 (inproceedings)

Abstract
We consider the intersection of two research fields: transfer learning and statistics on manifolds. In particular, we consider, for manifold-valued data, transfer learning of tangent-space models such as Gaussians distributions, PCA, regression, or classifiers. Though one would hope to simply use ordinary Rn-transfer learning ideas, the manifold structure prevents it. We overcome this by basing our method on inner-product-preserving parallel transport, a well-known tool widely used in other problems of statistics on manifolds in computer vision. At first, this straightforward idea seems to suffer from an obvious shortcoming: Transporting large datasets is prohibitively expensive, hindering scalability. Fortunately, with our approach, we never transport data. Rather, we show how the statistical models themselves can be transported, and prove that for the tangent-space models above, the transport “commutes” with learning. Consequently, our compact framework, applicable to a large class of manifolds, is not restricted by the size of either the training or test sets. We demonstrate the approach by transferring PCA and logistic-regression models of real-world data involving 3D shapes and image descriptors.

pdf SupMat Video poster DOI Project Page [BibTex]

pdf SupMat Video poster DOI Project Page [BibTex]

2013


Thumb xl thumb
Branch&Rank for Efficient Object Detection

Lehmann, A., Gehler, P., VanGool, L.

International Journal of Computer Vision, Springer, December 2013 (article)

Abstract
Ranking hypothesis sets is a powerful concept for efficient object detection. In this work, we propose a branch&rank scheme that detects objects with often less than 100 ranking operations. This efficiency enables the use of strong and also costly classifiers like non-linear SVMs with RBF-TeX kernels. We thereby relieve an inherent limitation of branch&bound methods as bounds are often not tight enough to be effective in practice. Our approach features three key components: a ranking function that operates on sets of hypotheses and a grouping of these into different tasks. Detection efficiency results from adaptively sub-dividing the object search space into decreasingly smaller sets. This is inherited from branch&bound, while the ranking function supersedes a tight bound which is often unavailable (except for rather limited function classes). The grouping makes the system effective: it separates image classification from object recognition, yet combines them in a single formulation, phrased as a structured SVM problem. A novel aspect of branch&rank is that a better ranking function is expected to decrease the number of classifier calls during detection. We use the VOC’07 dataset to demonstrate the algorithmic properties of branch&rank.

pdf link (url) Project Page [BibTex]

2013

pdf link (url) Project Page [BibTex]


Thumb xl cover3
Statistics on Manifolds with Applications to Modeling Shape Deformations

Freifeld, O.

Brown University, August 2013 (phdthesis)

Abstract
Statistical models of non-rigid deformable shape have wide application in many fi elds, including computer vision, computer graphics, and biometry. We show that shape deformations are well represented through nonlinear manifolds that are also matrix Lie groups. These pattern-theoretic representations lead to several advantages over other alternatives, including a principled measure of shape dissimilarity and a natural way to compose deformations. Moreover, they enable building models using statistics on manifolds. Consequently, such models are superior to those based on Euclidean representations. We demonstrate this by modeling 2D and 3D human body shape. Shape deformations are only one example of manifold-valued data. More generally, in many computer-vision and machine-learning problems, nonlinear manifold representations arise naturally and provide a powerful alternative to Euclidean representations. Statistics is traditionally concerned with data in a Euclidean space, relying on the linear structure and the distances associated with such a space; this renders it inappropriate for nonlinear spaces. Statistics can, however, be generalized to nonlinear manifolds. Moreover, by respecting the underlying geometry, the statistical models result in not only more e ffective analysis but also consistent synthesis. We go beyond previous work on statistics on manifolds by showing how, even on these curved spaces, problems related to modeling a class from scarce data can be dealt with by leveraging information from related classes residing in di fferent regions of the space. We show the usefulness of our approach with 3D shape deformations. To summarize our main contributions: 1) We de fine a new 2D articulated model -- more expressive than traditional ones -- of deformable human shape that factors body-shape, pose, and camera variations. Its high realism is obtained from training data generated from a detailed 3D model. 2) We defi ne a new manifold-based representation of 3D shape deformations that yields statistical deformable-template models that are better than the current state-of-the- art. 3) We generalize a transfer learning idea from Euclidean spaces to Riemannian manifolds. This work demonstrates the value of modeling manifold-valued data and their statistics explicitly on the manifold. Specifi cally, the methods here provide new tools for shape analysis.

pdf Project Page [BibTex]

2012


Thumb xl lietr
Lie Bodies: A Manifold Representation of 3D Human Shape. Supplemental Material

Freifeld, O., Black, M. J.

(No. 5), Max Planck Institute for Intelligent Systems, October 2012 (techreport)

pdf Project Page [BibTex]

2012

pdf Project Page [BibTex]


Thumb xl nips teaser
A Geometric Take on Metric Learning

Hauberg, S., Freifeld, O., Black, M. J.

In Advances in Neural Information Processing Systems (NIPS) 25, pages: 2033-2041, (Editors: P. Bartlett and F.C.N. Pereira and C.J.C. Burges and L. Bottou and K.Q. Weinberger), MIT Press, 2012 (inproceedings)

Abstract
Multi-metric learning techniques learn local metric tensors in different parts of a feature space. With such an approach, even simple classifiers can be competitive with the state-of-the-art because the distance measure locally adapts to the structure of the data. The learned distance measure is, however, non-metric, which has prevented multi-metric learning from generalizing to tasks such as dimensionality reduction and regression in a principled way. We prove that, with appropriate changes, multi-metric learning corresponds to learning the structure of a Riemannian manifold. We then show that this structure gives us a principled way to perform dimensionality reduction and regression according to the learned metrics. Algorithmically, we provide the first practical algorithm for computing geodesics according to the learned metrics, as well as algorithms for computing exponential and logarithmic maps on the Riemannian manifold. Together, these tools let many Euclidean algorithms take advantage of multi-metric learning. We illustrate the approach on regression and dimensionality reduction tasks that involve predicting measurements of the human body from shape data.

PDF Youtube Suppl. material Poster Project Page [BibTex]

PDF Youtube Suppl. material Poster Project Page [BibTex]


Thumb xl paperfig
Lie Bodies: A Manifold Representation of 3D Human Shape

Freifeld, O., Black, M. J.

In European Conf. on Computer Vision (ECCV), pages: 1-14, Part I, LNCS 7572, (Editors: A. Fitzgibbon et al. (Eds.)), Springer-Verlag, October 2012 (inproceedings)

Abstract
Three-dimensional object shape is commonly represented in terms of deformations of a triangular mesh from an exemplar shape. Existing models, however, are based on a Euclidean representation of shape deformations. In contrast, we argue that shape has a manifold structure: For example, summing the shape deformations for two people does not necessarily yield a deformation corresponding to a valid human shape, nor does the Euclidean difference of these two deformations provide a meaningful measure of shape dissimilarity. Consequently, we define a novel manifold for shape representation, with emphasis on body shapes, using a new Lie group of deformations. This has several advantages. First we define triangle deformations exactly, removing non-physical deformations and redundant degrees of freedom common to previous methods. Second, the Riemannian structure of Lie Bodies enables a more meaningful definition of body shape similarity by measuring distance between bodies on the manifold of body shape deformations. Third, the group structure allows the valid composition of deformations. This is important for models that factor body shape deformations into multiple causes or represent shape as a linear combination of basis shapes. Finally, body shape variation is modeled using statistics on manifolds. Instead of modeling Euclidean shape variation with Principal Component Analysis we capture shape variation on the manifold using Principal Geodesic Analysis. Our experiments show consistent visual and quantitative advantages of Lie Bodies over traditional Euclidean models of shape deformation and our representation can be easily incorporated into existing methods.

pdf supplemental material youtube poster eigenshape video code Project Page Project Page [BibTex]

pdf supplemental material youtube poster eigenshape video code Project Page Project Page [BibTex]

2011


Thumb xl mt
Branch&Rank: Non-Linear Object Detection

(Best Impact Paper Prize)

Lehmann, A., Gehler, P., VanGool, L.

In Proceedings of the British Machine Vision Conference (BMVC), pages: 8.1-8.11, (Editors: Jesse Hoey and Stephen McKenna and Emanuele Trucco), BMVA Press, September 2011, http://dx.doi.org/10.5244/C.25.8 (inproceedings)

video of talk pdf slides supplementary Project Page [BibTex]

2011

video of talk pdf slides supplementary Project Page [BibTex]

2003


Thumb xl delatorreijcvteaser
A framework for robust subspace learning

De la Torre, F., Black, M. J.

International Journal of Computer Vision, 54(1-3):117-142, August 2003 (article)

Abstract
Many computer vision, signal processing and statistical problems can be posed as problems of learning low dimensional linear or multi-linear models. These models have been widely used for the representation of shape, appearance, motion, etc., in computer vision applications. Methods for learning linear models can be seen as a special case of subspace fitting. One draw-back of previous learning methods is that they are based on least squares estimation techniques and hence fail to account for “outliers” which are common in realistic training sets. We review previous approaches for making linear learning methods robust to outliers and present a new method that uses an intra-sample outlier process to account for pixel outliers. We develop the theory of Robust Subspace Learning (RSL) for linear models within a continuous optimization framework based on robust M-estimation. The framework applies to a variety of linear learning problems in computer vision including eigen-analysis and structure from motion. Several synthetic and natural examples are used to develop and illustrate the theory and applications of robust subspace learning in computer vision.

pdf code pdf from publisher Project Page [BibTex]

2003

pdf code pdf from publisher Project Page [BibTex]

2001


Thumb xl bildschirmfoto 2012 12 11 um 11.56.46
Robust principal component analysis for computer vision

De la Torre, F., Black, M. J.

In Int. Conf. on Computer Vision, ICCV-2001, II, pages: 362-369, Vancouver, BC, USA, 2001 (inproceedings)

pdf Project Page [BibTex]

2001

pdf Project Page [BibTex]