Header logo is ps


2018


Thumb xl cover tro paper
An Online Scalable Approach to Unified Multirobot Cooperative Localization and Object Tracking

Ahmad, A., Lawless, G., Lima, P.

In IEEE International Conference on Robotics and Automation (ICRA) 2018, Journal Track., May 2018 (inproceedings)

[BibTex]

2018

[BibTex]


Thumb xl plos1
Body size estimation of self and others in females varying in BMI

Thaler, A., Geuss, M. N., Mölbert, S. C., Giel, K. E., Streuber, S., Romero, J., Black, M. J., Mohler, B. J.

PLoS ONE, 13(2), Febuary 2018 (article)

Abstract
Previous literature suggests that a disturbed ability to accurately identify own body size may contribute to overweight. Here, we investigated the influence of personal body size, indexed by body mass index (BMI), on body size estimation in a non-clinical population of females varying in BMI. We attempted to disentangle general biases in body size estimates and attitudinal influences by manipulating whether participants believed the body stimuli (personalized avatars with realistic weight variations) represented their own body or that of another person. Our results show that the accuracy of own body size estimation is predicted by personal BMI, such that participants with lower BMI underestimated their body size and participants with higher BMI overestimated their body size. Further, participants with higher BMI were less likely to notice the same percentage of weight gain than participants with lower BMI. Importantly, these results were only apparent when participants were judging a virtual body that was their own identity (Experiment 1), but not when they estimated the size of a body with another identity and the same underlying body shape (Experiment 2a). The different influences of BMI on accuracy of body size estimation and sensitivity to weight change for self and other identity suggests that effects of BMI on visual body size estimation are self-specific and not generalizable to other bodies.

pdf DOI [BibTex]


Thumb xl hmrteaser
End-to-end Recovery of Human Shape and Pose

Kanazawa, A., Black, M. J., Jacobs, D. W., Malik, J.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, 2018 (inproceedings)

Abstract
We describe Human Mesh Recovery (HMR), an end-to-end framework for reconstructing a full 3D mesh of a human body from a single RGB image. In contrast to most current methods that compute 2D or 3D joint locations, we produce a richer and more useful mesh representation that is parameterized by shape and 3D joint angles. The main objective is to minimize the reprojection loss of keypoints, which allows our model to be trained using in-the-wild images that only have ground truth 2D annotations. However, the reprojection loss alone is highly underconstrained. In this work we address this problem by introducing an adversary trained to tell whether human body shape and pose parameters are real or not using a large database of 3D human meshes. We show that HMR can be trained with and without using any paired 2D-to-3D supervision. We do not rely on intermediate 2D keypoint detections and infer 3D pose and shape parameters directly from image pixels. Our model runs in real-time given a bounding box containing the person. We demonstrate our approach on various images in-the-wild and out-perform previous optimization-based methods that output 3D meshes and show competitive results on tasks such as 3D joint location estimation and part segmentation.

pdf code project video [BibTex]

pdf code project video [BibTex]


Thumb xl person reid.001
Part-Aligned Bilinear Representations for Person Re-identification

Suh, Y., Wang, J., Tang, S., Mei, T., Lee, K. M.

arXiv preprint arXiv:1804.07094, 2018 (article)

Abstract
We propose a novel network that learns a part-aligned representation for person re-identification. It handles the body part misalignment problem, that is, body parts are misaligned across human detections due to pose/viewpoint change and unreliable detection. Our model consists of a two-stream network (one stream for appearance map extraction and the other one for body part map extraction) and a bilinear-pooling layer that generates and spatially pools a part- aligned map. Each local feature of the part-aligned map is obtained by a bilinear mapping of the corresponding local appearance and body part descriptors. Our new representation leads to a robust image matching similarity, which is equiv- alent to an aggregation of the local similarities of the corresponding body parts combined with the weighted appearance similarity. This part-aligned representa- tion reduces the part misalignment problem significantly. Our approach is also advantageous over other pose-guided representations (e.g., extracting represen- tations over the bounding box of each body part) by learning part descriptors optimal for person re-identification. For training the network, our approach does not require any part annotation on the person re-identification dataset. Instead, we simply initialize the part sub-stream using a pre-trained sub-network of an existing pose estimation network, and train the whole network to minimize the re-identification loss. We validate the effectiveness of our approach by demon- strating its superiority over the state-of-the-art methods on the standard bench- mark datasets, including Market-1501, CUHK03, CUHK01 and DukeMTMC, and standard video dataset MARS.

Part-AlignedBilinearRepresentationsforPersonRe-identification link (url) [BibTex]


Thumb xl yanzhang clustering
Temporal Human Action Segmentation via Dynamic Clustering

Zhang, Y., Sun, H., Tang, S., Neumann, H.

arXiv preprint arXiv:1803.05790, 2018 (article)

Abstract
We present an effective dynamic clustering algorithm for the task of temporal human action segmentation, which has comprehensive applications such as robotics, motion analysis, and patient monitoring. Our proposed algorithm is unsupervised, fast, generic to process various types of features, and applica- ble in both the online and offline settings. We perform extensive experiments of processing data streams, and show that our algorithm achieves the state-of- the-art results for both online and offline settings.

link (url) [BibTex]

link (url) [BibTex]


Thumb xl smalrteaser
Lions and Tigers and Bears: Capturing Non-Rigid, 3D, Articulated Shape from Images

Zuffi, S., Kanazawa, A., Black, M. J.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, 2018 (inproceedings)

Abstract
Animals are widespread in nature and the analysis of their shape and motion is important in many fields and industries. Modeling 3D animal shape, however, is difficult because the 3D scanning methods used to capture human shape are not applicable to wild animals or natural settings. Consequently, we propose a method to capture the detailed 3D shape of animals from images alone. The articulated and deformable nature of animals makes this problem extremely challenging, particularly in unconstrained environments with moving and uncalibrated cameras. To make this possible, we use a strong prior model of articulated animal shape that we fit to the image data. We then deform the animal shape in a canonical reference pose such that it matches image evidence when articulated and projected into multiple images. Our method extracts significantly more 3D shape detail than previous methods and is able to model new species, including the shape of an extinct animal, using only a few video frames. Additionally, the projected 3D shapes are accurate enough to facilitate the extraction of a realistic texture map from multiple frames.

pdf [BibTex]

pdf [BibTex]


Thumb xl selection 002
PoTion: Pose MoTion Representation for Action Recognition

Choutas, V., Weinzaepfel, P., Revaud, J., Schmid, C.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, 2018 (inproceedings)

Abstract
Most state-of-the-art methods for action recognition rely on a two-stream architecture that processes appearance and motion independently. In this paper, we claim that consider- ing them jointly offers rich information for action recogni- tion. We introduce a novel representation that gracefully en- codes the movement of some semantic keypoints. We use the human joints as these keypoints and term our Pose moTion representation PoTion. Specifically, we first run a state- of-the-art human pose estimator [4] and extract heatmaps for the human joints in each frame. We obtain our PoTion representation by temporally aggregating these probability maps. This is achieved by ‘colorizing’ each of them de- pending on the relative time of the frames in the video clip and summing them. This fixed-size representation for an en- tire video clip is suitable to classify actions using a shallow convolutional neural network. Our experimental evaluation shows that PoTion outper- forms other state-of-the-art pose representations [6, 48]. Furthermore, it is complementary to standard appearance and motion streams. When combining PoTion with the recent two-stream I3D approach [5], we obtain state-of- the-art performance on the JHMDB, HMDB and UCF101 datasets.

PDF [BibTex]

PDF [BibTex]

2003


Thumb xl iccv2003 copy
Image statistics and anisotropic diffusion

Scharr, H., Black, M. J., Haussecker, H.

In Int. Conf. on Computer Vision, pages: 840-847, October 2003 (inproceedings)

pdf [BibTex]

2003

pdf [BibTex]


Thumb xl switching2003
A switching Kalman filter model for the motor cortical coding of hand motion

Wu, W., Black, M. J., Mumford, D., Gao, Y., Bienenstock, E., Donoghue, J. P.

In Proc. IEEE Engineering in Medicine and Biology Society, pages: 2083-2086, September 2003 (inproceedings)

pdf [BibTex]

pdf [BibTex]


Thumb xl hedvig
Learning the statistics of people in images and video

Sidenbladh, H., Black, M. J.

International Journal of Computer Vision, 54(1-3):183-209, August 2003 (article)

Abstract
This paper address the problems of modeling the appearance of humans and distinguishing human appearance from the appearance of general scenes. We seek a model of appearance and motion that is generic in that it accounts for the ways in which people's appearance varies and, at the same time, is specific enough to be useful for tracking people in natural scenes. Given a 3D model of the person projected into an image we model the likelihood of observing various image cues conditioned on the predicted locations and orientations of the limbs. These cues are taken to be steered filter responses corresponding to edges, ridges, and motion-compensated temporal differences. Motivated by work on the statistics of natural scenes, the statistics of these filter responses for human limbs are learned from training images containing hand-labeled limb regions. Similarly, the statistics of the filter responses in general scenes are learned to define a “background” distribution. The likelihood of observing a scene given a predicted pose of a person is computed, for each limb, using the likelihood ratio between the learned foreground (person) and background distributions. Adopting a Bayesian formulation allows cues to be combined in a principled way. Furthermore, the use of learned distributions obviates the need for hand-tuned image noise models and thresholds. The paper provides a detailed analysis of the statistics of how people appear in scenes and provides a connection between work on natural image statistics and the Bayesian tracking of people.

pdf pdf from publisher code DOI [BibTex]

pdf pdf from publisher code DOI [BibTex]


Thumb xl delatorreijcvteaser
A framework for robust subspace learning

De la Torre, F., Black, M. J.

International Journal of Computer Vision, 54(1-3):117-142, August 2003 (article)

Abstract
Many computer vision, signal processing and statistical problems can be posed as problems of learning low dimensional linear or multi-linear models. These models have been widely used for the representation of shape, appearance, motion, etc., in computer vision applications. Methods for learning linear models can be seen as a special case of subspace fitting. One draw-back of previous learning methods is that they are based on least squares estimation techniques and hence fail to account for “outliers” which are common in realistic training sets. We review previous approaches for making linear learning methods robust to outliers and present a new method that uses an intra-sample outlier process to account for pixel outliers. We develop the theory of Robust Subspace Learning (RSL) for linear models within a continuous optimization framework based on robust M-estimation. The framework applies to a variety of linear learning problems in computer vision including eigen-analysis and structure from motion. Several synthetic and natural examples are used to develop and illustrate the theory and applications of robust subspace learning in computer vision.

pdf code pdf from publisher Project Page [BibTex]

pdf code pdf from publisher Project Page [BibTex]


Thumb xl ijcvcoverhd
Guest editorial: Computational vision at Brown

Black, M. J., Kimia, B.

International Journal of Computer Vision, 54(1-3):5-11, August 2003 (article)

pdf pdf from publisher [BibTex]

pdf pdf from publisher [BibTex]


Thumb xl cviu91teaser
Robust parameterized component analysis: Theory and applications to 2D facial appearance models

De la Torre, F., Black, M. J.

Computer Vision and Image Understanding, 91(1-2):53-71, July 2003 (article)

Abstract
Principal component analysis (PCA) has been successfully applied to construct linear models of shape, graylevel, and motion in images. In particular, PCA has been widely used to model the variation in the appearance of people's faces. We extend previous work on facial modeling for tracking faces in video sequences as they undergo significant changes due to facial expressions. Here we consider person-specific facial appearance models (PSFAM), which use modular PCA to model complex intra-person appearance changes. Such models require aligned visual training data; in previous work, this has involved a time consuming and error-prone hand alignment and cropping process. Instead, the main contribution of this paper is to introduce parameterized component analysis to learn a subspace that is invariant to affine (or higher order) geometric transformations. The automatic learning of a PSFAM given a training image sequence is posed as a continuous optimization problem and is solved with a mixture of stochastic and deterministic techniques achieving sub-pixel accuracy. We illustrate the use of the 2D PSFAM model with preliminary experiments relevant to applications including video-conferencing and avatar animation.

pdf [BibTex]

pdf [BibTex]


no image
A Gaussian mixture model for the motor cortical coding of hand motion

Wu, W., Mumford, D., Black, M. J., Gao, Y., Bienenstock, E., Donoghue, J. P.

Neural Control of Movement, Santa Barbara, CA, April 2003 (conference)

abstract [BibTex]

abstract [BibTex]


Thumb xl bildschirmfoto 2013 01 15 um 09.35.12
Connecting brains with machines: The neural control of 2D cursor movement

Black, M. J., Bienenstock, E., Donoghue, J. P., Serruya, M., Wu, W., Gao, Y.

In 1st International IEEE/EMBS Conference on Neural Engineering, pages: 580-583, Capri, Italy, March 2003 (inproceedings)

pdf [BibTex]

pdf [BibTex]


Thumb xl bildschirmfoto 2013 01 15 um 09.44.01
A quantitative comparison of linear and non-linear models of motor cortical activity for the encoding and decoding of arm motions

Gao, Y., Black, M. J., Bienenstock, E., Wu, W., Donoghue, J. P.

In 1st International IEEE/EMBS Conference on Neural Engineering, pages: 189-192, Capri, Italy, March 2003 (inproceedings)

pdf [BibTex]

pdf [BibTex]


no image
Accuracy of manual spike sorting: Results for the Utah intracortical array

Wood, F., Fellows, M., Vargas-Irwin, C., Black, M. J., Donoghue, J. P.

Program No. 279.2. 2003, Abstract Viewer and Itinerary Planner, Society for Neuroscience, Washington, DC, 2003, Online (conference)

abstract [BibTex]

abstract [BibTex]


no image
Specular flow and the perception of surface reflectance

Roth, S., Domini, F., Black, M. J.

Journal of Vision, 3 (9): 413a, 2003 (conference)

abstract poster [BibTex]

abstract poster [BibTex]


Thumb xl attractiveteaser
Attractive people: Assembling loose-limbed models using non-parametric belief propagation

Sigal, L., Isard, M. I., Sigelman, B. H., Black, M. J.

In Advances in Neural Information Processing Systems 16, NIPS, pages: 1539-1546, (Editors: S. Thrun and L. K. Saul and B. Schölkopf), MIT Press, 2003 (inproceedings)

Abstract
The detection and pose estimation of people in images and video is made challenging by the variability of human appearance, the complexity of natural scenes, and the high dimensionality of articulated body models. To cope with these problems we represent the 3D human body as a graphical model in which the relationships between the body parts are represented by conditional probability distributions. We formulate the pose estimation problem as one of probabilistic inference over a graphical model where the random variables correspond to the individual limb parameters (position and orientation). Because the limbs are described by 6-dimensional vectors encoding pose in 3-space, discretization is impractical and the random variables in our model must be continuous-valued. To approximate belief propagation in such a graph we exploit a recently introduced generalization of the particle filter. This framework facilitates the automatic initialization of the body-model from low level cues and is robust to occlusion of body parts and scene clutter.

pdf (color) pdf (black and white) [BibTex]

pdf (color) pdf (black and white) [BibTex]


Thumb xl bildschirmfoto 2013 01 15 um 09.48.31
Neural decoding of cursor motion using a Kalman filter

(Nominated: Best student paper)

Wu, W., Black, M. J., Gao, Y., Bienenstock, E., Serruya, M., Shaikhouni, A., Donoghue, J. P.

In Advances in Neural Information Processing Systems 15, pages: 133-140, MIT Press, 2003 (inproceedings)

pdf [BibTex]

pdf [BibTex]

1991


Thumb xl ijcai91
Dynamic motion estimation and feature extraction over long image sequences

Black, M. J., Anandan, P.

In Proc. IJCAI Workshop on Dynamic Scene Understanding, Sydney, Australia, August 1991 (inproceedings)

[BibTex]

1991

[BibTex]


Thumb xl bildschirmfoto 2013 01 14 um 12.06.42
Robust dynamic motion estimation over time

(IEEE Computer Society Outstanding Paper Award)

Black, M. J., Anandan, P.

In Proc. Computer Vision and Pattern Recognition, CVPR-91,, pages: 296-302, Maui, Hawaii, June 1991 (inproceedings)

Abstract
This paper presents a novel approach to incrementally estimating visual motion over a sequence of images. We start by formulating constraints on image motion to account for the possibility of multiple motions. This is achieved by exploiting the notions of weak continuity and robust statistics in the formulation of the minimization problem. The resulting objective function is non-convex. Traditional stochastic relaxation techniques for minimizing such functions prove inappropriate for the task. We present a highly parallel incremental stochastic minimization algorithm which has a number of advantages over previous approaches. The incremental nature of the scheme makes it truly dynamic and permits the detection of occlusion and disocclusion boundaries.

pdf video abstract [BibTex]

pdf video abstract [BibTex]