Header logo is ps


2019


Thumb xl celia
Decoding subcategories of human bodies from both body- and face-responsive cortical regions

Foster, C., Zhao, M., Romero, J., Black, M. J., Mohler, B. J., Bartels, A., Bülthoff, I.

NeuroImage, 202(15):116085, November 2019 (article)

Abstract
Our visual system can easily categorize objects (e.g. faces vs. bodies) and further differentiate them into subcategories (e.g. male vs. female). This ability is particularly important for objects of social significance, such as human faces and bodies. While many studies have demonstrated category selectivity to faces and bodies in the brain, how subcategories of faces and bodies are represented remains unclear. Here, we investigated how the brain encodes two prominent subcategories shared by both faces and bodies, sex and weight, and whether neural responses to these subcategories rely on low-level visual, high-level visual or semantic similarity. We recorded brain activity with fMRI while participants viewed faces and bodies that varied in sex, weight, and image size. The results showed that the sex of bodies can be decoded from both body- and face-responsive brain areas, with the former exhibiting more consistent size-invariant decoding than the latter. Body weight could also be decoded in face-responsive areas and in distributed body-responsive areas, and this decoding was also invariant to image size. The weight of faces could be decoded from the fusiform body area (FBA), and weight could be decoded across face and body stimuli in the extrastriate body area (EBA) and a distributed body-responsive area. The sex of well-controlled faces (e.g. excluding hairstyles) could not be decoded from face- or body-responsive regions. These results demonstrate that both face- and body-responsive brain regions encode information that can distinguish the sex and weight of bodies. Moreover, the neural patterns corresponding to sex and weight were invariant to image size and could sometimes generalize across face and body stimuli, suggesting that such subcategorical information is encoded with a high-level visual or semantic code.

paper pdf DOI [BibTex]

2019

paper pdf DOI [BibTex]


Thumb xl multihumanoflow thumb
Learning Multi-Human Optical Flow

Ranjan, A., Hoffmann, D. T., Tzionas, D., Tang, S., Romero, J., Black, M. J.

arxiv preprint arXiv:1910.1166, November 2019 (article)

Abstract
The optical flow of humans is well known to be useful for the analysis of human action. Recent optical flow methods focus on training deep networks to approach the problem. However, the training data used by them does not cover the domain of human motion. Therefore, we develop a dataset of multi-human optical flow and train optical flow networks on this dataset. We use a 3D model of the human body and motion capture data to synthesize realistic flow fields in both single-and multi-person images. We then train optical flow networks to estimate human flow fields from pairs of images. We demonstrate that our trained networks are more accurate than a wide range of top methods on held-out test data and that they can generalize well to real image sequences. The code, trained models and the dataset are available for research.

Paper poster link (url) [BibTex]


Thumb xl autonomous mocap cover image new
Active Perception based Formation Control for Multiple Aerial Vehicles

Tallamraju, R., Price, E., Ludwig, R., Karlapalem, K., Bülthoff, H. H., Black, M. J., Ahmad, A.

IEEE Robotics and Automation Letters, Robotics and Automation Letters, 4(4):4491-4498, IEEE, October 2019 (article)

Abstract
We present a novel robotic front-end for autonomous aerial motion-capture (mocap) in outdoor environments. In previous work, we presented an approach for cooperative detection and tracking (CDT) of a subject using multiple micro-aerial vehicles (MAVs). However, it did not ensure optimal view-point configurations of the MAVs to minimize the uncertainty in the person's cooperatively tracked 3D position estimate. In this article, we introduce an active approach for CDT. In contrast to cooperatively tracking only the 3D positions of the person, the MAVs can actively compute optimal local motion plans, resulting in optimal view-point configurations, which minimize the uncertainty in the tracked estimate. We achieve this by decoupling the goal of active tracking into a quadratic objective and non-convex constraints corresponding to angular configurations of the MAVs w.r.t. the person. We derive this decoupling using Gaussian observation model assumptions within the CDT algorithm. We preserve convexity in optimization by embedding all the non-convex constraints, including those for dynamic obstacle avoidance, as external control inputs in the MPC dynamics. Multiple real robot experiments and comparisons involving 3 MAVs in several challenging scenarios are presented.

pdf DOI Project Page [BibTex]

pdf DOI Project Page [BibTex]


Thumb xl 3dmm
3D Morphable Face Models - Past, Present and Future

Egger, B., Smith, W. A. P., Tewari, A., Wuhrer, S., Zollhoefer, M., Beeler, T., Bernard, F., Bolkart, T., Kortylewski, A., Romdhani, S., Theobalt, C., Blanz, V., Vetter, T.

arxiv preprint arXiv:1909.01815, September 2019 (article)

Abstract
In this paper, we provide a detailed survey of 3D Morphable Face Models over the 20 years since they were first proposed. The challenges in building and applying these models, namely capture, modeling, image formation,and image analysis, are still active research topics, and we review the state-of-the-art in each of these areas. We also look ahead, identifying unsolved challenges, proposing directions for future research and highlighting the broad range of current and future applications.

link (url) [BibTex]

link (url) [BibTex]


Thumb xl hessepami
Learning and Tracking the 3D Body Shape of Freely Moving Infants from RGB-D sequences

Hesse, N., Pujades, S., Black, M., Arens, M., Hofmann, U., Schroeder, S.

Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019 (article)

Abstract
Statistical models of the human body surface are generally learned from thousands of high-quality 3D scans in predefined poses to cover the wide variety of human body shapes and articulations. Acquisition of such data requires expensive equipment, calibration procedures, and is limited to cooperative subjects who can understand and follow instructions, such as adults. We present a method for learning a statistical 3D Skinned Multi-Infant Linear body model (SMIL) from incomplete, low-quality RGB-D sequences of freely moving infants. Quantitative experiments show that SMIL faithfully represents the RGB-D data and properly factorizes the shape and pose of the infants. To demonstrate the applicability of SMIL, we fit the model to RGB-D sequences of freely moving infants and show, with a case study, that our method captures enough motion detail for General Movements Assessment (GMA), a method used in clinical practice for early detection of neurodevelopmental disorders in infants. SMIL provides a new tool for analyzing infant shape and movement and is a step towards an automated system for GMA.

pdf Journal DOI [BibTex]

pdf Journal DOI [BibTex]


Thumb xl kenny
Perceptual Effects of Inconsistency in Human Animations

Kenny, S., Mahmood, N., Honda, C., Black, M. J., Troje, N. F.

ACM Trans. Appl. Percept., 16(1):2:1-2:18, Febuary 2019 (article)

Abstract
The individual shape of the human body, including the geometry of its articulated structure and the distribution of weight over that structure, influences the kinematics of a person’s movements. How sensitive is the visual system to inconsistencies between shape and motion introduced by retargeting motion from one person onto the shape of another? We used optical motion capture to record five pairs of male performers with large differences in body weight, while they pushed, lifted, and threw objects. From these data, we estimated both the kinematics of the actions as well as the performer’s individual body shape. To obtain consistent and inconsistent stimuli, we created animated avatars by combining the shape and motion estimates from either a single performer or from different performers. Using these stimuli we conducted three experiments in an immersive virtual reality environment. First, a group of participants detected which of two stimuli was inconsistent. Performance was very low, and results were only marginally significant. Next, a second group of participants rated perceived attractiveness, eeriness, and humanness of consistent and inconsistent stimuli, but these judgements of animation characteristics were not affected by consistency of the stimuli. Finally, a third group of participants rated properties of the objects rather than of the performers. Here, we found strong influences of shape-motion inconsistency on perceived weight and thrown distance of objects. This suggests that the visual system relies on its knowledge of shape and motion and that these components are assimilated into an altered perception of the action outcome. We propose that the visual system attempts to resist inconsistent interpretations of human animations. Actions involving object manipulations present an opportunity for the visual system to reinterpret the introduced inconsistencies as a change in the dynamics of an object rather than as an unexpected combination of body shape and body motion.

publisher pdf DOI [BibTex]

publisher pdf DOI [BibTex]


Thumb xl virtualcaliper
The Virtual Caliper: Rapid Creation of Metrically Accurate Avatars from 3D Measurements

Pujades, S., Mohler, B., Thaler, A., Tesch, J., Mahmood, N., Hesse, N., Bülthoff, H. H., Black, M. J.

IEEE Transactions on Visualization and Computer Graphics, 25, pages: 1887,1897, IEEE, 2019 (article)

Abstract
Creating metrically accurate avatars is important for many applications such as virtual clothing try-on, ergonomics, medicine, immersive social media, telepresence, and gaming. Creating avatars that precisely represent a particular individual is challenging however, due to the need for expensive 3D scanners, privacy issues with photographs or videos, and difficulty in making accurate tailoring measurements. We overcome these challenges by creating “The Virtual Caliper”, which uses VR game controllers to make simple measurements. First, we establish what body measurements users can reliably make on their own body. We find several distance measurements to be good candidates and then verify that these are linearly related to 3D body shape as represented by the SMPL body model. The Virtual Caliper enables novice users to accurately measure themselves and create an avatar with their own body shape. We evaluate the metric accuracy relative to ground truth 3D body scan data, compare the method quantitatively to other avatar creation tools, and perform extensive perceptual studies. We also provide a software application to the community that enables novices to rapidly create avatars in fewer than five minutes. Not only is our approach more rapid than existing methods, it exports a metrically accurate 3D avatar model that is rigged and skinned.

Project Page IEEE Open Access IEEE Open Access PDF DOI [BibTex]

Project Page IEEE Open Access IEEE Open Access PDF DOI [BibTex]

2009


Thumb xl foe2009
Fields of Experts

Roth, S., Black, M. J.

International Journal of Computer Vision (IJCV), 82(2):205-29, April 2009 (article)

Abstract
We develop a framework for learning generic, expressive image priors that capture the statistics of natural scenes and can be used for a variety of machine vision tasks. The approach provides a practical method for learning high-order Markov random field (MRF) models with potential functions that extend over large pixel neighborhoods. These clique potentials are modeled using the Product-of-Experts framework that uses non-linear functions of many linear filter responses. In contrast to previous MRF approaches all parameters, including the linear filters themselves, are learned from training data. We demonstrate the capabilities of this Field-of-Experts model with two example applications, image denoising and image inpainting, which are implemented using a simple, approximate inference scheme. While the model is trained on a generic image database and is not tuned toward a specific application, we obtain results that compete with specialized techniques.

pdf pdf from publisher [BibTex]

2009

pdf pdf from publisher [BibTex]


Thumb xl ajp1
Left Ventricular Regional Wall Curvedness and Wall Stress in Patients with Ischemic Dilated Cardiomyopathy

Liang Zhong, Yi Su, Si Yong Yeo, Ru San Tan Dhanjoo Ghista, Ghassan Kassab

American Journal of Physiology – Heart and Circulatory Physiology, 296(3):H573-84, 2009 (article)

Abstract
Geometric remodeling of the left ventricle (LV) after myocardial infarction is associated with changes in myocardial wall stress. The objective of this study was to determine the regional curvatures and wall stress based on three-dimensional (3-D) reconstructions of the LV using MRI. Ten patients with ischemic dilated cardiomyopathy (IDCM) and 10 normal subjects underwent MRI scan. The IDCM patients also underwent delayed gadolinium-enhancement imaging to delineate the extent of myocardial infarct. Regional curvedness, local radii of curvature, and wall thickness were calculated. The percent curvedness change between end diastole and end systole was also calculated. In normal heart, a short- and long-axis two-dimensional analysis showed a 41 +/- 11% and 45 +/- 12% increase of the mean of peak systolic wall stress between basal and apical sections, respectively. However, 3-D analysis showed no significant difference in peak systolic wall stress from basal and apical sections (P = 0.298, ANOVA). LV shape differed between IDCM patients and normal subjects in several ways: LV shape was more spherical (sphericity index = 0.62 +/- 0.08 vs. 0.52 +/- 0.06, P < 0.05), curvedness at end diastole (mean for 16 segments = 0.034 +/- 0.0056 vs. 0.040 +/- 0.0071 mm(-1), P < 0.001) and end systole (mean for 16 segments = 0.037 +/- 0.0068 vs. 0.067 +/- 0.020 mm(-1), P < 0.001) was affected by infarction, and peak systolic wall stress was significantly increased at each segment in IDCM patients. The 3-D quantification of regional wall stress by cardiac MRI provides more precise evaluation of cardiac mechanics. Identification of regional curvedness and wall stresses helps delineate the mechanisms of LV remodeling in IDCM and may help guide therapeutic LV restoration.

[BibTex]

[BibTex]


Thumb xl mbec1
A Curvature-Based Approach for Left Ventricular Shape Analysis from Cardiac Magnetic Resonance Imaging

Si Yong Yeo, Liang Zhong, Yi Su, Ru San Tan, Dhanjoo Ghista

Medical & Biological Engineering & Computing, 47(3):313-322, 2009 (article)

Abstract
It is believed that left ventricular (LV) regional shape is indicative of LV regional function, and cardiac pathologies are often associated with regional alterations in ventricular shape. In this article, we present a set of procedures for evaluating regional LV surface shape from anatomically accurate models reconstructed from cardiac magnetic resonance (MR) images. LV surface curvatures are computed using local surface fitting method, which enables us to assess regional LV shape and its variation. Comparisons are made between normal and diseased hearts. It is illustrated that LV surface curvatures at different regions of the normal heart are higher than those of the diseased heart. Also, the normal heart experiences a larger change in regional curvedness during contraction than the diseased heart. It is believed that with a wide range of dataset being evaluated, this approach will provide a new and efficient way of quantifying LV regional function.

link (url) [BibTex]

link (url) [BibTex]

2004


Thumb xl woodtransbme04
On the variability of manual spike sorting

Wood, F., Black, M. J., Vargas-Irwin, C., Fellows, M., Donoghue, J. P.

IEEE Trans. Biomedical Engineering, 51(6):912-918, June 2004 (article)

pdf pdf from publisher [BibTex]

2004

pdf pdf from publisher [BibTex]


Thumb xl wutransbme04
Modeling and decoding motor cortical activity using a switching Kalman filter

Wu, W., Black, M. J., Mumford, D., Gao, Y., Bienenstock, E., Donoghue, J. P.

IEEE Trans. Biomedical Engineering, 51(6):933-942, June 2004 (article)

Abstract
We present a switching Kalman filter model for the real-time inference of hand kinematics from a population of motor cortical neurons. Firing rates are modeled as a Gaussian mixture where the mean of each Gaussian component is a linear function of hand kinematics. A “hidden state” models the probability of each mixture component and evolves over time in a Markov chain. The model generalizes previous encoding and decoding methods, addresses the non-Gaussian nature of firing rates, and can cope with crudely sorted neural data common in on-line prosthetic applications.

pdf pdf from publisher [BibTex]

pdf pdf from publisher [BibTex]

2003


Thumb xl hedvig
Learning the statistics of people in images and video

Sidenbladh, H., Black, M. J.

International Journal of Computer Vision, 54(1-3):183-209, August 2003 (article)

Abstract
This paper address the problems of modeling the appearance of humans and distinguishing human appearance from the appearance of general scenes. We seek a model of appearance and motion that is generic in that it accounts for the ways in which people's appearance varies and, at the same time, is specific enough to be useful for tracking people in natural scenes. Given a 3D model of the person projected into an image we model the likelihood of observing various image cues conditioned on the predicted locations and orientations of the limbs. These cues are taken to be steered filter responses corresponding to edges, ridges, and motion-compensated temporal differences. Motivated by work on the statistics of natural scenes, the statistics of these filter responses for human limbs are learned from training images containing hand-labeled limb regions. Similarly, the statistics of the filter responses in general scenes are learned to define a “background” distribution. The likelihood of observing a scene given a predicted pose of a person is computed, for each limb, using the likelihood ratio between the learned foreground (person) and background distributions. Adopting a Bayesian formulation allows cues to be combined in a principled way. Furthermore, the use of learned distributions obviates the need for hand-tuned image noise models and thresholds. The paper provides a detailed analysis of the statistics of how people appear in scenes and provides a connection between work on natural image statistics and the Bayesian tracking of people.

pdf pdf from publisher code DOI [BibTex]

2003

pdf pdf from publisher code DOI [BibTex]


Thumb xl delatorreijcvteaser
A framework for robust subspace learning

De la Torre, F., Black, M. J.

International Journal of Computer Vision, 54(1-3):117-142, August 2003 (article)

Abstract
Many computer vision, signal processing and statistical problems can be posed as problems of learning low dimensional linear or multi-linear models. These models have been widely used for the representation of shape, appearance, motion, etc., in computer vision applications. Methods for learning linear models can be seen as a special case of subspace fitting. One draw-back of previous learning methods is that they are based on least squares estimation techniques and hence fail to account for “outliers” which are common in realistic training sets. We review previous approaches for making linear learning methods robust to outliers and present a new method that uses an intra-sample outlier process to account for pixel outliers. We develop the theory of Robust Subspace Learning (RSL) for linear models within a continuous optimization framework based on robust M-estimation. The framework applies to a variety of linear learning problems in computer vision including eigen-analysis and structure from motion. Several synthetic and natural examples are used to develop and illustrate the theory and applications of robust subspace learning in computer vision.

pdf code pdf from publisher Project Page [BibTex]

pdf code pdf from publisher Project Page [BibTex]


Thumb xl ijcvcoverhd
Guest editorial: Computational vision at Brown

Black, M. J., Kimia, B.

International Journal of Computer Vision, 54(1-3):5-11, August 2003 (article)

pdf pdf from publisher [BibTex]

pdf pdf from publisher [BibTex]


Thumb xl cviu91teaser
Robust parameterized component analysis: Theory and applications to 2D facial appearance models

De la Torre, F., Black, M. J.

Computer Vision and Image Understanding, 91(1-2):53-71, July 2003 (article)

Abstract
Principal component analysis (PCA) has been successfully applied to construct linear models of shape, graylevel, and motion in images. In particular, PCA has been widely used to model the variation in the appearance of people's faces. We extend previous work on facial modeling for tracking faces in video sequences as they undergo significant changes due to facial expressions. Here we consider person-specific facial appearance models (PSFAM), which use modular PCA to model complex intra-person appearance changes. Such models require aligned visual training data; in previous work, this has involved a time consuming and error-prone hand alignment and cropping process. Instead, the main contribution of this paper is to introduce parameterized component analysis to learn a subspace that is invariant to affine (or higher order) geometric transformations. The automatic learning of a PSFAM given a training image sequence is posed as a continuous optimization problem and is solved with a mixture of stochastic and deterministic techniques achieving sub-pixel accuracy. We illustrate the use of the 2D PSFAM model with preliminary experiments relevant to applications including video-conferencing and avatar animation.

pdf [BibTex]

pdf [BibTex]