Header logo is ps


2019


Thumb xl celia
Decoding subcategories of human bodies from both body- and face-responsive cortical regions

Foster, C., Zhao, M., Romero, J., Black, M. J., Mohler, B. J., Bartels, A., Bülthoff, I.

NeuroImage, 202(15):116085, November 2019 (article)

Abstract
Our visual system can easily categorize objects (e.g. faces vs. bodies) and further differentiate them into subcategories (e.g. male vs. female). This ability is particularly important for objects of social significance, such as human faces and bodies. While many studies have demonstrated category selectivity to faces and bodies in the brain, how subcategories of faces and bodies are represented remains unclear. Here, we investigated how the brain encodes two prominent subcategories shared by both faces and bodies, sex and weight, and whether neural responses to these subcategories rely on low-level visual, high-level visual or semantic similarity. We recorded brain activity with fMRI while participants viewed faces and bodies that varied in sex, weight, and image size. The results showed that the sex of bodies can be decoded from both body- and face-responsive brain areas, with the former exhibiting more consistent size-invariant decoding than the latter. Body weight could also be decoded in face-responsive areas and in distributed body-responsive areas, and this decoding was also invariant to image size. The weight of faces could be decoded from the fusiform body area (FBA), and weight could be decoded across face and body stimuli in the extrastriate body area (EBA) and a distributed body-responsive area. The sex of well-controlled faces (e.g. excluding hairstyles) could not be decoded from face- or body-responsive regions. These results demonstrate that both face- and body-responsive brain regions encode information that can distinguish the sex and weight of bodies. Moreover, the neural patterns corresponding to sex and weight were invariant to image size and could sometimes generalize across face and body stimuli, suggesting that such subcategorical information is encoded with a high-level visual or semantic code.

paper pdf DOI [BibTex]

2019

paper pdf DOI [BibTex]


Thumb xl multihumanoflow thumb
Learning Multi-Human Optical Flow

Ranjan, A., Hoffmann, D. T., Tzionas, D., Tang, S., Romero, J., Black, M. J.

arxiv preprint arXiv:1910.1166, November 2019 (article)

Abstract
The optical flow of humans is well known to be useful for the analysis of human action. Recent optical flow methods focus on training deep networks to approach the problem. However, the training data used by them does not cover the domain of human motion. Therefore, we develop a dataset of multi-human optical flow and train optical flow networks on this dataset. We use a 3D model of the human body and motion capture data to synthesize realistic flow fields in both single-and multi-person images. We then train optical flow networks to estimate human flow fields from pairs of images. We demonstrate that our trained networks are more accurate than a wide range of top methods on held-out test data and that they can generalize well to real image sequences. The code, trained models and the dataset are available for research.

Paper poster link (url) [BibTex]


Thumb xl autonomous mocap cover image new
Active Perception based Formation Control for Multiple Aerial Vehicles

Tallamraju, R., Price, E., Ludwig, R., Karlapalem, K., Bülthoff, H. H., Black, M. J., Ahmad, A.

IEEE Robotics and Automation Letters, Robotics and Automation Letters, 4(4):4491-4498, IEEE, October 2019 (article)

Abstract
We present a novel robotic front-end for autonomous aerial motion-capture (mocap) in outdoor environments. In previous work, we presented an approach for cooperative detection and tracking (CDT) of a subject using multiple micro-aerial vehicles (MAVs). However, it did not ensure optimal view-point configurations of the MAVs to minimize the uncertainty in the person's cooperatively tracked 3D position estimate. In this article, we introduce an active approach for CDT. In contrast to cooperatively tracking only the 3D positions of the person, the MAVs can actively compute optimal local motion plans, resulting in optimal view-point configurations, which minimize the uncertainty in the tracked estimate. We achieve this by decoupling the goal of active tracking into a quadratic objective and non-convex constraints corresponding to angular configurations of the MAVs w.r.t. the person. We derive this decoupling using Gaussian observation model assumptions within the CDT algorithm. We preserve convexity in optimization by embedding all the non-convex constraints, including those for dynamic obstacle avoidance, as external control inputs in the MPC dynamics. Multiple real robot experiments and comparisons involving 3 MAVs in several challenging scenarios are presented.

pdf DOI Project Page [BibTex]

pdf DOI Project Page [BibTex]


Thumb xl 3dmm
3D Morphable Face Models - Past, Present and Future

Egger, B., Smith, W. A. P., Tewari, A., Wuhrer, S., Zollhoefer, M., Beeler, T., Bernard, F., Bolkart, T., Kortylewski, A., Romdhani, S., Theobalt, C., Blanz, V., Vetter, T.

arxiv preprint arXiv:1909.01815, September 2019 (article)

Abstract
In this paper, we provide a detailed survey of 3D Morphable Face Models over the 20 years since they were first proposed. The challenges in building and applying these models, namely capture, modeling, image formation,and image analysis, are still active research topics, and we review the state-of-the-art in each of these areas. We also look ahead, identifying unsolved challenges, proposing directions for future research and highlighting the broad range of current and future applications.

paper project page [BibTex]

paper project page [BibTex]


Thumb xl hessepami
Learning and Tracking the 3D Body Shape of Freely Moving Infants from RGB-D sequences

Hesse, N., Pujades, S., Black, M., Arens, M., Hofmann, U., Schroeder, S.

Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019 (article)

Abstract
Statistical models of the human body surface are generally learned from thousands of high-quality 3D scans in predefined poses to cover the wide variety of human body shapes and articulations. Acquisition of such data requires expensive equipment, calibration procedures, and is limited to cooperative subjects who can understand and follow instructions, such as adults. We present a method for learning a statistical 3D Skinned Multi-Infant Linear body model (SMIL) from incomplete, low-quality RGB-D sequences of freely moving infants. Quantitative experiments show that SMIL faithfully represents the RGB-D data and properly factorizes the shape and pose of the infants. To demonstrate the applicability of SMIL, we fit the model to RGB-D sequences of freely moving infants and show, with a case study, that our method captures enough motion detail for General Movements Assessment (GMA), a method used in clinical practice for early detection of neurodevelopmental disorders in infants. SMIL provides a new tool for analyzing infant shape and movement and is a step towards an automated system for GMA.

pdf Journal DOI [BibTex]

pdf Journal DOI [BibTex]


Thumb xl kenny
Perceptual Effects of Inconsistency in Human Animations

Kenny, S., Mahmood, N., Honda, C., Black, M. J., Troje, N. F.

ACM Trans. Appl. Percept., 16(1):2:1-2:18, Febuary 2019 (article)

Abstract
The individual shape of the human body, including the geometry of its articulated structure and the distribution of weight over that structure, influences the kinematics of a person’s movements. How sensitive is the visual system to inconsistencies between shape and motion introduced by retargeting motion from one person onto the shape of another? We used optical motion capture to record five pairs of male performers with large differences in body weight, while they pushed, lifted, and threw objects. From these data, we estimated both the kinematics of the actions as well as the performer’s individual body shape. To obtain consistent and inconsistent stimuli, we created animated avatars by combining the shape and motion estimates from either a single performer or from different performers. Using these stimuli we conducted three experiments in an immersive virtual reality environment. First, a group of participants detected which of two stimuli was inconsistent. Performance was very low, and results were only marginally significant. Next, a second group of participants rated perceived attractiveness, eeriness, and humanness of consistent and inconsistent stimuli, but these judgements of animation characteristics were not affected by consistency of the stimuli. Finally, a third group of participants rated properties of the objects rather than of the performers. Here, we found strong influences of shape-motion inconsistency on perceived weight and thrown distance of objects. This suggests that the visual system relies on its knowledge of shape and motion and that these components are assimilated into an altered perception of the action outcome. We propose that the visual system attempts to resist inconsistent interpretations of human animations. Actions involving object manipulations present an opportunity for the visual system to reinterpret the introduced inconsistencies as a change in the dynamics of an object rather than as an unexpected combination of body shape and body motion.

publisher pdf DOI [BibTex]

publisher pdf DOI [BibTex]


Thumb xl virtualcaliper
The Virtual Caliper: Rapid Creation of Metrically Accurate Avatars from 3D Measurements

Pujades, S., Mohler, B., Thaler, A., Tesch, J., Mahmood, N., Hesse, N., Bülthoff, H. H., Black, M. J.

IEEE Transactions on Visualization and Computer Graphics, 25, pages: 1887,1897, IEEE, 2019 (article)

Abstract
Creating metrically accurate avatars is important for many applications such as virtual clothing try-on, ergonomics, medicine, immersive social media, telepresence, and gaming. Creating avatars that precisely represent a particular individual is challenging however, due to the need for expensive 3D scanners, privacy issues with photographs or videos, and difficulty in making accurate tailoring measurements. We overcome these challenges by creating “The Virtual Caliper”, which uses VR game controllers to make simple measurements. First, we establish what body measurements users can reliably make on their own body. We find several distance measurements to be good candidates and then verify that these are linearly related to 3D body shape as represented by the SMPL body model. The Virtual Caliper enables novice users to accurately measure themselves and create an avatar with their own body shape. We evaluate the metric accuracy relative to ground truth 3D body scan data, compare the method quantitatively to other avatar creation tools, and perform extensive perceptual studies. We also provide a software application to the community that enables novices to rapidly create avatars in fewer than five minutes. Not only is our approach more rapid than existing methods, it exports a metrically accurate 3D avatar model that is rigged and skinned.

Project Page IEEE Open Access IEEE Open Access PDF DOI [BibTex]

Project Page IEEE Open Access IEEE Open Access PDF DOI [BibTex]

2012


Thumb xl eigenmaps
An SVD-Based Approach for Ghost Detection and Removal in High Dynamic Range Images

Srikantha, A., Sidibe, D., Meriaudeau, F.

International Conference on Pattern Recognition (ICPR), pages: 380-383, November 2012 (article)

pdf [BibTex]

2012

pdf [BibTex]


Thumb xl posear
Coupled Action Recognition and Pose Estimation from Multiple Views

Yao, A., Gall, J., van Gool, L.

International Journal of Computer Vision, 100(1):16-37, October 2012 (article)

publisher's site code pdf Project Page Project Page Project Page [BibTex]

publisher's site code pdf Project Page Project Page Project Page [BibTex]


Thumb xl representativecrop
DRAPE: DRessing Any PErson

Guan, P., Reiss, L., Hirshberg, D., Weiss, A., Black, M. J.

ACM Trans. on Graphics (Proc. SIGGRAPH), 31(4):35:1-35:10, July 2012 (article)

Abstract
We describe a complete system for animating realistic clothing on synthetic bodies of any shape and pose without manual intervention. The key component of the method is a model of clothing called DRAPE (DRessing Any PErson) that is learned from a physics-based simulation of clothing on bodies of different shapes and poses. The DRAPE model has the desirable property of "factoring" clothing deformations due to body shape from those due to pose variation. This factorization provides an approximation to the physical clothing deformation and greatly simplifies clothing synthesis. Given a parameterized model of the human body with known shape and pose parameters, we describe an algorithm that dresses the body with a garment that is customized to fit and possesses realistic wrinkles. DRAPE can be used to dress static bodies or animated sequences with a learned model of the cloth dynamics. Since the method is fully automated, it is appropriate for dressing large numbers of virtual characters of varying shape. The method is significantly more efficient than physical simulation.

YouTube pdf talk Project Page Project Page [BibTex]

YouTube pdf talk Project Page Project Page [BibTex]


Thumb xl ghosthdr
Ghost Detection and Removal for High Dynamic Range Images: Recent Advances

Srikantha, A., Sidib’e, D.

Signal Processing: Image Communication, 27, pages: 650-662, July 2012 (article)

pdf link (url) [BibTex]

pdf link (url) [BibTex]


Thumb xl thumb screen shot 2012 10 06 at 11.48.38 am
Visual Servoing on Unknown Objects

Gratal, X., Romero, J., Bohg, J., Kragic, D.

Mechatronics, 22(4):423-435, Elsevier, June 2012, Visual Servoing \{SI\} (article)

Abstract
We study visual servoing in a framework of detection and grasping of unknown objects. Classically, visual servoing has been used for applications where the object to be servoed on is known to the robot prior to the task execution. In addition, most of the methods concentrate on aligning the robot hand with the object without grasping it. In our work, visual servoing techniques are used as building blocks in a system capable of detecting and grasping unknown objects in natural scenes. We show how different visual servoing techniques facilitate a complete grasping cycle.

Grasping sequence video Offline calibration video Pdf DOI [BibTex]

Grasping sequence video Offline calibration video Pdf DOI [BibTex]


Thumb xl jneuroscicrop
Visual Orientation and Directional Selectivity Through Thalamic Synchrony

Stanley, G., Jin, J., Wang, Y., Desbordes, G., Wang, Q., Black, M., Alonso, J.

Journal of Neuroscience, 32(26):9073-9088, June 2012 (article)

Abstract
Thalamic neurons respond to visual scenes by generating synchronous spike trains on the timescale of 10–20 ms that are very effective at driving cortical targets. Here we demonstrate that this synchronous activity contains unexpectedly rich information about fundamental properties of visual stimuli. We report that the occurrence of synchronous firing of cat thalamic cells with highly overlapping receptive fields is strongly sensitive to the orientation and the direction of motion of the visual stimulus. We show that this stimulus selectivity is robust, remaining relatively unchanged under different contrasts and temporal frequencies (stimulus velocities). A computational analysis based on an integrate-and-fire model of the direct thalamic input to a layer 4 cortical cell reveals a strong correlation between the degree of thalamic synchrony and the nonlinear relationship between cortical membrane potential and the resultant firing rate. Together, these findings suggest a novel population code in the synchronous firing of neurons in the early visual pathway that could serve as the substrate for establishing cortical representations of the visual scene.

preprint publisher's site Project Page [BibTex]

preprint publisher's site Project Page [BibTex]


Thumb xl bilinear
Bilinear Spatiotemporal Basis Models

Akhter, I., Simon, T., Khan, S., Matthews, I., Sheikh, Y.

ACM Transactions on Graphics (TOG), 31(2):17, ACM, April 2012 (article)

Abstract
A variety of dynamic objects, such as faces, bodies, and cloth, are represented in computer graphics as a collection of moving spatial landmarks. Spatiotemporal data is inherent in a number of graphics applications including animation, simulation, and object and camera tracking. The principal modes of variation in the spatial geometry of objects are typically modeled using dimensionality reduction techniques, while concurrently, trajectory representations like splines and autoregressive models are widely used to exploit the temporal regularity of deformation. In this article, we present the bilinear spatiotemporal basis as a model that simultaneously exploits spatial and temporal regularity while maintaining the ability to generalize well to new sequences. This factorization allows the use of analytical, predefined functions to represent temporal variation (e.g., B-Splines or the Discrete Cosine Transform) resulting in efficient model representation and estimation. The model can be interpreted as representing the data as a linear combination of spatiotemporal sequences consisting of shape modes oscillating over time at key frequencies. We apply the bilinear model to natural spatiotemporal phenomena, including face, body, and cloth motion data, and compare it in terms of compaction, generalization ability, predictive precision, and efficiency to existing models. We demonstrate the application of the model to a number of graphics tasks including labeling, gap-filling, denoising, and motion touch-up.

pdf project page link (url) [BibTex]

pdf project page link (url) [BibTex]


Thumb xl thumb latent space2
A metric for comparing the anthropomorphic motion capability of artificial hands

Feix, T., Romero, J., Ek, C. H., Schmiedmayer, H., Kragic, D.

IEEE RAS Transactions on Robotics, TRO, pages: 974-980, 2012 (article)

Publisher site Human Grasping Database Project [BibTex]

Publisher site Human Grasping Database Project [BibTex]


Thumb xl rat4
The Ankyrin 3 (ANK3) Bipolar Disorder Gene Regulates Psychiatric-related Behaviors that are Modulated by Lithium and Stress

Leussis, M., Berry-Scott, E., Saito, M., Jhuang, H., Haan, G., Alkan, O., Luce, C., Madison, J., Sklar, P., Serre, T., Root, D., Petryshen, T.

Biological Psychiatry , 2012 (article)

Prepublication Article Abstract [BibTex]

Prepublication Article Abstract [BibTex]


Thumb xl imavis2012
Natural Metrics and Least-Committed Priors for Articulated Tracking

Soren Hauberg, Stefan Sommer, Kim S. Pedersen

Image and Vision Computing, 30(6-7):453-461, Elsevier, 2012 (article)

Publishers site Code PDF [BibTex]

Publishers site Code PDF [BibTex]

2007


no image
Learning static Gestalt laws through dynamic experience

Ostrovsky, Y., Wulff, J., Sinha, P.

Journal of Vision, 7(9):315-315, ARVO, June 2007 (article)

Abstract
The Gestalt laws (Wertheimer 1923) are widely regarded as the rules that help us parse the world into objects. However, it is unclear as to how these laws are acquired by an infant's visual system. Classically, these “laws” have been presumed to be innate (Kellman and Spelke 1983). But, more recent work in infant development, showing the protracted time-course over which these grouping principles emerge (e.g., Johnson and Aslin 1995; Craton 1996), suggests that visual experience might play a role in their genesis. Specifically, our studies of patients with late-onset vision (Project Prakash; VSS 2006) and evidence from infant development both point to an early role of common motion cues for object grouping. Here we explore the possibility that the privileged status of motion in the developmental timeline is not happenstance, but rather serves to bootstrap the learning of static Gestalt cues. Our approach involves computational analyses of real-world motion sequences to investigate whether primitive optic flow information is correlated with static figural cues that could eventually come to serve as proxies for grouping in the form of Gestalt principles. We calculated local optic flow maps and then examined how similarity of motion across image patches co-varied with similarity of certain figural properties in static frames. Results indicate that patches with similar motion are much more likely to have similar luminance, color, and orientation as compared to patches with dissimilar motion vectors. This regularity suggests that, in principle, common motion extracted from dynamic visual experience can provide enough information to bootstrap region grouping based on luminance and color and contour continuation mechanisms in static scenes. These observations, coupled with the cited experimental studies, lend credence to the hypothesis that static Gestalt laws might be learned through a bootstrapping process based on early dynamic experience.

link (url) DOI [BibTex]

2007

link (url) DOI [BibTex]


Thumb xl pedestal
Neuromotor prosthesis development

Donoghue, J., Hochberg, L., Nurmikko, A., Black, M., Simeral, J., Friehs, G.

Medicine & Health Rhode Island, 90(1):12-15, January 2007 (article)

Abstract
Article describes a neuromotor prosthesis (NMP), in development at Brown University, that records human brain signals, decodes them, and transforms them into movement commands. An NMP is described as a system consisting of a neural interface, a decoding system, and a user interface, also called an effector; a closed-loop system would be completed by a feedback signal from the effector to the brain. The interface is based on neural spiking, a source of information-rich, rapid, complex control signals from the nervous system. The NMP described, named BrainGate, consists of a match-head sized platform with 100 thread-thin electrodes implanted just into the surface of the motor cortex where commands to move the hand emanate. Neural signals are decoded by a rack of computers that displays the resultant output as the motion of a cursor on a computer monitor. While computer cursor motion represents a form of virtual device control, this same command signal could be routed to a device to command motion of paralyzed muscles or the actions of prosthetic limbs. The researchers’ overall goal is the development of a fully implantable, wireless multi-neuron sensor for broad research, neural prosthetic, and human neurodiagnostic applications.

pdf [BibTex]

pdf [BibTex]


Thumb xl ijcvflow2
On the spatial statistics of optical flow

Roth, S., Black, M. J.

International Journal of Computer Vision, 74(1):33-50, 2007 (article)

Abstract
We present an analysis of the spatial and temporal statistics of "natural" optical flow fields and a novel flow algorithm that exploits their spatial statistics. Training flow fields are constructed using range images of natural scenes and 3D camera motions recovered from hand-held and car-mounted video sequences. A detailed analysis of optical flow statistics in natural scenes is presented and machine learning methods are developed to learn a Markov random field model of optical flow. The prior probability of a flow field is formulated as a Field-of-Experts model that captures the spatial statistics in overlapping patches and is trained using contrastive divergence. This new optical flow prior is compared with previous robust priors and is incorporated into a recent, accurate algorithm for dense optical flow computation. Experiments with natural and synthetic sequences illustrate how the learned optical flow prior quantitatively improves flow accuracy and how it captures the rich spatial structure found in natural scene motion.

pdf preprint pdf from publisher [BibTex]

pdf preprint pdf from publisher [BibTex]


Thumb xl arrayhd
Assistive technology and robotic control using MI ensemble-based neural interface systems in humans with tetraplegia

Donoghue, J. P., Nurmikko, A., Black, M. J., Hochberg, L.

Journal of Physiology, Special Issue on Brain Computer Interfaces, 579, pages: 603-611, 2007 (article)

Abstract
This review describes the rationale, early stage development, and initial human application of neural interface systems (NISs) for humans with paralysis. NISs are emerging medical devices designed to allowpersonswith paralysis to operate assistive technologies or to reanimatemuscles based upon a command signal that is obtained directly fromthe brain. Such systems require the development of sensors to detect brain signals, decoders to transformneural activity signals into a useful command, and an interface for the user.We review initial pilot trial results of an NIS that is based on an intracortical microelectrode sensor that derives control signals from the motor cortex.We review recent findings showing, first, that neurons engaged by movement intentions persist in motor cortex years after injury or disease to the motor system, and second, that signals derived from motor cortex can be used by persons with paralysis to operate a range of devices. We suggest that, with further development, this form of NIS holds promise as a useful new neurotechnology for those with limited motor function or communication.We also discuss the additional potential for neural sensors to be used in the diagnosis and management of various neurological conditions and as a new way to learn about human brain function.

pdf preprint pdf from publisher DOI [BibTex]

pdf preprint pdf from publisher DOI [BibTex]

2005


Thumb xl ivc05
Representing cyclic human motion using functional analysis

Ormoneit, D., Black, M. J., Hastie, T., Kjellström, H.

Image and Vision Computing, 23(14):1264-1276, December 2005 (article)

Abstract
We present a robust automatic method for modeling cyclic 3D human motion such as walking using motion-capture data. The pose of the body is represented by a time-series of joint angles which are automatically segmented into a sequence of motion cycles. The mean and the principal components of these cycles are computed using a new algorithm that enforces smooth transitions between the cycles by operating in the Fourier domain. Key to this method is its ability to automatically deal with noise and missing data. A learned walking model is then exploited for Bayesian tracking of 3D human motion.

pdf pdf from publisher DOI [BibTex]

2005

pdf pdf from publisher DOI [BibTex]

1999


Thumb xl bildschirmfoto 2012 12 06 um 09.38.15
Parameterized modeling and recognition of activities

Yacoob, Y., Black, M. J.

Computer Vision and Image Understanding, 73(2):232-247, 1999 (article)

Abstract
In this paper we consider a class of human activities—atomic activities—which can be represented as a set of measurements over a finite temporal window (e.g., the motion of human body parts during a walking cycle) and which has a relatively small space of variations in performance. A new approach for modeling and recognition of atomic activities that employs principal component analysis and analytical global transformations is proposed. The modeling of sets of exemplar instances of activities that are similar in duration and involve similar body part motions is achieved by parameterizing their representation using principal component analysis. The recognition of variants of modeled activities is achieved by searching the space of admissible parameterized transformations that these activities can undergo. This formulation iteratively refines the recognition of the class to which the observed activity belongs and the transformation parameters that relate it to the model in its class. We provide several experiments on recognition of articulated and deformable human motions from image motion parameters.

pdf pdf from publisher DOI [BibTex]

1999

pdf pdf from publisher DOI [BibTex]