Header logo is ps


2017


Semantic Multi-view Stereo: Jointly Estimating Objects and Voxels
Semantic Multi-view Stereo: Jointly Estimating Objects and Voxels

Ulusoy, A. O., Black, M. J., Geiger, A.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, IEEE, Piscataway, NJ, USA, July 2017 (inproceedings)

Abstract
Dense 3D reconstruction from RGB images is a highly ill-posed problem due to occlusions, textureless or reflective surfaces, as well as other challenges. We propose object-level shape priors to address these ambiguities. Towards this goal, we formulate a probabilistic model that integrates multi-view image evidence with 3D shape information from multiple objects. Inference in this model yields a dense 3D reconstruction of the scene as well as the existence and precise 3D pose of the objects in it. Our approach is able to recover fine details not captured in the input shapes while defaulting to the input models in occluded regions where image evidence is weak. Due to its probabilistic nature, the approach is able to cope with the approximate geometry of the 3D models as well as input shapes that are not present in the scene. We evaluate the approach quantitatively on several challenging indoor and outdoor datasets.

YouTube pdf suppmat Project Page [BibTex]

2017

YouTube pdf suppmat Project Page [BibTex]


Deep representation learning for human motion prediction and classification
Deep representation learning for human motion prediction and classification

Bütepage, J., Black, M., Kragic, D., Kjellström, H.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, IEEE, Piscataway, NJ, USA, July 2017 (inproceedings)

Abstract
Generative models of 3D human motion are often restricted to a small number of activities and can therefore not generalize well to novel movements or applications. In this work we propose a deep learning framework for human motion capture data that learns a generic representation from a large corpus of motion capture data and generalizes well to new, unseen, motions. Using an encoding-decoding network that learns to predict future 3D poses from the most recent past, we extract a feature representation of human motion. Most work on deep learning for sequence prediction focuses on video and speech. Since skeletal data has a different structure, we present and evaluate different network architectures that make different assumptions about time dependencies and limb correlations. To quantify the learned features, we use the output of different layers for action classification and visualize the receptive fields of the network units. Our method outperforms the recent state of the art in skeletal motion prediction even though these use action specific training data. Our results show that deep feedforward networks, trained from a generic mocap database, can successfully be used for feature extraction from human motion data and that this representation can be used as a foundation for classification and prediction.

arXiv Project Page [BibTex]

arXiv Project Page [BibTex]


Unite the People: Closing the Loop Between 3D and 2D Human Representations
Unite the People: Closing the Loop Between 3D and 2D Human Representations

Lassner, C., Romero, J., Kiefel, M., Bogo, F., Black, M. J., Gehler, P. V.

In Proceedings IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2017, IEEE, Piscataway, NJ, USA, July 2017 (inproceedings)

Abstract
3D models provide a common ground for different representations of human bodies. In turn, robust 2D estimation has proven to be a powerful tool to obtain 3D fits “in-the-wild”. However, depending on the level of detail, it can be hard to impossible to acquire labeled data for training 2D estimators on large scale. We propose a hybrid approach to this problem: with an extended version of the recently introduced SMPLify method, we obtain high quality 3D body model fits for multiple human pose datasets. Human annotators solely sort good and bad fits. This procedure leads to an initial dataset, UP-3D, with rich annotations. With a comprehensive set of experiments, we show how this data can be used to train discriminative models that produce results with an unprecedented level of detail: our models predict 31 segments and 91 landmark locations on the body. Using the 91 landmark pose estimator, we present state-of-the art results for 3D human pose and shape estimation using an order of magnitude less training data and without assumptions about gender or pose in the fitting procedure. We show that UP-3D can be enhanced with these improved fits to grow in quantity and quality, which makes the system deployable on large scale. The data, code and models are available for research purposes.

arXiv project/code/data Project Page [BibTex]

arXiv project/code/data Project Page [BibTex]


Human Shape Estimation using Statistical Body Models
Human Shape Estimation using Statistical Body Models

Loper, M. M.

University of Tübingen, May 2017 (thesis)

Abstract
Human body estimation methods transform real-world observations into predictions about human body state. These estimation methods benefit a variety of health, entertainment, clothing, and ergonomics applications. State may include pose, overall body shape, and appearance. Body state estimation is underconstrained by observations; ambiguity presents itself both in the form of missing data within observations, and also in the form of unknown correspondences between observations. We address this challenge with the use of a statistical body model: a data-driven virtual human. This helps resolve ambiguity in two ways. First, it fills in missing data, meaning that incomplete observations still result in complete shape estimates. Second, the model provides a statistically-motivated penalty for unlikely states, which enables more plausible body shape estimates. Body state inference requires more than a body model; we therefore build obser- vation models whose output is compared with real observations. In this thesis, body state is estimated from three types of observations: 3D motion capture markers, depth and color images, and high-resolution 3D scans. In each case, a forward process is proposed which simulates observations. By comparing observations to the results of the forward process, state can be adjusted to minimize the difference between simulated and observed data. We use gradient-based methods because they are critical to the precise estimation of state with a large number of parameters. The contributions of this work include three parts. First, we propose a method for the estimation of body shape, nonrigid deformation, and pose from 3D markers. Second, we present a concise approach to differentiating through the rendering process, with application to body shape estimation. And finally, we present a statistical body model trained from human body scans, with state-of-the-art fidelity, good runtime performance, and compatibility with existing animation packages.

Official Version [BibTex]


Early Stopping Without a Validation Set
Early Stopping Without a Validation Set

Mahsereci, M., Balles, L., Lassner, C., Hennig, P.

arXiv preprint arXiv:1703.09580, 2017 (article)

Abstract
Early stopping is a widely used technique to prevent poor generalization performance when training an over-expressive model by means of gradient-based optimization. To find a good point to halt the optimizer, a common practice is to split the dataset into a training and a smaller validation set to obtain an ongoing estimate of the generalization performance. In this paper we propose a novel early stopping criterion which is based on fast-to-compute, local statistics of the computed gradients and entirely removes the need for a held-out validation set. Our experiments show that this is a viable approach in the setting of least-squares and logistic regression as well as neural networks.

link (url) Project Page Project Page [BibTex]


Data-Driven Physics for Human Soft Tissue Animation
Data-Driven Physics for Human Soft Tissue Animation

Kim, M., Pons-Moll, G., Pujades, S., Bang, S., Kim, J., Black, M. J., Lee, S.

ACM Transactions on Graphics, (Proc. SIGGRAPH), 36(4):54:1-54:12, 2017 (article)

Abstract
Data driven models of human poses and soft-tissue deformations can produce very realistic results, but they only model the visible surface of the human body and cannot create skin deformation due to interactions with the environment. Physical simulations can generalize to external forces, but their parameters are difficult to control. In this paper, we present a layered volumetric human body model learned from data. Our model is composed of a data-driven inner layer and a physics-based external layer. The inner layer is driven with a volumetric statistical body model (VSMPL). The soft tissue layer consists of a tetrahedral mesh that is driven using the finite element method (FEM). Model parameters, namely the segmentation of the body into layers and the soft tissue elasticity, are learned directly from 4D registrations of humans exhibiting soft tissue deformations. The learned two layer model is a realistic full-body avatar that generalizes to novel motions and external forces. Experiments show that the resulting avatars produce realistic results on held out sequences and react to external forces. Moreover, the model supports the retargeting of physical properties from one avatar when they share the same topology.

video paper link (url) Project Page [BibTex]

video paper link (url) Project Page [BibTex]


Learning Inference Models for Computer Vision
Learning Inference Models for Computer Vision

Jampani, V.

MPI for Intelligent Systems and University of Tübingen, 2017 (phdthesis)

Abstract
Computer vision can be understood as the ability to perform 'inference' on image data. Breakthroughs in computer vision technology are often marked by advances in inference techniques, as even the model design is often dictated by the complexity of inference in them. This thesis proposes learning based inference schemes and demonstrates applications in computer vision. We propose techniques for inference in both generative and discriminative computer vision models. Despite their intuitive appeal, the use of generative models in vision is hampered by the difficulty of posterior inference, which is often too complex or too slow to be practical. We propose techniques for improving inference in two widely used techniques: Markov Chain Monte Carlo (MCMC) sampling and message-passing inference. Our inference strategy is to learn separate discriminative models that assist Bayesian inference in a generative model. Experiments on a range of generative vision models show that the proposed techniques accelerate the inference process and/or converge to better solutions. A main complication in the design of discriminative models is the inclusion of prior knowledge in a principled way. For better inference in discriminative models, we propose techniques that modify the original model itself, as inference is simple evaluation of the model. We concentrate on convolutional neural network (CNN) models and propose a generalization of standard spatial convolutions, which are the basic building blocks of CNN architectures, to bilateral convolutions. First, we generalize the existing use of bilateral filters and then propose new neural network architectures with learnable bilateral filters, which we call `Bilateral Neural Networks'. We show how the bilateral filtering modules can be used for modifying existing CNN architectures for better image segmentation and propose a neural network approach for temporal information propagation in videos. Experiments demonstrate the potential of the proposed bilateral networks on a wide range of vision tasks and datasets. In summary, we propose learning based techniques for better inference in several computer vision models ranging from inverse graphics to freely parameterized neural networks. In generative vision models, our inference techniques alleviate some of the crucial hurdles in Bayesian posterior inference, paving new ways for the use of model based machine learning in vision. In discriminative CNN models, the proposed filter generalizations aid in the design of new neural network architectures that can handle sparse high-dimensional data as well as provide a way for incorporating prior knowledge into CNNs.

pdf [BibTex]

pdf [BibTex]


Sparse Inertial Poser: Automatic 3D Human Pose Estimation from Sparse IMUs
Sparse Inertial Poser: Automatic 3D Human Pose Estimation from Sparse IMUs

(Best Paper, Eurographics 2017)

Marcard, T. V., Rosenhahn, B., Black, M., Pons-Moll, G.

Computer Graphics Forum 36(2), Proceedings of the 38th Annual Conference of the European Association for Computer Graphics (Eurographics), pages: 349-360 , 2017 (article)

Abstract
We address the problem of making human motion capture in the wild more practical by using a small set of inertial sensors attached to the body. Since the problem is heavily under-constrained, previous methods either use a large number of sensors, which is intrusive, or they require additional video input. We take a different approach and constrain the problem by: (i) making use of a realistic statistical body model that includes anthropometric constraints and (ii) using a joint optimization framework to fit the model to orientation and acceleration measurements over multiple frames. The resulting tracker Sparse Inertial Poser (SIP) enables motion capture using only 6 sensors (attached to the wrists, lower legs, back and head) and works for arbitrary human motions. Experiments on the recently released TNT15 dataset show that, using the same number of sensors, SIP achieves higher accuracy than the dataset baseline without using any video data. We further demonstrate the effectiveness of SIP on newly recorded challenging motions in outdoor scenarios such as climbing or jumping over a wall

video pdf Project Page [BibTex]

video pdf Project Page [BibTex]


Efficient 2D and 3D Facade Segmentation using Auto-Context
Efficient 2D and 3D Facade Segmentation using Auto-Context

Gadde, R., Jampani, V., Marlet, R., Gehler, P.

IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017 (article)

Abstract
This paper introduces a fast and efficient segmentation technique for 2D images and 3D point clouds of building facades. Facades of buildings are highly structured and consequently most methods that have been proposed for this problem aim to make use of this strong prior information. Contrary to most prior work, we are describing a system that is almost domain independent and consists of standard segmentation methods. We train a sequence of boosted decision trees using auto-context features. This is learned using stacked generalization. We find that this technique performs better, or comparable with all previous published methods and present empirical results on all available 2D and 3D facade benchmark datasets. The proposed method is simple to implement, easy to extend, and very efficient at test-time inference.

arXiv Project Page [BibTex]

arXiv Project Page [BibTex]


{ClothCap}: Seamless {4D} Clothing Capture and Retargeting
ClothCap: Seamless 4D Clothing Capture and Retargeting

Pons-Moll, G., Pujades, S., Hu, S., Black, M.

ACM Transactions on Graphics, (Proc. SIGGRAPH), 36(4):73:1-73:15, ACM, New York, NY, USA, 2017, Two first authors contributed equally (article)

Abstract
Designing and simulating realistic clothing is challenging and, while several methods have addressed the capture of clothing from 3D scans, previous methods have been limited to single garments and simple motions, lack detail, or require specialized texture patterns. Here we address the problem of capturing regular clothing on fully dressed people in motion. People typically wear multiple pieces of clothing at a time. To estimate the shape of such clothing, track it over time, and render it believably, each garment must be segmented from the others and the body. Our ClothCap approach uses a new multi-part 3D model of clothed bodies, automatically segments each piece of clothing, estimates the naked body shape and pose under the clothing, and tracks the 3D deformations of the clothing over time. We estimate the garments and their motion from 4D scans; that is, high-resolution 3D scans of the subject in motion at 60 fps. The model allows us to capture a clothed person in motion, extract their clothing, and retarget the clothing to new body shapes. ClothCap provides a step towards virtual try-on with a technology for capturing, modeling, and analyzing clothing in motion.

video project_page paper link (url) DOI Project Page Project Page [BibTex]

video project_page paper link (url) DOI Project Page Project Page [BibTex]


Towards Accurate Marker-less Human Shape and Pose Estimation over Time
Towards Accurate Marker-less Human Shape and Pose Estimation over Time

Huang, Y., Bogo, F., Lassner, C., Kanazawa, A., Gehler, P. V., Romero, J., Akhter, I., Black, M. J.

In International Conference on 3D Vision (3DV), pages: 421-430, 2017 (inproceedings)

Abstract
Existing markerless motion capture methods often assume known backgrounds, static cameras, and sequence specific motion priors, limiting their application scenarios. Here we present a fully automatic method that, given multiview videos, estimates 3D human pose and body shape. We take the recently proposed SMPLify method [12] as the base method and extend it in several ways. First we fit a 3D human body model to 2D features detected in multi-view images. Second, we use a CNN method to segment the person in each image and fit the 3D body model to the contours, further improving accuracy. Third we utilize a generic and robust DCT temporal prior to handle the left and right side swapping issue sometimes introduced by the 2D pose estimator. Validation on standard benchmarks shows our results are comparable to the state of the art and also provide a realistic 3D shape avatar. We also demonstrate accurate results on HumanEva and on challenging monocular sequences of dancing from YouTube.

Code pdf DOI Project Page [BibTex]


Capturing Hand-Object Interaction and Reconstruction of Manipulated Objects
Capturing Hand-Object Interaction and Reconstruction of Manipulated Objects

Tzionas, D.

University of Bonn, 2017 (phdthesis)

Abstract
Hand motion capture with an RGB-D sensor gained recently a lot of research attention, however, even most recent approaches focus on the case of a single isolated hand. We focus instead on hands that interact with other hands or with a rigid or articulated object. Our framework successfully captures motion in such scenarios by combining a generative model with discriminatively trained salient points, collision detection and physics simulation to achieve a low tracking error with physically plausible poses. All components are unified in a single objective function that can be optimized with standard optimization techniques. We initially assume a-priori knowledge of the object's shape and skeleton. In case of unknown object shape there are existing 3d reconstruction methods that capitalize on distinctive geometric or texture features. These methods though fail for textureless and highly symmetric objects like household articles, mechanical parts or toys. We show that extracting 3d hand motion for in-hand scanning effectively facilitates the reconstruction of such objects and we fuse the rich additional information of hands into a 3d reconstruction pipeline. Finally, although shape reconstruction is enough for rigid objects, there is a lack of tools that build rigged models of articulated objects that deform realistically using RGB-D data. We propose a method that creates a fully rigged model consisting of a watertight mesh, embedded skeleton and skinning weights by employing a combination of deformable mesh tracking, motion segmentation based on spectral clustering and skeletonization based on mean curvature flow.

Thesis link (url) Project Page [BibTex]

2007


A Database and Evaluation Methodology for Optical Flow
A Database and Evaluation Methodology for Optical Flow

Baker, S., Scharstein, D., Lewis, J.P., Roth, S., Black, M.J., Szeliski, R.

In Int. Conf. on Computer Vision, ICCV, pages: 1-8, Rio de Janeiro, Brazil, October 2007 (inproceedings)

pdf [BibTex]

2007

pdf [BibTex]


Shining a light on human pose: On shadows, shading and the estimation of pose and shape,
Shining a light on human pose: On shadows, shading and the estimation of pose and shape,

Balan, A., Black, M. J., Haussecker, H., Sigal, L.

In Int. Conf. on Computer Vision, ICCV, pages: 1-8, Rio de Janeiro, Brazil, October 2007 (inproceedings)

pdf YouTube [BibTex]

pdf YouTube [BibTex]


no image
Ensemble spiking activity as a source of cortical control signals in individuals with tetraplegia

Simeral, J. D., Kim, S. P., Black, M. J., Donoghue, J. P., Hochberg, L. R.

Biomedical Engineering Society, BMES, september 2007 (conference)

[BibTex]

[BibTex]


Detailed human shape and pose from images
Detailed human shape and pose from images

Balan, A., Sigal, L., Black, M. J., Davis, J., Haussecker, H.

In IEEE Conf. on Computer Vision and Pattern Recognition, CVPR, pages: 1-8, Minneapolis, June 2007 (inproceedings)

Abstract
Much of the research on video-based human motion capture assumes the body shape is known a priori and is represented coarsely (e.g. using cylinders or superquadrics to model limbs). These body models stand in sharp contrast to the richly detailed 3D body models used by the graphics community. Here we propose a method for recovering such models directly from images. Specifically, we represent the body using a recently proposed triangulated mesh model called SCAPE which employs a low-dimensional, but detailed, parametric model of shape and pose-dependent deformations that is learned from a database of range scans of human bodies. Previous work showed that the parameters of the SCAPE model could be estimated from marker-based motion capture data. Here we go further to estimate the parameters directly from image data. We define a cost function between image observations and a hypothesized mesh and formulate the problem as optimization over the body shape and pose parameters using stochastic search. Our results show that such rich generative models enable the automatic recovery of detailed human shape and pose from images.

pdf YouTube [BibTex]

pdf YouTube [BibTex]


no image
Learning static Gestalt laws through dynamic experience

Ostrovsky, Y., Wulff, J., Sinha, P.

Journal of Vision, 7(9):315-315, ARVO, June 2007 (article)

Abstract
The Gestalt laws (Wertheimer 1923) are widely regarded as the rules that help us parse the world into objects. However, it is unclear as to how these laws are acquired by an infant's visual system. Classically, these “laws” have been presumed to be innate (Kellman and Spelke 1983). But, more recent work in infant development, showing the protracted time-course over which these grouping principles emerge (e.g., Johnson and Aslin 1995; Craton 1996), suggests that visual experience might play a role in their genesis. Specifically, our studies of patients with late-onset vision (Project Prakash; VSS 2006) and evidence from infant development both point to an early role of common motion cues for object grouping. Here we explore the possibility that the privileged status of motion in the developmental timeline is not happenstance, but rather serves to bootstrap the learning of static Gestalt cues. Our approach involves computational analyses of real-world motion sequences to investigate whether primitive optic flow information is correlated with static figural cues that could eventually come to serve as proxies for grouping in the form of Gestalt principles. We calculated local optic flow maps and then examined how similarity of motion across image patches co-varied with similarity of certain figural properties in static frames. Results indicate that patches with similar motion are much more likely to have similar luminance, color, and orientation as compared to patches with dissimilar motion vectors. This regularity suggests that, in principle, common motion extracted from dynamic visual experience can provide enough information to bootstrap region grouping based on luminance and color and contour continuation mechanisms in static scenes. These observations, coupled with the cited experimental studies, lend credence to the hypothesis that static Gestalt laws might be learned through a bootstrapping process based on early dynamic experience.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


Decoding grasp aperture from motor-cortical population activity
Decoding grasp aperture from motor-cortical population activity

Artemiadis, P., Shakhnarovich, G., Vargas-Irwin, C., Donoghue, J. P., Black, M. J.

In The 3rd International IEEE EMBS Conference on Neural Engineering, pages: 518-521, May 2007 (inproceedings)

pdf [BibTex]

pdf [BibTex]


Multi-state decoding of point-and-click control signals from motor cortical activity in a human with tetraplegia
Multi-state decoding of point-and-click control signals from motor cortical activity in a human with tetraplegia

Kim, S., Simeral, J., Hochberg, L., Donoghue, J. P., Friehs, G., Black, M. J.

In The 3rd International IEEE EMBS Conference on Neural Engineering, pages: 486-489, May 2007 (inproceedings)

Abstract
Basic neural-prosthetic control of a computer cursor has been recently demonstrated by Hochberg et al. [1] using the BrainGate system (Cyberkinetics Neurotechnology Systems, Inc.). While these results demonstrate the feasibility of intracortically-driven prostheses for humans with paralysis, a practical cursor-based computer interface requires more precise cursor control and the ability to “click” on areas of interest. Here we present a practical point and click device that decodes both continuous states (e.g. cursor kinematics) and discrete states (e.g. click state) from single neural population in human motor cortex. We describe a probabilistic multi-state decoder and the necessary training paradigms that enable point and click cursor control by a human with tetraplegia using an implanted microelectrode array. We present results from multiple recording sessions and quantify the point and click performance.

pdf [BibTex]

pdf [BibTex]


Neuromotor prosthesis development
Neuromotor prosthesis development

Donoghue, J., Hochberg, L., Nurmikko, A., Black, M., Simeral, J., Friehs, G.

Medicine & Health Rhode Island, 90(1):12-15, January 2007 (article)

Abstract
Article describes a neuromotor prosthesis (NMP), in development at Brown University, that records human brain signals, decodes them, and transforms them into movement commands. An NMP is described as a system consisting of a neural interface, a decoding system, and a user interface, also called an effector; a closed-loop system would be completed by a feedback signal from the effector to the brain. The interface is based on neural spiking, a source of information-rich, rapid, complex control signals from the nervous system. The NMP described, named BrainGate, consists of a match-head sized platform with 100 thread-thin electrodes implanted just into the surface of the motor cortex where commands to move the hand emanate. Neural signals are decoded by a rack of computers that displays the resultant output as the motion of a cursor on a computer monitor. While computer cursor motion represents a form of virtual device control, this same command signal could be routed to a device to command motion of paralyzed muscles or the actions of prosthetic limbs. The researchers’ overall goal is the development of a fully implantable, wireless multi-neuron sensor for broad research, neural prosthetic, and human neurodiagnostic applications.

pdf [BibTex]

pdf [BibTex]


On the spatial statistics of optical flow
On the spatial statistics of optical flow

Roth, S., Black, M. J.

International Journal of Computer Vision, 74(1):33-50, 2007 (article)

Abstract
We present an analysis of the spatial and temporal statistics of "natural" optical flow fields and a novel flow algorithm that exploits their spatial statistics. Training flow fields are constructed using range images of natural scenes and 3D camera motions recovered from hand-held and car-mounted video sequences. A detailed analysis of optical flow statistics in natural scenes is presented and machine learning methods are developed to learn a Markov random field model of optical flow. The prior probability of a flow field is formulated as a Field-of-Experts model that captures the spatial statistics in overlapping patches and is trained using contrastive divergence. This new optical flow prior is compared with previous robust priors and is incorporated into a recent, accurate algorithm for dense optical flow computation. Experiments with natural and synthetic sequences illustrate how the learned optical flow prior quantitatively improves flow accuracy and how it captures the rich spatial structure found in natural scene motion.

pdf preprint pdf from publisher [BibTex]

pdf preprint pdf from publisher [BibTex]


Deterministic Annealing for Multiple-Instance Learning
Deterministic Annealing for Multiple-Instance Learning

Gehler, P., Chapelle, O.

In Artificial Intelligence and Statistics (AIStats), 2007 (inproceedings)

pdf [BibTex]

pdf [BibTex]


no image
Point-and-click cursor control by a person with tetraplegia using an intracortical neural interface system

Kim, S., Simeral, J. D., Hochberg, L. R., Friehs, G., Donoghue, J. P., Black, M. J.

Program No. 517.2. 2007 Abstract Viewer and Itinerary Planner, Society for Neuroscience, San Diego, CA, 2007, Online (conference)

[BibTex]

[BibTex]


Assistive technology and robotic control using {MI} ensemble-based neural interface systems in humans with tetraplegia
Assistive technology and robotic control using MI ensemble-based neural interface systems in humans with tetraplegia

Donoghue, J. P., Nurmikko, A., Black, M. J., Hochberg, L.

Journal of Physiology, Special Issue on Brain Computer Interfaces, 579, pages: 603-611, 2007 (article)

Abstract
This review describes the rationale, early stage development, and initial human application of neural interface systems (NISs) for humans with paralysis. NISs are emerging medical devices designed to allowpersonswith paralysis to operate assistive technologies or to reanimatemuscles based upon a command signal that is obtained directly fromthe brain. Such systems require the development of sensors to detect brain signals, decoders to transformneural activity signals into a useful command, and an interface for the user.We review initial pilot trial results of an NIS that is based on an intracortical microelectrode sensor that derives control signals from the motor cortex.We review recent findings showing, first, that neurons engaged by movement intentions persist in motor cortex years after injury or disease to the motor system, and second, that signals derived from motor cortex can be used by persons with paralysis to operate a range of devices. We suggest that, with further development, this form of NIS holds promise as a useful new neurotechnology for those with limited motor function or communication.We also discuss the additional potential for neural sensors to be used in the diagnosis and management of various neurological conditions and as a new way to learn about human brain function.

pdf preprint pdf from publisher DOI [BibTex]

pdf preprint pdf from publisher DOI [BibTex]


Learning Appearances with Low-Rank SVM
Learning Appearances with Low-Rank SVM

Wolf, L., Jhuang, H., Hazan, T.

In Conference on Computer Vision and Pattern Recognition (CVPR), 2007 (inproceedings)

pdf [BibTex]

pdf [BibTex]


no image
Neural correlates of grip aperture in primary motor cortex

Vargas-Irwin, C., Shakhnarovich, G., Artemiadis, P., Donoghue, J. P., Black, M. J.

Program No. 517.10. 2007 Abstract Viewer and Itinerary Planner, Society for Neuroscience, San Diego, CA, 2007, Online (conference)

[BibTex]

[BibTex]


no image
Directional tuning in motor cortex of a person with ALS

Simeral, J. D., Donoghue, J. P., Black, M. J., Friehs, G. M., Brown, R. H., Krivickas, L. S., Hochberg, L. R.

Program No. 517.4. 2007 Abstract Viewer and Itinerary Planner, Society for Neuroscience, San Diego, CA, 2007, Online (conference)

[BibTex]

[BibTex]


Denoising archival films using a learned {Bayesian} model
Denoising archival films using a learned Bayesian model

Moldovan, T. M., Roth, S., Black, M. J.

(CS-07-03), Brown University, Department of Computer Science, 2007 (techreport)

pdf [BibTex]

pdf [BibTex]


Steerable random fields
Steerable random fields

(Best Paper Award, INI-Graphics Net, 2008)

Roth, S., Black, M. J.

In Int. Conf. on Computer Vision, ICCV, pages: 1-8, Rio de Janeiro, Brazil, 2007 (inproceedings)

pdf [BibTex]

pdf [BibTex]


no image
Toward standardized assessment of pointing devices for brain-computer interfaces

Donoghue, J., Simeral, J., Kim, S., G.M. Friehs, L. H., Black, M.

Program No. 517.16. 2007 Abstract Viewer and Itinerary Planner, Society for Neuroscience, San Diego, CA, 2007, Online (conference)

[BibTex]

[BibTex]


A Biologically Inspired System for Action Recognition
A Biologically Inspired System for Action Recognition

Jhuang, H., Serre, T., Wolf, L., Poggio, T.

In International Conference on Computer Vision (ICCV), 2007 (inproceedings)

code pdf [BibTex]

code pdf [BibTex]


no image
AREADNE Research in Encoding And Decoding of Neural Ensembles

Shakhnarovich, G., Hochberg, L. R., Donoghue, J. P., Stein, J., Brown, R. H., Krivickas, L. S., Friehs, G. M., Black, M. J.

Program No. 517.8. 2007 Abstract Viewer and Itinerary Planner, Society for Neuroscience, San Diego, CA, 2007, Online (conference)

[BibTex]

[BibTex]

2006


no image
Finding directional movement representations in motor cortical neural populations using nonlinear manifold learning

WorKim, S., Simeral, J., Jenkins, O., Donoghue, J., Black, M.

World Congress on Medical Physics and Biomedical Engineering 2006, Seoul, Korea, August 2006 (conference)

[BibTex]

2006

[BibTex]


A non-parametric {Bayesian} approach to spike sorting
A non-parametric Bayesian approach to spike sorting

Wood, F., Goldwater, S., Black, M. J.

In International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, pages: 1165-1169, New York, NY, August 2006 (inproceedings)

pdf [BibTex]

pdf [BibTex]


Predicting {3D} people from {2D} pictures
Predicting 3D people from 2D pictures

(Best Paper)

Sigal, L., Black, M. J.

In Proc. IV Conf. on Articulated Motion and DeformableObjects (AMDO), LNCS 4069, pages: 185-195, July 2006 (inproceedings)

Abstract
We propose a hierarchical process for inferring the 3D pose of a person from monocular images. First we infer a learned view-based 2D body model from a single image using non-parametric belief propagation. This approach integrates information from bottom-up body-part proposal processes and deals with self-occlusion to compute distributions over limb poses. Then, we exploit a learned Mixture of Experts model to infer a distribution of 3D poses conditioned on 2D poses. This approach is more general than recent work on inferring 3D pose directly from silhouettes since the 2D body model provides a richer representation that includes the 2D joint angles and the poses of limbs that may be unobserved in the silhouette. We demonstrate the method in a laboratory setting where we evaluate the accuracy of the 3D poses against ground truth data. We also estimate 3D body pose in a monocular image sequence. The resulting 3D estimates are sufficiently accurate to serve as proposals for the Bayesian inference of 3D human motion over time

pdf pdf from publisher Video [BibTex]

pdf pdf from publisher Video [BibTex]


Specular flow and the recovery of surface structure
Specular flow and the recovery of surface structure

Roth, S., Black, M.

In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, CVPR, 2, pages: 1869-1876, New York, NY, June 2006 (inproceedings)

Abstract
In scenes containing specular objects, the image motion observed by a moving camera may be an intermixed combination of optical flow resulting from diffuse reflectance (diffuse flow) and specular reflection (specular flow). Here, with few assumptions, we formalize the notion of specular flow, show how it relates to the 3D structure of the world, and develop an algorithm for estimating scene structure from 2D image motion. Unlike previous work on isolated specular highlights we use two image frames and estimate the semi-dense flow arising from the specular reflections of textured scenes. We parametrically model the image motion of a quadratic surface patch viewed from a moving camera. The flow is modeled as a probabilistic mixture of diffuse and specular components and the 3D shape is recovered using an Expectation-Maximization algorithm. Rather than treating specular reflections as noise to be removed or ignored, we show that the specular flow provides additional constraints on scene geometry that improve estimation of 3D structure when compared with reconstruction from diffuse flow alone. We demonstrate this for a set of synthetic and real sequences of mixed specular-diffuse objects.

pdf [BibTex]

pdf [BibTex]


An adaptive appearance model approach for model-based articulated object tracking
An adaptive appearance model approach for model-based articulated object tracking

Balan, A., Black, M. J.

In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, CVPR, 1, pages: 758-765, New York, NY, June 2006 (inproceedings)

Abstract
The detection and tracking of three-dimensional human body models has progressed rapidly but successful approaches typically rely on accurate foreground silhouettes obtained using background segmentation. There are many practical applications where such information is imprecise. Here we develop a new image likelihood function based on the visual appearance of the subject being tracked. We propose a robust, adaptive, appearance model based on the Wandering-Stable-Lost framework extended to the case of articulated body parts. The method models appearance using a mixture model that includes an adaptive template, frame-to-frame matching and an outlier process. We employ an annealed particle filtering algorithm for inference and take advantage of the 3D body model to predict self occlusion and improve pose estimation accuracy. Quantitative tracking results are presented for a walking sequence with a 180 degree turn, captured with four synchronized and calibrated cameras and containing significant appearance changes and self-occlusion in each view.

pdf [BibTex]

pdf [BibTex]


Measure locally, reason globally: Occlusion-sensitive articulated pose estimation
Measure locally, reason globally: Occlusion-sensitive articulated pose estimation

Sigal, L., Black, M. J.

In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, CVPR, 2, pages: 2041-2048, New York, NY, June 2006 (inproceedings)

pdf [BibTex]

pdf [BibTex]


Statistical analysis of the non-stationarity of neural population codes
Statistical analysis of the non-stationarity of neural population codes

Kim, S., Wood, F., Fellows, M., Donoghue, J. P., Black, M. J.

In BioRob 2006, The first IEEE / RAS-EMBS International Conference on Biomedical Robotics and Biomechatronics, pages: 295-299, Pisa, Italy, Febuary 2006 (inproceedings)

pdf [BibTex]

pdf [BibTex]


no image
How to choose the covariance for Gaussian process regression independently of the basis

Franz, M., Gehler, P.

In Proceedings of the Workshop Gaussian Processes in Practice, 2006 (inproceedings)

pdf [BibTex]

pdf [BibTex]


The rate adapting poisson model for information retrieval and object recognition
The rate adapting poisson model for information retrieval and object recognition

Gehler, P. V., Holub, A. D., Welling, M.

In Proceedings of the 23rd international conference on Machine learning, pages: 337-344, ICML ’06, ACM, New York, NY, USA, 2006 (inproceedings)

project page pdf DOI [BibTex]

project page pdf DOI [BibTex]


Implicit Wiener Series, Part II: Regularised estimation
Implicit Wiener Series, Part II: Regularised estimation

Gehler, P., Franz, M.

(148), Max Planck Institute, 2006 (techreport)

pdf [BibTex]


Tracking complex objects using graphical object models
Tracking complex objects using graphical object models

Sigal, L., Zhu, Y., Comaniciu, D., Black, M. J.

In International Workshop on Complex Motion, LNCS 3417, pages: 223-234, Springer-Verlag, 2006 (inproceedings)

pdf pdf from publisher [BibTex]

pdf pdf from publisher [BibTex]


{HumanEva}: Synchronized video and motion capture dataset for evaluation of articulated human motion
HumanEva: Synchronized video and motion capture dataset for evaluation of articulated human motion

Sigal, L., Black, M. J.

(CS-06-08), Brown University, Department of Computer Science, 2006 (techreport)

pdf abstract [BibTex]

pdf abstract [BibTex]


Bayesian population decoding of motor cortical activity using a {Kalman} filter
Bayesian population decoding of motor cortical activity using a Kalman filter

Wu, W., Gao, Y., Bienenstock, E., Donoghue, J. P., Black, M. J.

Neural Computation, 18(1):80-118, 2006 (article)

Abstract
Effective neural motor prostheses require a method for decoding neural activity representing desired movement. In particular, the accurate reconstruction of a continuous motion signal is necessary for the control of devices such as computer cursors, robots, or a patient's own paralyzed limbs. For such applications, we developed a real-time system that uses Bayesian inference techniques to estimate hand motion from the firing rates of multiple neurons. In this study, we used recordings that were previously made in the arm area of primary motor cortex in awake behaving monkeys using a chronically implanted multielectrode microarray. Bayesian inference involves computing the posterior probability of the hand motion conditioned on a sequence of observed firing rates; this is formulated in terms of the product of a likelihood and a prior. The likelihood term models the probability of firing rates given a particular hand motion. We found that a linear gaussian model could be used to approximate this likelihood and could be readily learned from a small amount of training data. The prior term defines a probabilistic model of hand kinematics and was also taken to be a linear gaussian model. Decoding was performed using a Kalman filter, which gives an efficient recursive method for Bayesian inference when the likelihood and prior are linear and gaussian. In off-line experiments, the Kalman filter reconstructions of hand trajectory were more accurate than previously reported results. The resulting decoding algorithm provides a principled probabilistic model of motor-cortical coding, decodes hand motion in real time, provides an estimate of uncertainty, and is straightforward to implement. Additionally the formulation unifies and extends previous models of neural coding while providing insights into the motor-cortical code.

pdf preprint pdf from publisher abstract [BibTex]

pdf preprint pdf from publisher abstract [BibTex]


Hierarchical Approach for Articulated {3D} Pose-Estimation and Tracking (extended abstract)
Hierarchical Approach for Articulated 3D Pose-Estimation and Tracking (extended abstract)

Sigal, L., Black, M. J.

In Learning, Representation and Context for Human Sensing in Video Workshop (in conjunction with CVPR), 2006 (inproceedings)

pdf poster [BibTex]

pdf poster [BibTex]


Nonlinear physically-based models for decoding motor-cortical population activity
Nonlinear physically-based models for decoding motor-cortical population activity

Shakhnarovich, G., Kim, S., Black, M. J.

In Advances in Neural Information Processing Systems 19, NIPS-2006, pages: 1257-1264, MIT Press, 2006 (inproceedings)

pdf [BibTex]

pdf [BibTex]


no image
A comparison of decoding models for imagined motion from human motor cortex

Kim, S., Simeral, J., Donoghue, J. P., Hocherberg, L. R., Friehs, G., Mukand, J. A., Chen, D., Black, M. J.

Program No. 256.11. 2006 Abstract Viewer and Itinerary Planner, Society for Neuroscience, Atlanta, GA, 2006, Online (conference)

[BibTex]

[BibTex]


Denoising archival films using a learned {Bayesian} model
Denoising archival films using a learned Bayesian model

Moldovan, T. M., Roth, S., Black, M. J.

In Int. Conf. on Image Processing, ICIP, pages: 2641-2644, Atlanta, 2006 (inproceedings)

pdf [BibTex]

pdf [BibTex]


Efficient belief propagation with learned higher-order {Markov} random fields
Efficient belief propagation with learned higher-order Markov random fields

Lan, X., Roth, S., Huttenlocher, D., Black, M. J.

In European Conference on Computer Vision, ECCV, II, pages: 269-282, Graz, Austria, 2006 (inproceedings)

pdf pdf from publisher [BibTex]

pdf pdf from publisher [BibTex]