Header logo is ps


2018


Thumb xl imgidx 00326
Customized Multi-Person Tracker

Ma, L., Tang, S., Black, M. J., Gool, L. V.

In Computer Vision – ACCV 2018, Springer International Publishing, December 2018 (inproceedings)

PDF Project Page [BibTex]

2018

PDF Project Page [BibTex]


Thumb xl sevillagcpr
On the Integration of Optical Flow and Action Recognition

Sevilla-Lara, L., Liao, Y., Güney, F., Jampani, V., Geiger, A., Black, M. J.

In German Conference on Pattern Recognition (GCPR), LNCS 11269, pages: 281-297, Springer, Cham, October 2018 (inproceedings)

Abstract
Most of the top performing action recognition methods use optical flow as a "black box" input. Here we take a deeper look at the combination of flow and action recognition, and investigate why optical flow is helpful, what makes a flow method good for action recognition, and how we can make it better. In particular, we investigate the impact of different flow algorithms and input transformations to better understand how these affect a state-of-the-art action recognition method. Furthermore, we fine tune two neural-network flow methods end-to-end on the most widely used action recognition dataset (UCF101). Based on these experiments, we make the following five observations: 1) optical flow is useful for action recognition because it is invariant to appearance, 2) optical flow methods are optimized to minimize end-point-error (EPE), but the EPE of current methods is not well correlated with action recognition performance, 3) for the flow methods tested, accuracy at boundaries and at small displacements is most correlated with action recognition performance, 4) training optical flow to minimize classification error instead of minimizing EPE improves recognition performance, and 5) optical flow learned for the task of action recognition differs from traditional optical flow especially inside the human body and at the boundary of the body. These observations may encourage optical flow researchers to look beyond EPE as a goal and guide action recognition researchers to seek better motion cues, leading to a tighter integration of the optical flow and action recognition communities.

arXiv DOI [BibTex]

arXiv DOI [BibTex]


Thumb xl interpolation
Temporal Interpolation as an Unsupervised Pretraining Task for Optical Flow Estimation

Wulff, J., Black, M. J.

In German Conference on Pattern Recognition (GCPR), LNCS 11269, pages: 567-582, Springer, Cham, October 2018 (inproceedings)

Abstract
The difficulty of annotating training data is a major obstacle to using CNNs for low-level tasks in video. Synthetic data often does not generalize to real videos, while unsupervised methods require heuristic n losses. Proxy tasks can overcome these issues, and start by training a network for a task for which annotation is easier or which can be trained unsupervised. The trained network is then fine-tuned for the original task using small amounts of ground truth data. Here, we investigate frame interpolation as a proxy task for optical flow. Using real movies, we train a CNN unsupervised for temporal interpolation. Such a network implicitly estimates motion, but cannot handle untextured regions. By fi ne-tuning on small amounts of ground truth flow, the network can learn to fill in homogeneous regions and compute full optical flow fi elds. Using this unsupervised pre-training, our network outperforms similar architectures that were trained supervised using synthetic optical flow.

pdf arXiv DOI Project Page [BibTex]

pdf arXiv DOI Project Page [BibTex]


Thumb xl bmvc pic
Human Motion Parsing by Hierarchical Dynamic Clustering

Zhang, Y., Tang, S., Sun, H., Neumann, H.

In Proceedings of the British Machine Vision Conference (BMVC), pages: 269, BMVA Press, September 2018 (inproceedings)

Abstract
Parsing continuous human motion into meaningful segments plays an essential role in various applications. In this work, we propose a hierarchical dynamic clustering framework to derive action clusters from a sequence of local features in an unsuper- vised bottom-up manner. We systematically investigate the modules in this framework and particularly propose diverse temporal pooling schemes, in order to realize accurate temporal action localization. We demonstrate our method on two motion parsing tasks: temporal action segmentation and abnormal behavior detection. The experimental results indicate that the proposed framework is significantly more effective than the other related state-of-the-art methods on several datasets.

pdf Project Page [BibTex]

pdf Project Page [BibTex]


Thumb xl coma faces
Generating 3D Faces using Convolutional Mesh Autoencoders

Ranjan, A., Bolkart, T., Sanyal, S., Black, M. J.

In European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, vol 11207, pages: 725-741, Springer, Cham, September 2018 (inproceedings)

Abstract
Learned 3D representations of human faces are useful for computer vision problems such as 3D face tracking and reconstruction from images, as well as graphics applications such as character generation and animation. Traditional models learn a latent representation of a face using linear subspaces or higher-order tensor generalizations. Due to this linearity, they can not capture extreme deformations and non-linear expressions. To address this, we introduce a versatile model that learns a non-linear representation of a face using spectral convolutions on a mesh surface. We introduce mesh sampling operations that enable a hierarchical mesh representation that captures non-linear variations in shape and expression at multiple scales within the model. In a variational setting, our model samples diverse realistic 3D faces from a multivariate Gaussian distribution. Our training data consists of 20,466 meshes of extreme expressions captured over 12 different subjects. Despite limited training data, our trained model outperforms state-of-the-art face models with 50% lower reconstruction error, while using 75% fewer parameters. We also show that, replacing the expression space of an existing state-of-the-art face model with our autoencoder, achieves a lower reconstruction error. Our data, model and code are available at http://coma.is.tue.mpg.de/.

code Project Page paper supplementary DOI Project Page Project Page [BibTex]

code Project Page paper supplementary DOI Project Page Project Page [BibTex]


Thumb xl person reid.001
Part-Aligned Bilinear Representations for Person Re-identification

Suh, Y., Wang, J., Tang, S., Mei, T., Lee, K. M.

In European Conference on Computer Vision (ECCV), 11218, pages: 418-437, Springer, Cham, September 2018 (inproceedings)

Abstract
Comparing the appearance of corresponding body parts is essential for person re-identification. However, body parts are frequently misaligned be- tween detected boxes, due to the detection errors and the pose/viewpoint changes. In this paper, we propose a network that learns a part-aligned representation for person re-identification. Our model consists of a two-stream network, which gen- erates appearance and body part feature maps respectively, and a bilinear-pooling layer that fuses two feature maps to an image descriptor. We show that it results in a compact descriptor, where the inner product between two image descriptors is equivalent to an aggregation of the local appearance similarities of the cor- responding body parts, and thereby significantly reduces the part misalignment problem. Our approach is advantageous over other pose-guided representations by learning part descriptors optimal for person re-identification. Training the net- work does not require any part annotation on the person re-identification dataset. Instead, we simply initialize the part sub-stream using a pre-trained sub-network of an existing pose estimation network and train the whole network to minimize the re-identification loss. We validate the effectiveness of our approach by demon- strating its superiority over the state-of-the-art methods on the standard bench- mark datasets including Market-1501, CUHK03, CUHK01 and DukeMTMC, and standard video dataset MARS.

pdf supplementary DOI Project Page [BibTex]

pdf supplementary DOI Project Page [BibTex]


Thumb xl persondetect  copy
Learning Human Optical Flow

Ranjan, A., Romero, J., Black, M. J.

In 29th British Machine Vision Conference, September 2018 (inproceedings)

Abstract
The optical flow of humans is well known to be useful for the analysis of human action. Given this, we devise an optical flow algorithm specifically for human motion and show that it is superior to generic flow methods. Designing a method by hand is impractical, so we develop a new training database of image sequences with ground truth optical flow. For this we use a 3D model of the human body and motion capture data to synthesize realistic flow fields. We then train a convolutional neural network to estimate human flow fields from pairs of images. Since many applications in human motion analysis depend on speed, and we anticipate mobile applications, we base our method on SpyNet with several modifications. We demonstrate that our trained network is more accurate than a wide range of top methods on held-out test data and that it generalizes well to real image sequences. When combined with a person detector/tracker, the approach provides a full solution to the problem of 2D human flow estimation. Both the code and the dataset are available for research.

video code pdf link (url) Project Page Project Page [BibTex]

video code pdf link (url) Project Page Project Page [BibTex]


Thumb xl nbf
Neural Body Fitting: Unifying Deep Learning and Model-Based Human Pose and Shape Estimation

(Best Student Paper Award)

Omran, M., Lassner, C., Pons-Moll, G., Gehler, P. V., Schiele, B.

In 3DV, September 2018 (inproceedings)

Abstract
Direct prediction of 3D body pose and shape remains a challenge even for highly parameterized deep learning models. Mapping from the 2D image space to the prediction space is difficult: perspective ambiguities make the loss function noisy and training data is scarce. In this paper, we propose a novel approach (Neural Body Fitting (NBF)). It integrates a statistical body model within a CNN, leveraging reliable bottom-up semantic body part segmentation and robust top-down body model constraints. NBF is fully differentiable and can be trained using 2D and 3D annotations. In detailed experiments, we analyze how the components of our model affect performance, especially the use of part segmentations as an explicit intermediate representation, and present a robust, efficiently trainable framework for 3D human pose estimation from 2D images with competitive results on standard benchmarks. Code is available at https://github.com/mohomran/neural_body_fitting

arXiv code Project Page [BibTex]


Thumb xl joeleccv18
Unsupervised Learning of Multi-Frame Optical Flow with Occlusions

Janai, J., Güney, F., Ranjan, A., Black, M. J., Geiger, A.

In European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, vol 11220, pages: 713-731, Springer, Cham, September 2018 (inproceedings)

pdf suppmat Video Project Page DOI Project Page [BibTex]

pdf suppmat Video Project Page DOI Project Page [BibTex]


Thumb xl sample3 merge black
Learning an Infant Body Model from RGB-D Data for Accurate Full Body Motion Analysis

Hesse, N., Pujades, S., Romero, J., Black, M. J., Bodensteiner, C., Arens, M., Hofmann, U. G., Tacke, U., Hadders-Algra, M., Weinberger, R., Muller-Felber, W., Schroeder, A. S.

In Int. Conf. on Medical Image Computing and Computer Assisted Intervention (MICCAI), September 2018 (inproceedings)

Abstract
Infant motion analysis enables early detection of neurodevelopmental disorders like cerebral palsy (CP). Diagnosis, however, is challenging, requiring expert human judgement. An automated solution would be beneficial but requires the accurate capture of 3D full-body movements. To that end, we develop a non-intrusive, low-cost, lightweight acquisition system that captures the shape and motion of infants. Going beyond work on modeling adult body shape, we learn a 3D Skinned Multi-Infant Linear body model (SMIL) from noisy, low-quality, and incomplete RGB-D data. We demonstrate the capture of shape and motion with 37 infants in a clinical environment. Quantitative experiments show that SMIL faithfully represents the data and properly factorizes the shape and pose of the infants. With a case study based on general movement assessment (GMA), we demonstrate that SMIL captures enough information to allow medical assessment. SMIL provides a new tool and a step towards a fully automatic system for GMA.

pdf Project page video extended arXiv version DOI Project Page [BibTex]

pdf Project page video extended arXiv version DOI Project Page [BibTex]


Thumb xl eccv pascal results  thumbnail
Deep Directional Statistics: Pose Estimation with Uncertainty Quantification

Prokudin, S., Gehler, P., Nowozin, S.

European Conference on Computer Vision (ECCV), September 2018 (conference)

Abstract
Modern deep learning systems successfully solve many perception tasks such as object pose estimation when the input image is of high quality. However, in challenging imaging conditions such as on low resolution images or when the image is corrupted by imaging artifacts, current systems degrade considerably in accuracy. While a loss in performance is unavoidable we would like our models to quantify their uncertainty in order to achieve robustness against images of varying quality. Probabilistic deep learning models combine the expressive power of deep learning with uncertainty quantification. In this paper, we propose a novel probabilistic deep learning model for the task of angular regression. Our model uses von Mises distributions to predict a distribution over object pose angle. Whereas a single von Mises distribution is making strong assumptions about the shape of the distribution, we extend the basic model to predict a mixture of von Mises distributions. We show how to learn a mixture model using a finite and infinite number of mixture components. Our model allow for likelihood-based training and efficient inference at test time. We demonstrate on a number of challenging pose estimation datasets that our model produces calibrated probability predictions and competitive or superior point estimates compared to the current state-of-the-art.

code pdf [BibTex]


Thumb xl vip
Recovering Accurate 3D Human Pose in The Wild Using IMUs and a Moving Camera

Marcard, T. V., Henschel, R., Black, M. J., Rosenhahn, B., Pons-Moll, G.

In European Conference on Computer Vision (ECCV), Lecture Notes in Computer Science, vol 11214, pages: 614-631, Springer, Cham, September 2018 (inproceedings)

Abstract
In this work, we propose a method that combines a single hand-held camera and a set of Inertial Measurement Units (IMUs) attached at the body limbs to estimate accurate 3D poses in the wild. This poses many new challenges: the moving camera, heading drift, cluttered background, occlusions and many people visible in the video. We associate 2D pose detections in each image to the corresponding IMU-equipped persons by solving a novel graph based optimization problem that forces 3D to 2D coherency within a frame and across long range frames. Given associations, we jointly optimize the pose of a statistical body model, the camera pose and heading drift using a continuous optimization framework. We validated our method on the TotalCapture dataset, which provides video and IMU synchronized with ground truth. We obtain an accuracy of 26mm, which makes it accurate enough to serve as a benchmark for image-based 3D pose estimation in the wild. Using our method, we recorded 3D Poses in the Wild (3DPW ), a new dataset consisting of more than 51; 000 frames with accurate 3D pose in challenging sequences, including walking in the city, going up-stairs, having co ffee or taking the bus. We make the reconstructed 3D poses, video, IMU and 3D models available for research purposes at http://virtualhumans.mpi-inf.mpg.de/3DPW.

pdf SupMat data project DOI Project Page [BibTex]

pdf SupMat data project DOI Project Page [BibTex]


Thumb xl aircap ca 3
Decentralized MPC based Obstacle Avoidance for Multi-Robot Target Tracking Scenarios

Tallamraju, R., Rajappa, S., Black, M. J., Karlapalem, K., Ahmad, A.

2018 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR), pages: 1-8, IEEE, August 2018 (conference)

Abstract
In this work, we consider the problem of decentralized multi-robot target tracking and obstacle avoidance in dynamic environments. Each robot executes a local motion planning algorithm which is based on model predictive control (MPC). The planner is designed as a quadratic program, subject to constraints on robot dynamics and obstacle avoidance. Repulsive potential field functions are employed to avoid obstacles. The novelty of our approach lies in embedding these non-linear potential field functions as constraints within a convex optimization framework. Our method convexifies nonconvex constraints and dependencies, by replacing them as pre-computed external input forces in robot dynamics. The proposed algorithm additionally incorporates different methods to avoid field local minima problems associated with using potential field functions in planning. The motion planner does not enforce predefined trajectories or any formation geometry on the robots and is a comprehensive solution for cooperative obstacle avoidance in the context of multi-robot target tracking. We perform simulation studies for different scenarios to showcase the convergence and efficacy of the proposed algorithm.

Published Version link (url) DOI [BibTex]

Published Version link (url) DOI [BibTex]


Thumb xl patent2009
Method and Apparatus for Estimating Body Shape

Black, M. J., Balan, A., Weiss, A., Sigal, L., Loper, M., St Clair, T.

June 2018, U.S.~Patent 10,002,460 (misc)

Abstract
A system and method of estimating the body shape of an individual from input data such as images or range maps. The body may appear in one or more poses captured at different times and a consistent body shape is computed for all poses. The body may appear in minimal tight-fitting clothing or in normal clothing wherein the described method produces an estimate of the body shape under the clothing. Clothed or bare regions of the body are detected via image classification and the fitting method is adapted to treat each region differently. Body shapes are represented parametrically and are matched to other bodies based on shape similarity and other features. Standard measurements are extracted using parametric or non-parametric functions of body shape. The system components support many applications in body scanning, advertising, social networking, collaborative filtering and Internet clothing shopping.

Google Patents Project Page [BibTex]

Google Patents Project Page [BibTex]


Thumb xl coregpatentfig
Co-Registration – Simultaneous Alignment and Modeling of Articulated 3D Shapes

Black, M., Hirshberg, D., Loper, M., Rachlin, E., Weiss, A.

Febuary 2018, U.S.~Patent 9,898,848 (misc)

Abstract
Present application refers to a method, a model generation unit and a computer program (product) for generating trained models (M) of moving persons, based on physically measured person scan data (S). The approach is based on a common template (T) for the respective person and on the measured person scan data (S) in different shapes and different poses. Scan data are measured with a 3D laser scanner. A generic personal model is used for co-registering a set of person scan data (S) aligning the template (T) to the set of person scans (S) while simultaneously training the generic personal model to become a trained person model (M) by constraining the generic person model to be scan-specific, person-specific and pose-specific and providing the trained model (M), based on the co registering of the measured object scan data (S).

text [BibTex]


Thumb xl hmrteaser
End-to-end Recovery of Human Shape and Pose

Kanazawa, A., Black, M. J., Jacobs, D. W., Malik, J.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, 2018 (inproceedings)

Abstract
We describe Human Mesh Recovery (HMR), an end-to-end framework for reconstructing a full 3D mesh of a human body from a single RGB image. In contrast to most current methods that compute 2D or 3D joint locations, we produce a richer and more useful mesh representation that is parameterized by shape and 3D joint angles. The main objective is to minimize the reprojection loss of keypoints, which allows our model to be trained using in-the-wild images that only have ground truth 2D annotations. However, the reprojection loss alone is highly underconstrained. In this work we address this problem by introducing an adversary trained to tell whether human body shape and pose parameters are real or not using a large database of 3D human meshes. We show that HMR can be trained with and without using any paired 2D-to-3D supervision. We do not rely on intermediate 2D keypoint detections and infer 3D pose and shape parameters directly from image pixels. Our model runs in real-time given a bounding box containing the person. We demonstrate our approach on various images in-the-wild and out-perform previous optimization-based methods that output 3D meshes and show competitive results on tasks such as 3D joint location estimation and part segmentation.

pdf code project video Project Page [BibTex]

pdf code project video Project Page [BibTex]


Thumb xl smalrteaser
Lions and Tigers and Bears: Capturing Non-Rigid, 3D, Articulated Shape from Images

Zuffi, S., Kanazawa, A., Black, M. J.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, 2018 (inproceedings)

Abstract
Animals are widespread in nature and the analysis of their shape and motion is important in many fields and industries. Modeling 3D animal shape, however, is difficult because the 3D scanning methods used to capture human shape are not applicable to wild animals or natural settings. Consequently, we propose a method to capture the detailed 3D shape of animals from images alone. The articulated and deformable nature of animals makes this problem extremely challenging, particularly in unconstrained environments with moving and uncalibrated cameras. To make this possible, we use a strong prior model of articulated animal shape that we fit to the image data. We then deform the animal shape in a canonical reference pose such that it matches image evidence when articulated and projected into multiple images. Our method extracts significantly more 3D shape detail than previous methods and is able to model new species, including the shape of an extinct animal, using only a few video frames. Additionally, the projected 3D shapes are accurate enough to facilitate the extraction of a realistic texture map from multiple frames.

pdf code/data 3D models Project Page [BibTex]

pdf code/data 3D models Project Page [BibTex]


Thumb xl selection 002
PoTion: Pose MoTion Representation for Action Recognition

Choutas, V., Weinzaepfel, P., Revaud, J., Schmid, C.

In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE Computer Society, 2018 (inproceedings)

Abstract
Most state-of-the-art methods for action recognition rely on a two-stream architecture that processes appearance and motion independently. In this paper, we claim that consider- ing them jointly offers rich information for action recogni- tion. We introduce a novel representation that gracefully en- codes the movement of some semantic keypoints. We use the human joints as these keypoints and term our Pose moTion representation PoTion. Specifically, we first run a state- of-the-art human pose estimator [4] and extract heatmaps for the human joints in each frame. We obtain our PoTion representation by temporally aggregating these probability maps. This is achieved by ‘colorizing’ each of them de- pending on the relative time of the frames in the video clip and summing them. This fixed-size representation for an en- tire video clip is suitable to classify actions using a shallow convolutional neural network. Our experimental evaluation shows that PoTion outper- forms other state-of-the-art pose representations [6, 48]. Furthermore, it is complementary to standard appearance and motion streams. When combining PoTion with the recent two-stream I3D approach [5], we obtain state-of- the-art performance on the JHMDB, HMDB and UCF101 datasets.

PDF [BibTex]

PDF [BibTex]

2009


Thumb xl teaser wacv2010
Ball Joints for Marker-less Human Motion Capture

Pons-Moll, G., Rosenhahn, B.

In IEEE Workshop on Applications of Computer Vision (WACV),, December 2009 (inproceedings)

pdf [BibTex]

2009

pdf [BibTex]


no image
Background Subtraction Based on Rank Constraint for Point Trajectories

Ahmad, A., Del Bue, A., Lima, P.

In pages: 1-3, October 2009 (inproceedings)

Abstract
This work deals with a background subtraction algorithm for a fish-eye lens camera having 3 degrees of freedom, 2 in translation and 1 in rotation. The core assumption in this algorithm is that the background is considered to be composed of a dominant static plane in the world frame. The novelty lies in developing a rank-constraint based background subtraction for equidistant projection model, a property of the fish-eye lens. A detail simulation result is presented to support the hypotheses explained in this paper.

link (url) [BibTex]

link (url) [BibTex]


Thumb xl teaser cinc
Parametric Modeling of the Beating Heart with Respiratory Motion Extracted from Magnetic Resonance Images

Pons-Moll, G., Crosas, C., Tadmor, G., MacLeod, R., Rosenhahn, B., Brooks, D.

In IEEE Computers in Cardiology (CINC), September 2009 (inproceedings)

[BibTex]

[BibTex]


Thumb xl ascc09
Computer cursor control by motor cortical signals in humans with tetraplegia

Kim, S., Simeral, J. D., Hochberg, L. R., Donoghue, J. P., Black, M. J.

In 7th Asian Control Conference, ASCC09, pages: 988-993, Hong Kong, China, August 2009 (inproceedings)

pdf [BibTex]

pdf [BibTex]


no image
Classification of colon polyps in NBI endoscopy using vascularization features

Stehle, T., Auer, R., Gross, S., Behrens, A., Wulff, J., Aach, T., Winograd, R., Trautwein, C., Tischendorf, J.

In Medical Imaging 2009: Computer-Aided Diagnosis, 7260, (Editors: N. Karssemeijer and M. L. Giger), SPIE, February 2009 (inproceedings)

Abstract
The evolution of colon cancer starts with colon polyps. There are two different types of colon polyps, namely hyperplasias and adenomas. Hyperplasias are benign polyps which are known not to evolve into cancer and, therefore, do not need to be removed. By contrast, adenomas have a strong tendency to become malignant. Therefore, they have to be removed immediately via polypectomy. For this reason, a method to differentiate reliably adenomas from hyperplasias during a preventive medical endoscopy of the colon (colonoscopy) is highly desirable. A recent study has shown that it is possible to distinguish both types of polyps visually by means of their vascularization. Adenomas exhibit a large amount of blood vessel capillaries on their surface whereas hyperplasias show only few of them. In this paper, we show the feasibility of computer-based classification of colon polyps using vascularization features. The proposed classification algorithm consists of several steps: For the critical part of vessel segmentation, we implemented and compared two segmentation algorithms. After a skeletonization of the detected blood vessel candidates, we used the results as seed points for the Fast Marching algorithm which is used to segment the whole vessel lumen. Subsequently, features are computed from this segmentation which are then used to classify the polyps. In leave-one-out tests on our polyp database (56 polyps), we achieve a correct classification rate of approximately 90%.

DOI [BibTex]

DOI [BibTex]


Thumb xl 3dim09
One-shot scanning using de bruijn spaced grids

Ulusoy, A., Calakli, F., Taubin, G.

In Computer Vision Workshops (ICCV Workshops), 2009 IEEE 12th International Conference on, pages: 1786-1792, IEEE, 2009 (inproceedings)

Abstract
In this paper we present a new one-shot method to reconstruct the shape of dynamic 3D objects and scenes based on active illumination. In common with other related prior-art methods, a static grid pattern is projected onto the scene, a video sequence of the illuminated scene is captured, a shape estimate is produced independently for each video frame, and the one-shot property is realized at the expense of space resolution. The main challenge in grid-based one-shot methods is to engineer the pattern and algorithms so that the correspondence between pattern grid points and their images can be established very fast and without uncertainty. We present an efficient one-shot method which exploits simple geometric constraints to solve the correspondence problem. We also introduce De Bruijn spaced grids, a novel grid pattern, and show with strong empirical data that the resulting scheme is much more robust compared to those based on uniform spaced grids.

pdf link (url) DOI [BibTex]

pdf link (url) DOI [BibTex]


Thumb xl iccv09
Estimating human shape and pose from a single image

Guan, P., Weiss, A., Balan, A., Black, M. J.

In Int. Conf. on Computer Vision, ICCV, pages: 1381-1388, 2009 (inproceedings)

pdf video - mov 25MB video - mp4 10MB YouTube Project Page [BibTex]

pdf video - mov 25MB video - mp4 10MB YouTube Project Page [BibTex]


Thumb xl screen shot 2012 02 21 at 15.56.00  2
On feature combination for multiclass object classification

Gehler, P., Nowozin, S.

In Proceedings of the Twelfth IEEE International Conference on Computer Vision, pages: 221-228, 2009, oral presentation (inproceedings)

project page, code, data GoogleScholar pdf DOI [BibTex]

project page, code, data GoogleScholar pdf DOI [BibTex]


Thumb xl tracking iccv09
Segmentation, Ordering and Multi-object Tracking Using Graphical Models

Wang, C., Gorce, M. D. L., Paragios, N.

In IEEE International Conference on Computer Vision (ICCV), 2009 (inproceedings)

pdf [BibTex]

pdf [BibTex]


no image
Evaluating the potential of primary motor and premotor cortex for mutltidimensional neuroprosthetic control of complete reaching and grasping actions

Vargas-Irwin, C. E., Yadollahpour, P., Shakhnarovich, G., Black, M. J., Donoghue, J. P.

2009 Abstract Viewer and Itinerary Planner. Society for Neuroscience, Society for Neuroscience, 2009, Online (conference)

[BibTex]

[BibTex]


Thumb xl thumb screen shot 2012 10 06 at 12.02.32 pm
Modeling and Evaluation of Human-to-Robot Mapping of Grasps

Romero, J., Kjellström, H., Kragic, D.

In International Conference on Advanced Robotics (ICAR), pages: 1-6, 2009 (inproceedings)

Pdf [BibTex]

Pdf [BibTex]


Thumb xl nips2009b
An additive latent feature model for transparent object recognition

Fritz, M., Black, M., Bradski, G., Karayev, S., Darrell, T.

In Advances in Neural Information Processing Systems 22, NIPS, pages: 558-566, MIT Press, 2009 (inproceedings)

pdf slides [BibTex]

pdf slides [BibTex]


Thumb xl screen shot 2012 06 06 at 11.24.14 am
Let the kernel figure it out; Principled learning of pre-processing for kernel classifiers

Gehler, P., Nowozin, S.

In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), pages: 2836-2843, IEEE Computer Society, 2009 (inproceedings)

doi project page pdf [BibTex]

doi project page pdf [BibTex]


Thumb xl thumb screen shot 2012 10 06 at 12.04.52 pm
Monocular Real-Time 3D Articulated Hand Pose Estimation

Romero, J., Kjellström, H., Kragic, D.

In IEEE-RAS International Conference on Humanoid Robots, pages: 87-92, 2009 (inproceedings)

Pdf [BibTex]

Pdf [BibTex]


Thumb xl snap
Grasp Recognition and Mapping on Humanoid Robots

Do, M., Romero, J., Kjellström, H., Azad, P., Asfour, T., Kragic, D., Dillmann, R.

In IEEE-RAS International Conference on Humanoid Robots, pages: 465-471, 2009 (inproceedings)

Pdf Video [BibTex]

Pdf Video [BibTex]


Thumb xl teaser wc
4D Cardiac Segmentation of the Epicardium and Left Ventricle

Pons-Moll, G., Tadmor, G., MacLeod, R. S., Rosenhahn, B., Brooks, D. H.

In World Congress of Medical Physics and Biomedical Engineering (WC), 2009 (inproceedings)

[BibTex]

[BibTex]


Thumb xl bmvc1
Geometric Potential Force for the Deformable Model

Si Yong Yeo, Xianghua Xie, Igor Sazonov, Perumal Nithiarasu

In The 20th British Machine Vision Conference, pages: 1-11, 2009 (inproceedings)

Abstract
We propose a new external force field for deformable models which can be conve- niently generalized to high dimensions. The external force field is based on hypothesized interactions between the relative geometries of the deformable model and image gradi- ents. The evolution of the deformable model is solved using the level set method. The dynamic interaction forces between the geometries can greatly improve the deformable model performance in acquiring complex geometries and highly concave boundaries, and in dealing with weak image edges. The new deformable model can handle arbi- trary cross-boundary initializations. Here, we show that the proposed method achieve significant improvements when compared against existing state-of-the-art techniques.

[BibTex]

[BibTex]


Thumb xl cmbe
Level Set Based Automatic Segmentation of Human Aorta

Si Yong Yeo, Xianghua Xie, Igor Sazonov, Perumal Nithiarasu

In International Conference on Computational & Mathematical Biomedical Engineering, pages: 242-245, 2009 (inproceedings)

[BibTex]

[BibTex]


Thumb xl orthonormaity
In Defense of Orthonormality Constraints for Nonrigid Structure from Motion

Akhter, I., Sheikh, Y., Khan, S.

In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pages: 2447-2453, 2009 (inproceedings)

Abstract
In factorization approaches to nonrigid structure from motion, the 3D shape of a deforming object is usually modeled as a linear combination of a small number of basis shapes. The original approach to simultaneously estimate the shape basis and nonrigid structure exploited orthonormality constraints for metric rectification. Recently, it has been asserted that structure recovery through orthonormality constraints alone is inherently ambiguous and cannot result in a unique solution. This assertion has been accepted as conventional wisdom and is the justification of many remedial heuristics in literature. Our key contribution is to prove that orthonormality constraints are in fact sufficient to recover the 3D structure from image observations alone. We characterize the true nature of the ambiguity in using orthonormality constraints for the shape basis and show that it has no impact on structure reconstruction. We conclude from our experimentation that the primary challenge in using shape basis for nonrigid structure from motion is the difficulty in the optimization problem rather than the ambiguity in orthonormality constraints.

pdf [BibTex]

pdf [BibTex]


no image
Dynamic distortion correction for endoscopy systems with exchangeable optics

Stehle, T., Hennes, M., Gross, S., Behrens, A., Wulff, J., Aach, T.

In Bildverarbeitung für die Medizin 2009, pages: 142-146, Springer Berlin Heidelberg, 2009 (inproceedings)

Abstract
Endoscopic images are strongly affected by lens distortion caused by the use of wide angle lenses. In case of endoscopy systems with exchangeable optics, e.g. in bladder endoscopy or sinus endoscopy, the camera sensor and the optics do not form a rigid system but they can be shifted and rotated with respect to each other during an examination. This flexibility has a major impact on the location of the distortion centre as it is moved along with the optics. In this paper, we describe an algorithm for the dynamic correction of lens distortion in cystoscopy which is based on a one time calibration. For the compensation, we combine a conventional static method for distortion correction with an algorithm to detect the position and the orientation of the elliptic field of view. This enables us to estimate the position of the distortion centre according to the relative movement of camera and optics. Therewith, a distortion correction for arbitrary rotation angles and shifts becomes possible without performing static calibrations for every possible combination of shifts and angles beforehand.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Computational mechanisms for the recognition of time sequences of images in the visual cortex

Tan, C., Jhuang, H., Singer, J., Serre, T., Sheinberg, D., Poggio, T.

Society for Neuroscience, 2009 (conference)

pdf [BibTex]

pdf [BibTex]


Thumb xl vriphys2009
Interactive Inverse Kinematics for Monocular Motion Estimation

Morten Engell-Norregaard, Soren Hauberg, Jerome Lapuyade, Kenny Erleben, Kim S. Pedersen

In The 6th Workshop on Virtual Reality Interaction and Physical Simulation (VRIPHYS), 2009 (inproceedings)

Conference site Paper site [BibTex]

Conference site Paper site [BibTex]


Thumb xl thumb screen shot 2012 10 06 at 12.17.40 pm
A Comprehensive Grasp Taxonomy

Feix, T., Pawlik, R., Schmiedmayer, H., Romero, J., Kragic, D.

In Robotics, Science and Systems: Workshop on Understanding the Human Hand for Advancing Robotic Manipulation, 2009 (inproceedings)

Pdf [BibTex]

Pdf [BibTex]


no image
Population coding of ground truth motion in natural scenes in the early visual system

Stanley, G., Black, M. J., Lewis, J., Desbordes, G., Jin, J., Alonso, J.

COSYNE, 2009 (conference)

[BibTex]

[BibTex]


Thumb xl miua1
Segmentation of Human Upper Airway Using a Level Set Based Deformable Model

Si Yong Yeo, Xianghua Xie, Igor Sazonov, Perumal Nithiarasu

In The 13th Medical Image Understanding and Analysis, 2009 (inproceedings)

[BibTex]

[BibTex]


Thumb xl emmcvpr2009
Three Dimensional Monocular Human Motion Analysis in End-Effector Space

Soren Hauberg, Jerome Lapuyade, Morten Engell-Norregaard, Kenny Erleben, Kim S. Pedersen

In Energy Minimization Methods in Computer Vision and Pattern Recognition, 5681, pages: 235-248, Lecture Notes in Computer Science, (Editors: Cremers, Daniel and Boykov, Yuri and Blake, Andrew and Schmidt, Frank), Springer Berlin Heidelberg, 2009 (inproceedings)

Publishers site Paper site PDF [BibTex]

Publishers site Paper site PDF [BibTex]


no image
Decoding visual motion from correlated firing of thalamic neurons

Stanley, G. B., Black, M. J., Desbordes, G., Jin, J., Wang, Y., Alonso, J.

2009 Abstract Viewer and Itinerary Planner. Society for Neuroscience, Society for Neuroscience, 2009 (conference)

[BibTex]

[BibTex]

2006


no image
Finding directional movement representations in motor cortical neural populations using nonlinear manifold learning

WorKim, S., Simeral, J., Jenkins, O., Donoghue, J., Black, M.

World Congress on Medical Physics and Biomedical Engineering 2006, Seoul, Korea, August 2006 (conference)

[BibTex]

2006

[BibTex]


Thumb xl spikes
A non-parametric Bayesian approach to spike sorting

Wood, F., Goldwater, S., Black, M. J.

In International Conference of the IEEE Engineering in Medicine and Biology Society, EMBS, pages: 1165-1169, New York, NY, August 2006 (inproceedings)

pdf [BibTex]

pdf [BibTex]


Thumb xl amdo
Predicting 3D people from 2D pictures

(Best Paper)

Sigal, L., Black, M. J.

In Proc. IV Conf. on Articulated Motion and DeformableObjects (AMDO), LNCS 4069, pages: 185-195, July 2006 (inproceedings)

Abstract
We propose a hierarchical process for inferring the 3D pose of a person from monocular images. First we infer a learned view-based 2D body model from a single image using non-parametric belief propagation. This approach integrates information from bottom-up body-part proposal processes and deals with self-occlusion to compute distributions over limb poses. Then, we exploit a learned Mixture of Experts model to infer a distribution of 3D poses conditioned on 2D poses. This approach is more general than recent work on inferring 3D pose directly from silhouettes since the 2D body model provides a richer representation that includes the 2D joint angles and the poses of limbs that may be unobserved in the silhouette. We demonstrate the method in a laboratory setting where we evaluate the accuracy of the 3D poses against ground truth data. We also estimate 3D body pose in a monocular image sequence. The resulting 3D estimates are sufficiently accurate to serve as proposals for the Bayesian inference of 3D human motion over time

pdf pdf from publisher Video [BibTex]

pdf pdf from publisher Video [BibTex]


Thumb xl specular
Specular flow and the recovery of surface structure

Roth, S., Black, M.

In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, CVPR, 2, pages: 1869-1876, New York, NY, June 2006 (inproceedings)

Abstract
In scenes containing specular objects, the image motion observed by a moving camera may be an intermixed combination of optical flow resulting from diffuse reflectance (diffuse flow) and specular reflection (specular flow). Here, with few assumptions, we formalize the notion of specular flow, show how it relates to the 3D structure of the world, and develop an algorithm for estimating scene structure from 2D image motion. Unlike previous work on isolated specular highlights we use two image frames and estimate the semi-dense flow arising from the specular reflections of textured scenes. We parametrically model the image motion of a quadratic surface patch viewed from a moving camera. The flow is modeled as a probabilistic mixture of diffuse and specular components and the 3D shape is recovered using an Expectation-Maximization algorithm. Rather than treating specular reflections as noise to be removed or ignored, we show that the specular flow provides additional constraints on scene geometry that improve estimation of 3D structure when compared with reconstruction from diffuse flow alone. We demonstrate this for a set of synthetic and real sequences of mixed specular-diffuse objects.

pdf [BibTex]

pdf [BibTex]


Thumb xl balaniccv06
An adaptive appearance model approach for model-based articulated object tracking

Balan, A., Black, M. J.

In Proc. IEEE Conf. on Computer Vision and Pattern Recognition, CVPR, 1, pages: 758-765, New York, NY, June 2006 (inproceedings)

Abstract
The detection and tracking of three-dimensional human body models has progressed rapidly but successful approaches typically rely on accurate foreground silhouettes obtained using background segmentation. There are many practical applications where such information is imprecise. Here we develop a new image likelihood function based on the visual appearance of the subject being tracked. We propose a robust, adaptive, appearance model based on the Wandering-Stable-Lost framework extended to the case of articulated body parts. The method models appearance using a mixture model that includes an adaptive template, frame-to-frame matching and an outlier process. We employ an annealed particle filtering algorithm for inference and take advantage of the 3D body model to predict self occlusion and improve pose estimation accuracy. Quantitative tracking results are presented for a walking sequence with a 180 degree turn, captured with four synchronized and calibrated cameras and containing significant appearance changes and self-occlusion in each view.

pdf [BibTex]

pdf [BibTex]