Header logo is ps


2014


Thumb xl modeltransport
Model Transport: Towards Scalable Transfer Learning on Manifolds

Freifeld, O., Hauberg, S., Black, M. J.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages: 1378 -1385, Columbus, Ohio, USA, June 2014 (inproceedings)

Abstract
We consider the intersection of two research fields: transfer learning and statistics on manifolds. In particular, we consider, for manifold-valued data, transfer learning of tangent-space models such as Gaussians distributions, PCA, regression, or classifiers. Though one would hope to simply use ordinary Rn-transfer learning ideas, the manifold structure prevents it. We overcome this by basing our method on inner-product-preserving parallel transport, a well-known tool widely used in other problems of statistics on manifolds in computer vision. At first, this straightforward idea seems to suffer from an obvious shortcoming: Transporting large datasets is prohibitively expensive, hindering scalability. Fortunately, with our approach, we never transport data. Rather, we show how the statistical models themselves can be transported, and prove that for the tangent-space models above, the transport “commutes” with learning. Consequently, our compact framework, applicable to a large class of manifolds, is not restricted by the size of either the training or test sets. We demonstrate the approach by transferring PCA and logistic-regression models of real-world data involving 3D shapes and image descriptors.

pdf SupMat Video poster DOI Project Page [BibTex]

2014

pdf SupMat Video poster DOI Project Page [BibTex]


Thumb xl screen shot 2014 07 09 at 15.49.27
Robot Arm Pose Estimation through Pixel-Wise Part Classification

Bohg, J., Romero, J., Herzog, A., Schaal, S.

In IEEE International Conference on Robotics and Automation (ICRA) 2014, pages: 3143-3150, June 2014 (inproceedings)

Abstract
We propose to frame the problem of marker-less robot arm pose estimation as a pixel-wise part classification problem. As input, we use a depth image in which each pixel is classified to be either from a particular robot part or the background. The classifier is a random decision forest trained on a large number of synthetically generated and labeled depth images. From all the training samples ending up at a leaf node, a set of offsets is learned that votes for relative joint positions. Pooling these votes over all foreground pixels and subsequent clustering gives us an estimate of the true joint positions. Due to the intrinsic parallelism of pixel-wise classification, this approach can run in super real-time and is more efficient than previous ICP-like methods. We quantitatively evaluate the accuracy of this approach on synthetic data. We also demonstrate that the method produces accurate joint estimates on real data despite being purely trained on synthetic data.

video code pdf DOI Project Page [BibTex]

video code pdf DOI Project Page [BibTex]


Thumb xl dfm
Efficient Non-linear Markov Models for Human Motion

Lehrmann, A. M., Gehler, P. V., Nowozin, S.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages: 1314-1321, IEEE, June 2014 (inproceedings)

Abstract
Dynamic Bayesian networks such as Hidden Markov Models (HMMs) are successfully used as probabilistic models for human motion. The use of hidden variables makes them expressive models, but inference is only approximate and requires procedures such as particle filters or Markov chain Monte Carlo methods. In this work we propose to instead use simple Markov models that only model observed quantities. We retain a highly expressive dynamic model by using interactions that are nonlinear and non-parametric. A presentation of our approach in terms of latent variables shows logarithmic growth for the computation of exact loglikelihoods in the number of latent states. We validate our model on human motion capture data and demonstrate state-of-the-art performance on action recognition and motion completion tasks.

Project page pdf DOI Project Page [BibTex]

Project page pdf DOI Project Page [BibTex]


Thumb xl grassmann
Grassmann Averages for Scalable Robust PCA

Hauberg, S., Feragen, A., Black, M. J.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages: 3810 -3817, Columbus, Ohio, USA, June 2014 (inproceedings)

Abstract
As the collection of large datasets becomes increasingly automated, the occurrence of outliers will increase – "big data" implies "big outliers". While principal component analysis (PCA) is often used to reduce the size of data, and scalable solutions exist, it is well-known that outliers can arbitrarily corrupt the results. Unfortunately, state-of-the-art approaches for robust PCA do not scale beyond small-to-medium sized datasets. To address this, we introduce the Grassmann Average (GA), which expresses dimensionality reduction as an average of the subspaces spanned by the data. Because averages can be efficiently computed, we immediately gain scalability. GA is inherently more robust than PCA, but we show that they coincide for Gaussian data. We exploit that averages can be made robust to formulate the Robust Grassmann Average (RGA) as a form of robust PCA. Robustness can be with respect to vectors (subspaces) or elements of vectors; we focus on the latter and use a trimmed average. The resulting Trimmed Grassmann Average (TGA) is particularly appropriate for computer vision because it is robust to pixel outliers. The algorithm has low computational complexity and minimal memory requirements, making it scalable to "big noisy data." We demonstrate TGA for background modeling, video restoration, and shadow removal. We show scalability by performing robust PCA on the entire Star Wars IV movie.

pdf code supplementary material tutorial video results video talk poster DOI Project Page [BibTex]

pdf code supplementary material tutorial video results video talk poster DOI Project Page [BibTex]


Thumb xl 3basic posebits
Posebits for Monocular Human Pose Estimation

Pons-Moll, G., Fleet, D. J., Rosenhahn, B.

In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), pages: 2345-2352, Columbus, Ohio, USA, June 2014 (inproceedings)

Abstract
We advocate the inference of qualitative information about 3D human pose, called posebits, from images. Posebits represent boolean geometric relationships between body parts (e.g., left-leg in front of right-leg or hands close to each other). The advantages of posebits as a mid-level representation are 1) for many tasks of interest, such qualitative pose information may be sufficient (e.g. , semantic image retrieval), 2) it is relatively easy to annotate large image corpora with posebits, as it simply requires answers to yes/no questions; and 3) they help resolve challenging pose ambiguities and therefore facilitate the difficult talk of image-based 3D pose estimation. We introduce posebits, a posebit database, a method for selecting useful posebits for pose estimation and a structural SVM model for posebit inference. Experiments show the use of posebits for semantic image retrieval and for improving 3D pose estimation.

pdf Project Page Project Page [BibTex]

pdf Project Page Project Page [BibTex]


Thumb xl roser
Simultaneous Underwater Visibility Assessment, Enhancement and Improved Stereo

Roser, M., Dunbabin, M., Geiger, A.

IEEE International Conference on Robotics and Automation, pages: 3840 - 3847 , Hong Kong, China, June 2014 (conference)

Abstract
Vision-based underwater navigation and obstacle avoidance demands robust computer vision algorithms, particularly for operation in turbid water with reduced visibility. This paper describes a novel method for the simultaneous underwater image quality assessment, visibility enhancement and disparity computation to increase stereo range resolution under dynamic, natural lighting and turbid conditions. The technique estimates the visibility properties from a sparse 3D map of the original degraded image using a physical underwater light attenuation model. Firstly, an iterated distance-adaptive image contrast enhancement enables a dense disparity computation and visibility estimation. Secondly, using a light attenuation model for ocean water, a color corrected stereo underwater image is obtained along with a visibility distance estimate. Experimental results in shallow, naturally lit, high-turbidity coastal environments show the proposed technique improves range estimation over the original images as well as image quality and color for habitat classification. Furthermore, the recursiveness and robustness of the technique allows real-time implementation onboard an Autonomous Underwater Vehicles for improved navigation and obstacle avoidance performance.

pdf DOI [BibTex]

pdf DOI [BibTex]


Thumb xl icmlteaser
Preserving Modes and Messages via Diverse Particle Selection

Pacheco, J., Zuffi, S., Black, M. J., Sudderth, E.

In Proceedings of the 31st International Conference on Machine Learning (ICML-14), 32(1):1152-1160, J. Machine Learning Research Workshop and Conf. and Proc., Beijing, China, June 2014 (inproceedings)

Abstract
In applications of graphical models arising in domains such as computer vision and signal processing, we often seek the most likely configurations of high-dimensional, continuous variables. We develop a particle-based max-product algorithm which maintains a diverse set of posterior mode hypotheses, and is robust to initialization. At each iteration, the set of hypotheses at each node is augmented via stochastic proposals, and then reduced via an efficient selection algorithm. The integer program underlying our optimization-based particle selection minimizes errors in subsequent max-product message updates. This objective automatically encourages diversity in the maintained hypotheses, without requiring tuning of application-specific distances among hypotheses. By avoiding the stochastic resampling steps underlying particle sum-product algorithms, we also avoid common degeneracies where particles collapse onto a single hypothesis. Our approach significantly outperforms previous particle-based algorithms in experiments focusing on the estimation of human pose from single images.

pdf SupMat link (url) Project Page Project Page [BibTex]

pdf SupMat link (url) Project Page Project Page [BibTex]


Thumb xl schoenbein
Calibrating and Centering Quasi-Central Catadioptric Cameras

Schoenbein, M., Strauss, T., Geiger, A.

IEEE International Conference on Robotics and Automation, pages: 4443 - 4450, Hong Kong, China, June 2014 (conference)

Abstract
Non-central catadioptric models are able to cope with irregular camera setups and inaccuracies in the manufacturing process but are computationally demanding and thus not suitable for robotic applications. On the other hand, calibrating a quasi-central (almost central) system with a central model introduces errors due to a wrong relationship between the viewing ray orientations and the pixels on the image sensor. In this paper, we propose a central approximation to quasi-central catadioptric camera systems that is both accurate and efficient. We observe that the distance to points in 3D is typically large compared to deviations from the single viewpoint. Thus, we first calibrate the system using a state-of-the-art non-central camera model. Next, we show that by remapping the observations we are able to match the orientation of the viewing rays of a much simpler single viewpoint model with the true ray orientations. While our approximation is general and applicable to all quasi-central camera systems, we focus on one of the most common cases in practice: hypercatadioptric cameras. We compare our model to a variety of baselines in synthetic and real localization and motion estimation experiments. We show that by using the proposed model we are able to achieve near non-central accuracy while obtaining speed-ups of more than three orders of magnitude compared to state-of-the-art non-central models.

pdf DOI [BibTex]

pdf DOI [BibTex]


Thumb xl aistats2014
Probabilistic Solutions to Differential Equations and their Application to Riemannian Statistics

Hennig, P., Hauberg, S.

In Proceedings of the 17th International Conference on Artificial Intelligence and Statistics, 33, pages: 347-355, JMLR: Workshop and Conference Proceedings, (Editors: S Kaski and J Corander), Microtome Publishing, Brookline, MA, April 2014 (inproceedings)

Abstract
We study a probabilistic numerical method for the solution of both boundary and initial value problems that returns a joint Gaussian process posterior over the solution. Such methods have concrete value in the statistics on Riemannian manifolds, where non-analytic ordinary differential equations are involved in virtually all computations. The probabilistic formulation permits marginalising the uncertainty of the numerical solution such that statistics are less sensitive to inaccuracies. This leads to new Riemannian algorithms for mean value computations and principal geodesic analysis. Marginalisation also means results can be less precise than point estimates, enabling a noticeable speed-up over the state of the art. Our approach is an argument for a wider point that uncertainty caused by numerical calculations should be tracked throughout the pipeline of machine learning algorithms.

pdf Youtube Supplements Project page link (url) [BibTex]

pdf Youtube Supplements Project page link (url) [BibTex]


Thumb xl thumb
Multi-View Priors for Learning Detectors from Sparse Viewpoint Data

Pepik, B., Stark, M., Gehler, P., Schiele, B.

International Conference on Learning Representations, April 2014 (conference)

Abstract
While the majority of today's object class models provide only 2D bounding boxes, far richer output hypotheses are desirable including viewpoint, fine-grained category, and 3D geometry estimate. However, models trained to provide richer output require larger amounts of training data, preferably well covering the relevant aspects such as viewpoint and fine-grained categories. In this paper, we address this issue from the perspective of transfer learning, and design an object class model that explicitly leverages correlations between visual features. Specifically, our model represents prior distributions over permissible multi-view detectors in a parametric way -- the priors are learned once from training data of a source object class, and can later be used to facilitate the learning of a detector for a target class. As we show in our experiments, this transfer is not only beneficial for detectors based on basic-level category representations, but also enables the robust learning of detectors that represent classes at finer levels of granularity, where training data is typically even scarcer and more unbalanced. As a result, we report largely improved performance in simultaneous 2D object localization and viewpoint estimation on a recent dataset of challenging street scenes.

reviews pdf Project Page [BibTex]

reviews pdf Project Page [BibTex]


Thumb xl figure1
NRSfM using Local Rigidity

Rehan, A., Zaheer, A., Akhter, I., Saeed, A., Mahmood, B., Usmani, M., Khan, S.

In Proceedings Winter Conference on Applications of Computer Vision, pages: 69-74, open access, IEEE , Steamboat Springs, CO, USA, March 2014 (inproceedings)

Abstract
Factorization methods for computation of nonrigid structure have limited practicality, and work well only when there is large enough camera motion between frames, with long sequences and limited or no occlusions. We show that typical nonrigid structure can often be approximated well as locally rigid sub-structures in time and space. Specifically, we assume that: 1) the structure can be approximated as rigid in a short local time window and 2) some point pairs stay relatively rigid in space, maintaining a fixed distance between them during the sequence. We first use the triangulation constraints in rigid SFM over a sliding time window to get an initial estimate of the nonrigid 3D structure. We then automatically identify relatively rigid point pairs in this structure, and use their length-constancy simultaneously with triangulation constraints to refine the structure estimate. Unlike factorization methods, the structure is estimated independent of the camera motion computation, adding to the simplicity and stability of the approach. Further, local factorization inherently handles significant natural occlusions gracefully, performing much better than the state-of-the art. We show more stable and accurate results as compared to the state-of-the art on even short sequences starting from 15 frames only, containing camera rotations as small as 2 degree and up to 50% missing data.

link (url) [BibTex]

link (url) [BibTex]


Thumb xl aggteaser
Model-based Anthropometry: Predicting Measurements from 3D Human Scans in Multiple Poses

Tsoli, A., Loper, M., Black, M. J.

In Proceedings Winter Conference on Applications of Computer Vision, pages: 83-90, IEEE , March 2014 (inproceedings)

Abstract
Extracting anthropometric or tailoring measurements from 3D human body scans is important for applications such as virtual try-on, custom clothing, and online sizing. Existing commercial solutions identify anatomical landmarks on high-resolution 3D scans and then compute distances or circumferences on the scan. Landmark detection is sensitive to acquisition noise (e.g. holes) and these methods require subjects to adopt a specific pose. In contrast, we propose a solution we call model-based anthropometry. We fit a deformable 3D body model to scan data in one or more poses; this model-based fitting is robust to scan noise. This brings the scan into registration with a database of registered body scans. Then, we extract features from the registered model (rather than from the scan); these include, limb lengths, circumferences, and statistical features of global shape. Finally, we learn a mapping from these features to measurements using regularized linear regression. We perform an extensive evaluation using the CAESAR dataset and demonstrate that the accuracy of our method outperforms state-of-the-art methods.

pdf DOI Project Page Project Page [BibTex]

pdf DOI Project Page Project Page [BibTex]


Thumb xl isprs2014
Evaluation of feature-based 3-d registration of probabilistic volumetric scenes

Restrepo, M. I., Ulusoy, A. O., Mundy, J. L.

In ISPRS Journal of Photogrammetry and Remote Sensing, 98(0):1-18, 2014 (inproceedings)

Abstract
Automatic estimation of the world surfaces from aerial images has seen much attention and progress in recent years. Among current modeling technologies, probabilistic volumetric models (PVMs) have evolved as an alternative representation that can learn geometry and appearance in a dense and probabilistic manner. Recent progress, in terms of storage and speed, achieved in the area of volumetric modeling, opens the opportunity to develop new frameworks that make use of the {PVM} to pursue the ultimate goal of creating an entire map of the earth, where one can reason about the semantics and dynamics of the 3-d world. Aligning 3-d models collected at different time-instances constitutes an important step for successful fusion of large spatio-temporal information. This paper evaluates how effectively probabilistic volumetric models can be aligned using robust feature-matching techniques, while considering different scenarios that reflect the kind of variability observed across aerial video collections from different time instances. More precisely, this work investigates variability in terms of discretization, resolution and sampling density, errors in the camera orientation, and changes in illumination and geographic characteristics. All results are given for large-scale, outdoor sites. In order to facilitate the comparison of the registration performance of {PVMs} to that of other 3-d reconstruction techniques, the registration pipeline is also carried out using Patch-based Multi-View Stereo (PMVS) algorithm. Registration performance is similar for scenes that have favorable geometry and the appearance characteristics necessary for high quality reconstruction. In scenes containing trees, such as a park, or many buildings, such as a city center, registration performance is significantly more accurate when using the PVM.

Publisher site link (url) DOI [BibTex]

Publisher site link (url) DOI [BibTex]


no image
Left Ventricle Segmentation by Dynamic Shape Constrained Random Walk

X. Yang, Y. Su, M. Wan, S. Y. Yeo, C. Lim, S. T. Wong, L. Zhong, R. S. Tan

In Annual International Conference of the IEEE Engineering in Medicine and Biology Society, 2014 (inproceedings)

Abstract
Accurate and robust extraction of the left ventricle (LV) cavity is a key step for quantitative analysis of cardiac functions. In this study, we propose an improved LV cavity segmentation method that incorporates a dynamic shape constraint into the weighting function of the random walks algorithm. The method involves an iterative process that updates an intermediate result to the desired solution. The shape constraint restricts the solution space of the segmentation result, such that the robustness of the algorithm is increased to handle misleading information that emanates from noise, weak boundaries, and clutter. Our experiments on real cardiac magnetic resonance images demonstrate that the proposed method obtains better segmentation performance than standard method.

[BibTex]

[BibTex]

2009


Thumb xl teaser wacv2010
Ball Joints for Marker-less Human Motion Capture

Pons-Moll, G., Rosenhahn, B.

In IEEE Workshop on Applications of Computer Vision (WACV),, December 2009 (inproceedings)

pdf [BibTex]

2009

pdf [BibTex]


no image
Background Subtraction Based on Rank Constraint for Point Trajectories

Ahmad, A., Del Bue, A., Lima, P.

In pages: 1-3, October 2009 (inproceedings)

Abstract
This work deals with a background subtraction algorithm for a fish-eye lens camera having 3 degrees of freedom, 2 in translation and 1 in rotation. The core assumption in this algorithm is that the background is considered to be composed of a dominant static plane in the world frame. The novelty lies in developing a rank-constraint based background subtraction for equidistant projection model, a property of the fish-eye lens. A detail simulation result is presented to support the hypotheses explained in this paper.

link (url) [BibTex]

link (url) [BibTex]


Thumb xl teaser cinc
Parametric Modeling of the Beating Heart with Respiratory Motion Extracted from Magnetic Resonance Images

Pons-Moll, G., Crosas, C., Tadmor, G., MacLeod, R., Rosenhahn, B., Brooks, D.

In IEEE Computers in Cardiology (CINC), September 2009 (inproceedings)

[BibTex]

[BibTex]


Thumb xl ascc09
Computer cursor control by motor cortical signals in humans with tetraplegia

Kim, S., Simeral, J. D., Hochberg, L. R., Donoghue, J. P., Black, M. J.

In 7th Asian Control Conference, ASCC09, pages: 988-993, Hong Kong, China, August 2009 (inproceedings)

pdf [BibTex]

pdf [BibTex]


no image
Classification of colon polyps in NBI endoscopy using vascularization features

Stehle, T., Auer, R., Gross, S., Behrens, A., Wulff, J., Aach, T., Winograd, R., Trautwein, C., Tischendorf, J.

In Medical Imaging 2009: Computer-Aided Diagnosis, 7260, (Editors: N. Karssemeijer and M. L. Giger), SPIE, February 2009 (inproceedings)

Abstract
The evolution of colon cancer starts with colon polyps. There are two different types of colon polyps, namely hyperplasias and adenomas. Hyperplasias are benign polyps which are known not to evolve into cancer and, therefore, do not need to be removed. By contrast, adenomas have a strong tendency to become malignant. Therefore, they have to be removed immediately via polypectomy. For this reason, a method to differentiate reliably adenomas from hyperplasias during a preventive medical endoscopy of the colon (colonoscopy) is highly desirable. A recent study has shown that it is possible to distinguish both types of polyps visually by means of their vascularization. Adenomas exhibit a large amount of blood vessel capillaries on their surface whereas hyperplasias show only few of them. In this paper, we show the feasibility of computer-based classification of colon polyps using vascularization features. The proposed classification algorithm consists of several steps: For the critical part of vessel segmentation, we implemented and compared two segmentation algorithms. After a skeletonization of the detected blood vessel candidates, we used the results as seed points for the Fast Marching algorithm which is used to segment the whole vessel lumen. Subsequently, features are computed from this segmentation which are then used to classify the polyps. In leave-one-out tests on our polyp database (56 polyps), we achieve a correct classification rate of approximately 90%.

DOI [BibTex]

DOI [BibTex]


Thumb xl 3dim09
One-shot scanning using de bruijn spaced grids

Ulusoy, A., Calakli, F., Taubin, G.

In Computer Vision Workshops (ICCV Workshops), 2009 IEEE 12th International Conference on, pages: 1786-1792, IEEE, 2009 (inproceedings)

Abstract
In this paper we present a new one-shot method to reconstruct the shape of dynamic 3D objects and scenes based on active illumination. In common with other related prior-art methods, a static grid pattern is projected onto the scene, a video sequence of the illuminated scene is captured, a shape estimate is produced independently for each video frame, and the one-shot property is realized at the expense of space resolution. The main challenge in grid-based one-shot methods is to engineer the pattern and algorithms so that the correspondence between pattern grid points and their images can be established very fast and without uncertainty. We present an efficient one-shot method which exploits simple geometric constraints to solve the correspondence problem. We also introduce De Bruijn spaced grids, a novel grid pattern, and show with strong empirical data that the resulting scheme is much more robust compared to those based on uniform spaced grids.

pdf link (url) DOI [BibTex]

pdf link (url) DOI [BibTex]


Thumb xl iccv09
Estimating human shape and pose from a single image

Guan, P., Weiss, A., Balan, A., Black, M. J.

In Int. Conf. on Computer Vision, ICCV, pages: 1381-1388, 2009 (inproceedings)

pdf video - mov 25MB video - mp4 10MB YouTube Project Page [BibTex]

pdf video - mov 25MB video - mp4 10MB YouTube Project Page [BibTex]


Thumb xl screen shot 2012 02 21 at 15.56.00  2
On feature combination for multiclass object classification

Gehler, P., Nowozin, S.

In Proceedings of the Twelfth IEEE International Conference on Computer Vision, pages: 221-228, 2009, oral presentation (inproceedings)

project page, code, data GoogleScholar pdf DOI [BibTex]

project page, code, data GoogleScholar pdf DOI [BibTex]


Thumb xl tracking iccv09
Segmentation, Ordering and Multi-object Tracking Using Graphical Models

Wang, C., Gorce, M. D. L., Paragios, N.

In IEEE International Conference on Computer Vision (ICCV), 2009 (inproceedings)

pdf [BibTex]

pdf [BibTex]


no image
Evaluating the potential of primary motor and premotor cortex for mutltidimensional neuroprosthetic control of complete reaching and grasping actions

Vargas-Irwin, C. E., Yadollahpour, P., Shakhnarovich, G., Black, M. J., Donoghue, J. P.

2009 Abstract Viewer and Itinerary Planner. Society for Neuroscience, Society for Neuroscience, 2009, Online (conference)

[BibTex]

[BibTex]


Thumb xl thumb screen shot 2012 10 06 at 12.02.32 pm
Modeling and Evaluation of Human-to-Robot Mapping of Grasps

Romero, J., Kjellström, H., Kragic, D.

In International Conference on Advanced Robotics (ICAR), pages: 1-6, 2009 (inproceedings)

Pdf [BibTex]

Pdf [BibTex]


Thumb xl nips2009b
An additive latent feature model for transparent object recognition

Fritz, M., Black, M., Bradski, G., Karayev, S., Darrell, T.

In Advances in Neural Information Processing Systems 22, NIPS, pages: 558-566, MIT Press, 2009 (inproceedings)

pdf slides [BibTex]

pdf slides [BibTex]


Thumb xl screen shot 2012 06 06 at 11.24.14 am
Let the kernel figure it out; Principled learning of pre-processing for kernel classifiers

Gehler, P., Nowozin, S.

In Proceedings of the Conference on Computer Vision and Pattern Recognition (CVPR), pages: 2836-2843, IEEE Computer Society, 2009 (inproceedings)

doi project page pdf [BibTex]

doi project page pdf [BibTex]


Thumb xl thumb screen shot 2012 10 06 at 12.04.52 pm
Monocular Real-Time 3D Articulated Hand Pose Estimation

Romero, J., Kjellström, H., Kragic, D.

In IEEE-RAS International Conference on Humanoid Robots, pages: 87-92, 2009 (inproceedings)

Pdf [BibTex]

Pdf [BibTex]


Thumb xl snap
Grasp Recognition and Mapping on Humanoid Robots

Do, M., Romero, J., Kjellström, H., Azad, P., Asfour, T., Kragic, D., Dillmann, R.

In IEEE-RAS International Conference on Humanoid Robots, pages: 465-471, 2009 (inproceedings)

Pdf Video [BibTex]

Pdf Video [BibTex]


Thumb xl teaser wc
4D Cardiac Segmentation of the Epicardium and Left Ventricle

Pons-Moll, G., Tadmor, G., MacLeod, R. S., Rosenhahn, B., Brooks, D. H.

In World Congress of Medical Physics and Biomedical Engineering (WC), 2009 (inproceedings)

[BibTex]

[BibTex]


Thumb xl bmvc1
Geometric Potential Force for the Deformable Model

Si Yong Yeo, Xianghua Xie, Igor Sazonov, Perumal Nithiarasu

In The 20th British Machine Vision Conference, pages: 1-11, 2009 (inproceedings)

Abstract
We propose a new external force field for deformable models which can be conve- niently generalized to high dimensions. The external force field is based on hypothesized interactions between the relative geometries of the deformable model and image gradi- ents. The evolution of the deformable model is solved using the level set method. The dynamic interaction forces between the geometries can greatly improve the deformable model performance in acquiring complex geometries and highly concave boundaries, and in dealing with weak image edges. The new deformable model can handle arbi- trary cross-boundary initializations. Here, we show that the proposed method achieve significant improvements when compared against existing state-of-the-art techniques.

[BibTex]

[BibTex]


Thumb xl cmbe
Level Set Based Automatic Segmentation of Human Aorta

Si Yong Yeo, Xianghua Xie, Igor Sazonov, Perumal Nithiarasu

In International Conference on Computational & Mathematical Biomedical Engineering, pages: 242-245, 2009 (inproceedings)

[BibTex]

[BibTex]


Thumb xl orthonormaity
In Defense of Orthonormality Constraints for Nonrigid Structure from Motion

Akhter, I., Sheikh, Y., Khan, S.

In Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference on, pages: 2447-2453, 2009 (inproceedings)

Abstract
In factorization approaches to nonrigid structure from motion, the 3D shape of a deforming object is usually modeled as a linear combination of a small number of basis shapes. The original approach to simultaneously estimate the shape basis and nonrigid structure exploited orthonormality constraints for metric rectification. Recently, it has been asserted that structure recovery through orthonormality constraints alone is inherently ambiguous and cannot result in a unique solution. This assertion has been accepted as conventional wisdom and is the justification of many remedial heuristics in literature. Our key contribution is to prove that orthonormality constraints are in fact sufficient to recover the 3D structure from image observations alone. We characterize the true nature of the ambiguity in using orthonormality constraints for the shape basis and show that it has no impact on structure reconstruction. We conclude from our experimentation that the primary challenge in using shape basis for nonrigid structure from motion is the difficulty in the optimization problem rather than the ambiguity in orthonormality constraints.

pdf [BibTex]

pdf [BibTex]


no image
Dynamic distortion correction for endoscopy systems with exchangeable optics

Stehle, T., Hennes, M., Gross, S., Behrens, A., Wulff, J., Aach, T.

In Bildverarbeitung für die Medizin 2009, pages: 142-146, Springer Berlin Heidelberg, 2009 (inproceedings)

Abstract
Endoscopic images are strongly affected by lens distortion caused by the use of wide angle lenses. In case of endoscopy systems with exchangeable optics, e.g. in bladder endoscopy or sinus endoscopy, the camera sensor and the optics do not form a rigid system but they can be shifted and rotated with respect to each other during an examination. This flexibility has a major impact on the location of the distortion centre as it is moved along with the optics. In this paper, we describe an algorithm for the dynamic correction of lens distortion in cystoscopy which is based on a one time calibration. For the compensation, we combine a conventional static method for distortion correction with an algorithm to detect the position and the orientation of the elliptic field of view. This enables us to estimate the position of the distortion centre according to the relative movement of camera and optics. Therewith, a distortion correction for arbitrary rotation angles and shifts becomes possible without performing static calibrations for every possible combination of shifts and angles beforehand.

link (url) DOI [BibTex]

link (url) DOI [BibTex]


no image
Computational mechanisms for the recognition of time sequences of images in the visual cortex

Tan, C., Jhuang, H., Singer, J., Serre, T., Sheinberg, D., Poggio, T.

Society for Neuroscience, 2009 (conference)

pdf [BibTex]

pdf [BibTex]


Thumb xl vriphys2009
Interactive Inverse Kinematics for Monocular Motion Estimation

Morten Engell-Norregaard, Soren Hauberg, Jerome Lapuyade, Kenny Erleben, Kim S. Pedersen

In The 6th Workshop on Virtual Reality Interaction and Physical Simulation (VRIPHYS), 2009 (inproceedings)

Conference site Paper site [BibTex]

Conference site Paper site [BibTex]


Thumb xl thumb screen shot 2012 10 06 at 12.17.40 pm
A Comprehensive Grasp Taxonomy

Feix, T., Pawlik, R., Schmiedmayer, H., Romero, J., Kragic, D.

In Robotics, Science and Systems: Workshop on Understanding the Human Hand for Advancing Robotic Manipulation, 2009 (inproceedings)

Pdf [BibTex]

Pdf [BibTex]


no image
Population coding of ground truth motion in natural scenes in the early visual system

Stanley, G., Black, M. J., Lewis, J., Desbordes, G., Jin, J., Alonso, J.

COSYNE, 2009 (conference)

[BibTex]

[BibTex]


Thumb xl miua1
Segmentation of Human Upper Airway Using a Level Set Based Deformable Model

Si Yong Yeo, Xianghua Xie, Igor Sazonov, Perumal Nithiarasu

In The 13th Medical Image Understanding and Analysis, 2009 (inproceedings)

[BibTex]

[BibTex]


Thumb xl emmcvpr2009
Three Dimensional Monocular Human Motion Analysis in End-Effector Space

Soren Hauberg, Jerome Lapuyade, Morten Engell-Norregaard, Kenny Erleben, Kim S. Pedersen

In Energy Minimization Methods in Computer Vision and Pattern Recognition, 5681, pages: 235-248, Lecture Notes in Computer Science, (Editors: Cremers, Daniel and Boykov, Yuri and Blake, Andrew and Schmidt, Frank), Springer Berlin Heidelberg, 2009 (inproceedings)

Publishers site Paper site PDF [BibTex]

Publishers site Paper site PDF [BibTex]


no image
Decoding visual motion from correlated firing of thalamic neurons

Stanley, G. B., Black, M. J., Desbordes, G., Jin, J., Wang, Y., Alonso, J.

2009 Abstract Viewer and Itinerary Planner. Society for Neuroscience, Society for Neuroscience, 2009 (conference)

[BibTex]

[BibTex]

1999


Thumb xl bildschirmfoto 2013 01 14 um 09.07.06
Edges as outliers: Anisotropic smoothing using local image statistics

Black, M. J., Sapiro, G.

In Scale-Space Theories in Computer Vision, Second Int. Conf., Scale-Space ’99, pages: 259-270, LNCS 1682, Springer, Corfu, Greece, September 1999 (inproceedings)

Abstract
Edges are viewed as statistical outliers with respect to local image gradient magnitudes. Within local image regions we compute a robust statistical measure of the gradient variation and use this in an anisotropic diffusion framework to determine a spatially varying "edge-stopping" parameter σ. We show how to determine this parameter for two edge-stopping functions described in the literature (Perona-Malik and the Tukey biweight). Smoothing of the image is related the local texture and in regions of low texture, small gradient values may be treated as edges whereas in regions of high texture, large gradient magnitudes are necessary before an edge is preserved. Intuitively these results have similarities with human perceptual phenomena such as masking and "popout". Results are shown on a variety of standard images.

pdf [BibTex]

1999

pdf [BibTex]


Thumb xl bildschirmfoto 2013 01 07 um 12.35.15
Probabilistic detection and tracking of motion discontinuities

(Marr Prize, Honorable Mention)

Black, M. J., Fleet, D. J.

In Int. Conf. on Computer Vision, ICCV-99, pages: 551-558, ICCV, Corfu, Greece, September 1999 (inproceedings)

pdf [BibTex]

pdf [BibTex]


Thumb xl bildschirmfoto 2013 01 14 um 09.12.47
Explaining optical flow events with parameterized spatio-temporal models

Black, M. J.

In IEEE Proc. Computer Vision and Pattern Recognition, CVPR’99, pages: 326-332, IEEE, Fort Collins, CO, 1999 (inproceedings)

pdf video [BibTex]

pdf video [BibTex]

1996


Thumb xl bildschirmfoto 2013 01 14 um 10.40.24
Cardboard people: A parameterized model of articulated motion

Ju, S. X., Black, M. J., Yacoob, Y.

In 2nd Int. Conf. on Automatic Face- and Gesture-Recognition, pages: 38-44, Killington, Vermont, October 1996 (inproceedings)

Abstract
We extend the work of Black and Yacoob on the tracking and recognition of human facial expressions using parameterized models of optical flow to deal with the articulated motion of human limbs. We define a "cardboard person model" in which a person's limbs are represented by a set of connected planar patches. The parameterized image motion of these patches is constrained to enforce articulated motion and is solved for directly using a robust estimation technique. The recovered motion parameters provide a rich and concise description of the activity that can be used for recognition. We propose a method for performing view-based recognition of human activities from the optical flow parameters that extends previous methods to cope with the cyclical nature of human motion. We illustrate the method with examples of tracking human legs over long image sequences.

pdf [BibTex]

1996

pdf [BibTex]


Thumb xl bildschirmfoto 2013 01 14 um 10.48.32
Skin and Bones: Multi-layer, locally affine, optical flow and regularization with transparency

(Nominated: Best paper)

Ju, S., Black, M. J., Jepson, A. D.

In IEEE Conf. on Computer Vision and Pattern Recognition, CVPR’96, pages: 307-314, San Francisco, CA, June 1996 (inproceedings)

pdf [BibTex]

pdf [BibTex]


Thumb xl bildschirmfoto 2013 01 14 um 10.52.58
EigenTracking: Robust matching and tracking of articulated objects using a view-based representation

Black, M. J., Jepson, A.

In Proc. Fourth European Conf. on Computer Vision, ECCV’96, pages: 329-342, LNCS 1064, Springer Verlag, Cambridge, England, April 1996 (inproceedings)

pdf video [BibTex]

pdf video [BibTex]

1993


Thumb xl bildschirmfoto 2013 01 14 um 11.48.36
Mixture models for optical flow computation

Jepson, A., Black, M.

In IEEE Conf. on Computer Vision and Pattern Recognition, CVPR-93, pages: 760-761, New York, NY, June 1993 (inproceedings)

pdf abstract tech report [BibTex]

1993

pdf abstract tech report [BibTex]


Thumb xl bildschirmfoto 2013 01 14 um 11.52.45
A framework for the robust estimation of optical flow

(Helmholtz Prize)

Black, M. J., Anandan, P.

In Fourth International Conf. on Computer Vision, ICCV-93, pages: 231-236, Berlin, Germany, May 1993 (inproceedings)

Abstract
Most approaches for estimating optical flow assume that, within a finite image region, only a single motion is present. This single motion assumption is violated in common situations involving transparency, depth discontinuities, independently moving objects, shadows, and specular reflections. To robustly estimate optical flow, the single motion assumption must be relaxed. This work describes a framework based on robust estimation that addresses violations of the brightness constancy and spatial smoothness assumptions caused by multiple motions. We show how the robust estimation framework can be applied to standard formulations of the optical flow problem thus reducing their sensitivity to violations of their underlying assumptions. The approach has been applied to three standard techniques for recovering optical flow: area-based regression, correlation, and regularization with motion discontinuities. This work focuses on the recovery of multiple parametric motion models within a region as well as the recovery of piecewise-smooth flow fields and provides examples with natural and synthetic image sequences.

pdf video abstract code [BibTex]

pdf video abstract code [BibTex]


Thumb xl ijcai
Action, representation, and purpose: Re-evaluating the foundations of computational vision

Black, M. J., Aloimonos, Y., Brown, C. M., Horswill, I., Malik, J., G. Sandini, , Tarr, M. J.

In International Joint Conference on Artificial Intelligence, IJCAI-93, pages: 1661-1666, Chambery, France, 1993 (inproceedings)

pdf [BibTex]

pdf [BibTex]