Header logo is ps


2015


Thumb xl silvia phd
Shape Models of the Human Body for Distributed Inference

Zuffi, S.

Brown University, May 2015 (phdthesis)

Abstract
In this thesis we address the problem of building shape models of the human body, in 2D and 3D, which are realistic and efficient to use. We focus our efforts on the human body, which is highly articulated and has interesting shape variations, but the approaches we present here can be applied to generic deformable and articulated objects. To address efficiency, we constrain our models to be part-based and have a tree-structured representation with pairwise relationships between connected parts. This allows the application of methods for distributed inference based on message passing. To address realism, we exploit recent advances in computer graphics that represent the human body with statistical shape models learned from 3D scans. We introduce two articulated body models, a 2D model, named Deformable Structures (DS), which is a contour-based model parameterized for 2D pose and projected shape, and a 3D model, named Stitchable Puppet (SP), which is a mesh-based model parameterized for 3D pose, pose-dependent deformations and intrinsic body shape. We have successfully applied the models to interesting and challenging problems in computer vision and computer graphics, namely pose estimation from static images, pose estimation from video sequences, pose and shape estimation from 3D scan data. This advances the state of the art in human pose and shape estimation and suggests that carefully de ned realistic models can be important for computer vision. More work at the intersection of vision and graphics is thus encouraged.

PDF [BibTex]


Thumb xl th teaser
From Scans to Models: Registration of 3D Human Shapes Exploiting Texture Information

Bogo, F.

University of Padova, March 2015 (phdthesis)

Abstract
New scanning technologies are increasing the importance of 3D mesh data, and of algorithms that can reliably register meshes obtained from multiple scans. Surface registration is important e.g. for building full 3D models from partial scans, identifying and tracking objects in a 3D scene, creating statistical shape models. Human body registration is particularly important for many applications, ranging from biomedicine and robotics to the production of movies and video games; but obtaining accurate and reliable registrations is challenging, given the articulated, non-rigidly deformable structure of the human body. In this thesis, we tackle the problem of 3D human body registration. We start by analyzing the current state of the art, and find that: a) most registration techniques rely only on geometric information, which is ambiguous on flat surface areas; b) there is a lack of adequate datasets and benchmarks in the field. We address both issues. Our contribution is threefold. First, we present a model-based registration technique for human meshes that combines geometry and surface texture information to provide highly accurate mesh-to-mesh correspondences. Our approach estimates scene lighting and surface albedo, and uses the albedo to construct a high-resolution textured 3D body model that is brought into registration with multi-camera image data using a robust matching term. Second, by leveraging our technique, we present FAUST (Fine Alignment Using Scan Texture), a novel dataset collecting 300 high-resolution scans of 10 people in a wide range of poses. FAUST is the first dataset providing both real scans and automatically computed, reliable "ground-truth" correspondences between them. Third, we explore possible uses of our approach in dermatology. By combining our registration technique with a melanocytic lesion segmentation algorithm, we propose a system that automatically detects new or evolving lesions over almost the entire body surface, thus helping dermatologists identify potential melanomas. We conclude this thesis investigating the benefits of using texture information to establish frame-to-frame correspondences in dynamic monocular sequences captured with consumer depth cameras. We outline a novel approach to reconstruct realistic body shape and appearance models from dynamic human performances, and show preliminary results on challenging sequences captured with a Kinect.

[BibTex]


Thumb xl thesis teaser
Long Range Motion Estimation and Applications

Sevilla-Lara, L.

Long Range Motion Estimation and Applications, University of Massachusetts Amherst, University of Massachusetts Amherst, Febuary 2015 (phdthesis)

Abstract
Finding correspondences between images underlies many computer vision problems, such as optical flow, tracking, stereovision and alignment. Finding these correspondences involves formulating a matching function and optimizing it. This optimization process is often gradient descent, which avoids exhaustive search, but relies on the assumption of being in the basin of attraction of the right local minimum. This is often the case when the displacement is small, and current methods obtain very accurate results for small motions. However, when the motion is large and the matching function is bumpy this assumption is less likely to be true. One traditional way of avoiding this abruptness is to smooth the matching function spatially by blurring the images. As the displacement becomes larger, the amount of blur required to smooth the matching function becomes also larger. This averaging of pixels leads to a loss of detail in the image. Therefore, there is a trade-off between the size of the objects that can be tracked and the displacement that can be captured. In this thesis we address the basic problem of increasing the size of the basin of attraction in a matching function. We use an image descriptor called distribution fields (DFs). By blurring the images in DF space instead of in pixel space, we in- crease the size of the basin attraction with respect to traditional methods. We show competitive results using DFs both in object tracking and optical flow. Finally we demonstrate an application of capturing large motions for temporal video stitching.

[BibTex]

[BibTex]

2013


Thumb xl implied flow whue
Puppet Flow

Zuffi, S., Black, M. J.

(7), Max Planck Institute for Intelligent Systems, October 2013 (techreport)

Abstract
We introduce Puppet Flow (PF), a layered model describing the optical flow of a person in a video sequence. We consider video frames composed by two layers: a foreground layer corresponding to a person, and background. We model the background as an affine flow field. The foreground layer, being a moving person, requires reasoning about the articulated nature of the human body. We thus represent the foreground layer with the Deformable Structures model (DS), a parametrized 2D part-based human body representation. We call the motion field defined through articulated motion and deformation of the DS model, a Puppet Flow. By exploiting the DS representation, Puppet Flow is a parametrized optical flow field, where parameters are the person's pose, gender and body shape.

pdf Project Page Project Page [BibTex]

2013

pdf Project Page Project Page [BibTex]


no image
D2.1.4 RoCKIn@Work - Innovation in Mobile Industrial Manipulation Competition Design, Rule Book, and Scenario Construction

Ahmad, A., Awaad, I., Amigoni, F., Berghofer, J., Bischoff, R., Bonarini, A., Dwiputra, R., Hegger, F., Hochgeschwender, N., Iocchi, L., Kraetzschmar, G., Lima, P., Matteucci, M., Nardi, D., Schneider, S.

(FP7-ICT-601012 Revision 0.7), RoCKIn - Robot Competitions Kick Innovation in Cognitive Systems and Robotics, sep 2013 (techreport)

Abstract
RoCKIn is a EU-funded project aiming to foster scientific progress and innovation in cognitive systems and robotics through the design and implementation of competitions. An additional objective of RoCKIn is to increase public awareness of the current state-of-the-art in robotics in Europe and to demonstrate the innovation potential of robotics applications for solving societal challenges and improving the competitiveness of Europe in the global markets. In order to achieve these objectives, RoCKIn develops two competitions, one for domestic service robots (RoCKIn@Home) and one for industrial robots in factories (RoCKIn-@Work). These competitions are designed around challenges that are based on easy-to-communicate and convincing user stories, which catch the interest of both the general public and the scientifc community. The latter is in particular interested in solving open scientific challenges and to thoroughly assess, compare, and evaluate the developed approaches with competing ones. To allow this to happen, the competitions are designed to meet the requirements of benchmarking procedures and good experimental methods. The integration of benchmarking technology with the competition concept is one of the main objectives of RoCKIn. This document describes the first version of the RoCKIn@Work competition, which will be held for the first time in 2014. The first chapter of the document gives a brief overview, outlining the purpose and objective of the competition, the methodological approach taken by the RoCKIn project, the user story upon which the competition is based, the structure and organization of the competition, and the commonalities and differences with the RoboCup@Work competition, which served as inspiration for RoCKIn@Work. The second chapter provides details on the user story and analyzes the scientific and technical challenges it poses. Consecutive chapters detail the competition scenario, the competition design, and the organization of the competition. The appendices contain information on a library of functionalities, which we believe are needed, or at least useful, for building competition entries, details on the scenario construction, and a detailed account of the benchmarking infrastructure needed — and provided by RoCKIn.

[BibTex]

[BibTex]


no image
D2.1.1 RoCKIn@Home - A Competition for Domestic Service Robots Competition Design, Rule Book, and Scenario Construction

Ahmad, A., Awaad, I., Amigoni, F., Berghofer, J., Bischoff, R., Bonarini, A., Dwiputra, R., Hegger, F., Hochgeschwender, N., Iocchi, L., Kraetzschmar, G., Lima, P., Matteucci, M., Nardi, D., Schneider, S.

(FP7-ICT-601012 Revision 0.7), RoCKIn - Robot Competitions Kick Innovation in Cognitive Systems and Robotics, sep 2013 (techreport)

Abstract
RoCKIn is a EU-funded project aiming to foster scientific progress and innovation in cognitive systems and robotics through the design and implementation of competitions. An additional objective of RoCKIn is to increase public awareness of the current state-of-the-art in robotics in Europe and to demonstrate the innovation potential of robotics applications for solving societal challenges and improving the competitiveness of Europe in the global markets. In order to achieve these objectives, RoCKIn develops two competitions, one for domestic service robots (RoCKIn@Home) and one for industrial robots in factories (RoCKIn-@Work). These competitions are designed around challenges that are based on easy-to-communicate and convincing user stories, which catch the interest of both the general public and the scientifc community. The latter is in particular interested in solving open scientific challenges and to thoroughly assess, compare, and evaluate the developed approaches with competing ones. To allow this to happen, the competitions are designed to meet the requirements of benchmarking procedures and good experimental methods. The integration of benchmarking technology with the competition concept is one of the main objectives of RoCKIn. This document describes the first version of the RoCKIn@Home competition, which will be held for the first time in 2014. The first chapter of the document gives a brief overview, outlining the purpose and objective of the competition, the methodological approach taken by the RoCKIn project, the user story upon which the competition is based, the structure and organization of the competition, and the commonalities and differences with the RoboCup@Home competition, which served as inspiration for RoCKIn@Home. The second chapter provides details on the user story and analyzes the scientific and technical challenges it poses. Consecutive chapters detail the competition scenario, the competition design, and the organization of the competition. The appendices contain information on a library of functionalities, which we believe are needed, or at least useful, for building competition entries, details on the scenario construction, and a detailed account of the benchmarking infrastructure needed — and provided by RoCKIn.

[BibTex]

[BibTex]


Thumb xl cover3
Statistics on Manifolds with Applications to Modeling Shape Deformations

Freifeld, O.

Brown University, August 2013 (phdthesis)

Abstract
Statistical models of non-rigid deformable shape have wide application in many fi elds, including computer vision, computer graphics, and biometry. We show that shape deformations are well represented through nonlinear manifolds that are also matrix Lie groups. These pattern-theoretic representations lead to several advantages over other alternatives, including a principled measure of shape dissimilarity and a natural way to compose deformations. Moreover, they enable building models using statistics on manifolds. Consequently, such models are superior to those based on Euclidean representations. We demonstrate this by modeling 2D and 3D human body shape. Shape deformations are only one example of manifold-valued data. More generally, in many computer-vision and machine-learning problems, nonlinear manifold representations arise naturally and provide a powerful alternative to Euclidean representations. Statistics is traditionally concerned with data in a Euclidean space, relying on the linear structure and the distances associated with such a space; this renders it inappropriate for nonlinear spaces. Statistics can, however, be generalized to nonlinear manifolds. Moreover, by respecting the underlying geometry, the statistical models result in not only more e ffective analysis but also consistent synthesis. We go beyond previous work on statistics on manifolds by showing how, even on these curved spaces, problems related to modeling a class from scarce data can be dealt with by leveraging information from related classes residing in di fferent regions of the space. We show the usefulness of our approach with 3D shape deformations. To summarize our main contributions: 1) We de fine a new 2D articulated model -- more expressive than traditional ones -- of deformable human shape that factors body-shape, pose, and camera variations. Its high realism is obtained from training data generated from a detailed 3D model. 2) We defi ne a new manifold-based representation of 3D shape deformations that yields statistical deformable-template models that are better than the current state-of-the- art. 3) We generalize a transfer learning idea from Euclidean spaces to Riemannian manifolds. This work demonstrates the value of modeling manifold-valued data and their statistics explicitly on the manifold. Specifi cally, the methods here provide new tools for shape analysis.

pdf Project Page [BibTex]


no image
D1.1 Specification of General Features of Scenarios and Robots for Benchmarking Through Competitions

Ahmad, A., Awaad, I., Amigoni, F., Berghofer, J., Bischoff, R., Bonarini, A., Dwiputra, R., Fontana, G., Hegger, F., Hochgeschwender, N., Iocchi, L., Kraetzschmar, G., Lima, P., Matteucci, M., Nardi, D., Schiaffonati, V., Schneider, S.

(FP7-ICT-601012 Revision 1.0), RoCKIn - Robot Competitions Kick Innovation in Cognitive Systems and Robotics, July 2013 (techreport)

Abstract
RoCKIn is a EU-funded project aiming to foster scientific progress and innovation in cognitive systems and robotics through the design and implementation of competitions. An additional objective of RoCKIn is to increase public awareness of the current state-of-the-art in robotics and the innovation potential of robotics applications. From these objectives several requirements for the work performed in RoCKIn can be derived: The RoCKIn competitions must start from convincing, easy-to-communicate user stories, that catch the attention of relevant stakeholders, the media, and the crowd. The user stories play the role of a mid- to long-term vision for a competition. Preferably, the user stories address economic, societal, or environmental problems. The RoCKIn competitions must pose open scientific challenges of interest to sufficiently many researchers to attract existing and new teams of robotics researchers for participation in the competition. The competitions need to promise some suitable reward, such as recognition in the scientific community, publicity for a team’s work, awards, or prize money, to justify the effort a team puts into the development of a competition entry. The competitions should be designed in such a way that they reward general, scientifically sound solutions to the challenge problems; such general solutions should score better than approaches that work only in narrowly defined contexts and are considred over-engineered. The challenges motivating the RoCKIn competitions must be broken down into suitable intermediate goals that can be reached with a limited team effort until the next competition and the project duration. The RoCKIn competitions must be well-defined and well-designed, with comprehensive rule books and instructions for the participants in order to guarantee a fair competition. The RoCKIn competitions must integrate competitions with benchmarking in order to provide comprehensive feedback for the teams about the suitability of particular functional modules, their overall architecture, and system integration. This document takes the first steps towards the RoCKIn goals. After outlining our approach, we present several user stories for further discussion within the community. The main objectives of this document are to identify and document relevant scenario features and the tasks and functionalities subject for benchmarking in the competitions.

[BibTex]

[BibTex]


no image
SocRob-MSL 2013 Team Description Paper for Middle Sized League

Messias, J., Ahmad, A., Reis, J., Serafim, M., Lima, P.

17th Annual RoboCup International Symposium 2013, July 2013 (techreport)

Abstract
This paper describes the status of the SocRob MSL robotic soccer team as required by the RoboCup 2013 qualification procedures. The team’s latest scientific and technical developments, since its last participation in RoboCup MSL, include further advances in cooperative perception; novel communication methods for distributed robotics; progressive deployment of the ROS middleware; improved localization through feature tracking and Mixture MCL; novel planning methods based on Petri nets and decision-theoretic frameworks; and hardware developments in ball-handling/kicking devices.

link (url) [BibTex]

link (url) [BibTex]


Thumb xl phd
Probabilistic Models for 3D Urban Scene Understanding from Movable Platforms

Geiger, A.

Karlsruhe Institute of Technology, Karlsruhe Institute of Technology, April 2013 (phdthesis)

Abstract
Visual 3D scene understanding is an important component in autonomous driving and robot navigation. Intelligent vehicles for example often base their decisions on observations obtained from video cameras as they are cheap and easy to employ. Inner-city intersections represent an interesting but also very challenging scenario in this context: The road layout may be very complex and observations are often noisy or even missing due to heavy occlusions. While Highway navigation and autonomous driving on simple and annotated intersections have already been demonstrated successfully, understanding and navigating general inner-city crossings with little prior knowledge remains an unsolved problem. This thesis is a contribution to understanding multi-object traffic scenes from video sequences. All data is provided by a camera system which is mounted on top of the autonomous driving platform AnnieWAY. The proposed probabilistic generative model reasons jointly about the 3D scene layout as well as the 3D location and orientation of objects in the scene. In particular, the scene topology, geometry as well as traffic activities are inferred from short video sequences. The model takes advantage of monocular information in the form of vehicle tracklets, vanishing lines and semantic labels. Additionally, the benefit of stereo features such as 3D scene flow and occupancy grids is investigated. Motivated by the impressive driving capabilities of humans, no further information such as GPS, lidar, radar or map knowledge is required. Experiments conducted on 113 representative intersection sequences show that the developed approach successfully infers the correct layout in a variety of difficult scenarios. To evaluate the importance of each feature cue, experiments with different feature combinations are conducted. Additionally, the proposed method is shown to improve object detection and object orientation estimation performance.

pdf [BibTex]

pdf [BibTex]


Thumb xl jampani 13 thesis
A Study of X-Ray Image Perception for Pneumoconiosis Detection

Jampani, V.

IIIT-Hyderabad, Hyderabad, India, January 2013 (mastersthesis)

Abstract
Pneumoconiosis is an occupational lung disease caused by the inhalation of industrial dust. Despite the increasing safety measures and better work place environments, pneumoconiosis is deemed to be the most common occupational disease in the developing countries like India and China. Screening and assessment of this disease is done through radiological observation of chest x-rays. Several studies have shown the significant inter and intra reader observer variation in the diagnosis of this disease, showing the complexity of the task and importance of the expertise in diagnosis. The present study is aimed at understanding the perceptual and cognitive factors affecting the reading of chest x-rays of pneumoconiosis patients. Understanding these factors helps in developing better image acquisition systems, better training regimen for radiologists and development of better computer aided diagnostic (CAD) systems. We used an eye tracking experiment to study the various factors affecting the assessment of this diffused lung disease. Specifically, we aimed at understanding the role of expertize, contralateral symmetric (CS) information present in chest x-rays on the diagnosis and the eye movements of the observers. We also studied the inter and intra observer fixation consistency along with the role of anatomical and bottom up saliency features in attracting the gaze of observers of different expertize levels, to get better insights into the effect of bottom up and top down visual saliency on the eye movements of observers. The experiment is conducted in a room dedicated to eye tracking experiments. Participants consisting of novices (3), medical students (12), residents (4) and staff radiologists (4) were presented with good quality PA chest X-rays, and were asked to give profusion ratings for each of the 6 lung zones. Image set consisting of 17 normal full chest x-rays and 16 single lung images are shown to the participants in random order. Time of the diagnosis and the eye movements are also recorded using a remote head free eye tracker. Results indicated that Expertise and CS play important roles in the diagnosis of pneumoconiosis. Novices and medical students are slow and inefficient whereas, residents and staff are quick and efficient. A key finding of our study is that the presence of CS information alone does not help improve diagnosis as much as learning how to use the information. This learning appears to be gained from focused training and years of experience. Hence, good training for radiologists and careful observation of each lung zone may improve the quality of diagnostic results. For residents, the eye scanning strategies play an important role in using the CS information present in chest radiographs; however, in staff radiologists, peripheral vision or higher-level cognitive processes seems to play role in using the CS information. There is a reasonably good inter and intra observer fixation consistency suggesting the use of similar viewing strategies. Experience is helping the observers to develop new visual strategies based on the image content so that they can quickly and efficiently assess the disease level. First few fixations seem to be playing an important role in choosing the visual strategy, appropriate for the given image. Both inter-rib and rib regions are given equal importance by the observers. Despite reading of chest x-rays being highly task dependent, bottom up saliency is shown to have played an important role in attracting the fixations of the observers. This role of bottom up saliency seems to be more in lower expertize groups compared to that of higher expertize groups. Both bottom up and top down influence of visual fixations seems to change with time. The relative role of top down and bottom up influences of visual attention is still not completely understood and it remains the part of future work. Based on our experimental results, we have developed an extended saliency model by combining the bottom up saliency and the saliency of lung regions in a chest x-ray. This new saliency model performed significantly better than bottom-up saliency in predicting the gaze of the observers in our experiment. Even though, the model is a simple combination of bottom-up saliency maps and segmented lung masks, this demonstrates that even basic models using simple image features can predict the fixations of the observers to a good accuracy. Experimental analysis suggested that the factors affecting the reading of chest x-rays of pneumoconiosis are complex and varied. A good understanding of these factors definitely helps in the development of better radiological screening of pneumoconiosis through improved training and also through the use of improved CAD tools. The presented work is an attempt to get insights into what these factors are and how they modify the behavior of the observers.

pdf [BibTex]

pdf [BibTex]


Thumb xl secretstr
A Quantitative Analysis of Current Practices in Optical Flow Estimation and the Principles Behind Them

Sun, D., Roth, S., Black, M. J.

(CS-10-03), Brown University, Department of Computer Science, January 2013 (techreport)

pdf [BibTex]

pdf [BibTex]

2011


no image
ISocRob-MSL 2011 Team Description Paper for Middle Sized League

Messias, J., Ahmad, A., Reis, J., Sousa, J., Lima, P.

15th Annual RoboCup International Symposium 2011, July 2011 (techreport)

Abstract
This paper describes the status of the ISocRob MSL robotic soccer team as required by the RoboCup 2011 qualification procedures. The most relevant technical and scientifical developments carried out by the team, since its last participation in the RoboCup MSL competitions, are here detailed. These include cooperative localization, cooperative object tracking, planning under uncertainty, obstacle detection and improvements to self-localization.

link (url) [BibTex]

2011

link (url) [BibTex]


Thumb xl screen shot 2012 03 13 at 2.41.46 pm
Dorsal Stream: From Algorithm to Neuroscience

Jhuang, H.

PhD Thesis, MIT, 2011 (techreport)

pdf [BibTex]


Thumb xl thesis
Spatial Models of Human Motion

Soren Hauberg

University of Copenhagen, 2011 (phdthesis)

PDF [BibTex]

PDF [BibTex]

2006


Thumb xl screen shot 2012 06 06 at 11.31.38 am
Implicit Wiener Series, Part II: Regularised estimation

Gehler, P., Franz, M.

(148), Max Planck Institute, 2006 (techreport)

pdf [BibTex]

2006


Thumb xl evatr
HumanEva: Synchronized video and motion capture dataset for evaluation of articulated human motion

Sigal, L., Black, M. J.

(CS-06-08), Brown University, Department of Computer Science, 2006 (techreport)

pdf abstract [BibTex]

pdf abstract [BibTex]

1996


Thumb xl miximages
Mixture Models for Image Representation

Jepson, A., Black, M.

PRECARN ARK Project Technical Report ARK96-PUB-54, March 1996 (techreport)

Abstract
We consider the estimation of local greylevel image structure in terms of a layered representation. This type of representation has recently been successfully used to segment various objects from clutter using either optical ow or stereo disparity information. We argue that the same type of representation is useful for greylevel data in that it allows for the estimation of properties for each of several different components without prior segmentation. Our emphasis in this paper is on the process used to extract such a layered representation from a given image In particular we consider a variant of the EM algorithm for the estimation of the layered model and consider a novel technique for choosing the number of layers to use. We briefly consider the use of a simple version of this approach for image segmentation and suggest two potential applications to the ARK project

pdf [BibTex]

1996

pdf [BibTex]