Header logo is ps


2019


Towards Geometric Understanding of Motion
Towards Geometric Understanding of Motion

Ranjan, A.

University of Tübingen, December 2019 (phdthesis)

Abstract

The motion of the world is inherently dependent on the spatial structure of the world and its geometry. Therefore, classical optical flow methods try to model this geometry to solve for the motion. However, recent deep learning methods take a completely different approach. They try to predict optical flow by learning from labelled data. Although deep networks have shown state-of-the-art performance on classification problems in computer vision, they have not been as effective in solving optical flow. The key reason is that deep learning methods do not explicitly model the structure of the world in a neural network, and instead expect the network to learn about the structure from data. We hypothesize that it is difficult for a network to learn about motion without any constraint on the structure of the world. Therefore, we explore several approaches to explicitly model the geometry of the world and its spatial structure in deep neural networks.

The spatial structure in images can be captured by representing it at multiple scales. To represent multiple scales of images in deep neural nets, we introduce a Spatial Pyramid Network (SpyNet). Such a network can leverage global information for estimating large motions and local information for estimating small motions. We show that SpyNet significantly improves over previous optical flow networks while also being the smallest and fastest neural network for motion estimation. SPyNet achieves a 97% reduction in model parameters over previous methods and is more accurate.

The spatial structure of the world extends to people and their motion. Humans have a very well-defined structure, and this information is useful in estimating optical flow for humans. To leverage this information, we create a synthetic dataset for human optical flow using a statistical human body model and motion capture sequences. We use this dataset to train deep networks and see significant improvement in the ability of the networks to estimate human optical flow.

The structure and geometry of the world affects the motion. Therefore, learning about the structure of the scene together with the motion can benefit both problems. To facilitate this, we introduce Competitive Collaboration, where several neural networks are constrained by geometry and can jointly learn about structure and motion in the scene without any labels. To this end, we show that jointly learning single view depth prediction, camera motion, optical flow and motion segmentation using Competitive Collaboration achieves state-of-the-art results among unsupervised approaches.

Our findings provide support for our hypothesis that explicit constraints on structure and geometry of the world lead to better methods for motion estimation.

PhD Thesis [BibTex]

2019

PhD Thesis [BibTex]

2012


Virtual Human Bodies with Clothing and Hair: From Images to Animation
Virtual Human Bodies with Clothing and Hair: From Images to Animation

Guan, P.

Brown University, Department of Computer Science, December 2012 (phdthesis)

pdf [BibTex]

2012

pdf [BibTex]


From Pixels to Layers: Joint Motion Estimation and Segmentation
From Pixels to Layers: Joint Motion Estimation and Segmentation

Sun, D.

Brown University, Department of Computer Science, July 2012 (phdthesis)

pdf [BibTex]

pdf [BibTex]


An Analysis of Successful Approaches to Human Pose Estimation
An Analysis of Successful Approaches to Human Pose Estimation

Lassner, C.

An Analysis of Successful Approaches to Human Pose Estimation, University of Augsburg, University of Augsburg, May 2012 (mastersthesis)

Abstract
The field of Human Pose Estimation is developing fast and lately leaped forward with the release of the Kinect system. That system reaches a very good perfor- mance for pose estimation using 3D scene information, however pose estimation from 2D color images is not solved reliably yet. There is a vast amount of pub- lications trying to reach this aim, but no compilation of important methods and solution strategies. The aim of this thesis is to fill this gap: it gives an introductory overview over important techniques by analyzing four current (2012) publications in detail. They are chosen such, that during their analysis many frequently used techniques for Human Pose Estimation can be explained. The thesis includes two introductory chapters with a definition of Human Pose Estimation and exploration of the main difficulties, as well as a detailed explanation of frequently used methods. A final chapter presents some ideas on how parts of the analyzed approaches can be recombined and shows some open questions that can be tackled in future work. The thesis is therefore a good entry point to the field of Human Pose Estimation and enables the reader to get an impression of the current state-of-the-art.

pdf [BibTex]

pdf [BibTex]


Exploiting pedestrian interaction via global optimization and social behaviors
Exploiting pedestrian interaction via global optimization and social behaviors

Leal-Taixé, L., Pons-Moll, G., Rosenhahn, B.

In Theoretic Foundations of Computer Vision: Outdoor and Large-Scale Real-World Scene Analysis, Springer, April 2012 (incollection)

pdf [BibTex]

pdf [BibTex]


Data-driven Manifolds for Outdoor Motion Capture
Data-driven Manifolds for Outdoor Motion Capture

Pons-Moll, G., Leal-Taix’e, L., Gall, J., Rosenhahn, B.

In Outdoor and Large-Scale Real-World Scene Analysis, 7474, pages: 305-328, LNCS, (Editors: Dellaert, Frank and Frahm, Jan-Michael and Pollefeys, Marc and Rosenhahn, Bodo and Leal-Taix’e, Laura), Springer, 2012 (incollection)

video publisher's site pdf Project Page [BibTex]

video publisher's site pdf Project Page [BibTex]


Scan-Based Flow Modelling in Human Upper Airways
Scan-Based Flow Modelling in Human Upper Airways

Perumal Nithiarasu, Igor Sazonov, Si Yong Yeo

In Patient-Specific Modeling in Tomorrow’s Medicine, pages: 241 - 280, 0, (Editors: Amit Gefen), Springer, 2012 (inbook)

[BibTex]

[BibTex]


An Introduction to Random Forests for Multi-class Object Detection
An Introduction to Random Forests for Multi-class Object Detection

Gall, J., Razavi, N., van Gool, L.

In Outdoor and Large-Scale Real-World Scene Analysis, 7474, pages: 243-263, LNCS, (Editors: Dellaert, Frank and Frahm, Jan-Michael and Pollefeys, Marc and Rosenhahn, Bodo and Leal-Taix’e, Laura), Springer, 2012 (incollection)

code code for Hough forest publisher's site pdf Project Page [BibTex]

code code for Hough forest publisher's site pdf Project Page [BibTex]


Home {3D} body scans from noisy image and range data
Home 3D body scans from noisy image and range data

Weiss, A., Hirshberg, D., Black, M. J.

In Consumer Depth Cameras for Computer Vision: Research Topics and Applications, pages: 99-118, 6, (Editors: Andrea Fossati and Juergen Gall and Helmut Grabner and Xiaofeng Ren and Kurt Konolige), Springer-Verlag, 2012 (incollection)

Project Page [BibTex]

Project Page [BibTex]


Consumer Depth Cameras for Computer Vision - Research Topics and Applications
Consumer Depth Cameras for Computer Vision - Research Topics and Applications

Fossati, A., Gall, J., Grabner, H., Ren, X., Konolige, K.

Advances in Computer Vision and Pattern Recognition, Springer, 2012 (book)

workshop publisher's site [BibTex]

workshop publisher's site [BibTex]

2009


no image
An introduction to Kernel Learning Algorithms

Gehler, P., Schölkopf, B.

In Kernel Methods for Remote Sensing Data Analysis, pages: 25-48, 2, (Editors: Gustavo Camps-Valls and Lorenzo Bruzzone), Wiley, New York, NY, USA, 2009 (inbook)

Abstract
Kernel learning algorithms are currently becoming a standard tool in the area of machine learning and pattern recognition. In this chapter we review the fundamental theory of kernel learning. As the basic building block we introduce the kernel function, which provides an elegant and general way to compare possibly very complex objects. We then review the concept of a reproducing kernel Hilbert space and state the representer theorem. Finally we give an overview of the most prominent algorithms, which are support vector classification and regression, Gaussian Processes and kernel principal analysis. With multiple kernel learning and structured output prediction we also introduce some more recent advancements in the field.

link (url) DOI [BibTex]

2009

link (url) DOI [BibTex]


no image
Visual Object Discovery

Sinha, P., Balas, B., Ostrovsky, Y., Wulff, J.

In Object Categorization: Computer and Human Vision Perspectives, pages: 301-323, (Editors: S. J. Dickinson, A. Leonardis, B. Schiele, M.J. Tarr), Cambridge University Press, 2009 (inbook)

link (url) [BibTex]

link (url) [BibTex]