The Department of Perceiving Systems regrets to inform you that Daniel Cudeiro passed away on Wednesday December 5, 2018. For scientific issues please contact Michael Black <firstname.lastname@example.org> and for all other issues please contact Melanie Feldhofer <email@example.com>.
His research combined principles from Artificial Intelligence, such as Machine Learning, Natural Language Processing and Computer Vision, with concepts from psychology and computer graphics. He was most interested in understanding human communication in order to improve the state of the art in virtual avatars and human to computer interaction. His first step towards this bigger goal was to develop models of facial animation. The next steps were to understand more abstract forms of human communication such as emotions and intention. Most of his research was data driven, for which he captured humans in 3D using our 4D scanner. His ultimate goal was to understand the building blocks that allow humans to perceive, interact and create. To finally be able to model human's inside out (body, face, mind, motion, ...) making 3D virtual avatars indistinguishable from real humans.
From Computer Vision, he was interested in:
3D human pose and shape estimation;
2D/3D head pose and shape estimation;
and new tasks arising from zero-shot learning, transfer learning and multimodal learning.
Within Computer Graphics, he was interested in:
human faces and bodies;
and human computer interaction.
For what regards Machine Learning, he was interested in:
Within Natural Language Processing, he was interested in:
language and vision,
applied psycholinguistic research.
As part of his PhD studies, he investigated ways of combining these topics to discover new and exciting tasks while solving problems and contributing to a better world. Eager to learn, pleased to discover!
Artificial Intelligence Natural Language Processing Computer Vision Transfer Learning Multimodal Learning 3D modelling.
In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), June 2019 (inproceedings)
Audio-driven 3D facial animation has been widely explored, but achieving realistic, human-like performance is still unsolved. This is due to the lack of available 3D datasets, models, and standard evaluation metrics. To address this, we introduce a unique 4D face dataset with about 29 minutes of 4D scans captured at 60 fps and synchronized audio from 12 speakers. We then train a neural network on our dataset that factors identity from facial motion. The learned model, VOCA (Voice Operated Character Animation) takes any speech signal as input—even speech in languages other than English—and realistically animates a wide range of adult faces. Conditioning on subject labels during training allows the model to learn a variety of realistic speaking styles. VOCA also provides animator controls to alter speaking style, identity-dependent facial shape, and pose (i.e. head, jaw, and eyeball rotations) during animation. To our knowledge, VOCA is the only realistic 3D facial animation model that is readily applicable to unseen subjects without retargeting. This makes VOCA suitable for tasks like in-game video, virtual reality avatars, or any scenario in which the speaker, speech, or language is not known in advance. We make the dataset and model available for research purposes at http://voca.is.tue.mpg.de.
Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems