A good clothing model is a key component of virtual human. Almost all current models concerning human modeling, detecting, tracking and parsing will benefit from realistic garment models.
Physical simulation-based clothing models are accurate but usually complicated due to the requirement for expert knowledge. We get around this by developing data-driven approaches, making use of both image data and 3D scans.
ClothNet [ ] is a conditional generative model that is directly learned from images of people in clothing. Given a body silhouette, the model produces different people with similar pose and shape in different clothing styles by using a variational autoencoder, followed by a image-to-image translation network that generates texture of the outfit.
To dress people in 3D, the minimally-clothed body shape is needed but often hard to acquire. Our BUFF model [ ] addresses this problem by estimating body shape under clothing from a sequence of 3D scans. In a motion sequence, different poses will make the cloth tight to body in different parts. All frames in a sequence are brought into an unposed canonical space and fused into a single frame. By optimizing an objective on this fusion frame that snaps naked shape to close-by cloth vertices and ignores far away cloth points, we can recover a personalized shape of the person.
Now that the body shape becomes obtainable from clothed scans, we are able to make a step forward and segment clothing from human body. ClothCap [ ] is a pipeline of capturing dynamic clothing on humans from 4D scans and transforming it to dress virtual avatars. With the novel multi-part mesh model, the approach can nicely segment clothing from body and retarget it to different body shape and poses.