Generating High-Resolution Fashion Model Images Wearing Custom Outfits

Team members:

At Zalando, we provide high-quality photographs of fashion models wearing the articles in our online selection. These photographs help our customers visualise the garments they browse and enhance their shopping experience.

But what if our customers wish to visualise an individually created outfit? Zalando provides a large and evolving assortment of garments, which makes it infeasible to photograph every possible outfit combination. In order to circumvent this limitation, at Zalando Research, we work on a “Fashion Renderer”, which creates a computer-generated image of a fashion model wearing an input outfit for an input body pose as shown in the figure below. Note that the entire image, including the fashion model, is generated by our algorithm.

Example generation

In recent years, advances in Generative Adversarial Networks (GANs) [1] enabled sampling realistic images via implicit generative modeling. This development established new avenues in visual design and content creation, especially in fashion, where visualization is a key component. GANs can be used to create personalized visual content, such as rendering an outfit on a human body [2], which can enrich shopping experience on e-commerce platforms. In this project, we aim for a solution that concentrates on generating high-resolution images of fashion models wearing desired outfits and standing in different poses.

Our initial approach augments StyleGAN [3] with embedding networks and employs it on a proprietary dataset of fashion model-outfit-pose images. A flowchart of our model is illustrated in the figure below.

Below are some additional results from our Fashion Renderer. For each row, a set of input articles and a body pose (represented as a heatmap around body keypoints) are used to generate the image of a fashion model in the last column.

Throughout this project, we are going to address various challenges that are on our research roadmap. One of them is to increase the diversity and visual fidelity of fashion articles we can render, which requires novel generative approaches. Another challenge is to enhance the relatability and demographic diversity of the generated fashion models, which can help customers engage with the rendered results. As the quality of a generated image can be subjective, we also work on image-quality metrics, both statistical and human-perception driven, that can quantify and accurately measure how well our techniques work.

REFERENCES

[1] Generative Adversarial Nets, Goodfellow et. al., NeurIPS 2014

[2] Generating High-Resolution Fashion Model Images Wearing Custom Outfits, Yildirim et. al., ICCV Workshop on CVFAD 2019

[3] A Style-Based Generator Architecture for Generative Adversarial Networks, Karras et. al., CVPR 2019