Across Channel Deep Personalization

Research Project by Dr. Urs Bergmann

The aim of this research is to advance the state-of-the art of current personalization systems, e.g. recommender systems. While Zalando has a lot of data over the population of customers, a key problem for good personalization is the shortage of data for each individual customer. To alleviate this problem, flexible models will be developed that can leverage multiple customer or article contact channels (ad clicks, PDP clicks, purchases, etc.). In addition, the goal of the research is to advance the state in Bayesian Neural Networks to yield the industry’s best personalization experience.

Why is this of interest for Zalando? Sizes of articles are unfortunately not well standardized. For example, an Adidas shoe in EU size 45 might correspond to a Converse shoe in EU size 42. Even worse, as humans have diverse feet shape, this correspondence might hold for one customer, while another customer might prefer size 44 for the Adidas and size 42 for the Converse.

Shoe Detail Page by Urs Bergmann
Shoe Detail Page by Urs Bergmann

The goal of the research is therefore to build models that infer a customer’s latent ‘size variables’ based on multiple contact channels, e.g. confirmed purchases, returns, feedback on articles etc. Given additional data of the articles (and their potentially multiple contact channels e.g. an X-ray image of a shoe, feedback of other customers and/or employees, etc.), the personalized fitting probabilities over the sizes for a single individual can then be calculated – see Figure. These personalized size recommendations lessen the return burden for both the company and our customers.

Personalized Fitting Scheme by Urs Bergmann
Personalized Fitting Scheme by Urs Bergmann

Why is this of interest for the scientific community? Bayesian probabilistic models are very well understood. However, recent advances in deep learning show that neural networks are currently superior in certain domains, like e.g. image recognition & generation and speech recognition. In general, deep learning seems superior on (a lot of) data of an unknown highly non-linear function, with a strong signal and relatively weak noise – while Bayesian methods are superior in a regime with little data, a lot of noise and with a (potentially) known function.

That’s why the machine learning community is pushing to marry Bayesian methods with neural theory – and recently several high-impact breakthroughs were published. Our research aims to contribute to this trend – in particular in the domain of recommendation problems.