Zalando boosts the customer experience with its Assistant, powered by GPT-4o mini
![The image displays the "Zalando" logo in clean, white text centered over an abstract background featuring bold orange and red tones with expressive brushstrokes, evoking energy and vibrancy.](https://images.ctfassets.net/kftzwdyauwt9/7J2cGP0on4JzV0ceVKBbjy/b5898c89006661a1150f7a97309e2793/oai_Zalando_hero.png?w=3840&q=90&fm=webp)
Zalando(opens in a new window), one of Europe’s largest online fashion and lifestyle platforms, serves over 50 million customers in 25 countries. With a vast catalog of apparel, shoes, and beauty products, the company has steadily expanded its offerings to become a go-to destination for fashion enthusiasts.
Zalando worked with OpenAI to develop the Zalando Assistant, an AI-powered tool that provides personalized content recommendations and streamlines product discovery. With GPT-4o mini and a robust evaluation framework, and compared to the previous version, the latest iteration of the Assistant has delivered:
A 23% increase in product clicks
A 40%+ increase in products added to the wishlist
Scaled availability to 25 markets featuring local languages
Scaling the Assistant for 25 markets
The first iteration of the Zalando Assistant launched in 4 German and English-speaking markets in 2023. Based on its success, the Zalando team set the goal of expanding the Assistant into 20+ markets in 2024.
This required refining and scaling the Assistant’s personalization capabilities. To serve markets across Europe, the team needed an AI solution that could perform well in over 20 languages.
They also needed the Assistant to follow nuanced instructions. GPT-3.5, which powered the first iteration of the Assistant, wasn’t best equipped for instruction-following tasks. For instance, when users asked for seasonal outfits or event-specific attire, the results were often too generic.
Working closely, the Zalando and OpenAI teams identified two investment areas: Improving the evaluation process—referred to as an evaluation framework or “evals”—and upgrading the model powering the Assistant.
Implementing component-specific evaluations and enhanced few-shot prompting
Guided by the OpenAI team’s advice, Zalando adopted a more granular approach to evaluating the Assistant. The new framework included component-specific evaluations to test individual parts of the system, such as routing and response generation, in isolation.
To further improve model assessment, the team enhanced the quality and accuracy of few-shot prompts. Few-shot prompting allows an LLM model to better understand quality benchmarks by showing clear examples of what a high or low rating should look like.
For example, by showing the model examples of what weak and strong responses look like, they improved its ability to evaluate output quality and align with user expectations.
These updates provided Zalando with actionable insights into the Assistant’s strengths and weaknesses, enabling them to make targeted improvements and confidently prepare for a model upgrade.
Migrating to GPT-4o mini for multilingual and instructional capabilities
With a refined evaluation framework in place, the next step was to transition the Assistant from operating on GPT-3.5 to GPT-4o mini, a model that is more cost-efficient and best equipped for multilingual and instruction-following tasks.
In just two weeks, the team migrated 50% of the Assistant’s traffic to the new model, with the remaining migration completed shortly after.
The transition proved pivotal. The Assistant could now operate in new languages such as French and Spanish, allowing Zalando to deliver localized, culturally relevant recommendations across its markets. The model also reduced latency and operational costs, ensuring scalability as user numbers grew.
![A beta "Fashion Assistant" chatbot suggests fashion items like festive dresses based on user preferences, with chat prompts, item images, prices, and feedback options.](https://images.ctfassets.net/kftzwdyauwt9/5c5JoSNCu9kkN4Jgk4GdJT/658f48dafe4df963d1862ae9c09b3fba/productui.png?w=3840&q=90&fm=webp)
Increased engagement, localization, and cost savings
The combination of GPT-4o mini and the improved evaluation process led to measurable outcomes that exceeded expectations in comparison to previous versions of the assistant:
Improved engagement: Product clicks increased by 23% within the recommendation carousel, and wishlist additions rose by 41%, demonstrating stronger user interaction.
Quality feedback: The number of recommendations deemed “unhelpful” by users decreased by 5%, reflecting higher satisfaction with the new Assistant’s picks.
Better cost efficiency: By transitioning from GPT-3.5 to GPT-4o mini, Zalando was able to scale its traffic 12x by deploying the Assistant to more markets, and without spending significantly more.
Localization at scale: The Assistant now performs seamlessly across all 25 of Zalando’s markets, supporting multiple languages and cultural contexts.
Building on this success, Zalando is progressing the Assistant with more conversational capabilities, with the ability to answer questions as detailed as, “What should I wear to my dad’s 60th birthday in November in Barcelona?” The company also continues to customize the Zalando Assistant for diverse linguistic and cultural contexts as it scales.