Skip to content

Reproducing Zepto's Multilingual Query System from the Ground Up

Explore Zepto's approach to multilingual and misspelled query resolution, employing Language Models (LLMs) and Relevance and Ranking (RAG) systems. Here's a guide to constructing a comparable system.

Creating Your Own Multilingual Query Solution Similar to Zepto's From Square One
Creating Your Own Multilingual Query Solution Similar to Zepto's From Square One

Reproducing Zepto's Multilingual Query System from the Ground Up

Zepto, a leading online grocery platform, has introduced a groundbreaking multilingual query resolution system that significantly improves the user experience in online grocery ordering. This innovative system leverages Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) to accurately interpret and fix user queries across different languages.

Powered by LLMs and RAG

The system uses advanced LLMs with stepwise prompting techniques to break down user queries into smaller, interpretable parts. This aids in better understanding the intent and context, allowing the system to resolve ambiguous or misspelled terms effectively.

RAG, a hybrid method that combines traditional retrieval methods with the generative capabilities of LLMs, is also a key component. It initially retrieves relevant documents or product information related to the query, and then the LLM generates a refined and corrected query or suggestion. This approach improves accuracy in multiple languages and reduces errors caused by spelling variations or language differences.

End-to-End Query Resolution Pipeline

The system processes queries starting from fuzzy matching, corrected spelling suggestions, to final query output adjustment. It ensures that even if a user inputs something like a misspelled item name, it can still deliver the right product or suggestion seamlessly, enhancing conversion rates. Zepto reportedly saw a 7.5% uplift in conversions using this system.

Testing and Implementation

A dummy dataset has been created to thoroughly test the system, containing a variety of products, common brand names, multilingual and vernacular terms, and potentially ambiguous items. The necessary Python libraries for the implementation include langchain, langchain-groq, fastembed, langchain-chroma, and pandas.

The system is being implemented using code hands-on, starting from installing dependencies to the last similarity search. An embedding strategy is used where each product is represented as a single text document that combines its name, category, and tags. The components are chained together using LangChain Expression Language (LCEL) to create a seamless flow from query to final result.

Boosting User Satisfaction

This technology significantly boosts user satisfaction by minimizing friction caused by language and spelling issues during the online grocery ordering process, leading to faster and more accurate product discovery. It can correct misspellings and slang, understand multilingual queries, disambiguate queries, and provide structured, auditable outputs.

In summary, Zepto's multilingual query resolution system leverages LLMs with RAG to create a robust system that detects and corrects misspellings, understands queries across various languages, improves search quality and product matching, and enhances overall user experience and conversion in grocery e-commerce.

Behind the Scenes

Harsh Mishra, an AI/ML Engineer, is the brain behind this innovative system. He spends more time talking to Large Language Models than actual humans. He is passionate about GenAI, NLP, and making machines smarter. The system's robust, scalable, and demonstrates a clear path to significantly improving user experience and search conversion rates. It is being tested with a variety of challenging queries to see how it performs.

  1. Harsh Mishra, an AI/ML Engineer, has developed a data science-focused system that leverages Large Language Models (LLMs) and Retrieval-Augmented Generation (RAG) to improve the user experience in online shopping, particularly for home-and-garden and grocery items.
  2. This system, when implemented, would not only enhance online grocery ordering but also extend to other sectors like technology, entertainment, shopping, social media, and even artificially-intelligent lifestyles.
  3. The end-to-end query resolution pipeline of this system is designed to handle misspelled item names, understand multilingual queries, and disambiguate queries, thereby improving search quality and product matching.
  4. The system's potential impact on the data-and-cloud-computing landscape is significant, as it provides a structured and auditable output, minimizing friction caused by language and spelling issues, and boosting user satisfaction in e-commerce platforms.

Read also:

    Latest