Optimizing E-commerce Search: Toward a Generalizable and Rank-Consistent Pre-Ranking Model

Read original: arXiv:2405.05606 - Published 8/22/2024 by Enqiang Xu, Yiming Qiu, Junyang Bai, Ping Zhang, Dadong Miao, Songlin Wang, Guoyu Tang, Lin Liu, Mingming Li

📈

Overview

In large e-commerce platforms, search systems typically have multiple modules, including recall, pre-ranking, and ranking phases.
The pre-ranking phase is a crucial lightweight module that filters out the majority of products before the more computationally expensive ranking phase.
Existing industry efforts have focused on enhancing ranking consistency, model structure, and generalization towards long-tail items in the pre-ranking model.
However, meeting system performance requirements remains a significant challenge.

Plain English Explanation

When you search for products on a large e-commerce platform, the search system goes through a series of steps to find the most relevant items. The first step is the recall phase, which quickly identifies a broad set of potentially relevant products. Then, the pre-ranking phase acts as a lightweight filter, quickly removing the majority of products that are unlikely to be good matches for your search.

The pre-ranking model is important because it helps the final ranking phase, which is more computationally intensive, focus on the most promising products. Existing industry efforts have tried to improve the pre-ranking model in a few key ways:

Ranking consistency: Making the pre-ranking model's results more aligned with the final ranking, so the most relevant products are consistently surfaced early in the process.
Model structure: Optimizing the design of the pre-ranking model itself to perform better.
Generalization: Improving the model's ability to handle long-tail or less popular products, not just the most common ones.

However, even with these improvements, meeting the overall performance requirements of the search system remains a significant challenge. This is where the research paper proposes a novel approach called GRACE.

Technical Explanation

The paper introduces a Generalizable and RAnk-ConsistEnt Pre-Ranking Model (GRACE) that aims to address the limitations of existing pre-ranking models. GRACE achieves three key objectives:

Ranking consistency: GRACE introduces multiple binary classification tasks that predict whether a product is within the top-k results as estimated by the final ranking model. This helps align the pre-ranking model's output with the final ranking.
Generalizability: GRACE uses contrastive learning to pre-train product representations on a subset of ranking model embeddings. This helps the pre-ranking model generalize better, especially to long-tail products.
Ease of implementation: GRACE's feature construction and online deployment are designed to be straightforward to implement in real-world systems.

The paper presents extensive experiments that demonstrate GRACE's significant improvements in both offline metrics (0.75% increase in AUC) and online A/B testing (1.28% increase in CVR, or conversion rate).

Critical Analysis

The paper provides a comprehensive and well-designed solution to the pre-ranking challenge in large e-commerce search systems. The key strengths of the GRACE approach are its focus on ranking consistency, generalization, and practical implementation.

However, the paper does not delve into certain potential limitations or areas for further research. For example, it would be interesting to understand the computational and memory footprint of the GRACE model compared to previous approaches, as these factors can be critical in real-world deployments. Additionally, the paper could explore the model's performance on other e-commerce platforms or industries beyond the specific use case presented.

Overall, the GRACE method seems to be a promising advancement in the field of search ranking and multi-modal retrieval. It would be valuable for future research to build upon this work and address any remaining challenges in the e-commerce search and ranking domains.

Conclusion

The paper presents a novel pre-ranking model called GRACE that addresses key challenges in large e-commerce search systems. By focusing on ranking consistency, generalizability, and practical implementation, GRACE demonstrates significant improvements in both offline and online metrics.

This research represents an important step forward in optimizing the pre-ranking phase, which is a crucial component of modern e-commerce search infrastructure. The GRACE approach could have broader implications for competitive retrieval systems beyond just e-commerce, and its principles could be applied to other areas of search and recommendation.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

📈

Optimizing E-commerce Search: Toward a Generalizable and Rank-Consistent Pre-Ranking Model

Enqiang Xu, Yiming Qiu, Junyang Bai, Ping Zhang, Dadong Miao, Songlin Wang, Guoyu Tang, Lin Liu, Mingming Li

In large e-commerce platforms, search systems are typically composed of a series of modules, including recall, pre-ranking, and ranking phases. The pre-ranking phase, serving as a lightweight module, is crucial for filtering out the bulk of products in advance for the downstream ranking module. Industrial efforts on optimizing the pre-ranking model have predominantly focused on enhancing ranking consistency, model structure, and generalization towards long-tail items. Beyond these optimizations, meeting the system performance requirements presents a significant challenge. Contrasting with existing industry works, we propose a novel method: a Generalizable and RAnk-ConsistEnt Pre-Ranking Model (GRACE), which achieves: 1) Ranking consistency by introducing multiple binary classification tasks that predict whether a product is within the top-k results as estimated by the ranking model, which facilitates the addition of learning objectives on common point-wise ranking models; 2) Generalizability through contrastive learning of representation for all products by pre-training on a subset of ranking product embeddings; 3) Ease of implementation in feature construction and online deployment. Our extensive experiments demonstrate significant improvements in both offline metrics and online A/B test: a 0.75% increase in AUC and a 1.28% increase in CVR.

8/22/2024

Building a Scalable, Effective, and Steerable Search and Ranking Platform

Marjan Celikik, Jacek Wasilewski, Ana Peleteiro Ramallo, Alexey Kurennoy, Evgeny Labzin, Danilo Ascione, Tural Gurbanov, G'eraud Le Falher, Andrii Dzhoha, Ian Harris

Modern e-commerce platforms offer vast product selections, making it difficult for customers to find items that they like and that are relevant to their current session intent. This is why it is key for e-commerce platforms to have near real-time scalable and adaptable personalized ranking and search systems. While numerous methods exist in the scientific literature for building such systems, many are unsuitable for large-scale industrial use due to complexity and performance limitations. Consequently, industrial ranking systems often resort to computationally efficient yet simplistic retrieval or candidate generation approaches, which overlook near real-time and heterogeneous customer signals, which results in a less personalized and relevant experience. Moreover, related customer experiences are served by completely different systems, which increases complexity, maintenance, and inconsistent experiences. In this paper, we present a personalized, adaptable near real-time ranking platform that is reusable across various use cases, such as browsing and search, and that is able to cater to millions of items and customers under heavy load (thousands of requests per second). We employ transformer-based models through different ranking layers which can learn complex behavior patterns directly from customer action sequences while being able to incorporate temporal (e.g. in-session) and contextual information. We validate our system through a series of comprehensive offline and online real-world experiments at a large online e-commerce platform, and we demonstrate its superiority when compared to existing systems, both in terms of customer experience as well as in net revenue. Finally, we share the lessons learned from building a comprehensive, modern ranking platform for use in a large-scale e-commerce environment.

9/5/2024

Advancing Re-Ranking with Multimodal Fusion and Target-Oriented Auxiliary Tasks in E-Commerce Search

Enqiang Xu, Xinhui Li, Zhigong Zhou, Jiahao Ji, Jinyuan Zhao, Dadong Miao, Songlin Wang, Lin Liu, Sulong Xu

In the rapidly evolving field of e-commerce, the effectiveness of search re-ranking models is crucial for enhancing user experience and driving conversion rates. Despite significant advancements in feature representation and model architecture, the integration of multimodal information remains underexplored. This study addresses this gap by investigating the computation and fusion of textual and visual information in the context of re-ranking. We propose textbf{A}dvancing textbf{R}e-Ranking with textbf{M}ultitextbf{m}odal Fusion and textbf{T}arget-Oriented Auxiliary Tasks (ARMMT), which integrates an attention-based multimodal fusion technique and an auxiliary ranking-aligned task to enhance item representation and improve targeting capabilities. This method not only enriches the understanding of product attributes but also enables more precise and personalized recommendations. Experimental evaluations on JD.com's search platform demonstrate that ARMMT achieves state-of-the-art performance in multimodal information integration, evidenced by a 0.22% increase in the Conversion Rate (CVR), significantly contributing to Gross Merchandise Volume (GMV). This pioneering approach has the potential to revolutionize e-commerce re-ranking, leading to elevated user satisfaction and business growth.

8/13/2024

Generative Retrieval with Preference Optimization for E-commerce Search

Mingming Li, Huimu Wang, Zuxu Chen, Guangtao Nie, Yiming Qiu, Binbin Wang, Guoyu Tang, Lin Liu, Jingwei Zhuo

Generative retrieval introduces a groundbreaking paradigm to document retrieval by directly generating the identifier of a pertinent document in response to a specific query. This paradigm has demonstrated considerable benefits and potential, particularly in representation and generalization capabilities, within the context of large language models. However, it faces significant challenges in E-commerce search scenarios, including the complexity of generating detailed item titles from brief queries, the presence of noise in item titles with weak language order, issues with long-tail queries, and the interpretability of results. To address these challenges, we have developed an innovative framework for E-commerce search, called generative retrieval with preference optimization. This framework is designed to effectively learn and align an autoregressive model with target data, subsequently generating the final item through constraint-based beam search. By employing multi-span identifiers to represent raw item titles and transforming the task of generating titles from queries into the task of generating multi-span identifiers from queries, we aim to simplify the generation process. The framework further aligns with human preferences using click data and employs a constrained search method to identify key spans for retrieving the final item, thereby enhancing result interpretability. Our extensive experiments show that this framework achieves competitive performance on a real-world dataset, and online A/B tests demonstrate the superiority and effectiveness in improving conversion gains.

7/30/2024