Ads Recommendation in a Collapsed and Entangled World

Read original: arXiv:2403.00793 - Published 7/9/2024 by Junwei Pan, Wei Xue, Ximei Wang, Haibin Yu, Xun Liu, Shijie Quan, Xueming Qiu, Dapeng Liu, Lei Xiao, Jie Jiang

Ads Recommendation in a Collapsed and Entangled World

Overview

This paper explores the challenge of ad recommendation in a world where user and item representations are becoming increasingly "collapsed" and "entangled" as recommendation models scale up.
The authors propose a novel system that aims to address these issues by leveraging disentangled representation learning and knowledge adaptation techniques.
The system is evaluated on both synthetic and real-world datasets, demonstrating its effectiveness in improving ad recommendation performance.

Plain English Explanation

As online platforms and services continue to grow, the models used to recommend ads to users are also getting more complex. However, this growth can lead to some unintended consequences. As these recommendation models become larger and more sophisticated, the underlying representations of users and items (e.g., products, content) can become "collapsed" and "entangled," making it harder to understand and manipulate the key factors driving recommendations.

The researchers in this paper tackle this challenge by developing a new ad recommendation system that uses advanced techniques like disentangled representation learning and knowledge adaptation. These techniques aim to tease apart the different factors influencing user preferences and item characteristics, making the recommendation process more transparent and controllable.

By testing their system on both artificial and real-world data, the researchers show that it can outperform conventional recommendation approaches, especially in scenarios where user and item representations have become overly complex and entangled. This suggests that their approach could be a valuable tool for improving ad recommendations as online platforms and services continue to grow and evolve.

Technical Explanation

The paper begins by highlighting the challenge of embedding collapse and entanglement in large-scale recommendation systems. As these models become more complex, the underlying representations of users and items can lose their interpretability and granularity, making it difficult to understand and manipulate the key factors driving recommendations.

To address this issue, the authors propose a novel ad recommendation system that leverages disentangled representation learning and knowledge adaptation techniques. The system consists of several key components:

Feature Encoding: The system encodes user and item features into disentangled representations, where each dimension corresponds to a distinct latent factor.
Recommendation Model: The disentangled representations are then used to train a recommendation model that can make accurate ad predictions.
Knowledge Adaptation: To further improve performance, the system adapts knowledge from a pre-trained large language model, leveraging its rich semantic understanding.

The authors evaluate their system on both synthetic and real-world datasets, demonstrating its superiority over conventional recommendation approaches, especially in scenarios where user and item representations have become overly complex and entangled.

Critical Analysis

The paper presents a well-designed and comprehensive solution to the challenge of embedding collapse and entanglement in large-scale recommendation systems. The authors' use of disentangled representation learning and knowledge adaptation techniques is a promising approach to maintaining interpretability and control as these models become more complex.

One potential limitation of the research is the reliance on synthetic data for part of the evaluation. While the real-world dataset results are encouraging, it would be valuable to see the system tested on a wider range of real-world scenarios to fully assess its performance and generalizability.

Additionally, the paper does not delve deeply into the potential societal implications of their approach. As recommendation systems become more powerful and influential, it will be important to consider how techniques like disentanglement and knowledge adaptation could impact issues such as fairness, transparency, and user agency.

Overall, this paper makes a valuable contribution to the field of recommendation systems, offering a novel solution to an important challenge. However, further research and discussion are needed to fully understand the broader implications and potential trade-offs of the proposed approach.

Conclusion

This paper presents a novel ad recommendation system that addresses the challenges of embedding collapse and entanglement in large-scale recommendation models. By leveraging disentangled representation learning and knowledge adaptation techniques, the system is able to maintain interpretability and control as the underlying models become more complex.

The researchers demonstrate the effectiveness of their approach through extensive experiments on both synthetic and real-world data, showing that it can outperform conventional recommendation methods. This suggests that their system could be a valuable tool for improving ad recommendations as online platforms and services continue to grow and evolve.

While the paper offers a promising solution to an important problem, further research is needed to fully understand the broader implications and potential trade-offs of the proposed techniques. As recommendation systems become more powerful and ubiquitous, it will be crucial to consider how these advances can be responsibly developed and deployed to benefit both businesses and users alike.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Ads Recommendation in a Collapsed and Entangled World

Junwei Pan, Wei Xue, Ximei Wang, Haibin Yu, Xun Liu, Shijie Quan, Xueming Qiu, Dapeng Liu, Lei Xiao, Jie Jiang

We present Tencent's ads recommendation system and examine the challenges and practices of learning appropriate recommendation representations. Our study begins by showcasing our approaches to preserving prior knowledge when encoding features of diverse types into embedding representations. We specifically address sequence features, numeric features, and pre-trained embedding features. Subsequently, we delve into two crucial challenges related to feature representation: the dimensional collapse of embeddings and the interest entanglement across different tasks or scenarios. We propose several practical approaches to address these challenges that result in robust and disentangled recommendation representations. We then explore several training techniques to facilitate model optimization, reduce bias, and enhance exploration. Additionally, we introduce three analysis tools that enable us to study feature correlation, dimensional collapse, and interest entanglement. This work builds upon the continuous efforts of Tencent's ads recommendation team over the past decade. It summarizes general design principles and presents a series of readily applicable solutions and analysis tools. The reported performance is based on our online advertising platform, which handles hundreds of billions of requests daily and serves millions of ads to billions of users.

7/9/2024

Enhancing Taobao Display Advertising with Multimodal Representations: Challenges, Approaches and Insights

Xiang-Rong Sheng, Feifan Yang, Litong Gong, Biao Wang, Zhangming Chan, Yujing Zhang, Yueyao Cheng, Yong-Nan Zhu, Tiezheng Ge, Han Zhu, Yuning Jiang, Jian Xu, Bo Zheng

Despite the recognized potential of multimodal data to improve model accuracy, many large-scale industrial recommendation systems, including Taobao display advertising system, predominantly depend on sparse ID features in their models. In this work, we explore approaches to leverage multimodal data to enhance the recommendation accuracy. We start from identifying the key challenges in adopting multimodal data in a manner that is both effective and cost-efficient for industrial systems. To address these challenges, we introduce a two-phase framework, including: 1) the pre-training of multimodal representations to capture semantic similarity, and 2) the integration of these representations with existing ID-based models. Furthermore, we detail the architecture of our production system, which is designed to facilitate the deployment of multimodal representations. Since the integration of multimodal representations in mid-2023, we have observed significant performance improvements in Taobao display advertising system. We believe that the insights we have gathered will serve as a valuable resource for practitioners seeking to leverage multimodal data in their systems.

7/30/2024

On the Embedding Collapse when Scaling up Recommendation Models

Xingzhuo Guo, Junwei Pan, Ximei Wang, Baixu Chen, Jie Jiang, Mingsheng Long

Recent advances in foundation models have led to a promising trend of developing large recommendation models to leverage vast amounts of available data. Still, mainstream models remain embarrassingly small in size and naive enlarging does not lead to sufficient performance gain, suggesting a deficiency in the model scalability. In this paper, we identify the embedding collapse phenomenon as the inhibition of scalability, wherein the embedding matrix tends to occupy a low-dimensional subspace. Through empirical and theoretical analysis, we demonstrate a emph{two-sided effect} of feature interaction specific to recommendation models. On the one hand, interacting with collapsed embeddings restricts embedding learning and exacerbates the collapse issue. On the other hand, interaction is crucial in mitigating the fitting of spurious features as a scalability guarantee. Based on our analysis, we propose a simple yet effective multi-embedding design incorporating embedding-set-specific interaction modules to learn embedding sets with large diversity and thus reduce collapse. Extensive experiments demonstrate that this proposed design provides consistent scalability and effective collapse mitigation for various recommendation models. Code is available at this repository: https://github.com/thuml/Multi-Embedding.

6/7/2024

Async Learned User Embeddings for Ads Delivery Optimization

Mingwei Tang, Meng Liu, Hong Li, Junjie Yang, Chenglin Wei, Boyang Li, Dai Li, Rengan Xu, Yifan Xu, Zehua Zhang, Xiangyu Wang, Linfeng Liu, Yuelei Xie, Chengye Liu, Labib Fawaz, Li Li, Hongnan Wang, Bill Zhu, Sri Reddy

In recommendation systems, high-quality user embeddings can capture subtle preferences, enable precise similarity calculations, and adapt to changing preferences over time to maintain relevance. The effectiveness of recommendation systems depends on the quality of user embedding. We propose to asynchronously learn high fidelity user embeddings for billions of users each day from sequence based multimodal user activities through a Transformer-like large scale feature learning module. The async learned user representations embeddings (ALURE) are further converted to user similarity graphs through graph learning and then combined with user realtime activities to retrieval highly related ads candidates for the ads delivery system. Our method shows significant gains in both offline and online experiments.

6/26/2024