Posterior Probability Matters: Doubly-Adaptive Calibration for Neural Predictions in Online Advertising

2205.07295

Published 5/28/2024 by Penghui Wei, Weimin Zhang, Ruijie Hou, Jinquan Liu, Shaoguo Liu, Liang Wang, Bo Zheng

🎲

Abstract

Predicting user response probabilities is vital for ad ranking and bidding. We hope that predictive models can produce accurate probabilistic predictions that reflect true likelihoods. Calibration techniques aim to post-process model predictions to posterior probabilities. Field-level calibration -- which performs calibration w.r.t. to a specific field value -- is fine-grained and more practical. In this paper we propose a doubly-adaptive approach AdaCalib. It learns an isotonic function family to calibrate model predictions with the guidance of posterior statistics, and field-adaptive mechanisms are designed to ensure that the posterior is appropriate for the field value to be calibrated. Experiments verify that AdaCalib achieves significant improvement on calibration performance. It has been deployed online and beats previous approach.

Create account to get full access

Overview

Predicting user response probabilities is crucial for ad ranking and bidding.
Calibration techniques aim to post-process model predictions to posterior probabilities.
Field-level calibration is a fine-grained and practical approach.
This paper proposes AdaCalib, a doubly-adaptive approach to field-level calibration.

Plain English Explanation

When companies show ads to users online, they need to predict how likely the user is to engage with the ad (e.g., click on it). These user response probabilities are vital for deciding which ads to show and how much to bid for ad placements. Researchers hope that predictive models can produce accurate probability estimates that reflect the true likelihood of user actions.

Calibration techniques are used to post-process model predictions and ensure they correspond to true posterior probabilities. Field-level calibration, which calibrates predictions based on specific feature values, is a more fine-grained and practical approach compared to general calibration.

This paper introduces AdaCalib, a new field-level calibration method. AdaCalib learns an adaptive function to calibrate the model's probability outputs, using guidance from posterior statistics. It also includes mechanisms to ensure the calibration is well-suited for the specific field value being calibrated. Experiments show AdaCalib significantly improves calibration performance, and it has been successfully deployed in production.

Technical Explanation

The paper proposes AdaCalib, a doubly-adaptive approach to field-level calibration of model predictions. AdaCalib learns an isotonic function family to calibrate the model's outputs, using guidance from posterior statistics to ensure the calibration is appropriate.

Additionally, AdaCalib includes field-adaptive mechanisms to ensure the posterior probability estimates used for calibration are well-suited for the specific field value being calibrated. This fine-grained, field-level calibration is more practical than general calibration approaches.

Experiments on real-world data demonstrate that AdaCalib achieves significant improvements in calibration performance compared to previous methods. The authors also report that AdaCalib has been successfully deployed in online production environments, outperforming earlier calibration techniques.

Critical Analysis

The paper provides a thorough technical explanation of the AdaCalib method and presents convincing experimental results. However, the authors do not discuss any potential limitations or caveats of their approach.

It would be helpful to know how AdaCalib performs on a wider range of datasets and model architectures, and whether there are any scenarios where it may struggle or fail to improve calibration. Additionally, the authors could explore potential trade-offs between the complexity of the AdaCalib model and its calibration performance.

Further research could also investigate the interpretability of the learned calibration functions and explore ways to make the calibration process more transparent to users and stakeholders.

Conclusion

This paper introduces AdaCalib, a novel field-level calibration method that significantly improves the accuracy of probabilistic predictions from machine learning models. By adaptively learning calibration functions and incorporating field-specific mechanisms, AdaCalib produces well-calibrated probabilities that better reflect the true likelihoods of user responses.

The successful deployment of AdaCalib in production environments demonstrates its practical value for applications like online advertising, where accurate probability estimates are crucial for effective ad ranking and bidding. As machine learning models continue to play a central role in high-stakes decision-making, techniques like AdaCalib will become increasingly important for ensuring the reliability and trustworthiness of these systems.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

Confidence-Aware Multi-Field Model Calibration

Yuang Zhao, Chuhan Wu, Qinglin Jia, Hong Zhu, Jia Yan, Libin Zong, Linxuan Zhang, Zhenhua Dong, Muyu Zhang

Accurately predicting the probabilities of user feedback, such as clicks and conversions, is critical for advertisement ranking and bidding. However, there often exist unwanted mismatches between predicted probabilities and true likelihoods due to the rapid shift of data distributions and intrinsic model biases. Calibration aims to address this issue by post-processing model predictions, and field-aware calibration can adjust model output on different feature field values to satisfy fine-grained advertising demands. Unfortunately, the observed samples corresponding to certain field values can be seriously limited to make confident calibrations, which may yield bias amplification and online disturbance. In this paper, we propose a confidence-aware multi-field calibration method, which adaptively adjusts the calibration intensity based on confidence levels derived from sample statistics. It also utilizes multiple fields for joint model calibration according to their importance to mitigate the impact of data sparsity on a single field. Extensive offline and online experiments show the superiority of our method in boosting advertising performance and reducing prediction deviations.

5/22/2024

cs.LG

↗️

Calibrated Regression Against An Adversary Without Regret

Shachi Deshpande, Charles Marx, Volodymyr Kuleshov

We are interested in probabilistic prediction in online settings in which data does not follow a probability distribution. Our work seeks to achieve two goals: (1) producing valid probabilities that accurately reflect model confidence; and (2) ensuring that traditional notions of performance (e.g., high accuracy) still hold. We introduce online algorithms guaranteed to achieve these goals on arbitrary streams of data points, including data chosen by an adversary. Specifically, our algorithms produce forecasts that are (1) calibrated -- i.e., an 80% confidence interval contains the true outcome 80% of the time -- and (2) have low regret relative to a user-specified baseline model. We implement a post-hoc recalibration strategy that provably achieves these goals in regression; previous algorithms applied to classification or achieved (1) but not (2). In the context of Bayesian optimization, an online model-based decision-making task in which the data distribution shifts over time, our method yields accelerated convergence to improved optima.

6/6/2024

cs.LG

🔮

Online Calibrated and Conformal Prediction Improves Bayesian Optimization

Shachi Deshpande, Charles Marx, Volodymyr Kuleshov

Accurate uncertainty estimates are important in sequential model-based decision-making tasks such as Bayesian optimization. However, these estimates can be imperfect if the data violates assumptions made by the model (e.g., Gaussianity). This paper studies which uncertainties are needed in model-based decision-making and in Bayesian optimization, and argues that uncertainties can benefit from calibration -- i.e., an 80% predictive interval should contain the true outcome 80% of the time. Maintaining calibration, however, can be challenging when the data is non-stationary and depends on our actions. We propose using simple algorithms based on online learning to provably maintain calibration on non-i.i.d. data, and we show how to integrate these algorithms in Bayesian optimization with minimal overhead. Empirically, we find that calibrated Bayesian optimization converges to better optima in fewer steps, and we demonstrate improved performance on standard benchmark functions and hyperparameter optimization tasks.

6/27/2024

cs.LG stat.ML

🏷️

Bayesian Adaptive Calibration and Optimal Design

Rafael Oliveira, Dino Sejdinovic, David Howard, Edwin Bonilla

The process of calibrating computer models of natural phenomena is essential for applications in the physical sciences, where plenty of domain knowledge can be embedded into simulations and then calibrated against real observations. Current machine learning approaches, however, mostly rely on rerunning simulations over a fixed set of designs available in the observed data, potentially neglecting informative correlations across the design space and requiring a large amount of simulations. Instead, we consider the calibration process from the perspective of Bayesian adaptive experimental design and propose a data-efficient algorithm to run maximally informative simulations within a batch-sequential process. At each round, the algorithm jointly estimates the parameters of the posterior distribution and optimal designs by maximising a variational lower bound of the expected information gain. The simulator is modelled as a sample from a Gaussian process, which allows us to correlate simulations and observed data with the unknown calibration parameters. We show the benefits of our method when compared to related approaches across synthetic and real-data problems.

5/24/2024

cs.LG stat.ML