Classification Diffusion Models: Revitalizing Density Ratio Estimation

Read original: arXiv:2402.10095 - Published 6/13/2024 by Shahar Yadin, Noam Elata, Tomer Michaeli

🏷️

Overview

A prominent family of methods for learning data distributions is density ratio estimation (DRE)
DRE-based models can directly output the likelihood for any given input, a highly desired property lacking in most generative techniques
DRE methods have struggled to accurately capture the distributions of complex high-dimensional data like images, leading to reduced research attention
This work presents classification diffusion models (CDMs), a DRE-based generative method that adopts the formalism of denoising diffusion models (DDMs)

Plain English Explanation

Classification diffusion models (CDMs) are a new way to generate complex data like images. Traditional methods for generating data, called generative models, have struggled to accurately capture the full distribution of real-world data, especially for high-dimensional data like images.

CDMs try to solve this problem by using a different approach called density ratio estimation (DRE). DRE-based models train a "classifier" to distinguish between real data and made-up data. This allows the model to directly output the likelihood, or probability, of any given input.

The key innovation in CDMs is that they combine DRE with the framework of denoising diffusion models (DDMs). DDMs gradually add noise to clean data and then try to "denoise" it to generate new samples. CDMs use a classifier to predict the level of noise added to the data, which allows them to generate high-quality images while still being able to output likelihood scores.

Technical Explanation

Classification diffusion models (CDMs) are a DRE-based generative method that adopts the formalism of denoising diffusion models (DDMs). The key idea is to train a classifier that predicts the level of noise added to a clean signal, which the authors show is analytically connected to an MSE-optimal denoiser for white Gaussian noise.

This allows CDMs to generate high-quality images while also being able to output the likelihood of any input in a single forward pass, achieving state-of-the-art negative log likelihood (NLL) among methods with this property. The authors derive an analytical connection between the optimal denoiser and the optimal classifier, which forms the core of their approach.

CDMs build upon recent advances in physics-informed diffusion models and diffusion model optimization to achieve their impressive results on complex, high-dimensional data like images.

Critical Analysis

The authors acknowledge that while CDMs can generate high-quality images and output likelihood scores, there are still some limitations to the approach. For example, the method relies on the assumption of white Gaussian noise, which may not always hold true for real-world data.

Additionally, the authors note that further research is needed to understand the full capabilities and limitations of CDMs, especially when it comes to generating accurate channel distributions and modeling more complex data distributions.

Overall, CDMs represent a promising step forward in DRE-based generative modeling, but there is still work to be done to fully harness the potential of this approach for complex, high-dimensional data generation.

Conclusion

In summary, classification diffusion models (CDMs) are a new DRE-based generative method that combines the strengths of density ratio estimation and denoising diffusion models. By training a classifier to predict the level of noise added to a clean signal, CDMs can generate high-quality images while also outputting likelihood scores, a highly desirable property lacking in most generative techniques.

While CDMs represent an important advancement in DRE-based generative modeling, there are still some limitations and open questions that require further research. Nevertheless, this work demonstrates the potential of this approach and could inspire new directions in the field of complex data generation.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

🏷️

Classification Diffusion Models: Revitalizing Density Ratio Estimation

Shahar Yadin, Noam Elata, Tomer Michaeli

A prominent family of methods for learning data distributions relies on density ratio estimation (DRE), where a model is trained to $textit{classify}$ between data samples and samples from some reference distribution. DRE-based models can directly output the likelihood for any given input, a highly desired property that is lacking in most generative techniques. Nevertheless, to date, DRE methods have struggled to accurately capture the distributions of complex high-dimensional data like images, which led to reduced research attention over the years. In this work we present $textit{classification diffusion models}$ (CDMs), a DRE-based generative method that adopts the formalism of denoising diffusion models (DDMs) while making use of a classifier that predicts the level of noise added to a clean signal. Our method is based on an analytical connection that we derive between an MSE-optimal denoiser for white Gaussian noise and a cross-entropy-optimal classifier for predicting the noise level. To the best of our knowledge, our method is the first DRE-based technique that can successfully generate images. Furthermore, it can output the likelihood of any input in a single forward pass, achieving state-of-the-art negative log likelihood (NLL) among methods with this property. Code is available on the project's webpage in https://shaharYadin.github.io/CDM/ .

6/13/2024

📊

Discrete Diffusion Modeling by Estimating the Ratios of the Data Distribution

Aaron Lou, Chenlin Meng, Stefano Ermon

Despite their groundbreaking performance for many generative modeling tasks, diffusion models have fallen short on discrete data domains such as natural language. Crucially, standard diffusion models rely on the well-established theory of score matching, but efforts to generalize this to discrete structures have not yielded the same empirical gains. In this work, we bridge this gap by proposing score entropy, a novel loss that naturally extends score matching to discrete spaces, integrates seamlessly to build discrete diffusion models, and significantly boosts performance. Experimentally, we test our Score Entropy Discrete Diffusion models (SEDD) on standard language modeling tasks. For comparable model sizes, SEDD beats existing language diffusion paradigms (reducing perplexity by $25$-$75$%) and is competitive with autoregressive models, in particular outperforming GPT-2. Furthermore, compared to autoregressive mdoels, SEDD generates faithful text without requiring distribution annealing techniques like temperature scaling (around $6$-$8times$ better generative perplexity than un-annealed GPT-2), can trade compute and quality (similar quality with $32times$ fewer network evaluations), and enables controllable infilling (matching nucleus sampling quality while enabling other strategies besides left to right prompting).

6/10/2024

💬

Generative Modeling with Flow-Guided Density Ratio Learning

Alvin Heng, Abdul Fatir Ansari, Harold Soh

We present Flow-Guided Density Ratio Learning (FDRL), a simple and scalable approach to generative modeling which builds on the stale (time-independent) approximation of the gradient flow of entropy-regularized f-divergences introduced in recent work. Specifically, the intractable time-dependent density ratio is approximated by a stale estimator given by a GAN discriminator. This is sufficient in the case of sample refinement, where the source and target distributions of the flow are close to each other. However, this assumption is invalid for generation and a naive application of the stale estimator fails due to the large chasm between the two distributions. FDRL proposes to train a density ratio estimator such that it learns from progressively improving samples during the training process. We show that this simple method alleviates the density chasm problem, allowing FDRL to generate images of dimensions as high as $128times128$, as well as outperform existing gradient flow baselines on quantitative benchmarks. We also show the flexibility of FDRL with two use cases. First, unconditional FDRL can be easily composed with external classifiers to perform class-conditional generation. Second, FDRL can be directly applied to unpaired image-to-image translation with no modifications needed to the framework. Our code is publicly available at ttps://github.com/clear-nus/fdrl.

6/6/2024

✅

Physics-Informed Diffusion Models

Jan-Hendrik Bastek, WaiChing Sun, Dennis M. Kochmann

Generative models such as denoising diffusion models are quickly advancing their ability to approximate highly complex data distributions. They are also increasingly leveraged in scientific machine learning, where samples from the implied data distribution are expected to adhere to specific governing equations. We present a framework to inform denoising diffusion models of underlying constraints on such generated samples during model training. Our approach improves the alignment of the generated samples with the imposed constraints and significantly outperforms existing methods without affecting inference speed. Additionally, our findings suggest that incorporating such constraints during training provides a natural regularization against overfitting. Our framework is easy to implement and versatile in its applicability for imposing equality and inequality constraints as well as auxiliary optimization objectives.

5/24/2024