Blue noise for diffusion models

Read original: arXiv:2402.04930 - Published 5/3/2024 by Xingchang Huang, Corentin Salaun, Cristina Vasconcelos, Christian Theobalt, Cengiz Oztireli, Gurprit Singh

✨

Overview

Current diffusion models use Gaussian noise during training and sampling, which may not effectively capture the frequency content reconstructed by the denoising network.
The paper introduces a novel class of diffusion models that incorporate correlated noise within and across images.
The proposed time-varying noise model uses blue noise to improve generation quality compared to Gaussian white noise.
The framework allows introducing correlation across images in a mini-batch to improve gradient flow.

Plain English Explanation

Current diffusion models use random, uncorrelated noise (Gaussian white noise) when training the model and generating new samples. However, this may not be the best approach, as the denoising network in the model is designed to reconstruct specific frequency contents in the images.

To address this, the researchers propose a new type of diffusion model that uses correlated noise instead of pure random noise. Correlated noise means that the noise values are related to each other, rather than being completely independent. This can help the model better capture the frequency patterns it needs to reconstruct.

Specifically, the researchers introduce a time-varying noise model that incorporates this correlated noise. They use a type of correlated noise called blue noise, which has certain desirable properties for image generation. This blue noise-based approach leads to improved image quality compared to using standard Gaussian white noise.

Additionally, the researchers' framework allows for introducing correlations across images within a single batch during training. This can help improve the flow of gradients through the network, leading to better optimization and performance.

Technical Explanation

The paper proposes a novel class of diffusion models that incorporate correlated noise within and across images, in contrast to the standard Gaussian white noise used in most existing diffusion models.

The key technical contributions are:

Time-varying Noise Model: The researchers introduce a time-varying noise model that incorporates correlated noise, such as blue noise, into the diffusion process. This helps better capture the frequency content reconstructed by the denoising network.
Correlated Noise Mask Generation: The paper presents a method for efficiently generating correlated noise masks, which can be used to introduce noise correlations during training and sampling.
Cross-image Noise Correlation: The framework allows for introducing noise correlations across images within a single mini-batch, which can improve the flow of gradients and lead to better optimization.

The researchers evaluate their method on a variety of datasets and tasks, demonstrating improvements in the FID metric over existing deterministic diffusion models.

Critical Analysis

The paper presents a thoughtful approach to addressing a potential limitation of current diffusion models, namely the use of Gaussian white noise, which may not be optimal for capturing the frequency content of the reconstructed images.

One potential caveat is that the paper does not provide a deep theoretical analysis of why correlated noise, such as blue noise, should lead to better performance. The authors mention that correlated noise has been used in computer graphics, but more discussion on the underlying principles and how they apply to diffusion models would be helpful.

Additionally, while the paper demonstrates improvements in the FID metric, it would be valuable to see evaluations on a broader range of metrics and tasks to fully assess the benefits of the proposed approach. Comparisons to other noise modeling techniques, such as consistent diffusion or fine-grained color guidance, would also provide a more comprehensive understanding of the method's strengths and limitations.

Overall, the paper introduces an interesting and potentially impactful direction for improving diffusion models, but further research and analysis would be valuable to fully understand the implications and broader applicability of the approach.

Conclusion

This paper presents a novel class of diffusion models that incorporate correlated noise, such as blue noise, within and across images during training and sampling. By moving away from the standard Gaussian white noise, the researchers demonstrate improvements in image generation quality, as measured by the FID metric.

The key contributions of this work are the time-varying noise model, the efficient generation of correlated noise masks, and the ability to introduce noise correlations across images during training. These advancements have the potential to lead to better-performing diffusion models, which could have far-reaching impacts in various generative modeling applications.

While further research is needed to fully understand the theoretical underpinnings and broader implications of this approach, the paper represents an important step forward in the ongoing effort to improve the capabilities and versatility of diffusion-based generative models.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

✨

Blue noise for diffusion models

Xingchang Huang, Corentin Salaun, Cristina Vasconcelos, Christian Theobalt, Cengiz Oztireli, Gurprit Singh

Most of the existing diffusion models use Gaussian noise for training and sampling across all time steps, which may not optimally account for the frequency contents reconstructed by the denoising network. Despite the diverse applications of correlated noise in computer graphics, its potential for improving the training process has been underexplored. In this paper, we introduce a novel and general class of diffusion models taking correlated noise within and across images into account. More specifically, we propose a time-varying noise model to incorporate correlated noise into the training process, as well as a method for fast generation of correlated noise mask. Our model is built upon deterministic diffusion models and utilizes blue noise to help improve the generation quality compared to using Gaussian white (random) noise only. Further, our framework allows introducing correlation across images within a single mini-batch to improve gradient flow. We perform both qualitative and quantitative evaluations on a variety of datasets using our method, achieving improvements on different tasks over existing deterministic diffusion models in terms of FID metric.

5/3/2024

One Noise to Rule Them All: Learning a Unified Model of Spatially-Varying Noise Patterns

Arman Maesumi, Dylan Hu, Krishi Saripalli, Vladimir G. Kim, Matthew Fisher, Soren Pirk, Daniel Ritchie

Procedural noise is a fundamental component of computer graphics pipelines, offering a flexible way to generate textures that exhibit natural random variation. Many different types of noise exist, each produced by a separate algorithm. In this paper, we present a single generative model which can learn to generate multiple types of noise as well as blend between them. In addition, it is capable of producing spatially-varying noise blends despite not having access to such data for training. These features are enabled by training a denoising diffusion model using a novel combination of data augmentation and network conditioning techniques. Like procedural noise generators, the model's behavior is controllable via interpretable parameters and a source of randomness. We use our model to produce a variety of visually compelling noise textures. We also present an application of our model to improving inverse procedural material design; using our model in place of fixed-type noise nodes in a procedural material graph results in higher-fidelity material reconstructions without needing to know the type of noise in advance.

4/26/2024

👨‍🏫

Quantum-Noise-Driven Generative Diffusion Models

Marco Parigi, Stefano Martina, Filippo Caruso

Generative models realized with machine learning techniques are powerful tools to infer complex and unknown data distributions from a finite number of training samples in order to produce new synthetic data. Diffusion models are an emerging framework that have recently overcome the performance of the generative adversarial networks in creating synthetic text and high-quality images. Here, we propose and discuss the quantum generalization of diffusion models, i.e., three quantum-noise-driven generative diffusion models that could be experimentally tested on real quantum systems. The idea is to harness unique quantum features, in particular the non-trivial interplay among coherence, entanglement and noise that the currently available noisy quantum processors do unavoidably suffer from, in order to overcome the main computational burdens of classical diffusion models during inference. Hence, we suggest to exploit quantum noise not as an issue to be detected and solved but instead as a very remarkably beneficial key ingredient to generate much more complex probability distributions that would be difficult or even impossible to express classically, and from which a quantum processor might sample more efficiently than a classical one. An example of numerical simulations for an hybrid classical-quantum generative diffusion model is also included. Therefore, our results are expected to pave the way for new quantum-inspired or quantum-based generative diffusion algorithms addressing more powerfully classical tasks as data generation/prediction with widespread real-world applications ranging from climate forecasting to neuroscience, from traffic flow analysis to financial forecasting.

6/13/2024

✅

Physics-Informed Diffusion Models

Jan-Hendrik Bastek, WaiChing Sun, Dennis M. Kochmann

Generative models such as denoising diffusion models are quickly advancing their ability to approximate highly complex data distributions. They are also increasingly leveraged in scientific machine learning, where samples from the implied data distribution are expected to adhere to specific governing equations. We present a framework to inform denoising diffusion models of underlying constraints on such generated samples during model training. Our approach improves the alignment of the generated samples with the imposed constraints and significantly outperforms existing methods without affecting inference speed. Additionally, our findings suggest that incorporating such constraints during training provides a natural regularization against overfitting. Our framework is easy to implement and versatile in its applicability for imposing equality and inequality constraints as well as auxiliary optimization objectives.

5/24/2024