Discrete Nonparametric Causal Discovery Under Latent Class Confounding






Published 5/24/2024 by Bijan Mazaheri, Spencer Gordon, Yuval Rabani, Leonard Schulman



An acyclic causal structure can be described using a directed acyclic graph (DAG) with arrows indicating causation. The task of learning this structure from data is known as causal discovery. Diverse populations or changing environments can sometimes give rise to heterogeneous data. This heterogeneity can be thought of as a mixture model with multiple sources, each exerting their own distinct signature on the observed variables. From this perspective, the source is a latent common cause for every observed variable. While some methods for causal discovery are able to work around unobserved confounding in special cases, the only known ways to deal with a global confounder (such as a latent class) involve parametric assumptions. Focusing on discrete observables, we demonstrate that globally confounded causal structures can still be identifiable without parametric assumptions, so long as the number of latent classes remains small relative to the size and sparsity of the underlying DAG.

Create account to get full access


If you already have an account, we'll log you in


  • Causal structures can be represented using directed acyclic graphs (DAGs) with arrows indicating causation.
  • Learning these causal structures from data is called "causal discovery."
  • Heterogeneous data, such as from diverse populations or changing environments, can be modeled as a mixture with multiple "sources" - latent common causes for the observed variables.
  • Existing causal discovery methods have limitations in dealing with global confounders (like latent classes) without restrictive parametric assumptions.

Plain English Explanation

Imagine you're trying to understand how different factors influence each other in a complex system. You can think of this as a set of "causes" and "effects," where one thing leads to another in a specific direction. These causal relationships can be visualized using a diagram with arrows pointing from the causes to the effects, called a directed acyclic graph (DAG).

The process of figuring out these causal structures from the available data is called "causal discovery." This can be challenging, especially when the data comes from diverse sources or changes over time. In these cases, the data may have what's called "heterogeneity," where there are multiple underlying "sources" or factors influencing the observations.

Imagine you're studying the factors that affect a person's health, and the data comes from people in different countries or at different time periods. The populations may have their own unique characteristics that impact the data in distinct ways. These hidden factors can be thought of as "latent classes" - like invisible groups within the data - each exerting its own influence on the observed variables.

While some existing methods can handle certain types of hidden confounders, dealing with a single, global confounder (like a latent class) usually requires making restrictive assumptions about the data. These assumptions can be quite limiting, especially when working with discrete (non-continuous) variables.

The key insight from this research is that even with these global confounders, it's still possible to identify the causal structure, as long as the number of latent classes is relatively small compared to the size and sparsity of the underlying causal graph. This means that under certain conditions, we can uncover the causal relationships without needing to make those restrictive parametric assumptions.

Technical Explanation

The paper explores the problem of learning causal structures from heterogeneous data, where the observed variables are influenced by multiple, distinct "sources" or latent common causes. This is related to the concept of causal representation learning from multiple distributions, as discussed in this paper.

The authors show that even in the presence of a single, global confounder (such as a latent class), the causal structure can still be identifiable, as long as the number of latent classes is small compared to the size and sparsity of the underlying directed acyclic graph (DAG). This contrasts with the limitations of some existing causal discovery methods, which struggle to handle unobserved confounding, as explored in this work on local causal discovery.

The key insight is that by focusing on discrete (non-continuous) observables, the authors are able to circumvent the need for restrictive parametric assumptions that are often required when dealing with global confounders. This is an interesting alternative to approaches that rely on parametric models with unobserved confounders, as discussed in this paper on simultaneous inference.

The authors demonstrate their approach using both synthetic and real-world datasets, showing that it can recover the correct causal structure in the presence of a global confounder. This relates to the general problem of causal generative modeling, which has been explored in approaches like the FIP (fixed-point) method.

Critical Analysis

The paper presents a promising approach for causal discovery in the presence of global confounders, such as latent classes. By focusing on discrete observables, the authors are able to avoid the restrictive parametric assumptions required by some existing methods.

However, the authors acknowledge that their approach relies on the assumption that the number of latent classes is small relative to the size and sparsity of the underlying causal graph. In practice, this may not always be the case, and it would be valuable to explore the performance of the method when this assumption is violated.

Additionally, the paper does not provide a comprehensive discussion of the potential limitations or caveats of the proposed approach. For example, it would be helpful to understand how the method might perform in the presence of measurement error, missing data, or other common challenges in real-world causal discovery scenarios.

Furthermore, the ability to handle global counterfactual directions, as explored in this work, could be an important consideration for the broader applicability of the method.

Overall, the paper presents an interesting and potentially valuable contribution to the field of causal discovery, but further research and analysis would be needed to fully assess the method's strengths, weaknesses, and practical applications.


This research addresses the challenge of learning causal structures from heterogeneous data, where the observed variables are influenced by multiple, distinct latent common causes or "sources." The key insight is that even in the presence of a single, global confounder (such as a latent class), the causal structure can still be identifiable, as long as the number of latent classes is relatively small compared to the size and sparsity of the underlying causal graph.

By focusing on discrete observables, the authors are able to circumvent the need for restrictive parametric assumptions that are often required when dealing with global confounders. This represents a promising approach for causal discovery in complex, real-world scenarios where hidden factors may be influencing the observed data.

While the method has some limitations and caveats that require further exploration, this research contributes to our understanding of causal inference in the presence of unobserved confounding and opens up new avenues for developing more robust and flexible causal discovery techniques.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers


Hybrid Global Causal Discovery with Local Search

Sujai Hiremath, Jacqueline R. M. A. Maasch, Mengxiao Gao, Promit Ghosal, Kyra Gan





Learning the unique directed acyclic graph corresponding to an unknown causal model is a challenging task. Methods based on functional causal models can identify a unique graph, but either suffer from the curse of dimensionality or impose strong parametric assumptions. To address these challenges, we propose a novel hybrid approach for global causal discovery in observational data that leverages local causal substructures. We first present a topological sorting algorithm that leverages ancestral relationships in linear structural equation models to establish a compact top-down hierarchical ordering, encoding more causal information than linear orderings produced by existing methods. We demonstrate that this approach generalizes to nonlinear settings with arbitrary noise. We then introduce a nonparametric constraint-based algorithm that prunes spurious edges by searching for local conditioning sets, achieving greater accuracy than current methods. We provide theoretical guarantees for correctness and worst-case polynomial time complexities, with empirical validation on synthetic data.

Read more


Personalized Binomial DAGs Learning with Network Structured Covariates

Personalized Binomial DAGs Learning with Network Structured Covariates

Boxin Zhao, Weishi Wang, Dingyuan Zhu, Ziqi Liu, Dong Wang, Zhiqiang Zhang, Jun Zhou, Mladen Kolar





The causal dependence in data is often characterized by Directed Acyclic Graphical (DAG) models, widely used in many areas. Causal discovery aims to recover the DAG structure using observational data. This paper focuses on causal discovery with multi-variate count data. We are motivated by real-world web visit data, recording individual user visits to multiple websites. Building a causal diagram can help understand user behavior in transitioning between websites, inspiring operational strategy. A challenge in modeling is user heterogeneity, as users with different backgrounds exhibit varied behaviors. Additionally, social network connections can result in similar behaviors among friends. We introduce personalized Binomial DAG models to address heterogeneity and network dependency between observations, which are common in real-world applications. To learn the proposed DAG model, we develop an algorithm that embeds the network structure into a dimension-reduced covariate, learns each node's neighborhood to reduce the DAG search space, and explores the variance-mean relation to determine the ordering. Simulations show our algorithm outperforms state-of-the-art competitors in heterogeneous data. We demonstrate its practical usefulness on a real-world web visit dataset.

Read more



Local Causal Structure Learning in the Presence of Latent Variables

Feng Xie, Zheng Li, Peng Wu, Yan Zeng, Chunchen Liu, Zhi Geng





Discovering causal relationships from observational data, particularly in the presence of latent variables, poses a challenging problem. While current local structure learning methods have proven effective and efficient when the focus lies solely on the local relationships of a target variable, they operate under the assumption of causal sufficiency. This assumption implies that all the common causes of the measured variables are observed, leaving no room for latent variables. Such a premise can be easily violated in various real-world applications, resulting in inaccurate structures that may adversely impact downstream tasks. In light of this, our paper delves into the primary investigation of locally identifying potential parents and children of a target from observational data that may include latent variables. Specifically, we harness the causal information from m-separation and V-structures to derive theoretical consistency results, effectively bridging the gap between global and local structure learning. Together with the newly developed stop rules, we present a principled method for determining whether a variable is a direct cause or effect of a target. Further, we theoretically demonstrate the correctness of our approach under the standard causal Markov and faithfulness conditions, with infinite samples. Experimental results on both synthetic and real-world data validate the effectiveness and efficiency of our approach.

Read more



Sample, estimate, aggregate: A recipe for causal discovery foundation models

Menghua Wu, Yujia Bao, Regina Barzilay, Tommi Jaakkola





Causal discovery, the task of inferring causal structure from data, promises to accelerate scientific research, inform policy making, and more. However, causal discovery algorithms over larger sets of variables tend to be brittle against misspecification or when data are limited. To mitigate these challenges, we train a supervised model that learns to predict a larger causal graph from the outputs of classical causal discovery algorithms run over subsets of variables, along with other statistical hints like inverse covariance. Our approach is enabled by the observation that typical errors in the outputs of classical methods remain comparable across datasets. Theoretically, we show that this model is well-specified, in the sense that it can recover a causal graph consistent with graphs over subsets. Empirically, we train the model to be robust to erroneous estimates using diverse synthetic data. Experiments on real and synthetic data demonstrate that this model maintains high accuracy in the face of misspecification or distribution shift, and can be adapted at low cost to different discovery algorithms or choice of statistics.

Read more
