Causal Discovery in Semi-Stationary Time Series

Read original: arXiv:2407.07291 - Published 7/11/2024 by Shanyun Gao, Raghavendra Addanki, Tong Yu, Ryan A. Rossi, Murat Kocaoglu

Causal Discovery in Semi-Stationary Time Series

Overview

Examines the problem of causal discovery in semi-stationary time series data
Proposes a novel approach to identify causal relationships in non-stationary processes with gradually changing dynamics
Introduces a framework that combines change point detection and structural causal models to uncover causal structure

Plain English Explanation

This paper addresses a challenging problem in the field of causal inference: determining causal relationships in time series data where the underlying processes are gradually changing over time, rather than remaining constant. This type of "semi-stationary" data is common in many real-world scenarios, such as financial markets or climate systems, where the dynamics can shift slowly but significantly.

The key insight of the paper is to combine two powerful techniques: change point detection and structural causal models. By first identifying the points in time where the data-generating process undergoes a change, the researchers can then apply causal discovery methods to each "stationary" segment separately. This allows them to uncover the causal structure within each regime, as well as how that structure evolves over time.

The proposed framework is validated on both synthetic and real-world datasets, demonstrating its ability to accurately recover causal relationships in the presence of non-stationarity. This work represents an important advancement in the field of causal inference from time series data, building on previous efforts to discover causal models from time-varying data and detect causal changes over time.

Technical Explanation

The paper introduces a novel framework for causal discovery in semi-stationary time series, which combines change point detection and structural causal modeling. The key steps are:

Change Point Detection: The researchers first apply a change point detection algorithm to identify the times at which the underlying data-generating process undergoes a shift. This allows them to partition the time series into distinct "regimes" with relatively stable dynamics.
Causal Discovery: Within each regime, the framework then applies a structural causal model (SCM) learning algorithm to uncover the causal relationships among the variables. The SCM approach can capture more complex dependencies than traditional Granger causality methods.
Temporal Causal Modeling: By tracking how the causal structure evolves across the different regimes, the framework is able to model the temporal dynamics of the causal relationships. This is an important advancement over prior work that focused on learning causal models from static data or assumed stationarity.

The authors evaluate their approach on both synthetic data and real-world datasets, including financial time series and climate data. The results demonstrate the ability of the framework to accurately recover causal structures, even in the presence of slowly-varying non-stationarity. This represents an important step forward in the field of causal discovery from tabular time series data.

Critical Analysis

The paper presents a compelling approach to causal discovery in semi-stationary time series, and the experimental results are promising. However, there are a few potential limitations and areas for further research:

Sensitivity to Change Point Detection: The performance of the overall framework is heavily dependent on the accuracy of the change point detection algorithm. Errors in identifying regime boundaries could lead to biased causal estimates within each segment.
Computational Complexity: Applying causal discovery methods to each regime individually may become computationally intensive for long time series with many change points. Scalability could be an issue for real-world applications with large datasets.
Handling Non-linear Relationships: While the structural causal modeling approach can capture more complex dependencies than linear methods, the paper does not extensively discuss the framework's ability to handle highly non-linear or dynamic causal relationships.
Incorporation of Domain Knowledge: The paper focuses on a purely data-driven approach to causal discovery. Incorporating available domain knowledge about the underlying processes could potentially improve the accuracy and interpretability of the inferred causal models.

Despite these potential limitations, this work represents a significant contribution to the field of causal inference from time series data. The proposed framework provides a principled way to uncover causal structures in the face of non-stationarity, which is a critical challenge in many real-world applications.

Conclusion

This paper introduces a novel framework for causal discovery in semi-stationary time series, which combines change point detection and structural causal modeling. By first identifying the points in time where the data-generating process undergoes a shift, the framework can then apply causal discovery methods to each "stationary" segment separately, allowing it to uncover the evolving causal structure over time.

The experimental results demonstrate the effectiveness of this approach in recovering causal relationships, even in the presence of slowly-varying non-stationarity. This work represents an important advancement in the field of causal inference from time series data, with potential applications in a wide range of domains, such as finance, climate science, and systems biology.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Follow @aimodelsfyi on 𝕏 →

Related Papers

Causal Discovery in Semi-Stationary Time Series

Shanyun Gao, Raghavendra Addanki, Tong Yu, Ryan A. Rossi, Murat Kocaoglu

Discovering causal relations from observational time series without making the stationary assumption is a significant challenge. In practice, this challenge is common in many areas, such as retail sales, transportation systems, and medical science. Here, we consider this problem for a class of non-stationary time series. The structural causal model (SCM) of this type of time series, called the semi-stationary time series, exhibits that a finite number of different causal mechanisms occur sequentially and periodically across time. This model holds considerable practical utility because it can represent periodicity, including common occurrences such as seasonality and diurnal variation. We propose a constraint-based, non-parametric algorithm for discovering causal relations in this setting. The resulting algorithm, PCMCI$_{Omega}$, can capture the alternating and recurring changes in the causal mechanisms and then identify the underlying causal graph with conditional independence (CI) tests. We show that this algorithm is sound in identifying causal relations on discrete time series. We validate the algorithm with extensive experiments on continuous and discrete simulated data. We also apply our algorithm to a real-world climate dataset.

7/11/2024

🤯

Causal Inference from Slowly Varying Nonstationary Processes

Kang Du, Yu Xiang

Causal inference from observational data following the restricted structural causal models (SCM) framework hinges largely on the asymmetry between cause and effect from the data generating mechanisms, such as non-Gaussianity or non-linearity. This methodology can be adapted to stationary time series, yet inferring causal relationships from nonstationary time series remains a challenging task. In this work, we propose a new class of restricted SCM, via a time-varying filter and stationary noise, and exploit the asymmetry from nonstationarity for causal identification in both bivariate and network settings. We propose efficient procedures by leveraging powerful estimates of the bivariate evolutionary spectra for slowly varying processes. Various synthetic and real datasets that involve high-order and non-smooth filters are evaluated to demonstrate the effectiveness of our proposed methodology.

5/30/2024

Discovering Mixtures of Structural Causal Models from Time Series Data

Sumanth Varambally, Yi-An Ma, Rose Yu

Discovering causal relationships from time series data is significant in fields such as finance, climate science, and neuroscience. However, contemporary techniques rely on the simplifying assumption that data originates from the same causal model, while in practice, data is heterogeneous and can stem from different causal models. In this work, we relax this assumption and perform causal discovery from time series data originating from a mixture of causal models. We propose a general variational inference-based framework called MCD to infer the underlying causal models as well as the mixing probability of each sample. Our approach employs an end-to-end training process that maximizes an evidence-lower bound for the data likelihood. We present two variants: MCD-Linear for linear relationships and independent noise, and MCD-Nonlinear for nonlinear causal relationships and history-dependent noise. We demonstrate that our method surpasses state-of-the-art benchmarks in causal discovery tasks through extensive experimentation on synthetic and real-world datasets, particularly when the data emanates from diverse underlying causal graphs. Theoretically, we prove the identifiability of such a model under some mild assumptions.

6/26/2024

🔮

Temporally Disentangled Representation Learning under Unknown Nonstationarity

Xiangchen Song, Weiran Yao, Yewen Fan, Xinshuai Dong, Guangyi Chen, Juan Carlos Niebles, Eric Xing, Kun Zhang

In unsupervised causal representation learning for sequential data with time-delayed latent causal influences, strong identifiability results for the disentanglement of causally-related latent variables have been established in stationary settings by leveraging temporal structure. However, in nonstationary setting, existing work only partially addressed the problem by either utilizing observed auxiliary variables (e.g., class labels and/or domain indexes) as side information or assuming simplified latent causal dynamics. Both constrain the method to a limited range of scenarios. In this study, we further explored the Markov Assumption under time-delayed causally related process in nonstationary setting and showed that under mild conditions, the independent latent components can be recovered from their nonlinear mixture up to a permutation and a component-wise transformation, without the observation of auxiliary variables. We then introduce NCTRL, a principled estimation framework, to reconstruct time-delayed latent causal variables and identify their relations from measured sequential data only. Empirical evaluations demonstrated the reliable identification of time-delayed latent causal influences, with our methodology substantially outperforming existing baselines that fail to exploit the nonstationarity adequately and then, consequently, cannot distinguish distribution shifts.

8/2/2024