HUGO -- Highlighting Unseen Grid Options: Combining Deep Reinforcement Learning with a Heuristic Target Topology Approach

2405.00629

Published 5/24/2024 by Malte Lehna, Clara Holzhuter, Sven Tomforde, Christoph Scholz

🤿

Abstract

With the growth of Renewable Energy (RE) generation, the operation of power grids has become increasingly complex. One solution could be automated grid operation, where Deep Reinforcement Learning (DRL) has repeatedly shown significant potential in Learning to Run a Power Network (L2RPN) challenges. However, only individual actions at the substation level have been subjected to topology optimization by most existing DRL algorithms. In contrast, we propose a more holistic approach by proposing specific Target Topologies (TTs) as actions. These topologies are selected based on their robustness. As part of this paper, we present a search algorithm to find the TTs and upgrade our previously developed DRL agent CurriculumAgent (CAgent) to a novel topology agent. We compare the upgrade to the previous CAgent and can increase their L2RPN score significantly by 10%. Further, we achieve a 25% better median survival time with our TTs included. Later analysis shows that almost all TTs are close to the base topology, explaining their robustness

Create account to get full access

Overview

As renewable energy generation grows, power grid operations have become increasingly complex.
Deep Reinforcement Learning (DRL) has shown promise in Learning to Run a Power Network (L2RPN) challenges, but existing DRL algorithms have focused on optimizing individual substation-level actions.
This paper proposes a more holistic approach by selecting specific Target Topologies (TTs) as actions, based on their robustness.
The paper presents a search algorithm to find the TTs and upgrades a previously developed DRL agent, CurriculumAgent (CAgent), to a novel "topology agent."

Plain English Explanation

As more and more renewable energy sources like wind and solar are added to the power grid, the overall system has become much more complex to manage and operate. One potential solution to this problem is to use deep reinforcement learning (DRL), a type of AI that can learn to make decisions on its own.

Previous research has shown that DRL can be effective at "Learning to Run a Power Network" (L2RPN), which means training an AI system to control a simulated power grid. However, most existing DRL approaches have only focused on optimizing individual actions at the level of individual substations in the grid.

In contrast, this new paper proposes a more comprehensive approach. Instead of just optimizing individual actions, the researchers developed an algorithm to identify specific "target topologies" - configurations of the overall grid that are particularly robust and reliable. They then trained a DRL agent, which they call a "topology agent," to learn how to transition the grid between these target topologies in an optimal way.

By using this more holistic, topology-focused approach, the researchers were able to significantly improve the performance of their DRL agent compared to previous methods. They saw a 10% increase in the agent's scores on the L2RPN challenge, as well as a 25% improvement in the agent's ability to keep the grid running stably over time.

The key insight was that focusing on optimal overall grid topologies, rather than just individual substation actions, allowed the DRL agent to make more strategic and robust decisions. Further analysis showed that the target topologies identified by the algorithm tended to be very similar to the base, "normal" grid topology, suggesting that small, incremental changes are often the most reliable way to keep the power system running smoothly.

Technical Explanation

The paper proposes a novel approach to using deep reinforcement learning (DRL) for the control of power grid operations. Whereas previous DRL algorithms for the Learning to Run a Power Network (L2RPN) challenge have focused on optimizing individual actions at the substation level, this paper takes a more holistic view.

The key innovation is the introduction of "Target Topologies" (TTs) as the actions that the DRL agent can take. These TTs are specific configurations of the overall power grid that have been selected for their robustness. The paper presents a search algorithm to identify these TTs, which are then used to train a novel "topology agent" built upon the authors' previously developed CurriculumAgent (CAgent) DRL framework.

Experiments show that upgrading CAgent to use the TT-based approach results in a 10% improvement in the agent's scores on the L2RPN challenge. Furthermore, the median survival time of the agent's grid operations is increased by 25% when using the TT-based approach.

Analysis of the identified TTs reveals that they tend to be quite similar to the base, "normal" grid topology. This suggests that small, incremental changes to the grid configuration are often the most reliable way to maintain stable operations, rather than more dramatic topological changes.

Critical Analysis

The paper presents a compelling approach to using DRL for power grid control, with a clear focus on identifying robust grid topologies rather than just optimizing individual actions. This aligns well with the increasing complexity of power grids as renewable energy sources are integrated.

However, the paper does not address some potential limitations of the approach. For example, the search algorithm for identifying TTs is not evaluated in depth, and it is unclear how scalable or generalizable this process would be for larger, more complex grid systems. Additionally, the paper does not discuss how the identified TTs might need to be updated over time as the grid infrastructure and generation mix evolves.

Further research could also explore the relationship between the TTs and other grid resilience metrics, such as the ability to withstand contingencies or natural disasters. It would be interesting to see if the TT-based approach provides benefits in these areas as well.

Overall, the paper presents a promising direction for DRL-based power grid control, but additional work is needed to fully understand the strengths, limitations, and broader applicability of the proposed topology-centric approach.

Conclusion

This paper introduces a novel, topology-focused approach to using deep reinforcement learning (DRL) for power grid control. By identifying and optimizing for specific "target topologies" (TTs) rather than just individual substation actions, the researchers were able to significantly improve the performance of their DRL agent on the Learning to Run a Power Network (L2RPN) challenge.

The key insight was that small, incremental changes to the overall grid configuration tend to be more robust and reliable than more dramatic topological shifts. This aligns well with the increasing complexity of modern power grids as renewable energy sources are integrated.

While further research is needed to fully understand the limitations and broader applicability of this approach, the paper presents a promising direction for using AI to help manage the transition to a more sustainable and resilient electricity system.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

🔎

Fault Detection for agents on power grid topology optimization: A Comprehensive analysis

Malte Lehna, Mohamed Hassouna, Dmitry Degtyar, Sven Tomforde, Christoph Scholz

The topology optimization of transmission networks using Deep Reinforcement Learning (DRL) has increasingly come into focus. Various researchers have proposed different DRL agents, which are often benchmarked on the Grid2Op environment from the Learning to Run a Power Network (L2RPN) challenges. The environments have many advantages with their realistic chronics and underlying power flow backends. However, the interpretation of agent survival or failure is not always clear, as there are a variety of potential causes. In this work, we focus on the failures of the power grid to identify patterns and detect them a priori. We collect the failed chronics of three different agents on the WCCI 2022 L2RPN environment, totaling about 40k data points. By clustering, we are able to detect five distinct clusters, identifying different failure types. Further, we propose a multi-class prediction approach to detect failures beforehand and evaluate five different models. Here, the Light Gradient-Boosting Machine (LightGBM) shows the best performance, with an accuracy of 86%. It also correctly identifies in 91% of the time failure and survival observations. Finally, we provide a detailed feature importance analysis that identifies critical features and regions in the grid.

6/26/2024

cs.LG cs.AI cs.SY eess.SY

Decentralized Coordination of Distributed Energy Resources through Local Energy Markets and Deep Reinforcement Learning

Daniel May, Matthew Taylor, Petr Musilek

As the energy landscape evolves toward sustainability, the accelerating integration of distributed energy resources poses challenges to the operability and reliability of the electricity grid. One significant aspect of this issue is the notable increase in net load variability at the grid edge. Transactive energy, implemented through local energy markets, has recently garnered attention as a promising solution to address the grid challenges in the form of decentralized, indirect demand response on a community level. Given the nature of these challenges, model-free control approaches, such as deep reinforcement learning, show promise for the decentralized automation of participation within this context. Existing studies at the intersection of transactive energy and model-free control primarily focus on socioeconomic and self-consumption metrics, overlooking the crucial goal of reducing community-level net load variability. This study addresses this gap by training a set of deep reinforcement learning agents to automate end-user participation in ALEX, an economy-driven local energy market. In this setting, agents do not share information and only prioritize individual bill optimization. The study unveils a clear correlation between bill reduction and reduced net load variability in this setup. The impact on net load variability is assessed over various time horizons using metrics such as ramping rate, daily and monthly load factor, as well as daily average and total peak export and import on an open-source dataset. Agents are then benchmarked against several baselines, with their performance levels showing promising results, approaching those of a near-optimal dynamic programming benchmark.

4/23/2024

eess.SY cs.AI cs.LG cs.MA cs.SY

🏅

End-to-End Reinforcement Learning of Curative Curtailment with Partial Measurement Availability

Hinrikus Wolf, Luis Bottcher, Sarra Bouchkati, Philipp Lutat, Jens Breitung, Bastian Jung, Tina Mollemann, Viktor Todosijevi'c, Jan Schiefelbein-Lach, Oliver Pohl, Andreas Ulbig, Martin Grohe

In the course of the energy transition, the expansion of generation and consumption will change, and many of these technologies, such as PV systems, electric cars and heat pumps, will influence the power flow, especially in the distribution grids. Scalable methods that can make decisions for each grid connection are needed to enable congestion-free grid operation in the distribution grids. This paper presents a novel end-to-end approach to resolving congestion in distribution grids with deep reinforcement learning. Our architecture learns to curtail power and set appropriate reactive power to determine a non-congested and, thus, feasible grid state. State-of-the-art methods such as the optimal power flow (OPF) demand high computational costs and detailed measurements of every bus in a grid. In contrast, the presented method enables decisions under sparse information with just some buses observable in the grid. Distribution grids are generally not yet fully digitized and observable, so this method can be used for decision-making on the majority of low-voltage grids. On a real low-voltage grid the approach resolves 100% of violations in the voltage band and 98.8% of asset overloads. The results show that decisions can also be made on real grids that guarantee sufficient quality for congestion-free grid operation.

6/21/2024

cs.LG cs.AI cs.SY eess.SY

Learning Heuristics for Transit Network Design and Improvement with Deep Reinforcement Learning

Andrew Holliday, Ahmed El-Geneidy, Gregory Dudek

Transit agencies world-wide face tightening budgets. To maintain quality of service while cutting costs, efficient transit network design is essential. But planning a network of public transit routes is a challenging optimization problem. The most successful approaches to date use metaheuristic algorithms to search through the space of possible transit networks by applying low-level heuristics that randomly alter routes in a network. The design of these low-level heuristics has a major impact on the quality of the result. In this paper we use deep reinforcement learning with graph neural nets to learn low-level heuristics for an evolutionary algorithm, instead of designing them manually. These learned heuristics improve the algorithm's results on benchmark synthetic cities with 70 nodes or more, and obtain state-of-the-art results when optimizing operating costs. They also improve upon a simulation of the real transit network in the city of Laval, Canada, by as much as 54% and 18% on two key metrics, and offer cost savings of up to 12% over the city's existing transit network.

4/16/2024

cs.LG cs.AI cs.NE