Control Policy Correction Framework for Reinforcement Learning-based Energy Arbitrage Strategies

2404.18821

Published 5/1/2024 by Seyed Soroush Karimi Madahi, Gargya Gokhale, Marie-Sophie Verwee, Bert Claessens, Chris Develder

🏅

Abstract

A continuous rise in the penetration of renewable energy sources, along with the use of the single imbalance pricing, provides a new opportunity for balance responsible parties to reduce their cost through energy arbitrage in the imbalance settlement mechanism. Model-free reinforcement learning (RL) methods are an appropriate choice for solving the energy arbitrage problem due to their outstanding performance in solving complex stochastic sequential problems. However, RL is rarely deployed in real-world applications since its learned policy does not necessarily guarantee safety during the execution phase. In this paper, we propose a new RL-based control framework for batteries to obtain a safe energy arbitrage strategy in the imbalance settlement mechanism. In our proposed control framework, the agent initially aims to optimize the arbitrage revenue. Subsequently, in the post-processing step, we correct (constrain) the learned policy following a knowledge distillation process based on properties that follow human intuition. Our post-processing step is a generic method and is not restricted to the energy arbitrage domain. We use the Belgian imbalance price of 2023 to evaluate the performance of our proposed framework. Furthermore, we deploy our proposed control framework on a real battery to show its capability in the real world.

Create account to get full access

Overview

This paper describes the anonymous-acm package, which provides a way to anonymize submissions to the ACM LaTeX Master Article Template.
The package allows authors to hide their identities during the peer review process, while still maintaining the proper formatting and structure of the ACM article template.
The paper demonstrates how to use the anonymous-acm package and discusses its key features and benefits.

Plain English Explanation

The ACM LaTeX Master Article Template is a commonly used format for submitting academic papers to the Association for Computing Machinery (ACM). However, some publication venues require that the authors' identities be hidden during the peer review process to ensure an unbiased evaluation.

The anonymous-acm package provides a solution to this problem. It allows authors to anonymize their papers while still using the ACM article template. This means that the paper will have the same professional formatting and structure as a regular ACM paper, but the authors' names and other identifying information will be removed.

This is useful for papers that are submitted to conferences or journals that have a double-blind review process, where the reviewers do not know the identity of the authors. By using the anonymous-acm package, authors can ensure that their work is evaluated solely on its merits, without any potential biases introduced by knowing who wrote the paper.

Technical Explanation

The anonymous-acm package is designed to work with the ACM LaTeX Master Article Template, which is a widely used format for submitting academic papers to the Association for Computing Machinery (ACM). The package provides a set of commands and options that allow authors to anonymize their papers while still maintaining the proper formatting and structure of the ACM article template.

The key features of the anonymous-acm package include:

Hiding the authors' names and affiliations in the title and header of the paper
Removing any references to the authors' previous work that could be used to identify them
Ensuring that the paper still complies with the ACM article template's formatting requirements, such as the use of specific fonts, layout, and section structures

The package also provides several options that allow authors to fine-tune the level of anonymity, such as whether to include acknowledgments or funding sources. This flexibility ensures that authors can strike the right balance between maintaining their anonymity and providing important contextual information about their work.

Critical Analysis

The anonymous-acm package provides a valuable tool for authors who need to submit their work to double-blind peer review processes. By ensuring that their identities are hidden, the package helps to mitigate potential biases and ensures that the papers are evaluated solely on their merits.

However, it is important to note that the package does not provide a complete solution for maintaining anonymity. Authors still need to be careful to avoid including any information in the paper, such as references to their previous work or acknowledgments, that could be used to identify them. Additionally, the package may not be suitable for all types of publications, such as those that require more detailed author information or different formatting requirements.

Further research could explore ways to enhance the anonymous-acm package, such as by integrating it with other tools for managing the peer review process or by developing additional features to ensure even stronger anonymity. Additionally, the package could be extended to support other academic paper templates beyond the ACM LaTeX Master Article Template.

Conclusion

The anonymous-acm package provides a valuable solution for authors who need to submit their work to double-blind peer review processes. By allowing them to anonymize their papers while still maintaining the proper formatting and structure of the ACM article template, the package helps to ensure that their work is evaluated solely on its merits, without any potential biases introduced by knowing the authors' identities.

While the package does not provide a complete solution for maintaining anonymity, it represents an important step forward in addressing the challenges associated with double-blind peer review. As the academic community continues to explore ways to improve the fairness and transparency of the publication process, tools like the anonymous-acm package will likely become increasingly important.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

New!Reinforcement Learning for Efficient Design and Control Co-optimisation of Energy Systems

Marine Cauz, Adrien Bolland, Nicolas Wyrsch, Christophe Ballif

The ongoing energy transition drives the development of decentralised renewable energy sources, which are heterogeneous and weather-dependent, complicating their integration into energy systems. This study tackles this issue by introducing a novel reinforcement learning (RL) framework tailored for the co-optimisation of design and control in energy systems. Traditionally, the integration of renewable sources in the energy sector has relied on complex mathematical modelling and sequential processes. By leveraging RL's model-free capabilities, the framework eliminates the need for explicit system modelling. By optimising both control and design policies jointly, the framework enhances the integration of renewable sources and improves system efficiency. This contribution paves the way for advanced RL applications in energy management, leading to more efficient and effective use of renewable energy sources.

7/1/2024

cs.LG

Time-Varying Constraint-Aware Reinforcement Learning for Energy Storage Control

Jaeik Jeong, Tai-Yeon Ku, Wan-Ki Park

Energy storage devices, such as batteries, thermal energy storages, and hydrogen systems, can help mitigate climate change by ensuring a more stable and sustainable power supply. To maximize the effectiveness of such energy storage, determining the appropriate charging and discharging amounts for each time period is crucial. Reinforcement learning is preferred over traditional optimization for the control of energy storage due to its ability to adapt to dynamic and complex environments. However, the continuous nature of charging and discharging levels in energy storage poses limitations for discrete reinforcement learning, and time-varying feasible charge-discharge range based on state of charge (SoC) variability also limits the conventional continuous reinforcement learning. In this paper, we propose a continuous reinforcement learning approach that takes into account the time-varying feasible charge-discharge range. An additional objective function was introduced for learning the feasible action range for each time period, supplementing the objectives of training the actor for policy learning and the critic for value learning. This actively promotes the utilization of energy storage by preventing them from getting stuck in suboptimal states, such as continuous full charging or discharging. This is achieved through the enforcement of the charging and discharging levels into the feasible action range. The experimental results demonstrated that the proposed method further maximized the effectiveness of energy storage by actively enhancing its utilization.

5/20/2024

cs.AI cs.LG

🏅

End-to-End Reinforcement Learning of Curative Curtailment with Partial Measurement Availability

Hinrikus Wolf, Luis Bottcher, Sarra Bouchkati, Philipp Lutat, Jens Breitung, Bastian Jung, Tina Mollemann, Viktor Todosijevi'c, Jan Schiefelbein-Lach, Oliver Pohl, Andreas Ulbig, Martin Grohe

In the course of the energy transition, the expansion of generation and consumption will change, and many of these technologies, such as PV systems, electric cars and heat pumps, will influence the power flow, especially in the distribution grids. Scalable methods that can make decisions for each grid connection are needed to enable congestion-free grid operation in the distribution grids. This paper presents a novel end-to-end approach to resolving congestion in distribution grids with deep reinforcement learning. Our architecture learns to curtail power and set appropriate reactive power to determine a non-congested and, thus, feasible grid state. State-of-the-art methods such as the optimal power flow (OPF) demand high computational costs and detailed measurements of every bus in a grid. In contrast, the presented method enables decisions under sparse information with just some buses observable in the grid. Distribution grids are generally not yet fully digitized and observable, so this method can be used for decision-making on the majority of low-voltage grids. On a real low-voltage grid the approach resolves 100% of violations in the voltage band and 98.8% of asset overloads. The results show that decisions can also be made on real grids that guarantee sufficient quality for congestion-free grid operation.

6/21/2024

cs.LG cs.AI cs.SY eess.SY

Decentralized Coordination of Distributed Energy Resources through Local Energy Markets and Deep Reinforcement Learning

Daniel May, Matthew Taylor, Petr Musilek

As the energy landscape evolves toward sustainability, the accelerating integration of distributed energy resources poses challenges to the operability and reliability of the electricity grid. One significant aspect of this issue is the notable increase in net load variability at the grid edge. Transactive energy, implemented through local energy markets, has recently garnered attention as a promising solution to address the grid challenges in the form of decentralized, indirect demand response on a community level. Given the nature of these challenges, model-free control approaches, such as deep reinforcement learning, show promise for the decentralized automation of participation within this context. Existing studies at the intersection of transactive energy and model-free control primarily focus on socioeconomic and self-consumption metrics, overlooking the crucial goal of reducing community-level net load variability. This study addresses this gap by training a set of deep reinforcement learning agents to automate end-user participation in ALEX, an economy-driven local energy market. In this setting, agents do not share information and only prioritize individual bill optimization. The study unveils a clear correlation between bill reduction and reduced net load variability in this setup. The impact on net load variability is assessed over various time horizons using metrics such as ramping rate, daily and monthly load factor, as well as daily average and total peak export and import on an open-source dataset. Agents are then benchmarked against several baselines, with their performance levels showing promising results, approaching those of a near-optimal dynamic programming benchmark.

4/23/2024

eess.SY cs.AI cs.LG cs.MA cs.SY