Time-Varying Constraint-Aware Reinforcement Learning for Energy Storage Control

2405.10536

Published 5/20/2024 by Jaeik Jeong, Tai-Yeon Ku, Wan-Ki Park

Time-Varying Constraint-Aware Reinforcement Learning for Energy Storage Control

Abstract

Energy storage devices, such as batteries, thermal energy storages, and hydrogen systems, can help mitigate climate change by ensuring a more stable and sustainable power supply. To maximize the effectiveness of such energy storage, determining the appropriate charging and discharging amounts for each time period is crucial. Reinforcement learning is preferred over traditional optimization for the control of energy storage due to its ability to adapt to dynamic and complex environments. However, the continuous nature of charging and discharging levels in energy storage poses limitations for discrete reinforcement learning, and time-varying feasible charge-discharge range based on state of charge (SoC) variability also limits the conventional continuous reinforcement learning. In this paper, we propose a continuous reinforcement learning approach that takes into account the time-varying feasible charge-discharge range. An additional objective function was introduced for learning the feasible action range for each time period, supplementing the objectives of training the actor for policy learning and the critic for value learning. This actively promotes the utilization of energy storage by preventing them from getting stuck in suboptimal states, such as continuous full charging or discharging. This is achieved through the enforcement of the charging and discharging levels into the feasible action range. The experimental results demonstrated that the proposed method further maximized the effectiveness of energy storage by actively enhancing its utilization.

Create account to get full access

Overview

This paper presents a time-varying constraint-aware reinforcement learning (RL) approach for controlling energy storage systems.
The proposed method aims to optimize the operation of energy storage devices while considering dynamic constraints and system uncertainties.
The researchers demonstrate the effectiveness of their approach through simulation experiments on a residential energy storage system.

Plain English Explanation

Energy storage systems, such as batteries, play a crucial role in modern power grids, helping to balance supply and demand and improve the efficiency and reliability of the electricity network. However, effectively controlling these storage systems can be challenging, as their operation is subject to various constraints, and the energy demand and supply can fluctuate over time.

The researchers in this paper have developed a new reinforcement learning-based approach to address this challenge. Reinforcement learning is a type of machine learning where an agent (in this case, the control system for the energy storage device) learns to make optimal decisions by interacting with its environment and receiving feedback in the form of rewards or penalties.

The key innovation in this paper is the incorporation of time-varying constraints into the reinforcement learning framework. This means that the control system is aware of and can adapt to changing limitations, such as the maximum charging and discharging rates of the energy storage device, or the available energy supply from renewable sources like solar or wind power. By taking these dynamic constraints into account, the researchers' approach can optimize the operation of the energy storage system more effectively than traditional control methods.

The researchers demonstrate the effectiveness of their approach through simulations of a residential energy storage system, showing that it can outperform other RL-based control strategies in terms of cost savings and adherence to the system's constraints.

Technical Explanation

The paper presents a time-varying constraint-aware reinforcement learning (TVCRL) framework for controlling energy storage systems. The key elements of the approach are:

State Representation: The state of the system is represented by the current energy level of the storage device, the time-varying constraints (e.g., maximum charging/discharging rates, available renewable energy), and other relevant system parameters.
Action Space: The agent can choose actions that determine the charging or discharging rate of the energy storage device, within the bounds of the time-varying constraints.
Reward Function: The reward function encourages the agent to minimize the overall energy cost while adhering to the dynamic system constraints, such as maintaining the energy level within a desired range.
Constraint-Aware Policy Learning: The researchers use a constrained optimization approach to train the reinforcement learning agent, ensuring that the learned control policy respects the time-varying system constraints.

The researchers evaluate their TVCRL approach through simulation experiments on a residential energy storage system, comparing its performance to other RL-based control strategies, such as Control Policy Correction Framework for Reinforcement Learning-Based Control, Reinforcement Learning Approach to Dairy Farm Battery, and Decentralized Coordination of Distributed Energy Resources through Local. The results demonstrate that the TVCRL approach can achieve significant cost savings while maintaining better adherence to the system's time-varying constraints.

Critical Analysis

The researchers have addressed an important challenge in the control of energy storage systems by incorporating time-varying constraints into the reinforcement learning framework. This is a valuable contribution, as real-world energy systems often face dynamic limitations and uncertainties that need to be accounted for in the control strategy.

One potential limitation of the study is the reliance on simulations rather than real-world experiments. While the simulations are based on realistic models, it would be beneficial to validate the approach in a physical energy storage system to understand its performance under real-world conditions, as discussed in the CityLearn v2: Energy-Flexible and Resilient Occupant-Centric and Continual Model-Based Reinforcement Learning for Data-Efficient studies.

Additionally, the paper could have provided more details on the specific time-varying constraints and how they were modeled, as well as the computational complexity and training requirements of the TVCRL approach. This information would help readers better understand the practical implementation challenges and the potential scalability of the method.

Overall, the proposed TVCRL framework is a promising approach for improving the control of energy storage systems, and the researchers have demonstrated its potential through simulation experiments. Further real-world validation and a more in-depth analysis of the method's capabilities and limitations would help strengthen the contribution of this work.

Conclusion

This paper presents a time-varying constraint-aware reinforcement learning (TVCRL) framework for controlling energy storage systems. The key innovation is the incorporation of dynamic constraints into the RL-based control strategy, allowing the system to optimize its operation while respecting time-varying limitations such as charging/discharging rates and available renewable energy.

The simulation results show that the TVCRL approach can outperform other RL-based control strategies in terms of cost savings and adherence to system constraints. This work contributes to the ongoing efforts to develop more intelligent and adaptive control systems for energy storage, which are crucial for the integration of renewable energy sources and the overall efficiency and reliability of modern power grids.

This summary was produced with help from an AI and may contain inaccuracies - check out the links to read the original source documents!

Related Papers

New!Reinforcement Learning for Efficient Design and Control Co-optimisation of Energy Systems

Marine Cauz, Adrien Bolland, Nicolas Wyrsch, Christophe Ballif

The ongoing energy transition drives the development of decentralised renewable energy sources, which are heterogeneous and weather-dependent, complicating their integration into energy systems. This study tackles this issue by introducing a novel reinforcement learning (RL) framework tailored for the co-optimisation of design and control in energy systems. Traditionally, the integration of renewable sources in the energy sector has relied on complex mathematical modelling and sequential processes. By leveraging RL's model-free capabilities, the framework eliminates the need for explicit system modelling. By optimising both control and design policies jointly, the framework enhances the integration of renewable sources and improves system efficiency. This contribution paves the way for advanced RL applications in energy management, leading to more efficient and effective use of renewable energy sources.

7/1/2024

cs.LG

🏅

Control Policy Correction Framework for Reinforcement Learning-based Energy Arbitrage Strategies

Seyed Soroush Karimi Madahi, Gargya Gokhale, Marie-Sophie Verwee, Bert Claessens, Chris Develder

A continuous rise in the penetration of renewable energy sources, along with the use of the single imbalance pricing, provides a new opportunity for balance responsible parties to reduce their cost through energy arbitrage in the imbalance settlement mechanism. Model-free reinforcement learning (RL) methods are an appropriate choice for solving the energy arbitrage problem due to their outstanding performance in solving complex stochastic sequential problems. However, RL is rarely deployed in real-world applications since its learned policy does not necessarily guarantee safety during the execution phase. In this paper, we propose a new RL-based control framework for batteries to obtain a safe energy arbitrage strategy in the imbalance settlement mechanism. In our proposed control framework, the agent initially aims to optimize the arbitrage revenue. Subsequently, in the post-processing step, we correct (constrain) the learned policy following a knowledge distillation process based on properties that follow human intuition. Our post-processing step is a generic method and is not restricted to the energy arbitrage domain. We use the Belgian imbalance price of 2023 to evaluate the performance of our proposed framework. Furthermore, we deploy our proposed control framework on a real battery to show its capability in the real world.

5/1/2024

eess.SY cs.AI cs.LG cs.SY

A Reinforcement Learning Approach to Dairy Farm Battery Management using Q Learning

Nawazish Ali, Abdul Wahid, Rachael Shaw, Karl Mason

Dairy farming consumes a significant amount of energy, making it an energy-intensive sector within agriculture. Integrating renewable energy generation into dairy farming could help address this challenge. Effective battery management is important for integrating renewable energy generation. Managing battery charging and discharging poses significant challenges because of fluctuations in electrical consumption, the intermittent nature of renewable energy generation, and fluctuations in energy prices. Artificial Intelligence (AI) has the potential to significantly improve the use of renewable energy in dairy farming, however, there is limited research conducted in this particular domain. This research considers Ireland as a case study as it works towards attaining its 2030 energy strategy centered on the utilization of renewable sources. This study proposes a Q-learning-based algorithm for scheduling battery charging and discharging in a dairy farm setting. This research also explores the effect of the proposed algorithm by adding wind generation data and considering additional case studies. The proposed algorithm reduces the cost of imported electricity from the grid by 13.41%, peak demand by 2%, and 24.49% when utilizing wind generation. These results underline how reinforcement learning is highly effective in managing batteries in the dairy farming sector.

5/16/2024

cs.LG cs.AI

🏅

Mixed-Integer Optimal Control via Reinforcement Learning: A Case Study on Hybrid Electric Vehicle Energy Management

Jinming Xu, Nasser Lashgarian Azad, Yuan Lin

Many optimal control problems require the simultaneous output of discrete and continuous control variables. These problems are usually formulated as mixed-integer optimal control (MIOC) problems, which are challenging to solve due to the complexity of the solution space. Numerical methods such as branch-and-bound are computationally expensive and undesirable for real-time control. This paper proposes a novel hybrid-action reinforcement learning (HARL) algorithm, twin delayed deep deterministic actor-Q (TD3AQ), for MIOC problems. TD3AQ combines the advantages of both actor-critic and Q-learning methods, and can handle the discrete and continuous action spaces simultaneously. The proposed algorithm is evaluated on a plug-in hybrid electric vehicle (PHEV) energy management problem, where real-time control of the discrete variables, clutch engagement/disengagement and gear shift, and continuous variable, engine torque, is essential to maximize fuel economy while satisfying driving constraints. Simulation outcomes demonstrate that TD3AQ achieves control results close to optimality when compared with dynamic programming (DP), with just 4.69% difference. Furthermore, it surpasses the performance of baseline reinforcement learning algorithms.

6/3/2024

eess.SY cs.AI cs.SY