| 引用本文: | 窦嘉铭,王小君,司方远,等.基于深度强化学习的区域综合能源系统主动调节灵活性规则提取及可解释优化调度[J].电力系统保护与控制,2025,53(23):139-151.[点击复制] |
| DOU Jiaming,WANG Xiaojun,SI Fangyuan,et al.Extraction of active regulation flexibility rules and interpretable optimal scheduling for regional integrated energy systems based on deep reinforcement learning[J].Power System Protection and Control,2025,53(23):139-151[点击复制] |
|
| 摘要: |
| 区域综合能源系统调度需充分挖掘主动调节能力以应对新能源波动及多变负荷。针对传统方法依赖精确建模、难以适应高不确定性、缺乏主动调节动态解析与策略可解释性的问题,提出主动调节灵活性规则提取及可解释调度强化学习方法。首先基于设备调节边界、响应速率及耦合关系,解析量化电、热等子系统设备功率调节量等灵活性规则指标。其次,设计融合主动调节灵活性物理规则奖励函数,将其嵌入改进深度确定性策略梯度(deep deterministic policy gradient, DDPG)算法框架,在策略更新过程中引入设备运行边界约束惩罚与灵活性激励,通过动态约束构建、学习率自适应及策略可视化等增强策略物理一致性与可解释性。仿真表明,所提方法相对二次规划、粒子群等方法调节能力指标分别提升11.08%、15.86%,且通过灵活性规则对日调节能力进行提取,为调度策略提供可追溯的物理依据和人机协同支持。 |
| 关键词: 主动调节能力 综合能源系统 灵活性 强化学习 可解释人工智能 规则提取 |
| DOI:10.19783/j.cnki.pspc.250196 |
| 投稿时间:2025-02-27修订日期:2025-07-14 |
| 基金项目:国家电网有限公司总部科技项目资助(520201250010-075-ZN) |
|
| Extraction of active regulation flexibility rules and interpretable optimal scheduling for regional integrated energy systems based on deep reinforcement learning |
| DOU Jiaming1,WANG Xiaojun1,SI Fangyuan1,LIU Zhaoyan2,XI Yanna2,LIU Zhao1,SHENG Kangling1,DUAN Yuge1 |
| (1. School of Electrical Engineering, Beijing Jiaotong University, Beijing 100044, China;
2. State Grid Beijing Electric Power Company, Beijing 100031, China) |
| Abstract: |
| The scheduling of regional integrated energy systems must fully exploit active regulation capability to cope with new energy fluctuation and diverse load conditions. Traditional methods rely heavily on precise modeling, struggle with high uncertainty, and lack dynamic analysis of active regulation as well as interpretability of scheduling strategies. To address these challenges, this paper proposes an active flexibility regulation rule extraction and explainable reinforcement learning method. First, based on the equipment regulation boundaries, response rates, and coupling relationships, flexibility metrics such as power regulation capacities of electrical and thermal subsystem components, are quantitatively analyzed. Second, a reward function integrating the physical rules of active regulation flexibility is designed and embedded into an improved deep deterministic policy gradient (DDPG) framework. During policy updates, device operation constraints and flexibility incentives are incorporated. Dynamic constraint construction, adaptive learning rate adjustment, and policy visualization are adopted to enhance physical consistency and interpretability of the learning process. Simulation results show that the proposed method improves the regulation capability by 11.08% and 15.86% compared with quadratic programming and particle swarm optimization, respectively. Moreover, the extracted flexibility rules enable interpretable day-ahead regulation capability analysis, providing traceable physical insights and supporting human-AI collaborative decision-making in scheduling strategies. |
| Key words: active regulation capability integrated energy system flexibility reinforcement learning explainable AI rule extraction |