引用本文: | 戴志辉,张富泽,张近月,等.基于MacBERT-BiLSTM-CRF模型的继电保护装置缺陷知识图谱构建方法[J].电力系统保护与控制,2024,52(20):131-143.[点击复制] |
DAI Zhihui,ZHANG Fuze,ZHANG Jinyue,et al.Construction method of a defect knowledge map of a relay protection device based on a MacBERT-BiLSTM-CRF model[J].Power System Protection and Control,2024,52(20):131-143[点击复制] |
|
摘要: |
电网发展至今积累了大量继电保护装置缺陷文本数据,尚未被有效挖掘利用。此外,继电保护装置的缺陷排除工作过度依赖运行人员的专业能力,现场运维工作难度大。针对上述问题,提出基于MacBERT-BiLSTM-CRF模型的继电保护装置缺陷知识图谱构建方法。首先,分析继电保护装置缺陷文本的记录特点,对非结构化文本进行数据清洗、数据标注以及数据增强处理。其次,基于BERT-BiLSTM-CRF模型构建MacBERT-BiLSTM-CRF模型进行实体抽取任务。然后,定义继电保护装置缺陷文本的关系抽取规则,结合实体抽取模型共同完成关系抽取任务。最后,构建继电保护装置缺陷知识图谱的模式层,并利用Neo4j图数据库实现知识图谱数据层的存储。算例分析表明,所提数据处理方法能够得到高质量BIO标注数据集。相比于传统BERT-BiLSTM-CRF模型,MacBERT- BiLSTM-CRF模型的实体抽取效果更好。基于模式层完成了继电保护装置缺陷知识图谱的构建与可视化展示,并提出继电保护装置缺陷辅助决策的应用流程与知识图谱的更新方法。 |
关键词: 继电保护装置 缺陷文本 实体抽取 关系抽取 知识图谱 |
DOI:10.19783/j.cnki.pspc.240008 |
投稿时间:2024-01-02修订日期:2024-04-07 |
基金项目:国家自然科学基金项目资助(51877084) |
|
Construction method of a defect knowledge map of a relay protection device based on a MacBERT-BiLSTM-CRF model |
DAI Zhihui1,ZHANG Fuze1,ZHANG Jinyue2,HAN Xiao1 |
(1. Hebei Key Laboratory of Distributed Energy Storage and Microgrid (North China Electric Power University),
Baoding 071003, China; 2. School of Control and Computer Engineering, North
China Electric Power University, Baoding 071003, China) |
Abstract: |
The development of the power grid has accumulated a large amount of unused text data on relay protection device defects. These have not been effectively mined and used. Also, the elimination of defects in relay protection devices excessively relies on the professional abilities of operators, resulting in difficult field operation and maintenance. To address these issues, this paper proposes a method for constructing a knowledge graph of relay protection device defects based on the MacBERT-BiLSTM-CRF model. First, the characteristics of the records of relay protection device defect texts are analyzed, and the unstructured texts are cleaned, annotated, and enhanced. Secondly, the MacBERT- BiLSTM-CRF model is constructed based on the BERT-BiLSTM-CRF model to perform entity extraction tasks. Then, the rules for relation extraction of textual records of the defects are defined, and relation extraction tasks are jointly completed with the entity extraction model. Finally, the pattern layer of the knowledge graph of relay protection device defects is constructed, and the Neo4j graph database is used to store the data layer of the knowledge graph. Case analysis shows that the proposed data processing method can obtain a high-quality BIO labeled dataset. Compared with the traditional BERT-BiLSTM-CRF model, the MacBERT-BiLSTM-CRF model achieves better entity extraction. The construction and visualization of the knowledge graph are accomplished based on the pattern layer, and an application workflow for assisting decision-making on relay protection device defects and a method for updating the knowledge graph are proposed. |
Key words: relay protection device defect text entity extraction relation extraction knowledge graph |