引用本文: | 齐俊,曲朝阳,娄建楼,等.一种基于Hadoop的电力大数据属性实体识别算法[J].电力系统保护与控制,2016,44(24):52-57.[点击复制] |
QI Jun,QU Zhaoyang,LOU Jianlou,et al.A kind of attribute entity recognition algorithm based on Hadoop for power big data[J].Power System Protection and Control,2016,44(24):52-57[点击复制] |
|
摘要: |
随着大数据时代的来临,传统的实体识别技术由于电网数据体积大以及类型复杂等特性已经无法有效地进行数据预处理。近年来兴起的Hadoop技术能够对大数据进行较好的处理。因此提出一种基于Hadoop的电力大数据属性实体识别算法。该算法利用改进离散化算法选取出信息准确率较高的离散点,并提出了一种离散化评价指标。最后,在Hadoop平台上对某风电机组的监测数据进行了属性实体识别。实验证明,该算法在实验正确性和断点数目方面表现良好,并且具有较好的加速比,适用于电力大数据的属性实体识别处理。 |
关键词: 电力大数据 实体识别 离散化算法 信息准确率 |
DOI:10.7667/PSPC152053 |
投稿时间:2015-11-25修订日期:2016-01-19 |
基金项目:国家自然科学基金资助项目(51277023);吉林省科技厅社发处重点科技攻关项目(20150204084GX) |
|
A kind of attribute entity recognition algorithm based on Hadoop for power big data |
QI Jun,QU Zhaoyang,LOU Jianlou,WANG Chong |
(School of Information Science and Engineering, Northeast Dianli University, Jilin 132012, China;Information & Telecommunication Branch Company, State Grid East Inner Mongolia Electric Power Co., Ltd., Hohhot 010020, China) |
Abstract: |
With the coming of the era of big data, traditional entity recognition technologies have been unable to effectively finish data pre-processing because of the large scale of power grid data and volume complex type features. The rising of the Hadoop technologies in these years can deal with the big data processing better. Therefore this paper proposes a power big data entity recognition algorithm based on Hadoop. This algorithm uses the discretization algorithm to select higher information accuracy discrete points and puts forward a discretization evaluation indicator. In the end, the entity recognition of the monitoring data of wind turbines is finished on Hadoop platform. Experimental results show that the proposed algorithm performs well in terms of correctness and breakpoint number experiments and it has a good speed-up ratio. The proposed algorithm can be applied to power large data entity recognition processing. This work is supported by National Natural Science Foundation of China (No. 51277023). |
Key words: power big data entity recognition algorithm discretization information accuracy |