朱鹏宇,鲍培明,吉根林.用户频繁通信关系的并行挖掘算法研究[J].计算机科学,2018,45(2):103-108
用户频繁通信关系的并行挖掘算法研究
Parallel Algorithm for Mining User Frequent Communication Relationship
投稿时间:2017-04-26  修订日期:2017-06-29
DOI:10.11896/j.issn.1002-137X.2018.02.018
中文关键词:  通信网络,频繁子图,频繁通信关系
英文关键词:Communication network,Frequent sub-graph,Frequent communication relationship
基金项目:本文受国家自然科学基金项目:云计算环境下顾及用户关系的手机用户时空轨迹模式挖掘方法研究(41471371)资助
作者单位E-mail
朱鹏宇 南京师范大学计算机科学与技术学院 南京210023 pyzhu2016@163.com 
鲍培明 南京师范大学计算机科学与技术学院 南京210023 baopeiming@163.com 
吉根林 南京师范大学计算机科学与技术学院 南京210023  
摘要点击次数: 376
全文下载次数: 266
中文摘要:
      随着移动通信技术和互联网的飞速发展,移动通信设备已经成为大多数人随身携带的工具,这些设备之间因互相通信而产生的数据构成了通信网络。文中提出了一种针对海量通信数据的频繁通信子图并行挖掘算法PMFCS。该算法 在频繁项目集挖掘思想和子图连接规则的基础上, 利用并行计算框架Spark 将所有的图以边为单位分布到各个计算节点,在各个节点统计1阶候选频繁子图,再通过汇总候选子图得到1阶频繁子图。PMFCS算法通过迭代地连接k-1阶子图和1阶子图生成k阶候选子图,再计算k阶候选子图的频繁度,直至k阶频繁子图集合为空集。实验结果表明,该算法可以快速、有效地解决频繁通信关系的挖掘问题。
英文摘要:
      With the rapid development of mobile communication technology and Internet,mobile communication equipment has become a portable tool for most people.A parallel algorithm PMFCS was proposed for mining frequent communication sub-graph of mass communication data.The algorithm is based on the Apriori algorithm and sub-graph connect principle.It uses Spark to distribute all the edges to each computing node,then the 1th-order frequent candidate sub-graphs are distributed to each node,the 1th-order frequent candidate sub-graphs are counted at each node,and the 1th-order sub-graphs are got by summarizing candidate sub-graphs.PMFCS iteratively connects the (k-1)th-order sub-graph and the 1th-order sub-graph to generate kth-order candidate sub-graphs.Subsequently,the algorithm terminates until the kth-order frequent sub-graph set is empty.The experimental results show that PMFCS can mine the frequent communication sub-graph efficiently and quickly.
查看全文  查看/发表评论  下载PDF阅读器