鲜学丰,崔志明,赵朋朋,刘昭斌,顾才东.基于主题模型的位置感知订阅发布系统[J].计算机科学,2018,45(3):165-170
基于主题模型的位置感知订阅发布系统
Location-awareness Publication Subscription System Based on Topic Model
投稿时间:2016-12-28  修订日期:2017-04-17
DOI:10.11896/j.issn.1002-137X.2018.03.026
中文关键词:  订阅/发布,概率主题模型,主题映射,索引
英文关键词:Publication/Subscription,LDA,Topic mapping,Index
基金项目:本文受国家自然科学基金资助
作者单位E-mail
鲜学丰 江苏省现代企业信息化应用支撑软件工程技术研发中心 江苏 苏州215104 sudaxxf@163.com 
崔志明 江苏省现代企业信息化应用支撑软件工程技术研发中心 江苏 苏州215104
苏州大学智能信息处理及应用研究所 江苏 苏州 215006 
 
赵朋朋 苏州大学智能信息处理及应用研究所 江苏 苏州 215006 ppzhao@suda.edu.cn 
刘昭斌 江苏省现代企业信息化应用支撑软件工程技术研发中心 江苏 苏州215104  
顾才东 江苏省现代企业信息化应用支撑软件工程技术研发中心 江苏 苏州215104  
摘要点击次数: 498
全文下载次数: 366
中文摘要:
      随着移动互联网的迅速发展和智能手机的普及,基于位置感知的订阅发布系统在工业界和学术界引起了广泛重视。现有系统主要处理海量空间数据下订阅与事件的查询匹配问题,其匹配模型主要是基于空间关键字之间的相似性,鲜有研究考虑语义相关性。为了探索并实现订阅发布系统在语义上的查询与匹配,提出了一种基于主题模型的位置感知订阅发布系统。 首先, 该系统利用主题模型对订阅发布系统中的关键字进行主题映射。然后,设计了一种两步分区索引结构RPTM-trees,并使用该索引结构为订阅的主题集合和空间信息建立索引。RPTM-trees根据主题集合的主题个数及关键主题对订阅进行两步分区索引,使其对订阅的分区能力更强,从而显著提升查询匹配的效率。最后,在高流速的事件流、千万级订阅数据集上进行了实验,实验结果表明所提方案是稳定和高效的。
英文摘要:
      Location-awareness publication subscription system has drawn extensive academic and industrial attention with the booming development of mobile Internet and the popularity of smart-phones.The existing systems on location-awareness publication/subscription mainly focus on handling the query and matching problem of events among massive spatial data,whose matching model is mainly based upon the similarities of spatial keywords,while the semantic aspect is ignored.In order to explore how to realize the semantic query and matching in subscription/publication system,this paper proposed a location-awareness publication/subscription system based upon theme model.Firstly,the system makes use of theme model algorithm and realizes the thematic reflection of keywords in location-awareness publication/subscription system.Secondly,it designs a two-step partition index structure RPTM-trees and utilizes RPTM-trees to createan index between thematic aggregation and spatial information.As RPTM-trees conducts a two-step partitioning and indexing of the subscription information based on the topic numbers of thematic aggregation and key topics,a stronger subscription partitioning ability is achieved,and the efficiency of query and matching is significantly improved.Finally,an experiment on high-speed event stream and millions and millions subscription data aggregation was conducted,indicating the effectiveness and the efficiency of the proposed solution.
查看全文  查看/发表评论  下载PDF阅读器