河北大学学报(自然科学版) ›› 2021, Vol. 41 ›› Issue (2): 212-217.DOI: 10.3969/j.issn.1000-1565.2021.02.015

• • 上一篇    下一篇

基于MapReduce的大数据在线聚集优化设计

李骏   

  • 收稿日期:2020-06-01 出版日期:2021-03-25 发布日期:2021-04-07
  • 作者简介:李骏(1983—),男,四川泸县人,成都工业学院助理研究员,主要从事计算机程序设计研究.
    E-mail:lijun00s@163.com
  • 基金资助:
    四川省教育厅省级教育体制机制改革试点项目(G5-08)

Design of online aggregation optimization of big data based on MapReduce

LI Jun   

  1. Office of Teaching Affairs, Chengdu Technological University, Chengdu 611730, China
  • Received:2020-06-01 Online:2021-03-25 Published:2021-04-07

摘要: 针对大数据在线聚集执行时间长、执行性能及延迟调度性能较差的问题,提出基于MapReduce的大数据在线聚集优化程序设计.使用分片聚集方法使集群中所有机器的计算资源得到充分调用,采用子连接的启发式优先方法优化各节点本地执行连接任务关系运算,实现大数据在线聚集的并行连接.利用混合近似查询框架的大数据在线聚集动态切换机制及基于渐进近似估计的动态切换机制,降低混合近似查询切换误判率,增强大数据在线聚集的执行性能.实验结果表明,该方法设计的在线聚集优化程序在不同数据规模下的执行时间较小,同时在基本频繁查询性能方面具备显著优势.

关键词: MapReduce, 大数据, 聚集优化, 切换机制

Abstract: Aiming at the problems of long execution time, poor execution performance and delayed scheduling performance of big data online aggregation, an optimization program design of big data online aggregation based on MapReduce is proposed. The cluster computing resources of all machines in the cluster can be fully utilized by using the fragment aggregation method, and the heuristic priority method of sub connection is used to optimize the local execution of connection task relation operation of each node, so as to realize the parallel connection of big data online aggregation. The dynamic switch mechanism of big data online aggregation based on hybrid approximate query framework and the dynamic switch mechanism based on progressive approximate estimation are used to reduce the misjudgment rate of hybrid approximate query switching and enhance the execution performance of big data online aggregation. The experimental results show that the online aggregation optimization program designed by this method has less execution time under different data scales, and has significant advantages in basic frequent query performance.

Key words: MapReduce, big data, aggregation optimization, switching mechanism

中图分类号: