查询结果:   朱立红,杨鹤标.海量结构化数据查询系统的研究与实现[J].计算机应用与软件,2014,31(2):29 - 32.
中文标题
海量结构化数据查询系统的研究与实现
发表栏目
数据库技术
摘要点击数
893
英文标题
RESEARCH and IMPLEMENTATION OF MASSIVE STRUCTURED DATA QUERY SYSTEM
作 者
朱立红 杨鹤标 Zhu Lihong Yang Hebiao
作者单位
江苏大学计算机科学与通信工程学院 江苏 镇江 212013     
英文单位
School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang 212013, Jiangsu, China     
关键词
数据库集群 数据分布 MapReduce Hadoop
Keywords
Database cluster Data distribution MapReduce Hadoop
基金项目
国家自然科学基金项目(61202110)
作者资料
朱立红,硕士生,主研领域:云计算,数据挖掘。杨鹤标,教授。 。
文章摘要
随着电子商务和信息技术的飞速发展,企业需要存储和处理的数据量正在以惊人的速度增长,而传统的关系型数据库管理系统已无法满足企业对大规模数据的处理需求,因此,基于云计算的海量结构化数据处理日益成为人们关注的热点。针对Hadoop云计算平台在处理结构化数据方面的不足,给出一种以异构的数据库集群作为底层的数据存储系统,以扩展的MapReduce框架作为任务的管理和执行容器的查询系统。为提高查询的效率,给出一种优化的查询和数据分布策略。实验表明,该查询系统的执行效率较Hive有很大的提升。
Abstract
With the rapid development of e-commerce and information technology, the amount of data the enterprises have to store and process is growing in alarming speed. However, traditional RDBMS can no longer meet the demand of the enterprises in large-scale data processing. Therefore, the massive structured data processing based on cloud computing is increasingly becoming the focus of people’s attention. In this paper, we present a data storage system to solve the insufficiency of Hadoop cloud computing platform in processing the structured data. The system uses heterogeneous database cluster as the underlying, and uses extended MapReduce framework as the query system for tasks management and execution container. In order to improve the query efficiency, we give an optimised query and data distribution strategy. With the experiments, we prove that the query system greatly improve the execution efficiency compared with Hive.
下载PDF全文