未加星标

Microsoft boasts its cloudy Hadoop big data mill's faster than yours

字体大小 | |
[数据库(综合) 所属分类 数据库(综合) | 发布者 店小二04 | 时间 2016 | 作者 红领巾 ] 0人收藏点击收藏

Microsoft boasts its cloudy Hadoop big data mill's faster than yours

Microsoft has overhauled its cloud-hosted Azure HDInsight Hadoop big data mill with extra security in the shape of enhanced authentication and identity management features plus a claimed 25 times performance boost in crunching big data queries.

Azure HDInsight is a service that lets users deploy and manage Apache Hadoop clusters on Microsoft’s Azure cloud, and has been developed in partnership with Hadoop specialist Hortonworks using the latter’s Hortonworks Data Platform.

It was also updated with support for Apache Spark just a few months back, adding support for in-memory processing to speed analytics jobs.

Much of the underlying framework of Azure HDInsight is thus open source software, which Redmond is very much in favour of these days.

In fact, the firm claims it has played an important part in making the Apache Hive data warehouse tool run faster, and this where significant performance gains have come, thanks to something called Long Lived and Process (LLAP) functionality.

LLAP keeps data compressed while running in-memory, and along with other enhancements, delivers a 25x performance improvement for big data queries, according to Microsoft. However, as is often the case with cloud services, this is currently offered only as a public preview.

Performance gains also come from updating the Spark platform support to Spark 2.0, which overhauls the core query engine with the ability to perform cache-efficient vectorised computations for up to 10x faster processing.

Security is set to get a boost with new features that will be turned on in October. These include integration of Azure HDInsight with Azure Active Directory, the cloud-based version of Microsoft’s directory and identity management service, and implementation of Apache Ranger, an open source project that provides centralised policy control for Hadoop clusters.

Meanwhile, the data processed by Azure HDInsight can now be secured while at rest through server-side encryption in the Azure Data Lake Store or Azure Storage. Users can also choose to manage their own encryption keys for this, storing them in the Azure Key Vault.

Redmond also welcomed a new crop of third-party vendors to the Azure HDInsight tent. Two outfits called Cask and StreamSets have joined the partner programme that enables application code to run directly on the HDInsight clusters instead of being hosted elsewhere. This enables end users to access Hadoop and Spark clusters pre-integrated and pre-tuned with their big data application of choice, Microsoft said.

本文数据库(综合)相关术语:系统安全软件

主题: HadoopSparkHiveVault
分页:12
转载请注明
本文标题:Microsoft boasts its cloudy Hadoop big data mill's faster than yours
本站链接:http://www.codesec.net/view/480736.html
分享请点击:


1.凡CodeSecTeam转载的文章,均出自其它媒体或其他官网介绍,目的在于传递更多的信息,并不代表本站赞同其观点和其真实性负责;
2.转载的文章仅代表原创作者观点,与本站无关。其原创性以及文中陈述文字和内容未经本站证实,本站对该文以及其中全部或者部分内容、文字的真实性、完整性、及时性,不作出任何保证或承若;
3.如本站转载稿涉及版权等问题,请作者及时联系本站,我们会及时处理。
登录后可拥有收藏文章、关注作者等权限...
技术大类 技术大类 | 数据库(综合) | 评论(0) | 阅读(31)