Virtualization Technology News and Information
Article
RSS
Alluxio Virtualizes Distributed Storage for Petabyte Scale Computing at In-Memory Speeds
Alluxio (formerly known as Tachyon), the world's first memory-centric virtual distributed storage system, today announced its open source version 1.0 release. The vision for Alluxio is to become the de-facto storage unification layer for big data and other scale-out application environments in the same manner that Apache Spark became the standard computation layer.

Alluxio's memory-centric architecture provides orders of magnitude performance gains over existing solutions and superior manageability by allowing developers to interact with a single storage layer API without worrying about the configurations and complexities of underlying storage and file systems. Co-created by Haoyuan Li, CEO of Alluxio, Inc. and a founding committer of Spark, Alluxio ushers in the next generation of storage virtualization for petabyte scale computing.

"A storage unification layer that bridges computation frameworks and underlying storage systems is long overdue in the enterprise," said Haoyuan Li. "Alluxio is that unification layer with a memory-centric architecture. Alluxio enables any framework to access any data, from any storage at memory speeds."

Organizations can run any computation framework (e.g. Apache Spark, Apache MapReduce, Presto, etc.) with any storage system (e.g. Amazon S3, EMC, Google Cloud Storage, NetApp) and utilize any storage media (DRAM, SSD, HDD, etc.). As a memory-centric system, Alluxio yields orders of magnitude performance gains and manageability for existing configurations.

Only three years in existence, Alluxio has gained broad industry support as an open source project. With more than 200 contributors, 12,000 commits, and over 50 commercial organizations, Alluxio has surpassed many other open source projects in the same timeframe. Alluxio runs in production at some of the largest cloud providers for petabyte scale workloads, in financial services to meet government regulations, for research by leading universities, and at technology vendors globally.

Intel recently published its findings on the diverse range of big data storage challenges that Alluxio can address.

"Big data analytics is driving new requirements for distributed memory across clusters for real-time streaming, interactive queries, analytics and graph processing," said Michael Greene, Intel vice president, Software and Services Group and general manager of System Technologies and Optimization. "We are excited to work with developer communities on Alluxio and to optimize Alluxio solutions on Intel platforms. Ultimately, this helps our customers create more innovative and high performance cloud and big data solutions."

In financial services, Alluxio brings many advantages. It helps banks make faster and better trading decisions through dramatic performance improvements and also helps satisfy regulatory requirements. Barclays, the global financial services firm with 48 million customers and clients, recently published a report about how it uses Alluxio to boost big data analytics performance without duplicating confidential customer information to disk.

Last summer, IBM Research published a study about using Tachyon for "ultra-fast big data processing" to overcome "critical bottlenecks for system workloads."

For some of the world's cloud computing giants, Alluxio is allowing business analysts to discover insights interactively by analyzing petabytes of data in near real-time to improve customer experience.

"As one of the largest Internet companies in the world, Baidu constantly faces the challenges of managing data at multi-petabyte scale. By adopting innovative technologies like Alluxio we are able to help our users extract meaningful and useful data almost instantly," said James Peng, Chief Architect at Baidu. "Our deployment of an Alluxio cluster has already reached 1,000 workers, which is one of the largest Alluxio clusters in the world. The tiered storage of Alluxio has provided us great flexibility in managing data in large-scale. We are seeing an average 10-fold, and up to 30-fold performance improvement in supporting interactive query system and other types of workloads. This greatly improved the speed in making important business decisions."

"As the cloud computing business for Alibaba Group, the world's leading e-commerce business, Alibaba manages many of the world's largest data centers, including the largest big data cluster ever built in China," said Wensong Zhang, CTO and Senior Research Fellow of AliCloud, founder of Linux Virtual Server. "With Alluxio combined with AliCloud OSS as well as other AliCloud cloud service products, our customers can leverage the technology trends of hardware to run important jobs at the fastest performance. We have been contributing to the Alluxio open source community and believe that Alluxio will play a critical role in the future of big data infrastructure."

Published Tuesday, February 23, 2016 7:27 PM by David Marshall
Filed under:
Comments
VMblog's Expert Interviews: Alluxio Discusses Big Data Frameworks and a New Memory-centric Virtual Distributed Storage System : @VMblog - (Author's Link) - February 24, 2016 8:48 AM
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
top25
Calendar
<February 2016>
SuMoTuWeThFrSa
31123456
78910111213
14151617181920
21222324252627
282912345
6789101112