Alluxio (formerly
known as Tachyon), the world's first memory-centric virtual distributed
storage system, today announced its open source version 1.0 release. The
vision for Alluxio is to become the de-facto storage unification layer
for big data and other scale-out application environments in the same
manner that Apache Spark became the standard computation layer.
Alluxio's memory-centric architecture provides orders of
magnitude performance gains over existing solutions and superior
manageability by allowing developers to interact with a single storage
layer API without worrying about the configurations and complexities of
underlying storage and file systems. Co-created by Haoyuan Li, CEO of
Alluxio, Inc. and a founding committer of Spark, Alluxio ushers in the
next generation of storage virtualization for petabyte scale computing.
"A storage unification layer that bridges computation
frameworks and underlying storage systems is long overdue in the
enterprise," said Haoyuan Li. "Alluxio is that unification layer with a
memory-centric architecture. Alluxio enables any framework to access any
data, from any storage at memory speeds."
Organizations can run any computation framework (e.g. Apache
Spark, Apache MapReduce, Presto, etc.) with any storage system (e.g.
Amazon S3, EMC, Google Cloud Storage, NetApp) and utilize any storage
media (DRAM, SSD, HDD, etc.). As a memory-centric system, Alluxio yields
orders of magnitude performance gains and manageability for existing
configurations.
Only three years in existence, Alluxio has gained broad
industry support as an open source project. With more than 200
contributors, 12,000 commits, and over 50 commercial organizations,
Alluxio has surpassed many other open source projects in the same
timeframe. Alluxio runs in production at some of the largest cloud
providers for petabyte scale workloads, in financial services to meet
government regulations, for research by leading universities, and at
technology vendors globally.
Intel recently published its findings on the diverse range of big data storage challenges that Alluxio can address.
"Big data analytics is driving new requirements for
distributed memory across clusters for real-time streaming, interactive
queries, analytics and graph processing," said Michael Greene, Intel
vice president, Software and Services Group and general manager of
System Technologies and Optimization. "We are excited to work with
developer communities on Alluxio and to optimize Alluxio solutions on
Intel platforms. Ultimately, this helps our customers create more
innovative and high performance cloud and big data solutions."
In financial services, Alluxio brings many advantages. It
helps banks make faster and better trading decisions through dramatic
performance improvements and also helps satisfy regulatory requirements.
Barclays, the global financial services firm with 48 million customers
and clients, recently published a report
about how it uses Alluxio to boost big data analytics performance
without duplicating confidential customer information to disk.
Last summer, IBM Research published a study about using Tachyon for "ultra-fast big data processing" to overcome "critical bottlenecks for system workloads."
For some of the world's cloud computing giants, Alluxio is
allowing business analysts to discover insights interactively by
analyzing petabytes of data in near real-time to improve customer
experience.
"As one of the largest Internet companies in the world, Baidu
constantly faces the challenges of managing data at multi-petabyte
scale. By adopting innovative technologies like Alluxio we are able to
help our users extract meaningful and useful data almost instantly,"
said James Peng, Chief Architect at Baidu. "Our deployment of an Alluxio
cluster has already reached 1,000 workers, which is one of the largest
Alluxio clusters in the world. The tiered storage of Alluxio has
provided us great flexibility in managing data in large-scale. We are
seeing an average 10-fold, and up to 30-fold performance improvement in
supporting interactive query system and other types of workloads. This
greatly improved the speed in making important business decisions."
"As the cloud computing business for Alibaba Group, the
world's leading e-commerce business, Alibaba manages many of the world's
largest data centers, including the largest big data cluster ever built
in China," said Wensong Zhang, CTO and Senior Research Fellow of
AliCloud, founder of Linux Virtual Server. "With Alluxio combined with
AliCloud OSS as well as other AliCloud cloud service products, our
customers can leverage the technology trends of hardware to run
important jobs at the fastest performance. We have been contributing to
the Alluxio open source community and believe that Alluxio will play a
critical role in the future of big data infrastructure."