Developer(s) | Alluxio Open Foundation |
---|---|
Stable release | 1.8.1
/ September 27, 2018 |
Type | Open Source Memory Speed Virtual Distributed Storage |
License | Apache License 2.0 |
Website | www |
Alluxio is an open source virtual distributed file system. Alluxio (initially named Tachyon) was developed in a doctoral thesis at the University of California, Berkeley AMPLab with grant funding from DARPA. Alluxio is an in-memory data layer between applications and data storage systems. The software is published under the Apache License.
Overview
editThe motivation for creating Alluxio stemmed from other research projects at AMPLab, notably Apache Mesos and Apache Spark which focused on the compute and data layers, respectively. Haoyuan Li, then a Ph.D. student working on distributed systems, identified the need for innovation at the data layer. Li developed the first version of Alluxio with the goal of creating technology that simplifies the way application frameworks connect to diaparate and heterogeneous storage systems.[1] Alluxio is now used commercially in cloud-based big data environments for applications such as analytics processing and machine learning. Common use cases[2] include:
- Improving application performance by caching frequently accessed data or caching data locally from remote sources
- Unifying data from multiple storage systems and/or locations
- Providing shared access to a single data set for multiple application frameworks
- Simplifying data access in hybrid cloud environments
History
edit- In 2012 Haoyuan Li (also known as “HY”) was a Ph.D. student focused on distributed systems and the University of California, Berkeley AMPLab when he developed the first version of Alluxio (then known as the Tachyon project). Tachyon was incorporated into the Berkeley Data Analytics Stack[3]
- In 2013 Alluxio was initially released under the Apache open source license. The current source code repository can be found on GitHub.
- In November 2014 the first academic paper on Alluxio, “Tachyon: Reliable Memory Speed Storage for Cluster Computing Frameworks” was published at SOCC[4]
- In March 2015 Tachyon Nexus (later renamed Alluxio, Inc.) was founded to provide ongoing development and commercial support for Alluxio.[5]
- In February 2016 Tachyon was renamed Alluxio and open source version 1.0 was released.[6][7]
- In July 2018 version 1.8 was released[8][9]
- Key Alluxio project members and contributors: [10]
- Haoyuan (“HY”) Li, creator
- Founding members: Bin Fan, Yupeng Fu, Calvin Jia, Gene Pang
Technology
editAlluxio is a virtual distributed file system that creates a shared in-memory data layer between compute and storage. The software acts as an abstraction layer that presents a set of disparate data stores (file or object) as a single file system, providing standard APIs and consistent semantics for applications. The solution integrates three primary innovations: [1]
- Unified Namespace: also referred to as a global namespace, the Alluxio file system aggregates disparate data sources regardless of location. Data sources, stored in any file- or object-based file system, are virtualized and appear as a single namespace that can be mounted and accessed via the Alluxio file system.
- API Translation: Alluxio converts from client-side interface to native storage interface via server-side API translation.
- Intelligent Cache Management: Configuration settings and user-defined policies establish the framework for cache management (data fetching and replacement), resource utilization across media (DRAM, SSD, HDD), data placement for performance and reliability, and data consistency with persistent storage.
Editions
editAlluxio Open Source (AOS) is available for free download with no restrictions on the number of nodes to deploy on or duration of use.[11] Alluxio source code can be downloaded from GitHub. With over 800 contributors as of August 2018, Alluxio is one of the most popular open source big data projects in the world.[12]
Version History[13] | |
Version | Release Date |
1.0 | 2016-02-23 |
1.1 | 2016-06-07 |
1.2 | 2016-07-18 |
1.3 | 2016-10-06 |
1.4 | 2017-01-13 |
1.5 | 2017-06-12 |
1.6 | 2017-09-25 |
1.7 | 2018-01-16 |
1.8 | 2018-07-09 |
Alluxio Community Edition (ACE) is a free edition based on Alluxio Open Source (AOS) and supports Alluxio Manager, a graphical user interface (GUI) for system management.
Alluxio Enterprise Edition (AEE) is a commercially supported edition available via subscription. AEE includes AOS plus additional enterprise features.[14]
References
edit- ^ a b Li, Haoyuan. "Alluxio: A Virtual Distributed File System | EECS at UC Berkeley". www2.eecs.berkeley.edu. Retrieved 2018-12-26.
- ^ "Alluxio - Open Source Memory Speed Virtual Distributed Storage". www.alluxio.org. Retrieved 2018-12-26.
- ^ Harris, Derrick (2014-08-02). "The lab that created Spark wants to speed up everything, including cures for cancer". gigaom.com. Retrieved 2018-12-26.
- ^ Li, Haoyuan; Ghodsi, Ali; Zaharia, Matei; Shenker, Scott; Stoica, Ion (2014-06-16). "Reliable, Memory Speed Storage for Cluster Computing Frameworks". Fort Belvoir, VA. doi:10.21236/ada611854.
{{cite journal}}
: Cite journal requires|journal=
(help) - ^ Gage, Deborah (2015-03-17). "Andreessen Horowitz Invests $7.5M in Big-Data Startup Tachyon". WSJ. Retrieved 2018-12-26.
- ^ February 23, Matt Asay |; 2016; Pst, 10:04 Am. "Big data developers' hallelujah moment for distributed storage". TechRepublic. Retrieved 2018-12-26.
{{cite web}}
:|last2=
has numeric name (help)CS1 maint: numeric names: authors list (link) - ^ "Alluxio Virtualizes Distributed Storage for Petabyte Scale Computing at In-Memory Speeds". Marketwired. February 23, 2016.
- ^ Ribeiro, Anna. "WWPI – Covering the best in IT since 1980 » Blog Archive » Alluxio v1.8 boosts cloud deployments for analytics and machine learning to ease migration from HDFS to cloud". Retrieved 2018-12-26.
- ^ "Alluxio adds connectors for multi-cloud data migration". SearchStorage. Retrieved 2018-12-26.
- ^ "Alluxio - Open Source Memory Speed Virtual Distributed Storage". www.alluxio.org. Retrieved 2018-12-26.
- ^ "Alluxio - Open Source Memory Speed Virtual Distributed Storage". www.alluxio.org. Retrieved 2018-12-26.
- ^ Alluxio, formerly Tachyon, Unify Data at Memory Speed: Alluxio/alluxio, Alluxio, 2018-12-26, retrieved 2018-12-26
- ^ Alluxio, formerly Tachyon, Unify Data at Memory Speed: Alluxio/alluxio, Alluxio, 2018-12-26, retrieved 2018-12-26
- ^ "Products". Alluxio. Retrieved 2018-12-26.