Virtualization Technology News and Information
Article
RSS
Accelerating Data Intensive Applications with Coud Native Local Storage

DaoCloud-HwameiStor 

By Mingming Zhou, Senior Software Engineer, DaoCloud and Simon YN Zhao, Senior Solution Architect, DaoCloud

HwameiStor is a software defined cloud-native local storage system that combines the advantages of both local disks and commercial storage products, cost-effective, high-performance, and rich enterprise-level data management functions. Designed & developed for Kubernetes, HwameiStor has the following features:

  • Simple O&M: One-click containerization deployment with minimum server resources, automatic O&M
  • Elasticity:Storage resources (volumes, disks, nodes)dynamically scale on demand to support large-scale applications
  • Data management: Multiple volume types for HA, snapshots, cloning, restoration, failover, and one-click eviction data migration.

Use case for LLM Pre-Training

HwameiStor provides a simple efficient way for accelerating datasets access (read) and CKPT quick save (write): Object Storage + Local Storage

hwameiStor-diagram 

HwameiStor will first load the remote dataset to object storage nodes in the cluster, then move the data to training nodes local storage for training.  For the subsequent same datasets loading during training,HwameiStor provides local storage access either by scheduling training tasks or reloading datasets.

In CKPT quick save scenario, leverage Memory and NVMe as the underlying storage to accelerate data writes;guarantee the sharing and security of CKPT data by asynchronously writing to object storage within the cluster.

In this solution, the object storage is used as the datasets second-level cache in the cluster, HwameiStor provides local storage for the training phase. The advantages are: simple and easy to deploy; minimum data loading overhead; best performance with local data access; high stability without additional complexity and failure points.

Use case for infrastructure workloads

Middleware (Message Queue, Kafka, MySQL) and cloud native virtualization (KubeVirt) scenarios requires the underlying storage with data durability, performance, HA, security, and some key capabilities to support virtual machine migration, snapshots, cloning, recovery, etc.

In edge computing like K3s, KubeEdge environment, data and services are distributed to the edge side, with extremely limited server resources, and the network is also unreliable. The underlying storage must be able to provide stable storage with a small resource footprint.

hwameiStor-storage 

In order to meet the above storage requirements, HwameiStor adopts disk management technology of pooling, with underlying multiple disk types of NVMe, SSD, HDD, to provide high-performance local storage volumes for upper-layer applications. HwameiStor also provides a HA data volume type, and services like snapshot, cloning, and recovery to help the users achieve the business continuity goal to meet the requirements for data security, reliability, and high availability.

Advanced features for production:

  • QoS: Volume-based I/O rate limiting for performance stability
  • Data migration: Manual or auto data volumes migration for faulty
  • Audit logs: Usage and operation logs at clusters, nodes, and data volumes levels
  • Dynamic expansion: Manually or auto data volumes, disks, nodes expansion
  • Management tools: UI and command-line management and O&M tools
  • Data services: Business continuity tools of snapshots, cloning, and other services

HwameiStor is a CNCF Sandbox project: https://github.com/hwameistor.  To learn more about HwameiStor, stay tuned for KubeCon Europe 2024.

##

ABOUT THE AUTHORS

Mingming Zhou is a HwameiStor Developer & Maintainer

Simon YN Zhao is a HwameiStor Maintainer with 20 years storage and 5 years cloud native experience, used to work for HDS, Sun, EMC.

Published Thursday, February 15, 2024 7:35 AM by David Marshall
Comments
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
Calendar
<February 2024>
SuMoTuWeThFrSa
28293031123
45678910
11121314151617
18192021222324
252627282912
3456789