CelerData announced the latest version of its
enterprise analytics platform, CelerData Version 3. CelerData is built on top
of the open source project StarRocks, the fastest MPP SQL database - recently
donated to the Linux Foundation.
"The data
lakehouse has added critical capabilities to the data lake architecture by
introducing ACID control, table formats and data governance," said James Li,
CEO, CelerData. "However, analytics capabilities on the lakehouse are still
limited and cost prohibitive. Most query engines struggle to support
interactive ad-hoc queries, are not able to support real-time analytics, and
fall apart when facing a large number of concurrent users."
With the release
of CelerData V3, lakehouse users have the option to conduct high-performance
analytics without ingesting data into a central data warehouse.
Compared to
other common query engines, CelerData improves query performance by at least 3
times while significantly reducing infrastructure cost.
"Though
several challenges exist when it comes to the underlying infrastructure
supporting data lakes, organizations continue to look for solutions and
approaches that can address those challenges head-on. They understand the value
an organization can achieve when implementing a data lake the right way," said
Mike Leone, principal analyst, ESG. "Of all the data lake environment
challenges organizations experience today, our research shows the greatest
challenge is the management, optimization, and automation of data placement.
With CelerData's support for a lakehouse architecture through the integration
with common table formats such as Iceberg and Hudi, a data lakehouse can now
have the option to conduct high-performance analytics without ingesting data into
a central data warehouse."
Data
lakehouse users can perform analytics by querying across streaming data and
historical data in real-time, without having to wait and combine streaming data
into batches for analysis. This greatly simplifies the data architecture and
improves the timeliness of lakehouse analytics. CelerData's advanced query
engine can support thousands of concurrent users at 10,000 QPS(Queries Per
Second), enabling use cases previously not possible on the data lakehouse.
New Features
in CelerData V3, include:
- Cloud
Native Architecture
- CelerData
3 cloud native architecture leverages cloud object storage to improve
reliability and reduce storage cost.
- It
also enables better workload and resource isolation so that users
can create different warehouses for different use cases.
- With
this feature CelerData now supports multi-AZ availability in the cloud.
- High
performance data lake analytics
- By
integrating with open table formats such as Hudi, Iceberg, and Delta
Lake, customers can now enjoy the industry-leading performance of
CelerData query engine on a data lake without data ingestion.
- Unlike
other data lake query engines, CelerData users have the option to bring
data into its own storage format on the lake for the best query
performance.
- A
local caching layer can be enabled to improve remote I/O performance.
- Multi-table
materialized views can be created to further improve query performance.
- Real-time
streaming analytics on data lakehouse
- Most
enterprises use a separate platform for streaming analytics. With
CelerData 3, streaming data analytics and data lake analytics are unified
into one platform, eliminating the roadblocks for real-time insights on a
data lakehouse.
- Multi-Table
Materialized View simplifies data pipelines
- Materialized
views can be built from multiple joint base tables to speed up query
performance.
- Users can now ingest
raw data and transform data within CelerData, significantly simplifying
the data processing pipeline.
CelerData Version 3 will be
generally available in early April 2023.