StarTree unveiled a set of new
innovations designed to equip organizations to efficiently handle evolving data
structures, enhance query performance, and streamline user access management,
ensuring faster and more reliable real-time analytics at scale.
As organizations
shift towards real-time analytics, they face the dual challenge of adapting to
an environment where everything is expected instantly and at an unprecedented
scale. The rapid evolution of table sizes, the growing number of tables, and
soaring ingestion and query rates amplify the complexity of managing dynamic
data structures. Apache Pinot demonstrates exceptional scale, powering
real-time workloads at LinkedIn with upwards of 650,000 queries per second,
Stripe managing approximately 1 PB of data, and Uber achieving a 99th
percentile query latency of just 100 milliseconds. Unlike batch systems, which
benefit from stable, periodic data loads, real-time analytics requires
solutions that maintain performance, security, and reliability amidst ever-changing
conditions and at unprecedented scale.
Changes in
streaming pipelines-such as schema shifts or data gaps-require immediate
resolution, and optimized queries are essential for sustaining throughput and
responsiveness in customer-facing applications. StarTree Cloud's new
capabilities address these unique needs, enabling organizations to manage
real-time data efficiently while maintaining stringent performance and security
standards.
New innovations in
StarTree Cloud, include:
- Pauseless
Ingestion: Pauseless ingestion underscores StarTree's
dedication to data freshness at scale, ensuring that every second counts. As
organizations ingest tens of millions of messages per second, even brief delays
can impact data accuracy and decision-making. StarTree Cloud maintains
continuous data flow during segment building and upload phases, enabling
businesses to deliver real-time, reliable insights at scale.
- Performance
Manager: Designed to help teams scale up quickly and minimize
time to value when using Apache Pinot. With an intuitive,
machine-learning-powered interface, it simplifies the process of optimizing
query performance. New users often face challenges navigating the extensive
range of indexing technologies available. Performance Manager addresses this by
analyzing query structures and metrics to recommend enhancements-such as
indexes, bloom filters, derived columns, or star-tree indexes. Users can apply
these optimizations with a click, achieving immediate performance gains. This
automation not only accelerates onboarding and efficiency but also maximizes
cluster throughput, reducing manual effort and boosting overall system
performance.
- Schema
evolution: In real-time databases, where data flows continuously,
schema evolution in StarTree Cloud allows the system to accommodate new fields,
indexes, altered data types, or other structural modifications without
disrupting operations. This capability is essential for maintaining data
consistency and ensuring that applications relying on the database continue to
function smoothly despite the evolving nature of the input data.
- Data
Backfill: This feature addresses incorrect or missing data by
enabling users to seamlessly reload data from past events, filling any gaps in
data flows. When data fails to load or stream correctly, backfill allows teams
to go back and retrieve the incorrect or missing information, preserving
consistency across datasets. This capability is particularly valuable in
real-time analytics, where continuous data integrity is essential. By
automating the backfill process, organizations can easily maintain accurate,
up-to-date information.
- Role-Based
Access Control (RBAC) Management: Enhances security and
simplifies user administration for real-time data analytics. This feature
allows organizations to easily assign and control user access based on roles,
ensuring secure, efficient access to sensitive data, even when that data is ingested
and analyzed by that role in a sub-second window.
Kishore
Gopalakrishna, co-founder and CEO, StarTree, said, "StarTree's new
features bridge a crucial gap in real-time analytics, enabling organizations to
scale with the reliability and control typically seen in batch systems. By
tackling critical challenges in data management and security, StarTree helps
organizations better leverage real-time insights, enhancing their ability to
scale operations efficiently."
"Real-time
analytics presents a distinct challenge compared to batch-based analytics, as
it requires instant adaptation to data changes while maintaining stringent
performance demands," said Paul Nashawaty, Principal Analyst, AppDev, theCUBE
Research. "Industry research found that 75% of organizations struggle with the
complexity and latency associated with traditional real-time analytics
solutions. StarTree is bringing essential improvements to real-time analytical
databases, paving the way for broader adoption of real-time analytics across
many industries. It enables businesses to gain real-time insights from their
data and make faster, more informed decisions."
Availability
All of these capabilities are available in private preview in Q4 2024 for
StarTree Cloud customers and prospects with GA in Q1 of 2025.