Dremio announced ground-breaking innovations that deliver sub-second query
response times directly on cloud data lakes and support for thousands of
concurrent users and queries. In addition, Dremio now includes a built-in
integration with Microsoft Power BI, enabling users to instantly launch the
data visualization software from Dremio and immediately start querying data via
a direct connection.
"As
the modern needs of businesses are forcing organizations to rethink their data
warehousing strategies, businesses are digitally transforming their technology
stack, embracing cloud data lakes and focusing on analytics initiatives," said
Mike Leone, senior analyst, ESG. "Businesses should get the BI and reporting
capabilities they're used to from a data warehouse and they should be able to
easily and cost-effectively bring all data to one place in an open format.
Dremio is among a wave of vendors delivering on the initial promise of a cloud
data lake, re-architecting platforms, offering managed services and more
tightly integrating with BI, data science, and machine learning platforms."
The
latest Dremio product release enables companies to run production BI workloads,
including interactive dashboards, directly on Amazon S3 and Azure Data Lake
Storage (ADLS) - without having to move data into data warehouses, cubes,
aggregation tables or extracts. The new capabilities deliver simple,
self-service access to data and enable analysts to see results immediately,
eliminating their dependency on manual ETL processes or data engineering while
reducing the costs associated with data warehousing.
"The
fact that organizations don't need to copy their data into a data warehouse for
BI workloads has been unthinkable for the last 30 years," said Tomer Shiran,
Dremio co-founder and chief product officer. "Today, our users can leverage
Dremio to power live dashboards and reports directly on S3 and ADLS, instead of
waiting weeks to have data moved into a data warehouse. We're removing limitations,
accelerating time to insight and empowering data teams."
Key new features of Dremio's cloud data lake
engine are designed to enable high-concurrency, low-latency SQL workloads,
including BI dashboards, directly on the cloud data lake. These include:
- Apache
Arrow caching
- Dremio can now cache data reflections (physically optimized
representations of data) in the Apache Arrow format so the data can be
loaded directly into memory with zero compute processing overhead. This
eliminates the need to decode and decompress data at runtime, enabling
sub-second query response times for BI dashboards.
- Scale-out
query planning
- Dremio supports horizontal scaling for coordinator nodes, in addition to
executor nodes, allowing companies to run high-concurrency workloads
consisting of thousands of simultaneous users and queries.
- Runtime
filtering
- By automatically leveraging runtime intelligence from dimension tables,
Dremiodrastically reduces the amount of data that must be read from a fact
table. This results in a performance speedup of more than 100x for star
schemas, workloads that have traditionally only been run on data
warehouses.
- Enhanced
Power BI integration
- Microsoft and Dremio have partnered to develop a deeper integration
between Power BI and Dremio that enables users to launch Power BI Desktop
directly from the Dremio interface with the click of a button. Power BI
automatically connects to Dremio using a native connector, so users can
easily transition from building a dataset in Dremio to analyzing their
data in Power BI.
- External
queries
- Dremio enables users to incorporate explicit SQL queries on their
relational databases within Dremio virtual datasets. This makes it easy to
join data between large datasets in a cloud data lake and smaller datasets
in existing relational databases.
Availability
The new product features are available today in both AWS and Azure.
Deployment options are available here.