Industry executives and experts share their predictions for 2021. Read them in this 13th annual VMblog.com series exclusive.
Machine Learning and Pervasive Intelligence
By Luke Han, Co-Founder and CEO at Kyligence
Achieving pervasive analytics requires the breaking down of the logistical
barriers that exist for every cloud and data platform. This means that there is
a logical data layer that must exist outside of these platforms that also
integrates with, or extends these platforms. A distributed architecture that
consists of a cloud data warehouse and cloud data storage service offerings is
what many organizations will have to turn to in 2021 to enable intelligence at
all levels of applications and analytics stacks. But data and application architects
need to think of "data warehousing" as a behavior, not as a product. The
resulting cloud native thinking will
enable the massive acceleration of analytics needed to make them truly
pervasive.
Embedded Analytics in SaaS
As an ever growing proportion of
enterprise business functions use cloud services to operate, SaaS companies sit
on tons of data with tremendous potential value. It's a huge boulder of
potential energy sitting at the top of the hill. But until now, excessive
friction - in the form of cost, time, and effort - has kept this boulder from
tumbling down the hill, realizing its kinetic energy in the process.
For nearly all enterprises, the benefits
of cloud analytics are becoming more immediate and profitable. They can
quantify, analyze, and predict business performance across a wide scope of
functional areas and business processes. They can identify
potential up-sell and cross-sell opportunities and make informed decisions at a
very granular level about opportunities and threats.
But SaaS vendors may be the ultimate
beneficiaries of pervasive analytics as more and more companies turn to SaaS
solutions to streamline and modernize their operations while de-emphasizing
spending capital on data centers. The ability to offer useful insights at every
turn gives SaaS vendors a potential new revenue stream they can offer their
existing customer base. New SaaS analytics offerings also offer a competitive
advantage and extend the life and value of their core product. This will in
turn increase the stickiness of their solution as customers become accustomed
to - and then addicted to - the new wealth of easy to consume insights.
But these insights do come at a cost.
Pervasive analytics capabilities, whether it's dashboards for executives,
self-service analysis for business users, or reports for line-of-business
managers, create some familiar requirements for SaaS vendors:
Data Freshness and Accuracy
There is an inherent expectation for SaaS
users that the data in their dashboard needs to be fresh and accurate. With
many SaaS vendors servicing customers in multiple time zones, daily refreshes
of analytics data sets may not be granular enough to be useful. Rapid refresh
of these datasets also enable data teams to identify any anomalies,
duplications, or errors early on.
Unified Data and Semantics
The line between business analysts and
data scientists will continue to blur. As software vendors productize
enterprise grade machine learning software, business analysts are empowered to
conduct more advanced data science-inspired research. These data savvy business
analysts demand a more powerful data service layer that requires a consolidated
view of data across the organization.
These business analysts need to analyze
at the ‘speed of thought' and need to interact with their datasets without
breaking for coffee every time they submit a query. For today's information
worker, and machine learning applications, the speed of thought is not fast
enough. With the general availability of machine learning algorithms and libraries,
we need a data service layer that can deliver insight at machine speed.
This means we need to look beyond today's
typical data service architecture, which usually includes data warehouses
and/or data lakes with some query engines, plus data science frameworks that
read and process large amounts of data. A consolidated data service layer
should be able to serve both human analytics and machine learning workloads,
with unified semantics across the enterprise, at the speed 10x or 100x faster
than today's most commonly used data technologies. This takes the form of a
Unified Semantic Layer and an updated notion of data warehousing.
Analytics Everywhere and Anywhere
We suspect that multi-cloud and
multi-platform strategies will become more popular in the enterprise. This is
the logical result of data gravity, where a created dataset becomes so
essential to a company, application or industry that analytics solutions must
grow around the data rather than the other way around.
The most common multi-cloud approach is
to choose different clouds for different applications. This approach allows
enterprises to choose the best of breed services from different cloud vendors.
In a broader context, multi-cloud can refer to a mix of public cloud, private
cloud, and on-premise architectures.
The challenge of this approach is now you
have a new set of data silos on a grand scale with different supporting cloud
infrastructure. New regulatory requirements such as GDPR make it even harder to
connect those silos. An enterprise data service layer that is not only cloud
neutral, but also multi-cloud friendly, becomes essential to enable pervasive
analytics without undo operational friction.
Expect to see more customers that will be
interested in the capabilities of analyzing data in a multi-cloud environment,
supporting multiple data platforms (data warehouses, data lakes, cloud storage)
that enables the ability to merge, compare, summarize across data platforms and
cloud boundaries.
This type of multi-cloud, multi-platform analytics
is foundational to pervasive analytics and therefore must be highly automated
and business user friendly. We can't repeat the same mistakes with the
technologies that require heavy data engineering efforts. We should expect a
tighter integration with enterprise data catalogs, a unified semantic layer
across cloud boundaries, and the advancement of data governance that is also
multi-cloud ready.
##
About
the Author
Luke
Han is Co-Founder and CEO at Kyligence and co-creator and PMC chair of the Apache Kylin project.
As head of Kyligence, he has been working to grow the Apache Kylin community,
expand its adoption, and build out a commercial software ecosystem through the
Kyligence Cloud product. Prior to Kyligence, he was the Big Data Product Lead
at eBay and Chief Consultant at Actuate China.