Virtualization Technology News and Information
Kyligence 2021 Predictions: Machine Learning and Pervasive Intelligence

vmblog 2021 prediction series 

Industry executives and experts share their predictions for 2021.  Read them in this 13th annual series exclusive.

Machine Learning and Pervasive Intelligence

By Luke Han, Co-Founder and CEO at Kyligence

Achieving pervasive analytics requires the breaking down of the logistical barriers that exist for every cloud and data platform. This means that there is a logical data layer that must exist outside of these platforms that also integrates with, or extends these platforms. A distributed architecture that consists of a cloud data warehouse and cloud data storage service offerings is what many organizations will have to turn to in 2021 to enable intelligence at all levels of applications and analytics stacks. But data and application architects need to think of "data warehousing" as a behavior, not as a product. The resulting cloud native thinking will enable the massive acceleration of analytics needed to make them truly pervasive.

Embedded Analytics in SaaS

As an ever growing proportion of enterprise business functions use cloud services to operate, SaaS companies sit on tons of data with tremendous potential value. It's a huge boulder of potential energy sitting at the top of the hill. But until now, excessive friction - in the form of cost, time, and effort - has kept this boulder from tumbling down the hill, realizing its kinetic energy in the process.

For nearly all enterprises, the benefits of cloud analytics are becoming more immediate and profitable. They can quantify, analyze, and predict business performance across a wide scope of functional areas and business processes. They can identify potential up-sell and cross-sell opportunities and make informed decisions at a very granular level about opportunities and threats.

But SaaS vendors may be the ultimate beneficiaries of pervasive analytics as more and more companies turn to SaaS solutions to streamline and modernize their operations while de-emphasizing spending capital on data centers. The ability to offer useful insights at every turn gives SaaS vendors a potential new revenue stream they can offer their existing customer base. New SaaS analytics offerings also offer a competitive advantage and extend the life and value of their core product. This will in turn increase the stickiness of their solution as customers become accustomed to - and then addicted to - the new wealth of easy to consume insights.

But these insights do come at a cost. Pervasive analytics capabilities, whether it's dashboards for executives, self-service analysis for business users, or reports for line-of-business managers, create some familiar requirements for SaaS vendors:

Data Freshness and Accuracy

There is an inherent expectation for SaaS users that the data in their dashboard needs to be fresh and accurate. With many SaaS vendors servicing customers in multiple time zones, daily refreshes of analytics data sets may not be granular enough to be useful. Rapid refresh of these datasets also enable data teams to identify any anomalies, duplications, or errors early on.

Unified Data and Semantics

The line between business analysts and data scientists will continue to blur. As software vendors productize enterprise grade machine learning software, business analysts are empowered to conduct more advanced data science-inspired research. These data savvy business analysts demand a more powerful data service layer that requires a consolidated view of data across the organization.

These business analysts need to analyze at the ‘speed of thought' and need to interact with their datasets without breaking for coffee every time they submit a query. For today's information worker, and machine learning applications, the speed of thought is not fast enough. With the general availability of machine learning algorithms and libraries, we need a data service layer that can deliver insight at machine speed.

This means we need to look beyond today's typical data service architecture, which usually includes data warehouses and/or data lakes with some query engines, plus data science frameworks that read and process large amounts of data. A consolidated data service layer should be able to serve both human analytics and machine learning workloads, with unified semantics across the enterprise, at the speed 10x or 100x faster than today's most commonly used data technologies. This takes the form of a Unified Semantic Layer and an updated notion of data warehousing.

Analytics Everywhere and Anywhere

We suspect that multi-cloud and multi-platform strategies will become more popular in the enterprise. This is the logical result of data gravity, where a created dataset becomes so essential to a company, application or industry that analytics solutions must grow around the data rather than the other way around.

The most common multi-cloud approach is to choose different clouds for different applications. This approach allows enterprises to choose the best of breed services from different cloud vendors. In a broader context, multi-cloud can refer to a mix of public cloud, private cloud, and on-premise architectures.

The challenge of this approach is now you have a new set of data silos on a grand scale with different supporting cloud infrastructure. New regulatory requirements such as GDPR make it even harder to connect those silos. An enterprise data service layer that is not only cloud neutral, but also multi-cloud friendly, becomes essential to enable pervasive analytics without undo operational friction.

Expect to see more customers that will be interested in the capabilities of analyzing data in a multi-cloud environment, supporting multiple data platforms (data warehouses, data lakes, cloud storage) that enables the ability to merge, compare, summarize across data platforms and cloud boundaries.

This type of multi-cloud, multi-platform analytics is foundational to pervasive analytics and therefore must be highly automated and business user friendly. We can't repeat the same mistakes with the technologies that require heavy data engineering efforts. We should expect a tighter integration with enterprise data catalogs, a unified semantic layer across cloud boundaries, and the advancement of data governance that is also multi-cloud ready.


About the Author

Luke Han 

Luke Han is Co-Founder and CEO at Kyligence and co-creator and PMC chair of the Apache Kylin project. As head of Kyligence, he has been working to grow the Apache Kylin community, expand its adoption, and build out a commercial software ecosystem through the Kyligence Cloud product. Prior to Kyligence, he was the Big Data Product Lead at eBay and Chief Consultant at Actuate China.

Published Wednesday, January 06, 2021 7:42 AM by David Marshall
VMblog 2021 Industry Experts Video Predictions Series - Episode 5 : @VMblog - (Author's Link) - January 20, 2021 1:52 PM
VMblog 2021 Industry Experts Video Predictions Series – Episode 5 - (Author's Link) - January 21, 2021 5:05 AM
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
<January 2021>