Virtualization Technology News and Information
Article
RSS
VMblog's Expert Interviews: GoodData Talks Enterprise Data, Analytics, AI and Machine Learning

interview gooddata ml

There's a data explosion taking place right now, and that data growth is only going to keep growing.  Machine learning will be needed to automatically discover data and detect usage patterns across high-scale data lakes and other sources.  Now is the time for data lakes to start proving their business value - not just storing massive quantities of data.  To dive in deeper into this topic, I reached out to Arvin Hsu, Senior Director of Data Science and Machine Learning at GoodData.

VMblog:  What is the current state of enterprise data as it pertains to machine learning?

Arvin Hsu:  As more, faster, and less structured data pours in from an estimated 50 billion IoT-connected devices by 2020 - adding to an already unprecedented amount of data - today's enterprises are facing the new challenge of making that data actionable and capable of creating meaningful change. 

Data lakes promised a path to make this happen, yet enterprises are abandoning them left and right. Fundamentally, this has been a reflection of the failed promise of extracting meaning from big data. The challenge of extracting concrete business value from terabytes of structured, unstructured, IoT, audio/visual data and more now lies in the hands of data scientists and machine learning (ML) engineers.

In order to meet this challenge, data scientists will need to develop more advanced cataloging tools to access data from many different sources, visualization and discovery tools to help them understand the data at hand, and leverage automated/ML driven meaning extraction systems. The signals that will help businesses make better decisions, create better customer value, and optimize their workflows reside within the big data storage centers of these enterprises. It will be up to the AI teams to write the ML algorithms that can detect those signals in order to effect meaningful change.

VMblog:  When will enterprises start to fully utilize big data? 

Hsu:  Early adopters like Amazon, Google, and Uber are already fully utilizing their data. Other enterprises are all playing catch up - building out their data engineering pipelines, signal detection ML algorithms, and the ML operationalization systems required to turn insights into action. I predict that organizations will start reaping the benefits of their data stories in 2019 and beyond, as major enterprises continue to adopt cloud computing, scalable ML architecture, and streamlined production systems. This benefit will come to fruition as better customer personalization, efficient, streamlined workflows, and optimization improvements across a myriad of business processes.

VMblog:  What role does the continued emerging of cloud compute and ML have on this?

Hsu:  As more siloed enterprise data sources get migrated to the cloud, data access and data democratization will increase, allowing data intelligence gurus to get a more holistic view of customers, products, and business processes. Similarly, the shift towards cloud services for big data pipelines, realtime and streaming, and compute allows for easier integration of everything that's needed to create high impact data products for the business.

Cloud compute platforms allow much easier, faster provisioning for big data processing initiatives, and incremental, serverless billing models make it much cheaper to implement both temporary sandboxes as well as production compute systems. Coud-based data pipelines and compute models create an ease of distribution for end-users, whether they are internal business units or embedded into a customer-facing application. GoodData has built an end-to-end intelligence platform that specifically leverages all these benefits, ingesting enterprise data into the cloud to deliver analytics and intelligence to end-users embedded directly into their applications.

VMblog:  How can data scientists make data actionable and capable of creating meaningful change?

Hsu:  Data scientists need to start with understanding business processes, use cases, and pain points. This allows them to tie their data discovery and model building to concrete business value - usually decisions or actions that a business takes. Throughout the data discovery and model building process, data scientists need to continue to think about the type of impact the models they build will affect, such as the impact of Type I Errors and Type II Errors. The best models don't maximize mathematical accuracy or prediction performance - they maximize business impact.

VMblog:  What kind of business use cases can be made possible by deeper text analytics, more free-form text, and enhanced sentiment analysis?

Hsu:  Enterprises have a wealth of untapped information in unstructured text fields. Comprising everything from qualitative problem descriptions to customer satisfaction reports, unstructured text can provide amazing insight into not only customer satisfaction, behavior and opinion, but also business processes, user feedback, order processing and more. Using new deep learning models to better extract meaning from unstructured text fields and using that as inputs into learning models will enable enterprises to extract significant value from all those untapped resources.

VMblog:  GoodData is unique in that its analytics and data pipeline are built in an end-to-end in the Cloud.  How does this help companies eliminate the pain points of getting meaningful insights from their data?

Hsu:  GoodData's end-to-end platform provides a seamless and efficient value creation process from data ingestion all the way to embedded recommendations and other analytics. The typical friction points of multi-source data integration, ML productionalization, and endless BI dashboards are all obviated. GoodData focuses on changing business workflows by embedding action-oriented analytics at the point-of-work. This creates fundamental changes in the actions and decisions that businesses make and creates a direct, seamless link from data ingestion to business value.

VMblog:  How will cloud-based AI services help manage the unprecedented volume and diversity of enterprise data in 2019 and beyond?

Hsu:  AI helps us comb through massive amounts of data, separating the signal from the noise. As the AI industry matures to handle more big data use cases, and technologies develop to deal with these issues, more big data stores that have yet to be tapped into will yield nuggets of value as the AI-algorithms "automatically mine" them. These advanced mining algorithms will be more capable of detecting significant signals that impact business-critical KPI's. This holds true not only for unstructured text, but also for IoT data, Audio/Video/Image data, data stored in ER databases, and more. All of this, of course, is tied to the continued availability of more powerful and more affordable compute resources. From Google's TPU's to AWS's P3 GPU-clusters, compute necessary for big data AI and deep learning continues to become more affordable.

VMblog:  As major players such as AWS, GCP, and Azure continue to introduce greater cloud compute and deep learning resources, what do you see as the direction of these technologies, and what does that mean for the future of AI?

Hsu:  All the major cloud vendors are competing against each other to offer the most innovative, across-the-board solutions for big data and AI. The competition fuels innovation and the creation of better serverless offerings to use ML to extract business value. Enterprises will only benefit from integrating cloud-based architecture solutions into their roadmap, whether they choose to work with a single vendor or a multi-cloud solution. We will also continue to see specialization and diversification among the cloud vendors. Microsoft has a technological lead with its R-language offerings, while Google has a lead with its Tensorflow based offerings. As technology and innovation evolve, we will continue to see the cloud vendors differentiate and specialize into different areas of ML and AI.

##

Published Monday, April 02, 2018 7:34 AM by David Marshall
Filed under: , ,
Comments
Finovate Alumni News – Finovate | Premium apps reviews Blog and Programing - (Author's Link) - April 2, 2018 1:34 PM
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
top25
Calendar
<April 2018>
SuMoTuWeThFrSa
25262728293031
1234567
891011121314
15161718192021
22232425262728
293012345