Virtualization Technology News and Information
VMblog Expert Interview: NVIDIA Talks Trends in Big Data and Deep Learning AI

NVIDIA Interview Manuvir Das 

Big Data has never been more central to our lives.  And consider the advanced analytics technology that enables value to be extracted from this resource and results achieved in some challenging areas – such as research into COVID-19.  So where is analytics heading next and what kind of solutions will enable this?  To find out more, VMblog spoke with Manuvir Das, head of Enterprise Computing at NVIDIA.

VMblog:  What is happening with the ‘datasphere’, and where is data going?

Manuvir Das:  It's no secret that data is growing exponentially; it has been for decades and is expected to continue to accelerate in coming years. Autonomous vehicles will hit the roads, and more robots and devices are being deployed in hospitals and factories. Companies are re-architecting their businesses around AI, and the arrival of fast, low-latency 5G networks that will create countless new data centers close to the edge. 

Given the billions of dollars invested in all of these systems, it's imperative for enterprises to gain faster insights from their data. Accelerated computing platforms play a key role in creating the composable data center infrastructure needed to deliver the speed, security and efficiency that will be essential in the next decade of computing.

This shift is already having an important impact. Consider drug discovery, which has been critical to responding to the COVID-19 crisis. To help save lives and economies around the world, the healthcare industry has sped up processes, creating solutions in record time. It's taken less than a year to bring vaccines and treatments to patients - processes that often had taken a decade or more in the past.

As businesses deploy more data-driven services, it will become critical for companies to add compute capabilities at the edge, close to where the data is created.  This not only will create more data growth, but will also be critical for intelligent retail services like checkout-free shopping, or managing fleets of robotaxis. We'll see shifts in IT infrastructure to support these new opportunities, with technologies like DPUs being added to speed applications and boost security. 

VMblog:  What has changed with data science tools that can make an impact on businesses/research now?

Das:  Data science jobs have been at the top of Linkedin's careers rankings for a few years now, inspiring more and more people to learn the skills needed to meet the demand for this work. All of this interest has created a boom for data science, yet enterprises are still only just beginning to operationalize their data science and AI pipelines. 

NVIDIA has been investing in helping data scientists work more efficiently with GPU-accelerated software for the end-to-end data science and AI pipeline. This year, we helped bring GPU acceleration to Apache Spark 3.0, the world's leading data analytics platform. We're working with Cloudera to accelerate the Cloudera Data Platform with the RAPIDS suite of GPU-accelerated data science software libraries. BlazingSQL is also using RAPIDS on their high-performance SQL engine to help SQL experts analyze their databases at lightning speed.

Additionally, we're collaborating with VMware to enable businesses to run both modern AI workloads and existing applications on their infrastructure through NVIDIA NGC software. We're teaming up to bring pre-trained models, helm charts and other AI software available on the NGC catalog into VMware vSphere, VMware Cloud Foundation and VMware Tanzu.

Turning data into insight through data science, and then putting that insight into production in an AI model is a journey with many steps. By intelligently accelerating these steps with our partners, we'll be able to help enterprises put their data to work to transform their industries.

VMblog:  Focusing on the role of supercomputers and Big Data, what is the significance of NVIDIA's RAPIDS open source APIs? 

Das:  RAPIDS GPU-accelerated software speeds up the data science pipeline, bringing accelerated computing to the most popular data science libraries and frameworks. This means that companies can not only gain faster insights from their data, but also boost their efficiency and speed time-to-solution.

With more than 70% of data work estimated to be spent on ETL, the faster data can be prepared, the faster customers can gain insights. This is why we developed the RAPIDS Accelerator for Apache Spark, which combines the power of the RAPIDS cuDF library and the scale of the Spark distributed computing framework.

The openness of RAPIDS also speeds innovation at HPC scale. For example, the Oak Ridge National Laboratory is using BlazingSQL with RAPIDS to accelerate drug discovery research using NVIDIA GPUs on Summit, America's fastest supercomputer. ORNL researchers found that the GPU-accelerated open-source platform helped them process enormous datasets extremely quickly with scalable SQL queries.

It's also important to note the breakthrough efficiency that NVIDIA products are bringing to supercomputers. An NVIDIA DGX A100 SuperPOD won the top spot in the recent Green500 rankings, and in fact, four out of the five most efficient supercomputing systems on the Green500 list used NVIDIA technology. As data continues to grow, so will computing, which is why it is so essential for our supercomputers to be as efficient as possible.

VMblog:  Considering Big Data, AI and edge processing, can you give some real-world examples?

Das:  GPU-accelerated big data, AI and edge computing are helping businesses across every industry capture more market share and create new opportunities.

In marketing and professional services, Adobe was among the first to put big data to work with GPU-accelerated Spark 3.0 on Adobe Experience Cloud. The Adobe team gained more than 7x faster performance with NVIDIA-accelerated Spark 3.0 compared to running Spark on CPUs, while also helping to save more than 90% on computation costs.

Financial leader Capital One is integrating AI and machine learning into its customer-facing applications such as fraud monitoring and detection, call center operations and customer experience. From operations to services, AI is helping Capital One transform the business of finance.

At the edge, global supply chain solutions provider KION Group is using the NVIDIA EGX AI platform to develop AI applications for its intelligent warehouse systems, increasing the throughput and efficiency in its more than 6,000 retail distribution centers.

VMblog:  For the future, are the many aspects of IoT points an important part of the big data movement?

Das:  In his GTC Fall 2020 keynote, NVIDIA CEO Jensen Huang explained how the internet of human beings is rapidly becoming the Internet of Things. Trillions of devices, sensors and robots will be AI-enabled to offer a broad range of services. 

Soon, the kind of computing that used to only be possible in a data center will be available everywhere -- in stores, on telephone poles, even in parking lots. As Jensen noted, the missing piece of IoT is AI. Together, edge computing and powerful AI software are essential to pioneering new breakthroughs.

In healthcare, AI IoT will be able to help patients ask questions about their care. They will help factory workers stay safe, and help us check out faster in stores. On a micro scale, the AI IoT will deliver our lunches, and at the macro level, it will boost efficiency in transportation with fleets of autonomous vehicles.

While these innovations might sound futuristic, companies are already pioneering these new products and services, working with NVIDIA software and platforms to bring their ideas to life. We expect AI IoT products and services to accelerate as projects come to market and inspire even more companies to leverage AI to better serve their customers.


Published Friday, December 11, 2020 7:45 AM by David Marshall
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
<December 2020>