Industry executives and experts share their predictions for 2020. Read them in this 12th annual VMblog.com series exclusive.
By Bill Bryce, VP of Products, Univa
Five HPC predictions for 2020 and beyond
Back
in the 1980s, a German chemical company (BASF) found a slogan that resonates to
this day - "We
don't make a lot of the products you buy. We make a lot of the products you buy
better". I see parallels between BASF and our business at Univa. We don't build HPC applications to
be clear - rather, we build tools that help HPC experts run applications more
efficiently. In this capacity, we're fortunate to work with leading HPC users in
disciplines from aerospace to life sciences to AI supercomputing. This provides
a unique vantage point to how HPC workloads are evolving. As we start the next
decade of HPC innovation, I thought it would be interesting to make some predictions
for 2020 and beyond by extrapolating existing trends.
Computing
gets more diverse - A decade ago, GPU-optimized applications
were few and far between. Today, fueled by better developer tools,
GPU-optimized libraries, and dramatic performance gains, most leading HPC
applications support GPUs. Without GPUs, AI models for computer vision, natural
language processing, and language translation would be hard to imagine. Our customers
at Univa now demand advanced GPU-aware scheduling features, reflecting how
integral GPUs have become to modern workloads. While NVIDIA is the reigning data
center GPU champ, the market is poised to get much more competitive. Intel re-entered
the market in late 2019 with a slew of GPU-related announcements (OneAPI, Ponte
Vecchio, Rambo Cache, and Gelato), and AMD did the same with new EPYC servers, Radeon
GPUs and GPU-related software solutions. New technologies such as neuromorphic
processors are already in production, promising to reduce power consumption for
specific AI workloads by up to 10,000x by modeling neural networks directly in
silicon. Look for a decade of increased diversity as GPUs, FPGAs, and new types
of accelerators augment CPUs, delivering dramatic performance gains, and
picking up from where Moore's Law left off.
Machine
architectures matter again - Despite the dominance of x86-64/x64 in HPC, it's
started to feel like processor architectures were less relevant. Most HPC users
run Linux distros such as Debian, Ubuntu, or CentOS, and the user experience is
largely the same across all. The same apt-get, yum, or pip commands work
identically on x86, Arm, and Power systems auto-installing the same widely used
components and frameworks. While containers have emerged as a preferred way of making
HPC applications portable, they've also had the effect of making machine
architectures important again. The availability of containerized software
varies widely by platform. In Docker Hub at present, there are almost 2.4
million Linux container images for x86. For Arm and Arm-64, there are ~40,000
images, and for IBM Power LE (little-endian), there are ~5,000 images. The
emergence of GPU-specific container clouds such as NVIDIA's NGC shows the way
of the future. As new CPU and GPU architectures proliferate, we can expect increased
fragmentation in container ecosystems and registries as hardware, software, and
cloud providers race to promote and deliver containers tailored to their own architectures
and software ecosystems. HPC users will need management software that
transparently supports multiple container formats and registries across
multiple CPU and accelerator architectures.
Clouds
get cloudier -
In HPC, cloud usually implies infrastructure-as-a-service (IaaS). While SaaS
and PaaS offerings exist, the economics of running HPC in the cloud can be challenging.
HPC users often prefer to burst into IaaS cloud services to preserve
flexibility, portability, and deploy virtualized or containerized software environments
that match on-premise environments. As new workloads emerge however, this
dynamic is changing. Analytic and data science applications are frequently
accessed via interactive, collaborative Notebooks - analysts code to high-level
libraries such as Keras or Fast.AI that abstract away underlying software
frameworks and infrastructure. New types of domain-specific cloud offerings are
emerging tailored to these workloads including Paperspace, Gradient, Onepanel
and FloydHub, where users can bring their own containers and data to avoid lock-in
but pay for only the capacity that they use. While on-premise and cloud HPC
clusters are here to stay, we can expect cloud usage to become more fragmented as
users leverage an increasingly diverse spectrum of specialized cloud services offering
new types of cloud consumption models.
HPC cloud spending emerges as a key concern -
While HPC users have been slow to embrace cloud historically, this is changing
fast. Recent research shows a
dramatic 60 percent increase in HPC cloud spending from just under $2.5 billion
in 2018 to approximately $4 billion in 2019. Managing
cloud spending is especially challenging in HPC, where users have an enormous
appetite for computing resources. Look for cloud spend-management and
budget-aware workload management tools to be increasingly important as the use
of HPC in the cloud accelerates.
HPC shifts to the edge - Today,
most HPC workloads run in centralized data centers operated by governments,
enterprises, and cloud providers. The internet of things (IoT) accelerated by
5G networks will be a game-changer, promising network speeds as high as 20
Gbps, millisecond latency, and up to a million devices per square kilometer.
It's easy to see autonomous vehicles, drones, and vast arrays of sensors
sharing vastly more telemetry than ever before. Backhauling all this data to
far away clouds with tens of milliseconds of latency won't work - Clouds will
need to get much closer to the ground - hence the term fog computing. We
already see the effects of data gravity in commercial HPC workloads such as
algorithmic trading, streaming data analytics, and seismic analysis, where
latency and data considerations require HPC processing close to the edge of the
network. With faster networks, larger datasets, and a vast increase in the
number of connected devices and sensors, this shift to the edge will only
accelerate.
##
About the Author
Bill Bryce brings 14 years of experience to his
position as VP of Products in which he leads product management. Bill was
instrumental in introducing agile development across Univa's distributed team,
resulting in a doubling of engineering efficiency. Bill is also credited with
the conception and development of key products including Univa's cloud management
products. He received his B Math in Computer Science from the University of
Waterloo.