Wallaroo.AI
announced a strategic collaboration with Ampere Computing to
create optimized hardware/software solutions that provide reduced energy
consumption, greater efficiency, and lower cost per inference for cloud
artificial intelligence (AI).
Ampere
processors are inherently more energy efficient than traditional AI
accelerators. Now, with an optimized low-code/no-code ML software solution and
customized hardware, putting AI into production in the cloud has never been
easier or more cost-effective (even at cost-per-inference measure) or used less
energy.
"This
Wallaroo.AI/Ampere solution allows enterprises to deploy easily, improve
performance, increase energy efficiency, and balance their ML workloads across
available compute resources much more effectively," said Vid Jain, chief
executive officer of Wallaroo.AI, "all of which is critical to meeting the huge
demand for AI computing resources today also while addressing the
sustainability impact of the explosion in AI."
"Through
this collaboration, Ampere and Wallaroo.AI are combining Cloud Native hardware
and optimized software to make ML production within the cloud much easier and
more energy-efficient," said Jeff
Wittich, Chief Product Officer at Ampere. "That means more enterprises
will be able to turn AI initiatives into business value more quickly."
Breakthrough
Cloud AI Performance
One
of the key advantages of the collaboration is the integration of Ampere's built-in
AI acceleration technology and Wallaroo.AI's highly-efficient Inference Server,
part of the Wallaroo Enterprise Edition platform for production ML.
Benchmarks
have shown as much as a 6x improvement over containerized x86 solutions on
certain models like the open source ResNet-50 model. Tests were run using an
optimized version of the Wallaroo Enterprise Edition on Dpsv5-series Azure
virtual machines using Arm64 Azure virtual machines using AmpereⓇ Altra 64-bit
processors, however, the optimized solution will also be available for other
cloud platforms.
Benefits
of Energy-Efficient AI
Reduced
Hardware Needs/Costs
- With a $15.7 trillion (U.S.) potential contribution to the global economy by
2030 (PwC), demand for AI has never been higher. However, the graphics
processing units (GPUs) used to train AI models are in high demand. The
quantities required for AI - and especially for large ML models like ChatGPT
and other large language models (LLMs) - mean they are often not a
cost-effective solution for AI/ML. For many enterprises, it is a better
alternative to run software like the highly optimized Wallaroo.AI inference
server, which can cost-efficiently run many AI/ML workloads with similar
performance using currently available, advanced CPUs.
Supporting
Sustainability/ESG Goals - The MIT Technology Review states that one AI
training model uses more energy in a year than 100 U.S. homes. This means
facility costs (power, cooling, etc.) of running GPUs can
severely impact cloud providers as well as the power grid. Many clients of
cloud providers also have environmental, social, governance (ESG) or
sustainability initiatives that would be negatively impacted by large-scale
adoption of AI with GPUs. Using optimized inference solutions on CPUs like the
Ampere Altra Family of
processors, allows organizations to realize greater efficiency for inference
workloads advancing both their need for AI/ML performance while simultaneously
addressing their ESG goals for greater sustainability.