Fungible
Inc. and the San Diego
Supercomputer Center (SDSC) announced they have shattered the NVMe over TCP storage initiator
performance world record, achieving 10M IOPS.
Distributed
AI, machine learning, and other data-centric workflows have
traditionally been constrained in what they can accomplish by the
limitations of traditional RDMA, iSCSI, and Fibre Channel based storage
protocols and products. The Fungible solution leverages the modern NVMe
over TCP standard and Fungible's Storage Initiator card and storage
target to unlock the potential of the rest of the infrastructure. The
performance record being announced today exceeds the prior performance
record by over 50%. The prior performance record was attained using the Fungible Storage Cluster without the benefit of the Fungible Storage Initiator cards.
The Fungible Storage Initiator cards were able to deliver this
significant increase in performance while simultaneously freeing up a
significant amount of server resources to do other work.
The
experiment was performed under the auspices of SDSC's Advanced
Technology Lab. The ATL's team of scientists and engineers surveys,
evaluates, and assembles the computing and storage technologies needed
for emerging scientific computing and data analysis systems.
"While
impressive from a performance perspective, the results of this testing
are more about expanding the scope of what AI, machine learning, data
analytics, and other data-centric environments can deliver," said Eric
Hayes, CEO of Fungible. "The Fungible Storage Initiator cards developed
on our standards-based Fungible DPU free
up tremendous amounts of server CPU resources to run application code,
and the application now has faster access to data than it ever has
before. Scale-out data centers, powered by Fungible, can now surpass
their performance goals economically, reliably and securely."
According
to John Graham, UC San Diego senior development engineer working at
SDSC and the Qualcomm Institute, "The Fungible solution has set a new
bar for storage performance in our environment. The results are
potentially transformational for large-scale scientific
cyberinfrastructure such as the Pacific Research Platform (PRP) and its
follow-on, the National Research Platform (NRP). With Fungible's
innovative DPU technology, we are able to deploy a high-performance
storage solution that achieves our planned density and cost
requirements," he said. "The PRP and NRP are unique, multi-institutional
distributed systems for conducting at-scale AI and data-intensive
computing for scientific research in a wide area environment."
"One
of the challenges of doing distributed AI at scale is storage
performance, both raw bandwidth and IOPS," noted Frank Wuerthwein,
interim director of SDSC and principal investigator for the National
Research Platform. "Fungible's technology looks very promising in
delivering the storage performance we need to achieve our future goals
for a wide area, distributed AI and data science platform."
"We
are proud that AMD EPYC processors and their high-performance
capabilities were able to help Fungible and SDSC showcase a new level of
storage performance," said Kumaran Siva, corporate vice president,
Server Software and Systems, AMD. "Achievements like this have profound
impacts on scale-out data centers around the world for scalability of
storage technologies."