ArcaStream and
Excelero, leaders in scale-out software-defined storage, said
that their solutions were part of a new software-defined storage deployment
at
Imperial College London, one of the world's leading university
research centres for science, engineering, medicine and business. By moving to
a single high-performance storage infrastructure that integrates legacy and new
resources, Imperial College London reduced storage complexity, increased
agility in managing capacity and performance, and gained significant ROI.
Backed by its new system, the university also improved usability, enacted a
simpler and more cost-effective charge back policy, and embraced a future-proof
approach to staying ahead of the continual multi-petabyte per year growth of
their data holdings.
Expansion
in Imperial College London's Research Computing Services over the past decade
had resulted in over 30 separate and independently managed islands of storage -
silos that were difficult to access, manage and use. The group needed to enable
researchers to access data with ease and speed, so they remain solely focused
on their research projects - while integrating legacy and future compute
systems and managing data throughout its life-cycle in line with stringent
regulatory compliance guidelines.
The new
Research Data Store (RDS) is built around ArcaStream PixStor, a high-performance scalable storage
platform based on IBM Spectrum Scale parallel file system, which combines
flash, disk, tape, and cloud storage into a single global name space. The
infrastructure is geographically dispersed with a 5PB storage repository at a
primary site and a secondary site for disaster recovery, served by PixStor with
asynchronous replication and intelligent tiering to external storage targets.
RDS also includes Excelero's NVMesh, software that enables the sharing of NVMe
Flash storage resources across any network and supports any local or
distributed file system. NVMesh provides a scalable NVMe tier for extreme
metadata performance. Users benefit from the performance of local flash with
the convenience of centralized storage while reducing the overall storage total
cost of ownership.
With
this new robust infrastructure, Imperial College London's RDS now
simultaneously serves 2,000 existing HPC nodes and over 3,000 users seamlessly,
with 20GB/s of throughput with no loss of interactive user performance.
"The
usability of the systems for interactive use in particular has improved
significantly," explains Matthew Harvey, RDS project lead and RCS Manager at
Imperial College. "Previously, there were frequent interruptions to interactive
use because the file system load for some compute jobs effectively squeezed out
users. Users would log into the system, type in their search criteria but it
could take more than 10 seconds to respond. Now that is a thing of the past."
The
new RDS supports a charge-back strategy where researchers cost-out storage as
services on their grants - instead of charging users based on reserve capacity.
More effective management of storage capacity also allowed the College to avoid
costly additions. The ArcaStream platform provides tools and insight needed to
understand the access patterns of data on the file system for each project
allocation. "This information governance is enabling us to store valuable data
more intelligently and economically," Imperial College London's Matthew Harvey
continued.