Virtualization Technology News and Information
BlueData Adds Deep Learning, GPU Acceleration, and Multi-Cloud Support for Big Data Workloads on Docker Containers

BlueData, provider of the leading Big-Data-as-a-Service (BDaaS) software platform, today announced the new fall release for BlueData EPIC and introduced initial availability on Google Cloud Platform (GCP) and Microsoft Azure. This release adds new innovations and options for running Hadoop, Spark, and other Big Data workloads on Docker containers -- delivering on the requirements from its rapidly growing customer base, including many of the world's largest enterprises across multiple industries.

In recognition of its innovations for running Big Data workloads in containerized environments, BlueData just won the 2017 Datanami Editors' Choice Award for Best Big Data Product: Virtualization. This new fall release builds on the innovative functionality introduced in BlueData EPIC version 3.0, with support for deep learning use cases, GPU support, and flexible container placement policies. And by extending availability of BlueData EPIC from Amazon Web Services (AWS) to Azure and GCP, BlueData is the first and only BDaaS solution that can be deployed on-premises, in the public cloud, or in hybrid and multi-cloud architectures.

Deep Learning, GPU Support, and Flexible Container Placement

Earlier this year, BlueData added new capabilities to bring DevOps agility to distributed data science operations and machine learning use cases. The new fall release of the BlueData EPIC software platform provides the ability to jumpstart an even broader range of applications and use cases, including deep learning:

  • Streamlined operations for deep learning projects: With BlueData EPIC, data science teams can quickly get started in deep learning without the operational overhead of setting up, configuring, and managing these new environments. They can leverage pre-integrated Spark clusters, action scripts (e.g. to update all the nodes in a running environment with a single click), and web-based notebooks (e.g. JupyterHub, RStudio Server, Zeppelin) to automate the end-to-end lifecycle of data science operations. 

  • Support for GPU acceleration and TensorFlow: BlueData can now support clusters accelerated with Graphics Processing Units (GPUs), and provide the ability to run TensorFlow for deep learning on GPUs or on Intel architecture CPUs. By leveraging the advanced host tagging feature introduced in this release, administrators can specify placement of Docker containers running TensorFlow on infrastructure configured with GPUs or CPUs either in the public cloud or on-premises.

  • BigDL for distributed deep learning on Spark: The fall release now includes a pre-integrated application image for Intel's BigDL running on Docker containers. BigDL is a Spark-based framework for deep learning optimized for Intel CPU architecture. With BigDL, BlueData now offers a fast and economical path to deep learning by utilizing existing x86-based server infrastructure and the pre-integrated Spark clusters that BlueData EPIC provides out of the box.

This new release also brings new capabilities for container placement as well as additional enhancements in performance, monitoring, and security. Some of these features and benefits include:

  • Flexible container placement policies: Within BlueData EPIC, administrators can now define various roles for a given application image and control the placement of containers associated with a specific role to specific hosts. For example, containers with the Spark worker role can be placed on servers or instances containing a large amount of memory or a local SSD for fast storage access.

  • Improved utilization and performance for Big Data workloads: With new purpose-built features for flexible cluster role definition and host tagging, administrators can ensure that the right Big Data workload is assigned to the right underlying host. This in turn can help to optimize infrastructure utilization, allow for greater performance optimizations, provide better control over SLAs, and enable chargeback models for infrastructure consumption.

  • Support for Intel cache acceleration and SSD technology: BlueData now enables customers to leverage the power of Intel Cache Acceleration Software (CAS) and Intel Optane Solid State Drive (SSD) technology to further improve performance for Big Data jobs by maximizing the performance of local disk storage. With Intel CAS and high-performance SSDs, enterprises can reduce costs for latency-sensitive workloads and improve overall data center TCO.

  • Enhanced container-level monitoring: In the last release, BlueData introduced a new pluggable framework based on Elasticsearch, Metricbeat, and Kibana to provide fine-grained monitoring for CPU, memory, and other key metrics. Now, using this same framework, BlueData EPIC includes detailed monitoring for container-level disk I/O and network throughput.

Additional details and other new enhancements are highlighted in the accompanying blog post here. The fall release of BlueData EPIC will be generally available in October 2017.

Published Tuesday, September 26, 2017 10:06 AM by David Marshall
Filed under: , ,
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
<September 2017>