Amazon Web Services, Inc. (AWS) announced the general
availability of three new serverless analytics offerings that make it even
easier for customers to analyze vast amounts of data without having to
configure, scale, or manage the underlying infrastructure. Today's
announcements include new serverless offerings for Amazon EMR to enable
customers to run analytics applications using open-source big data frameworks (Apache
Spark and Hive) without having to manage the underlying infrastructure, Amazon
Managed Streaming for Apache Kafka (Amazon MSK) to simplify real-time data
ingestion and streaming, and Amazon Redshift to allow customers to run
high-performance data warehousing and analytics workloads on petabytes of data
without having to manage clusters. Along with other serverless analytics
offerings from AWS such as Amazon QuickSight for business intelligence and AWS
Glue for data integration, the new offerings announced today make it
significantly easier and more cost-effective for customers to modernize their
infrastructure and analyze vast amounts of data without worrying about capacity
planning or incurring excess costs by over-provisioning for peak demand. There are
no upfront commitments or additional costs to use Amazon EMR Serverless, Amazon
MSK Serverless, and Amazon Redshift Serverless, and customers only pay for the
precise capacity needed for their analytics workloads.
"By offering the most serverless options for data analytics
in the cloud-including options for data warehousing, big data processing,
real-time data analysis, data integration, interactive dashboards and
visualizations, and more-we are making it even easier for customers to maximize
the value of their data to drive innovation, improve customer experiences, and
make better decisions faster," said Swami Sivasubramanian, vice president of
Database, Analytics, and Machine Learning at AWS. "With these new serverless
options, customers can run even the most variable and intermittent analytics
workloads and expand the use of analytics throughout their organizations
without worrying about provisioning or scaling capacity-or incurring excess
cost."
AWS customers choose from a wide variety of purpose-built
analytics services to derive maximum value from their organizations' data,
including Amazon EMR for processing vast amounts of unstructured data (using
open-source big data frameworks like Apache Spark and Hive), Amazon MSK for
ingesting real-time data streams, and Amazon Redshift for data warehousing.
While many customers appreciate the fine-grained control these services offer,
a subset of customers with highly variable or intermittent workloads would
prefer to have AWS manage the underlying infrastructure by automatically adding
or subtracting resources based on application demand. To remove the complexity
of scaling and managing infrastructure, AWS introduced the concept of
serverless, event-driven computing in 2014. Many customers have since adopted
serverless technologies on AWS-including Amazon Kinesis Data Streams for
real-time data streaming, AWS Glue for data integration, and Amazon QuickSight
for interactive dashboards and visualizations-to take advantage of benefits
like automatic provisioning, on-demand scaling, and pay-for-use pricing. With
the new serverless offerings for Amazon EMR, Amazon MSK, and Amazon Redshift,
AWS offers the broadest set of serverless analytics capabilities in the cloud,
making it even easier for customers to lower costs, expand analytics to more
users, and maximize their data's value.
-
Serverless big data analytics with Amazon EMR Serverless: Tens of thousands of customers use Amazon EMR to run
open-source frameworks like Apache Spark and Hive for large-scale distributed
data processing jobs, interactive SQL queries, and machine learning
applications. Amazon EMR supports the most big data frameworks in the cloud,
enabling customers to run big data applications and petabyte-scale data
analytics faster, and at less than half the cost of on-premises solutions. With
Amazon EMR Serverless, customers can simply specify the framework they want to
run, and Amazon EMR Serverless automatically provisions, manages, and scales
the necessary compute and memory resources as workload demands change.
Customers can get started with Amazon EMR Serverless by simply selecting an
open-source framework and submitting their jobs using the Amazon EMR
application programming interface (API), the AWS Command Line Interface (AWS
CLI), or an integrated development environment (IDE) with Amazon EMR Studio.
Amazon EMR Serverless is generally available today to customers running Amazon
EMR in US East (N. Virginia), US West (Oregon), Asia Pacific (Tokyo), and
Europe (Ireland), with availability in additional AWS Regions coming soon. To
get started with Amazon EMR Serverless, visit aws.amazon.com/emr/serverless.
-
Serverless data streaming with Amazon MSK Serverless: Today's organizations are increasingly adopting Apache
Kafka to capture and analyze real-time data streams from Internet of Things
(IoT) devices, website clickstreams, database logs, and many other sources
where dynamic data is continuously generated. With this new serverless option,
Amazon MSK Serverless now provisions, manages, and scales clusters
automatically, so customers no longer have to worry about capacity planning or
unpredictable streaming workloads. To take advantage of Amazon MSK Serverless,
customers simply create a cluster in the Amazon MSK console, set up a private and
secure Apache Kafka endpoint, and use new or existing Apache Kafka clients to
stream data. Amazon MSK Serverless is generally available today to customers
running Amazon MSK in US East (Ohio), US East (N. Virginia), US West (Oregon),
Asia Pacific (Singapore), Asia Pacific (Sydney), Asia Pacific (Tokyo), Europe
(Frankfurt), Europe (Ireland), and Europe (Stockholm), with availability in
additional AWS Regions coming soon. To get started with Amazon MSK Serverless,
visit aws.amazon.com/msk/features/msk-serverless.
-
Serverless data warehouse with Amazon Redshift Serverless: Tens of thousands of customers are collectively processing
more than two exabytes of data with Amazon Redshift every day. Amazon Redshift
offers up to 3x better price performance than other enterprise cloud data
warehouses, providing customers with faster data analytics at lower cost.
Amazon Redshift Serverless now makes it even easier to get insights from data
quickly without the need to manage data warehouse infrastructure. Customers
currently managing their own Amazon Redshift clusters can choose to move them
to the new serverless option using the Amazon Redshift console or API without
making changes to their applications. Amazon Redshift Serverless is generally
available today to customers running Amazon Redshift in US East (Ohio), US East
(N. Virginia), US West (Oregon), Asia Pacific (Singapore), Asia Pacific
(Sydney), Asia Pacific (Seoul), Asia Pacific (Tokyo), Europe (Frankfurt),
Europe (Ireland), Europe (London), and Europe (Stockholm), with availability in
additional AWS Regions coming soon. To get started with Amazon Redshift
Serverless, visit aws.amazon.com/redshift/redshift-serverless.
Amobee provides advertising
solutions that help customers unify audiences to optimize results across all
TV, connected TV, and digital media to drive customers' growth. "While we like
the flexibility that Amazon EMR provides to scale resources up or down
automatically based on workload requirements, some of our infrequent but heavy
jobs were disrupting existing clusters, necessitating us to create and manage
additional clusters for these jobs," said David Ortiz, senior manager of
Engineering at Amobee. "Amazon EMR Serverless allowed us to right-size the CPU
and memory resources that the jobs required without the overhead of any
additional processes, helping us streamline our workflows and cut costs by
providing just the right amount of capacity to meet workload demands precisely
when we need it."
Powered by Apache Kylin, Kyligence Cloud accelerates
organizations' business intelligence and analytics on big data. "To help
customers make critical business decisions from an extensive volume of data,
our platform loads and processes a significant amount of data using Spark jobs.
Doing this at scale became costly and required operational overhead," said Luke
Han, co-founder and CEO at Kyligence. "We adopted Amazon EMR Serverless to help
us eliminate the costs and administrative tasks of maintaining and tuning
clusters. Amazon EMR Serverless has helped us reduce that complexity by taking
over the time-consuming tasks of managing, tuning, and optimizing clusters for
performance as workload demand changes. And because it is less expensive than
our previous solution, we can pass cost savings on to our customers."
Glas Data provides simplified data management for the
agricultural sector. "We ingest and process data in real time using Amazon MSK
to inform automated data analytics and alerts for our customers. Our workloads
can be highly variable and unpredictable, with some actions generating only a
few messages that require a small amount of capacity, and others creating a much
larger number of messages that require significantly more capacity," said
Robert Sanders, CTO and founder at Glas Data. "This workload variability makes
it difficult to predict which action will be taken at what time, causing us to
monitor and adjust capacity constantly to avoid unexpected capacity
constraints. Amazon MSK Serverless automatically scales capacity up and down
based on workload requirements, removing the system administration overhead and
freeing us up to develop our solution without worrying about memory and storage
constraints or incurring excess costs."
NextGen Healthcare is a leading provider of innovative
healthcare technology solutions on a mission to improve the lives of those who
practice medicine and their patients. "Our NextGen Population Health solution
provides actionable insights directly to care teams via the aggregation and
transformation of multi-source data. Optimizing our systems to reduce manual
interventions like setting up and managing data warehouse infrastructure is critical
to our success," said Owen Zacharias, vice president of Application Delivery at
NextGen Healthcare. "With Amazon Redshift Serverless, we're no longer managing
complex warehouse orchestration systems. Amazon Redshift Serverless has
improved workload performance, and its auto-scaling capabilities allow us to
use the speed of Amazon Redshift for even our most dynamic workloads, while
only paying for what we use. We're excited to migrate additional workloads to
Amazon Redshift Serverless. It's a game changer."
Informatica provides an end-to-end cloud data management
platform that connects, manages, unifies, and governs data, empowering
enterprises to modernize and advance their data strategies. "Organizations
today are looking to expand data and analytics, but face challenges with data
silos, cost constraints, and infrastructure management," said Rik Tamm-Daniels,
GVP of Ecosystems at Informatica. "Amazon Redshift Serverless helps address
these challenges by automatically provisioning and scaling resources to meet
demand, making it easy to run analytics without the need to set up and manage
data warehouse infrastructure or the worry of incurring excess costs by
overprovisioning for peak demand. Together with our Intelligent Data Management
Cloud on AWS, Amazon Redshift Serverless helps us provide Informatica customers
with a serverless data and analytics foundation to power their most
business-critical initiatives."
The Rail Delivery Group (RDG) brings together the companies
that run Britain's railway into a single team to deliver a better railway
experience. "Amazon Redshift Serverless delivers high performance for our
teams, and because it automatically provisions and manages the underlying data
warehouse, more of our business users can quickly and easily get insights from
data," said Toby Ayre, head of Data and Analytics at Rail Delivery Group.
"Amazon Redshift Serverless automatically scales data warehouse capacity to
handle even our most demanding and unpredictable workloads, helping us lower
our costs and expand the use of analytics across our organization."
Huron is a global professional services firm that
collaborates with clients to create sound strategies, optimize operations,
accelerate digital transformation, and empower businesses and their people to
own their future. "We're thrilled to include Amazon Redshift Serverless as an
exciting addition to our data analytics workflow. This offering seamlessly
replaces several parts of our previous infrastructure, and its simplicity makes
it very easy to use," said Harry Gollakota, data engineer at Huron. "Amazon
Redshift Serverless drastically helps reduce data engineering latency and acts
as a force multiplier in accelerating development. Implementing Amazon Redshift
Serverless helped us cut through our data engineering backlog and now allows us
to spend more of our time gathering insights from the data."