Today,
at AWS re:Invent, Amazon Web Services, Inc. (AWS) announced three new serverless options for its
suite of analytics services that make it easier to analyze data at any
scale without having to configure, scale, or manage the underlying
infrastructure. A new serverless option for Amazon Redshift
automatically sets up and scales resources in seconds, giving customers
the ability to run high-performance analytics workloads on petabytes of
data without having to manage data warehouse clusters. A new serverless
option for Amazon Managed Streaming for Apache Kafka (Amazon MSK)
quickly scales resources to vastly simplify real-time data ingestion and
streaming. Amazon EMR now provides a serverless option for customers to
run analytics applications using open-source big data frameworks like
Apache Spark, Hive, and Presto without having to provision, manage, and
scale the underlying infrastructure.
"Some
customers want fine-grained control over every aspect of their
workloads, but other customers have asked AWS to take the guesswork out
of managing their analytics infrastructure so they can move faster and
expand the use of analytics in their organizations. Today, we are
helping customers reduce the complexity of managing their analytics
infrastructure by offering serverless versions of three popular
analytics services," said Rahul Pathak, Vice President of Analytics at
AWS. "This makes it significantly easier and more cost effective for
customers to modernize their infrastructure and unify vast amounts of
data from a variety of endpoints. Now, customers can run analytics
workloads at any scale and quickly deliver insights to the people and
applications that need it-without having to even think about managing
infrastructure."
AWS
customers use a wide variety of purpose-built analytics services to
make data-driven decisions, including Amazon Redshift for data
warehousing, Amazon MSK for processing real-time data streams, and
Amazon EMR for running Apache Spark, Hive, Presto, and other open-source
big data frameworks. These services offer powerful analytics
capabilities for a variety of use cases, but there is a subset of
customers who want to benefit from AWS analytics services and don't want
to put in the time needed to learn how to manage the underlying
clusters or servers. To remove the complexity of scaling and managing
infrastructure, AWS introduced the concept of serverless, event-driven
computing in 2014, and many customers have adopted serverless
technologies on AWS because it removes the need to configure, scale, or
manage servers or provision compute instances and storage to meet peak
capacity for their applications. The new serverless options announced
today extend these capabilities to AWS analytics engines to
automatically add or subtract resources to provide just the right amount
of capacity to meet the demands of data analytics at any scale, so
customers do not need to worry about constantly right-sizing clusters or
over provisioning for peak capacity-saving them time and helping them
optimize costs. With today's announcements, customers can now enjoy the
automatic provisioning, on-demand scaling, and pay-as-you-go pricing of
serverless to lower costs, expand analytics to more users, and quickly
and easily get started with AWS analytics services, including:
- Serverless data warehouse with Amazon Redshift Serverless: Today,
tens of thousands of customers are collectively processing more than
two exabytes of data with Amazon Redshift every day. Amazon Redshift
offers up to 3x better price performance and up to 10x better query
performance than other enterprise cloud data warehouses, providing
customers with faster data analytics at lower cost. The new serverless
option for Amazon Redshift now makes it even easier to get insights from
data quickly without the need to set up, manage, or scale clusters.
Customers currently managing their own Amazon Redshift clusters can
easily move them to the new serverless option using the Amazon Redshift
console or the application programming interface (API) without making
changes to their applications. To learn more about the new serverless
option for Amazon Redshift, visit aws.amazon.com/redshift/redshift-serverless.
- Serverless data streaming with Amazon MSK Serverless: Today's
organizations are increasingly adopting Apache Kafka to capture and
analyze real-time data streams from IoT devices, website clickstreams,
database logs, and many other sources where dynamic data is continuously
generated. Amazon MSK Serverless now builds, manages, and scales
clusters automatically, so customers no longer have to worry about
capacity planning or unpredictable workloads. To get started with Amazon
MSK Serverless, customers simply create a cluster in the Amazon MSK
console, set up a private and secure Apache Kafka endpoint, and use new
or existing Apache Kafka clients to stream data. To learn more about
Amazon MSK Serverless, visit aws.amazon.com/msk/features/msk-serverless.
- Serverless big data analytics with Amazon EMR Serverless: Tens
of thousands of customers use Amazon EMR to run open-source frameworks
like Apache Spark, Hive, and Presto for large-scale distributed data
processing jobs, interactive SQL queries, and machine learning
applications. With Amazon EMR Serverless, customers simply specify the
framework they want to run, and Amazon EMR Serverless provisions,
manages, and scales the compute and memory resources up and down as
workload demands change. Customers can get started with Amazon EMR
Serverless by simply selecting an open-source framework and submitting
their job using Amazon EMR APIs, the AWS Command Line Interface (AWS
CLI), or the AWS Management Console. To learn more about Amazon EMR
Serverless, visit aws.amazon.com/emr/serverless.
Roche
is one of the largest pharmaceutical companies in the world and the
leading provider of cancer treatments globally. "Amazon Redshift
Serverless helps us complete our data management without having to
manage clusters and optimizes our cost by provisioning just the right
amount of capacity to meet demand," said Dr. Yannick Misteli, Lead Cloud
Platform and ML Engineer at Roche. "Amazon Redshift Serverless is
reducing the operational burden, lowering costs, and enabling scale for
the Roche Go-to-Market domain. This simplification is a game changer,
helping us rapidly onboard and support a variety of analytics-heavy use
cases without friction."
Riot
Games is a video game developer and publisher, renowned for creating
one of the world's most-played PC games: League of Legends. "We ingest
about 20 terabytes of data per day using Amazon MSK on AWS, and reducing
the time to query this data after it is produced is critical for us.
With Amazon MSK, we now have a mechanism for streaming data into our
ecosystem while eliminating the heavy lifting of running Apache Kafka on
our own," said Wesley Kerr, Sr. Principal Data Scientist at Riot Games.
"Amazon MSK Serverless will further streamline our operations, as it
allows us to keep up with changes in demand without having to take
scaling actions. As a result, our developers can worry less about
scaling Apache Kafka and focus more on offering the best gaming
experiences around the world."
Intuit
is the global technology platform that helps consumers and small
businesses overcome their most important financial challenges, serving
more than 100 million customers worldwide with TurboTax, QuickBooks,
Mint, Credit Karma, and Mailchimp. "At Intuit we use Apache Kafka as a
central event bus that sits between thousands of decoupled microservices
that power our products," said Ritesh Bansal, Director of Engineering
at Intuit. "We recently migrated our self-managed Apache Kafka clusters
to Amazon MSK because it allows us to redirect engineering talent
towards innovations closer to our end customers. We're excited about
Amazon MSK Serverless, which will make managing our scale and capacity
much easier."
The
Orchard, a Sony Music Entertainment subsidiary, collects, processes,
and distributes music from labels and artists to Spotify, Amazon Music,
and other streaming providers and physical retailers. "Amazon MSK has
helped us accelerate the pace at which we are launching production ready
applications that process streaming data for The Orchard Suite," said
Farouk Umar, Engineering Manager at The Orchard. "Amazon MSK Serverless
enables teams that are not familiar with Apache Kafka scaling to benefit
from Amazon MSK, allowing us to fully decentralize Apache Kafka in our
organization and provide a better developer experience. As a result, we
are able to scale adoption of Apache Kafka faster, which helps us
accelerate adoption of our event-driven strategy."
Classmethod,
Inc. is a leading cloud integrator with expertise in big data, mobile,
and artificial intelligence. "Our data integration platform service,
called Customer Story Analytics (CSA), integrates Amazon Redshift,
Amazon S3, Amazon Aurora, and other services to avoid data silos and
provide powerful, unified governance between data services," said Satoru
Ishikawa, Solution Architect, Data Integration Division at Classmethod.
"Amazon Redshift Serverless automates the sizing of compute and storage
and quickly scales to meet demand. This elastic serverless experience
mitigates manual operational costs, expands data access among
departments, and accelerates autonomy on data analytics and machine
learning, allowing us to scale the CSA business in new and exciting
ways."
Sedric
is an AI risk and compliance excellence platform designed for the new
generation of fintech. "Ease of use and self-service data access is key
for our analytics initiatives. With Amazon Redshift Serverless, we don't
have to think about managing the data warehouse," said Tomer Levi, Vice
President of R&D at Sedric. "Data from Amazon S3 gets loaded 7x
faster for us than our previous solution, helping us get actionable
insights from millions of customer events. We are thrilled with the
performance improvements and cost optimizations we are seeing with
Amazon Redshift Serverless."
ZS
Associates is a global professional services firm that helps companies
develop and deliver products for their customers. "We leverage AWS
heavily for our data analytics strategy and have had tremendous success
over the years. Our SaaS products depend on Amazon EMR versions to
upgrade Spark reliably and remove the undifferentiated heavy lifting,"
said Anirudh Vohra, Associate Director of Cloud Architecture at ZS.
"However, some of our workloads don't need the level of customization
offered by Amazon EMR on EC2, and we want to simply run certain Apache
Spark applications without worrying about managing and scaling servers
or clusters. We are excited about the launch of Amazon EMR Serverless
and look forward to porting our workloads with ad-hoc analytics needs
onto Amazon EMR Serverless."