Anyscale launched Anyscale
Endpoints,
a new service enabling developers to integrate fast, cost-efficient, and
scalable large language models (LLMs) into their applications using popular LLM
APIs.
Unveiled
at Ray
Summit 2023,
the leading conference on LLMs and generative AI for developers, Endpoints is
less than half the cost of comparable proprietary solutions for general
workloads and up to 10X less expensive for specific tasks.
Previously,
developers had to assemble machine learning pipelines, train their own models
from scratch, then secure, deploy and scale them. This resulted in high costs
and slower time-to-market. Anyscale Endpoints lets developers use familiar API
calls to seamlessly add "LLM superpowers" to their operational
applications without the painstaking process of developing a custom AI
platform.
"Obstacles
like infrastructure complexity, compute resources and cost have historically
limited AI application developers when it comes to open-source LLMs," said
Robert Nishihara, Co-founder and CEO of Anyscale. "With seamless access via a
simple API to powerful GPUs at a market-leading price, Endpoints lets
developers take advantage of open-source LLMs without the complexity of
traditional ML infrastructure. As AI innovation continues to accelerate,
Endpoints enables developers to harvest the latest developments of the
open-source community and stay focused on what matters -- building the next
generation of AI applications."
The Power of Open Source for LLMs
Demand
for generative AI and high-quality LLM applications is growing rapidly.
According to a new report from Bloomberg Intelligence, the generative AI market
is poised to grow from $40 billion in 2022 to $1.3 trillion over the next decade.
Leading
research firms like Gartner have commented on this dynamic.
Unmatched Price-Performance
As
a testament to the unmatched scale and efficiency of the Anyscale Platform,
Endpoints is offered at $1 per million tokens for state-of-the-art open source
LLMs like Llama-2 70B, and costs even less for other models. This dramatically
expands access to LLM services for application developers. Anyscale is also
typically able to add new models in hours, not weeks, so Anyscale Endpoints
users have rapid access to the continuous innovation of the open source
community.
A Path to an AI Application Platform
LLMs
provide significant value to companies as a result of their ability to be
tailored to the specific use cases and fine-tuned with additional content and
context to serve end users' specific needs. Fine-tuning helps users get the
best combination of price and performance for their use case.
In
addition to fine-tuning, Anyscale provides the ability to run and use the
Endpoints service within the customer's existing cloud account on AWS (Amazon
Web Services) or GCP (Google Cloud Platform). Not only does that improve
security for activities like fine-tuning, it enables customers to reuse
existing security controls and policies and use computing resources in their
own cloud to process their proprietary data.
Anyscale
Endpoints customers also have the option to upgrade to the full Anyscale AI
Application Platform, giving them the ability to fully customize an LLM, and
have fine-grained control over their data and models and end-to-end app
architecture as well as deploy multiple AI applications on the same
infrastructure.
The
new service seamlessly integrates with many popular Python and machine learning
libraries and frameworks, including Weight & Biases, Arize and Hugging
Face, enabling developers to address multiple different types of use cases
across any cloud as their AI applications evolve.
Driving User Success
"Realchar.ai
is about delivering immersive, realistic experiences for our users, not
fighting infrastructure or upgrading open source models" said Shaun Wei, CEO
and Cofounder at Realchar.ai, an Endpoints beta user. "Endpoints made it
possible for us to introduce new services in hours, instead of weeks, and for a
fraction of the cost of proprietary services. It also enables us to seamlessly
personalize user experiences at scale."
"We use Anyscale Endpoints to power consumer-facing services that
have reach to millions of Google Chrome and Microsoft Edge users," said
Siddartha Saxena, Co-Founder and CTO at Merlin. "Anyscale Endpoints gives us
5x-8x cost advantages over alternatives, making it easy for us to make Merlin
even more powerful while staying affordable for millions of users."
Anyscale
Endpoints is available today and will continue to evolve rapidly, powered by
open source innovation at both the AI infrastructure and LLM model layers. To
try or learn more about Anyscale Endpoints, visit: https://endpoints.anyscale.com.