AI21 announced the integration of AI21 Maestro's with NVIDIAs
NIM microservices, enabling
enterprises to build and self-host trustworthy AI solutions on A121 Maestro
using any NIM-supported model.
AI21
Maestro integrates seamlessly with NVIDIA NIM microservices, allowing
businesses to connect multiple NIM-supported language models and let Maestro
dynamically select the best one when planning and executing each task. This
flexibility enables enterprises to build AI solutions that meet operational and
compliance requirements while optimizing for efficiency and performance-without
being locked into a single model.
Enterprises
see tremendous potential from generative AI, but many POCs struggle to reliably
reach production. AWS' 2024 CDO agenda found that only 6% of generative AI
projects officially achieved this milestone - highlighting unmet needs with
today's approaches. Most enterprises either rely on LLMs to execute open-ended
instructions without control or guarantees of accuracy-a practice sometimes
described as "Prompt and Pray"- which can yield unpredictable and
unreliable results, or they use hard-coded chains, which offer
predictability but are rigid, labor-intensive, and brittle under change. Reasoning
models (CoT) can falter at complex, domain specific tasks, often
lacking the consistency and precision enterprises require.
A
more robust and controlled approach is emerging to address these challenges. By
combining the adaptability of large language models with built-in safeguards,
domain-specific logic, and transparent processes, enterprises can finally
deploy AI solutions that are accurate, maintainable, and scalable. AI21's Maestro
- a Large Planning System (LPS), shifts AI from probabilistic models
to reliable, system-level intelligence. It integrates large language models
(LLMs) or large reasoning models (LRMs) into a structured framework that
systematically analyzes possible actions, plans tailored solutions,
intelligently scales compute, and validates results. By learning the enterprise
environment through offline simulations, LPS optimizes decisions, executes
budget-aware plans, and ensures accuracy, efficiency, and
cost-effectiveness-turning AI into a trustworthy, enterprise-grade system.
Deployable on-prem and on customer VPC, Maestro works seamlessly with any LLM
or Reasoning model, optimizing compute usage to balance cost and
performance.
"Enterprises
need AI solutions they can trust to execute complex tasks, not just generate
responses," said Ori Goshen, Co-CEO of AI21. "Maestro enables businesses to
deploy AI solutions that align with their strategic goals-delivering accuracy,
efficiency, and control while integrating seamlessly with their existing AI
infrastructure."
"As AI moves rapidly into running in production, enterprises are
optimizing their applications for high-performance inference to meet the
demands of AI agents and reasoning," said Amanda Saunders, director, Generative
AI Software at NVIDIA. "AI21's integration with NVIDIA NIM microservices
provides a flexible option for deploying NVIDIA-optimized AI models."