Virtualization Technology News and Information
Article
RSS
AI21 to Integrate Maestro with NVIDIA NIM Microservices, Powering Self-Hosted Reliable AI Solutions

AI21 announced the integration of AI21 Maestro's with NVIDIAs NIM microservices, enabling enterprises to build and self-host trustworthy AI solutions on A121 Maestro using any NIM-supported model.

AI21 Maestro integrates seamlessly with NVIDIA NIM microservices, allowing businesses to connect multiple NIM-supported language models and let Maestro dynamically select the best one when planning and executing each task. This flexibility enables enterprises to build AI solutions that meet operational and compliance requirements while optimizing for efficiency and performance-without being locked into a single model.

Enterprises see tremendous potential from generative AI, but many POCs struggle to reliably reach production. AWS' 2024 CDO agenda found that only 6% of generative AI projects officially achieved this milestone - highlighting unmet needs with today's approaches. Most enterprises either rely on LLMs to execute open-ended instructions without control or guarantees of accuracy-a practice sometimes described as "Prompt and Pray"- which can yield unpredictable and unreliable results, or they use hard-coded chains, which offer predictability but are rigid, labor-intensive, and brittle under change. Reasoning models (CoT) can falter at complex, domain specific tasks, often lacking the consistency and precision enterprises require. 

A more robust and controlled approach is emerging to address these challenges. By combining the adaptability of large language models with built-in safeguards, domain-specific logic, and transparent processes, enterprises can finally deploy AI solutions that are accurate, maintainable, and scalable. AI21's Maestro - a Large Planning System (LPS), shifts AI from probabilistic models to reliable, system-level intelligence. It integrates large language models (LLMs) or large reasoning models (LRMs) into a structured framework that systematically analyzes possible actions, plans tailored solutions, intelligently scales compute, and validates results. By learning the enterprise environment through offline simulations, LPS optimizes decisions, executes budget-aware plans, and ensures accuracy, efficiency, and cost-effectiveness-turning AI into a trustworthy, enterprise-grade system. Deployable on-prem and on customer VPC, Maestro works seamlessly with any LLM or Reasoning model, optimizing compute usage to balance cost and performance. 

"Enterprises need AI solutions they can trust to execute complex tasks, not just generate responses," said Ori Goshen, Co-CEO of AI21. "Maestro enables businesses to deploy AI solutions that align with their strategic goals-delivering accuracy, efficiency, and control while integrating seamlessly with their existing AI infrastructure."

"As AI moves rapidly into running in production, enterprises are optimizing their applications for high-performance inference to meet the demands of AI agents and reasoning," said Amanda Saunders, director, Generative AI Software at NVIDIA. "AI21's integration with NVIDIA NIM microservices provides a flexible option for deploying NVIDIA-optimized AI models."

Published Thursday, March 20, 2025 9:53 AM by David Marshall
Filed under:
Comments
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
Calendar
<March 2025>
SuMoTuWeThFrSa
2324252627281
2345678
9101112131415
16171819202122
23242526272829
303112345