Virtualization Technology News and Information
Article
RSS
PromptQL Partners with UC Berkeley to Develop New Data Agent Benchmark for Reliability of Enterprise AI Agents

PromptQL announced a strategic research collaboration with the University of California, Berkeley to develop the first comprehensive data agent benchmark for enterprise reliability specifically designed to evaluate general-purpose AI data agents in enterprise environments.

A recent McKinsey study revealed that 78% of organizations use AI in at least one business function, however, more than 80% say their organization hasn't seen a tangible impact on enterprise-level Earnings Before Interest and Taxes (EBIT). The partnership - led by Aditya Parameswaran, Professor and Co-Director of UC Berkeley's EPIC Data Lab, along with his students - addresses this fundamental challenge organizations face when deploying AI systems in business-critical environments.

While existing agentic data benchmarks like GAIA, Spider, and FRAMES test specific AI tasks, they overlook the complexity, reliability demands, and messy, siloed data that define real business environments. The forthcoming data agent benchmark aims to offer a solution by creating a framework that reflects real-world complexities.

"Our customer conversations reveal a clear pattern-they're ready to move from proof-of-concepts to production AI, yet they lack the evaluation tools to make confident deployment decisions," said Tanmai Gopal, CEO of PromptQL. "The data agent benchmark changes that by using representative datasets from our work in telecom, healthcare, finance, retail, and anti-money laundering to reflect the real complexity of enterprise AI."

UC Berkeley's EPIC Data Lab brings expertise to this collaboration. Professor Parameswaran is a leading authority on the use of AI for next-gen usable data analysis tools and has received numerous prestigious awards. His research group has created widely-adopted data tools with tens of millions of downloads.

"Current benchmarks suffer from what I call the ‘1% problem'-they're built for tech giants and ignore the 99% of organizations grappling with real-world data complexity," Parameswaran said. "The data agent benchmark marks a shift toward evaluating AI based on the reliability, transparency, and practical value enterprises actually need. This collaboration bridges academic rigor with the production insights PromptQL brings from real deployments."

The data agent benchmark beta will be revealed later this year. Organizations interested in early access or contributing use-cases or datasets can reach out to the research team at epic-support@eecs.berkeley.edu.

Published Wednesday, June 04, 2025 4:19 PM by David Marshall
Filed under:
Comments
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
Calendar
<June 2025>
SuMoTuWeThFrSa
25262728293031
1234567
891011121314
15161718192021
22232425262728
293012345