Today in Singapore, MLCommons and AI Verify signed a memorandum of
intent to collaborate on developing a set of common safety testing
benchmarks for generative AI models for the betterment of AI safety
globally.
A mature safety ecosystem includes collaboration across AI testing
companies, national safety institutes, auditors, and researchers. The
aim of the AI Safety benchmark effort that this agreement advances is to
provide AI developers, integrators, purchasers, and policy makers with a
globally accepted baseline approach to safety testing for generative
AI.
"There is significant interest in the generative AI community globally
to develop a common approach towards generative AI safety evaluations,"
said Peter Mattson, MLCommons President and AI Safety working group
co-chair. "The MLCommons AI Verify collaboration is a step-forward
towards creating a global and inclusive standard for AI safety testing,
with benchmarks designed to address safety risks across diverse
contexts, languages, cultures, and value systems."
The MLCommons AI Safety working group, a global group of academic
researchers, industry technical experts, policy and standards
representatives, and civil society advocates recently announced a v0.5
AI Safety benchmark proof of concept (POC). AI Verify will develop
interoperable AI testing tools that will inform an inclusive v1.0
release which is expected to deliver this fall. In addition, they are
building a toolkit for interactive testing to support benchmarking and
red-teaming.
"Making first moves towards globally accepted AI safety benchmarks and
testing standards, AI Verify Foundation is excited to partner with
MLCommons to help our partners build trust in their models and
applications across the diversity of cultural contexts and languages in
which they were developed. We invite more partners to join this effort
to promote responsible use of AI in Singapore and the world," said Dr
Ong Chen Hui, Chair of the Governing Committee at AI Verify Foundation.
The AI Safety working group encourages global participation to help
shape the v1.0 AI Safety benchmark suite and beyond. To contribute,
please join the MLCommons AI Safety working group.