Cognite announced the launch of the
Cognite Atlas AI LLM & SLM Benchmark Report for Industrial Agents. The
first-of-its-kind report addresses the shortcomings of general
benchmark datasets by tailoring large language model (LLM) and small
language model (SLM) evaluations to focus on specialized industrial
tasks, ensuring the reliability, accuracy, and effectiveness of
industrial AI solutions.
"General LLM and SLM benchmark reports fail to capture the complexities
of industrial environments and don't align with the specialized needs of
industrial operations, where precision, safety, and domain expertise
are critical," said Knut Vidvei, Head of Product Management at Cognite,
speaking from the stage at IMPACT 2024. "With the Cognite Atlas AI LLM & SLM Benchmark Report for Industrial Agents,
we've tailored an evaluation framework to real-world industrial tasks,
ensuring AI Agents are reliable and effective, driving the advancement
of industrial AI."
Cognite Atlas AI is an industrial agent workbench that extends Cognite Data Fusion, the leading industrial Data and AI platform. With unmatched data management and comprehensive AI capabilities, Cognite earned Frost & Sullivan's Global Company of the Year Award in the digital industrial platforms market and Frost Radar: Digital Industrial Platforms market powerhouse status, solidifying the company as an authority on Data and AI for industry.
The Cognite Atlas AI Benchmark Report for Industrial Agents will
initially focus on natural language search as a key data retrieval tool
for industrial AI agents. The test set includes a wide range of data
models designed for sectors like Oil & Gas and Manufacturing, with
real-life question-answer pairs to evaluate performance across different
scenarios. Answers are assessed using a wide range of evaluation
metrics. These benchmark datasets enable systematic evaluation of the
system's performance in answering complex questions, like tracking open
safety-critical work orders in a facility.
Future versions of the report will evaluate additional AI tools, such as
those for summarizing, analyzing, and reasoning with industrial data,
to assess the full performance of industrial AI agents.
The first Cognite Atlas AI LLM & SLM Benchmark Report for Industrial Agents will
be available to download for free on October 28, 2024. The report will
then be regularly published to enable digital transformation leaders to
use Gen AI to carry out more complex operations with greater accuracy.