Catchpoint, in partnership with Blameless,
releases their annual SRE industry report. This report contains special
contributions from Adrian Cockcroft and Steve McGhee and highlights findings
from 559 responses across reliability practitioners, managers, architects, and
executives. Now in its fifth year, this report has become the trusted source of
trends and insights for reliability-as-a-feature practices and continues to
enable organizations to make decisions based on valuable industry data.
The full 2023 SRE Report is available for download here (no
registration required).
"SRE teams are in continuous pursuit of delivering
increasingly reliable services", says Mehdi Daoudi, CEO of Catchpoint. "In a
post-pandemic world, there is an increased requirement for reliable digital
services across the Internet, what it means to accomplish our everyday tasks,
and the importance of Internet resilience. For this, we extend a hearty,
sincere thank you to reliability-as-a-feature practitioners worldwide who
continue to keep our everyday services running."
Key findings include:
- Organizations who operate with a "just culture" are 500%
more likely to be Elite performing organizations
- Over half (54%) of organizations have three, or
more, different types of telemetry (e.g., infrastructure, application, or
network) feeding their observability frameworks. Yet, almost half of the
organizations (46%) also said they receive no, or low, value from
artificial intelligence for IT operations (AIOps)
- Elite-performing organizations are 260% more likely
to substantially focus on Customer Experience reliability versus Low-performing
organizations
- Organizations (59%) say that maintaining
innovation velocity occasionally or often impacts employee productivity or
morale - 14% unsure
- Organizations (59%) say tool sprawl is a
non-existent or minor problem - this challenges other research, which
simply equates tool sprawl to, ‘how many tools are in the stack'
- Half of individual practitioners and half of executives
said they build up to 32% and 20%, respectively, of their tools in house,
revealing a dichotomy in perspective
- Median toil levels drop
by 5%, confirming a continued downward trend
"On the 20th anniversary of the creation of
the SRE function, with the ongoing evolution of DevOps practices, technologies,
and business requirements, it is important to understand how the practice is
evolving", said Gerardo Dada, CMO at Catchpoint. "The report aims to
continuously explore the challenges SREs face every day, best practices, and
areas to improve. We believe this is important, especially in a world where
site reliability is essential for practically every business."
"This report reveals a rich collection of insights that
tell us SRE practices lead to high performing teams and increased levels of
reliability," said Jim Gochee, CEO at Blameless. "It's wonderful to see
toil levels continue to go down and also interesting that the value of a
blameless culture and post-mortems for continuous learning are characteristics
of Elite performing orgs", added Gochee.