Software
intelligence company Dynatrace announced the release
of a new, in-depth State of SRE report, based on an independent survey
of 450 site reliability engineers (SREs). The report highlights that
SREs are taking on a more strategic role, as organizations have a
growing need to ensure teams have the answers and intelligent automation
needed to accelerate digital transformation. The growth of new
technologies used in cloud-native development, however, has created an
explosion of complexity that is hindering these efforts. The "State of
SRE Report: 2022 Edition," is available for download here.
The research reveals:
- 88% of SREs say there is now more understanding of the strategic importance of their role than there was three years ago.
- SREs
currently dedicate the largest amount of their time to reducing MTTR
(mean time to recovery) (67%), building and maintaining automation code
(60%), and ensuring security vulnerabilities are detected and eliminated
quickly (58%).
- 68%
of SREs expect their role in security to become more central in the
future, as organizations continue using third-party libraries, such as
Log4j, for cloud-native application development.
- 99%
of SREs encounter challenges when defining and creating SLOs to
evaluate service levels for applications and infrastructure. The most
common challenges include:
- Too many data sources (64%),
- Difficulty finding the most relevant metrics for a service (54%),
- The inability of monitoring tools to easily define and track SLO performance (36%).
- 68%
of SREs say siloed teams and multiple tools make it difficult to align
on a single version of 'the truth' about service levels.
"Reliability,
experience, and security have become critical success factors in a
world where every second of downtime leads to lost revenue, declining
share prices, and lasting reputational damage," said Bernd Greifeneder,
Founder and Chief Technology Officer at Dynatrace. "This makes SRE
central to driving faster digital transformation. Most organizations,
however, remain relatively immature in their adoption of SRE practices.
At a time when demand far outstrips the supply of skilled engineers,
organizations should be doing everything in their power to amplify the
efforts of these teams. Despite this, manual steps and unnecessary
effort are a major distraction for SREs, which holds organizations back.
SREs must define a ‘golden path,' a set of steps development teams can
take to navigate the complexity of cloud-native delivery, to overcome
these barriers and fully unleash digital innovation."
Additional findings from the report include:
- 85% of organizations say their ability to scale SRE practices will be dependent on automation and AI capabilities.
- 71%
of organizations are increasing the use of automation across every part
of the lifecycle to reduce toil for developers and SREs.
- Organizations
are primarily using automation in SRE to resolve security
vulnerabilities (61%), and application failures (57%), increase the
speed of delivery (56%), and predict SLO violations before they occur
(55%).
- SREs
say AIOps will enable teams to automate more processes critical to
ensuring service levels are continually met (64%), prioritize problems
that have the biggest impact on user satisfaction (63%), and prioritize
security vulnerabilities to minimize downtime (62%).
- By 2025, 85% of SREs want to standardize on the same observability platform from Dev to Ops and security.
"SREs
need a single, unified platform that enables reliability, security, and
automation by default," continued Greifeneder. "Self-service
observability and monitoring-as-code capabilities are key, allowing
development teams to build feedback loops into their applications in
just a few clicks. Through this, SREs will lead the charge in going
beyond basic automation to smart orchestration of customer experience
and business outcomes. That will empower organizations to drive digital
transformation faster than ever, through self-healing cloud applications
that quickly scale with business needs. As a result, SREs can be free
to focus on the things that are core to their role, enabling them to
create greater value by driving best practices for reliability,
resiliency, security, and performance, to ultimately deliver better
business outcomes."