To create a developer self-service
platform, static scripting languages are inadequate, and a systems design
approach to DevOps must be adopted.
By Venkat Thiruvengadam, Founder and CEO of DuploCloud
The
journey of infrastructure operations has been a continuous process of
evolution, and while progress has been made, true transformation will only be
realized when developers are given control. Although IT requirements have
appeared to be the driving force behind advancements in infrastructure
technologies, it is actually the need for enhanced developer productivity that
has fueled this shift. As developers hold a more direct influence on business
outcomes, IT must prioritize their needs. In the face of concerns about
security, building a walled garden around infrastructure will ultimately harm
productivity and lead to developer attrition.
Enterprise Computing: IT operations
with Scripting and CLI
In the
late '90s and early 2000s, companies left mainframes for on-premises
"productivity." Microsoft and Sun made great strides in enabling developers
with .NET and Java, respectively. Applications were getting fancier.
On
the infrastructure side, VMware disrupted the industry with virtualization,
Cisco with networking and EMC with storage technologies. Microsoft's monopoly
with Windows was strengthened by Active Directory. Over the years these
infrastructure technologies became so complex that it became a niche skill to
manage them.
The
purpose of this increasingly complex infrastructure was to ship software
faster; which, in turn, meant developers needed to be more productive. Ideally,
developers should have been given direct infrastructure access through an
abstraction layer that automated the low-level provisioning details. Instead,
IT tightened control.
Cloud Computing: DevOps with
Infrastructure-as-Code
The
public cloud eliminated all aspects of physical infrastructure and delivered
Infrastructure-as-a-Service (IaaS) at exponential speed with significantly less
complex implementations. While AWS started with IaaS, Microsoft started
Platform-as-a-Service (PaaS), which proved to be a bit ahead of its time.
Subsequently, they course-corrected to focus on IaaS as well.
Nevertheless,
the vision was clear: siloed disciplines of server, network and storage admins
were eliminated. Infrastructure provisioning through code became the de facto
model and was significantly faster than its predecessor.
While
the core disruption was realized by the fundamental architecture of the cloud
itself, the improvement in the infrastructure operations tools has been less
disruptive. Terraform, Cloud Formation, Chef and Puppet started to make
significant improvements in the automation space, but then came microservices
which exponentially increased the number of moving pieces.
Cloud
orchestrators, on the other hand, are focused on the hybrid cloud by
normalizing all clouds down to IaaS, leaving hundreds of native cloud services
like DynamoDB, SQS, SNS, Kinesis and others out of scope. At best, they acted
as a facade to consume static templates that had little flexibility and
self-service as they constantly relied on administrators to update them.
Fundamentally, all these infrastructure scripting tools are not meant for the
consumption of developers.
Spinning
up new environments takes days or weeks. Even at the most efficient companies,
the OpEx-to-CapEx ratio is still about 1:1. For example, if they spend a
million dollars on AWS, they would require six to 10 DevOps engineers. IT, and
its newly monikered platform engineering, remains a big cost center.
What Needs to Change
The
fundamental change in the approach to Infrastructure-as-Code is indisputable,
but, more importantly, platform engineering as a whole will not come from
companies whose core audience has been operators.
It
will come from cloud vendors or developers who have experienced the pain
first-hand and understand that you cannot build a developer self-service
platform with security guardrails from Terraform or other static scripting
languages. We need to shift to a systems design approach to DevOps.
Most
successful platforms, like Kubernetes, observability- and security-solutions
and even the public clouds themselves are built with a systems design
architecture. They all have an opinionated interface often called the policy
model and have a state machine that can translate and implement higher-level
user specifications into lower-layer nuances.
There
is an "as-a-Service" theme to all of them: Infrastructure-as-a-Service,
container orchestration service and so on. They offer a reliable and consistent
way to manage and process many complex use cases without human intervention,
reducing the potential for errors and improving efficiency.
In
fact, DevOps automation needs to be looked at as a systems design problem. By
simply extrapolating Infrastructure-as-a-Service, one could argue that
Devops-as-a-Service can be built on similar principles above the IaaS layer.
Ultimately,
the evolution of infrastructure operations has been driven by the need for
enhanced developer productivity rather than IT requirements. The shift towards
Infrastructure-as-Code (IaC) has enabled infrastructure provisioning through code
and significantly increased the speed of operations. However, the siloed
disciplines of server, network and storage admins were eliminated, and
developers have not been given direct infrastructure access through an
abstraction layer that automated the low-level provisioning details. The
solution lies in a systems design approach to DevOps, which offers a reliable
and consistent way to manage and process many complex use cases without human
intervention, reducing the potential for errors and improving efficiency.
Ultimately, the success of platforms such as Kubernetes, observability and
security solutions, and public clouds themselves is due to their systems design
architecture and "as-a-Service" theme, and DevOps automation needs to
be looked at as a systems design problem.
##
ABOUT THE AUTHOR
Venkat Thiruvengadam is CEO and
founder of DuploCloud. Venkat was an early engineer at Microsoft Azure, the
first developer and founding member in Azure's networking team. He wrote
significant parts of the Azure compute and network controller stack where he
saw Azure grow from a hundred-odd servers to millions of nodes in just a few
years. After leaving Microsoft, he realized that such hyperscale automation
techniques have not made their way outside of companies like AWS, Microsoft and
Google, which led him to form DuploCloud with a goal of bringing the hyperscale
automation techniques to Main Street IT.