The ZeitGeist of Cloud Native Microservices
Thin And Thick Definitions of Cloud
Native Microservices
Some
philosophers believe that there are underlying forces that drive the cumulative
reason of humanity. [1] When these drivers are examined, the
explanation of how we got to where we are becomes less surprising and maybe a
bit undeniable. Indeed, our mental
ability to separate things and examine them might very well be our greatest
ability. [2] Can we use these techniques to examine the
underlying forces that brought us to our current views of the microservice
pattern? [3] What are the undeniable drivers which affect
us currently and have brought us to this present time -- the zeitgeist of
microservices?
Introduction
Why
should we care about the various drivers of microservices? What you don't know can hurt you, in terms of
efficiency and opportunity costs. We
have written elsewhere that cloud native concepts can be watered down so
loosely so as to mean very little [4]. With microservices, the opposite seems to be
true. The definition becomes so strict,
so technical, that the implementation load becomes too much to bear for small
to mid sized development shops.
Borrowing from philosophy, we can say that there are thin (descriptive
and more particular) and thick (evaluative and more encompassing of the
zeitgeist/spirit of the age) [5]
views of both cloud native and microservices.
Cloud Native Definition
The
thin definition of cloud native, interestingly enough, does not include
microservices as a requirement! [ 6] This definition puts orchestration,
automation [7],
self healing, and cloud aware applications to the forefront [ 8].
The
thick definition of cloud native [9]
includes microservices along with immutable infrastructure, declarative APIs,
etc.
Microservice Definition
The
thin definition of microservices concentrates on processes. Specifically, one process per container [10]. This definition prioritizes the size of the
container (it should be small), the speed of the container (it should start up
fast), and the fact that it should be easily orchestrated (e.g. it should
expose readiness and liveness health checks).
The
three forces that drive the thick definition of microservices are people,
agility, and coarse grained deployments.
The agile story of people developing software has progressed from top
down analysis to small iterative development.
Instead of fighting against how our organizations affect the software we
write, we now make software fit the contours of the organizations that we are a
part of. The way we share our software,
whether within sibling organizations or in the greater world, is benefited by
stronger boundaries which eliminate dependency clashes.
Microservice Rate of Change
The
facilitation of your software's rate of change may be the most foundational of
all architectural principles. That is to
say that the separation of concepts and artifacts based on how often they
change may very well be the principle that binds all other principles of
architecture. [11] Rate of change is probably the most essential
driver in the push for microservices. It
can be found in how people are organized (Conways' law), in how modern software
is developed (the Agile movement), and how artifacts are deployed (Continuous
Development), which are all driving forces in the push for microservices.
An
interesting scenario that highlights the importance of rate of change in a
production application is the application of a security patch. With a microservice, the change management
and deployment strategies are limited to one service, and not the whole
application. A versioned security patch
is pushed out, using a pipeline, and it does not require deployment in
lock-step for the rest of the application.
The cycle time and median time to recovery is dramatically reduced because
of this. [12] When we look at a monolith, the security
patch must be pushed out with a deployment of the whole application. Keeping in mind that security patches are
high priority and must be deployed immediately, even a 6 month to 2 year
deployment cycle *must* be interrupted with a security patch deployment. With at least security patches (and probably
other high priority changes), there ends up being an implicit rate of change
(the rate of the patches) and a public "official" release cycle. In actuality this means, for larger
monoliths, the whole monolith change rate is erratic and coupled to the rate of
security patches.
One Process, Many Libraries
The
discussion of how many processes should be in a microservice versus how many
libraries are deployed with a microservice seems to be where the thin and thick
definitions of microservices overlap with each other. On one side you have the idea that the
separation of concerns should be implemented by creating a whole new single
self contained runtime process, and therefore new service. On the other side you have the ability to
separate concerns by creating a new library that shares the same tool chain as
projects up or downstream, complete with its own versioning, product team, and
deployment process. You can then use
that library in a larger, but still single, microservice process.
Since
the technical rate of change can be managed by either versioned separation of
libraries within a larger process or versioned separation of the code that
implements a process, it isn't the deciding factor here. That leads us to explore the driving forces
behind the organizational rate of change.
Process Separation Using Conway's
Law
Conway's
law, which is the law that your software will be bound by your organizational
structure, is another area of interest for the development of libraries. Will you have multiple product teams, each
responsible for one or more libraries?
Do they each have their own versioning and deployment pipelines? How does a project that is downstream get
notified that a library has a new version?
How does feedback from very important downstream projects make its way
back to the upstream projects? How long
does this communication take? Will all sibling organizations develop software
using the same programming language and tool chain? Will they all upgrade their programming
language versions in lockstep? If these questions can be answered well enough
to fit the service level of the project being developed, it seems that a bigger
process with more libraries is a viable path. [13] If not, the natural boundaries of one product
team, one process seem to be the better solution.
There
are many isolation strategies that can be used for your microservice
runtime. The first thing to note is that
either containers (logical abstraction, each environment shares the same
kernel) or light VMs (virtual hardware abstraction, each environment has its
own kernel) can serve as the isolation method between services.[14] There are different levels of deployment
tooling for each of these methods with pros and cons to each, but theoretically
both methods will work for microservices.
Along
with the determination of the isolation method for a service, the process
separation also needs to be addressed.
The thin and thick definitions of microservices both favor one process
per container or one process per deployment (e.g. a pod) which may use one or
many versioned libraries. There are
exceptions such as sidecars for handling some secondary process like
logging. With these exceptions it seems
that the many process (one main process, with sidecar processes) and many
libraries method for process separation is the best practice. What is frowned upon is having many process
types (e.g. new processes spawned off by the initial container process) and
having those processes as non-supervised/orchestrated, or supervised by
homegrown methods. A major tenant of
cloud native microservices is having an external, mature supervisor watching
over them instead of having that code baked in.
Scalability
also drives the push for microservices.
If the general desire is to scale out instead of scale up, the idea is
to have more machines running many containers/vms, versus less (more powerful)
machines running many processes. Having
one very powerful machine is definitely frowned upon. Given that we must have redundancy of
machines anyways, the scale out strategy with orchestration at the abstract
container/vm level instead of the process level seems to be the way to go.
Product Teams and Conway's law
Organizational
drivers are the strongest forces behind microservices, although it may not seem
so at first. Organizational
requirements, politics, and competencies drive the different rates of change[15] of
software development. These changes act
like a river where change flows downstream.
The upper hierarchy of an organization or multi-organizational initiative
may try to shape the river, but the boundaries of the internal sibling
organizations act like the riverbeds, shores, and dams of the river. The software reflects how the change in the
organization actually flows, not how we want it to flow. This is the realization of Conway's law. Inverting Conway's law and developing
software so that it facilitates the organization it is developed in, instead of
ignoring organizational drivers, results in something like a microservice. That is to say the software will look like
code separated by orchestrated processes composed of versioned libraries.
The
inversion of Conway's law has software developed by autonomous cross functional
product teams. Product teams incorporate
the early stages of software, such as the gathering of requirements, the middle
stages of implementation, and the later stages of deployment. This requires a cross functional team that
includes some form of project management, developers, DevOps, and infrastructure
knowledge. These teams are smaller but
stretched more vertically (having different skill sets) versus horizontally
(having the same skill set but more redundancy).
With
its focus on iterative development, agile development was a driver for the
formation of the product team. During
earlier stages of agile development in large projects, the teams were separated
into silos. This meant having project
managers, developers, quality assurance, and operations members on separate
teams. Given that agile processes are
measured by how fast their iterations are, teams that combined the skill sets
were explored and found to have faster iterations. One way to look at microservices is that they
are the result of faster iterations of sibling product teams which create
software for one another.[16]
Another
driver for the formation of teams in this way is covered by Daniel Pink in his
book "Drive". After a certain amount of
monetary compensation has been administered, autonomy, purpose (as well as
mastery) take over as drivers for a fulfilling work life.[17] People in smaller groups have more autonomy
and, in product teams, have more responsibility for their software which gives
a higher sense of purpose.
A
benefit of the thick, people oriented, idea of microservice development is the
facilitation of bounded context[18]
across larger organizations. Terms take
on slightly different meanings throughout an organization (e.g. a receipt
number on a receipt given by a sales organization can also serve as a support
tracking number to a support organization).
Having product teams with full responsibility for a service includes the
local understanding, authority, and responsibility over the terms along with
any conflict resolution duties in the implementation of those terms in the
software.
We now have a lower (thin) boundary
of a microservice, which is software deployed at the process level and a upper
(thick) boundary which is any software developed by a product team.
How do the product teams implement, and interconnect with one
another? This is the subject of the next
section.
Dependencies, Deployments, Surfaces
When
product teams develop software for one another, whether the software is a
container or a library used by a container, a big source of pain for the
consumer of that software is deployment.
Coarse grained deployments serve as a way to reduce some of the
dependency problems that come with the deployment of software created by
others.[19]
When
consuming libraries within the same programming language, dependency management
tends to be limited to versioned lock files that can be checked into source
control, and associated with the downstream service's version. When consuming another organization's
deployable binary artifacts, dependency management becomes much more
challenging. This is because the
consuming projects have their own tool chains and tool chain versions (e.g.
interpreters, compilers, etc) which often conflict with the upstream project's
tool chains. To resolve this,
microservice builds are deployed in a coarse grained fashion. This means that the dependencies are built
based off of a version and kept in an immutable container which can then be
consumed.
Containers
add the benefit of multi-stage builds which build on the strategy of separating
dependencies. With multi-stage builds we can encapsulate the dependency of a
static binary from one stage (e.g. a Golang binary artifact) with a downstream
build (e.g. a Ruby build) which keeps us from having the Golang tool chain in
the same environment as the Ruby tool chain.
The benefits increase with each tool chain dependency (e.g. Rust, C++,
etc). With compiled languages and static
binaries, the multi-stage build benefits are much greater, since the static
binaries can be passed down and used in each build. With interpreted languages, upstream
multi-stage preparation is not as beneficial.
A
long running driver for microservices is the age-old battle for the surface of
the interface as described by Richard Gabriel in his ‘Worse is Better' essay[20]. Gabriel gives an account of two sides in the
development of Unix: the MIT style and the New Jersey style. The essay mourns the victory of the ‘worse'
New Jersey style, used in the development of Unix, over the more pure MIT style
which prioritized interfaces over internals.
The latest iteration of this battle could very well be taken up by
microservices being an ‘interface' that is "simple, both in implementation and
interface." while adhering to the principle that it is "more important for the
interface to be simple than the implementation". With this view, microservices are the latest
attempt at having our cake and eating it too.
This
is especially true with the Golang/K8s implementation of microservices. This is because all areas of the surface of
the interface are handled by the Golang Kubernetes triad: CLI and CLI
generators, Config Maps as a malleable APIs, and Protobufs which ease the
generation of the various language client APIs.
The leveraging of generators keeps all three APIs in sync and creates
the illusion of a unified surface that updates in lockstep.
Security, Distributed Systems,
Ambiguousness
No
description of microservices would be complete without including critiques, and
this description is no exception.
Microservices can be, and are, overhyped. They are not a panacea, and are overkill for
small organizations. Distributed systems
are difficult to manage, and microservices require extra tooling
(orchestration, observability, service meshes, etc) in order to be managed
appropriately.
Conclusion
We
have seen that there are many forces at play (technical rate of change,
organizational rate of change, the state of the art in deployment, etc), when
we set out to write software.
Microservices are a byproduct of these forces. Any particular attempt to resolve these forces
by a non-trivial sized organization will be driven to some kind of pattern that
resembles the thick, evaluative definition of the microservice pattern.
##
W. Watson - Principal Consultant, Vulk Coop
W.
Watson has been professionally developing software for 25 years. He has spent
the numerous years studying game theory and other business expertise in pursuit
of the perfect organizational structure for software co-operatives. He also
founded the Austin Software Cooperatives meetup group and Vulk Coop as an
alternative way to work on software as a group. He has a diverse background
that includes service in the Marine Corps as a computer programmer, and
software development in numerous industries including defense, medical,
education, and insurance. He has spent the last couple of years developing
complementary cloud native systems such as the cncf.ci dashboard. He currently
works on the Cloud Native Network Function (CNF) Conformance test suite
(https://github.com/cncf/cnf-conformance) the CNF Testbed
(https://github.com/cncf/cnf-testbed), and the cloud native networking
principles (https://networking.cloud-native-principles.org/) initiatives.
Recent speaking experiences include ONS NA, KubeCon NA 2019, and Open Source
Summit 2020.
[1] https://en.wikipedia.org/wiki/Zeitgeist
[2] "The
activity of dissolution is the power and work of the Understanding, the most
astonishing and mightiest of powers, or rather the absolute power". The
Phenomenology of Spirit, Georg Wilhelm Friedrich Hegel, 1807
[3] "Each
living pattern resolves some system of forces or allows them to resolve
themselves. Each pattern creates an
organization which maintains that portion of the world in balance", pg 134.
Christopher Alexander, The Timeless Way of Building. Oxford University Press,
1979
[4] https://vmblog.com/archive/2020/09/08/a-break-from-the-past-why-cnfs-must-move-beyond-the-nfv-mentality.aspx
[5] https://iep.utm.edu/thick-co/
[ 6] "Cloud
native is not about microservices or infrastructure as code.", Garrison,
Justin; Nova, Kris. Cloud Native Infrastructure: Patterns for Scalable
Infrastructure and Applications in a Dynamic Environment . O'Reilly Media.
Kindle Edition.
[7] "Cloud
native is about autonomous systems that do not require humans to make
decisions. It still uses automation, but only after deciding the action needed.
Only when the system cannot automatically determine the right thing to do
should it notify a human." Garrison, Justin; Nova, Kris. Cloud Native
Infrastructure: Patterns for Scalable Infrastructure and Applications in a
Dynamic Environment . O'Reilly Media. Kindle Edition.
[ 8] "Applications
with these characteristics need a platform that can pragmatically monitor, gather
metrics, and then react when failures occur. Cloud native applications do not
rely on humans to set up ping checks or create syslog rules. They require
self-service resources abstracted away from selecting a base operating system
or package manager, and they rely on service discovery and robust network
communication to provide a feature-rich experience.", Garrison, Justin; Nova,
Kris. Cloud Native Infrastructure: Patterns for Scalable Infrastructure and
Applications in a Dynamic Environment . O'Reilly Media. Kindle Edition.
[9] "Cloud-native
technologies empower organizations to build and run scalable applications in
modern, dynamic environments such as public, private, and hybrid clouds.
Containers, service meshes, microservices, immutable infrastructure, and
declarative APIs exemplify this approach. These techniques enable loosely
coupled systems that are resilient, manageable, and observable. Combined with
robust automation, they allow engineers to make high-impact changes frequently
and predictably with minimal toil."
[10] "The
best way to think of a container is
as a method to package a service,
application, or job. It's an RPM on steroids, taking the application and adding
in its dependencies, as well as providing a standard way for its host system to manage its runtime
environment . Rather than a single container running multiple processes, aim
for multiple containers, each running one
process. These processes then become
independent, loosely coupled
entities. This makes containers a nice match for microservice application
architectures." Morris, Kief. Infrastructure as Code: Managing Servers in the
Cloud (Kindle Locations 1708-1711). O'Reilly Media. Kindle Edition
[11] "O'Neill's
A Hierarchical Concept of Ecosystems.
O'Neill and his co-authors noted that ecosystems
could be better understood by observing the rates of change of different components.
Hummingbirds and flowers are quick, redwood trees slow, and whole redwood
forests even slower. Most interaction
is within the same pace level-hummingbirds and flowers pay attention to each other,
oblivious to redwoods, who are oblivious to them. Meanwhile the forest is
attentive to climate change but not to the hasty fate of individual trees."
Brand, Stewart. How Buildings Learn (p. 33). Penguin Publishing Group. Kindle
Edition.
[12] "Metrics are best used by the team to
help itself, and should be continually reviewed to decide whether they are
still providing value. Some common
metrics used by infrastructure teams include: Cycle time The time taken from a need being identified to
fulfilling it. This is a measure of
the efficiency and speed of change management. [...] Mean time to recover (MTTR) The time
taken from an availability problem
(which includes critically degraded performance or functionality) being identified to a resolution, even where it's a workaround. This is a measure of the efficiency and speed of problem resolution.
Mean
time between failures (MTBF) The time
between critical availability issues.
This is a measure of the stability of the system, and the quality of the change
management process. Although it's a valuable metric, over-optimizing for MTBF is a common cause of poor performance on other
metrics", Morris, Kief. Infrastructure as Code: Managing Servers in the Cloud
(Kindle Locations 2805-2807). O'Reilly Media. Kindle Edition.
[13] "Integration Models
The design and implementation of pipelines for testing how systems and
infrastructure elements integrate depends on the relationships between them,
and the relationships between the teams responsible for them. There are several
typical situations: Single team One team owns all of the elements of the system and
is fully responsible for managing changes to them. In this case, a single pipeline, with fan-in as needed, is often sufficient. Group
of teams A group of teams works together on a single system with
multiple services and/or infrastructure elements. Different teams own different
parts of the system, which all integrate together. In this case, a single fan-in pipeline may work up to a
point, but as the size of the group
and its system grows, decoupling may become necessary. Separate
teams with high coordination Each team (which may itself be a group of
teams) owns a system, which integrates
with systems owned by other teams. A
given system may integrate with multiple systems. Each team will have its own
pipeline and manage its releases
independently. But they may have a close enough relationship that one team
is willing to customize its systems and releases to support another team's requirements.
This is often seen with different groups
within a large company and with close vendor relationships. Separate
teams with low coordination As with the previous situation, except one of the teams is a vendor with
many other customers. Their release process is designed to meet the requirements of many teams, with little or no customizations
to the requirements of individual customer teams. "X as a Service" vendors,
providing logging, infrastructure, web analytics, and so on, tend to use this
model." Morris, Kief. Infrastructure as Code: Managing Servers in the Cloud
(Kindle Locations 4892-4907). O'Reilly Media. Kindle Edition.
[14] "Cloud native is not about running applications in
containers. When Netflix pioneered cloud native infrastructure, almost all its
applications were deployed with virtual-machine images, not containers. The way
you package your applications does not mean you will have the scalability and
benefits of autonomous systems." Garrison, Justin; Nova, Kris. Cloud Native
Infrastructure: Patterns for Scalable Infrastructure and Applications in a
Dynamic Environment . O'Reilly Media. Kindle Edition.
[15] Stine,
Matt. Migrating to Cloud-Native Application Architecture, O'reilly, 2015, pp.
16.. "As we
decouple the
business domain into independently
deployable
bounded contexts of
capabilities, we also
decouple the associated
change cycles. As long as the changes are restricted to a single bounded
context, and the service continues to
fulfill
its existing
contracts, those
changes can be made and
deployed independent of any
coordination with the rest of the business. The result is
enablement of
more frequent and
rapid
deployments, allowing for a
continuous flow of value."
[16] Stine,
Matt. Migrating to Cloud-Native Application Architecture, O'reilly, 2015, pp.
16.. "Microservices represent the
decomposition of monolithic business systems into independently deployable
services that do "one thing well." That one thing usually represents a
business capability, or the smallest, "atomic" unit of service that delivers
business value."
[17] "We need an upgrade. And the science shows the
way. This new approach has three essential elements: (1) Autonomy- the desire
to direct our own lives; (2) Mastery- the urge to make progress and get better
at something that matters; and (3) Purpose- the yearning to do what we do in
the service of something larger than ourselves.", Pink, Daniel H.. Drive: The
Surprising Truth About What Motivates Us (p. 204). Penguin Publishing Group.
Kindle Edition.
[18] Stine,
Matt. Migrating to Cloud-Native Application Architecture, O'reilly, 2015, pp.
16-17.Development can be accelerated by scaling the development organization
itself. It's very difficult to build
software faster by adding more people
due to the overhead of communication
and coordination. Fred Brooks taught us years ago that adding more people to a late software project makes it later. However, rather than
placing all of the developers in a single sandbox, we can create parallel work streams by building more sandboxes through bounded contexts.
[19] "The
benefits of decoupling runtime requirements from the host
system are particularly powerful for
infrastructure management. It creates a clean separation of concerns between infrastructure
and applications. The host system only needs to have the container runtime software
installed, and then it can run nearly any container image. Applications,
services, and jobs are packaged into containers along with all of their
dependencies [...]. These dependencies can include operating system packages,
language runtimes, libraries, and system files. Different containers may
have different, even conflicting dependencies, but still run on the same host without issues. Changes
to the dependencies can be made without any changes to the host
system." Morris, Kief. Infrastructure as Code: Managing Servers in the Cloud
(Kindle Locations 1652-1658). O'Reilly Media. Kindle Edition.
[20] https://www.dreamsongs.com/RiseOfWorseIsBetter.html