Virtualization Technology News and Information
Article
RSS
A Break From the Past, Why CNFs Must Move Beyond the NFV Mentality

By Jeffrey Saelens - Principal Engineer and W. Watson, Principal Consultant, Vulk Coop

Introduction

What does ideology mean?  The common pejorative use of the term for another person's ideology is "that which is wrong".  Other definitions of ideology are better characterized as ideas that suppress from the top "a rational system of ideas to oppose the irrational impulses of the mob" to ideas that erupt from the bottom "fantasies that hide inconvenient truths" [1]. 

The term cloud native has been used along a spectrum of ideological definitions starting with "no one knows what cloud native means" (That which is wrong) to treating cloud native as a panacea (a fantasy).  The answer is somewhere in the middle.  A clue to how to reason about cloud native ideas in this middle ground can be found in the paper "Programming-in-the-large versus programming-in-the-small" by Frank DeRemer and Hans Kron written back in 1975 [2].  The problem that has been plaguing programming for 40+ years is figuring out how to reason about, define, and direct several programs, components, or modules as a system.  When we look at cloud native principles as a whole, it is analogous to reasoning about programming in the large, but the various domains of cloud native (containerization, CI/CD, orchestration, observability, service meshes, networking, etc) [3] can be associated with programming in the small.  Some aspects of the former are found in the latter.

With respect to networking, the right question to ask is "how do we take cloud native principles and apply them to the individual domain of networking?"  For cloud native in the large, networking means service mesh networking, such as service discovery, health checking, routing for REST services, load balancing, authentication and authorization, and the generation of observability metrics and tracing [4].  These are mostly layer 7 concerns but really address networking the "components or modules" as a system.  For cloud native in the small, networking includes concerns down to layer 2, with implications for layer 1.  For cloud native in the small, in the words of Ed Warnicke, the packet is the payload.

cloud-native-buzzword-bingo 

                         Cloud Native Buzzword Bingo

What About Cloud Native Buzzword Bingo?

The result of any critique of an architectural offering should be the trade offs of the offering. Critiquing the promise of cloud native should be no different.  One recommendation to keep us on track with using the terms and definitions of cloud native in a reasonable way is to force ourselves to define what we mean by the terms.  We can do this by playing buzzword bingo during our talks, papers, and conferences to keep us from relying too heavily on buzzwords. [5] When describing cloud native networks while playing, a question to ask ourselves is "does this description sound more like an enterprise application concept, a traditional SDN networking concept, or neither and the definitions are not well defined?" 

SDN-Networking 

SDN - Champion of All Buzzwords

Software defined networking (SDN) might just be the champion of all buzzwords in the networking space. A proposed shift in paradigm, not unlike the recent pivot to cloud native, in how network operations were to be run, SDN was supposed to fundamentally change how the industry approached networking. Unfortunately,  SDN buckled under its own hype and a business model that threatened the very vendors pushing solutions into the market. It's hard to identify at this point what problem SDN set out to solve as its scope has creeped indefinitely since its proposal. Reducing complexity, automating network operations and providing a common interface to multiple network vendors equipment are some of the primary examples of what SDN promised to deliver on. This desire to please everyone resulted in a confusing landscape of anything and everything claiming to be "SDN". Given this, it's hard to tell what SDN even entails. Is it a complex protocol like OpenFlow and NETCONF or possibly custom python manipulating CLIs/APIs? Furthermore, what is the difference between SDN, orchestration and policy driven networking?  This confusion makes at least two of the core problems, reducing complexity and interoperability between vendors, very hard to achieve. All of this is not to say there have been no SDN success stories; however, what is typically seen are greenfield deployments of both hardware and software the vast majority of the time. This is often a suitable approach in the enterprise space, but creates challenges in multi-vendor communication service provider's (CSP) networks with large brownfields.

NFV - 50 Shades of Virtualization

Network Function Virtualization (NFV) was another major proposal for shifting how networking at large is tackled. Unlike SDN, NFV had a very clearly defined goal, reduce OPEX. It would achieve this by enabling network operators to migrate to a common off the shelf (COTS) model for their hardware base, and simply license virtual network functions from vendors in a multi-vendor ecosystem. This ideal turned out to be quite the reach. Endless variations between CSP's infrastructure, and physical network function code simply shoved into a virtual machine and labeled a VNF made the early days of NFV a continuous cycle of pain and frustration. Compounding matters, the NFV deployment models and architectures often directly mirrored their legacy physical counterparts with additional complexity built in. Using the packet core as an example, line cards often had a one-to-one mapping directly to a single VM consuming an entire server. This perpetuated the appliance based approach within NFV making integration extremely challenging. The appliances were built with very specific expectations with regards to infrastructure tuning and the type of dataplane available to it. Some wanted SR IOV, others a DPDK enabled virtual switch to handle packet treatment before reaching their VNF. The pain that arose from this infinite mutability was likely a major catalyst for pushing the cloud native community towards standardizing on immutable infrastructure. Finally, there is the provisioning side of NFV. ETSI's MANO architecture for VNF orchestration and life-cycle management came out very early in the NFV journey. While providing a general context of how things "could" be done, its lack of specificity made using it as an established standard challenging. There was room for an endless amount of interpretation as to how a network function virtualization orchestrator (NFVO) was supposed to communicate with the virtual network function manager (VNFM) and virtual infrastructure manager (VIM). Every vendor sought to innovate in this space, and each had a different take. This complexity delayed the maturity of the overall virtualization efforts within the risk adverse CSP space, causing some providers to question if they should skip it entirely with the cloud native tsunami on the horizon.

SDN and NFV, what went wrong?

The unique challenges and complexity of both the SDN and NFV spaces have made that ever elusive OPEX reduction hard to achieve. Too often, SDN and NFV are used interchangeably despite solving very different technical problems. They've been lumped into the primordial soup we refer to as buzzword bingo, losing their gravity. The lessons learned in these spaces should and can be carried forward. One of the biggest deltas between the SDN/NFV and cloud native approaches is the concept of declarative consumption models. Even in the more successful SDN and NFV deployments, the presentation layer is incredibly imperative. This means that operators have to hire multiple experts for a single deployment across all their teams. First, each team will need someone who granularly knows what the VNF configuration and network service is supposed to encompass. Second, these teams will need someone who understands the modeling languages of the MANO stack they chose to deploy and what the API interactions between the stack entails. Third, each organization needs someone who deeply understands virtualization to tune and troubleshoot the infrastructure. An alternative approach would be to jettison this mindset and instead put the needed skill sets on the appropriate teams. These teams would fall within the declarative vs imperative spectrum and focus in on their specializations. Teams would imperatively define their domain, creating abstractions that allow themselves and other teams to declaratively consume the work of others.

What You Think You Want

Service providers have the desire for speedy deployments and changes to code with minimal toil.  They want to apply security patches without having to upgrade everything in lock step.  They want resilient infrastructure.  What many service providers think they want from cloud native is Kubernetes.  Kubernetes has benefits of orchestration.  Running Kubernetes seems to legitimize containerization previously done on workloads.  But there are problems with just adopting Kubernetes without changing your development process.  This premature adoption really gets the cart before the horse.  The many of the desires of service providers are solved higher up in the CNCF's cloud native trail map.

cloud-native-trailmap 

 

                         Cloud Native Trail Map [ 6 ] 

What You Really Want

What service providers really want, if you filter out the noise, is uninterrupted service delivery to their customers and the ability to continuously deploy new services without impacting their infrastructure. Translating this into a cloud native paradigm, what service providers really want are the properties of agile.  The properties of speed of deployments and the ability to change code with minimal toil.  This is supported by a strong continuous delivery process, which is often skipped on the trail to cloud native. 

With continuous delivery, you have a process (a pipeline) that applies a series of tests at different stages.  The first stage creates an artifact from the source code, the second stage tests the integration of that artifact with other artifacts, configuration, and resources in a test environment.  The third stage and beyond can be manual tests, deployment into production, or other environments.  Sometimes the later stages include tests of how well stacks of infrastructure elements (servers, switches, databases, i.e. any grouping of elements that must be modified all at once) work together. 

Continuous Delivery

The reason why you need continuous delivery is to pull the pain forward.  The more you practice something the easier it becomes.  For service providers this means preferring software and hardware options that lend themselves to the continuous delivery process.  This means promoting, contributing to, and using open source, open hardware, bare metal switches, and commodity solutions that allow for completely automated deployments [7] with separate artifacts [ 8 ], configuration [9], and environments [10]. [11] If you use proprietary software and hardware, it can be harder to automate deployments.  Service providers need to demand the ability to make completely automated deploys with separate artifacts, configuration, and environments from the vendors in order to get the benefits of an agile process.

Immutable Infrastructure

If infrastructure is immutable, it is easily reproduced, consistent, disposable, will have a repeatable provisioning process, and will not have configuration or artifacts that are modifiable in place.  Kubernetes allows for immutable infrastructure above the orchestration level.  The orchestrator itself needs to be in a continuous delivery process as well, in order to get agile benefits for everything below the application level, which is a great cause of pain for service providers.  Changes that happen frequently in production should be intelligently automated and based on conditions that trigger the change.  This is a significant change from the previous NFV mentality.

Challenges

There are many challenges when applying cloud native principles to networking.  One of the many challenges is identifying the requirements from the diverse service provider community.  One type of requirement for larger service providers is the ability to procure solutions from multiple vendors.  This means that vendors can not sell solutions that do not play nice with other vendor's solutions.  An example of failure between implementations in the past would be SNMP [12] and Netconf/YANG.  Some success stories would be HTTP and IPv4 [13].  The deployment process should be integrated with the procurement process to help facilitate multiple organizations, working with Conway's law [14] and not against it.  For larger service providers this should include automated performance, security, and compliance testing stages in the continuous delivery process.

Network Service Mesh

Network Service Mesh [15] is one solution that lends itself to declarative configuration for layer 2 and layer 3 payloads.  This means that it lends itself to a comprehensive continuous delivery process as well, since the configuration can be saved in source control, versioned, tested, and is generally easier to reason about [16].  This helps with requirements for easily repeatable deployments and the separation of network service artifacts, network configuration, and network environments which is needed for testing.  It also has abstractions between the network service developer, the operator (i.e sneaky network people), and even the consumer (i.e. Sarah, the application developer), that are easier to reason about.   

declarative-spectrum 

When an application developer consumes a cloud native networking function, network service mesh allows the application developer to consume it using a declarative API.  ‌If an operator combines cloud native network functions into a service chain using network service mesh, they combine the services using a declarative API and then expose those services as a declarative API for the application developer.  When a cloud native network function developer creates networking software using the network service mesh, they expose that software using a declarative API.

Conclusion

For CNFs to qualify as cloud native, the ideology behind what it means to in fact be cloud native must be taken into consideration. Simply repackaging network functions in containers and deploying them with Kubernetes will not be enough to give services providers what they "want" or "need". While NSM might not be a silver bullet, it is at least approaching the problem with cloud native considerations at the forefront as opposed to being relegated to a nice to have afterthought.

##

About the Authors

Jeffrey Saelens - Principal Engineer

Jeffrey Saelens 

Jeffrey Saelens is a Principal Architect in the telecommunications industry. Starting his career in the US Army, Jeffrey was a Green Beret who focused on communications and systems engineering. After leaving the military, he dove into the service provider world heavily focusing on NFV and SDN transformations within data center, core and access networks. Currently, Jeffrey seeks out ways of implementing cloud native philosophies within the service provider space.

 

W. Watson - Principal Consultant, Vulk Coop

w watson 

W. Watson has been professionally developing software for 25 years. He has spent the numerous years studying game theory and other business expertise in pursuit of the perfect organizational structure for software co-operatives. He also founded the Austin Software Cooperatives meetup group and Vulk Coop as an alternative way to work on software as a group. He has a diverse background that includes service in the Marine Corps as a computer programmer, and software development in numerous industries including defense, medical, education, and insurance. He has spent the last couple of years developing complementary cloud native systems such as the cncf.ci dashboard. He currently works on the Cloud Native Network Function (CNF) Conformance test suite (https://github.com/cncf/cnf-conformance) the CNF Testbed (https://github.com/cncf/cnf-testbed), and the cloud native networking principles (https://networking.cloud-native-principles.org/) initiatives. Recent speaking experiences include ONS NA, KubeCon NA 2019, and Open Source Summit 2020.



[1] https://en.wikipedia.org/wiki/Ideology

[2] https://dl.acm.org/doi/10.1145/390016.808431

[3] https://github.com/cncf/trailmap

[4] https://blog.envoyproxy.io/service-mesh-data-plane-vs-control-plane-2774e720f7fc

[5] https://en.wikipedia.org/wiki/Buzzword_bingo

[ 6 ] https://github.com/cncf/trailmap

[7] A deployment is the act of putting artifacts into an environment.

[ 8 ] An artifact is the result of compiling code: a binary.  It is environment agnostic.

[9] Configuration is specific to an environment.  It is used to communicate to artifacts information about the environment

[10] An environment is configuration plus resources (such as a server or switch).  It is everything besides the artifacts.  An environment usually has a name such as ‘test' or ‘production'

[11] "Open code availability. Perhaps the next most important technical consideration is that a protocol have freely available implementation code. This may have been the case when deciding between IPv4 and IPX, the latter of which at the time was, in many ways, the technically superior of the two."  https://www.ietfjournal.org/what-makes-for-a-successful-protocol/

[12] "SNMP implementations vary across platform vendors. In some cases, SNMP is an added feature, and is not taken seriously enough to be an element of the core design. Some major equipment vendors tend to over-extend their proprietary command line interface (CLI) centric configuration andcontrol systems." https://en.wikipedia.org/wiki/Simple_Network_Management_Protocol#Implementation_issues

[13] "If we apply those definitions, then a protocol such as HTTP is defined as wildly successful because it exceeded its design in both purpose and scale. Another example of a wildly successful protocol is IPv4. Although it was designed for all purposes ("Everything over IP and IP over Everything"), it has been deployed on a far greater scale than it was originally designed to meet." https://www.ietfjournal.org/what-makes-for-a-successful-protocol/

[14] https://en.wikipedia.org/wiki/Conway%27s_law

[15] https://networkservicemesh.io/

[16] "Declarative configuration is different from imperative configuration , where you simply take a series of actions (e.g., apt-get install foo ) to modify the world. Years of production experience have taught us that maintaining a written record of the system's desired state leads to a more manageable, reliable system. Declarative configuration enables numerous advantages, including code review for configurations as well as documenting the current state of the world for distributed teams. Additionally, it is the basis for all of the self-healing behaviors in Kubernetes that keep applications running without user action." Hightower, Kelsey; Burns, Brendan; Beda, Joe. Kubernetes: Up and Running: Dive into the Future of Infrastructure (Kindle Locations 892-896). Kindle Edition.

Published Tuesday, September 08, 2020 7:35 AM by David Marshall
Comments
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
Calendar
<September 2020>
SuMoTuWeThFrSa
303112345
6789101112
13141516171819
20212223242526
27282930123
45678910