Virtualization Technology News and Information
Article
RSS
You shouldn't (have to) think about where your workloads run

By Shaun O'Meara, Global Field CTO, Mirantis

We'd like to think that we're all truly cloud native, and that we don't need to think about where our workloads are running, and that we can let Kubernetes or some other workload scheduler or do all the thinking.  Unfortunately it's not true; at some level we still need to think about it -- but we shouldn't have to.

There are lots of reasons we still think about where our workloads are running:

  • Security: You want to make sure the systems on which your workloads run are free from malware or vulnerabilities that can enable attackers to gain access. Right now that means having to pay (understandable) attention to the hosts themselves.
  • Cost: Any time a workload is fired up, it incurs a cost. The idea is to minimize those costs as much as possible. Either by keeping workloads on-premise or by right-sizing public cloud instances.
  • Regulations: Depending on your industry, you may have additional requirements to consider, whether they are governance or geographical. This requires careful planning and managing where we deploy our workloads and the data centers in which they are located.
  • Latency: Latency-sensitive workloads need to run as "close" to the user as possible. This is still an emergent issue, and users currently manage it by creating additional Edge clusters closer to the user.

That's a lot, but think about how far we've already come! in reality, some companies haven't even come this far; they are still running monolithic workloads in on-premise data centers, doling out individual physical or virtual servers and increasing capacity by ordering hardware that will then be racked and stacked by overworked, overstressed IT personnel.

But imagine a situation where we didn't have to do that, where you could just send your workload out into the ether and have it run without you worrying about it. To extend the metaphor we've been using for years to describe the ability to get the appropriate resources within a cloud, Infrastructure as a Service, we can call it Hybrid Data Center as a Service.

So to let us reach this level of workload nirvana, Data Center as a Service would need:

  • Intelligent workload routing: The first thing that we need is to know that workloads that have specific requirements are going to automatically be routed to nodes or even clusters that satisfy those requirements. so workloads that need to remain in Europe, for example, will be scheduled to a European cluster. Workloads that require specialized hardware should be scheduled to servers that have that hardware.
  • Automatic server management: In order for intelligent workload routing to be effective, the appropriate resources must be available. Ideally, when additional  resources such as servers, storage, or networking are needed, Data Center as a Service will spin it up in such a way as to conform with the requirements. For example, in Europe, a job requested from the United States that deals with European users and thus (according to our business rules) needs to be executed in Europe might call for the system to spin up a new bare metal node to add one of the available European clusters. Or, if no European clusters were available, it might go ahead and create one on newly commissioned European servers. Similarly, when the last European job ends, it might decommission that cluster (after some pre-determined buffer time).
  • Service mesh:  With all of these routing options for workloads, of course requests will need to be routed as well. Service mesh technologies will need to implement business rules as to what requests go where, and will need to be kept up-to-date.
  • Virtualized and containerized workloads:  While much new development is typically architected as containers, legacy applications will be around for many years. (Last year states were begging for COBOL programmers to help with their Unemployment Compensation systems!) The ideal system will be able to handle not just containerized workloads, but also these legacy systems, which often can be run as Virtual Machines.
  • Security: Security should cover all aspects of running workloads, from end-to-end TLS (which can be handled by the service mesh) to data encryption and node-based security issues. The actual platform running your workloads must also be secure.
  • Logging, monitoring and alerting: While the ideal system enables you, as an operator, to be much more hands-off, that doesn't mean that you are completely divorced from the day-to-day running of your system. Quite the opposite. You should be able to see exactly what is happening at any given moment, and the system should alert you to any problems. Ideally the system will alert you to issues before they become problems, and in some cases, even handle them for you. For example, if patterns indicate that a server is about to fail, the system could create a new node, drain workloads from the questionable node to the new one, and shut the new one down before there are problems.
  • Developer guardrails:  The best production system technology can offer won't do you any good if your developers are creating insecure or suboptimal software. The system should provide a way for developers to begin with trusted base images and scan for vulnerabilities. It should also prevent workloads from moving to production unless they have been verified as having been through this process.

All of this technology exists as open source projects; it simply needs to be combined and made available for users so that all workloads, whether they themselves are cloud-native or not, can take advantage of what we all want cloud native computing to be.

To implement these technologies we need to create a clear delineation between the orchestration of the workloads and the infrastructure that provides the compute resources. The workloads need to be agnostic to the underlying infrastructure and only consume common open standards based technologies like containers and kubernetes.  The infrastructure needs to be made accessible through a common set of standardised APIs that ensure that developers can consume the resources they need in a consistent manner, from any provider be that on-premise or in the public cloud. This needs to be achieved while ensuring that security, Identity Management, Logging and Monitoring, and lifecycle manager are implemented in a clear and consistent manner that meets the needs of the organisation.

##

To hear more about cloud native topics, join the Cloud Native Computing Foundation and cloud native community at KubeCon+CloudNativeCon North America 2021 - October 11-15, 2021 

ABOUT THE AUTHOR

Shaun O'Meara, Global Field CTO, Mirantis

Shaun O'Meara 

Shaun O'Meara has been designing and building Enterprise IT Infrastructure Solutions for 20 years. His work with customers, advising on the journey to cloud and assisting in the development of cloud solutions, has given him a wide scope to learn and try new and diverse technologies. Currently Field CTO at Mirantis, he previously served as the company's EMEA Lead and Senior Systems Architect.

Published Tuesday, September 14, 2021 7:31 AM by David Marshall
Filed under: ,
Comments
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
Calendar
<September 2021>
SuMoTuWeThFrSa
2930311234
567891011
12131415161718
19202122232425
262728293012
3456789