Virtualization Technology News and Information
Pushing your platform full stream ahead

Time and again big ticket events cause sites and platforms to crash, whether it's major UK retailers like John Lewis and GAME failing to put the right infrastructure in place to scale fast enough to handle the influx in visitors wanting a PS5, or the entire internet in South Korea crashing due to the popularity of Netflix show Squid Game. The Glastonbury Festival ticket site crashed almost immediately after launching in November, with revellers already fighting for the few available tickets. Similarly, global ticket sales website Ticketmaster crashed due to high demand for Taylor Swift concert tickets, and more recently it crashed yet again as partygoers attempted to buy Eurovision tickets.

Didn't we fix all of this when Covid-19 rapidly accelerated digital transformation across every industry? There's a big difference between migrating to the cloud, and making your cloud work for less.

Always online: stretching the bandwidth

Where once the internet was limited to sharing text information, today it's used from video streaming and gaming to data gathering and complex calculations. To complicate matters, the number of people using the internet is also increasing.

There are 8 billion people on Earth right now - 6.92 billion of them are smartphone users. 86% of the world has access to the internet at all times from their pockets, constantly messaging, shopping, streaming, gaming, reading, exploring. And it isn't only consumers who are online more, so are businesses.

Global Covid-19 lockdowns accelerated the world's digital transformation journey. In fact, according to McKinsey, companies digitised many activities 20 to 25 times faster during Covid-19. Companies had to adjust processes to allow employees to work remotely where possible. But can the internet handle this rapid growth in active users? The smart answer is yes, but it's not what many are experiencing.

Traffic is growing exponentially, the amount of data being produced by each individual is also increasing rapidly, but many companies fail to have the right infrastructure to scale online at pace.

Kubernetes: not quite the golden solution

The ongoing shift to the cloud led to the rise of Kubernetes (or K8s), which calls itself an open-source system for automating deployment, scaling, and management of containerised applications. Dubbed by some as the cloud's operating system, K8s has supported the world's invasion of the web allowing companies to scale online but it also created a new complex digital world of metrics.

This is one of the key issues with K8s - by default, it generates a huge amount of metrics that continuously grows. Many businesses can't decide which metric is important today, which metric may be important in the future or which metric will probably never be important. Too afraid to stop monitoring data that could one day in a hundred years be important, businesses instead attempt to monitor it all. To quantify how big an issue this is, let's focus on K8s version 1.24.0. Every node exports between 2,000-3,000 series, without counting application metrics. As the number of K8s nodes and running containers increases, so too does the number of metrics, very quickly resulting in millions of metrics. Considering only 25% of K8s metrics are ever used, storing and analysing these huge volumes of unused data is a waste of time and resources.

A solution to this problem could be through a universal monitoring standard, but no such thing exists. Instead there are a number of standards that are used across a single company as each person prefers a specific model. This results in a chaotic collection of data with no structure, no uniformity and no compatibility - and even worse, it means you end up with even more data.

Drowning in data:  the crux of the problem

Those in the industry are still struggling to solve the K8s multi-metric issue with no avail. But while solutions continue to be sought to reduce the number of metrics created, there are changes businesses can make to reduce the amount of RAM and disk space needed to maintain high cardinality series - such as when hundreds of thousands of people attempt to buy glastonbury tickets.

In fact, looking into that example further, consider the amount of time series data created by a single person purchasing tickets. The network allows the user to log into the website - requiring a database check and confirmation; it supports ticket purchases allowing a certain number of applicants through based on system availability, then verifies payment details with a bank and updates internal systems on ticket availability; attendee onboarding is activated including regular email newsletters that need to be sent at specific pre-agreed points of time ahead of the live event. The amount of time series data that needs to be collected, monitoring, analysed and processed for this one single event is huge - now consider hundreds of thousands of people attempting to do the same thing. Approaches like optimising data structures and using intelligent algorithms to compress data is an effective way companies can reduce the energy - and therefore cost - required for data processing and storage.

Data is essential but expensive. When online websites and applications are hit with peak traffic events the volume of data and metrics created jumps considerably, oftentimes overloading a system and causing it to crash. However, it's not plausible for a business to cover the cost of high bandwidth all the time when it likely won't be used. This is why it's so important to be able to scale up OR down quickly and efficiently. If your current approach to data monitoring was not born with scalability in mind, you will not be able to maintain uptime during high-traffic times.


To learn more about the transformative nature of cloud native applications and open source software, join us at KubeCon + CloudNativeCon Europe 2023, hosted by the Cloud Native Computing Foundation, which takes place from April 18-21.


Roman Khavronenko Co-Founder, VictoriaMetrics


Roman is a software engineer with experience in distributed systems, monitoring and high-performance services. The idea to create VictoriaMetrics took shape when Roman and Aliaksandr were working at the same company. Prior to joining the VictoriaMetrics team, Roman worked as an engineer at Cloudflare in London. He lives in Austria with his wife and son.

Published Monday, April 03, 2023 7:30 AM by David Marshall
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
<April 2023>