Virtualization Technology News and Information
The Importance of Resilience in a Distributed World

[ This article is part of a series promoting FailoverConf -- a virtual event dedicated to resilience hosted by Gremlin on April 21. Join! ]

By Shahar Fogel, CEO of Rookout

Few jobs require adapting to change as much as being a software developer. In just the past 5 years, we've seen radical changes such as the rise of microservices, containers, and serverless environments. In parallel, we've seen the explosion of remote work and distributed teams across various time zones. While this opens the talent pool, and employees often appreciate the flexibility, we haven't talked enough about how it impacts how we build, maintain, and fix software.

I don't want to focus too much on the current global crisis, other than to say even companies that have been resistant to remote work, are now realizing the importance of working remotely, effectively. We live in a global world enabled by software, and so it seems safe to say that this will continue to be a growing trend. But a major problem is that development workflows haven't been updated at the same pace -- we still deploy, maintain, and fix software very much the same way we did a decade ago. This hurts our ability to increase velocity and stay resilient to change.

From what I can see, there are many challenges particularly to debugging software in this new paradigm. Working remotely, and hence debugging software remotely, is made difficult by everything from lack of proper tooling and access, to broken person-to-person communication that makes collaborating on software more difficult. From a developer's perspective, trying to get the full picture of a problem a customer is experiencing, for example, remains highly difficult.

As developers try to keep pace with the ever-increasing complexity, speed, and scale of their software, they are realizing that they are the new bottleneck, still stuck working with the same development and debug tools from a simpler age. The skillset developers require to address these challenges is constantly growing, along with the time and energy that they need just to get started -- to understand the problem space and assess how the available tools can be optimally applied.

This often creates what we call the "engineering dilemma" where we must decide whether to fly slow or to fly blind. Whether to take the time to get the data we need out of our systems, or whether we move forward with limited information. But in today's world, driven by data, we should be making the best data-driven decisions possible, and we should be able to do it quickly. In other words, we shouldn't have to sacrifice our reliability to innovate quickly, or vice-versa.

To accomplish this, we need to shed some of our old mindsets, in favor of democratizing data and access. And we need to embrace tools and workflows that make it easier to extract this data and pipeline it to the key stakeholders and decision makers. This process is very much broken today, which is why we believe strongly in our mission at Rookout. Join us at Failoverconf to hear how other modern tech companies are considering the importance of resilience! It's a free virtual event. Should be fun :)


About the Author

Shahar Fogel 

Shahar Fogel, CEO of Rookout. He's spent the last 2 decades leading data-driven businesses, products and R&D teams, from early stage start-ups to government organizations. Shahar is passionate about software architecture and observability; as a cyber Security team lead, product manager, VC investor, and a Cambridge University MBA alumni.

Published Friday, April 17, 2020 7:32 AM by David Marshall
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
<April 2020>