Virtualization Technology News and Information
SLOconf 2023 QA: Honeycomb Showcases Its Leading Observability Platform for High-performing Engineering Teams


SLOconf is the only event dedicated to the practice and application of Service Level Objectives (SLOs). Taking place May 15-18, SLOconf 2023 is a virtual event now in its third year. The agenda will include more than 70 speakers with presentations laser-focused on all aspects of SLOs. 

In this exclusive pre-show Q&A, we're speaking with Alayshia Knighten, Ecosystems and Partnerships Engineer at Honeycomb, a leading observability platform used by high-performing engineering teams to investigate the behavior of cloud applications.

VMblog:  To kick things off, give VMblog readers a quick overview of the company. 

Alayshia Knighten:  Honeycomb provides observability for high-performing engineering teams so they can quickly understand what their code does in the hands of real users in unpredictable and highly complex cloud environments. As an observability platform, Honeycomb is vastly different from traditional Application Performance Monitoring (APM) tools because it has the ability to sift through billions of rows of complete telemetry data grouped by any arbitrary number of dimensions. This enables faster debugging, higher uptime, better-performing services, more time for innovation, and ultimately, happier developers and end users.

VMblog:  What made you sponsor SLOconf 2023? Is this a must-sponsor event for your company?     

Knighten:  Honeycomb SLOs make it possible to trigger alerts on issues that matter most to your business so you can quickly debug them. They answer important questions like, "How much monthly downtime is tolerable? What performance impact is acceptable before users are negatively impacted? Should we focus on new features or tech debt?" Our actionable SLOs help teams define, measure, validate, and adjust engineering priorities collaboratively. SLO error budgets give teams the leeway needed to prioritize or de-prioritize production issues

Honeycomb's focus on understanding individual customer experiences is especially highlighted in our approach to SLOs. Most tools use time series data to measure availability but are limited to aggregating all customers into one measure: for the second that just occurred, was the system "good" or was it "bad?" There may be hundreds or thousands of customer experiences buried within those aggregate time series measures that you just can't see or respond to.

Honeycomb's event-based approach means that every individual service request is evaluated against the service-level indicator (SLI) criteria you define. If even one request failed, while thousands of other simultaneous requests succeeded, you'll know about it. 

VMblog:  What is your message to attendees of the show? 

Knighten:  Being on call doesn't have to take over your personal life. You shouldn't need to sleep next to your pager or be woken up by constant alerts. And avoiding these negative experiences is so simple: implement SLOs that matter to your business. The freedom to explore and receive alerts on things you actually care about is important. 

It's also essential to ensure you align SLOs to overall business objectives. Hence, all stakeholders understand what engineering priorities are (and why!) and engineering's impact on overall business goals.

VMblog:  What market needs or problems is your company solving for these attendees? 

Knighten:  As organizations face turbulent economic headwinds, engineering teams are expected to do more with this. This unprecedented pressure to innovate and release new features faster is compounded by stringent end-user expectations and increasingly complex tech stacks. As a result, modern engineering teams can no longer rely on "good enough" legacy APM tools built for predictable, monolithic systems that aren't architected for today's complex and unpredictable distributed cloud environments. Honeycomb is the only observability platform to entirely sidestep the data correlation problem across logs, metrics, and traces, by uniquely architecting its datastore to be datatype agnostic.

VMblog:  What sets you apart from the competition? 

Knighten:  Today, how code is written often differs from reality. End users have varied environments and dynamic software use cases, creating unpredictable bugs and anomalies. Honeycomb's ability to quickly analyze high-cardinality data is crucial to discovering novel problems. Honeycomb gives engineering teams the power to detect patterns in seconds across billions of data points representing how users are experiencing their code in real-time. It never aggregates or discards data.

We were the first observability platform to launch fully executing Natural Language Querying using generative AI for our new capability, Query Assistant. Query Assistant enables developers at all levels to ask questions in plain English instead of a query language. Generative AI then builds a relevant, modifiable query, eliminating the prerequisite for advanced knowledge of query-based languages like SQL.

Our CTO Charity Majors often says that the best developer tools are the ones that get out of your way and become invisible. Observability shouldn't require engineers to master complicated tools or languages that force you to constantly switch context and piece together clues to get answers. The only thing observability tools should encourage you to focus on is your own curiosity about what's happening in your system. This is where Honeycomb truly stands out from the pack. 

VMblog:  What are the trends your company is seeing that we should be aware of in 2023 and beyond? 

Knighten:  Developers at legacy organizations are seeing the benefits that a modern approach to observability offers them beyond the logs and metrics that they've had to use previously. We're particularly excited about eBPF and the opportunities it brings to support out-of-the-box auto-instrumentation while still providing the rich context and flexibility users expect of observability tools. 

OpenTelemetry is rapidly gaining traction as the preferred way to instrument data for observability. The interest from the developer community in creating an open standard for telemetry has been on the rise for quite some time, and in 2023, we see the possibility that OpenTelemetry will surpass Kubernetes as the fastest and most important developing CNCF project. Vendors who continue to push their own bespoke and proprietary instrumentation libraries and agents as the default way to use their products will soon find themselves on the wrong side of what consumers are demanding. In 2023, using OpenTelemetry to instrument your applications for observability, regardless of the tools you're using, will become the de facto standard.

VMblog:  Does your company have a speaking track at the event? If so, can you tell us about the session so people can get them on their schedules?

Knighten:  I'll be speaking about SLI negotiation tactics for engineers. As engineers, we have our own Survival Level Indicators (SLIs) that measure and define whether compliance with what happens when the rockstar engineer who performs essential tasks A and B hasn't taken a vacation in nine months. Over time, not meeting SLIs can take its toll on engineers. How do we provide ourselves the opportunity for grace and the ability to say "does this fit me as a person"?

In this session, I will review different strategies to identify human burnout versus company personal objectives. I will also talk about how we can improve ourselves and survive in the high-risk climate in tech. The talk will give engineers and managers the courage to care for themselves and their teams. Sometimes, it's hard to identify when we-or our friends-are okay or not. In this discussion, we will review how to identify "Houston, we have a problem" moments, ways to improve our problems, and overall strategies for strengthening who we are.

My colleague Jessica Kerr is also presenting how our use of SLOs has evolved over time here at Honeycomb. 

SLOs are part of our product, so we've cared about them for a long time, and we put a lot of conscious effort into how we use them (especially Fred Hebert, Staff SRE, who is co-author and possibly co-presenter). As we change our internal best practices, we regularly re-evaluate our SLOs, trading off alert fatigue against customer experience. We also know how our customers use SLOs, so we know that other companies could benefit from the kind of thought Honeycomb's SRE team puts into this.


Published Wednesday, May 10, 2023 12:26 PM by David Marshall
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
<May 2023>