Virtualization Technology News and Information
Article
RSS
VMblog Expert Interview: Nobl9 Customer Experience Survey Highlights Service Level Objectives as Key to Ecommerce Success

interview-nobl9-ruby 

E-commerce is changing the way consumers shop, giving them millions of options to purchase goods online. This creates intense competition for sellers, who must invest in solutions to deliver exceptional customer experiences. Recently, Nobl9 conducted a digital customer experience survey to understand how consumers interact with the applications they use. For e-commerce organizations, the value of implementing Service Level Objectives (SLOs) for reliability management stands out as an innovative technique to drive customer loyalty by providing dependable shopping experiences. I sat down with Dan Ruby, VP of Marketing at Nobl9 to understand the goals of the survey and what the results tell us.

VMblog: Let's start with the premise of the survey. Why did you feel a customer experience survey was important for Nobl9, and what kinds of information did you hope to reveal? 

Dan Ruby:  As a provider of SLO-based reliability solutions, the Nobl9 team is always thinking about customer experience - for both our own customers, and our customers' customers who expect high-quality service. Even our website describes Nobl9's offering as, "Customer-Facing Reliability, Powered by Service-Level Objectives." Effectively, in this survey we wanted to know how reliable the average person views the sites and apps they interact with on a daily basis, and how that reliability or lack thereof impacted their views of the app, willingness to engage with the company, and overall purchasing behavior.

We wanted to know, for example, how many issues with a single application a user would tolerate before switching to an alternative. We were also curious what customers tend to do when they experience a glitch in an app - would they immediately delete it, simply try again later, or maybe leave a negative review? Not only is this great data for Nobl9 to have, it also reveals how crucial it is for all application providers to focus on delivering seamless experiences, because our survey results prove that customers really do care how reliable their tools and services are. The hope is that this survey can help us communicate better with our customers about their users' behavior to show them why SLOs remain critical to business outcomes.

VMblog: Can you explain the link between reliability and customer experience?

Ruby: Every application, every company, every product owner should be thinking of reliability as customer experience. You can pack all the flash and features into a digital experience you want, but at the end of the day, the experience part is driven as much by what you can do with an app and whether you can actually do it. Everyone has had bad experiences with an app - slow to load, crashes a lot, loses data, rejects their perfectly correct password, and so on. You downloaded the app because the sizzle of all the things it can do piqued your interest, and then you deleted the app because trying to actually do all of those things was actually quite difficult.

I think organizations understand this perfectly well - per CIO's 2024 State of the CIO report, 22% of CEOs view "Improve the customer experience" as one of the top 3 IT priorities, along with improving IT and business collaboration. But improving the customer experience requires understanding it, particularly from a reliability perspective. Keeping things siloed and putting a requirement of nines of uptime doesn't give you the insights you need to really make for a great experience.

In terms of customer satisfaction, key metrics like a company's Net Promoter Score (NPS) are tied to reliability. So, too, are more bottom-line business outcomes - latency, availability, micro-service availability; really, any situation where there's a noticeable decrease in the customer's reliability experience is simply another reason for them to not complete whatever action they wanted to complete. This leads to higher rates of cart abandonment and user churn, which in turn drive down metrics like lifetime customer value.

One really tangible example of this is the impact of page load times or latency on conversion rates. We know that even a one-second delay loading a page can lead to 7% fewer conversions, and that means revenue takes a hit. For a site generating $100,000 per day, this 7% reduction translates to $2.5 million in lost sales annually.

With SLOs, you can even go beyond this. Running SLOs on ratio metrics like conversion rate, cart abandonment rate, etc. can help you identify when something is causing your business outcomes to suffer. Digging into your product's SLOs when this happens can help you identify what your SLIs should be - you may find that you're being too stringent and wasting money on targets that don't actually negatively impact customer experience. You should also be able to identify what the break point of sorts is where a metric does have that impact.

VMblog: What were some of the significant results of your survey?

Ruby: We conducted this survey in April 2024 and received over 300 responses, giving us confidence that this data is statistically meaningful. The first key point is that customers are experiencing quite a few issues with their applications. In fact, only 12% of people did not have an issue in their daily use of apps in the past year. We found that almost 60% of respondents experienced slow load times over the past year, and worse, 60% also had an app crash completely. At the same time, over 40% experienced an app forcibly logging them out. These three issues are also the ones customers found most frustrating, with app crashes in the top spot, slow load times second, and forced log-outs in third place.

More interesting is how these customers respond. Nobl9's survey found that when experiencing an issue, most people would unsurprisingly close the app and try again later. This is not necessarily a good thing for a company seeking to retain users - but the good news is that almost 40% of people would either try the app from a different device, or check the status page to see if the application as a whole was down. But one thing that struck me is how few users actually took an action that the developers or product owners could see - 10% reached out to the company, and 11% checked social media to see if the app was down and potentially complained publicly. Assuming the company is monitoring all social mentions, that leaves 79% of frustrating, user-impacting app issues that are just never seen internally. How can a dev or IT team be expected to fix issues that are invisible to traditional monitoring and aren't brought to their attention by the users?

At a higher level, survey data shows that 40% of users are unlikely or very unlikely to continue working with a company whose applications do not work properly, so ensuring consistent performance is critical. The result I think teams will be most interested in is how many issues their customers will tolerate before giving up on the product altogether. Our survey found that over 70% of users would abandon an app completely after just 1-5 issues - 6% after just one issue.

This suggests that businesses have very slim margins of error if they want customers to continue using their products. But context here is important; our survey revealed that major outages are not the main cause of concern for bottom lines. Actually, 53% of respondents would feel less frustrated about experiencing reliability issues if they knew the application had a major outage. In that case, at least they know more or less why their experience was subpar, and that it's not just them feeling singled out by issues.

This tells us that ongoing, smaller reliability incidents are the bigger source of customer churn. These hiccups in performance can drive customers away, often without the business's knowledge; another key result from the survey is that customers are very hesitant to give feedback. All respondents indicated they were unlikely to leave reviews on apps whether they liked them or not.

VMblog: As a provider of SLO solutions at Nobl9, what does this data tell you about reliability practices?

Ruby: The main takeaway for us is that businesses need to move beyond traditional reliability strategies; modernizing your approach with SLO-based reliability is a necessity, not a nice-to-have. IT departments have of course recognized the link between reliability and user retention for a long time, turning to traditional reliability practices like improving Mean Time To Recovery (MTTR) as a primary KPI, prioritizing the number of nines of uptime their applications have, and reducing the number of catastrophic outages. But MTTR is reactive, not proactive, and nines of uptime has its own issues - most of these day-to-day reliability issues don't necessarily constitute an outage. You can have all the nines of uptime, but if micro-outages are affecting your customers' experience, they don't matter. I doubt many people would hear "Hey, I know you're unhappy with the performance of our app, but look, we have five nines of availability!" and suddenly change their outlook on the app's performance.

Unlike traditional techniques, SLOs provide you with a system to monitor errors, not outages. You can strategically dial in your acceptable error budget and get alerts when your Service Level Indicators (SLIs) are burning that error budget more quickly than expected.

This has a couple of benefits. For one, your definition of "reliability" goes from a binary up/down metric to one that is actually indicative of the day-in day-out customer experience. And two, by setting your SLIs strategically based on the customer impact of errors, you can dial in your IT spend by focusing on KPIs that impact customer outcomes.

In an ecommerce system, for example, the app may be running, but the checkout process is slow to load and customers are abandoning carts at a high rate. With a SLO tracking the checkout process's latency, app and product owners can very quickly recognize a key issue that hurts the customer experience. On the other hand, maybe your app's login authentication service is throwing errors. But you know that the service is set up to retry authentication automatically, taking a matter of milliseconds. This means that a single error doesn't actually meaningfully impact the customer experience, so you can set your SLO to allow for a bigger error budget here, saving you some IT spend by focusing on the customer's perspective.

Organizations realize already that the day-in day-out customer experience is key to making their users happy and retaining them as customers. And there are a lot of tools out there that pull data around customer experience. SLOs are the last step of modernizing a reliability strategy - in the case of Nobl9, getting them set up becomes almost trivial. Connect the existing data sources, run some analysis on historical data to help inform your SLI/SLO parameters, group infrastructure and services into a project, and start making reliability holistic and customer-centric. SLOs don't need to be new instrumentation; the data is already there, but without viewing it through the SLO lens, it's just fragmented, siloed raw data that is far from actionable.

VMblog: It seems that every organization could benefit from stronger reliability management. Tell us why this is particularly important for e-commerce companies.

Ruby: Ecommerce companies are facing immense competitive pressures that many smaller industries are not.

There were almost 27 million global ecommerce sites as of 2023 - nearly 14 million in the United States alone. Amazon and other massive sellers stand out, but even extremely specialized sellers are up against stiff competition given just how many options are out there. It really has never been more important for ecommerce companies to focus on site reliability in order to gain new customers and keep existing ones.

There's not a ton of levers ecommerce companies can pull to differentiate themselves. You can compete on price, within the constraints of manufacturers' pricing guidelines. You can try to differentiate on shipping speed and costs, or product availability and variety. But to a certain extent, these are often kind of top-of-funnel elements. Someone may launch your app or go to your site because a Google product search shows your listing and price. Once they get there, reliability either becomes a blocker to them completing a purchase, or they never notice it because everything goes smoothly, and they buy from you.  

An SLO approach is particularly powerful for ecommerce companies because their applications are made up of a large collection of services that must all work properly for the overarching app to work. Even a simple app hosted on a public cloud platform might include Kubernetes clusters to automate scaling; external services such as CAPTCHA for logins, a payment gateway, and a CDN to host images and videos; and internal microservices like an authentication server, a shopping cart, and a search feature.

Traditional reliability practices tend to silo these elements, with little insight into their mutual impact. One endpoint monitoring tool might look at servers, while another tool pulls infrastructure data, another monitors containers, and so on. Making strategic reliability decisions with this toolset means that every part of the application is necessarily held to the same or similar standards. Reducing outages is important for ecommerce companies, but traditional reliability fails to account for the nuanced performance of an app on a day-to-day basis.

Consider that failures in microservices supporting an ecommerce site's various functions, like the all-important shopping cart, are what generate incomplete transactions and dissatisfied consumers. Long load times are especially problematic for ecommerce sites that depend on ushering consumers from search through checkout without a hitch. Just a few seconds of delay may cause users to abandon shopping carts; the average ecommerce site loses half of its visitors if pages take longer than 3 seconds to load. Frequent app crashes also frustrate users, as our survey shows, and they become less engaged or look elsewhere if an app repeatedly goes down.

VMblog: Is there anything else our readers should know? How will this survey inform Nobl9's strategy moving forward?

Ruby: This survey, and all of our customer data-gathering, is directly shaping the future of the Nobl9 platform. We just released SLO Details 2.0, which is a significant upgrade to our primary user interface. All of the new features were driven by user feedback, like the new Overview section that lets users focus on the most important metrics for their Primary Objective. Other new updates are coming soon that will help customer-facing organizations feed even more types of metrics into their SLOs, from disparate sources, so insights from anything touching the customer can be leveraged from within Nobl9.

We will continue to center customer feedback in all decision-making and will likely conduct more surveys like this in the future to keep current data on hand as user needs evolve. I would encourage everyone to join our frequent webinars and workshops, which are great resources to improve your reliability management. And lastly, because even with SLOs customer feedback is important - we have a running "Suggestion Box" for folks to drop comments, concerns, or anything they'd like to share.

##

About Dan Ruby

Daniel Ruby is the VP of Marketing at Nobl9. Ruby is a dynamic marketing executive with a focus on B2B marketing, and has significant experience building teams and driving successful, data-driven programs for a range of startups and mid-sized organizations. As the Director of Online Marketing for Localytics, Ruby was the first marketing hire and scaled his team to a full-fledged marketing department with domain specialists focused on mobile apps. Ruby also has a background in journalism and spent several years guest lecturing marketing courses at Bentley University, bringing this dynamic skill set to his current role at Nobl9. Ruby holds a BA in Broadcast Journalism from University of Missouri-Columbia and an MBA in International Business from Brandeis University.

Published Thursday, June 13, 2024 7:29 AM by David Marshall
Filed under: ,
Comments
There are no comments for this post.
To post a comment, you must be a registered user. Registration is free and easy! Sign up now!
Calendar
<June 2024>
SuMoTuWeThFrSa
2627282930311
2345678
9101112131415
16171819202122
23242526272829
30123456