Observability, Data and the Road to Digital Resilience

Observability, Data and the Road to Digital Resilience

Helen Yu 07/02/2024
Observability, Data and the Road to Digital Resilience

Digital resilience is a key skill that is essential in 2024. 

Designing a walk-in closet was, I thought, was going to be a fun, creative experience. I vividly remember my excitement as I uploaded my room dimensions to create the perfectly organized space. Little did I know that the virtual journey to my dream closet would turn into a series of unexpected obstacles.

Five in-person visits with a closet designer left me increasingly frustrated when their system kept freezing. Even adding a mirror door became a giant hurdle the system simply couldn't handle. We had to resort to ordering the remaining parts in a separate transaction, disrupting what should have been an exciting process.

If you've ever shopped online or faced system downtime during a transaction, my experience might resonate with you. Even five years ago, most of us would have shrugged it off. Today, shoppers expect more a smooth, frictionless, digital buying experience as they move forward in the sales journey – from first look to transaction completion. A bad experience costs more than the price of the sale, it impacts how a brand is viewed.

More than ever, digital resilience is a key success factor.

Buying is Personal

Author and coach Tony Robbins shared in a speech that people don’t buy products, they buy feelings. So when a customer feels frustrated or stressed, they can form strong opinions about a retailer. The big test, of course, comes by way of Cyber Week, a pivotal period for e-commerce, with high traffic testing the resilience of digital systems unlike any other time of year. As seen in the @Splunk digital resilience study, Cyber Monday in 2023 witnessed global online consumer spending reaching $1.14 trillion, emphasizing the critical nature of reliable digital experiences.

Digital resilience, then, is a way to dramatically reduce customer dissatisfaction. But how do you define the term? What does it mean? Is it measurable?

In the recent CXO Spice, Mala Pillutla, GVP of Global Observability Strategy at Splunk defined Digital Resilience as the ability to not only recover but thrive during disruptions, especially in scenarios like Black Friday. She underscored the importance of good observability practices for swift issue identification and resolution, ensuring reliability and exceptional customer experiences. Derek Dykens, global retail and hospitality advisor Splunk, highlighted the impact of slow web performance - with a mere 2-second delay leading to an 87% customer departure rate during high-traffic events (this is up from 40% back in 2017)

Digital Resilience Impacts Top and Bottom-Line Growth

The study highlights the need for efficient troubleshooting, enhanced visibility into customer experience, and avoiding service performance degradation from new deployments. The consequences of poor digital experiences are severe, with downtime costs exceeding $150,000 per hour for nearly two-thirds of organizations.

In its study, Splunk shares seven e-commerce brands’ success stories. Embracing observability practices, which Splunk defines as the ability to measure a system’s internal states by examining its outputs, is the catalyst for digital resilience through high traffic events.

Let’s take a look:

  • RentTheRunway.com faced challenges with limited visibility across a complex microservices architecture, leading to outages. With Splunk, they achieved complete visibility, reducing Mean Time to Recovery (MTTR) by 94%, preventing unplanned downtime, and ultimately enhancing customer experiences.

  • DANA, one of the largest e-wallet providers in Indonesia, struggled with reactive and fragmented monitoring in a fast-paced digital payment environment. Implementing Splunk's centralized observability platform with full-fidelity tracing, they increased business resilience, performed proactive troubleshooting, and achieved faster incident resolution.

  • Rappi needed scalable observability tools to cope with the skyrocketing demand for its delivery services during the pandemic. Leveraging Splunk's observability tools, Rappi's IT team ensured a smooth purchase experience for 7.5 million weekly active users, detecting problems quickly and achieving a 90%+ faster Mean Time to Resolution (MTTR).

  • Lenovo faced unexpected web traffic spikes during cloud migration. They turned to Splunk Observability Cloud. The result was a significant reduction in troubleshooting time, maintaining 100% uptime, and handling a 300% increase in web traffic on Black Friday 2020.

  • Stripe's bold mission to increase the GDP of the internet faced challenges in ensuring high availability during peak times like Black Friday and Cyber Monday. With Splunk's security and observability platform, Stripe achieved 99.9998% availability during Cyber Week, supporting billions of daily payment processing transactions.

  • Tesco had to scale its online business during the COVID-19 pandemic, facing unprecedented demand for groceries and household essentials. Tesco partnered with Splunk and doubled online delivery slots, maintained secure and reliable systems, and ensured zero downtime during a 30% surge in online traffic over Christmas.

  • PUMA Group lacked insight into customer orders on its e-commerce websites, resulting in a poor customer experience and missed sales opportunities. Implementing Splunk, PUMA achieved real-time monitoring, quick investigation, and problem rectification, leading to a 15-minute average time to detect issues and enhanced revenue.

Unpacking Observability

In the context of digital systems and businesses, observability refers to the practice of monitoring and understanding the behavior of applications, infrastructure, and other components to ensure their reliability, performance, and security. Observability involves collecting and analyzing telemetry data, which includes metrics, logs, traces, and other relevant information.

My conversations with Mala and Derek from CXO Spice highlight the role of observability as a catalyst for digital resilience, particularly in the context of retail and customer-centric businesses.

The Business Case for Observability

Observability serves as a cornerstone for organizations, offering unified visibility into their digital infrastructure. This holistic perspective enables continuous monitoring of diverse aspects, empowering companies to detect and address issues that might affect the user experience, thereby fostering customer loyalty.

Effective observability is crucial during high-traffic events such as Cyber Monday or Super Bowl Sunday. Early issue detection and the ability to predict potential problems allow organizations to proactively address challenges. For example, a retailer with robust observability practices can anticipate and resolve issues swiftly, ensuring system reliability even during unexpected surges in demand.

Observability is not only about reactive problem-solving; it plays a pivotal role in supporting innovation. By providing insights that inform decision-making, observability enables teams to innovate faster with confidence. This is especially true when observability strategies encompass both security and innovation, contributing to the overall digital resilience of the system.

Good observability practices lead to faster incident resolution. Teams can analyze consolidated telemetry data to identify root causes swiftly, minimizing downtime and ensuring minimal disruption to business operations. Adopting a cohesive observability strategy across development, infrastructure, and security teams equips organizations to respond effectively to incidents, as the consolidation of telemetry data ensures everyone is examining the same information, facilitating quicker recovery from outages.

Where To Start?

Digital resilience is measured by an organization's ability to adapt, respond, and recover from disruptions or unexpected events in the digital landscape. It's about ensuring that digital systems remain robust and can continue to deliver a positive customer experience even in the face of challenges, such as sudden spikes in traffic, system failures, or cyber threats.

Observability practices are essential to digital resiliency. Your next step?  Start with a specific technology, service, or team open to innovation. Beginning with a focused approach allows for a gradual integration of observability tools and practices, addressing cultural challenges and promoting a smoother transition.

In a world where digital experiences define customer satisfaction, observability is not just a tool; it's the linchpin for digital resilience. In a time of great change and uncertainty, digital resilience is an intentional strategy to endure by way of delivering a more frictionless, more positive customer experience.

Share this article

Leave your comments

Post comment as a guest

terms and condition.
  • No comments found

Share this article

Helen Yu

Innovation Expert

Helen Yu is a Global Top 20 thought leader in 10 categories, including digital transformation, artificial intelligence, cloud computing, cybersecurity, internet of things and marketing. She is a Board Director, Fortune 500 Advisor, WSJ Best Selling & Award Winning Author, Keynote Speaker, Top 50 Women in Tech and IBM Top 10 Global Thought Leader in Digital Transformation. She is also the Founder & CEO of Tigon Advisory, a CXO-as-a-Service growth accelerator, which multiplies growth opportunities from startups to large enterprises. Helen collaborated with prestigious organizations including Intel, VMware, Salesforce, Cisco, Qualcomm, AT&T, IBM, Microsoft and Vodafone. She is also the author of Ascend Your Start-Up.

Cookies user prefences
We use cookies to ensure you to get the best experience on our website. If you decline the use of cookies, this website may not function as expected.
Accept all
Decline all
Read more
Tools used to analyze the data to measure the effectiveness of a website and to understand how it works.
Google Analytics