The Rise of AIOps: How Data, Machine Learning, and AI Will Transform Performance Monitoring

December 17 2018
 

AppDynamics surveyed 6,000 global IT leaders about application performance monitoring and AIOps. Read on to discover the trends shaping the space.


Over the last decade, application environments have exploded in complexity.

Gone are the days of managing monoliths. Today’s IT professionals are tasked with ensuring the performance and reliability of distributed systems across virtualized and multi-cloud environments. And while it may be true that the emergence of this modern application environment has provided the speed and flexibility professionals demand, these numerous services have unleashed a deluge of data on the enterprise IT environment.

Application performance monitoring (APM) solutions have proven essential in helping leaders take back control by providing the real-time insights needed to take action. But as the volume of data in IT ecosystems increases, many professionals are finding it challenging to take a proactive approach to managing it all. While automating tasks have helped teams free up some bandwidth for operations and planning, automation alone is no match for today’s increasingly complex environments. What’s needed is a strategy focused on reducing the burden of mounting IT operations responsibilities, and surfacing the insights that matter the most so that businesses can take the right action.

So, what are forward-thinking IT professionals doing to stay ahead of the curve?

Many are applying what’s being called an AIOps approach to the challenge of application environment complexity. This approach leverages advances in machine learning and artificial intelligence (AI) to proactively solve problems that arise in the application environment. Even though relatively new, the approach is gaining momentum and has been the focus of several Gartner AIOps reports. And for good reason: Using AI to identify potential challenges within the application environment doesn’t just help IT professionals get ahead of problems — it helps companies avoid revenue-impacting outages that jeopardize the customer experience, the business, and the brand.

In order to fully understand the rise of AIOps and why it has developed the momentum it has, we wanted to dig deeper to uncover the actual challenges faced by IT professionals, and how they’re managing them in an increasingly complex application environment. To accomplish that, AppDynamics undertook a study of 6,000 global IT leaders in Australia, Canada, France, Germany, the United Kingdom, and the United States. Their responses answered three key questions about the shift in the performance space:

(1) What’s the current enterprise approach to managing increasing application environment complexity?

(2) How are global IT leaders taking a proactive approach to identifying problems in the application environment?

(3) How broadly is AI identified as a potential solution to reducing complexity in IT ecosystems?

Let’s see what the research revealed.

The Demand for Proactive Application Performance Monitoring Tools

Today, midsize to large companies use an average of eight different cloud providers for various enterprise applications and services. As a result, IT professionals are managing an ever-increasing set of tasks that have the potential to become disconnected if not managed properly. What’s more, within these highly distributed systems, IT leaders must grapple with the impact of new code being deployed, as well as the virtually infinite potential outcomes associated with doing so. Without a unified view of how all of these elements interact, there’s significant potential for issues to arise that impact performance — and, ultimately — the customer experience.

New research from AppDynamics underscores the cause for concern: 48% of enterprises surveyed say they’re releasing new features or code at least monthly, but their current approach to monitoring only provides a siloed view on the quality and impact of each release. In fact, of those enterprises that release on that cadence, a massive 91% say that monitoring tools only provide data on how each release drives the performance of their own area of responsibility.

Research from AppDynamics indicates performance monitoring remains siloed.

Should these findings raise eyebrows? Absolutely.

That’s because they indicate that for the vast majority of those surveyed, a holistic view of business and customer value is still difficult to achieve. And that puts innovation — as well as modern, best-in-class software development practices like continuous delivery — at serious risk.

But that’s where leveraging data about the application environment using machine learning, as well as AI, can make a massive difference. Instead of merely ingesting data from every dimension of the application environment, these tools can help IT professionals build a more proactive approach to APM.

And, by all accounts, that’s what most global IT leaders want.

According to research findings from AppDynamics, 74% of surveyed said they want to use monitoring and analytics tools proactively to detect emerging business-impacting issues, optimize user experience, and drive business outcomes like revenue and conversion. But according to our research, 42% of respondents are still using monitoring and analytics tools reactively to find and resolve technical issues. There’s indication, however, that this approach is extremely problematic for businesses. Beyond a serving as a pain point for IT professionals in terms of capacity and resource planning, reactive monitoring — in some cases — can potentially cost businesses hundreds of thousands of dollars in lost revenue.

The majority of IT professionals want to use monitoring tools more proactively.

How Reactive Monitoring Hurts Performance, Revenue, and Brand

From e-commerce to banking, booking flights to watching movies on Netflix, applications have proliferated people’s lives. As a result, consumers have high expectations for application performance that businesses must deliver on. If not, they risk jeopardizing brand loyalty and, as our research revealed, their bottom line.

“As the broader technology landscape undergoes its own dramatic change, forcing businesses to double down on their customer focus, managing the performance of applications has never been more critical to the bottom line.” — Jason Bloomberg, The Rebirth of Application Performance Management

IT professionals have long relied on the mean time to repair (MTTR) metric to evaluate the overall health of an application environment. The longer it takes to resolve an issue, the greater the potential for it to turn into a significant business problem, particularly in an increasingly fast-paced digital world. However, in this latest AppDynamics research, we made a startling discovery: Most organizations are grappling with a high average MTTR:  Respondents reported that it took an average of 1 business day, or seven hours, to resolve a system-wide issue.

But that wasn’t the most alarming finding.

Our research also revealed that many enterprise IT teams weren’t notified about performance issues via monitoring tools at all. In fact:

  • 58% find out from users calling or emailing their organization’s help desk
  • 55% find out from an executive or non-IT team member at their company who informs IT
  • 38% find out from users posting on social networks

AppDynamics research reveals how performance problems are being discovered in the enterprise.

To fully appreciate the impact of 7 hour MTTR on a business, AppDynamics asked survey respondents to report the total number of dollars lost during an hour-long outage, and used that figure to extrapolate the typical cost of an average, day-long outage. For the United States and United Kingdom, the cost of an average outage totals $402,542 USD and $212,254 USD, respectively (the cost of an outage in the United Kingdom was converted into United States dollars).

United States

AppDynamics research revealed that companies in the United States on average lose $402,542 for a single service outage.

United Kingdom

The high cost of a performance outage in the United Kingdom.

It’s important to note that these figures reflect the total cost for a single outage in the enterprise — if a company has more than one, that figure can rise dramatically. In fact, a substantial 97% of global IT leaders surveyed said they’d had performance issues related to business-critical applications in the last six months alone.

Of the 6,000 IT professionals AppDynamics surveyed, 97% said they’d experienced a service outage in the last six months.

In addition to the impact on a company’s bottom line, global IT leaders reported that
reactive performance monitoring had created stressful war room situations and damaged their brand. 36% said they had to pull developers and other teams off other work to analyze and fix problems as they presented themselves, and nearly a quarter of respondents said slow root cause analyses drained resources.

The takeaway here is clear: global IT leaders need to build a more proactive approach to APM in order to lower MTTR and protect their bottom line. But in today’s increasingly complex application environment, that’s easier said than done.

Unless, of course, you’re developing an AIOps strategy to manage it.

The Risk of Not Adopting an AIOps Strategy

AppDynamics research showed that the overwhelming majority of IT professionals want a more proactive approach to APM, but one of the main ways of achieving that — through the adoption of an AIOps strategy — isn’t being widely pursued by global IT teams in the near-term.

In fact, the global IT leaders AppDynamics surveyed reported that although they believe AIOps will be critical to their monitoring strategy, only 15% identified it as a top priority for their business in the next two years.

AppDynamics research reveals that the vast majority of IT professionals surveyed don’t have an AIOps strategy in place in the near-term.

What’s more, the capabilities that respondents identified as essential to APM in the next 5 years are precisely those that match AIOps use cases. For example:

Intelligent alerting that can be trusted to indicate an emerging issue.
49% of respondents identified this feature as core to their performance monitoring capabilities in the next five years. By ingesting data from any application environment, AIOps platforms and technology can play a pivotal role in not just automating existing IT tasks, but identifying and managing new ones based on potential problems detected in the application environment.

Automated root cause analysis and business impact assessment.
44% of respondents said solving problems quickly and understanding their impact on the business would play a crucial part of their performance management in the years ahead. With the help of AIOps technology, this can be achieved, providing increased agility in the face of potential service disruptions or threats, and without additional drain on resources.

Automated remediation for common issues.
42% of survey respondents said that they needed to build automated remediation into their strategy for performance monitoring. With AIOps, it’s easy to not only automate remediation for known issues, but unknown issues, too. That’s because it not only ingests data from your application environment, but provides more intelligent insights as a result of it.

Leading The Way With AIOps Strategy and Platforms  

Despite increasingly complex application environments, few of the global IT leaders surveyed are prioritizing the development of an AIOps strategy, which would allow them to implement the platforms and practices to permit proactive identification of issues before they become system-wide problems. Instead, global IT leaders report an average MTTR rate that hovers at a full business day, and has the potential to cost companies hundreds of thousands of dollars in lost revenue with each incident.

What’s more, AppDynamics research findings also make it clear that many global IT leaders are struggling to integrate monitoring activities into the purview of the broader business. This can cause significant delays in MTTR, as noted, as well as make companies vulnerable to service disruptions that can cause irreparable harm to the customer experience, and the enterprise as a whole.

While IT leaders have expressed a desire for a more proactive approach to monitoring, this research indicates that there’s still plenty of work to be done on numerous fronts. But the first step is clear: IT leaders must prioritize the development of an AIOps strategy and related technology. In doing so, they’ll  simplify the demands of an increasingly complex application environment, and build a stronger connection from IT to the business as a whole.


Editor’s Note: In this piece, the term “global IT leaders” refers to the respondents surveyed for this report. The term “IT professionals” refers to people in the IT or related professions as a whole.

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form