How Garmin attacks IT and business problems with Application Performance Management (APM)

If you’ve used a GPS product before you’re probably familiar with Garmin. They are the leading worldwide provider of navigation devices and technologies. In addition to their popular consumer GPS products, Garmin also offers devices for planes, boats, outdoor activities, fitness, and cars. Because end user experience is so important to Garmin and its customers, ensuring the speed of its web and mobile applications is one of its highest priorities.

This blog post summarizes the content of a case study that can be found by clicking here.

The reality in today’s IT driven world is that there are many different issues that need to be solved and therefore many different use cases for the products designed to help with these issues. Garmin has shown it’s industry leadership by taking advantage of AppDynamics to address many use cases with a single product. This shows a high level of maturity that most organizations strive to achieve.

“We never had anything that would give us a good deep dive into what was going on in the transactions inside our application – we had no historical data, and we had no insight into database calls, threads, etc.,” – Doug Strick

Garmin-FlowMap

AppDynamics flow map showing part of Garmin’s production environment.

Real-Time and Historic Visibility in Production

You just can’t test everything before your application goes to production. Real users will find ways to break your application and/or make it slow to a crawl. When this happens you need historical data to compare against as well as real-time information to quickly remediate problems.

Garmin found that AppDynamics, an application performance management solution designed for high-volume production environments, best suited its needs. Garmin’s operations team feels comfortable leaving AppDynamics on in the production environment, allowing them to respond more quickly and collect historical data.

The following use cases are examples of how Garmin is attacking IT and business problems to better serve their company and their customers.

Use Case 1: Memory Monitoring

Before they started using AppDynamics, Garmin was unable to track application memory usage over time. Using AppDynamics they were able to identify a major memory issue impacting customers and ensure a fast and stable user experience. Click here to read the details in the case study.

Garmin-Heap

Chart of heap usage over time.

Use Case 2: Automated Remediation Using Application Run Book Automation

AppDynamics’ Application Run Book Automation feature allows organizations to trigger certain tasks, such as restarting a JVM or running a script, with thresholds in AppDynamics. Garmin was able to use this feature very effectively one weekend while waiting for a fix from a third party. Click here to read the details in the case study.

Use Case 3: Code-level Diagnostics

“We knew we needed a tool that could constantly monitor our production environment, allowing us to collect historical data and trend performance over time,” said Strick. “Also, we needed something that would give us a better view of what was going on inside the application at the code level.” – Doug Strick

With AppDynamics, Garmin’s application teams have been able to rapidly resolve several issues originating in the application code.  Click here to read the details in the case study.

Use Case 4: Bad Configuration in Production

Garmin also uses AppDynamics to verify that new code releases perform well. Because Garmin has an agile release schedule, new code releases are very frequent (every 2 weeks) and it’s important to ensure that each release goes smoothly. One common problem that occurs when deploying new code is that configurations for test are not updated for the production environment, resulting in performance issues. Click here to read the details in the case study.

Garmin-Verification

AppDynamics flow map showing production nodes making services calls to test environments.

Use Case 5: Real-time Business Metrics (RTBM)

IT metrics alone don’t tell the whole story. There can be business problems due to pricing issues our issues with external service providers that impact the business and will never show up as an IT metric. That’s why AppDynamics created RTBM and why Garmin adopted it so early on in their deployment of AppDynamics. With Real-Time Business Metrics, organizations like Garmin can collect, monitor, and visualize transaction payload data from their applications. Strick used this capability to measure the revenue flow in Garmin’s eCommerce application.  Click here to read the details in the case study.

Garmin-RTBM

Real-time order totals for web orders.

Conclusion

“AppDynamics has given us visibility into areas we never had visibility into before… Some issues that used to take several hours now take under 30 minutes to resolve, or a matter of seconds if we’re using Run Book Automation.” – Doug Strick

Garmin has shown how to get real value out of its APM tool of choice and you can do the same. Click here to try AppDynamics for free in your environment today.

Forrester Talks Automation and Application Performance Management

Jean-Pierre (JP) Garbani of the analyst firm Forrester recently wrote a research paper titled “Technology Spotlight: Automate Application Performance Management – Automate Your Incident Management Process With Run Book Automation“. In this paper JP discusses the fact that IT must embrace automation if they are to be successful moving forward. He goes on to describe Run Book Automation (RBA) and how it applies to Application Performance Management (APM).

One of the most interesting parts of the research note is a graphical depiction of the intersection between APM and RBA. While I can’t show that image in this blog post I can say that it is a spot on depiction of how RBA works within AppDynamics. For those who are not familiar, AppDynamics decided to lead the APM industry in a new direction by listening to our customers and making RBA an important and fully integrated part of our product.

If you take nothing else away from this blog post or from JPs research paper you need to understand this key insight… The reason APM based RBA works so much better than traditional RBA is because APM understands exactly which application nodes are impacted at any given time and can perform run book remediation on those nodes without any input from a user.

The Old Way

Traditional RBA requires that you write tests to act as triggers for run book workflows. You would run these tests against pre-defined sets of infrastructure and application components when there is a problem so that your runbooks can fix any known issues. If your application and infrastructure change you need to manually modify the list components that are getting tested. This manual update process has been the downfall of RBA and other technologies (CMDB anyone?) as people have come to realize the time investment required to keep everything current. This problem is amplified by todays dynamic application technologies like virtualization and cloud computing.

Classic RBA Process

Process flow and timeline of classic RBA.

The New Way

Forrester and AppDynamics agree that the answer to the traditional RBA problem described above is by using APM to dynamically track the current state of application and infrastructure components as well as identify problems that trigger run book workflows for resolution. Identification of issues from within the application and in real time is a giant step forward from the pre-determined interval testing of traditional RBA. And when you combine this capability with real time business metrics you get a new capability that enables the business to react immediately to problems that have nothing to do with IT.

APM RBA Process

Process flow and timeline of APM based RBA.

Application run book automation can be used by any organization large or small. If there are issues within your environment that have known fixes then you can use application RBA to automatically detect and remediate those problems within seconds. Stop wasting your time doing repetitive tasks and try AppDynamics for free today. If you’d like to read the Forrester research paper in its entirety you can download it for free by clicking here.

My Top 3 Automated Tasks for Finding and Fixing Problems

Gears-BrainAutomation sets apart organizations at the top of their game from the rest of the pack. The limiting factor in most organizations is that they are usually too busy putting out fires and keeping up with all of their other obligations to expend the effort required to envision and build out their automation strategy. With that in mind I have created a small list of the automation tasks that I feel provide the most value to an organization. Along with these tasks I explain the type of effort involved and reward associated with each one. All of the information presented is based upon my 15 years of troubleshooting within enterprise operations and applications environments.

IT Operations

1. Collect troubleshooting metrics – To me this is the no-brainer of automation tasks. For each particular type of problem you always want certain information to help resolve that issue. This is also the easiest of my top three to implement but may provide less value than the other two. Here are some examples…

  • Hung/Unresponsive JVM/CLR – Initiate and store thread dump, restart application on offending node.
  • Slow transactions plus high server CPU utilization – Collect process listing to determine CPU contributors and spin up extra instance of application.
  • Transactions throwing excessive errors – search log files for errors and send list to appropriate personnel, based upon error type possibly probe individial components deeper (see #2 below)

Application Operations

2. Probe application components – This one is really useful for figuring out difficult application problems but requires more effort to set up than #1. The basic concept is that you need to find out from the application support team what steps they would take manually to trouble shoot their application if it were slow or broken. The usual responses are things like “Check this log file for this word”, “Run this query against this database and if the output is -3 then I know what the problem is”, “Hit this URL to see what the response looks like. If the page does not return properly I know there is a problem with this component”, etc…

It may seem like a lot of work to set up this type of automated probing at first but once it is set up it becomes an invaluable troubleshooting tool. Imagine that the application response times get slow so these troubleshooting measures are automatically invoked and you get an email with the exact root cause within minutes of performance degradation. With this type of automation there is usually a known resolution based upon the probing results so that could be automated too. How handy is that?

Business Operations

Blame_Flowchart3. Alert the business to changing conditions – This is where the IT staff has the ability to really make an impression and impact on the business. One of the most overlooked aspects of monitoring and automation is the ability to gather, baseline, alert, and act based upon pure business metrics. Here’s an example scenario…

Bobs business has an e-commerce website that sells many different products. They use AppDynamics Pro to track the quantity of each item sold along with the total revenue of all sales throughout the business day (using the information point functionality). One day Bob gets an alert that the latest greatest widgets are selling way below their typical volume but the automation engine searched the prices of major competitors to find that one competitor lowered prices and is undercutting business. With this actionable information Bob is able to immediately match the pricing of the competition and sales rates return to normal before too much damage is done.

Obviously business operations automation can be the most complex but also the most rewarding. Reaching out to the business and having a conversation about their activities and processes can seem daunting but I can tell you from personal experience that the business is more than willing to participate in initiatives that save them time and money. This type of conversation also normally leads to the business asking about other ways to collaborate to make them more effective so it is a great way to improve the overall level of communication with the business.

AppDynamics Pro enables a new level of automation because it knows exactly what is going on inside of your applications from both a business and IT perspective. Your level of automation is limited only by your imagination. I recommend that you start out small with a single automation use case and build outward from there. You can use AppDynamics Pro for free to try out our application runbook automation functionality on your specific use case by clicking here.

Application Runbook Automation – A Detailed Walk Through

On Monday AppDynamics announced a new feature called Application Runbook Automation (RBA). The response to this announcement has been great and many people want to see the details on how we implement RBA within AppDynamics, so I’m going to walk you through it step by step in this blog post.

If you don’t already know WHY we built Application RBA please read “Don’t Be An “Also-Ran” – Application Runbook Automation for World Class IT”

Let’s jump right in…

Don’t be an “Also-Ran” – Application Runbook Automation for World Class IT

Your applications are the lifeblood of your business–the culmination of countless hours of development, testing, bug fixes, more testing, and finally deployment to production. No matter how good your coding and testing practices are, though, eventually your applications will slow down, start to throw errors, and crash. How quickly and effectively you remediate these business impacting problems is the difference between world class IT organizations and the “also rans.”

Application Runbook Automation from AppDynamics is a game changing technology that can remediate business impact in a matter of seconds. It can represent a savings of hundreds of thousands of dollars per incident. Read more to find out how…

rba revenue risk 2

This table shows the business impact in USD of reducing MTTR from hours to minutes to seconds by taking advantage of AppDynamics and application run book automation.

 

So What’s a Runbook Really?

In case you’ve been locked in an IT dungeon for your whole career, let’s take a minute to talk about runbooks. Runbooks are the cure for insomnia, the crappy part of application support where you document how to stop and restart your awesome application that will never, ever fail. But just in case, some of you are also forced by the evil management regime to document troubleshooting procedures along with remediation steps for the common problems that could cause your application to misbehave.

All of this documentation (Runbook) is handed off to your friendly operations center staff so that it can be stuffed away in a giant repository never to be heard from again. After all, the job of the NOC (network operations center) is just to receive and route alerts to the proper people–so why even write out those runbooks in the first place?

You know how to restart your application when you get paged at 3 AM, no documentation required. You can recover your application within 30-120 minutes of initial impact half asleep with one hand tied behind your back. Unfortunately, during that time your company just lost revenue, customers, and tarnished their reputation. Tough break, but that’s just the way it is today. But there’s a better way!

Hasn’t Runbook Automation (RBA) Been Around for a While?

Yep, RBA is that expensive software sitting on your corporate shelf right now. It can’t help you with the problem described above. It’s become niche software for infrastructure support personnel who know nothing about your applications (and are offended they consume so much of their resources, kidding of course). Traditional RBA is a colossal failure that knows nothing about your user, business transactions, and applications. It’s this lack of application awareness that has relegated RBA to being used by a handful of infrastructure support personnel in your company.

Application Runbook Automation is a Business Differentiator

The time has come to fulfill the promise of automated remediation of business impact. Consider this:

  • AppDynamics understands exactly what components are being used at any given time by your applications.
  • It knows exactly what your end users are doing all the time and tracks and measures that activity through your application components.
  • It knows what transactions are slow or failing and exactly what code, configuration, and nodes are causing the problems.

Using this granular level of problem detection and isolation, AppDynamics application RBA understands business impact and takes the appropriate action to remediate within seconds of detection. There are many problems that get detected automatically and many remediation’s already available out of the box to take advantage of so you don’t have to code your own. You can also re-use your existing RBA processes and scripts to take advantage of all that hard work that has been done in the past. You can even integrate your Puppet and Chef tools directly with AppDynamics. Why re-invent the wheel?!

Application RBA is powerful and flexible. You have the choice to take action automatically or to be prompted for authorization if you’re not ready to trust in the judgement of machines just yet. Your troubleshooting and remediation actions can be executed on only the impacted nodes, on all associated nodes in a tier, or at the overall application level. Application awareness and flexibility empower you to manage your applications the way you need to for your unique business.

Monitoring, Meet Management

Management has long been the missing dimension from the promise of APM. The lack of focus on this problem by the monitoring industry needs to change and AppDynamics is leading the way. When most applications were run on a handful of servers it was acceptable to overlook the management aspect and just focus on pointing out where the problems were. With modern applications scaling to hundreds and thousands of nodes it’s becoming impossible for humans to keep ahead of the management challenges without using machine automation.

If you need to remediate business impact caused by slow or crashed applications as fast as possible, Application Runbook Automation from AppDynamics is just what you need.

Try AppDynamics with Application Runbook Automation for free today.