How Garmin attacks IT and business problems with Application Performance Management (APM)

If you’ve used a GPS product before you’re probably familiar with Garmin. They are the leading worldwide provider of navigation devices and technologies. In addition to their popular consumer GPS products, Garmin also offers devices for planes, boats, outdoor activities, fitness, and cars. Because end user experience is so important to Garmin and its customers, ensuring the speed of its web and mobile applications is one of its highest priorities.

This blog post summarizes the content of a case study that can be found by clicking here.

The reality in today’s IT driven world is that there are many different issues that need to be solved and therefore many different use cases for the products designed to help with these issues. Garmin has shown it’s industry leadership by taking advantage of AppDynamics to address many use cases with a single product. This shows a high level of maturity that most organizations strive to achieve.

“We never had anything that would give us a good deep dive into what was going on in the transactions inside our application – we had no historical data, and we had no insight into database calls, threads, etc.,” – Doug Strick

Garmin-FlowMap

AppDynamics flow map showing part of Garmin’s production environment.

Real-Time and Historic Visibility in Production

You just can’t test everything before your application goes to production. Real users will find ways to break your application and/or make it slow to a crawl. When this happens you need historical data to compare against as well as real-time information to quickly remediate problems.

Garmin found that AppDynamics, an application performance management solution designed for high-volume production environments, best suited its needs. Garmin’s operations team feels comfortable leaving AppDynamics on in the production environment, allowing them to respond more quickly and collect historical data.

The following use cases are examples of how Garmin is attacking IT and business problems to better serve their company and their customers.

Use Case 1: Memory Monitoring

Before they started using AppDynamics, Garmin was unable to track application memory usage over time. Using AppDynamics they were able to identify a major memory issue impacting customers and ensure a fast and stable user experience. Click here to read the details in the case study.

Garmin-Heap

Chart of heap usage over time.

Use Case 2: Automated Remediation Using Application Run Book Automation

AppDynamics’ Application Run Book Automation feature allows organizations to trigger certain tasks, such as restarting a JVM or running a script, with thresholds in AppDynamics. Garmin was able to use this feature very effectively one weekend while waiting for a fix from a third party. Click here to read the details in the case study.

Use Case 3: Code-level Diagnostics

“We knew we needed a tool that could constantly monitor our production environment, allowing us to collect historical data and trend performance over time,” said Strick. “Also, we needed something that would give us a better view of what was going on inside the application at the code level.” – Doug Strick

With AppDynamics, Garmin’s application teams have been able to rapidly resolve several issues originating in the application code.  Click here to read the details in the case study.

Use Case 4: Bad Configuration in Production

Garmin also uses AppDynamics to verify that new code releases perform well. Because Garmin has an agile release schedule, new code releases are very frequent (every 2 weeks) and it’s important to ensure that each release goes smoothly. One common problem that occurs when deploying new code is that configurations for test are not updated for the production environment, resulting in performance issues. Click here to read the details in the case study.

Garmin-Verification

AppDynamics flow map showing production nodes making services calls to test environments.

Use Case 5: Real-time Business Metrics (RTBM)

IT metrics alone don’t tell the whole story. There can be business problems due to pricing issues our issues with external service providers that impact the business and will never show up as an IT metric. That’s why AppDynamics created RTBM and why Garmin adopted it so early on in their deployment of AppDynamics. With Real-Time Business Metrics, organizations like Garmin can collect, monitor, and visualize transaction payload data from their applications. Strick used this capability to measure the revenue flow in Garmin’s eCommerce application.  Click here to read the details in the case study.

Garmin-RTBM

Real-time order totals for web orders.

Conclusion

“AppDynamics has given us visibility into areas we never had visibility into before… Some issues that used to take several hours now take under 30 minutes to resolve, or a matter of seconds if we’re using Run Book Automation.” – Doug Strick

Garmin has shown how to get real value out of its APM tool of choice and you can do the same. Click here to try AppDynamics for free in your environment today.

Deploying APM in the Enterprise Part 2 – APM Maturity As You’ve Probably Never Seen It Before

Welcome to Part 2 of the series “Deploying APM in the Enterprise“. In Part 1 I provided some background on what this series is all about and why you should continue reading. Part 2 will provide more foundation and my take on APM maturity. This will not be one of those boring theoretical maturity models but instead will come from the reality of fighting fires and utilizing APM tools in an Enterprise situation. Whether you’re in a small shop or a giant Fortune 100 these same principles will apply. The biggest difference should be the amount of budget you have to work with (sorry small shops, the large enterprises tend to have a lot more money in the budget).

The way I think about APM maturity is by the questions and comments that are associated with capabilities and overall maturity. Here’s an example… When my kid was 3 years old he asked “Where do babies come from?“. This is a very obvious indicator of where he belongs in the “Life Maturity Model”. Anyone asking this question probably belongs at that same level. An important aspect to remember about any maturity model is that you can be (and most organizations are) associated with multiple levels of maturity at any given point in time. So let’s take a look at my version of an APM maturity model!

Hirsch’s APM Maturity Model:

Level 0 – WTF Just Happened?

  • WTF Just HappenedWe just got a bunch of phone calls that the Website/Application is slow. Really?
  • CPU, memory, disk, and network all look great. Why is it still so slow?
  • You start making phone calls or start/join a conference call asking
    • Did you change anything?
    • Do you see anything in the log files?
    • Are we having network problems?
    • Can someone get the DBA on the line? It has to be the database!
    • It just started working again. Did anyone change anything?
    • Is it fixed?
  • 3AM phone call from help desk..It’s broken again. Damn!
  • How does a business transaction relate to IT infrastructure?

Level 1 – Ouch, too much information!

  • Ouch, buried in data and alertsOur new monitoring tool sure does provide a lot of data. Look at all these charts to spend hours digging through.
  • It took a long time to set up all of those alert thresholds but I bet it will be worth it.
  • Why so many alerts? Did everything break at the same time?
  • Is anything really wrong? I don’t know, go test the site/app to find out.
  • It worked great in dev/test/qa. What’s different about prod?
  • We profiled our code in dev and it is still slowing down in prod. Why???
  • It’s still slow for our customers? It looks fine from the office.
  • Our APM tool is okay for testing but we wouldn’t dare use it in production.
  • Does anyone know what dependencies exist between our applications?
  • I heard about something called DevOps. Any idea what it is?

Level 2 – Whew, that’s getting better!

Better APM Maturity

  • We’re still getting a lot of alerts but now we know if apps are slow or broken.
  • We don’t set alert thresholds very often, our tooling alerts us automatically when important metrics deviate from their baselines.
  • Looks like some of the functions in our app are always slow. Let’s focus on optimizing the ones that are used the most or are the most important.
  • We built a dashboard for our app to show when it gets slow or breaks.
  • We can see everything going on in test and prod and know what’s different between environments.
  • We know if any of our end users are impacted because we monitor every business transaction.
  • Yep, the problem is on line 45 of the DoSomething method.
  • We automatically deploy monitoring with our apps. It’s part of our build/release process.
  • Our applications and their dependencies automatically get mapped by our tools. No need to guess what will break if we make a change.
  • Wouldn’t it be cool if we could automatically react to that spike in workload so our site wont slow down or crash?
  • I wonder if the business felt any impact from that problem?

Level 3 – That’s Right, We Bad!

(That’s a reference from the movie Stir Crazy)

  • Bad Ass APMWe built a business AND technology dashboard so that everyone could see if there was any impact at any given time.
  • All of our monitoring tools are integrated and provide a holistic view of the health of each component as well as the entire application.
  • Whenever there is an application slowdown (or when we predict there will be a slowdown) due to spikes in user activity our tooling automatically adapts and spins up new instances until the spike ends.
  • When any of our application nodes are not working properly our tooling automatically removes the bad node and replaces it with a new functional node.
  • The data derived from our APM toolset is used by many different functional groups within the organization spanning both technology and business.

So you can probably identify with one or more of the maturity levels described above but what’s really important is figuring out how to advance your capabilities so that you can keep progressing to higher levels. Utilizing software tools is obviously part of developing higher levels of maturity but good processes and well trained people are also critical components of success.

In part 3 of this series we will discuss how to get started down the path of deploying APM in the enterprise so that you can advance your monitoring maturity and realize significant value in your business. I’m sure you can think of other comments or questions you have heard that relate to the APM maturity levels I’ve listed above. I’d love to hear your feedback in the comments section.