A Newbie Guide to APM

Today’s blog post is headed back to the basics. I’ve been using and talking about APM tools for so many years sometimes it’s hard to remember that feeling of not knowing the associated terms and concepts. So for anyone who is looking to learn about APM, this blog is for you.

What does the term APM stand for?

APM is an acronym for Application Performance Management. You’ll also hear the term Application Performance Monitoring used interchangeably and that is just fine. Some will debate the details of monitoring versus management and in reality there is an important difference but from a terminology perspective it’s a bit nit-picky.

What’s the difference between monitoring and management?

Monitoring is a term used when you are collecting data and presenting it to the end user. Management is when you have the ability to take action on your monitored systems. Management tasks can include restarting components, making configuration changes, collecting more information through the execution of scripts, etc… If you want to read more about the management functionality in APM tools click here.

What is APM?

There is a lot of confusion about the term APM. Most of this confusion is caused by software vendors trying to convince people that their software is useful for monitoring applications. In an effort to create a standard definition for grouping software products, Gartner introduced a definition that we will review here.

Gartner lists five key dimensions of APM in their terms glossary found here… http://www.gartner.com/it-glossary/application-performance-monitoring-apm

End user experience monitoringEUM and RUM are the common acronyms for this dimension of monitoring. This type of monitoring provides information about the response times and errors end users are seeing on their device (mobile, browser, etc…). This information is very useful for identifying compatibility issues (website doesn’t work properly with IE8), regional issues (users in northern California are seeing slow response times), and issues with certain pages and functions (the Javascript is throwing an error on the search page).

prod-meuem_a-960x0 (2)

Screen_Shot_2014-08-04_at_4.29.55_PM-960x0 (2)

Runtime application architecture discovery modeling and display – This is a graphical representation of the components in an application or group of applications that communicate with each other to deliver business functionality. APM tools should automatically discover these relationships and update the graphical representation as soon as anything changes. This graphical view is a great starting point for understanding how applications have been deployed and for identifying and troubleshooting problems.

Screen_Shot_2014-07-17_at_3.42.47_PM-960x0 (3)

User-defined transaction profiling – This is functionality that tracks the user activity within your applications across all of the components that service those transactions. A common term associated with transaction profiling is business transactions (BT’s). A BT is very different from a web page. Here’s an example… As a user of a website I go to the login page, type in my username and password, then hit the submit button. As soon as I hit submit a BT is started on the application servers. The app servers may communicate with many different components (LDAP, Database, message queue, etc…) in order to authenticate my credentials. All of this activity is tracked and measured and associated with a single “login” BT. This is a very important concept in APM and is shown in the screenshots below.

Screen_Shot_2014-08-04_at_1.30.15_PM-960x0

Component deep-dive monitoring in application context – Deep dive monitoring is when you record and measure the internal workings of application components. For application servers, this would entail recording the call stack of code execution and the timing associated with each method. For a database server this would entail recording all of the SQL queries, stored procedure executions, and database statistics. This information is used to troubleshoot complex code issues that are responsible for poor performance or errors.

Screen_Shot_2014-08-07_at_11.08.00_AM-960x0

Analytics – This term leaves a lot to be desired since it can be and often is very liberally interpreted. To me, analytics (in the context of APM) means baselining, and correlating data to provide actionable information. To others analytics can be as basic as providing reporting capabilities that simply format the raw data in a more consumable manner. I think analytics should help identify and solve problems and be more than just reporting but that is my personal opinion.

business-impact-analytics2-1-960x0

performance-analytics-960x0

Do I need APM?

APM tools have many use cases. If you provide support for application components or the infrastructure components that service the applications then APM is an invaluable tool for your job. If you are a developer the absolutely yes, APM fits right in with the entire software development lifecycle. If your company is adopting a DevOps philosophy, APM is a tool that is collaborative at it’s core and enables developers and operations staff to work more effectively. Companies that are using APM tools consider them a competitive advantage because they resolve problems faster, solve more issues over time, and provide meaningful business insight.

How can I get started with APM?

First off you need an application to monitor. Assuming you have access to one, you can try AppDynamics for free. If you want to understand more about the process used in most companies to purchase APM tools you can read about it by clicking here.

Hopefully this introduction has provided you with a foundation for starting an APM journey. If there are more related topics that you want me to write about please let me know in the comments section below.

Check out our complementary ebook, Top 10 Java Performance Problems!

Monitoring the Real End User Experience

Web application performance is fundamentally associated in the mind of the end user; with a brands reliability, stability, and credibility. Slow or unstable performance is simply not an option when your users are only a click away from taking their business elsewhere. Understanding your users, the locations they are coming from, and the devices/browsers they are using is crucial to ensuring customer loyalty and growth.

Today’s modern web applications are architected to be highly interactive, often executing complex client side logic in order to provide a rich and engaging experience to the user. This added complexity means it is no longer good enough to simply measure the effects users have on the back-end. It is also necessary to measure and optimize the client-side performance to ensure the best possible experience for your users.

Determining the root cause for poor user experience is a costly and time consuming activity which requires visibility into page composition, JavaScript error diagnostics, network latency metrics and AJAX/iframe performance.

Let’s take a look at a few of the key features available in AppDynamics 3.7 which simplify troubleshooting these problems.

End User Experience dashboard:

The first view we will look at reports EUM data by geographic location showing which regions have the highest loads, the longest response times, the most errors, etc.

1geo
The dashboard is split into three main panels:

  • A main panel in the upper left that displays geographic distribution on a map or a grid
  • A panel on the right displaying summary information: total end user response time, page render time, network time and server time
  • Trend graphs in the lower part of the dashboard that dynamically display data based on the level of information displayed in the other two panels

The geographic region for which the data is displayed throughout the dashboard is based on the region currently showed on the map or in the summary panel. For example, if you zoom down from global view to France in the map, the summary panel and the graphs display data only for France.

This view is key to understanding the geographical impact of any network & CDN latency performance issues. You can also see which are the busiest geographies, driving the highest throughput of your application. 

Browsers and Devices:

From the Browsers and Devices tabs you can see the distribution of devices, browsers and browser versions providing an understanding of which are the most popular access mechanisms for the users of your application and geographic-split by region.  From here we can isolate if a particular browser or device is delivering a reduced experience to the end user and help plan the best areas for optimisation.

2browserdevice

Troubleshooting End User Experience:

The user response breakdown shown below, is the first place we look to troubleshoot why a user is experiencing slow response times. It provides a full breakdown of where the overall time is being spent during the various stages of a page render, highlighting issues such as network latency, poor page design, too much time parsing HTML or downloading and executing JavaScript. 

Screen Shot 2013-07-25 at 9.47.14 AM

Response time metric breakdown

First Byte Time is the interval between the time that a user initiates a request and the time that the browser receives the first response byte.

Server Connection Time is the interval between the time that a user initiates a request and the start of fetching the response document from the server. This includes time spent on redirects, domain lookups, TCP connects and SSL handshakes.

Response Available Time is the interval between the beginning of the processing of the request on the browser to the time that the browser receives the response. This includes time in the network from the user’s browser to the server.

Front End Time is the interval between the arrival of the first byte of text response and the completion of the response page rendering by the browser.

Document Ready Time is the time to make the complete HTML document (DOM) available.

Document Download Time is the time for the browser to download the complete HTML document content.

Document Processing Time is the time for the browser to build the Document Object Model (DOM)

Page Render Time is the time for the browser to complete the download of remaining resources, including images, and finish rendering the page.

AppDynamics EUM reports on three different kinds of pages:

  • A base page represents what an end user sees in a single browser window. 
  • An iframe is a page embedded in another page.
  • An Ajax request is a request for data sent from a page asynchronously.

Notifications can be configured to trigger on any of these.

JavaScript error detection

JavaScript error detection provides alerting and identification of the root cause of JavaScript errors in minutes, highlighting the JavaScript file, line # and exception messages for all errors seen by your real users.

Screen Shot 2013-07-25 at 9.45.01 AM

Server-side correlation

If the above isn’t enough and you want to look into the execution within the datacentre, you can drill in-context from any of the detailed browser snapshots directly into the corresponding call stack trace in the application servers behind. This provides end-to-end visibility of a user’s interaction with your web application, from the browser all the way through the datacentre and deep into the database.

4breakdown

Deployment and scalability:

Deployment is simple – all you have to do is add a few lines of JavaScript to the web pages you want to monitor. We’ll even auto-inject this JavaScript on certain platforms at runtime. With its elastic public cloud architecture, AppDynamics EUM is designed to support billions of devices and user sessions per day, making it a perfect fit for enterprise web applications.

See Everything:

With AppDynamics you’ll get visibility into the performance of pages, AJAX requests and iframes, and you can see how performance varies by geographic region, device and browser type. In addition, you’ll get a highly granular browser response time breakdown (using the Navigation Timing API) for each snapshot, allowing you to see exactly how much time is spent in the network and in rendering the page. And if that’s not enough, you’ll see all JavaScript errors occurring at the end user’s browser down to the line number.

If you don’t currently know exactly what experience your users are getting when they access your applications or your users are complaining and you don’t know why then why not checkout AppDynamics End User Monitoring for free at appdynamics.com