Just how useful is JMX Monitoring?

November 01 2011
 

A simple example of how both application and JMX monitoring combined can help organizations find the root cause of application performance issues.


I’ve always been a skeptic of JMX monitoring, largely because I felt it was often wrongly positioned as the pillar stone of application monitoring software. An application for me is a collection of business services or transactions that users perform, that causes application logic (code) to request and process data. Without visibility into the performance of business transactions and code execution, JMX monitoring can be seen as just another infrastructure monitoring tool for the JVM. However, when application and JMX monitoring are combined into a single tool, they can offer powerful capabilities for managing application performance.

What is JMX Monitoring?

JMX Monitoring is done by querying data from “Managed Beans” (MBeans) that are exposed via a JVM port (JMX console). An MBean represents a resource running inside a JVM and provides data on the configuration and usage of that resource. MBeans are typically grouped into “domains” to denote where resources belong to. Typically in a JVM you’ll see multiple domains. For example, for an application running on tomcat you’d see “Catalina” and “Java.lang” domains. “Catalina” represents resources and MBeans relating to the Apache Tomcat container, and “Java.lang” represents the same for the JVM run-time (e.g. Hotspot). You might also see custom domains that belong to an application, given how trivial it is to write custom MBeans via the JMX interface.

A big problem today in managing application performance is that every application is unique and different. Some applications are small and require a low amount of resource, while others are large and require significant resource. Whilst application code is often optimized during the development lifecycle, less can be said for the JVM run-time and container. Yes, many organizations may tweak memory heap size, or perhaps garbage collection–but many will just deploy their application with default configuration. Most of the time this is fine; however, as you’ll see with the customer example below, JVM/container configuration can often be a root cause for slow performance because not enough resource is available to the application.

Database Connection Pool bottleneck:

This example came from a recent customer who solved a major bottleneck in their application and actually shows the power of application and JMX monitoring in action. The customer on this occasion has identified that a business transaction “Retrieve Carrier” was taking over 40 second to complete. In the below screenshot, you can see how that individual business transaction executed over the infrastructure. Note in particular that 38 seconds was spent accessing the database “OracleDB1”. At this stage, you might assume it’s just a slow SQL statement that needs an index or two.

If we drill deeper into the “ws” JVM that invokes “OracleDB1,” we can see the main hotspots responsible for this response time is the Spring Bean factoryPersistanceTarget:findAdminlevelCarriers that makes several JDBC calls.

If we then drill down on this JDBC activity, we can see the JDBC calls that were made and we can see that 19.2 seconds was spent on “Get Pooled Connection.” This basically tells the user that the “Retrieve Carrier” business transaction was waiting 19.2 seconds for a connection to the Oracle database, thus causing a significant amount of latency. Notice how the select SQL query only took 99ms, meaning the issue here isn’t query or database related.

 We can confirm this connection pool bottleneck by looking at the below screenshot, which shows metrics from the JVM JMX console displaying the utilization of the database connection pool being invoked by the application. The green line on the chart shows the number of maximum database connections allowed (50), while the blue line shows the number of active database connections that the application initiated over time. You can see that occasionally the number of active connections equaled the maximum number of connections allowed, when this happens business transactions have to wait until connections are released. This is why some “Retrieve Carrier” business transactions wait 20 seconds for a database connections and other transactions don’t. In this instance, the customer increased the database connection pool size to 75, which completely removed this bottleneck.

This is just a simple example of how both application and JMX monitoring combined can help organizations find the root cause of application performance issues. However, JMX monitoring on its own only tells you half the story. Without having application monitoring, you’ll be blind to business impact and understanding what specific business transactions and application code is exhausting the JVM resource. You can get started for free today by downloading AppDynamics Lite 2.0, which provides both application and JMX monitoring for a single JVM. You can also request a free 30 day trial of AppDynamics Pro, which provide more powerful features for managing performance across your entire distributed application.

App Man.

Sandy Mappic

Thank you! Your submission has been received!

Oops! Something went wrong while submitting the form