Healthcare Reform and Application Performance Monitoring

Regardless of your political views, the healthcare reform is truly, and no pun intended, reforming healthcare in the United States. Everyone is probably familiar with the Affordable Care Act  (ACA) of 2010, or “Obamacare” which was enacted to increase the quality and affordability of healthcare in the United States. Another legislation which affects the healthcare industry was enacted in 2009 and it is commonly known as the “Stimulus”. Among the many provisions of the “Stimulus” or “The American Recovery and Reinvestment Act (ARRA)” are new regulations around Healthcare IT (HIT), chief among those is Meaningful Use (MU).

Broken out in 3 stages, the MU programs provide financial incentives for the “meaningful use” of certified Electronic Medical Records (EHR) technology. To receive an EHR incentive payment, providers have to show that they are using their certified EHR technology by meeting certain measurement thresholds that range from recording patient information as structured data to exchanging summary care records.

Screen Shot 2014-03-31 at 11.17.28 AM

The HIT Industry is Slow to Change

While the ARRA provides financial incentives to hospitals and eligible professionals to automate medical records (let’s call this the carrot), it also penalizes hospitals and eligible professionals that do not demonstrate and attest to MU by reducing Medicare and Medicaid reimbursements over time (let’s call this the stick).

MU will require change and, based on my years experience as an HIT consultant and application provider, the healthcare industry is slow to adopt change.

Prior to joining AppDynamics, I participated on a major system upgrade for a large hospital system. This upgrade was necessary for Meaning Use Attestation. The hospital was three major releases behind from the current release of their EHR software and the features of the software required for MU attestation where only available on the latest release. By the way, this upgrade latency is not uncommon in HIT.

Because of the severity of the change and the complexity of the environment, contingency plans were put in place to assure the hospital could continue to care for patients should any of the EHR components fail to upgrade properly. However, nothing could prepare the team for what happened next. 

A Problem Arises

Two days after the final upgrade outage, and just as everyone was ready to head home after a number of sleepless nights, a frantic call came into the upgrade command center. The call came from the nurse shift supervisor of the emergency department (ED). If you have ever met an ED nurse, you will understand it when I say that an ED nurse is not someone you want to upset.

The vast array of people caring for patients in an emergency department can be overwhelming. In order to provide visibility and bring order into a very intense operation, the ED relies on a number of critical tools. One such tool is the Tracking Board application. The Tracking Board application provides visibility into length of stay, staff assignment, room assignment, lab order tracking, patient criticality and many more vital data points. All of which can have a major effect on patient safety – the top priority for any healthcare professional.

Untitled

The ED tracking board was unusable and the ED was operating in the dark. Without the visibility provided by the Tracking Board application, the ED was at a stand still. While far from ideal, in such situations, the ED shuts down. But because this particular hospital is the only Level I trauma center in the region, this wasn’t an option. Because the software was composed of other modules that were working properly and the technical dependencies among modules, a downgrade wasn’t an option either.

The command center became a war room; Clinical analysts from the hospital, project managers from both sides of the implementation, a large ensemble of high-level clinical and executives from the hospital, the entire infrastructure team, DBA’s, interface engine administrators, and developers from 3 different continents, were all locked in and given clear instruction: “Don’t leave until this issue is resolved”.

Screen Shot 2014-04-23 at 10.02.04 PM

Minutes turned into hour, then into days. The situation in the emergency room was coincidentally turning into an emergency itself. The Chief Medical Information Officer (CMIO) of the hospital brought the vendor project manager to tears and the ED nurses were gathering their torches and pitchforks and marching against the IT department. All appeared lost and after days of outage and close to 1,000 man-hours spent trying to find the root cause of the problem, everyone was ready to walk out.

The patients however, could not walk away. Many of them had life threatening conditions, and the queue outside the ED was only growing longer. Patient safety is job #1 for everyone in healthcare, and the unavailability of the ED Tracking Board application was affecting every patient’s safety!

AppDynamics to the Rescue! 

Clearly, it was time for an intervention. Unlike the TV reality series, this intervention didn’t come in the form of over-emotional family members, but in the form of an APM solution. AppDynamics was deployed and quickly generated a flow map of the entire application environment. Within minutes, business transactions (BT) from within the software itself and from all adjacent systems that interface with the Tracking Board application began to pour in.

A business transaction (BT) is a key feature of AppDynamics, which, in simple terms, allows the users of AppDynamics to map the application based on how the users experiences it.

A key BT captured by AppDynamics was one happily named “UpdateCycle”. As reported by the Tracking Board application vendor, the ”UpdateCycle” BT was responsible for querying it’s own database, the interface engine, and a variety of disconjointed data sources and update an operational dashboard displayed via digital signage throughout the ED.

As the team monitored the application via AppDynamics, looking for clues as to why the application was failing, we noticed that the UpdateCycle transaction volume was 100x what was expected. In general, the Tracking Board dashboards update every minutes for each one of the viewers. Considering there were ~10 viewers at any given time, the system was designed to support tens of transaction per minute, and was failing because we were receiving thousands of transactions per minute.

A faulty client side configuration was overloading the server and causing it to generate slow responses back to the clients. The listener was working overtime, getting a response back every few seconds, and trying to update the ED tracking board constantly, resulting in constantly updates to the signage stations and webpages, making the system inoperable.

Using AppDynamics, the team was able to locate the root cause within one hour of deployment and the change itself took less than 5 minutes. The web server was restarted and all was calm in the kingdom of the ED.

Screen Shot 2014-04-24 at 8.02.32 AM

From the moment “yours truly” recommended the use of AppDynamics until everyone left the war room for good, less than 4 hours had elapsed. Let me say this again, 4 hours was all it took to download the software, install it, allow for traffic capturing and resolution!

Take a few minutes to get complete visibility into the performance of your production applications with AppDynamics today.

Introducing Nodetime for Node.js Monitoring

Node.js is rapidly becoming one of the most popular platforms for building fast, scalable web applications. According to a W3C tech survey, adoption of Node.js doubled in the last year alone, and the Node.js application server is currently the 14th most popular in the world. Today a range of organizations use Node.js to power their mobile applications, including LinkedIn, Walmart and Klout. With Nodetime you can monitor, troubleshoot and diagnose performance issues in your Node.js applications.

Nodetime reveals the internals of your application and infrastructure through profiling and proactive monitoring enabling detailed analysis, fast troubleshooting, performance and capacity optimization. Monitor realtime and historical state of the application by following multiple application metrics. Over three thousand organizations use Nodetime to monitor their Node.js applications, including Condé Nast, Kabam and Change.org.

“We’re very excited to see AppDynamics pursuing the latest and most innovative technologies with this acquisition,” said Nic Johnson, Web Architect at FamilySearch, a customer of both AppDynamics and Nodetime. “Both Nodetime and AppDynamics are essential parts of our toolset, and we’re very excited to see them united in the same product and the same great company. This will mean great things both for AppDynamics customers and for the Node.js community as a whole.”

Nodetime Dashboard for at a glance metrics of application health:

Nodetime Dashboard

CPU profiler showing a backtrace to find the root cause of a performance problem:

Nodetime

NodeTime

Nodetime metrics cover operating system state, garbage collection activity, application capacity, transactions and database calls for supported libraries, such as such as HTTP, File System, Socket.io, Redis, MongoDB, MySQL, PostgreSQL, Memcached and Cassandra.

Explore any application or OS metric via the Nodetime metric browser:

Nodetime

NodeTime

Get started for free with Node.js performance monitor with Nodetime.

Nodetime

Nodetime installation is exteremly easy:

npm install nodetime

Once the Nodetime module is available simply add the following to your node application:


require('nodetime').profile({
accountKey: 'xxx',
appName: 'MyApp'
});

Adding application performance monitoring for Node.js application has never been easier. Enjoy!

As always, please feel free to comment if you think I have missed something or if you have a request for content in an upcoming post.

Thoughts? Let us know on Twitter @AppDynamics!

Intelligent Alerting for Complex Applications – PagerDuty & AppDynamics

Screen Shot 2013-04-16 at 2.39.00 PMToday AppDynamics announced integration with PagerDuty, a SaaS-based provider of IT alerting and incident management software that is changing the way IT teams are notified, and how they manage incidents in their mission-critical applications.  By combining AppDynamics’ granular visibility of applications with PagerDuty’s reliable alerting capabilities, customers can make sure the right people are proactively notified when business impact occurs, so IT teams can get their apps back up and running as quickly as possible.

You’ll need a PagerDuty and AppDynamics license to get started – if you don’t already have one, you can sign up for free trials of PagerDuty and AppDynamics online.  Once you complete this simple installation, you’ll start receiving incidents in PagerDuty created by AppDynamics out-of-the-box policies.

Once an incident is filed it will have the following list view:

incident

When the ‘Details’ link is clicked, you’ll see the details for this particular incident including the Incident Log:

incident_details

If you are interested in learning more about the event itself, simply click ‘View message’ and all of the AppDynamics event details are displayed showing which policy was breached, violation value, severity, etc. :

incident_message

Let’s walk through some examples of how our customers are using this integration today.

Say Goodbye to Irrelevant Notifications

Is your work email address included in some sort of group email alias at work and you get several, maybe even dozens, of notifications a day that aren’t particularly relevant to your responsibilities or are intended for other people on your team?  I know I do.  Imagine a world where your team only receives messages when the notifications have to do with their individual role and only get sent to people that are actually on call.  With AppDynamics & PagerDuty you can now build in alerting logic that routes specific alerts to specific teams and only sends messages to the people that are actually on-call.  App response time way above the normal value?  Send an alert to the app support engineer that is on call, not all of his colleagues.  Not having to sift through a bunch of irrelevant alerts means that when one does come through you can be sure it requires YOUR attention right away.

on_call_schedules

Automatic Escalations

If you are only sending a notification and assigning an incident to one person, what happens if that person is out of the office or doesn’t have access to the internet / phone to respond to the alert?  Well, the good thing about the power of PagerDuty is that you can build in automatic escalations.  So, if you have a trigger in AppDynamics to fire off a PagerDuty alert when a node is down, and the infrastructure manager isn’t available, you can automatically escalate and re-assign / alert a backup employee or admin.

escalation_policy

The Sky is Falling!  Oh Wait – We’re Just Conducting Maintenance…

Another potentially annoying situation for IT teams are all of the alerts that get fired off during a maintenance window.  PagerDuty has the concept of a maintenance window so your team doesn’t get a bunch of doomsday messages during maintenance.  You can even setup a maintenance window with one click if you prefer to go that route.

maintenance_window

Either way, no new incidents will be created during this time period… meaning your team will be spared having to open, read, and file the alerts and update / close out the newly-created incidents in the system.

We’re confident this integration of the leading application performance management solution with the leading IT incident management solution will save your team time and make them more productive.  Check out the AppDynamics and PagerDuty integration today!

Introducing AppDynamics for PHP

PHP Logo

It’s been about 12 years since I last scripted in PHP. I pretty much paid my way through college building PHP websites for small companies that wanted a web presence. Back then PHP was the perfect choice, because nearly all the internet service providers had PHP support for free if you registered domain names with them. Java and .NET wasn’t an option for a poor smelly student like me, so I just wrote standard HTML with embedded scriplets of PHP code and bingo–I had dynamic web pages.

Today, 244 million websites run on PHP which is almost 75% of the web. That’s a pretty scary statistic. If only I’d kept coding PHP back when I was 21, I’d be a billionaire by now! PHP is a pretty good example of how open-source technology can go viral and infect millions of developers and organizations world-wide.

Turnkey APMaaS by AppDynamics

Since we launched our Managed Service Provider program late last year, we’ve signed up many MSPs that were interested in adding Application Performance Management-as-a-Service (APMaaS) to their service catalogs.  Wouldn’t you be excited to add a service that’s easy to manage but more importantly easy to sell to your existing customer base?

Service providers like Scicom definitely were (check out the case study), because they are being held responsible for the performance of their customer’s complex, distributed applications, but oftentimes don’t have visibility inside the actual application.  That’s like being asked to officiate an NFL game with your eyes closed.

ref

The sad truth is that many MSPs still think that high visibility in app environments equates to high configuration, high cost, and high overhead.

Thankfully this is 2013.  People send emails instead of snail mail, play Call of Duty instead of Pac-Man, listen to Pandora instead of cassettes, and can have high visibility in app environments with low configuration, low cost, and low overhead with AppDynamics.

Not only do we have a great APM service to help MSPs increase their Monthly Recurring Revenue (MRR), we make it extremely easy for them to deploy this service in their own environments, which, to be candid, is half the battle.  MSPs can’t spend countless hours deploying a new service.  It takes focus and attention away from their core business, which in turn could endanger the SLAs they have with their customers.  Plus, it’s just really annoying.

Introducing: APMaaS in a Box

Here at AppDynamics, we take pride in delivering value quickly.  Most of our customers go from nothing to full-fledged production performance monitoring across their entire environment in a matter of hours in both on-premise and SaaS deployments.  MSPs are now leveraging that same rapid SaaS deployment model in their own environments with something that we like to call ‘APMaaS in a Box’.

At a high level, APMaaS in a Box is large cardboard box with air holes and a fragile sticker wherein we pack a support engineer, a few management servers, an instruction manual, and a return label…just kidding…sorry, couldn’t resist.

man in box w sticker

Simply put, APMaaS in a Box is a set of files and scripts that allows MSPs to provision multi-tenant controllers in their own data center or private cloud and provision AppDynamics licenses for customers themselves…basically it’s the ultimate turnkey APMaaS.

By utilizing AppDynamics’ APMaaS in a Box, MSPs across the world are leveraging our quick deployment, self-service license provisioning, and flexibility in the way we do business to differentiate themselves and gain net new revenue.

Quick Deployment

Within 6 hours, MSPs like NTT Europe who use our APMaaS in a Box capabilities will have all the pieces they need in place to start monitoring the performance of their customer’s apps.  Now that’s some rapid time to value!

Self-Service License Provisioning

MSPs can provision licenses directly through the AppDynamics partner portal.  This gives you complete control over who gets licenses and makes it very easy to manage this process across your customer base.

Flexibility

A MSP can get started on a month-to-month basis with no commitment.  Only paying for what you sell eliminates the cost of shelfware.  MSPs can also sell AppDynamics however they would like to position it and can float licenses across customers.  NTT Europe uses a 3-tier service offering so customers can pick and choose the APM services they’d like to pay for.  Feel free to get creative when packaging this service for customers!

Conclusion

As more and more MSPs move up the stack from infrastructure management to monitoring the performance of their customer’s distributed applications, choosing an APM partner that understands the Managed Services business is of utmost importance.  AppDynamics’ APMaaS in a box capabilities align well with internal MSP infrastructures, and our pricing model aligns with the business needs of Managed Service Providers – we’re a perfect fit.

MSPs who continue to evolve their service offerings to keep pace with customer demands will be well positioned to reap the benefits and future revenue that comes along with staying ahead of the market.  To paraphrase The Great One, MSPs need to “skate where the puck is going to be, not where it has been.”  I encourage all you MSPs out there to contact us today to see how we can help you skate ahead of the curve and take advantage of the growing APM market with our easy to use, easy to deploy APMaaS in a Box.  If you don’t, your competition will…

AppDynamics & Splunk – Better Together

AppD & Splunk LogoA few months ago I saw an interesting partnership announcement from Foursquare and OpenTable.  Users can now make OpenTable reservations at participating restaurants from directly within the Foursquare mobile app.  My first thought was, “What the hell took you guys so long?” That integration makes sense on so many levels, I’m surprised it hadn’t already been done.

So when AppDynamics recently announced a partnership with Splunk, I viewed that as another no-brainer.  Two companies with complementary solutions making it easier for customers to use their products together – makes sense right?  It does to me, and I’m not alone.

I’ve been demoing a prototype of the integration for a few months now at different events across the country, and at the conclusion of each walk-through I’d get some variation of the same question, “How do I get my hands on this?”  Well, I’m glad to say the wait is over – the integration is available today as an App download on Splunkbase.  You’ll need a Splunk and AppDynamics license to get started – if you don’t already have one, you can sign up for free trials of Splunk and AppDynamics online.

The Top 5 Advantages of SaaS-based Application Performance Management

Software-as-a-Service (SaaS) has received a lot of success and adoption in the past five years, but not as much in the field of application performance management (APM) than it has in other markets. With Cloud computing gaining momentum, you’re likely to see SaaS APM adoption increase significantly as more applications are deployed to the Cloud. Gartner also recently made SaaS a mandatory requirement for APM vendors to be included in their 2012 APM Magic Quadrant, so SaaS-based APM is definitely becoming hot right now!

Here’s the top 5 advantages that SaaS-based APM can offer:

1. TIME-TO-VALUE

SaaS-based APM can be deployed within your organization in the time it takes you to read this article. Think about that for a second – you get to experience the full benefits of APM in just a few minutes with no interaction from sales people or technical consultants. All you need to do is sign up for an account, take a free trial, and evaluate whether APM can meet your needs or solve your problems.

Many cloud providers are now actively partnering with APM vendors to embed agents within the servers they provision for customer applications. I personally know of a company that solved a 6 month production issue within an hour of deploying SaaS-based APM. How about that for ROI and time to value!

2. COST – LICENSES, MAINTENANCE, ADMINISTRATION, HARDWARE

Simply put, subscription-based licenses are cheaper, more flexible and less risk than owning perpetual licenses. Annual maintenance is included in the subscription, as is the cost of managing and supporting the APM infrastructure required to monitor your applications. You don’t need to buy hardware to run your APM management server, and you also don’t need to pay someone to manage it either – you simply deploy your agents and you’re all done. There’s now no need to sign up to a multi-million dollar 3 year APM ELA agreement with a vendor; rather, you can pay as you go. If the APM software rocks, you renew your subscription. If the APM software sucks, you go elsewhere.

3. EASE OF USE

When a customer signs up for a SaaS account and evaluates APM for the first time, there is no pre-sales or technical consultant sitting next to them to configure or demo the solution. The experience from account registration to application monitoring is a journey taken alone by the customer.

First impressions are everything with SaaS. Therefore, the learning curve of APM in this context must be faster and easier, so the APM solution can sell itself to the customer.

SaaS-based APM solutions are also much younger than traditional on-premise software, meaning the technology, UI design principles, and concepts applied are more superior and interactive for the user. Try comparing the UI of an iPhone with a Nokia phone from 5 years ago and you’ll see my point.

First generation APM solutions were typically written for developers by developers. Today the value of APM touches many different user skill sets. It is therefore no surprise that SaaS-based APM can appeal to and be adopted by development, operations and business users.

4. MIGRATING TO THE LATEST RELEASE

When an APM vendor announces a new release of its software with lots of cool features, it’s normally down to the customers themselves to migrate to the new release. If things go well, they might spend several days or perhaps a few weeks performing the migration. If things go badly, they might end up spending several weeks working hand in hand with the vendor to complete the migration.

With SaaS-based APM, the vendors themselves are responsible for the migration. Customers simply login and they get the latest version and features automatically. They get to harness APM innovation as soon as it’s ready, rather than having to wait weeks or months to find the time to migrate by themselves. If anything goes wrong, then it’s the vendor who spends the time and money to fix it, rather than the customer.

Customers today will typically upgrade their APM software once a year because of the time and effort. With SaaS-based APM, they can receive multiple upgrades and always be on the latest version.

5. SCALABILITY

Enterprises and Cloud providers can manage lots of applications, which can span several thousand servers. It is one thing for a customer to deploy APM across two applications and a hundred servers in their organization. It is another thing to deploy it across fifty applications and a thousand servers.

Scaling APM has never been easy. The more agents you deploy, the more management servers you need to collect, process, and manage the data. How quickly can you purchase, provision, and maintain the APM management infrastructure when you’ve got hundreds of applications you want to monitor?

With SaaS-based APM, you let the vendor take care of that for you. I know of a SaaS-based APM user that monitors over 6,000 servers in their organization. Compare that with the largest APM on-premise deployment you know of and you can see why SaaS-based APM is a better scalability option.

So there you have it–five compelling reasons why you should consider SaaS-based APM in your organization. SaaS-based APM isn’t for everyone, though. I typically see less adoption in financial services customers where data privacy and security controls are much tighter.

Appman.

Gartner positions AppDynamics as a Leader in 2012 APM Magic Quadrant

Application Performance Monitoring (APM) has been my life and world for almost a decade. I used APM as a developer, sold it as a sales engineer, built it as a product manager and now I’m evangelizing it as a superhero. In that time, I’ve seen APM evolve from being a pure JavaEE monitoring tool in 2002 that a few developers might use, to a full blown IT monitoring platform in 2012 that aligns development, operations and the business.

Today, the APM market has advanced tenfold, with the help from analysts like Gartner, who research APM, and literally take hundreds of inquiry calls a year from buyers. As industry and technology trends evolve like SOA, Agile, web 2.0, cloud computing, devops and big data, so do the market requirements for APM.  For APM to deliver the promised benefits, it must enable users to monitor and manage modern applications. If modern buyers commonly require X, Y and Z from APM, then APM vendors must offer X, Y and Z to be considered relevant in the market, and so they’re recognized by analyst research and reports such as the Gartner Magic Quadrant.

For example, let’s take a look at the inclusion criteria from 2012 for a vendor to be included in the Gartner Application Performance Monitoring Magic Quadrant (and if you’d like to get a complimentary copy, be our guest):

  • The vendor’s APM product must include all five dimensions of APM, including application runtime; application architecture discovery and modeling; deep-dive monitoring of one or more key application component types (e.g., database, application server); user-defined transaction profiling; and analytics applied to metric aggregation, trending and pattern discovery techniques.
  • The APM product must provide compiled Java or .NET code instrumentation in a production environment.
  • The vendor should have at least 50 customers that use its APM products actively in a production environment.
  • The APM offering must include part of or the entire solution as a service. This includes managed service provider hosting, regardless of other commercial arrangements, or SaaS delivery through its own distribution channels.
  • Total revenue (including new licenses, updates, maintenance, subscriptions, SaaS, hosting and technical support) must have exceeded $5 million in 2011.
  • Customer references must be located in at least three of the following geographic locations: North America, South America, EMEA, the Asia/Pacific region and/or Japan.
  • The vendor references must monitor more than 200 production application server instances in a production environment.

Raising the APM bar:

The rational for Gartner’s 2012 APM MQ inclusion criteria is available here. A vendor must provide a broad set of APM functionality, supporting all five dimensions of APM rather than just a few.

Other inclusion criteria I liked from above was that APM vendors must provide a compiled Java or .NET code instrumentation in a production environment, the offering must include part or the entire solution as-a-service, and that vendor references must now monitor over 200 application server instances in a production environment. These items pretty much hit the sweet spots of AppDynamics, in that we monitor some of the largest production Java and .NET applications in the world, and we offer  all 5 dimensions of APM in a single product, which can be deployed both as on-premise or via SaaS. Our largest Java deployment is over 6,000 nodes and our largest .NET deployment is now over 5,000 nodes – this is how easy our APM solution is to deploy and scale.

The adoption of public cloud, combined with the facts that APM buyers are looking to simplify their APM purchases, implementations and maintenance means that AppDynamics is well positioned to capitalize on these opportunities.

Love or Hate the Gartner Magic Quadrant, every vendor wants to be part of it, because everyone wants to be known as a leader in their field. To do this, vendors must meet or exceed Gartner’s inclusion criteria as well as their very detailed requirements matrix, which puts pressure on each vendor to constantly innovate, execute and demonstrate a compelling vision.

AppDynamics named a Leader in 2012:

What I’ve witnessed at AppDynamics since I joined back in 2011 has been nothing short of amazing. We’ve kicked a lot of ass in the last year and have had a lot of fun doing it. You could say AppDynamics being positioned as a leader in the 2012 MQ was perhaps the recognition we deserved for breaking the rules of traditional APM. We believe our MQ position represents a clear testament of our technology, tremendous customer success and disruption in the marketplace. We’re enormously proud and privileged at AppDynamics to be recognized as a leader, but we know our job isn’t done yet. We want to make APM easy to deploy, easy to use and affordable for everyone. We do this and they’ll be more organizations in the world leveraging the benefits of APM than ever before, which translates to faster applications for everyone. Not a bad thing at all.

You can sign up for a free 30-day trial of AppDynamics Pro right here, and see for yourself why we’ve become a leader in just two years.

App Man.

Gartner does not endorse any vendor, product or service depicted in its research publications, and does not advise technology users to select only those vendors with the highest ratings. Gartner research publications consist of the opinions of Gartner’s research organization and should not be construed as statements of fact. Gartner disclaims all warranties, expressed or implied, with respect to this research, including any warranties of merchantability or fitness for a particular purpose

How Monitoring Analytics can make DevOps more Agile

The word “analytics” is an interesting and often abused term in the world of application monitoring. For the sake of correctness, I’m going to reference Wikipedia in how I define analytics:

Analytics is the discovery and communication of meaningful patterns in data.

Simply put, analytics should make IT’s life easier. Analytics should point out the bleeding obvious from all the monitoring data available, and guide IT so they can effectively manage the performance and availability of their application(s). Think of analytics as “doing the hard work” or “making sense” of the data being collected, so IT doesn’t have to spend hours figuring out for themselves what is being impacted and why.

Discovery
This is about how effectively a monitoring solution can self-learn the environment it’s deployed in, so it’s able to baseline what is normal and abnormal for the environment. This is really important as every application and business transaction is different. A key reason why many monitoring solutions fail today is that they rely on users to manually define what is normal and abnormal using static or simplistic global thresholds. The classic “alert me if server CPU > 90%” and “alert me if response times are > 2 seconds,” both of which normally result in a full inbox (which everyone loves) or an alert storm for IT to manage.

Communication
The communication bit of analytics is equally as important as the discovery bit. How well can IT interpret and understand what the monitoring solution is telling them? Is the data shown actionable–or does it require manual analysis, knowledge or expertise to arrive at a conclusion? Does the user have to look for problems on their own or does the monitoring solution present problems by itself? A monitoring solution should provide answers rather than questions.

One thing we did at AppDynamics was make analytics central to our product architecture. We’re about delivering maximum visibility through minimal effort, which means our product has to do the hard work for our users. Our customers today are solving issues in minutes versus days thanks to the way we collect, analyze and present monitoring data. If your applications are agile, complex, distributed and virtual then you probably don’t want to spend time telling a monitoring solution what is normal, abnormal, relevant or interesting. Let’s take a look at a few ways AppDynamics Pro is leveraging analytics:

Seeing The Big Picture
Seeing the bigger picture of application performance allows IT to quickly prioritize whether a problem is impacting an entire application or just a few users or transactions. For example, in the screenshot to the right we can see that in the last day the application processed 19.2 million business transactions (user requests), of which 0.1% experienced an error. 0.4% of transactions were classified as slow (> 2 SD), 0.3% were classified as very slow (> 3 SD) and 94 transaction stalled. The interesting thing here is that AppDynamics used analytics to automatically discover, learn and baseline what normal performance is for the application. No static, global or user defined thresholds were used – the performance baselines are dynamic and relative to each type of business transaction and user request. So if a credit card payment transaction normally takes 7 seconds, then this shouldn’t be classified as slow relative to other transactions that may only take 1 or 2 seconds.

The big picture here is that application performance generally looks OK, with 99.3% of business transactions having a normal end user experience with an average response time of 123 milliseconds. However, if you look at the data shown, 0.7% of user requests were either slow or very slow, which is almost 140,000 transactions. This is not good! The application in this example is an e-commerce website, so it’s important we understand exactly what business transactions were impacted out of those 140,000 that were classified as slow or very slow. For example, a slow search transaction isn’t the same as a slow checkout or order transaction – different transactions, different business impact.

Understanding the real Business Impact
The below screenshot shows business transaction health for the e-commerce application sorted by number of very slow requests. Analytics is used in this view by AppDynamics so it can automatically classify and present to the user which business transactions are erroneous, slow, very slow and stalling relative to their individual performance baseline (which is self-learned). At a quick glance, you can see two business transactions–“Order Calculate” and “OrderItemDisplayView”–are breaching their performance baseline.

This information helps IT determine the true business impact of a performance issue so they can prioritize where and what to troubleshoot. You can also see that the “Order Calculate” transaction had 15,717 errors. Clicking on this number would reveal the stack traces of those errors, thus allowing the APM user to easily find the root cause. In addition, we can see the average response time of the “Order Calculate” transaction was 576 milliseconds and the maximum response time is just over 64 seconds, along with 10,393 very slow requests. If AppDynamics didn’t show how many requests were erroneous, slow or very slow, then the user could spend hours figuring out the true business impact of such incident. Let’s take a look at those very slow requests by clicking on the 10,393 link in the user interface.

Seeing individual slow user business transactions
As you can probably imagine, using average response times to troubleshoot business impact is like putting a blindfold over your eyes. If your end users are experiencing slow transactions, then you need to see those transactions to effectively troubleshoot them. For example, AppDynamics uses real-time analytics to detect when business transactions breach their performance baseline, so it’s able to collect a complete blueprint of how those transactions executed across and inside the application infrastructure. This enables IT to identify root cause rapidly.

 In the screenshot above you can see all “OrderCalculate” transactions have been sorted in descending order by response time, thus making it real easy for the user to drill into any of the slow user requests. You can also see looking at the summary column that AppDynamics continuously monitors the response time of business transactions using moving averages and standard deviations to identify real business impact. Given the results our customers are seeing, we’d say this is a pretty proven way to troubleshoot business impact and application performance. Let’s drill into one of those slow transactions…

Visualizing the flow of a slow transaction
Sometimes a picture says a thousands words, and that’s exactly what visualizing the flow of a business transaction can do for IT. IT shouldn’t have to look through pages of metrics, or GBs of log files to correlate and guess why a transaction maybe slow. AppDynamics does all that for you! Look at the screenshot below that shows the flow of a “OrderCalculate” transaction–which takes 63 seconds to execute across 3 different application tiers as shown below. You can see the majority of time spent is calling the DB2 database and an external 3rd party HTTP web service. Let’s drill down to see what is causing that high amount of latency.

Automating Root Cause Analysis
Finding the root cause of a slow transaction isn’t trivial, because a single transaction can invoke several thousand lines of code–kind of like finding a needle in a haystack. Call graphs of transaction code execution are useful, but it’s much faster and easier if the user can shortcut to hotspots. AppDynamics uses analytics to do just that by presenting code hotspots to the user automatically so they can pinpoint the root cause in seconds. You can see in the below screenshot that almost 30 seconds (18.8+6.4+4.1+0.6) was spent in a web service call “calculateTaxes” (which was called 4 times) with another 13 seconds being spent in a single JDBC database call (user can click to view SQL query). Root cause analysis with analytics can be a powerful asset for any IT team.

Verifying Server Resource or Capacity
It’s true that application performance can be impacted by server capacity or resource constraints. When a transaction or user request is slow, it’s always a good idea to check what impact OS and JVM resource is having. For example, was the server maxed out on CPU? Was Garbage Collection (GC) running? If so, how long did GC run for? Was the database connection pool maxed out? All these questions require a user to manually look at different OS and JVM metrics to understand whether resource spikes or exhaustion was occurring during the slowdown. This is pretty much what most sysadmins do today to triage and troubleshoot servers that underpin a slow running application. Wouldn’t it be great if a monitoring solution could answer these questions in a single view, showing IT which OS and JVM resource was deviating from its baseline during the slowdown? With analytics it can.

AppDynamics introduced a new set of analytics in version 3.4.2 called “Node Problems” to do just this. The above screenshot shows this view whereby node metrics (e.g. OS, JVM and JMX metrics) are analyzed to determine if any were breaching their baseline and contributing to the slow performance of the “OrderCalculate” transaction. The screenshot above shows that % CPU idle, % memory used and MB memory used have deviated slightly from their baseline (denoted by blue dotted lines in the charts). Server capacity on this occasion was therefore not a contributing factor to the slow application performance. Hardware metrics that did not deviate from their baseline are not shown, thus reducing the amount of data and noise the user has to look at in this view.

Analytics makes IT more Agile
If a monitoring solution is able to discover abnormal patterns and communicate these effectively to a user, then this significantly reduces the amount of time IT has to spend managing application performance, thus making IT more agile and productive. Without analytics, IT can become a slave to data overload, big data, alert storming and silos of information that must be manually stitched together and analyzed by teams of people. In today’s world, “manually” isn’t cool or clever. If you want to be agile then you need to automate the way you manage application performance, or you’ll end up with the monitoring solution managing you.

If your current monitoring solution requires you to manually tell it what to monitor, then maybe you should be evaluating a next generation monitoring solution like AppDynamics.

App Man.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

AppDynamics recognized by Forrester in APM market overview

Interest in the Application Performance Management (APM) category is very high right now.   To stay one step ahead of their clients, the Industry Analysts who cover the category and write research to advise their clients have been very busy.  In December alone, there were six different analyst reports being researched by the major analyst firms.

Forrester published the results of their research in the 2nd week of December with the report: Market Overview: Application Performance Management, Q4 2011.  Forrester clients can access the report at www.forrester.com. In this report, Forrester provides very sound advice on why APM exists and what it should do for clients. Forrester has created their own “Reference Model” for APM and evaluated the vendor landscape against those criteria.

Raison d’etre for APM

Forrester VP and Principal Analyst, JP Garbani, gives readers very pragmatic advice on the raison d’etre for APM.  Simply put, APM’s job is to:

1) Alert IT to application performance and availability issues before a full-scale outage occurs

2) Isolate or pinpoint the problem source

3) Provide deep-diagnostics to enable IT to determine the root cause

For several years now, JP Garbani has been on the forefront of proclaiming that modern APM solutions should enable IT organizations to manage apps not by gauging the heath of their servers or servlets, but instead by assessing what the customer or end-user cares about most – whether their Business Transaction completes quickly and doesn’t make them wait.  He states that this has become even more critical as applications have gotten more distributed and complex.