Exploring the AppDynamics Integration with OpenShift Container Platform

This was originally posted on OpenShift’s blog.

Introduction

Regardless of whether you are developing traditional applications or microservices, you will inevitably have a requirement to monitor the health of your applications, and the components upon which they are built.

Furthermore, users of OpenShift Container Platform will often have their own existing enterprise standard monitoring infrastructure into which they will be looking to integrate applications they build. As part of Red Hat’s Professional Services organisation, one of the monitoring platforms I encounter whilst working with OpenShift in the field is the AppDynamics SaaS offering.

In this post, I will run through how we can take a Source-to-Image (S2I) builder, customise it to add the monitoring agent, and then use it as the basis of a Fuse Integration Services application, written in Apache Camel, and using the Java CDI approach.

Register with AppDynamics

Before getting into the process of modifying the S2I builder image and building the application, the first thing we need to do is register with the AppDynamics platform. If you’re an existing consumer of this service, then this step obviously isn’t required!

Either way, once registered, we need to download the Java agent. In this example, I’ve used the Standalone JVM agent, but there are many more options to choose from, and one of those may better suit your requirements.

Adding the Agent to your Image

There are two primary ways you can go about adding the AppDynamics Java agent to your image.

Firstly, you can use Source-To-Image (S2I) to add the Java agent to the standard fis-java-openshift base image at the same time as pulling in all your other dependencies – mainly source code and libraries.

Secondly, you can extend the fis-java-openshift S2I builder image itself, add your own layer containing the Java agent, and use this new image as the basis for your builds.

Using S2I

When using S2I to create an image, OpenShift can execute a number of scripts as part of this process. The two scripts we are interested in in this context are assemble and run.

In the fis-java-openshift image, the S2I scripts are located in /usr/local/s2i. We can override the actions of these scripts by adding an .s2i/bin directory into the code repository, and creating our new scripts there.

Assemble

The assemble script is going to be the script that pulls in the Java agent and unpacks it ready for use. Whilst we need to override it to carry out this task, we also need it to carry on performing the tasks it currently performs in addition to any customisations we might add:

#!/bin/bash

# run original assemble script
/usr/local/s2i/assemble

# install appdynamics agent
curl -H "Accept: application/zip" https://codeload.github.com/benemon/fis-java-appdynamics-plugin/zip/master -o /deployments/appdynamics.zip

pushd /deployments
unzip appdynamics.zip
    pushd fis-java-appdynamics-plugin-master/
    mv appdynamics/ ../
    popd
rm -rf fis-java-appdynamics-plugin-master/
rm -f appdynamics.zip
popd

As can be seen above, we actually get this script to execute the original assemble script before we add the AppDynamics agent – this way, if the Maven build fails, we haven’t wasted any time downloading any artifacts we’re not going to use.

Run

The run script is going to be the script that sets up the environment to allow us to use the AppDynamics Java agent, and – you’ve guessed it – run our app! Just as with the assemble script, we still want run to carry on executing our application when our customisations are complete. Therefore, all we do here is get it to check for the presence of an environment variable, and if it’s found, configure the environment to use AppDynamics.

#!/bin/bash

if [ x"$APPDYNAMICS_AGENT_ACCOUNT_NAME" != "x" ]; then
    mkdir /deployments/logs
    export JAVA_OPTIONS="-javaagent:/deployments/appdynamics/javaagent.jar -Dappdynamics.agent.logs.dir=/deployments/logs $JAVA_OPTIONS"
fi

exec /usr/local/s2i/run

In this case, we’re looking for a variable called APPDYNAMICS_AGENT_ACCOUNT_NAME. After all, if we haven’t configured any credentials for the Java agent, then it can’t connect to the AppDynamics SaaS anyway.

Template

Finally, to bring this all together, we can use a Template to pull all of these components together, begin the build process, and deploy our application.

The S2I process is possibly the simpler of the two methods outlined here for adding the AppDynamics Java agent to your application, but it does present some points of which you need to be aware:

  • The Java agent needs to be hosted somewhere accessible to your build process. It also needs to be version controlled separately from the build, which adds extra build management overhead.
  • It will be downloaded every single time you run a build – not the most efficient way of deploying it if you have an effective CI / CD pipeline and are doing multiple builds per hour!
  • Whilst it’s simpler to configure, it can present confusing problems during the build process. For example, if your assemble script creates some directories for your application to use (logging directories, for example), you may need to think about how your build and application are being executed, and who owns what in that process.

Regardless of these minor issues, this is still a powerful (and useful!) mechanism, and as such I have provided a sample repository that allows you to execute an S2I build that should pull in the Java agent and run it alongside an application.

NOTE: If you’re still interested in using the S2I process, and want to know more about how to configure the Java agent with environment variables, skip ahead to ‘Adding the AppDynamics agent to a FIS application’.

Extending the Fuse Integration Services (FIS) Base Image

My preference for using the AppDynamics Java agent with applications built on FIS (and for similar use cases), is to add it into the base image once, so that it is accessible by the any applications based on that image.

Dockerfile

In this example, this is done by creating a new Docker image, based on fis-java-openshift:latest and adding the Java agent into this project as an artifact to be added to that image:

FROM registry.access.redhat.com/jboss-fuse-6/fis-java-openshift:latest

USER root

ADD appdynamics/ /opt/appdynamics/

RUN chgrp -R 0 /opt/appdynamics/
RUN chmod -R g+rw /opt/appdynamics/
RUN find /opt/appdynamics/ -type d -exec chmod g+x {} +

#jboss from FIS
USER 185

In this Dockerfile, we are adding the content of the appdynamics directory in our Git repository to the fis-java-openshift base image, and altering its permissions so that it is owned by the JBoss user in that image.

OpenShift

In order to consume this Dockerfile and turn it into a useable image, we have a number of options. By far the simplest is to execute an oc new-build command against the repository hosting the Docker image – in the case of this image, this would be:

oc new-build https://github.com/benemon/fis-java-appdynamics/#0.2-SNAPSHOT  --context-dir=src/main/docker

Note the use of the --context-dir switch pointing to the directory containing the Dockerfile. This informs OpenShift that it needs to look in a sub-directory, not the root of the Git repository, for its build artifacts.

Once we’ve executed the above command, we can tail the OpenShift logs from the CLI (or view them from the Web Console), and see the Dockerfile build taking place. The output will be similar to this:

[vagrant@rhel-cdk ~]$ oc logs -f fis-java-appdynamics-1-build           
I0709 09:03:35.440237       1 source.go:197] Downloading "https://github.com/benemon/fis-java-appdynamics" ...
Step 1 : FROM registry.access.redhat.com/jboss-fuse-6/fis-java-openshift
 ---> 771d26abb75d
Step 2 : USER root
 ---> Using cache
 ---> c66c5f1378be
Step 3 : ADD appdynamics/ /opt/appdynamics/
 ---> ef153cb350d8
Removing intermediate container 44c776871f6f
Step 4 : RUN chown -R 185:185 /opt/appdynamics/
 ---> Running in 861f8c27225e
 ---> ee1ac493f88d
Removing intermediate container 861f8c27225e
Step 5 : USER 185
 ---> Running in 1d9fe0a02e6a
 ---> 73f598d8a0e9
Removing intermediate container 1d9fe0a02e6a
Step 6 : ENV "OPENSHIFT_BUILD_NAME" "fis-java-appdynamics-1" "OPENSHIFT_BUILD_NAMESPACE" "dev1" "OPENSHIFT_BUILD_SOURCE" "https://github.com/benemon/fis-java-appdynamics" "OPENSHIFT_BUILD_COMMIT" "d025f9961896b25fcae479d62779ae455df334d3"
 ---> Running in 510a4b51db5a
 ---> c4e938d189eb
Removing intermediate container 510a4b51db5a
Step 7 : LABEL "io.openshift.build.commit.message" "Updated the FIS build artifacts" "io.openshift.build.source-location" "https://github.com/benemon/fis-java-appdynamics" "io.openshift.build.source-context-dir" "src/main/docker" "io.openshift.build.commit.author" "Benjamin Holmes \u003canonymous@email.com\u003e" "io.openshift.build.commit.date" "Sat Jul 9 11:26:11 2016 +0100" "io.openshift.build.commit.id" "d025f9961896b25fcae479d62779ae455df334d3" "io.openshift.build.commit.ref" "master"
 ---> Running in 213844392db7
 ---> 44fede9609fd
Removing intermediate container 213844392db7
Successfully built 44fede9609fd
I0709 09:04:06.573966       1 docker.go:118] Pushing image 172.30.49.26:5000/dev1/fis-java-appdynamics:latest ...
I0709 09:04:10.970516       1 docker.go:122] Push successful

NOTE: As an alternative to a standard Dockerfile build, we can use the Kubernetes Fluent DSL to generate the BuildConfig and ImageStream objects as part of a Template that will tell OpenShift to do a Dockerfile build based on the supplied project content. Using Kubernetes DSL is optional (you are more than welcome to define the objects manually), but as a Java developer this is a simple process to understand, it allows you to version control your whole image build process, and also falls nicely into the ‘configuration as code’ discipline so prominent in the DevOps world. An example of how to use the Fluent DSL is supplied in the Github repository for the AppDynamics base image.

Whichever process you decide upon (the supplied Github repository contains artifacts for both builds), OpenShift will generate a number of Kubernetes artifacts. What we are interested in here is the Image Stream…

apiVersion: v1
kind: ImageStream
metadata:
...
  generation: 1
  labels:
    app: fis-java-appdynamics
  name: fis-java-appdynamics
  namespace: dev1
...
spec: {}
status:
  dockerImageRepository: 172.30.49.26:5000/dev1/fis-java-appdynamics
  tags:
..
    tag: latest

…and the BuildConfig:

apiVersion: v1
kind: BuildConfig
metadata:
...
  labels:
    app: fis-java-appdynamics
  name: fis-java-appdynamics
  namespace: dev1
...
spec:
  output:
    to:
    kind: ImageStreamTag
    name: fis-java-appdynamics:latest
  postCommit: {}
  resources: {}
  source:
    contextDir: src/main/docker
    git:
    uri: https://github.com/benemon/fis-java-appdynamics
    secrets: []
    type: Git
  strategy:
    dockerStrategy:
    from:
        kind: ImageStreamTag
        name: fis-java-openshift:latest
    type: Docker
  triggers:
  - github:
    secret: 9Y66CCaSoOipX2pgeEXs
    type: GitHub
  - generic:
    secret: IrYOFwVX0pZKSkceG4D_
    type: Generic
  - type: ConfigChange
  - imageChange:
    lastTriggeredImageID: registry.access.redhat.com/jboss-fuse-6/fis-java-openshift:latest
    type: ImageChange
status:
  lastVersion: 1

Please note that lines have been removed from the above objects for the sake of brevity.

Once the build of the fis-java-appdynamics image has completed successfully, we will have a new base image present in our namespace that contains the AppDynamics agent plugin.

Magic.

Adding the AppDynamics agent to a FIS application

Given that I have elected to follow the second method of creating a new base image with the AppDynamics Java agent added to it, I now need a way of configuring it.

NOTE: These steps are much the same as those performed if you were to use the S2I builder process. However you can see the subtle differences, such as the addition of JAVA_OPTIONS is performed by the .s2i/bin/run scriptas opposed to the DeploymentConfig in the Template in the sample repository here.

Configuring the Agent

The AppDynamics agent follows a similar agent model to many other profiling tools, in that it is added to a JVM using the -javaagent switch. When thinking in terms of immutable containers, we obviously want this whole configuration process to be as loosely coupled from the application image as possible.

With this in mind, the simplest way to configure the AppDynamics Java agent is via environment variables. This is helpful, as the AppDynamics agent prioritises environment variables over any other forms of configuration it may have available to it (such as controller-info.xml within the agent distribution). The AppDynamics Agent Configuration guide has  further information.

One option we have here is to hard code all of the requisite environment variables into an application DeploymentConfig. However, in this brave new world of immutable containers, short lived cloud applications workloads, and CI/CD pipelines we can be a bit cleverer than that.

Using a Template for the application, we can still define all of the environment variables required by the AppDynamics agent, but we can also use a mixture of templated parameters, and Kubernetes’ Downward API to effectively allow the container to introspect itself at runtime, and feed useful information about itself to the agent.

Therefore, we can produce a Template that includes an environment variables component in its DeploymentConfig section which looks a little like this:

      - name: JAVA_OPTIONS
        value: '-javaagent:/opt/appdynamics/javaagent.jar'
      - name: TZ
        value: Europe/London
      - name: APPDYNAMICS_CONTROLLER_HOST_NAME
        value: ${APPDYNAMICS_CONTROLLER_HOST_NAME}
      - name: APPDYNAMICS_CONTROLLER_PORT
        value: ${APPDYNAMICS_CONTROLLER_PORT}
      - name: APPDYNAMICS_CONTROLLER_SSL_ENABLED
        value: ${APPDYNAMICS_CONTROLLER_SSL_ENABLED}
      - name: APPDYNAMICS_AGENT_ACCOUNT_NAME
        value: ${APPDYNAMICS_AGENT_ACCOUNT_NAME}
      - name: APPDYNAMICS_AGENT_ACCOUNT_ACCESS_KEY
        value: ${APPDYNAMICS_AGENT_ACCOUNT_ACCESS_KEY}
      - name: APPDYNAMICS_AGENT_APPLICATION_NAME
        value: ${SERVICE_NAME}
      - name: APPDYNAMICS_AGENT_TIER_NAME
            valueFrom:
              fieldRef:
               apiVersion: v1
                fieldPath: metadata.namespace
      - name: APPDYNAMICS_AGENT_NODE_NAME  
            valueFrom:
              fieldRef:
              apiVersion: v1
                fieldPath: metadata.name

Note the use of Downward API reference for APPDYNAMICS_TIER_NAME and APPDYNAMICS_AGENT_NODE_NAME.

NOTE: If you would like to try this with my sample template, you should execute the following command against your OpenShift environment:

oc create -f https://raw.githubusercontent.com/benemon/camel-cxf-cdi-java-example/appdynamics/openshift/cxf-cdi-java-example.yml

This creates the template within the current OpenShift project. This should be the same project in which you have done the fis-java-appdynamics build, otherwise OpenShift won’t be able to locate the new base image!

When we present this via the OpenShift Web Console, we are shown a much more user friendly version of the above, allowing you to key in your AppDynamics account details without the need to store them in potentially troublesome static configuration files within the container.

image00

Once all the fields have been completed, click on ‘Create’, and (assuming all mandatory fields have been filled in), a screen will be presented confirming the successful creation of all template objects.

Once this Template has been instantiated successfully, OpenShift will start a build against the source code branch, using fis-java-appdynamics as the S2I builder image.

Be aware that this project repository contains a standard Maven settings.xml which can be used to define how Maven resolves the build dependencies. If you experience long build times, this file can be updated to resolve to a local Maven repository, such as Sonatype Nexus, or JFrog Artifactory.

After the build has completed successfully, a new Pod will be started, running the application with its embedded Java agent (parts omitted for brevity):

Executing /deployments/bin/run ...
Launching application in folder: /deployments
Running  java  -javaagent:/opt/appdynamics/javaagent.jar
…
Install Directory resolved to[/opt/appdynamics]
log4j:WARN No appenders could be found for logger (com.singularity.MissingMethodGenerator).
log4j:WARN Please initialize the log4j system properly.
[Thread-0] Sat Jul 09 05:29:26 BST 2016[DEBUG]: AgentInstallManager - Full Agent Registration Info Resolver is running
[Thread-0] Sat Jul 09 05:29:26 BST 2016[INFO]: AgentInstallManager - Full Agent Registration Info Resolver found env variable [APPDYNAMICS_AGENT_APPLICATION_NAME] for application name [greeting-service]
[Thread-0] Sat Jul 09 05:29:26 BST 2016[INFO]: AgentInstallManager - Full Agent Registration Info Resolver found env variable [APPDYNAMICS_AGENT_TIER_NAME] for tier name [dev1]
[Thread-0] Sat Jul 09 05:29:26 BST 2016[INFO]: AgentInstallManager - Full Agent Registration Info Resolver found env variable [APPDYNAMICS_AGENT_NODE_NAME] for node name [greeting-service-1-fbcaa]
[Thread-0] Sat Jul 09 05:29:26 BST 2016[INFO]: AgentInstallManager - Full Agent Registration Info Resolver using selfService [true]
[Thread-0] Sat Jul 09 05:29:26 BST 2016[INFO]: AgentInstallManager - Full Agent Registration Info Resolver using selfService [true]
[Thread-0] Sat Jul 09 05:29:26 BST 2016[INFO]: AgentInstallManager - Full Agent Registration Info Resolver using application name [greeting-service]
[Thread-0] Sat Jul 09 05:29:26 BST 2016[INFO]: AgentInstallManager - Full Agent Registration Info Resolver using tier name [dev1]
[Thread-0] Sat Jul 09 05:29:26 BST 2016[INFO]: AgentInstallManager - Full Agent Registration Info Resolver using node name [greeting-service-1-fbcaa]
[Thread-0] Sat Jul 09 05:29:26 BST 2016[DEBUG]: AgentInstallManager - Full Agent Registration Info Resolver finished running
[Thread-0] Sat Jul 09 05:29:26 BST 2016[INFO]: AgentInstallManager - Agent runtime directory set to [/opt/appdynamics/ver4.1.7.1]
[Thread-0] Sat Jul 09 05:29:26 BST 2016[INFO]: AgentInstallManager - Agent node directory set to [greeting-service-1-fbcaa]
[Thread-0] Sat Jul 09 05:29:26 BST 2016[INFO]: JavaAgent - Using Java Agent Version [Server Agent v4.1.7.1 GA #9949 ra4a2721d52322207b626e8d4c88855c846741b3d 18-4.1.7.next-build]
[Thread-0] Sat Jul 09 05:29:26 BST 2016[INFO]: JavaAgent - Running IBM Java Agent [No]
[Thread-0] Sat Jul 09 05:29:26 BST 2016[INFO]: JavaAgent - Java Agent Directory [/opt/appdynamics/ver4.1.7.1]
[Thread-0] Sat Jul 09 05:29:26 BST 2016[INFO]: JavaAgent - Java Agent AppAgent directory [/opt/appdynamics/ver4.1.7.1]
Agent Logging Directory [/opt/appdynamics/ver4.1.7.1/logs/greeting-service-1-fbcaa]
Running obfuscated agent
Started AppDynamics Java Agent Successfully.
…
Registered app server agent with Node ID[8494] Component ID[6859] Application ID [4075]

Verifying Successful Integration

Once the application has started successfully, and the agent has registered itself with AppDynamics, you should be able to see your application on the AppDynamics Dashboard:

image04

Drilling down into the Application in the Dashboard also confirms that the Downward API has done its job, and we’ve automatically pulled in both the container name, and the Kubernetes namespace.

image03

Testing Integration

In order to get something a bit more meaningful out of the AppDynamics platform, I’ve put together a small test harness in SoapUI that simply runs a load test against the Fuse application’s RESTful endpoint:

image02

In OpenShift’s container logs we can see these requests coming into the application, either via the Web Console or via the CLI.

Once the test harness has completed its cycle, going back to the AppDynamics dashboard starts to give us a glimpse of something a bit more useful to us from an application monitoring and operations point of view:

image01

We can even drill down into the Web Service endpoints themselves, and examine the levels of load each is experiencing.

image05

Application Scaling

One of the really nice things about using OpenShift, the Downward API, and AppDynamics in this way is that it even gives us useful information about health, request distribution and throughput when we scale out the application. Here the application has been scaled to 3 nodes:

image06

We can also look at the load and response times being experienced by users of the application service. Whilst this particular view gives an amalgamation of data, it’s a simple operation to drill down into an individual JVM to see how it’s performing.

Conclusion

I have barely scratched the surface of what we can monitor, log, and alert on with OpenShift, Fuse Integration Services, and AppDynamics. Hopefully though, it gives you a glimpse of what is possible using the tools provided by the OpenShift Container Platform, and a template for not only integrating AppDynamics, but also other useful toolsets that follow a similar agent model.

Resources

The full source repository for fis-java-appdynamics is here: https://github.com/benemon/fis-java-appdynamics/tree/0.2-SNAPSHOT

The full source repository for the FIS/Camel application based on the fis-java-appdynamics image is here:https://github.com/benemon/camel-cxf-cdi-java-example/tree/appdynamics

The full source repository for the FIS/Camel application with the Java agent added using S2I is here:

https://github.com/benemon/camel-cxf-cdi-java-example/tree/appdynamics-s2i
Please note branches and tags in these repositories.

Technical deep dive on what’s impacting Healthcare.gov

There has been a wealth of speculation about what might be wrong with the healthcare.gov website from some of the best and brightest technology has to offer. It seems like everyone has an opinion about what is wrong yet there are still very few facts available to the public. One of our Engineers (Evan Kyriazopoulos) at AppDynamics did some research using our End User Monitoring platform and has some interesting findings that I will share with you today.

What We Already Know

Healthcare.gov is an architecture of distributed services built by a multitude of contractors, and integrates with many legacy systems and insurance companies. CGI Federal (the main contractor) said that it was working with the government and other contractors “around the clock” to improve the system, which it called “complex, ambitious and unprecedented.” We’ve put together a logical architecture diagram based upon the information that is currently in the public domain. This logical diagram could represent hundreds or even thousands of application components and physical devices that work in coordination to service the website.

healthcare-gov-architecture-diagram

How We Tested

Since we don’t have access to the server side of healthcare.gov to inject our JavaScript end user agent we had to inject from the browser side for a single user perspective. Using GreaseMonkey for Firefox we injected our small and lightweight JavaScript code. This code allowed us to measure both server and browser response times of any pages that we visited. All data was sent to our AppDynamics controller for analysis and visualization.

The Findings

What we found after visiting multiple pages was that the problems with response time are pretty evenly divided between server response time issues and browser processing time issues. In the screenshot below you can see the average of all response times for the web pages we visited during this exploratory session. The “First Byte Time” part of the graphic is the measure of how long the server took to get its initial response to the end user browser. Everything shown in blue/green hues represents server side response times and everything in purple represents browser client processing and rendering. You can see really bad performance on both the client and server side on average.

SessionOverview

Now let’s take a look at some individual page requests to see where their problems were. The image below shows an individual request of the registration page and the breakdown of response times across the server and client sides. This page took almost 71 seconds to load with 59 seconds attributed to the server side and almost 12 seconds attributed to the client side data loading and rendering. There is an important consideration with looking at this data. Without having access to either the server side to measure its response time, or to all network segments connecting the server to the client to measure latency we cannot fully determine why the client side is slow. We know there are js and css file optimizations that should be made but if the server side and/or network connections are slow then the impact will be seen and measured at the end user browser.

PoorServerResponse

Moving past registration… The image below shows a “successful” load of a blank profile page. This is the type of frustrating web application behavior that drives end users crazy. It appears as though the page has loaded successfully (to the browser at least) but that is not really the case.

BlankProfilePage

When we look at the list of our end user requests we see that there are AJAX errors associated with the myProfile web page (right below the line highlighted in blue).

myProfile-PageView

So the myProfile page successfully loaded in ~3.4 seconds and it’s blank. Those AJAX requests are to blame since they are actually responsible for calling other service components that provide the profile details (SOA issues rearing their ugly head). Let’s take a look at one of the failed AJAX requests to see what happened.

Looking at the image below you can see there is an HTTP 503 status code as the response to the AJAX request. For those of you who don’t have all of the HTTP status codes memorized this is a “Service Unavailable” response. It means that the HTTP server was able to accept the request but can’t actually do anything with it because there are server side problems. Ugh… more server side problems.

ajax-error-marked

Recommendations

Now that we have some good data about where the problems really lie, what can be done to fix them? The server side of the house must be addressed first as it will have an impact on all client side measurements. There are 2 key problems from the server side that need resolving immediately.

1) Functional Integration Errors

When 100s of services developed by different teams come together, they exchange data through APIs using XML (or similar) formats (note the AJAX failures shown above). Ensuring that all data exchange is accurate and errors are not occurring requires proper monitoring and extensive integration testing. Healthcare.gov was obviously launched without proper monitoring and testing and that’s a major reason why many user interactions are failing, throwing errors and insurance companies are getting incomplete data forms.

To be clear, this isn’t an unusual problem. We see this time and time again with many companies who eventually come to us looking for help solving their application performance issues.

To accelerate this process of finding and fixing these problems at least 5X faster, a product like AppDynamics is required. We identify the source of errors within and across services much faster in test or production environments so that Developers can fix them right away. The screenshots below show an example of an application making a failed service call to a third party and just how easy it is to identify with AppDynamics. In this case there was a timeout socket exception waiting for PayPal (i.e. the PayPal service never answered our web services call at all).

Transaction Flow

Screen Shot 2013-10-14 at 2.59.50 PM

paypal socet exception

2) Scalability bottlenecks

The second big problem is that there are performance bottlenecks in the software that need to be identified and tuned quickly. Here here is some additional information that almost certainly apply to healthcare.gov

a) Website workloads can be unpredictable, especially so with a brand new site like healtcare.gov. Bottlenecks will occur in some services as they get overwhelmed by requests or start to have resource conflicts. When you are dealing with many interconnected services you must use monitoring that tracks all of your transactions by tagging them with a unique identifier and following them across any application components or services that are touched. In this way you will be able to immediately see when, where, and why a request has slowed down.

b) Each web service is really a mini application of it’s own. There will be bottlenecks in the code running on every server and service. To find and remediate these code issues quickly you must have a monitoring tool that automatically finds the problematic code and shows it to you in an intuitive manner. We put together a list of the most common bottlenecks in an earlier blog post that you can access by clicking here

Finding and fixing bottlenecks and errors in custom applications is why AppDynamics exists. Our customers typically resolve their application problems in minutes instead of taking days or weeks trying to use log files and infrastructure monitoring tools.

AppDynamics has offered to help fix the healthcare.gov website for free because we think the American people deserve a system that functions properly and doesn’t waste their time. Modern, service oriented application architectures require monitoring tools designed specifically to operate in those complex environments. AppDynamics is ready to answer the call to action.

Intelligent Alerting for Complex Applications – PagerDuty & AppDynamics

Screen Shot 2013-04-16 at 2.39.00 PMToday AppDynamics announced integration with PagerDuty, a SaaS-based provider of IT alerting and incident management software that is changing the way IT teams are notified, and how they manage incidents in their mission-critical applications.  By combining AppDynamics’ granular visibility of applications with PagerDuty’s reliable alerting capabilities, customers can make sure the right people are proactively notified when business impact occurs, so IT teams can get their apps back up and running as quickly as possible.

You’ll need a PagerDuty and AppDynamics license to get started – if you don’t already have one, you can sign up for free trials of PagerDuty and AppDynamics online.  Once you complete this simple installation, you’ll start receiving incidents in PagerDuty created by AppDynamics out-of-the-box policies.

Once an incident is filed it will have the following list view:

incident

When the ‘Details’ link is clicked, you’ll see the details for this particular incident including the Incident Log:

incident_details

If you are interested in learning more about the event itself, simply click ‘View message’ and all of the AppDynamics event details are displayed showing which policy was breached, violation value, severity, etc. :

incident_message

Let’s walk through some examples of how our customers are using this integration today.

Say Goodbye to Irrelevant Notifications

Is your work email address included in some sort of group email alias at work and you get several, maybe even dozens, of notifications a day that aren’t particularly relevant to your responsibilities or are intended for other people on your team?  I know I do.  Imagine a world where your team only receives messages when the notifications have to do with their individual role and only get sent to people that are actually on call.  With AppDynamics & PagerDuty you can now build in alerting logic that routes specific alerts to specific teams and only sends messages to the people that are actually on-call.  App response time way above the normal value?  Send an alert to the app support engineer that is on call, not all of his colleagues.  Not having to sift through a bunch of irrelevant alerts means that when one does come through you can be sure it requires YOUR attention right away.

on_call_schedules

Automatic Escalations

If you are only sending a notification and assigning an incident to one person, what happens if that person is out of the office or doesn’t have access to the internet / phone to respond to the alert?  Well, the good thing about the power of PagerDuty is that you can build in automatic escalations.  So, if you have a trigger in AppDynamics to fire off a PagerDuty alert when a node is down, and the infrastructure manager isn’t available, you can automatically escalate and re-assign / alert a backup employee or admin.

escalation_policy

The Sky is Falling!  Oh Wait – We’re Just Conducting Maintenance…

Another potentially annoying situation for IT teams are all of the alerts that get fired off during a maintenance window.  PagerDuty has the concept of a maintenance window so your team doesn’t get a bunch of doomsday messages during maintenance.  You can even setup a maintenance window with one click if you prefer to go that route.

maintenance_window

Either way, no new incidents will be created during this time period… meaning your team will be spared having to open, read, and file the alerts and update / close out the newly-created incidents in the system.

We’re confident this integration of the leading application performance management solution with the leading IT incident management solution will save your team time and make them more productive.  Check out the AppDynamics and PagerDuty integration today!