The Digital Enterprise – Problems and Solutions

According to a recent article featured in Wall Street and Technology, Financial Services (FS) companies have a problem. The article explains that FS companies built more datacenter capacity than they needed when profits were up and demand was rising. Now that profits are lower and demand has not risen as expected the data centers are partially empty and very costly to operate.

FS companies are starting to outsource their IT infrastructure and this brings a new problem to light…

“It will take a decade to complete the move to a digital enterprise, especially in financial services, because of the complexity of software and existing IT architecture. “Legacy data and applications are hard to move” to a third party, Bishop says, adding that a single application may touch and interact with numerous other applications. Removing one system from a datacenter may disrupt the entire ecosystem.”

Serious Problems

The article calls out a significant problem that FS companies are facing now and will be for the next decade but doesn’t mention a solution.

The problem is that you can’t just pick up an application and move it without impacting other applications. Based upon my experience working with FS applications I see multiple related problems:

  1. Disruption of other applications
  2. Baselining performance and availability before the move
  3. Ensuring performance and availability after the move

All of these problems increase risk and the chance that users will be impacted.

Solutions

1. Disruption of other applications – The solution to this problem is easy in theory and traditionally difficult in practice. The theory is that you need to understand all of the external interactions with application you want to move.

One solution is to use ADDM (Application Discovery and Dependency Mapping) tools that scan your infrastructure looking for application components and the various communications to and from them. This method works okay (I have used it in the past) but typically requires a lot of manual data manipulation after the fact to improve the accuracy of the discovered information.

ADDM1

ADDM product view of application dependencies.

Another solution is to use an APM (Application Performance Management) tool to gather the information from within the running application. The right APM tool will automatically see all application instances (even in a dynamically scaled environment) as well as all of the communications into and out of the monitored application.

Distributed Application View

APM visualization of an application and it’s components with remote service calls.

Remote Services 1

APM tool list of remote application calls with response times, throughput and errors.

 

A combination of these two types of tools would provide the ultimate in accurate and easy to consume information (APM strength) along with flexibility to cover all of the one off custom application processes that might not be supported by an APM tool (ADDM strength).

2. Baselining performance and availability before the move – It’s critically important to understand the performance characteristics of your application before you move. This will provide the baseline required for comparison sake after you make the move. The last thing you want to do is degrade application performance and user satisfaction by moving an application. The solution here is leveraging the APM tool referenced in solution #1. This is a core strength of APM and should be leveraged from multiple perspectives:

  1. Overall application throughput, response times, and availability
  2. Individual business transaction throughput and response times
  3. External dependency throughput and response times
  4. Application error rate and type
Application overview and baseline

Application overview with baseline information.

transactions and baselines

Business transaction overview and baseline information.

3. Ensuring performance and availability after the move – Now that your application has moved to an outsourcer it’s more important than ever to understand performance and availability. Invariably your application performance will degrade and the finger pointing between you and your outsourcer will begin. That is, unless you are using an APM tool to monitor your application. The whole point of APM tools is to end finger pointing and to reduce mean time to restore service (MTRS) as much as possible. By using APM after the application move you will provide the highest level of service to your customers as possible.

Compare Releases

Comparison of two application releases. Granular comparison to understand before and after states. – Key Performance Indicators

Compare releases 2

Comparison of two application releases. Granular comparison to understand before and after states. – Load, Response Time, Errors

If you’re considering or in the process of transitioning to a digital enterprise you should seriously consider using APM to solve a multitude of problems. You can click here to sign up for a free trial of AppDynamics and get started today.

The Poor Misguided CMDB

It’s not an exciting or glamorous subject but it’s an absolutely critical concept for properly managing your applications and infrastructure. CMDB, CMS, SIS, EIEIO (joking) or anything else you want to call it these days is a concept that has been poorly implemented from the very beginning. The concept itself is sound; have a single source of truth that describes your application and infrastructure environments to enable IT operations efficiency (at least that’s the core concept in my mind).

CMDB’s are Awesome, but Not Really

Back in the my days working for global financial services institutions I relied heavily on the CMDB as a starting point for many different activities. I say it was only a starting point because invariably the information within the CMDB was wrong. It was either originally input wrong, not updated regularly enough, or updated with incorrect information. No matter what the real cause, my single source of truth became a partial source of truth that always required extensive verification. Not very efficient!

This is how a traditional CMDB represents component relationships.

This is how a traditional CMDB represents component relationships.

Getting back to my reliance upon the CMDB… Here are the types of activities that required me to query the CMDB:

  • Change controls – Anytime I needed to change anything on a production server I needed to understand what applications had components existing on that server.
  • Application upgrades – I needed to know all of the application components at that exact moment to make sure they all got the update.
  • Application migrations – There were times when we simply needed to move the application from one data center to another. This required a complete understanding of all application components, flows, and dependencies.
  • Performance troubleshooting – When I was asked to get involved with a performance problem, one of the first things I wanted to understand was all of the components that made up the application and any external dependencies.

There are many more uses for the data in the CMDB but those were my top use cases. As I said before, invariably the CMDB was wrong. There were usually components missing from the CMDB, and components in the CMDB that were no longer part of the application, and incorrect dependencies, and, and , and…

Salvation by Auto Discovery and Dependency Mapping, but Not Really

So what’s a good IT department supposed to do about this problem? Buy a discovery and dependency mapping tool of course. And that’s exactly what we did. We explored the market and brought in the best (relative) tool for the job. It was one of those agentless tools that makes deployment way faster and easier in a large enterprise like mine was. The problem, as I would later realize, is that agentless discovery tools only see what’s going on when they login and scan the host. Under normal conditions you can scan your environment maybe once or twice a day without completely overwhelming the tool. What that means is that all of those transient (short lived) services calls into or out of each application are misses by the discovery tool unless they happen to be running at nearly the exact time of the scan.

Discovery and dependency mapping tools did a better job at maintaining and visualizing relationships but still fell short.

Discovery and dependency mapping tools did a better job at maintaining and visualizing relationships but still fell short.

To add further insult to injury, most organizations don’t want a bunch of scanning activity going on during heavy business hours so the scans are typically relegated to the middle of the night when there is little or no load on the applications that are being scanned. This amplifies the transient communication dependency mapping problem. Now the vendors who sell these solutions will claim that there are ways to deal with this issue if you just use their network sniffer in conjunction with their agentless appliance. I won’t comment much on this but I will say that this creates another slew of deployment problems from a political and technical perspective and the thought of every trying it again makes me wince in pain. (Where did I put my therapists phone number again?)

The Application Knows, Really It Does

What better source of understanding application components and dependencies is there than the application itself? Let’s explore this for a moment. If you can live inside of the application and see all of the socket connections opening and closing then you absolutely know what else the application is communicating with. Imagine if there was a system that could automatically see all of these connection that open and close at any given time and draw a picture of the application and all of it’s dependencies at that exact moment in time or any point in the past. And imagine if this system had a published API that allowed your other systems to query it for this information. Regardless of transient or persistent connection types, you would have the ability to know all of the components of your application and all of its external dependencies. This is exactly what AppDynamics does out of the box.

Application Dependency Map

Discovery of application components and dependencies from within the application, in real time, provides the most accurate information possible.

I believe that the CMDB of old should be an ecosystem of information points that provide the truth at the moment it is requested. Forrester calls this a SIS (service information system) in their research paper titled “Reinvent The Obsolete But Necessary CMDB”. Click here to read it if you’re interested. The SIS isn’t some vendor tool, instead it’s an architectural construct that should be different for each company based upon their tools and requirements. From my perspective it is incredibly difficult and inefficient to manage a datacenter or group of applications without implementing this type of concept.

If you’ve already got AppDynamics deployed, consider using it as a significant source of truth about your applications. If you’re stuck with an outdated CMDB, consider shifting your architecture and check out how AppDynamics can help with a free trial.