What’s up with the Network and Storage teams?

yoda-talk-to-handA few weeks ago I was presenting at CMG Performance and Capacity 2013 and during my presentation we (myself and a few audience members) got slightly side-tracked. Our conversation somehow took a turn and became a question of why it was so hard to get performance data from the network and storage teams. Audience members were asking me why, when they requested this type of data, they were typically stonewalled by these organizations.

I didn’t have a good answer for this question and in fact I have run into the same problem. Back when I was working in the Financial Services sector I was part of a team that was building a master dashboard that collected data from a bunch of underlying tools and displayed it in a drill-down dashboard format. It was, and still is, a great example of how to provide value to your business by bringing together the most relevant bits of data from your tooling ecosystem.

This master dashboard was focused on applications and included as many of the components involved in delivering each application as possible. Web servers, application severs, middleware, databases, OS metrics, business metrics, etc… were all included and the key performance indicators (KPIs) for each component were available within the dashboard. The entire premise of this master dashboard relied upon getting access to the tools that collected the data via API or through database queries.

The only problems that our group faced in getting access to the data we needed was with the network and storage teams. Why was that? Was it because these teams did not have the data we needed? Was it because these teams did not want anyone to see when they were experiencing issues? Was it for some other reason?

I know the network team had the data we required because they eventually agreed to provide the KPIs we had asked for. This was great, but the process was very painful and took way longer than it should have. Still, we eventually got access to the data. The big problem is that we never got access to the storage data. To this day I still don’t know why we were blocked at every turn but I’m hoping that some of the readers of this blog can share their insight.

Back in the day, when my team was chasing the storage team for access to their monitoring data, there weren’t really any tools that we could find for performance monitoring of storage arrays besides the tools that came with the arrays. These days I would have been able to get the data I needed for NetApp storage by using AppDynamics for Databases (which includes NetApp performance monitoring capabilities). You can read more about it by clicking here.

Have you been stonewalled by the network or storage or some other team? Did you ever get what you were after? Based upon my experiences talking with a lot of different folks at different organizations this seems to be a significant problem today. Are you on a network or storage team? Does your team freely share the data they have? Please share your experience, insight, or questions in the comments below. Just to clarify, I hold no ill will against any of you network or storage professionals out there. I’d just like to hear some opinions and gain some perspective.

Deploying APM in the Enterprise Part 7: Dashboards and Reports – Get the Business on Your Side

Welcome back to my series on Deploying APM in the Enterprise. In Part 6: Spread the APM Love we talked about ways to increase user adoption of your APM tool and thereby make your organization more successful.

This week we focus on Dashboards and Reports. I’m a huge believer in unlocking the information and intelligence contained within your monitoring tools, and turning that data into actionable information. Over time your tooling ecosystem will collect a wealth of data that can be used for capacity planning, business intelligence, development planning, and many other business and IT activities that require information about usage patterns and operational statistics. By unlocking this value, and surfacing it to the business and IT organizations in a meaningful way, you take another step up the maturity ladder and solidify your rockstar status within your organization. A huge benefit of providing useful information to the business is that they will fight for your tools and projects if they are ever in question!

Dashboards

Dashboards should be used to provide real-time insight into critical business and IT indicators. A failed server doesn’t always mean that there is impact to the business.

Several of our customers are using AppDynamics dashboards to better understand business activity. For example, we have multiple e-commerce customers that are tracking revenue and order volumes in real time.

Let’s take a look at some different perspectives within the organization and explore what type of dashboard each role requires.

Management

Most managers don’t need to know the sordid details about the infrastructure that supports the business applications. For each application within the managers realm of responsibility they need to understand key business indicators and have an overall indication of their application performance and health. Take a look at the dashboard below…

 MgmtDashThmb

This is the type of dashboard that managers want to keep an eye on throughout the workday. Any impact to the business will be easily identified on this dashboard so that the manager can make sure the impact is being addressed. There should be alerts associated with these metrics so that the operations center is notified of business impact but it’s not always an IT problem when business metrics deviate from normal behavior. It’s possible that sales volume is impacted by a competitor offering lower prices on the same product. This will show up as business impact and there is nothing that the IT staff can do to fix it. This is a business problem that needs to be addressed by the appropriate department.

Of course any IT infrastructure problems that impact the business will also show up on the managers dashboard but only by way of business metrics that have deviated from normal behavior. This fosters communication between IT and the business which should lead to improved cooperation and coordination over time.

Application Support

Application support rides the fence between the business and IT. If something goes wrong with their application, the business makes sure that app support hears about it and gets the problem resolved in a timely manner. Application support teams need a view of key business indicators (similar to what the managers are looking at) as well as key IT indicators so they know when there are problems with the application or the infrastructure they depend upon.

Operations

The operations teams need to know when infrastructure components are nearing capacity, throwing errors, or failing completely. It is their responsibility to ensure the proper functionality of the infrastructure and as such need an infrastructure centric view of the components they are responsible for. This is a very classic enterprise monitoring view that has been long established and still has its place in the modern monitoring world. With that said, its beneficial to show the operations teams a few key business indicators so they know how urgent it is to replace that failed piece of hardware.

Reports

Given that dashboards are best utilized to provide real-time status information, reports are the go-to solution for information that drives action in the future. Reports can be about the application, infrastructure, business, or anything else that makes sense given the data you ave to work with. What I want to focus on in this blog are the insights I have picked up over the years when developing and utilizing reports at my past companies.

  • Reports should contain information that is actionable. There is little value in receiving a report that you cannot use to decide if action is required.
  • Reports need to be concise. Very few people are interested in reading a 50 page report. Try to keep them to a few pages or at least summarized in the first couple of pages with supporting details in the rest of the report.
  • Don’t send an avalanche of reports. People usually don’t have the time or desire to read multiple reports per day. Ideally you should send reports only when there is something in the report that needs attention.
  • If you can, include the source of the report and a description of what they report is used for. I’ve received reports in the past where I didn’t know what system sent them and why I needed to see them.
  • Know your audience. Make sure the business gets business related information and IT gets technology related information.
  • Don’t blast reports to email groups (usually). Most reports need to be seen by only a few people in your organization. Email distribution groups tend to contain way more people than are truly interested in your report. No need to clog up the email system with a 50 page PDF sent to 2000 addresses.

Reports are important and can help your organization avoid issues down the road but you need to implement them carefully or they become just another piece of internal spam. The best report that I can ever receive is the one that only shows up when there is a problem brewing within the next few weeks, clearly identifies the problem, and I am the right person to make sure it gets addressed. That would be reporting done right!

Dashboards and reports need to be implemented properly to get the most out of your monitoring investments. You can amplify the value of all of your monitoring investments by combining, analyzing, and displaying the data contained within each disparate repository. The most mature organizations will build out their monitoring ecosystem and then invest in analytics to derive business and technology value and to create the best dashboards and reports possible.

Thanks for taking the time to read this series. The final post will be next week and will focus on how to keep up the momentum and stay relevant when it comes to monitoring within your organization.

Click here to read Deploying APM in the Enterprise Part 8: Stay Thirsty (and Relevant) My Friends

Spot the Difference in AppDynamics 3.4

Okay, admit it. The title Spot the Difference in AppDynamics 3.4 caught your attention, especially after we just recently announced our free End User Monitoring features, and now you’re here to find out what other insanely cool stuff we’ve been working on to enhance the experience for APM aficionados, customers and people like you! I’m proud to say that AppDynamics continues to innovate by leaps and bounds which enables our customers to be more successful in how they manage application performance and availability.

Here’s an example of us staying ahead of the curve. I was scouring Twitter feeds several weeks back and found this tweet from a Java applications guy,


Hmmmm, a proverbial question indeed.

So what exactly changes with an application release?

Flashback to 2008…I ran into a painfully similar situation back at LG. We had a team of consultants working on Java performance optimizations for eight months, executing test cases, refactoring application code, disabling trigger objects and even redesigning some of the use case workflows. Well, it turns out the build and deployment team pushed the entire application onto a completely different set of infrastructure, and when the new test results came back we almost choked when we saw 20-30% regressions in application performance. That night the performance team drank itself into oblivion, and you can imagine what happened next.

For the next two weeks everyone on the project was now heads down trying to identify what changes caused such a massive slowdown to the system. Remember, we didn’t have an APM solution like AppDynamics, and had to resort to using four different performance monitoring tools, since we didn’t have a single solution that provided us with complete end-to-end visibility. Those were tumultuous times when Ops, Dev and IT teams engaged in heated arguments about who was at fault.

All of this could have been easily avoided if we had something like AppDynamics 3.4 Agile Release Analysis to compare application releases visually. Having this type of comparative analysis capability comes in handy when you’re trying to understand why a user login transaction, for instance, is slower with your latest agile release than what it was a week ago. You could always conduct code diffs on your releases, but sometimes it’s incredibly valuable to be able to visualize performance infrastructure changes at a glance rather than having to spend hours or even days manually verifying what differences actually occurred.

Agile Release Analysis

AppDynamics release analysis rolls up application performance metrics so you can compare KPIs of not only the application, but also individual business transactions, as well as their flow and execution. The application might’ve have gotten slower, but being able to identify which piece of the overall application or specific transaction is affecting performance is much more useful to operations and IT.

Speaking of comparing Application Flow Maps, in order for this feature to be truly tenable, we needed to improve the visibility challenges customers were facing with our original flow map. We had the right idea, but it was too rigid. Monitoring several thousand tiers became blinding for some customers, leaving App Ops with a cluttered forest, and no trees. So to stay true to our company mission with offering customers with maximum visibility with minimal effort, we improved the viewing experience so you can now navigate your entire application just like you’d navigate the entire world with Google Maps. The next time you’re notified there’s an issue with a transaction that’s traversing through a particular node in a cluster, you can feel confident it’ll be easy to spot and zoom in on. The application flow maps now change color in real-time depending on the SLA and performance of application tiers and the flows that connect them. For example, here is a screenshot of the new flow map with baseline comparison enabled:

Zoom in and out of your App Flow Map

 

Introducing Role-Based Perspectives

Let’s assume the role of a database administrator for a moment. Sure, they might look at Java or .NET code once in a blue moon, but their technical domain is managing the backend of the application. So in 3.4 we decided to streamline the troubleshooting process from a role-based perspective to help our respective backend admins “get to the point” and view information pertinent to their role.

“Show me the top SQL calls for my database.” Done.

“Okay, now which business transactions do my SQL queries impact?” You can think of it as a “Where Used” look up that allows the DBA to analyze what they’re interested in from a role-based perspective and see how queries are contributing to business transaction performance. The Backend Specialists – DBA, Message Broker, ESB and Security Administers – can now say with conviction, “I optimized my backend service that was impacting the user experience of several mission critical business transactions.” That’s what I call getting bang for the buck performance optimizations!

Reporting on Application AND Business Activity.

AppDynamics 3.4 also includes a new PDF reporting engine so you can generate, save and share reports detailing application health metrics. We’ve introduced several prepackaged reports as well as the ability to create your own custom report. All of the aforementioned cool 3.4 features offer some amazing technical benefits that speak to those in DevOps, but at the end of the day, someone is going to ask how this is impacting your business’s bottom line. So in 3.4, we now allow you to build a dashboard that monitors business activity for your most revenue-critical apps. Albeit, our technology is pretty slick as it is to auto-discover and baseline your application’s performance without the need for manual instrumentation. However, there may be users or scenarios where you may want to monitor the activity and revenue of your application, rather than say its performance or throughput. No problem – you can exploit “Information Points” in AppDynamics to track any business metric or value which is part of a business transaction. Our intuitive dashboard builder lets you expose any metric with drag and drop widgets meaning you can create powerful business views in minutes.

There are a number of other powerful features and product enhancements in AppDynamics 3.4 you can start exploring today by registering here for a 30-day free trial. If you’re already an AppDynamics customer, we look forward to hearing more feedback on how we can improve the experience even further and want to say thanks again! We hope you’re as delighted as we are about our latest release.