Engineering

Managing Agility in a DevOps Ecosystem

By | | 4 min read


Summary
I recently had an opportunity to participate in the SPS Commerce DevOps day event; with over 100 engineers gathered to discuss topics ranging from remote team management, automation frameworks, managing docker deployments, and of course how dynamic monitoring can enable a more agile culture, it was...

I recently had an opportunity to participate in the SPS Commerce DevOps day event; with over 100 engineers gathered to discuss topics ranging from remote team management, automation frameworks, managing docker deployments, and of course how dynamic monitoring can enable a more agile culture, it was one of the most interesting and collaborative days I’ve had the opportunity to be apart of. 

Bridget Kromhout – a rock star Cloud Foundry Engineer at Pivotal (@bridgetkromhout) – kicked off the day by discussing strategies for building globally distributed teams and embracing collaboration across time zones.  Bridget is a great speaker, and the session was packed full of compelling anecdotes, but there were two specific takeaways that resonated with me, as they align closely to how I think about APM.

  1.      Tools are only valuable if they are embraced and leveraged broadly across a team as part of ongoing processes.
  2.      DevOps is an ecosystem of technologies, there is no one tool to rule them all.  Value is exponentially increased when teams, and tools, act collaboratively.

Let’s take a look at how these fundamental tenets apply to APM, and why they are so critical to increasing agility within a DevOps ecosystem.

Deployment

Before we can realize value from APM, we need to start by addressing deployment.  AppDynamics adopts an open model that encourages customers to integrate into their existing deployment strategy, rather than re-define a new mechanism.  With available Chef Recipes, Puppet Modules, and the Powershell remote management module, it’s now a trivial exercise to deploy APM across small or extremely large server farms.  At this past year’s AppSphere 2015 I presented a demonstration of leveraging Puppet to deploy and bootstrap AppDynamics across a distributed application stack hosted in Amazon EC2 in just 30-seconds.

Dynamic Discovery

Dynamic discovery, and scoring, is what drives our customer’s ability to increase their release velocity while at the same time maintaining quality.  By tying into our Jenkins pipelines we can tell exactly when code is deployed, what release or build was pushed, and its impact on performance or availability.  In the below example, we can see that immediately after pushing build number 313 to our Tomcat web server, AppDynamics detected and automatically alerted that a JDBC connection pool was exhausted.

Because we caught, and alerted on, the condition in real-time, our team was able to quickly back out the release and restore service to end-users.

Alerting

It’s easy to alert.  What’s more valuable is when our tools enable processes that allow operations teams to be quickly notified of an issue, understand the exact context of the event, and immediately dive into a diagnostic.

Above is an example of the AppDynamics integration with ServiceNow; this is a freely available integration to all AppDynamics customers.  Not only does AppDynamics create a ticket with the full contextual details of the problem, but it also provides an Incident URL so users can launch directly from Service Now back into the exact error condition in AppDynamics.

This is a prime example where collaboration across tools directly improves collaboration across teams.

Remediation

Mean Time to Restore (MTTR) is a common operational metric, and often drives SLAs.  A powerful capability within AppDynamics is Auto Remediation.  Not only can AppDynamics generate intelligent alerts, but also in many cases it can automate the execution of operational playbacks. 

Understanding Business Impact

APM is rapidly evolving beyond performance management.  With AppDynamics Analytics, we know have the ability to correlate and understand the direct impact of performance to our business health. 

Real-time Analytics is a powerful capability to start understanding the effect of rapid deployments.  Not only can we see the impact our changes have on performance, but we can define our business performance indicators, and in real-time evaluate the quality of our release in-terms of actual business value.

Wrap-up

When planning an APM rollout, it’s important to consider not only the capabilities that can be delivered, but how best to drive organizational adoption and collaboration.  Collaboration should be architected across multiple platforms, tools, as well as teams, and by designing APM to exist cooperatively as part of the overall DevOps ecosystem we will substantially increase the value our customers realize from APM.