Monitoring Asynchronous Business Transactions

This article originally appeared on LinkedIn

Essential to Successful Ecommerce

Back in the day, everybody expected synchronous behavior from business applications and web sites. Click a link and wait for a response – while in the meantime, the technology is doing whatever it can to respond to your request, blocking other activities until it’s complete.

Today, in contrast, asynchronous behavior is more of the norm. Not only does such behavior improve performance compared to synchronous interactions, as now many activities can proceed apace, but such systems are easier to use.

Modern technology users – in other words, just about everyone – are now accustomed to asynchronous behavior on the user interface (UI). I call this shift in user expectations the Facebook effect, as Facebook’s UI is fully asynchronous. When you first go to facebook.com, all that loads is an empty UI framework. From there on, Facebook pushes information to various parts of the screen – often with no clicking needed.

Where Facebook leads, the entire digital world follows. Asynchronous behavior now finds a central place in most end-to-end business transactions, both to support dynamic UIs as well as to streamline and accelerate less visible steps in the business process.

Furthermore, as organizations scale up their digital efforts, asynchrony becomes an essential tool for removing performance bottlenecks. It’s no wonder asynchronous interactions are the default interaction pattern in the cloud.

Monitoring Business Transactions that Contain Asynchronous Interactions

Ensuring the performance of any business transaction is important to both end users and the business, regardless of whether it includes any asynchronous activities.

When a transaction does include such an interaction, however, monitoring the transaction’s end-to-end performance becomes tricky, as poorly performing asynchronous activities can complete many seconds after the user believes the transaction has finished.

And of course, if the asynchronous step in the transaction fails completely, then the user may not even be aware there’s a problem – the worst possible scenario for the business.

To address these issues, AppDynamics has added the ability to monitor asynchronous elements of business transactions to its platform.

The admin begins this configuration by setting a transaction demarcator that signifies the end of the asynchronous interaction. For example, if a business transaction contains a method call that spawns threads and then assembles the response to the client once the threads complete, then the admin identifies that method as the logical endpoint in the transaction processing sequence.

In other situations, the admin can specify the tier where the end-to-end transaction processing will complete. As a result, the admin can determine the logical transaction response time for transactions that perform asynchronous back end calls, such as JMS or web service interactions. When the last of the threads that the business transaction spawned terminates on that tier, the AppDynamics agent considers the transaction complete.

Here’s an example. In the screenshot below, the solid lines represent synchronous interactions, while the dotted lines are asynchronous.

In the example above, two ecommerce service nodes interact synchronously with a database via JDBC and with inventory services via Web Service calls. In the meantime, the ecommerce nodes also push asynchronous requests onto three queues. Order processing services then pick up requests off of two of the queues and send them to fulfillment after processing.

The order processing services must properly deal with messages in both the order and fulfillment queues, or fulfillment will slow down or possibly never take place at all. By properly monitoring the order processing services, AppDynamics can identify such problems before they adversely impact the business or the customer.

The Intellyx Take

The ecommerce scenario in the example above is perhaps the most common use case for AppDynamics’ asynchronous monitoring, but it is but one of many scenarios where this capability would be useful.

Business processes in general are long-running, and therefore, any technology that automates such processes will be asynchronous. In addition to commerce and other supply chain-related processes involving ordering and fulfillment, any type of case management process will likely benefit from such monitoring.

For example, any software that supports call centers or other support personnel will likely have asynchronous interactions, whether the software offers support ticketing, field dispatch, or any number of other enterprise scenarios.

In today’s digital world, the typical business transaction centers on a purchase – but there are many other types of business transactions that are equally important to the organization. With the addition of its asynchronous monitoring capability, AppDynamics is able to monitor any such transaction.

Intellyx advises companies on their digital transformation initiatives and helps vendors communicate their agility stories. AppDynamics is an Intellyx client. Intellyx retains full editorial control over the content of this article.

A non-technical guy’s take on Business Transaction Monitoring

I began my journey in the Application Performance Management (APM) space a little over two years ago. Transitioning from a security background, the biggest thing I was concerned about was picking up the technology quickly. Somewhere between JVMs, CLRs, JMX, IIS, and APIs I was a little overwhelmed.

The thing that caught my eye most in the early stages of learning about the APM space (outside of AppDynamics’ growth & the growing IT Ops market) was the term Business Transaction Monitoring.

Gartner defines one of their key criteria’s as “User Defined Transaction Profiling – The tracing of user-grouped events, which comprise a transaction as they occur within the application as they interact with components discovered in the second dimension; this is generated in response to a user’s request to the application.”

I know what you’re thinking…. “That is what caught your eye about APM?!” Actually, yes! Despite Gartner’s Merrian-Webster-esque response, I was able to understand the intended definition of what the term is attempting to communicate: end-to-end visibility is critical. Or, in other words, pockets of visibility (silos) are bad. Unfortunately, so many organizations have silos that keep them in the dark from understanding and managing their customer’s digital experience. That’s often because as companies grow, they become plagued by structure; individual toolsets purposed with proving innocence rather than achieving a resolution, and a lack of decentralization.

In talking with hundreds of customers over the past two years, it’s amazing how so many mature application teams still struggle with end-to-end visibility. The market only makes the problem worse since every solution provider out there claims they provide true end-to-end visibility. So, when you’re researching potential application monitoring & analytics solutions, make sure and ask the vendor what information they’re collecting & how they’re presenting that data to the end user. For example, if you look under the hood of a potential app monitoring solution and you find out they take a “machine-centric” approach by displaying JMX metrics & CPU usage & render that data to you in a fat client UI, then you’re probably looking at the wrong solution. Compare that against a purpose-built APM solution that counts, measure, and scores every single user’s interaction with your mobile / web app and presents that data in the vehicle of a business transaction. It’ll be like finally getting your eyes checked and getting prescription glasses!

Alright, let’s get to the core message of this post. A solid APM strategy must be based on business transactions. Why? The end user’s experience (BT) is the only constant unit of measurement in today’s applications. Architectures, languages, cloud platforms, and frameworks all come and go, but the user experience stays the same. With such dynamic changes occurring the app landscape of today, it’s also important that your APM solution can dynamically instrument in those environments. If you have to tell your monitoring solution what to monitor anytime, there’s a new code release, then you’re wasting time. The reason so many companies have chosen AppDynamics is because AppDynamics has architected every feature of its platform around the concept of a BT. AppDynamics provides Unified Monitoring through business transactions. One can pivot & view data across all tiers of an app including code, end user browser experience, machine data, and even database queries in just a few clicks.

I wish I had a helpful analogy for a Business Transaction, but I’m going to have to settle to encourage you to view a demo of our solution. Or, download our free trial, install the app agent, generate load, and watch how we automatically categorize similar requests into BTs. If you have any questions, then feel free to reach out to me or anyone else in our sales org.

The title of the blog was a non-Tech guys approach to BTs, so I’ll simplify my ranting with the following conclusion: Your business is likely being disrupted and now defined by software. If so, having a tool like AppDynamics is key to being able to manage your app-first approach. Your APM solution (hopefully you’re already using AppDynamics) must have a proven approach to handling complex apps by focusing on unit groupings that can provide end-to-end visibility called “Business Transactions” or BTs. A proper approach to monitoring with BTs is critical because it perfectly marries the 10,000-foot view of the end user’s digital interaction with the business (what the business wants) and the 1-foot view of class/method visibility (what your app teams need). A successful BT monitoring strategy will enable your business effectively to monitor, manage, and scale your critical apps, and provide rich context for your business to make intelligent, data-driven decisions.

3 Reasons Financial Services Companies Need APM Right Now

Financial Services companies operate in a difficult environment. Many of their applications are absolutely vital to the proper workings of the global economy. They are one of the most heavily regulated industries in the world, and they are a constant target of hackers. Their systems need to be available, performant, and secure while generating the revenue sought by Wall Street and their shareholders.

In this article we’ll take a look at 3 factors that impact revenue and how APM is used to mitigate risk.

1. Customers Hate to Wait

Losing a customer is a bad thing no matter the circumstances. Losing a customer due to poor application performance and stability is preventable and should never happen! If you lose customers, you either need to win them back or attract new customers.

Fred Reichheld of Bain & Company reports that:

  • Over a five-year period businesses may lose as many as 1/2 of their customers
  • Acquiring a new customer can cost 6 to 7 times more than retaining an existing customer
  • Businesses who boosted customer retention rates by as little as 5% saw increases in their profits ranging from 5% to a whopping 95%

Based on this research, you should do everything in your power to retain your existing customers. Providing a great end user experience and level of service is what every customer expects.

APM software tracks every transaction flowing through an application, recognizes when there are problems and can even automatically fix the issue.

FS Transaction View

Understanding the components involved in each servicing each transaction is the first step in proper troubleshooting and problem resolution.

2. Customer Loyalty = Revenue Growth

On the flipside, better performance means more customer loyalty and, as a result, revenue gains. The Net Promoter methodology, developed by Satmetrix in cooperation with Bain & Company and Fred Reichheld, is a standard way to measure customer satisfaction. Satmetrix developed the Net Promoter methodology, which is an indexed measure of customer loyalty. In their Net Promoter whitepaper, Satmetrix discovered a direct correlation between high customer loyalty scores and the rate of revenue growth for those companies. The paper showed that the higher the customer loyalty score a company achieved, the higher their rate of revenue growth over a 5-year period.

With applications playing a dominant role as the most common interaction between company and customer, it is imperative that customers have a great experience every time they use your application. Slow transactions, errors, and unavailable platforms leave customers dissatisfied and will reduce your loyalty score. Over time, this will have a significant impact on revenue.

So if we accept the premise that performance should be top-of-mind for anyone with a critical banking or FS application, what do we do next? How do we improve our application management strategy to prevent loss of revenue and improve customer loyalty? The answer: by taking on a transaction-based approach to application performance management.

APM software tracks the performance of all transactions, dynamically baselines the normal performance, and alerts when transactions deviate from their normal behavior. In this manner you’re able to identify performance problems as they are beginning instead of waiting until customers are frustrated and abandoning you applications.

FS Business Transaction List

List of business transactions classified by their level of deviation from normal performance.

3. Transactions = Money

Transactions are the lifeblood of banking. From making an online payment or converting currency to buying or selling stock, just about everything a bank does involves transactions. Furthermore, a significant portion of banks’ revenue comes from transaction fees for activities ranging from ATM withdrawals to currency conversion and credit card usage. For these fee-based transactions, the faster you can ring the cash register (response time of business transactions), the more money you will make and the better likelihood that your customer will come back to you for their next transaction.

With this in mind, it is imperative that IT organizations take a user-centric, or rather, transaction-centric approach to managing application performance.

APM software provides context that enables you to understand the business impact of failed and/or slow transactions. The data gathered by APM software can be used to focus on improving functionality that is used most often or responsible for the most revenue.

FS Prioritization

Having various data perspectives allows application support and development teams to prioritize what needs to be updated in the next release.

If you aren’t using an APM tool yet, or if your APM tool isn’t providing the value that it should be, then you need to take a free trial of AppDynamics and see what you and your customers have been missing.

IT holds more business influence than they realise

A ‘well oiled’ organization is one where IT and the rest of the business are working together and on the same page. In order to achieve this there needs to be good communication, and for good communication there needs to be a common language.

In most organizations, while IT are striving to achieve their goal of 99.999% availability, the rest of the business is looking to drive additional revenue, increase user satisfaction, and reduce customer churn.

Ultimately everyone should be working towards a common goal: SUCCESS. Unfortunately different teams define their success in different ways and this lack of alignment often results in a mistrust between IT departments and the rest of the business.

Measuring success

Let’s look at how various teams within a typical organization define success today:

Operations:
IT ops teams are responsible for reducing risk, ensuring the application is available and the ‘lights are green’. The number ‘99.9’ can either be IT Ops best friend or its worst enemy. Availability targets such as these are often the only measure of ops success or failure, meaning many of the other things you are doing often go unnoticed.

Availability targets don’t show business insight, or the positive impact you’re having on the business. For instance, how much did performance improve after you implemented that change last week? Has the average order size increased? How many additional orders can the application process since re-platforming? Is anyone measuring what the performance improvement gains were for that change you implemented last week?

Development:
Dev teams are focussed on change. The Business demands they release more frequently, with more features, less defects, less resources and often less sleep! Dev teams are often targeted according to the number of updates and changes they can release. But nobody is measuring the success of these changes. Can anyone in your dev team demonstrate what the impact of your last code release was? Did revenues climb? Were users more satisfied? Were there an increased number of orders placed?

‘The Business’:
The business is focussed on targets; last month’s achievements and end of year goals. This means they concentrate on the past and the future, but have little or no idea what impact IT is having on the business in the present. Consulting a data warehouse to gather ‘Business Intelligence’ at the end of the month does not allow you to keep your finger on the pulse of the business.

With everyone focussing on different targets there is no real alignment to the overall business goals between different parts of an organization. One reason for this disconnect is due to the lack of meaningful shared metrics. More specifically, it’s access to these metrics in real-time that is the missing link.

If I asked how much revenue has passed through your application since reading this blogpost, or what impact your last code release had on customer adoption, how quickly could you find the answers? How quickly could anyone in your organization find the answers?

What if answers to these questions only took seconds?

Monitoring the Business in Real-time

In a previous post, I introduced AppDynamics Real-time Business Metrics which enables you to easily collect, auto-baseline, alert, and report on the Business data that is flowing through your applications… as it’s really happening.

This post demonstrates how to configure AppDynamics to extract all checkout revenue values from every business transaction and make this available as a new metric “Checkout Revenue” which can be reported in real-time just like any other AppDynamics metric.

With IT Ops, Dev and Business Owners all supporting business critical applications that are responsible for generating revenue, it is a great example of a business metric that could be used by every team to measure success.

Let’s look at a few examples of how this could change the way you do business, if everyone was jointly focussed on the same business metric.

Outage cost
The below example shows the revenue per minute vs. the response time per minute of an application. This application has obviously suffered an outage that lasted approximately 50 mins and it’s clear to see the impact it has had on the business in lost revenue. The short spike/increase in revenue seen after the outage indicates users who returned to complete their transaction, but this is not enough to recover the lost revenue for the period.

RtBM - outage

Impact of agile releases
This example shows the result of a performance improvement program that has taken place. The overall response time has improved by over a second across three code releases and you can clearly see the additional revenue that has been generated as a result of the code releases.

RtBM - agile releases

Here a 1 second improvement in response time has increased the revenue being generated by the online booking system by more than 30%. The value a development team is delivering back to the business is clearly visible with each new release, allowing developers to focus on the areas that drive the most return and quantify the value they are delivering.

Marketing campaign
This example is a little more complex. At midday there is a massive increase in the number of people visiting this eCommerce website due to an expensive TV advertising campaign. The increased load on the system has resulted in a small increase in the overall response time but nothing too significant. However, despite the increased traffic to the site, the revenue has not improved. If we take a look at the Number of Checkouts, which is a second Business Metric that has been configured, it’s clear the advertising campaign has driven additional users to the site, but these users have not generated additional revenue.

RtBM - marketing

Common metrics for common success

With traditional methods of measuring success in different ways it’s impossible to to align towards a common goal. This creates silo’d working environments that make it impossible for teams to collaborate and prioritise.

By enabling all parts of the business to focus on the business metrics that really matter, organizations benefit from being able to proactively prioritise and resolve issues when they occur. It helps IT truly align with the priorities and needs of the business, allowing them to speak the same language and manage the bottom line. For example, after implementing AppDynamics Real-time Business Metrics Doug Strick, who is the Internet Application Admin at Garmin, said the following:

“We can now understand how the application is growing over time. This data will prove invaluable in guiding future decisions in IT.”
-Doug Strick, Internet Application Admin

AppDynamics Real-time Business Metrics enable you to identify business challenges and react to them immediately, instead of waiting hours, days or even weeks for answers. Correlating performance, user experience, and Business metrics together in real-time and in one place.

If you want to capture the business performance and measure your success against it in real-time , you can get started today with Real-time Business Metrics by signing up and taking a free trial of AppDynamics Pro here.

PHP Performance Crash Course, Part 2: The Deep Dive

In my first post on this series I covered some basic tips for optimizing performance in php applications. In this post we are going to dive a bit deeper into the principles and practical tips in scaling PHP.

Top engineering organizations think of performance not as a nice-to-have, but as a crucial feature of their product. Those organizations understand that performance has a direct impact on the success of their business.

Ultimately, scalability is about the entire architecture, not some minor code optimizations. Often times people get this wrong and naively think they should focus on the edge cases. Solid architectural decisions like doing blocking work in the background via tasks, proactively caching expensive calls, and using a reverse proxy cache will get you much further than arguing about single quotes or double quotes.

Just to recap some core principles for performant PHP applications:

The first few tips don’t really require elaboration, so I will focus on what matters.

Optimize your sessions

In PHP it is very easy to move your session store to Memcached:

1) Install the Memcached extension with PECL

pecl install memcached

2) Customize your php.ini configuration to change the session handler


session.save_handler = memcached
session.save_path = "localhost:11211"

If you want to support a pool of memcache instances you can separate with a comma:

session.save_handler = memcached
session.save_path = "10.0.0.10:11211,10.0.0.11:11211,10.0.0.12:11211"

The Memcached extension has a variety of configuration options available, see the full list on Github. The ideal configuration I have found if using a pool of servers:

session.save_handler = memcached
session.save_path = "10.0.0.10:11211,10.0.0.11:11211,10.0.0.12:11211"

memcached.sess_prefix = “session.”
memcached.sess_consistent_hash = On
memcached.sess_remove_failed = 1
memcached.sess_number_of_replicas = 2
memcached.sess_binary = On
memcached.sess_randomize_replica_read = On
memcached.sess_locking = On
memcached.sess_connect_timeout = 200
memcached.serializer = “igbinary”

That’s it! Consult the documentation for a complete explanation of these configuration directives.

Leverage caching

Any data that is expensive to generate or query and long lived should be cached in-memory if possible. Common examples of highly cacheable data include web service responses, database result sets, and configuration data.

Using the Symfony2 HttpFoundation component for built-in http caching support

I won’t attempt to explain http caching. Just go read the awesome post from Ryan Tomako, Things Caches Do or the more in-depth guide to http caching from Mark Nottingham. Both are stellar posts that every professional developer should read.

With the Symfony2 HttpFoundation component it is easy to add support for caching to your http responses. The component is completely standalone and can be dropped into any existing php application to provide an object oriented abstraction around the http specification. The goal is to help you manage requests, responses, and sessions. Add “symfony/http-foundation” to your Composer file and you are ready to get started.

Expires based http caching flow

use SymfonyComponentHttpFoundationResponse;

$response = new Response(‘Hello World!’, 200, array(‘content-type’ => ‘text/html’));

$response->setCache(array(
‘etag’ => ‘a_unique_id_for_this_resource’,
‘last_modified’ => new DateTime(),
‘max_age’ => 600,
‘s_maxage’ => 600,
‘private’ => false,
‘public’ => true,
));

If you use both the request and response from the http foundation you can check your conditional validators from the request easily:


use SymfonyComponentHttpFoundationRequest;
use SymfonyComponentHttpFoundationResponse;

$request = Request::createFromGlobals();

$response = new Response(‘Hello World!’, 200, array(‘content-type’ => ‘text/html’));

if ($response->isNotModified($request)) {
$response->send();
}

Find more examples and complete documentation from the very detailed Symfony documentation.

Caching result sets with Doctrine ORM

If you aren’t using an ORM or some form of database abstraction you should consider it. Doctrine is the most fully featured database abstraction layer and object-relational mapper available for PHP. Of course, adding abstractions comes at the cost of performance, but I find Doctrine to be exteremly fast and efficient if used properly. If you leverage the Doctrine ORM you can easily enable caching result sets in Memcached:


$memcache = new Memcache();
$memcache->connect('localhost', 11211);

$memcacheDriver = new DoctrineCommonCacheMemcacheCache();
$memcacheDriver->setMemcache($memcache);

$config = new DoctrineORMConfiguration();
$config->setQueryCacheImpl($memcacheDriver);
$config->setMetadataCacheImpl($memcacheDriver);
$config->setResultCacheImpl($memcacheDriver);

$entityManager = DoctrineORMEntityManager::create(array(‘driver’ => ‘pdo_sqlite’, ‘path’ => __DIR__ . ‘/db.sqlite’), $config);

$query = $em->createQuery(‘select u from EntitiesUser u’);
$query->useResultCache(true, 60);

$users = $query->getResult();

Find more examples and complete documentation from the very detailed Doctrine documentation.

Caching web service responses with Guzzle HTTP client

Interacting with web services is very common in modern web applications. Guzzle is the most fully featured http client available for PHP. Guzzle takes the pain out of sending HTTP requests and the redundancy out of creating web service clients. It’s a framework that includes the tools needed to create a robust web service client. Add “guzzle/guzzle” to your Composer file and you are ready to get started.

Not only does Guzzle support a variety of authentication methods (OAuth 1+2, HTTP Basic, etc), it also support best practices like retries with exponential backoffs as well as http caching.


$memcache = new Memcache();
$memcache->connect('localhost', 11211);

$memcacheDriver = new DoctrineCommonCacheMemcacheCache();
$memcacheDriver->setMemcache($memcache);

$client = new GuzzleHttpClient(‘http://www.test.com/’);

$cachePlugin = new GuzzlePluginCacheCachePlugin(array(
‘storage’ => new GuzzlePluginCacheDefaultCacheStorage(
new GuzzleCacheDoctrineCacheAdapter($memcacheDriver)
)
));
$client->addSubscriber($cachePlugin);

$response = $client->get(‘http://www.wikipedia.org/’)->send();

// response will come from cache if server sends 304 not-modified
$response = $client->get(‘http://www.wikipedia.org/’)->send();

Following these tips will allow you to easily cache all your database queries, web service requests, and http responses.

Moving work to the background with Resque and Redis

Any process that is slow and not important for the immediate http response should be queued and processed via non-blocking background tasks. Common examples are sending social notifications (like Facebook, Twitter, LinkedIn), sending emails, and processing analytics. There are a lot of systems available for managing messaging layers or task queues, but I find Resque for PHP dead simple. I won’t provide an in-depth guide as Wan Qi Chen’s has already published an excellent blog post series about getting started with Resque. Add “chrisboulton/php-resque” to your Composer file and you are ready to get started. A very simple introduction to adding Resque to your application:

1) Define a Redis backend

Resque::setBackend('localhost:6379');

2) Define a background task

class MyTask
{
public function perform()
{
// Work work work
echo $this->args['name'];
}
}

3) Add a task to the queue

Resque::enqueue('default', 'MyTask', array('name' => 'AppD'));

4) Run a command line task to process the tasks with five workers from the queue in the background

$ QUEUE=* COUNT=5 bin/resque

For more information read the official documentation or see the very complete tutorial from Wan Qi Chen:

Monitor production performance

AppDynamics is application performance management software designed to help dev and ops troubleshoot performance problems in complex production applications. The application flow map allows you to easily monitor calls to databases, caches, queues, and web services with code level detail to performance problems:

Symfony2 Application Flow Map

Take five minutes to get complete visibility into the performance of your production applications with AppDynamics Pro today.

If you prefer slide format these posts were inspired from a recent tech talk I presented:

As always, please feel free to comment if you think I have missed something or if you have a request for content in an upcoming post.

Fat or Fit – You Choose

There was a time when I thought more was better. More french fries are better; more ice cream is better; more of everything is always better. I followed this principle as a way of life for many years until one day I woke up and realized that I was obese and slow. I realized that more is not always better and that moderation and balance was the an important key to a happy and healthy life. The same exact rule applies to IT monitoring and in this blog I’ll explain how to identify high fat, high calorie, low nutritional content monitoring solutions.

binge

Fat, Dumb, and Happy

The worst offenders in the battle against data center obesity are those companies that claim to be “always-on”. Always gathering mountains of data when there are no problems to solve is equivalent to entering a hotdog eating contest every day of the year. Why would you do that to yourself? Your IT bloating will reach epic proportions in no time and your CIO/CTO will eventually start asking why you are spending so much money on all of that storage space and all of those monitoring servers.

Let’s use an example to explore this scenario. A user of your application logs in, searches for some new running shoes, adds their favorite ones to their cart, checks out and happily disappears into the ether awaiting their shoe delivery so they can get ready for their local charity fun run. This same pattern is repeated for many users of your application.

Scenario 1: “Always on” bloated consumerism – Your monitoring software:

  • Tracked the response time of each function the user performed (small amount of data)
  • Tracked the execution details of many of the method calls involved in each and every function (lots and lots of data)
  • Sent all of this across the network to be compiled and stored
  • This happens for every single function that every single user executes regardless of if there is a problem or not.

Smart and Fit

Scenario 2: The intelligent fitness pro – Your monitoring software:trinity

  • Tracked the response time of each function the user performed (small amount of data)
  • Periodically tracked the execution details of all the method calls involved in each function so that you have a reference point for “good” transactions (small amount of data)
  • Tracked the execution details of all method calls for every bad (or slow) function (business transaction) so that you have the information you need to solve the problem (small – medium data)
  • The built in analytics decide when slow business transactions are impacting your users and automatically collect all the appropriate details.

How often do you look at deep granular monitoring details when there are no application issues to resolve? I was an application monitoring expert at a major investment bank and I never looked at those details when there were no problems. AppDynamics is a new breed of monitoring tool that is based upon intelligent analytics to keep your data center fast and fit. I think John Martin from Edmunds.com said it best in his case study “AppDynamics intelligence just says, ‘Hey something interesting is going on, I’m going to collect more data for you’.”

Smart people choose smart tools. You owe it to yourself to take a free trial of AppDynamics today and make us prove our value to you.

PHP Performance Crash Course, Part 1: The Basics

We all know performance is important, but performance tuning is too often an afterthought. As a result, taking on a performance tuning project for a slow application can be pretty intimidating – where do you even begin? In this series I’ll tell you about the strategies and technologies that (in my experience) have been the most successful in improving PHP performance. To start off, however, we’ll talk about some of the easy wins in PHP performance tuning. These are the things you can do that’ll get you the most performance bang for your buck, and you should be sure you’ve checked off all of them before you take on any of the more complex stuff.

Why does performance matter?

The simple truth is that application performance has a direct impact on your bottom line:

 

Follow these simple best practices to start improving PHP performance:

 

Update PHP!

One of the easiest improvements you can make to improve performance and stability is to upgrade your version of PHP. PHP 5.3.x was released in 2009. If you haven’t migrated to PHP 5.4, now is the time! Not only do you benefit from bug fixes and new features, but you will also see faster response times immediately. See PHP.net to get started.

Once you’ve finished upgrading PHP, be sure to disable any unused extensions in production such as xdebug or xhprof.

Use an opcode cache

PHP is an interpreted language, which means that every time a PHP page is requested, the server will interpet the PHP file and compile it into something the machine can understand (opcode). Opcode caches preserve this generated code in a cache so that it will only need to be interpreted on the first request. If you aren’t using an opcode cache you’re missing out on a very easy performance gain. Pick your flavor: APC, Zend Optimizer, XCache, or Eaccellerator. I highly recommend APC, written by the creator of PHP, Rasmus Lerdorf.

Use autoloading

Many developers writing object-oriented applications create one PHP source file per class definition. One of the biggest annoyances in writing PHP is having to write a long list of needed includes at the beginning of each script (one for each class). PHP re-evaluates these require/include expressions over and over during the evaluation period each time a file containing one or more of these expressions is loaded into the runtime. Using an autoloader will enable you to remove all of your require/include statements and benefit from a performance improvement. You can even cache the class map of your autoloader in APC for a small performance improvement.

Optimize your sessions

While HTTP is stateless, most real life web applications require a way to manage user data. In PHP, application state is managed via sessions. The default configuration for PHP is to persist session data to disk. This is extremely slow and not scalable beyond a single server. A better solution is to store your session data in a database and front with an LRU (Least Recently Used) cache with Memcached or Redis. If you are super smart you will realize you should limit your session data size (4096 bytes) and store all session data in a signed or encrypted cookie.

Use a distributed data cache

Applications usually require data. Data is usually structured and organized in a database. Depending on the data set and how it is accessed it can be expensive to query. An easy solution is to cache the result of the first query in a data cache like Memcached or Redis. If the data changes, you invalidate the cache and make another SQL query to get the updated result set from the database.

I highly recommend the Doctrine ORM for PHP which has built-in caching support for Memcached or Redis.

There are many use cases for a distributed data cache from caching web service responses and app configurations to entire rendered pages.

Do blocking work in the background

Often times web applications have to run tasks that can take a while to complete. In most cases there is no good reason to force the end-user to have to wait for the job to finish. The solution is to queue blocking work to run in background jobs. Background jobs are jobs that are executed outside the main flow of your program, and usually handled by a queue or message system. There are a lot of great solutions that can help solve running backgrounds jobs. The benefits come in terms of both end-user experience and scaling by writing and processing long running jobs from a queue. I am a big fan of Resque for PHP that is a simple toolkit for running tasks from queues. There are a variety of tools that provide queuing or messaging systems that work well with PHP:

I highly recommend Wan Qi Chen’s excellent blog post series about getting started with background jobs and Resque for PHP.

User location update workflow with background jobs

Leverage HTTP caching

HTTP caching is one of the most misunderstood technologies on the Internet. Go read the HTTP caching specification. Don’t worry, I’ll wait. Seriously, go do it! They solved all of these caching design problems a few decades ago. It boils down to expiration or invalidation and when used properly can save your app servers a lot of load. Please read the excellent HTTP caching guide from Mark Nottingam. I highly recommend using Varnish as a reverse proxy cache to alleviate load on your app servers.

Optimize your favorite framework

php_framework_overlay

Deep diving into the specifics of optimizing each framework is outside of the scope of this post, but these principles apply to every framework:

  • Stay up-to-date with the latest stable version of your favorite framework
  • Disable features you are not using (I18N, Security, etc)
  • Enable caching features for view and result set caching

Learn to how to profile code for PHP performance

Xdebug is a PHP extension for powerful debugging. It supports stack and function traces, profiling information and memory allocation and script execution analysis. It allows developers to easily profile PHP code.

WebGrind is an Xdebug profiling web frontend in PHP5. It implements a subset of the features of kcachegrind and installs in seconds and works on all platforms. For quick-and-dirty optimizations it does the job. Here’s a screenshot showing the output from profiling:

Check out Chris Abernethy’s guide to profiling PHP with XDebug and Webgrind.

XHprof is a function-level hierarchical profiler for PHP with a reporting and UI layer. XHProf is capable of reporting function-level inclusive and exclusive wall times, memory usage, CPU times and number of calls for each function. Additionally, it supports the ability to compare two runs (hierarchical DIFF reports) or aggregate results from multiple runs.

AppDynamics is application performance management software designed to help dev and ops troubleshoot problems in complex production apps.

Complete Visibility

Get started with AppDynamics today and get in-depth analysis of your applications performance.

PHP application performance is only part of the battle

Now that you have optimized the server-side, you can spend time improving the client side! In modern web applications most of the end-user experience time is spent waiting on the client side to render. Google has dedicated many resources to helping developers improve client side performance.

See us live!

If you are interested in hearing more best practices for scaling PHP in the real world join my session at LonestarPHP in Dallas, Texas or International PHP Conference in Berlin, Germany.

Orbitz Deploys AppDynamics Across 2,700 Servers In Just 15 Days

Here at AppDynamics we’re very proud of how easy our solution is to deploy and maintain. We also tout the fact that in many cases there is no configuration required to gain the insight needed to solve complex application problems. All of this is absolutely true, but does it mean that AppDynamics doesn’t have enough functionality to support complex deployments that make little to no use of common frameworks? Absolutely not! But don’t take our word for it, see for yourself in this presentation by Orbitz (Geoff Kramer and Nick Jasieniecki) at one of our Chicago User Group meetings. In case you don’t know what Orbitz does I think this quote from Geoff Kramer sums it up quite well… “Orbitz unlocks the joy of travel for our customers. You can’t do that if you are having site problems.”

If you don’t have 50 minutes to spare right now I will summarize the videos key points in this blog post.

Why Alerts Suck and Monitoring Solutions need to become Smarter

I have yet to meet anyone in Dev or Ops who likes alerts. I’ve also yet to meet anyone who was fast enough to acknowledge an alert, so they could prevent an application from slowing down or crashing. In the real world alerts just don’t work, nobody has the time or patience anymore, alerts are truly evil and no-one trusts them. The most efficient alert today is an angry end user phone call, because Dev and Ops physically hear and feel the pain of someone suffering 🙂

Why? There is little or no intelligence in how a monitoring solution determines what is normal or abnormal for application performance. Today, monitoring solutions are only as good as the users that configure them, which is bad news because humans make mistakes, configuration takes time, and time is something many of us have little of.

Its therefore no surprise to learn that behavioral learning and analytics are becoming key requirements for modern application performance monitoring (APM) solutions. In fact, Will Capelli from Gartner recently published a report on IT Operational Analytics and pattern based strategies in the data center. The report covered the role of Complex Event Processing (CEP), behavior learning engines (BLEs) and analytics as a means for monitoring solutions to deliver better intelligence and quality information to Dev and Ops. Rather than just collect, store and report data, monitoring solutions must now learn and make sense of the data they collect, thus enabling them to become smarter and deliver better intelligence back to their users.

Change is constant for applications and infrastructure thanks to agile cycles, therefore monitoring solutions must also change so they can adapt and stay relevant. For example, if the performance of a business transaction in an application is 2.5 secs one week, and that drops to 200ms the week after because of a development fix. 200ms should become the new performance baseline for that same transaction, otherwise the monitoring solution won’t learn or alert of any performance regression. If the end user experience of a business transaction goes from 2.5 secs to 200ms, then end user expectations change instantly, and users become used to an instant response. Monitoring solutions have to keep up with user expectations, otherwise IT will become blind to the one thing that impacts customer loyalty and experience the most.

What is normal application performance?

Peter Drucker proclaimed: “If you can’t measure it, you can’t manage it.” Do you know what’s “normal” for your mission-critical application? Actually, wait a second–with Halloween having just finished up,  maybe the following Young Frankenstein reference is more appropriate. Whenever I focus on the word “normal,” the first thing that pops into my head (pardon the pun) is that famous scene from Young Frankenstein:

DR. FREDERICK FRANKENSTEIN: Abby Normal?

IGOR: I’m almost sure that was the name.

DR. FREDERICK FRANKENSTEIN: [chuckles] Are you saying that I put an abnormal brain into a seven and a half foot long, fifty-four inch wide GORILLA?

[grabs Igor and starts throttling him]

DR. FREDERICK FRANKENSTEIN: Is that what you’re telling me?