PHP Performance Crash Course, Part 2: The Deep Dive

In my first post on this series I covered some basic tips for optimizing performance in php applications. In this post we are going to dive a bit deeper into the principles and practical tips in scaling PHP.

Top engineering organizations think of performance not as a nice-to-have, but as a crucial feature of their product. Those organizations understand that performance has a direct impact on the success of their business.

Ultimately, scalability is about the entire architecture, not some minor code optimizations. Often times people get this wrong and naively think they should focus on the edge cases. Solid architectural decisions like doing blocking work in the background via tasks, proactively caching expensive calls, and using a reverse proxy cache will get you much further than arguing about single quotes or double quotes.

Just to recap some core principles for performant PHP applications:

The first few tips don’t really require elaboration, so I will focus on what matters.

Optimize your sessions

In PHP it is very easy to move your session store to Memcached:

1) Install the Memcached extension with PECL

pecl install memcached

2) Customize your php.ini configuration to change the session handler


session.save_handler = memcached
session.save_path = "localhost:11211"

If you want to support a pool of memcache instances you can separate with a comma:

session.save_handler = memcached
session.save_path = "10.0.0.10:11211,10.0.0.11:11211,10.0.0.12:11211"

The Memcached extension has a variety of configuration options available, see the full list on Github. The ideal configuration I have found if using a pool of servers:

session.save_handler = memcached
session.save_path = "10.0.0.10:11211,10.0.0.11:11211,10.0.0.12:11211"

memcached.sess_prefix = “session.”
memcached.sess_consistent_hash = On
memcached.sess_remove_failed = 1
memcached.sess_number_of_replicas = 2
memcached.sess_binary = On
memcached.sess_randomize_replica_read = On
memcached.sess_locking = On
memcached.sess_connect_timeout = 200
memcached.serializer = “igbinary”

That’s it! Consult the documentation for a complete explanation of these configuration directives.

Leverage caching

Any data that is expensive to generate or query and long lived should be cached in-memory if possible. Common examples of highly cacheable data include web service responses, database result sets, and configuration data.

Using the Symfony2 HttpFoundation component for built-in http caching support

I won’t attempt to explain http caching. Just go read the awesome post from Ryan Tomako, Things Caches Do or the more in-depth guide to http caching from Mark Nottingham. Both are stellar posts that every professional developer should read.

With the Symfony2 HttpFoundation component it is easy to add support for caching to your http responses. The component is completely standalone and can be dropped into any existing php application to provide an object oriented abstraction around the http specification. The goal is to help you manage requests, responses, and sessions. Add “symfony/http-foundation” to your Composer file and you are ready to get started.

Expires based http caching flow

use SymfonyComponentHttpFoundationResponse;

$response = new Response(‘Hello World!’, 200, array(‘content-type’ => ‘text/html’));

$response->setCache(array(
‘etag’ => ‘a_unique_id_for_this_resource’,
‘last_modified’ => new DateTime(),
‘max_age’ => 600,
‘s_maxage’ => 600,
‘private’ => false,
‘public’ => true,
));

If you use both the request and response from the http foundation you can check your conditional validators from the request easily:


use SymfonyComponentHttpFoundationRequest;
use SymfonyComponentHttpFoundationResponse;

$request = Request::createFromGlobals();

$response = new Response(‘Hello World!’, 200, array(‘content-type’ => ‘text/html’));

if ($response->isNotModified($request)) {
$response->send();
}

Find more examples and complete documentation from the very detailed Symfony documentation.

Caching result sets with Doctrine ORM

If you aren’t using an ORM or some form of database abstraction you should consider it. Doctrine is the most fully featured database abstraction layer and object-relational mapper available for PHP. Of course, adding abstractions comes at the cost of performance, but I find Doctrine to be exteremly fast and efficient if used properly. If you leverage the Doctrine ORM you can easily enable caching result sets in Memcached:


$memcache = new Memcache();
$memcache->connect('localhost', 11211);

$memcacheDriver = new DoctrineCommonCacheMemcacheCache();
$memcacheDriver->setMemcache($memcache);

$config = new DoctrineORMConfiguration();
$config->setQueryCacheImpl($memcacheDriver);
$config->setMetadataCacheImpl($memcacheDriver);
$config->setResultCacheImpl($memcacheDriver);

$entityManager = DoctrineORMEntityManager::create(array(‘driver’ => ‘pdo_sqlite’, ‘path’ => __DIR__ . ‘/db.sqlite’), $config);

$query = $em->createQuery(‘select u from EntitiesUser u’);
$query->useResultCache(true, 60);

$users = $query->getResult();

Find more examples and complete documentation from the very detailed Doctrine documentation.

Caching web service responses with Guzzle HTTP client

Interacting with web services is very common in modern web applications. Guzzle is the most fully featured http client available for PHP. Guzzle takes the pain out of sending HTTP requests and the redundancy out of creating web service clients. It’s a framework that includes the tools needed to create a robust web service client. Add “guzzle/guzzle” to your Composer file and you are ready to get started.

Not only does Guzzle support a variety of authentication methods (OAuth 1+2, HTTP Basic, etc), it also support best practices like retries with exponential backoffs as well as http caching.


$memcache = new Memcache();
$memcache->connect('localhost', 11211);

$memcacheDriver = new DoctrineCommonCacheMemcacheCache();
$memcacheDriver->setMemcache($memcache);

$client = new GuzzleHttpClient(‘http://www.test.com/’);

$cachePlugin = new GuzzlePluginCacheCachePlugin(array(
‘storage’ => new GuzzlePluginCacheDefaultCacheStorage(
new GuzzleCacheDoctrineCacheAdapter($memcacheDriver)
)
));
$client->addSubscriber($cachePlugin);

$response = $client->get(‘http://www.wikipedia.org/’)->send();

// response will come from cache if server sends 304 not-modified
$response = $client->get(‘http://www.wikipedia.org/’)->send();

Following these tips will allow you to easily cache all your database queries, web service requests, and http responses.

Moving work to the background with Resque and Redis

Any process that is slow and not important for the immediate http response should be queued and processed via non-blocking background tasks. Common examples are sending social notifications (like Facebook, Twitter, LinkedIn), sending emails, and processing analytics. There are a lot of systems available for managing messaging layers or task queues, but I find Resque for PHP dead simple. I won’t provide an in-depth guide as Wan Qi Chen’s has already published an excellent blog post series about getting started with Resque. Add “chrisboulton/php-resque” to your Composer file and you are ready to get started. A very simple introduction to adding Resque to your application:

1) Define a Redis backend

Resque::setBackend('localhost:6379');

2) Define a background task

class MyTask
{
public function perform()
{
// Work work work
echo $this->args['name'];
}
}

3) Add a task to the queue

Resque::enqueue('default', 'MyTask', array('name' => 'AppD'));

4) Run a command line task to process the tasks with five workers from the queue in the background

$ QUEUE=* COUNT=5 bin/resque

For more information read the official documentation or see the very complete tutorial from Wan Qi Chen:

Monitor production performance

AppDynamics is application performance management software designed to help dev and ops troubleshoot performance problems in complex production applications. The application flow map allows you to easily monitor calls to databases, caches, queues, and web services with code level detail to performance problems:

Symfony2 Application Flow Map

Take five minutes to get complete visibility into the performance of your production applications with AppDynamics Pro today.

If you prefer slide format these posts were inspired from a recent tech talk I presented:

As always, please feel free to comment if you think I have missed something or if you have a request for content in an upcoming post.

PHP Performance Crash Course, Part 1: The Basics

We all know performance is important, but performance tuning is too often an afterthought. As a result, taking on a performance tuning project for a slow application can be pretty intimidating – where do you even begin? In this series I’ll tell you about the strategies and technologies that (in my experience) have been the most successful in improving PHP performance. To start off, however, we’ll talk about some of the easy wins in PHP performance tuning. These are the things you can do that’ll get you the most performance bang for your buck, and you should be sure you’ve checked off all of them before you take on any of the more complex stuff.

Why does performance matter?

The simple truth is that application performance has a direct impact on your bottom line:

 

Follow these simple best practices to start improving PHP performance:

 

Update PHP!

One of the easiest improvements you can make to improve performance and stability is to upgrade your version of PHP. PHP 5.3.x was released in 2009. If you haven’t migrated to PHP 5.4, now is the time! Not only do you benefit from bug fixes and new features, but you will also see faster response times immediately. See PHP.net to get started.

Once you’ve finished upgrading PHP, be sure to disable any unused extensions in production such as xdebug or xhprof.

Use an opcode cache

PHP is an interpreted language, which means that every time a PHP page is requested, the server will interpet the PHP file and compile it into something the machine can understand (opcode). Opcode caches preserve this generated code in a cache so that it will only need to be interpreted on the first request. If you aren’t using an opcode cache you’re missing out on a very easy performance gain. Pick your flavor: APC, Zend Optimizer, XCache, or Eaccellerator. I highly recommend APC, written by the creator of PHP, Rasmus Lerdorf.

Use autoloading

Many developers writing object-oriented applications create one PHP source file per class definition. One of the biggest annoyances in writing PHP is having to write a long list of needed includes at the beginning of each script (one for each class). PHP re-evaluates these require/include expressions over and over during the evaluation period each time a file containing one or more of these expressions is loaded into the runtime. Using an autoloader will enable you to remove all of your require/include statements and benefit from a performance improvement. You can even cache the class map of your autoloader in APC for a small performance improvement.

Optimize your sessions

While HTTP is stateless, most real life web applications require a way to manage user data. In PHP, application state is managed via sessions. The default configuration for PHP is to persist session data to disk. This is extremely slow and not scalable beyond a single server. A better solution is to store your session data in a database and front with an LRU (Least Recently Used) cache with Memcached or Redis. If you are super smart you will realize you should limit your session data size (4096 bytes) and store all session data in a signed or encrypted cookie.

Use a distributed data cache

Applications usually require data. Data is usually structured and organized in a database. Depending on the data set and how it is accessed it can be expensive to query. An easy solution is to cache the result of the first query in a data cache like Memcached or Redis. If the data changes, you invalidate the cache and make another SQL query to get the updated result set from the database.

I highly recommend the Doctrine ORM for PHP which has built-in caching support for Memcached or Redis.

There are many use cases for a distributed data cache from caching web service responses and app configurations to entire rendered pages.

Do blocking work in the background

Often times web applications have to run tasks that can take a while to complete. In most cases there is no good reason to force the end-user to have to wait for the job to finish. The solution is to queue blocking work to run in background jobs. Background jobs are jobs that are executed outside the main flow of your program, and usually handled by a queue or message system. There are a lot of great solutions that can help solve running backgrounds jobs. The benefits come in terms of both end-user experience and scaling by writing and processing long running jobs from a queue. I am a big fan of Resque for PHP that is a simple toolkit for running tasks from queues. There are a variety of tools that provide queuing or messaging systems that work well with PHP:

I highly recommend Wan Qi Chen’s excellent blog post series about getting started with background jobs and Resque for PHP.

User location update workflow with background jobs

Leverage HTTP caching

HTTP caching is one of the most misunderstood technologies on the Internet. Go read the HTTP caching specification. Don’t worry, I’ll wait. Seriously, go do it! They solved all of these caching design problems a few decades ago. It boils down to expiration or invalidation and when used properly can save your app servers a lot of load. Please read the excellent HTTP caching guide from Mark Nottingam. I highly recommend using Varnish as a reverse proxy cache to alleviate load on your app servers.

Optimize your favorite framework

php_framework_overlay

Deep diving into the specifics of optimizing each framework is outside of the scope of this post, but these principles apply to every framework:

  • Stay up-to-date with the latest stable version of your favorite framework
  • Disable features you are not using (I18N, Security, etc)
  • Enable caching features for view and result set caching

Learn to how to profile code for PHP performance

Xdebug is a PHP extension for powerful debugging. It supports stack and function traces, profiling information and memory allocation and script execution analysis. It allows developers to easily profile PHP code.

WebGrind is an Xdebug profiling web frontend in PHP5. It implements a subset of the features of kcachegrind and installs in seconds and works on all platforms. For quick-and-dirty optimizations it does the job. Here’s a screenshot showing the output from profiling:

Check out Chris Abernethy’s guide to profiling PHP with XDebug and Webgrind.

XHprof is a function-level hierarchical profiler for PHP with a reporting and UI layer. XHProf is capable of reporting function-level inclusive and exclusive wall times, memory usage, CPU times and number of calls for each function. Additionally, it supports the ability to compare two runs (hierarchical DIFF reports) or aggregate results from multiple runs.

AppDynamics is application performance management software designed to help dev and ops troubleshoot problems in complex production apps.

Complete Visibility

Get started with AppDynamics today and get in-depth analysis of your applications performance.

PHP application performance is only part of the battle

Now that you have optimized the server-side, you can spend time improving the client side! In modern web applications most of the end-user experience time is spent waiting on the client side to render. Google has dedicated many resources to helping developers improve client side performance.

See us live!

If you are interested in hearing more best practices for scaling PHP in the real world join my session at LonestarPHP in Dallas, Texas or International PHP Conference in Berlin, Germany.