Is Your Database Dragging Down your PHP Application?

Here’s a quiz from our last PHP application performance post: How much of a PHP application’s execution time is used up accessing the database? 50%? 75%? You may recall the real answer, since it is so (some might say surprisingly) big: some 90% of PHP execution time happens in the database layer. So it makes sense to look to the database to reap some big performance gains. Let’s take a look at how some of the decisions you make about your database can affect performance.

Is NoSQL a No-Brainer?

Whether it’s MongoDB, Cassandra, or one of the other flavors, developers often turn to NoSQL databases as a solution to database performance. But NoSQL databases aren’t a magic bullet. As with so many application questions, it all depends on the use case. NoSQL databases perform extremely well in certain, but not all, use cases. It depends on whether the dataset for your specific application is better represented by a non-relational model. If it is, then find a NoSQL solution that matches your data and offers the tuning and scalability you need. If not, keep reading.

Your SQL, MySQL, Everybody’s SQL

We’re going to focus this discussion on the MySQL database, since it’s the one most frequently used with PHP. But almost everything we say about it will apply, at least in broad strokes, to other relational databases. We’re not going to dive into database configuration either, as that’s a whole other topic. What follows are some of the top things a PHP developer should be looking at to make sure the database is performing at the highest possible level.

Why Be Normal?

The web is full of discussions about database normalization, but let’s go with the short explanation that these are formalized ways to structure a database to minimize redundancy and therefore keep the database cleaner. But neat and clean do not always equate to fast. There are times when denormalizing — combining tables together, even if it means having duplicate information — can be more efficient for your application and improve performance. It might seem counterintuitive, but sometimes being a little messy is the quickest way to get what you want out of your database.

Let’s look at a customer database application as a simplified example. This application needs to track what company each customer works for. This would typically be stored as two tables with a foreign key constraint between them, such as:

CREATE TABLE `company` (
`id`    int(11) AUTO_INCREMENT NOT NULL,
`name`  varchar(512) NULL,
PRIMARY KEY(`id`)
);
CREATE TABLE `customer` (
`id`   int(11) AUTO_INCREMENT NOT NULL,
`fullName`  varchar(512) NOT NULL,
`email`   varchar(512) NULL,
`companyID` int(11) NULL,
PRIMARY KEY(`id`)
);
ALTER TABLE `customer`
ADD CONSTRAINT `fk_customer_company`
FOREIGN KEY(`companyID`)
REFERENCES `company`(`id`);

Searching for customers and including their company name in the results will now require either two queries — not efficient — or a single query with a join, such as:

SELECT cst.*, cpy.name as companyName FROM customer cst LEFT JOIN company cpy ON cst.companyID = cpy.id;

If the application will typically need to have a company name when looking up customers, then it’s obviously inefficient to make separate queries or use database joins each time. The better solution for performance is to combine these two tables into a single table, such as this:

CREATE TABLE customer (
id           int(11) AUTO_INCREMENT NOT NULL,
fullName     varchar(512) NOT NULL,
email        varchar(512) NULL,
companyName  varchar(512) NULL,
PRIMARY KEY(id)
);

This simpler query structure allows for much faster lookups. But it comes at a cost, as now company names are not standardized, and the same company could be entered differently between two customers. If it’s is important to be exact — in a billing application, for example, that needs to know what company to bill — this lack of precision could be catastrophic. But in many cases, “company” would is just another piece of data about the person. If the spelling of company — or any other descriptive information — is not critical, it’s worth it from a performance perspective to combine tables.

To Index Or Not To Index?

Indexes can speed lookups up. But too may indexes can slow things down. So a critical database performance question is whether to index a column or not.

The primary key of the database is already indexed. Database indexes are lookup tables for anything else. A general rule of thumb or starting point is to identify the columns referenced in the “where” clauses of queries, and seriously consider indexing those columns. Again, this is a general rule, and needs to be balanced with how many indexes overall are being created and whether that number will negatively impact performance.

Again using the customer database example, if the application needs to search for a customer by email address, then index the email column of the table by executing the following SQL:

CREATE INDEX idx_customer_email ON customer(email);<c/ode>

This will significantly increase the speed of customer search by email address. Multiple columns can be included in the index as well — so if the application needs to look for customers by name within a particular company, an index like this would speed the search up:

CREATE INDEX idx_customer_dual ON customer(fullName, companyName);

Be Wary Of Your Queries

Queries themselves can be a source of slowdowns, for a variety of reasons. “Query tuning” is the process of finding slow queries and fixing them to make them faster.

What makes a query slower than it should be? Maybe an unnecessary subquery was used, or a “group by” clause is used that can be refactored out. Learning to use the EXPLAIN command can help one understand the performance of queries. The primary goal is to find queries that are slow and rework them, testing the performance at each iteration.

A common query pitfall involves SQL functions. Whenever possible, queries should avoid wrapping a column name inside of a function, within a “where” clause. Here’s an example of how not to look for all records created in the last month:

SELECT * FROM user WHERE NOW() < DATE_ADD(dateCreated, INTERVAL 1 MONTH);

Written this way, MySQL has to perform a full table scan, since it can’t benefit from any index that exists on the dateCreated column. Since the dateCreated column is being used in a function to modify, it is valued on every access point. Now look at this way of re-writing this query:

SELECT * FROM user WHERE dateCreated > DATE_SUB(NOW(), INTERVAL 1 MONTH);

Here, all the columns are moved to one side of the comparison, and all the functions to the other side, so the comparison can use an index now. This way of writing the query delivers the same results, but performs much better than the first version.

Tuning even further, in this specific case performance can be increased a hair more by the removal of the NOW() function from needing to be called, potentially on every comparison unless the database optimizes that out. Instead, the current date can be output via PHP in a query that looks like this:

SELECT * FROM user WHERE dateCreated > DATE_SUB(’2014–10–31’, INTERVAL 1 MONTH);

These examples are a broad look at query tuning, but are a good starting point for looking for performance improvement opportunities.

Cash In On Caching

No discussion on database performance would be complete without looking at caching. Typically, the caching layer that PHP developers are concerned with is the ‘user cache.’ This is a programmable caching layer, usually provided by software such as Memcached, APC/APCu, or Redis.

The common thread with these caching mechanisms is simple, in-memory, key-value lookups, which are much faster than database queries. Therefore one of the most common techniques to speed up a database is to leverage caching lookups. Once the the data is retrieved, it’s stored in the cache for some period. Future queries will retrieve the stored data and not care about it being slightly older or somewhat stale data.

This can be done generically in code via a helper function, such as this one that assumes a SQL query that will only ever return one row:

function cachedSingleRowQuery($query, Array $bind, $expiry, PDO $db, Memcached $cache) {
$key = 'SingleQuery' . md5($query) . md5(implode(',', $bind);
if (!($obj = $cache->get($key))) {
$result = $db->prepare($query)->execute($bind);
$obj = $result->fetchObject();
$cache->set($key, $obj, $expiry);
}
return $obj;
}

This function checks first to see if there is a cached version of the data that already exists — by making a unique lookup key from a hash of the query — and returns it if it does, or if it doesn’t, executes the query and caches the output for the specified amount of time, so it’ll be there for the next time it’s queried.

Another technique is to use a write-through cache, where the final copy of the data is stored in a database, but once the data is queried it is cached forever. To avoid the problem of stale data, the software proactively updates the cache any time the database layer is updated, or just deletes it, so it will be re-created the next time it’s requested.

All that said, here’s one caching caveat to pay attention to: If you’re using any framework or application layer for the application, don’t reinvent the wheel. Almost all modern frameworks insert a caching layer over the top of the database access layer or the object model automatically, in the way that the framework expects. So you’ll already be getting the benefit of efficient caching.

Add It All Up For The Highest Performance

Even from this brief discussion, you can see that there’s not one thing that makes or breaks your database performance. It’s a number of things that all have to be done well. But as we said at the outset, given that 90% of application execution time is in the database, it’s well worth the time and effort to make sure everything about your database is working in the direction of top performance.

Gain better visibility and ensure optimal PHP application performance, try AppDynamics for FREE today!

Why Every PHP Application Should Use an OpCache

PHP 5.5 introduced opcode caching into the core via OPCache.  OPCache was previously known as Zend Optimizer+, and although free, was closed source. Zend decided to open source the implementation, and include it in the core PHP distribution. OPCache is also available as an extension through pecl, and is compatible all the way back to PHP 5.2. While other opcode caching solutions like APC exist, now that OPCache is bundled with PHP, it will likely become the standard going forward.

What is an opcode cache, and how does it work? Every time a PHP script is requested, the PHP script will be parsed and compiled into opcode which then is executed in the Zend Engine. This is what allows PHP developers to skip the compilation step required in other languages like Java or C# — you can make changes to your PHP code and see those changes immediately. However, the parsing and compiling steps increase your response time, and in a non-development environment are often unnecessary, since your application code changes infrequently.

When an opcode cache is introduced, after a PHP script is interpreted and turned into opcode, it’s saved in shared memory, and subsequent requests will skip the parsing and compilation phases and leverage the opcode stored in memory, reducing the execution time of PHP.

How much benefit can you expect from an opcode cache?  Like many things in life, the answer is it depends. To test the benefit of OPCache, we have taken an existing PHP demo application used at AppDynamics, and installed OPCache. The OPCache settings were fairly straightforward, but we opted to use 0 for the refresh rate, which means a script will never be checked to see if it’s updated. While applicable for a production environment, it means you must delete the opcache cache when deploying new code.

The demo application is a simple e-commerce site built on top of Symfony 2 and PHP 5.4, leveraging a MySQL database, memcache and a backend Java service. For the test, the demo application is running on a medium ec2 instance (database, memcached, and Java services are on separate instances) with a small but steady amount of load on four different pages within the application.

In order to understand the performance benefit of enabling OPCache, the AppDynamics PHP agent was installed. The PHP agent auto-discovers application topology, and tracks metrics and flow maps for business transactions, app services, and backends in your web application by injecting instrumentation in the PHP-enabled web server instance at runtime. By leveraging the metrics collected by AppDynamics, we can see the decrease in response time OPCache provides.

Once OPCache was enabled on the application, there was a 14% reduction in response time for the application overall. AppDynamics has a feature called “compare releases” which allows you to select to separate time ranges and compare key metrics. In the screenshot below, we are comparing two small time ranges – March 14th from 9:00am to 12:00pm and March 14th from 1:00pm to 4:00pm, as OPCache was enabled at 12:10pm on March 14th.

While a 14% decrease in response time is good, especially considering the minimal amount of work required to install and enable OPCache, it may be less than you were expecting. The overall application decrease in response time obscures the variation seen across different pages within the application.

AppDynamics analyzes a concept called a business transaction, which represents an aggregation of similar user requests to accomplish a logical user activity. In this demo application, we were generating load on four specific business transactions: View Product, Search, Login, and Loop. Using the compare releases functionality from AppDynamics, instead of focusing on the individual business transactions, we see a lot of variation between the different business transactions in response time once OPCache was introduced.

Let’s look at each business transaction and determine why some transactions saw a large reduction in response time, while others experience a moderate or minimal decrease in response time.

The login business transaction saw a substantial decrease in response time, 74%.

The Login business transaction is relatively simple, as shown by the AppDynamics flow map below (a flow map graphically represents the tiers, nodes, and backends and the process flows between them in a managed application). The transaction goes through a standard Symfony controller and renders a basic html form to login — there are no databases or external services involved. On this particular business transaction, a majority of the response time was spent parsing and compiling the PHP. Once those steps are removed via an OPCache, the response time drops dramatically.

The Product View business transaction experience a similar decrease in response time at 74%.

The Product View business transaction relies on both memcache and MySQL database, although only 2% of the request time is spent outside of PHP (after OPCache was turned on, this increased to 8%), and hence we see a large benefit from opcode caching like we saw in the Login business transaction.

The Search business transaction response time dropped by only 8%.

Looking at the flow map, the majority of the response time is spent on the Java backend service, with a small amount of time spent in the network. Enabling OPCode  resulted in a 70% reduction in response time in PHP, but since PHP was only 11% of the overall response time, the effect was muted.

Before:After:

The Loop business transaction is not part of the demo application, but was added specifically for this test. The Loop business transaction saw only a 6% decrease in time.

Loop is 3 lines of code – it loops 10 millions times and increments a counter on each loop. The amount of time it takes to parse and compile the code is small compared with the time it takes to actually execute the opcode, hence the small decrease in response time form enabling opcode caching.

To illustrate the difference we can review a call graph of each transaction. AppDynamics captures snapshots of certain requests, and a snapshot contains the call graph. Looking at the call graphs below for Loop and Login, we see Login has a lot more PHP code to parse and compile:

 

Loop

 

Login

In summary, opcode caches provide a quick way to decrease the latency of your PHP application and should always be enabled in production PHP environments. The decrease in response time will primarily depend on two things: 1. The amount of time the request spends in PHP. If your application spends a lot of time waiting for a database to return results or relies on slow third party web services, the decrease in response time from an opcode cache will be on the lower side. 2. If your PHP scripts are very basic, including only the minimal amount of code to process the request, as compared to using a framework, then the reduction in response time will also be limited. Get started by upgrading to PHP 5.5 or installing Zend OpCache today.

Take five minutes to get complete visibility and control into the performance of your production applications with AppDynamics Pro today.

Tracking PHP Application Events with AppDynamics

Event Tracking

All too often PHP engineers find themselves repeating the same tasks to triage their application problems. Issues can range from poorly written code to database bottlenecks, slow remote service API calls, or machine issues including I/O bottlenecks — whether hardware or network related.

In certain cases, these issues are nearly impossible to discover due to the nonexistence of a mechanism for tracking and reporting particular events that may impact the performance of your application when those events are not directly related to the application code itself.

For example, imagine the frustration when a recent PHP upgrade causes a fatal error. What if routine configuration changes to your maintenance scripts also impacts your ability to read from your database?

Perhaps switching database table engines from MyISAM to to InnoDB is causing application slowdown. The numerous types of events outside of the normal development workflow can compromise the integrity of your application’s user experience while at the same time creating unwanted frustration.

Types of Events

Event tracking is an integral part of maintaining true and transparent insight into the various events that revolve around the performance of your application.  One of my favorite core APM features is Event Tracking: the ability to track a change in the state of your application that is of potential interest. Some examples of the various actions you can track are:

  • Upgrading your PHP framework

  • Application deployments AND rollbacks

  • Switching database table engines (e.g. MyISAM to InnoDB)

  • Changes/upgrades to hardware

  • Upgrades to your OS, MySQL, web server, etc.

  • PHP.ini changes

  • Installing/upgrading PHP extensions

I think you get the idea – you want to track anything that could potentially impact the performance of your application.

AppDynamics Event Tracking

The AppDynamics Event Tracking feature can be accessed by clicking the Events link in the main navigation menu.

Once clicked, you’re presented with a view of all events that have occurred in your application at that time. In this example, we’re presented with Health Rule Violations and an instance of a server being restarted. To narrow down what you’re looking for, you have the option of using an advanced search filter. Select ‘Show Filters’ and you will see a list of choices to the left of the event list.

Compare Releases

‘Compare Releases’ shows the real power of AppDynamics and is the reason why it remains one of my favorite core APM features. Under the Analyze menu item, click ‘Compare Releases’ and you’ll be shown a screen comparing your application between two different time periods. A unique column here is the ‘Events’ column displaying any events registered during the specified time range to give you further insight into what may have been previously overlooked. In this specific example, we’re comparing the application’s KPIs between two different weeks. You can see that our error rate decreased the week later with no health rule violations registered as events. The screenshot shows a definitive performance improvement between the two time periods.

We encourage you to explore the Events feature further. You will see how you can combine both the power of Change Releases and our Alert & Respond feature to execute custom scripts based upon triggered events..  an added bonus, the Events feature is also accessible by a RESTful API that allows you to register a change event from anywhere at anytime.

Take five minutes to get complete visibility into the performance of your production applications with AppDynamics Pro today.

 

PHP Performance Crash Course, Part 2: The Deep Dive

In my first post on this series I covered some basic tips for optimizing performance in php applications. In this post we are going to dive a bit deeper into the principles and practical tips in scaling PHP.

Top engineering organizations think of performance not as a nice-to-have, but as a crucial feature of their product. Those organizations understand that performance has a direct impact on the success of their business.

Ultimately, scalability is about the entire architecture, not some minor code optimizations. Often times people get this wrong and naively think they should focus on the edge cases. Solid architectural decisions like doing blocking work in the background via tasks, proactively caching expensive calls, and using a reverse proxy cache will get you much further than arguing about single quotes or double quotes.

Just to recap some core principles for performant PHP applications:

The first few tips don’t really require elaboration, so I will focus on what matters.

Optimize your sessions

In PHP it is very easy to move your session store to Memcached:

1) Install the Memcached extension with PECL

pecl install memcached

2) Customize your php.ini configuration to change the session handler


session.save_handler = memcached
session.save_path = "localhost:11211"

If you want to support a pool of memcache instances you can separate with a comma:

session.save_handler = memcached
session.save_path = "10.0.0.10:11211,10.0.0.11:11211,10.0.0.12:11211"

The Memcached extension has a variety of configuration options available, see the full list on Github. The ideal configuration I have found if using a pool of servers:

session.save_handler = memcached
session.save_path = "10.0.0.10:11211,10.0.0.11:11211,10.0.0.12:11211"

memcached.sess_prefix = “session.”
memcached.sess_consistent_hash = On
memcached.sess_remove_failed = 1
memcached.sess_number_of_replicas = 2
memcached.sess_binary = On
memcached.sess_randomize_replica_read = On
memcached.sess_locking = On
memcached.sess_connect_timeout = 200
memcached.serializer = “igbinary”

That’s it! Consult the documentation for a complete explanation of these configuration directives.

Leverage caching

Any data that is expensive to generate or query and long lived should be cached in-memory if possible. Common examples of highly cacheable data include web service responses, database result sets, and configuration data.

Using the Symfony2 HttpFoundation component for built-in http caching support

I won’t attempt to explain http caching. Just go read the awesome post from Ryan Tomako, Things Caches Do or the more in-depth guide to http caching from Mark Nottingham. Both are stellar posts that every professional developer should read.

With the Symfony2 HttpFoundation component it is easy to add support for caching to your http responses. The component is completely standalone and can be dropped into any existing php application to provide an object oriented abstraction around the http specification. The goal is to help you manage requests, responses, and sessions. Add “symfony/http-foundation” to your Composer file and you are ready to get started.

Expires based http caching flow

use SymfonyComponentHttpFoundationResponse;

$response = new Response(‘Hello World!’, 200, array(‘content-type’ => ‘text/html’));

$response->setCache(array(
‘etag’ => ‘a_unique_id_for_this_resource’,
‘last_modified’ => new DateTime(),
‘max_age’ => 600,
‘s_maxage’ => 600,
‘private’ => false,
‘public’ => true,
));

If you use both the request and response from the http foundation you can check your conditional validators from the request easily:


use SymfonyComponentHttpFoundationRequest;
use SymfonyComponentHttpFoundationResponse;

$request = Request::createFromGlobals();

$response = new Response(‘Hello World!’, 200, array(‘content-type’ => ‘text/html’));

if ($response->isNotModified($request)) {
$response->send();
}

Find more examples and complete documentation from the very detailed Symfony documentation.

Caching result sets with Doctrine ORM

If you aren’t using an ORM or some form of database abstraction you should consider it. Doctrine is the most fully featured database abstraction layer and object-relational mapper available for PHP. Of course, adding abstractions comes at the cost of performance, but I find Doctrine to be exteremly fast and efficient if used properly. If you leverage the Doctrine ORM you can easily enable caching result sets in Memcached:


$memcache = new Memcache();
$memcache->connect('localhost', 11211);

$memcacheDriver = new DoctrineCommonCacheMemcacheCache();
$memcacheDriver->setMemcache($memcache);

$config = new DoctrineORMConfiguration();
$config->setQueryCacheImpl($memcacheDriver);
$config->setMetadataCacheImpl($memcacheDriver);
$config->setResultCacheImpl($memcacheDriver);

$entityManager = DoctrineORMEntityManager::create(array(‘driver’ => ‘pdo_sqlite’, ‘path’ => __DIR__ . ‘/db.sqlite’), $config);

$query = $em->createQuery(‘select u from EntitiesUser u’);
$query->useResultCache(true, 60);

$users = $query->getResult();

Find more examples and complete documentation from the very detailed Doctrine documentation.

Caching web service responses with Guzzle HTTP client

Interacting with web services is very common in modern web applications. Guzzle is the most fully featured http client available for PHP. Guzzle takes the pain out of sending HTTP requests and the redundancy out of creating web service clients. It’s a framework that includes the tools needed to create a robust web service client. Add “guzzle/guzzle” to your Composer file and you are ready to get started.

Not only does Guzzle support a variety of authentication methods (OAuth 1+2, HTTP Basic, etc), it also support best practices like retries with exponential backoffs as well as http caching.


$memcache = new Memcache();
$memcache->connect('localhost', 11211);

$memcacheDriver = new DoctrineCommonCacheMemcacheCache();
$memcacheDriver->setMemcache($memcache);

$client = new GuzzleHttpClient(‘http://www.test.com/’);

$cachePlugin = new GuzzlePluginCacheCachePlugin(array(
‘storage’ => new GuzzlePluginCacheDefaultCacheStorage(
new GuzzleCacheDoctrineCacheAdapter($memcacheDriver)
)
));
$client->addSubscriber($cachePlugin);

$response = $client->get(‘http://www.wikipedia.org/’)->send();

// response will come from cache if server sends 304 not-modified
$response = $client->get(‘http://www.wikipedia.org/’)->send();

Following these tips will allow you to easily cache all your database queries, web service requests, and http responses.

Moving work to the background with Resque and Redis

Any process that is slow and not important for the immediate http response should be queued and processed via non-blocking background tasks. Common examples are sending social notifications (like Facebook, Twitter, LinkedIn), sending emails, and processing analytics. There are a lot of systems available for managing messaging layers or task queues, but I find Resque for PHP dead simple. I won’t provide an in-depth guide as Wan Qi Chen’s has already published an excellent blog post series about getting started with Resque. Add “chrisboulton/php-resque” to your Composer file and you are ready to get started. A very simple introduction to adding Resque to your application:

1) Define a Redis backend

Resque::setBackend('localhost:6379');

2) Define a background task

class MyTask
{
public function perform()
{
// Work work work
echo $this->args['name'];
}
}

3) Add a task to the queue

Resque::enqueue('default', 'MyTask', array('name' => 'AppD'));

4) Run a command line task to process the tasks with five workers from the queue in the background

$ QUEUE=* COUNT=5 bin/resque

For more information read the official documentation or see the very complete tutorial from Wan Qi Chen:

Monitor production performance

AppDynamics is application performance management software designed to help dev and ops troubleshoot performance problems in complex production applications. The application flow map allows you to easily monitor calls to databases, caches, queues, and web services with code level detail to performance problems:

Symfony2 Application Flow Map

Take five minutes to get complete visibility into the performance of your production applications with AppDynamics Pro today.

If you prefer slide format these posts were inspired from a recent tech talk I presented:

As always, please feel free to comment if you think I have missed something or if you have a request for content in an upcoming post.

PHP Performance Crash Course, Part 1: The Basics

We all know performance is important, but performance tuning is too often an afterthought. As a result, taking on a performance tuning project for a slow application can be pretty intimidating – where do you even begin? In this series I’ll tell you about the strategies and technologies that (in my experience) have been the most successful in improving PHP performance. To start off, however, we’ll talk about some of the easy wins in PHP performance tuning. These are the things you can do that’ll get you the most performance bang for your buck, and you should be sure you’ve checked off all of them before you take on any of the more complex stuff.

Why does performance matter?

The simple truth is that application performance has a direct impact on your bottom line:

 

Follow these simple best practices to start improving PHP performance:

 

Update PHP!

One of the easiest improvements you can make to improve performance and stability is to upgrade your version of PHP. PHP 5.3.x was released in 2009. If you haven’t migrated to PHP 5.4, now is the time! Not only do you benefit from bug fixes and new features, but you will also see faster response times immediately. See PHP.net to get started.

Once you’ve finished upgrading PHP, be sure to disable any unused extensions in production such as xdebug or xhprof.

Use an opcode cache

PHP is an interpreted language, which means that every time a PHP page is requested, the server will interpet the PHP file and compile it into something the machine can understand (opcode). Opcode caches preserve this generated code in a cache so that it will only need to be interpreted on the first request. If you aren’t using an opcode cache you’re missing out on a very easy performance gain. Pick your flavor: APC, Zend Optimizer, XCache, or Eaccellerator. I highly recommend APC, written by the creator of PHP, Rasmus Lerdorf.

Use autoloading

Many developers writing object-oriented applications create one PHP source file per class definition. One of the biggest annoyances in writing PHP is having to write a long list of needed includes at the beginning of each script (one for each class). PHP re-evaluates these require/include expressions over and over during the evaluation period each time a file containing one or more of these expressions is loaded into the runtime. Using an autoloader will enable you to remove all of your require/include statements and benefit from a performance improvement. You can even cache the class map of your autoloader in APC for a small performance improvement.

Optimize your sessions

While HTTP is stateless, most real life web applications require a way to manage user data. In PHP, application state is managed via sessions. The default configuration for PHP is to persist session data to disk. This is extremely slow and not scalable beyond a single server. A better solution is to store your session data in a database and front with an LRU (Least Recently Used) cache with Memcached or Redis. If you are super smart you will realize you should limit your session data size (4096 bytes) and store all session data in a signed or encrypted cookie.

Use a distributed data cache

Applications usually require data. Data is usually structured and organized in a database. Depending on the data set and how it is accessed it can be expensive to query. An easy solution is to cache the result of the first query in a data cache like Memcached or Redis. If the data changes, you invalidate the cache and make another SQL query to get the updated result set from the database.

I highly recommend the Doctrine ORM for PHP which has built-in caching support for Memcached or Redis.

There are many use cases for a distributed data cache from caching web service responses and app configurations to entire rendered pages.

Do blocking work in the background

Often times web applications have to run tasks that can take a while to complete. In most cases there is no good reason to force the end-user to have to wait for the job to finish. The solution is to queue blocking work to run in background jobs. Background jobs are jobs that are executed outside the main flow of your program, and usually handled by a queue or message system. There are a lot of great solutions that can help solve running backgrounds jobs. The benefits come in terms of both end-user experience and scaling by writing and processing long running jobs from a queue. I am a big fan of Resque for PHP that is a simple toolkit for running tasks from queues. There are a variety of tools that provide queuing or messaging systems that work well with PHP:

I highly recommend Wan Qi Chen’s excellent blog post series about getting started with background jobs and Resque for PHP.

User location update workflow with background jobs

Leverage HTTP caching

HTTP caching is one of the most misunderstood technologies on the Internet. Go read the HTTP caching specification. Don’t worry, I’ll wait. Seriously, go do it! They solved all of these caching design problems a few decades ago. It boils down to expiration or invalidation and when used properly can save your app servers a lot of load. Please read the excellent HTTP caching guide from Mark Nottingam. I highly recommend using Varnish as a reverse proxy cache to alleviate load on your app servers.

Optimize your favorite framework

php_framework_overlay

Deep diving into the specifics of optimizing each framework is outside of the scope of this post, but these principles apply to every framework:

  • Stay up-to-date with the latest stable version of your favorite framework
  • Disable features you are not using (I18N, Security, etc)
  • Enable caching features for view and result set caching

Learn to how to profile code for PHP performance

Xdebug is a PHP extension for powerful debugging. It supports stack and function traces, profiling information and memory allocation and script execution analysis. It allows developers to easily profile PHP code.

WebGrind is an Xdebug profiling web frontend in PHP5. It implements a subset of the features of kcachegrind and installs in seconds and works on all platforms. For quick-and-dirty optimizations it does the job. Here’s a screenshot showing the output from profiling:

Check out Chris Abernethy’s guide to profiling PHP with XDebug and Webgrind.

XHprof is a function-level hierarchical profiler for PHP with a reporting and UI layer. XHProf is capable of reporting function-level inclusive and exclusive wall times, memory usage, CPU times and number of calls for each function. Additionally, it supports the ability to compare two runs (hierarchical DIFF reports) or aggregate results from multiple runs.

AppDynamics is application performance management software designed to help dev and ops troubleshoot problems in complex production apps.

Complete Visibility

Get started with AppDynamics today and get in-depth analysis of your applications performance.

PHP application performance is only part of the battle

Now that you have optimized the server-side, you can spend time improving the client side! In modern web applications most of the end-user experience time is spent waiting on the client side to render. Google has dedicated many resources to helping developers improve client side performance.

See us live!

If you are interested in hearing more best practices for scaling PHP in the real world join my session at LonestarPHP in Dallas, Texas or International PHP Conference in Berlin, Germany.