Top 6 Database Performance Metrics to Monitor in Enterprise Applications

The previous article presented an introduction SQL and NoSQL. This article builds on these topics by reviewing six of the top performance metrics to capture to assess the health of your database in your enterprise application.

Specifically this article reviews the following:

  • Business Transactions
  • Query Performance
  • User and Query Conflicts
  • Capacity
  • Configuration
  • NoSQL Databases

Business Transactions

Business Transactions provide insight into real user behavior: they capture real-time performance that real users are experiencing as they interact with your application. As mentioned in the previous article, measuring the performance of a business transaction involves capturing the response time of a business transaction holistically as well as measuring the response times of its constituent tiers. These response times can then be compared with the baseline that best meets your business needs to determine normalcy.

If you were to measure only a single aspect of your application, I would encourage you to measure the behavior of your business transactions. While container metrics can provide a wealth of information and can help you determine when to auto-scale your environment, your business transactions determine the performance of your application. Instead of asking for the CPU usage of your application server you should be asking whether or not your users can complete their business transactions and if those business transactions are behaving optimally.

As a little background, business transactions are identified by their entry-point, which is the interaction with your application that starts the business transaction.

Once a business transaction is defined, its performance is measured across your entire application ecosystem. The performance of each business transaction is evaluated against its baseline to assess normalcy. For example, we might determine that if the response time of the business transaction is slower than two standard deviations from the average response time for this baseline that it is behaving abnormally, as shown in figure 1.

 


Figure 1 Evaluating BT Response Time Against its Baseline

The baseline used to assess the business transaction is consistent for the hour in which the business transaction is running, but the business transaction is being refined by each business transaction execution. For example, if you have chosen a baseline that compares business transactions against the average response time for the hour of the day and the day of the week after the current hour is over, all business transactions executed in that hour will be incorporated into the baseline for next week. Through this mechanism an application can evolve over time without requiring the original baseline to be thrown away and rebuilt; you can consider it as a window moving over time.

In summary, business transactions are the most reflective measurement of the user experience, so they are the most important metric to capture.

Query Performance

The most obvious place to look for poor query performance is in the query itself. Problems can result from queries that take too long to identify the required data or bring the data back. Look for these issues in queries:

Selecting more data than needed

It is not enough to write queries that return the appropriate rows; queries that return too many columns can cause slowness both in selecting the rows and retrieving the data. It is better to list the required columns rather than writing SELECT*. When the query is based on selecting specific fields, the plan may identify a covering index, which can speed up the results. A covering index includes all the fields used in the query. This means that the database can generate the results just from the index. It does not need to go to the underlying table to build the result. Additionally, listing the columns required in the result reduces the data that’s transmitted, which also benefits performance.

Inefficient joins between tables

Joins cause the database to bring multiple sets of data into memory and compare values, which can generate many database reads and significant CPU. Depending on how the tables are indexed, the join may require scanning all the rows of both tables. A poorly written join on two large tables that requires a complete scan of each one is very computationally expensive. Other factors that slow down joins include joining on columns that are different data types, requiring conversions, or a join condition that includes LIKE, which prevents the use of indexes. Avoid defaulting to using a full outer join; use inner joins when appropriate to bring back only the desired data.

Too few or too many indexes

When there aren’t any indexes that the query optimizer can use, the database needs to resort to table scans to produce query results, which generates a large amount of disk input/output (I/O). Proper indexes also reduce the need for sorting results. Indexes on non-unique values do not provide as much help as unique indexes in generating results. If the keys are large, the indexes become large as well, and using them creates more disk I/O. Most indexes are intended to help the performance of data retrieval, but it is important to realize that indexes also impact the performance of data inserts and updates, as all associated indexes must be updated.

Too much literal SQL causing parse contention

Before any SQL query can be executed, it must be parsed, which checks syntax and permissions before generating the execution plan. Because parsing is expensive, databases save the SQL they’ve parsed to reuse it and eliminate the parsing time. Queries that use literal values cannot be shared, as the WHERE clauses differ. This results in each query being parsed and added to the shared pool. Because the pool has limited space, some saved queries are discarded to make room. If those queries recur, they need to be parsed again.

User and Query Conflicts

Databases are designed to be multi-user, but the activities of multiple users can cause conflicts.

Page/row locking due to slow queries

To ensure that queries produce accurate results, databases must lock tables to prevent inserts and updates from occurring while a read query is running. If a report or query is slow, users who need to modify database values may experience slowness and delays in completing their updates. Lock hints help the database use the least disruptive locks. Separating reporting from transactional databases is also an efficient solution.

Transactional locks and deadlocks

Deadlocks occur when two transactions are blocked because each one needs a resource held by the other. When there’s a normal lock, a transaction is blocked until a resource is released. There isn’t a resolution to a deadlock. Databases monitor for deadlocks and choose to terminate one of the blocked transactions, freeing the resource and allowing the other transaction to proceed. The other transaction is rolled back.

Batch activities causing resource contention for online users

Batch processes typically perform bulk operations such as loads of large amounts of data or generating complex analytical reports. These operations are resource-intensive and can impact performance for online users. The best solution for this issue is to ensure that batch operations are run when online usage is low, such as at night, or to use separate databases for transactional processing and analytical reporting.

Capacity

Not all database performance issues are database issues. Some problems result from running the database on inadequate hardware.

Not enough CPUs or CPU speed too slow

More CPUs can share the server workload, resulting in improved performance. The performance the database experiences is not solely due to the database but also is affected by other processes running on the server, so it is important to review the overall load as well as database usage. As CPU utilization varies throughout the day, metrics should be examined for periods of low usage, average usage, and peak usage to best assess whether additional CPU resources will be beneficial.

Slow disk without enough IOPS

Disk performance can be stated in terms of input/output operations per second (IOPS). Combined with the I/O size, this provides a measure of the number of the disk’s throughput in terms of megabytes per second. This throughput is also affected by the disk’s latency, which is how long it takes the request to complete. These metrics are unique to the technology of the disk storage. Traditional hard disk drives (HDD) have a rotating disk and are typically slower than solid state drives (SSD) or flash memory without any moving parts. Until recently, an SSD was more expensive than an HDD, but costs have come down, making it a competitive option.

Full or misconfigured disks

Databases obviously require significant disk access, so incorrectly configured disks have a considerable performance impact. Disks should be suitably partitioned, with system data such as catalogs and logs separated from user data. Highly active tables should be separated to avoid contention. Increase parallelism by placing databases and indexes on different disks. Don’t arrange the operating system and swap space on the same disk as the database.

Not enough memory

Limited or poorly allocated physical memory impacts database performance. The more memory that is available, typically the better the performance will be. Monitor paging and swapping. Set up several page spaces on multiple, non-busy disks. Make sure the paging space allocated is sufficient for database requirements; each database vendor can provide guidance on this matter.

Slow network

Network speeds can affect how quickly retrieved data is returned to the end user or calling process. Use broadband for connecting to remote databases. In some cases, choosing TCP/IP instead of named pipes for the connection protocol can significantly increase performance.

Configuration

Every database has a large number of configuration settings. Default values may not be enough to give your database the performance it needs. Check all parameter settings, which includes looking for the following issues:

Buffer cache too small

Buffer cache improves performance by storing data in kernel memory and eliminating disk I/O. When the cache is too small, data is flushed from the cache more frequently. If it is needed again, it must be reread from disk. Besides the slowness of the disk read, this puts additional work on I/O devices and can become a bottleneck. In addition to allocating enough space to the buffer cache, tuning SQL queries can help them use buffer cache more efficiently.

No query caching

Query caching stores both database queries and their result sets. When an identical query is executed, the data is quickly retrieved from memory rather than requiring the query to be executed again. Updates to data invalidate the results, so query caching is only effective on static data. In some cases, a query cache can become a bottleneck rather than a benefit to performance. Huge caches can cause contention when they are locked for updates.

I/O contention due to temporary table creation on disk

Databases need to create temporary tables when performing certain query operations, such as executing a GROUP BY clause. When possible, the temporary tables are created in memory. However, in some cases, creating the temporary table in memory is not feasible, such as when the data contains BLOB or TEXT objects. In those cases, the temporary tables are created on disk. A large amount of disk I/O is required to create the temporary table, populate it with records, select the needed data from it, and drop the table when the query is complete. To avoid potential performance impact, the temporary database should be separated from the main database space. Rewriting queries also reduce the need for temporary tables by creating derived tables instead. Using a derived table, which directly selects from the result of another SELECT statement, allows data to be joined in memory rather than using disk.

NoSQL Databases

NoSQL has much appeal because of its ability to handle large amounts of data very rapidly. However, some disadvantages should be assessed when weighing if NoSQL is right for your use-case scenario. This is why it is wise to consider that NoSQL stands for “Not Only SQL.” This clearer definition accepts the premise that NoSQL is not always the right solution, nor does it necessarily replace SQL across the board — here are five reasons why:

Finicky Transactions

It is hard to keep entries consistent with NoSQL. When accessing structured data, it does not always ensure that changes to various tables are made at the same time. If a process crashes, tables might become inconsistent. An example of consistent transactions is double-entry accounting. A corresponding credit must balance every debit and vice versa. If the data on both sides is not consistent, the entry cannot be made. NoSQL may not “balance the books” properly.

Complex Databases

Supporters of NoSQL tend to point to the efficient code, simplicity, and speed of NoSQL. All of these factors line up when database tasks are simple. However, when databases become more complicated, NoSQL begins to break down. SQL has more potential over NoSQL when database demands are complicated because SQL has mature, industry standard interfaces. Each NoSQL setup has a unique interface.

Consistent JOINS

When executing a JOIN in SQL, there is a tremendous amount of overhead because the system must pull data from different tables and align them with keys. NoSQL seems like a dream because there is a lack of JOINS. Everything is all together in one place in the same table. When data is retrieved, it pulls all of the key-value pairs at the same time. The problem is that this can create several copies of the same data. Those copies have to be updated, and NoSQL does not have the functionality to help in this situation.

Flexibility in Schema Design

NoSQL was unique when it emerged on the scene because it did not require a schema. In previous database models, programmers would have to think about the columns they needed to accommodate all of the potential and data entries in each row. With NoSQL, entries can have a variety of strings or none at all. This flexibility allows programmers to ramp up applications quickly. However, it can be problematic when there are several groups working on the same program, or when new teams of developers take over a project. After some developers have modified the database using the freedom of NoSQL, there may be a wide variety of key pair implementations.

Resource Intensive

NoSQL databases are commonly much more resource intensive than relational databases. They require much more CPU reserves and RAM allocation. For that reason, most shared hosting companies do not offer NoSQL. You must sign up for a VPS or run your own dedicated server. On the other hand, SQL is made to run on one server. This works out fine in the beginning, but as database demands increase, the hardware must expand as well. The problem is that a single server with huge capacity is much more expensive than a variety of smaller servers. The price increase is exponential. This provides one reason NoSQL has found a home in enterprise computing scenarios, such as those used by Google and Facebook.

Conclusion

This article presented a top-6 list of metrics that you might want to measure when assessing the health of your database. In summary, those top-6 items were:

  • Business Transactions
  • Query Performance
  • User and Query Conflicts
  • Capacity
  • Configuration
  • NoSQL Databases

In the next article, we are going to pull all of the topics in this series together to present the approach that AppDynamics took to implementing its APM strategy. This is not a marketing article, but rather an explanation of why certain decisions and optimizations were made and how they can provide you with a robust view of the health of your applications and database.

Is Your Database Dragging Down your PHP Application?

Here’s a quiz from our last PHP application performance post: How much of a PHP application’s execution time is used up accessing the database? 50%? 75%? You may recall the real answer, since it is so (some might say surprisingly) big: some 90% of PHP execution time happens in the database layer. So it makes sense to look to the database to reap some big performance gains. Let’s take a look at how some of the decisions you make about your database can affect performance.

Is NoSQL a No-Brainer?

Whether it’s MongoDB, Cassandra, or one of the other flavors, developers often turn to NoSQL databases as a solution to database performance. But NoSQL databases aren’t a magic bullet. As with so many application questions, it all depends on the use case. NoSQL databases perform extremely well in certain, but not all, use cases. It depends on whether the dataset for your specific application is better represented by a non-relational model. If it is, then find a NoSQL solution that matches your data and offers the tuning and scalability you need. If not, keep reading.

Your SQL, MySQL, Everybody’s SQL

We’re going to focus this discussion on the MySQL database, since it’s the one most frequently used with PHP. But almost everything we say about it will apply, at least in broad strokes, to other relational databases. We’re not going to dive into database configuration either, as that’s a whole other topic. What follows are some of the top things a PHP developer should be looking at to make sure the database is performing at the highest possible level.

Why Be Normal?

The web is full of discussions about database normalization, but let’s go with the short explanation that these are formalized ways to structure a database to minimize redundancy and therefore keep the database cleaner. But neat and clean do not always equate to fast. There are times when denormalizing — combining tables together, even if it means having duplicate information — can be more efficient for your application and improve performance. It might seem counterintuitive, but sometimes being a little messy is the quickest way to get what you want out of your database.

Let’s look at a customer database application as a simplified example. This application needs to track what company each customer works for. This would typically be stored as two tables with a foreign key constraint between them, such as:

CREATE TABLE `company` (
`id`    int(11) AUTO_INCREMENT NOT NULL,
`name`  varchar(512) NULL,
PRIMARY KEY(`id`)
);
CREATE TABLE `customer` (
`id`   int(11) AUTO_INCREMENT NOT NULL,
`fullName`  varchar(512) NOT NULL,
`email`   varchar(512) NULL,
`companyID` int(11) NULL,
PRIMARY KEY(`id`)
);
ALTER TABLE `customer`
ADD CONSTRAINT `fk_customer_company`
FOREIGN KEY(`companyID`)
REFERENCES `company`(`id`);

Searching for customers and including their company name in the results will now require either two queries — not efficient — or a single query with a join, such as:

SELECT cst.*, cpy.name as companyName FROM customer cst LEFT JOIN company cpy ON cst.companyID = cpy.id;

If the application will typically need to have a company name when looking up customers, then it’s obviously inefficient to make separate queries or use database joins each time. The better solution for performance is to combine these two tables into a single table, such as this:

CREATE TABLE customer (
id           int(11) AUTO_INCREMENT NOT NULL,
fullName     varchar(512) NOT NULL,
email        varchar(512) NULL,
companyName  varchar(512) NULL,
PRIMARY KEY(id)
);

This simpler query structure allows for much faster lookups. But it comes at a cost, as now company names are not standardized, and the same company could be entered differently between two customers. If it’s is important to be exact — in a billing application, for example, that needs to know what company to bill — this lack of precision could be catastrophic. But in many cases, “company” would is just another piece of data about the person. If the spelling of company — or any other descriptive information — is not critical, it’s worth it from a performance perspective to combine tables.

To Index Or Not To Index?

Indexes can speed lookups up. But too may indexes can slow things down. So a critical database performance question is whether to index a column or not.

The primary key of the database is already indexed. Database indexes are lookup tables for anything else. A general rule of thumb or starting point is to identify the columns referenced in the “where” clauses of queries, and seriously consider indexing those columns. Again, this is a general rule, and needs to be balanced with how many indexes overall are being created and whether that number will negatively impact performance.

Again using the customer database example, if the application needs to search for a customer by email address, then index the email column of the table by executing the following SQL:

CREATE INDEX idx_customer_email ON customer(email);<c/ode>

This will significantly increase the speed of customer search by email address. Multiple columns can be included in the index as well — so if the application needs to look for customers by name within a particular company, an index like this would speed the search up:

CREATE INDEX idx_customer_dual ON customer(fullName, companyName);

Be Wary Of Your Queries

Queries themselves can be a source of slowdowns, for a variety of reasons. “Query tuning” is the process of finding slow queries and fixing them to make them faster.

What makes a query slower than it should be? Maybe an unnecessary subquery was used, or a “group by” clause is used that can be refactored out. Learning to use the EXPLAIN command can help one understand the performance of queries. The primary goal is to find queries that are slow and rework them, testing the performance at each iteration.

A common query pitfall involves SQL functions. Whenever possible, queries should avoid wrapping a column name inside of a function, within a “where” clause. Here’s an example of how not to look for all records created in the last month:

SELECT * FROM user WHERE NOW() < DATE_ADD(dateCreated, INTERVAL 1 MONTH);

Written this way, MySQL has to perform a full table scan, since it can’t benefit from any index that exists on the dateCreated column. Since the dateCreated column is being used in a function to modify, it is valued on every access point. Now look at this way of re-writing this query:

SELECT * FROM user WHERE dateCreated > DATE_SUB(NOW(), INTERVAL 1 MONTH);

Here, all the columns are moved to one side of the comparison, and all the functions to the other side, so the comparison can use an index now. This way of writing the query delivers the same results, but performs much better than the first version.

Tuning even further, in this specific case performance can be increased a hair more by the removal of the NOW() function from needing to be called, potentially on every comparison unless the database optimizes that out. Instead, the current date can be output via PHP in a query that looks like this:

SELECT * FROM user WHERE dateCreated > DATE_SUB(’2014–10–31’, INTERVAL 1 MONTH);

These examples are a broad look at query tuning, but are a good starting point for looking for performance improvement opportunities.

Cash In On Caching

No discussion on database performance would be complete without looking at caching. Typically, the caching layer that PHP developers are concerned with is the ‘user cache.’ This is a programmable caching layer, usually provided by software such as Memcached, APC/APCu, or Redis.

The common thread with these caching mechanisms is simple, in-memory, key-value lookups, which are much faster than database queries. Therefore one of the most common techniques to speed up a database is to leverage caching lookups. Once the the data is retrieved, it’s stored in the cache for some period. Future queries will retrieve the stored data and not care about it being slightly older or somewhat stale data.

This can be done generically in code via a helper function, such as this one that assumes a SQL query that will only ever return one row:

function cachedSingleRowQuery($query, Array $bind, $expiry, PDO $db, Memcached $cache) {
$key = 'SingleQuery' . md5($query) . md5(implode(',', $bind);
if (!($obj = $cache->get($key))) {
$result = $db->prepare($query)->execute($bind);
$obj = $result->fetchObject();
$cache->set($key, $obj, $expiry);
}
return $obj;
}

This function checks first to see if there is a cached version of the data that already exists — by making a unique lookup key from a hash of the query — and returns it if it does, or if it doesn’t, executes the query and caches the output for the specified amount of time, so it’ll be there for the next time it’s queried.

Another technique is to use a write-through cache, where the final copy of the data is stored in a database, but once the data is queried it is cached forever. To avoid the problem of stale data, the software proactively updates the cache any time the database layer is updated, or just deletes it, so it will be re-created the next time it’s requested.

All that said, here’s one caching caveat to pay attention to: If you’re using any framework or application layer for the application, don’t reinvent the wheel. Almost all modern frameworks insert a caching layer over the top of the database access layer or the object model automatically, in the way that the framework expects. So you’ll already be getting the benefit of efficient caching.

Add It All Up For The Highest Performance

Even from this brief discussion, you can see that there’s not one thing that makes or breaks your database performance. It’s a number of things that all have to be done well. But as we said at the outset, given that 90% of application execution time is in the database, it’s well worth the time and effort to make sure everything about your database is working in the direction of top performance.

Gain better visibility and ensure optimal PHP application performance, try AppDynamics for FREE today!

Storage is Killing Your Database Performance

The other day I had the opportunity to speak with a good friend of mine who also happens to be a DBA at a global Financial Services company. We were discussing database performance and I was surprised when he told me that the most common cause of database performance issues (from his experience) was a direct result of contention on shared storage arrays.

After recovering from my initial surprise I had an opportunity to really think things through and realized that this makes a lot of sense. Storage requirements in most companies are growing at an ever increasing pace (big data anyone?). Storage teams have to rack, stack, allocate, and configure new storage quickly to meet demand and don’t have the time to do a detailed analysis on the anticipated workload of every application that will connect to and use the storage. And therein lies the problem.

Workloads can be really unpredictable and can change considerably over time within a given application. Databases that once played nicely together on the same spindles can become the worst of enemies and sink the performance of multiple applications at the same time. So what can you do about it? How can you know for sure if your storage array is the cause of your application/database performance issues? Well, if you use NetApp storage then you’re in luck!

AppDynamics for Databases remotely connects (i.e. no agent required) to your NetApp controllers and collects the performance and configuration information that you need to identify the root cause of performance issues. Before we take a look at the features, let’s look at how it gets set up.

The Config

Step 1: Prepare the remote user ID and privileges on the NetApp controller. The following commands are used for the configuration.

useradmin role add AppD_Role -a api-*,login-http-admin
useradmin group add AppD_Group -r AppD_Role
useradmin user add appd -g AppD_Group

Note: Make sure you set a password for the appd user.

Step 2: Configure AppDynamics to monitor the NetApp controller. Notice that we configure AppDynamics with the the username and password created in step 1.

Screen Shot 2013-04-11 at 4.14.33 PM

 Step 3: Enjoy your awesome new monitoring (yep, it’s that easy).

The Result

After an incredibly difficult 2 minutes of configuration work we are ready for the payoff. In the AppDynamics for Databases main menu you will see a section for all of your NetApp agents.

Screen Shot 2013-04-11 at 4.09.13 PM

Let’s do a “drill-up” from the NetApp controller to our impacted database. Clicking into our monitored instance we see the following activity screen.

Screen Shot 2013-04-11 at 4.29.28 PM

By clicking on the purple latency line inside of the red box in the image above we can drill into the volume that has the highest response time. Notice in the scree grab below that we have a link at the bottom of the page where we can drill-up into the database that is attached to this storage volume. This relationship is built automatically by AppDynamics for Databases.

Screen Shot 2013-04-11 at 4.31.56 PM

Clicking on the “Launch In Context” link we are immediately transfered to the Oracle instance activity page shown below.

Screen Shot 2013-04-11 at 4.34.21 PM

In just the same manner as we can drill-up from storage to database, we can also drill-down from database to storage. Notice the screen grab below from an Oracle instance activity screen. Clicking on the “View NetApp Volume Activity” link will launch the NetApp activity screen shown earlier for the volumes associated with this Oracle instance. It’s that easy to switch between the views you need to solve your applications performance issues.

Imagine being able to detect an end user problem, drill down through the code execution, identify the slow SQL query, and isolate the storage volume that is causing the poor performance. That’s exactly what you can do with AppDynamics.

natappdrilldown

Storage monitoring in AppDynamics for Databases is another powerful feature that enables application support, database support, and storage support to get on the same page and restore service as quickly as possible. If you have databases connected to NetApp storage you need to take a free trial of AppDynamics for Databases today.

Here’s a short video demonstration of AppDynamics for NetApp…