Engineering

5 .NET Performance Issues Where Dev and Ops Play the Blame Game

By | | 6 min read


Summary
I spend a large portion of my time working with customers to review performance issues in production environments. I was asked by some colleagues to share some insight into some of the more interesting findings.

I spend a large portion of my time working with customers to review performance issues in production environments. I was asked by some colleagues to share some insight into some of the more interesting findings.

Before launching into a list of offenders, I think it’s worth explaining how often I hear the phrase “We need to prove it’s not our tin but their code” and how this relates to the discoveries here.  Typically, when a developer sits down to write some code, they do it in isolation, they’ll apply theory and best practise and throw in a bit of experience to come up with the best (hopefully) code they can produce for the task.

What happens, too often, is when the code reaches a production environment it’s executed in a way not predicted by the developer, architect or designer. Quite often the dynamics of a live environment varies drastically from the test and pre-production environments in terms of data and patterns of use. Just recently a client told me how their eCommerce site goes from a normal state of “browsing products” to a Black Friday or major event that’s “buy-only”, their application behaved very different under each circumstance.

So all that aside, let’s take a look of some of my favourite top 5 from this year…

1. Object Relational Mapping (ORM)

ORM is a broad term, but I apply it to anything that abstracts a developer from writing raw SQL – this includes LINQ, Entity Framework and NHibernate.

In a nutshell, something like Entity Framework allows a developer to create a neat design of their object and use syntax within .NET to query and manipulate the objects, which then results in queries and inserts into an underlying datastore. Figure 1 shows the type of view and code a developer would use.

Figure 1 – Entity Framework and the underlying code

Very few developers get a chance to review what goes on under the hood. I can normally tell when an Entity Framework is being used because it generates a lot of square brackets!

In fact, the best Entity Framework generated query I’ve encountered when pasted into Word covered 21 pages! To quote my customer at the time, “No sane human being would write that”. Most of the SQL was to cater for NULL cases.

In addition to creating lengthy SQL calls, ORMs can also create a lot of SQL calls — especially in production environments where data volumes can vary drastically from those in development environments and developers didn’t foresee the potential behaviour of their code. A client sent me this screenshot of their application, which spent close to 25 seconds executing about 2,000 queries into their database, this was all Entity Framework developed code.

BUT – before going on I need to point out – I like Entity Framework. It speeds up development time and improves collaboration between teams – just be careful with it and ensure you monitor what it does in the backend.

2. Data Volumes in Queries

Another client recently sent me the following screenshot to enquire what a SNIReadSyncOverAsync call was? I’ve seen this a few times; essentially this is the code that runs in .NET after running a query to get the data you’ve just queried.

I put together the following as a neat example of how this works, (No EF here I’m afraid).

The query we’re executing is to query all 1.6 million postcodes in the UK! The resulting underlying code shows our query is fast. The DBA’s done a great job, but retrieving and reading this data is is the time-consuming part and that’s what SNIReadSyncOverAsync is. You’ll see this a lot in scenarios where you’re retrieving a lot of data or you’re reading a cursor over a slow network connection.

3. Web Service and WCF calls

In a similar manner to our ORMs making multiple database calls, I’ve seen production code perform in a similar manner with Web Service and WCF calls. In one case, a front-end UI component was responsible for crunching 8 seconds worth of time performing close to 300 backend web service calls simply to render a grid control (just think of all that redundant SOAP information passed to and fro):

The same happens in WCF when you don’t consider how many calls your code may need to make in a production environment:

Another trap to be aware of in WCF is the Throttling settings. Early versions of WCF had very low Throttling settings which meant that in high volume sites you suddenly hit the buffers. Take a quick review of what your WCF applications are using, such as:

<serviceThrottling maxConcurrentCalls=”500″

maxConcurrentInstances =”100″

maxConcurrentSessions =”200″/>

Older implementations of the framework also hit issues with the maximum number of threads available, which will result in seeing a lot of “WaitOneNative” as your code in stuck in a “Holding Pattern”:

4. File and Disk IO

If you’re doing a lot of file manipulation, consider using some of the Win32 API calls instead of managed code. I was tasked a few years back writing a quick utility to recursively work through files scattered amongst ~100,000 directories and update some text in them. When we ran the tests with my initial code, it took close to a day to execute! By swapping out to Win32 APIs, I shortened the time to around 30 mins!

5. Task.WaitAll and The Task Parallel Library

I’ll cover this in another post, but I’m starting to see a huge adoption in the use of the Task Parallel Library (TPL) in applications such as Insurance Aggregators, sites that compile as much information from multiple insurers for quoting insurance premiums. The same applies to a number of travel sites retrieving flight or hotel details from multiple providers.

What happens from a DevOps perspective is that the production code you’re managing gets much harder to understand and maintain. Even visualising the complexity of what’s going on under the hood becomes far more challenging than non-multithreading applications. In short, you want to reduce the amount of time your application spends waiting on tasks to complete, therefore keep an eye on how much time your application spends waiting on tasks.

The Enter and ReliableEnter methods of the Monitor object are similar examples of code hanging around before it can continue. In these cases, code is “locking and blocking”, i.e. a Thread is executing on an object and locking out other threads. (refer to https://msdn.microsoft.com/en-us/library/hf5de04k(v=vs.110).aspx for more information on this one).

Interested in gaining end-to-end visibility over your complex application? Check out a FREE trial of AppDynamics today!

You may also like ...