A UNIX Bigot Learns About .NET and Azure Performance – Part 1

This blog post is the beginning of my journey to learn more about .NET and Microsoft Azure as it applies to performance monitoring. I’ve long admitted to being a UNIX bigot but recently I see a lot of good things going on with Microsoft. As a performance monitoring geek I feel compelled to understand more about these technologies at a deep enough level to provide good guidance when asked by peers and acquaintances.

The Importance of .NET

Here are some of the reasons why .NET is so important:

  • In a 2013 Computer World article, C# was listed as the #6 most important programming language to learn along with ASP.NET ranking at #14.
  • In a 2010 Forrester article .NET was cited as the top development platform used by respondents.
  • .NET is also a very widely used platform in financial services. An article published by WhiteHat Security stated that “.NET, Java and ASP are the most widely used programming languages at 28.1%, 25% and 16% respectively.” In reference to financial service companies.
  • The Rise of Azure

    .NET alone is pretty interesting from a statistical perspective but the rise of Azure in the cloud computing PaaS and IaaS world is a compounding factor. In a “State of the Cloud” survey conducted by RightScale, Azure was found to be the 3rd most popular public cloud computing platform for Enterprises. In a report published by Capgemini, 73% of their respondents globally stated that Azure was part of their cloud computing strategy with strong support across retail, financial services, energy/utilities, public, and telecommunications/media verticals.

    Developer influence

    Not to be underestimated in this .NET/Azure world is the influence that developers will have on overall adoption levels of each technology platform. Microsoft has created an integration between Visual Studio (the IDE used to develop on the .NET platform) and Azure that makes it extremely easy to deploy .NET applications onto the Azure cloud. Ease of deployment is one of the key factors in the success of new enterprise technologies and Microsoft has definitely created a great opportunity for itself by ensuring that .NET apps can be easily deployed to Azure through the interface developers are already familiar with.

    The fun part is yet to come

    Before I started my research for this blog series I didn’t realize how far Microsoft had come with their .NET and Azure technologies. To me, if you work in IT operations you absolutely must understand these important technologies and embrace the fact that Microsoft has really entrenched itself in the enterprise. I’m looking forward to learning more about the performance considerations of .NET and Azure and sharing that information with you in my follow-up posts. Keep an eye out for my next post as I dive into the relevant IIS and WMI/Perfmon performance counters and details.

    How to Run AppDynamics in Microsoft Azure

    “Is it possible to run the AppDynamics controller within my own Microsoft Azure IaaS?”

    I hear this question fairly regularly and would like to walk you through how to host the controller in your own Azure cloud. First off, the pros of having AppDynamics with Azure:

  • Have full control and ownership of the data collected by AppDynamics
  • Provide additional security to access the data (for example, lock it down to a corporate VPN only).
  • Enable easy integration between AppDynamics and your services, such as Active Directory for authentication or internal bug tracking system for alert notifications. These would typically require opening custom ports when you leverage the AppDynamics SaaS environment.
  • AppDynamics works by placing an agent running on your servers which reports to the controller. It’s common to have several agents monitoring your applications. To further the ease of use, we monitor Java, .NET, PHP, Node.js, and now, C++ all in one single pane of glass. Your Azure architecture might look something like this:

    Screen Shot 2014-09-24 at 3.32.03 PM
    A unique feature for AppDynamics is flexible deployment. Typically, legacy APM solutions rely on on-premise deployment, whereas newer companies are Saas-only. At AppDynamics you can run the controller on-premise, leverage the AppDynamics SaaS option, or deploy a hybrid mixture.

    To run the controller in your Azure IaaS you can leverage the security of the on-premise deployment option and install the controller the same way as if you would in your datacenter. This allows you to have full control over the your data and be the gatekeeper to access that data.

    Important to note:

  • Properly size the controller — you can estimate the CPU/memory/disk requirements based on number of agents you are going to deploy. This is covered in the AppDynamics online documentation.
  • Configure the VM for maximum I/O.
  • The second is very important to configure as the controller installs a database which requires high I/O throughput. The recommended best practice is to treat the VM the same as you would be running a SQL server on it. http://msdn.microsoft.com/en-us/library/azure/dn133149.aspx

    If you forget to do this, you run the risk that the performance of the controller will slow down. This will not slow down your monitored applications as the agents are implemented to be non-blocking. However, the slowness will cause controller UI to lag and hard to visualize the collected data.

    Hope this helps and you can choose the option which works the best for your organization! Try it out now, for FREE!

    How Do you Monitor Your Logging?

    Applications typically log additional data such as exceptions to different data sources. Windows event logs, local files, and SQL databases are most commonly used in production. New applications can take advantage of leveraging big data instead of individual files or SQL.

    One of the most surprising experiences when we start monitoring applications is noticing the logging is not configured properly in production environments. There have been two types of misconfiguration errors we’ve seen often in the field:

    1. logging configuration is copied from staging settings

    2. while deploying the application to production environment, logging wasn’t fully configured and the logging failed to log any data

    To take a closer look, I have a couple of sample applications to show how the problems could manifest themselves. These sample applications were implemented using MVC5 and are running in Windows Azure and using Microsoft Enterprise Library Exception Handling and Logging blocks to log exceptions to the SQL database. There is no specific preference regarding logging framework or storage, just wanted to demonstrate problems similar to what we’ve seen with different customers.

    Situation #1 Logging configuration was copied from staging to production and points to the staging SQL database

    When we installed AppDynamics and it automatically detected the application flowmap, I noticed the application talks to the production UserData database and… a staging database for logging.

    The other issue was the extremely slow response time while calling the logging database. The following snapshot can explain the slow performance, as you see there’s an exception happening while trying to run an ADO.NET query:

    Exception details confirm the application was not able to connect to a database, which is expected — the production environment in located in DMZ and usually can’t reach a staging network.

    To restate what we see above — this is a failure while trying to log the original exception which could be anything from a user not being able to log into the website to failing to checkout.

    At the same time the impact is even higher because the application spends 15 seconds trying to connect to the logging database and timeout, all while the user is waiting.

    Situation #2 During deployment the service account wasn’t granted permissions to write to the logging database

    This looks similar to the example above but when we drill inside the error we can see the request has an internal exception happened during the processing:

    The exception says the service account didn’t have permissions to run the stored procedure “WriteLog” which logs entries to the logging database. From the performance perspective, the overhead of security failure is less from timeouts in the example above but the result is the same — we won’t be able to see the originating exception.

    Not fully documenting or automating the application deployment/configuration process usually causes such problems.

    These are one-time issues that once you fix it will work on the machine. However, next time you deploy the application to a new server or VM this will happen again until you fix the deployment.

    Let’s check the EntLigLogging database — it has no rows

    Here’s some analysis to explain why this happened:

    1. We found exceptions when the application was logging the error

    2. This means there was an original error and the application was trying to report it using logging

    3. Logging failed which means the original error was never reported!

    4. And… logging doesn’t log anywhere about its failures, which means from a logging perspective the application has no problems!!

    This is logically correct — if you can’t log data to the storage database you can’t log anything. Typically, loggers are implemented similar to the following example:

    Logging is the last option in this case and when it fails nothing else happens as you see in the code above.

    Just to clarify, AppDynamics was able to report these exceptions because the agent instruments common methods like ADO.NET calls, HTTP calls, and other exit calls as well as error handlers, which helped in identifying the problem.

    Going back to our examples, what if the deployment and configuration process is now fixed and fully automated so there can’t be a manual mistake? Do you still need to worry? Unfortunately, these issues happen more often than you’d expect, here is another real example.

    Situation #3 What happens when the logging database fills up?

    Everything is configured correctly but at some point the logging database fills up. In the screenshot above you can this this happened around 10:15pm. As a result, the response time and error rates have spiked.

    Here is one of the snapshots collected at that time:

    You can see that in this situation it took over 32 seconds trying to log data. Here are the exception details:

    The worst part is at 10:15pm the application was not able to report about its own problems due to the database being completely full, which may incorrectly be translated that the application is healthy since it is “not failing” because there are no new log entries.

    We’ve seen enough times that the logging database isn’t seen as a critical piece of the application therefore it gets pushed down the priority list and often overlooked. Logging is part of your application logic and it should fall into the same category as the application. It’s essential to document, test, properly deploy and monitor the logging.

    This problem could be avoided entirely unless your application receives an unexpected surge of traffic due to a sales event, new release, marketing campaign, etc. Other than the rare Slashdotting effect, your database should never get to full capacity and result in a lack of logging. Without sufficient room in your database, your application’s performance is in jeopardy and you won’t know since your monitoring framework isn’t notifying you. Because these issues are still possible, albeit during a large load surge, it’s important to continuously monitor your loggingn as you wouldn’t want an issue to occur during an important event.

    Key points:

    • Logging adds a new dependency to the application

    • Logging can fail to log the data – there could be several reasons why

    • When this happens you won’t be notified about the original problem or a logging failure and the performance issues will compound

    This would never happen to your application, would it?

    If you’d like to try AppDynamics check out our free trial and start monitoring your apps today! Also, be sure to check out my previous post, The Real Cost of Logging.

    The Real Cost of Logging

    In order to manage today’s highly dynamic application environments, many organizations turn to their logging system for answers – but reliance on these systems may be having an undesired impact on the applications themselves.

    The vast majority of organisations use some sort of logging system — it could log errors, traces, information messages or debugging information. Applications can write logs to a file, database, Windows event log, or big data store and there are many logging frameworks and practices being used.

    Logging brings good insights about the application behavior, especially about failures. However, by being part of the application, logging also participates in the execution chain, which can have its disadvantages. While working with customers we often see the negative consequences when logging alone introduced the adverse impact to the application.

    Most of the time the overhead of logging is negligible. It only matters when the application is under significant load — but these are the times when it matters the most. Think about Walmart or Best Buy during Black Friday and Cyber Monday. Online sales are particularly crucial for these retail companies during this period and this is the time when their applications are under most stress.

    To better explain the logging overhead I created a lightweight .NET application that:

    1. implemented using ASP.NET
    2. performs lightweight processing
    3. has an exception built in
    4. exceptions are always handled within try…catch statement
    5. exceptions are either logged using log4net or ignored based on the test

    In my example I used log4net as I recently diagnosed a similar problem with a customer who was using log4net, however this could be replaced for any other framework that you use.

    Test #1

    First, we set up a baseline by running an application when exceptions are not being logged from the catch statement.

     

    Test #2

    Next I enabled logging exceptions by logging those to a local file and running same load test.

    As you can see not only is the average response time significantly higher now, but also the throughput of the application is lower.

    The AppDynamics snapshot is collected automatically when there is a performance problem or failure and includes full call graph with timings for each executed method.

    By investigating the call graph AppDynamics produces, we see that log4net method, FileAppender, renders error information to a file using FileStream. On the right you can see duration of each call and the most time was spent in WriteFileNative as it was competing with similar requests trying to append the log file with error details.

    Test #3

    I often come across attempting to make exception logging asynchronous by using ThreadPool. Below is how the performance looks like in this setup under exactly the same load.

    This is a clever concept and works adequately for low throughput applications, but as you can see the average response time is still in a similar range as the non-asynchronous version, but a slightly lower throughput is achieved.

    Why is this? Having logging running in separate threads means the resources are still consumed, there are less threads available, and the number of context switches will be greater.

    In my experience logging to a file is exceptionally common. Other types of storage could introduce better performance, however they always need to be further tested and logging to a file is the easier solution.

    Summary

    While logging is important and helps with application support and troubleshooting, logging should be treated as part of the application logic. This means logging has to be designed, implemented, tested, monitored and managed. In short, it should become a part of full application lifecycle management.

    How well do you understand the your logging framework and it’s performance cost?

     Take five minutes to get complete visibility into the performance of your production applications with AppDynamics today.

    Diving Into What’s New in Java & .NET Monitoring

    In the AppDynamics Spring 2014 release we added quite a few features to our Java and .NET APM solutions. With the addition of the service endpoints, an improved JMX console, JVM crash detection and crash reports, additional support for many popular frameworks, and async support we have the best APM solution in the marketplace for Java and .NET applications.

    Added support for frameworks:

    • TypeSafe Play/Akka

    • Google Web Toolkit

    • JAX-RS 2.0

    • Apache Synapse

    • Apple WebObjects

    Service Endpoints

    With the addition of service endpoints customers with large SOA environments can define specific service points to track metrics and get associated business transaction information. Service endpoints helps service owners monitor and troubleshoot their own specific services within a large set of services:

    JMX Console

    The JMX console has been greatly improved to add the ability to manage complex attributes and provide executing mBean methods and updating mBean attributes:

    JVM Crash Detector

    The JVM crash detector has been improved to provide crash reports with dump files that allow tracing the root cause of JVM crashes:

    Async Support

    We  added improved support for asynchronous calls and added a waterfall timeline for better clarity in where time is spent during requests:

      

    AppDynamics for .NET applications has been greatly improved by adding better integration and support for Windows AzureASP.Net MVC 5, improved Windows Communication Foundation support, and RabbitMQ support:

     

     

    Take five minutes to get complete visibility into the performance of your production applications with AppDynamics today.

    Instrumenting .NET applications with AppDynamics using NuGet

    Introduction

    One of the coolest things to come out of the .NET stable at AppD this week was the NuGet package for Azure Cloud Services. NuGet makes it a breeze to deploy our .NET agent along with your web and worker roles from inside Visual Studio. For those unfamiliar with NuGet, more information can be found here.

    Our NuGet package ensures that the .NET agent is deployed at the same time when the role is published to the cloud. After adding it to the project you’ll never have to worry about deploying the agent when you swap your hosting environment from staging to production in Azure or when Azure changes the machine from under your instance. For the remainder of the post I’ll use a web role to demonstrate how to quickly install our NuGet package, changes it makes to your solution and how to edit the configuration by hand if needed. Even though I’ll use a web role, things work exactly the same way for a worker role.

    Installation

    So, without further ado, let’s take a look at how to quickly instrument .NET code in Azure using AppD’s NuGet package for Windows Azure Cloud Services. NuGet packages can be added via the command line or the GUI. In order to use the command line, we need to bring up the package manager console in Visual Studio as shown below

    PackageManager

    In the console, type ‘install-package AppDynamics.WindowsAzure.CloudServices’ to install the package. This will bring up the following UI where you can enter the information needed by the agent to talk to the controller and upload metrics. You should have received this information in the welcome email from AppDynamics.

    Azure

    The ‘Application Name’ is the name of the application in the controller under which the metrics reported by this agent will be stored. When ‘Test Connection’ is checked we will check the information entered by trying to connect to the controller. An error message will be displayed if the test connection is unsuccessful. That’s it, enter the information, click apply and we’re done. Easy Peasy. No more adding files one by one or modifying scripts by hand. Once deployed, instances of this web role will start reporting metrics as soon as they experience any traffic. Oh, and by the way, if you prefer to use a GUI instead of typing commands on the console, the same thing can be done by right-clicking on the solution in Visual Studio and choosing ‘Manage NuGet Package’.

    Anatomy of the package

    If you look closely at the solution explorer you’ll notice that a new folder called ‘AppDynamics’ has been created in the solution explorer. On expanding the folder you’ll find the following two files:

    • Installer of the latest and greatest .NET agent.
    • Startup.cmd
    The startup script makes sure that the agent gets installed as a part of the deployment process on Azure. Other than adding these files we also change the ServiceDefinition.csdef file to add a startup task as shown below.

    Screen Shot 2013-11-27 at 8.11.27 PM

    In case, you need to change the controller information you entered in the GUI while installing the package, it can be done by editing the startup section of the csdef file shown above. Application name, controller URL, port, account key etc. can all be changed. On re-deploying the role to Azure, these new values will come into effect.

    Next Steps

    Microsoft Developer Evangelist, Bruno Terkaly blogged about monitoring the performance of multi-tiered Windows Azure based web applications. Find out more on Microsoft Developer Network.

    Find out more in our step-by-step guide on instrumenting .NET applications using AppDynamics Pro. Take five minutes to get complete visibility into the performance of your production applications with AppDynamics Pro today.

    As always, please feel free to comment if you think I have missed something or if you have a request for content in an upcoming post.

    AppDynamics Pro on the Windows Azure Store

    Over a year ago, AppDynamics announced a partnership with Microsoft and launched AppDynamics Lite on the Windows Azure Store. With AppDynamics Lite, Windows Azure users were able to easily monitor their applications at the code level, allowing them to identify and diagnose performance bottlenecks in real time. Today we’re happy to announce that AppDynamics Pro is now available as an addon in the Windows Azure store, which makes it easier for developers to get complete visibility into their mission-critical applications running on Windows Azure.

    • Easier/simplified buying experience in Windows Azure Store
    • Tiered pricing based on number of agents and VM size
    • Easy deployment from Visual Studio with NuGet
    • Out-of-the-box support for more Windows Azure services

    “AppDynamics is one of only a handful of application monitoring solutions that works on Windows Azure, and the only one that provides the level of visibility required in our distributed and complex application environments,” said James Graham, project manager at MacMillan Publishers. “The AppDynamics monitoring solution provides insight into how our .NET applications perform at a code level, which is invaluable in the creation of a dynamic, fulfilling user experience for our students.”

    Easy buying experience

    Purchasing the AppDynamics Pro add-on in the Windows Azure Store only takes a couple of minutes. In the Azure portal click NEW at the bottom left of the screen and then select STORE. Search for AppDynamics, choose your plan, add-on name and region.

    4-choose-appdynamics

    Tiered pricing

    AppDynamics Pro for Windows Azure features new tiered pricing based on the size of your VM (extra small, small or medium, large, or extra large) and the number of agents required (1, 5 or 10). This new pricing allows organizations with smaller applications to pay less to store their monitoring data than those with larger, more heavily trafficked apps. The cost is added to your monthly Windows Azure bill, and you can cancel or change your plan at any time.

    AppDynamics on Windows Azure Pricing

    Deploying with NuGet

    Use the AppDynamics NuGet package to deploy AppDynamics Pro with your solution from Visual Studio. For detailed instructions check out the how-to guide.

    2-vs-package-search

    Monitoring with AppDynamics

    • Monitor the health of Windows Azure applications
    • Troubleshoot performance problems in real time
    • Rapidly diagnose root cause of performance problems
    • Dynamically scale up and scale down their Windows Azure application based on performance metrics

    AppDynamics .Net App

    Additional platform support

    AppDynamics Pro automatically detects and monitors most Azure services out-of-the-box, including web and worker roles, SQL, Azure Blob, Azure Queue and Windows Azure Service Bus. In addition, AppDynamics Pro now supports MVC 4. Find out more in our getting started guide for Windows Azure.

    Get started monitoring your Windows Azure app by adding the AppDynamics Pro add-on in the Windows Azure Store.

    Covis Software GmbH isolates problematic Web Service in 4 minutes with AppDynamics Pro

    I received another X-Ray from Covis Software GmbH in Germany who provides CRM solutions. They’ve been managing their .NET application performance with AppDynamics Pro for several months now in production. In the below X-Ray (as documented by the customer), Covis were able to rapidly identify a poor performing remote web service call which was impacting their application, business transactions and end user experience of their customers.

    If you would like to get started with AppDynamics you can download AppDynamics Lite (our free version) or you can take a free 30-day trial of AppDynamics Pro.

    Users share their AppDynamics Lite for .NET Experiences

    A week has passed since we officially launched our free version of AppDynamics Lite for .NET to the public, and so far its been garnering attention and positive reviews from the APM community. For example, someone from a major Canadian telecommunications company downloaded .NET Lite and informed us on how quickly they were able to gain value from using it.

    We’re equally impressed and delighted that they’re heavily pushing our Lite product internally and on the path to becoming an AppDynamics Pro user. Even though we hear praises from customers often, it’s always satisfying to hear about our products delivering real, quantifiable results in under 24 hours.

    We also received this blog review from another .NET Lite user just the other day. They deployed AppDynamics Lite to monitor their overall site performance called Everymarket.ru – an open e-commerce market that connects resellers with a community of buyers for products that are discounted when purchased in bulk. The users of Lite are also able to monitor the outbound web service calls to VK.com which is Russia’s version of Facebook to ensure outbound requests to the social site are responsive.

    EveryMarket.ru Describes the Entire .NET Lite Experience

    AppDynamics Lite looks nice and catchy. Right after installation we started to obtain information about our application and performance metrics immediately. The following screenshot nicely illustrates that there are some hidden errors occurring for three of the business transactions in the following list. We have actually seen these errors before, but it was quite hard to estimate their frequency and severity. Now it can be easily understood from this view here and on the Business Transactions view, though it is limited to only 20 transactions.

    The first thing I did after the installation, I decreased the snapshot frequency from 100 to 50. It was more out of curiosity without much purpose. It’s nice to see the historical information about errors and requests presented on one screen. We then changed the default sorting “By timestamp” in order to see errors as they happened by recency.

    In this case, the error in question didn’t involve a SQL query. The Bad Request details were quite simple and repeated what we could find in a set of stack trace logs. However, the main advantage for us is that AppDynamics can catch these errors and notify our team immediately by email. Once inside the tool, the information is well organized for faster root cause diagnosis.

    In general I found it very useful that AppDynamics operates with business transaction logic. The semantic of transaction is easy to understand and one can have a detailed overview of a precise transaction. And of course the “red bar” matters. In case of many errors it is straightforward that “more red” means “look first”.

    From two examples you can see that our .NET Lite users are able to install and run our product with ease while gaining insight into their application’s health and performance bottlenecks to improve upon for faster runtime. If you have a success story using AppDynamics Lite, please let us know!

    AppDynamics Customers “Jam” at First EMEA User Conference in Mallorca

    If you’d asked me 12 months ago whether I’d be sat in a luxury hotel, on the spanish island of Mallorca, hosting AppDynamics first ever EMEA customer conference with 50+ attendees. I would have kindly asked you to put your crack pipe down. Yet, over the past few days that’s exactly what we did. Due to the phenomenal growth of AppDynamics in 2011, we felt it was only right to host our first EMEA customer conference–“App Jam.” Here’s a write-up of the event along with some media to help you understand the experience.

    Day One:

    Welcome to App Jam:

    App Jam kicked off in style with an intro video. Our CEO Jyoti Bansal then took to the stage, welcomed the audience, and explained the reasoning behind App Jam. “Our customers are passionate – they need a community that harnesses that vision,” said Jyoti. “We want to bring the best minds together, and help you achieve the best possible deployments and benefits from AppDynamics.” The key principles of App Jam were for customers to Share, Learn, Guide and Jam with each other over two days.

    Jyoti then talked about his experiences as Chief Architect at Wily Technology, and why first generation APM was no longer serving the needs of customers in today’s modern distributed applications. Jyoti then talked about the core design principles and cultural values behind AppDynamics as a company, specifically outlining the importance of solving real problems, and why it’s critical for AppDynamics to make our customers successful. “When you buy software from AppDynamics, you’re getting a partner, not just software.”

    Jyoti then finished with a call to action for attendees, asking them to be very open with feedback “Tell us what you need, help us solve your real problems – we want to become the undisputed leader in APM and can’t do this without you.”

    Using AppDynamics in a highly distributed Java System Z environment:
    This session was lead by Stephan Ehlen, Systems Architect at Provinzial and Mirko Novakovic, CEO of Codecentric AG. Running Java on System Z might sound like a complex, expensive and unnatural proposition for many. For Provinzial, Application Performance Management (APM) played a critical role in ensuring their applications and business was able to perform on System Z. Stephan talked about the limitations and challenges he faced around ensuring application performance and scalability.

    Mirko then gave a great insight into the common problems and resource constraints of using Java on System Z, specifically around remote invocations, session management and data access. “Using SOAP might sound cool, but its actually a very expensive activity when you’re forced to marshal and un-marshal data between JVMs.” He then talked about the need to use binary protocols, so serialization and latency were kept to a minimum.

    Stephan then gave a brief history of their brief experiences using toolsets from IBM, HP and dynaTrace before they eventually settled with AppDynamics in 2011. Stephan then highlighted the fact that “AppDynamics was able to auto-discover 95% of our application out-of-the-box, meaning I can spend more time managing application performance than telling my APM solution what to monitor.”

    “With System Z the truth is always in Production,” added Mirko, explaining the fact that developers can only test components of an application in isolation using tools like profilers. Overall a great, deep insight into the world of managing Java application performance on System Z.

    AppDynamics 2012 Product Roadmap
    Next up was Bhaskar Sunkara, our VP of Product Experience. This was probably one of the most exciting product roadmap sessions I’ve ever witnessed – the level of interaction and excitement in the room from customers was simply staggering. To see customers engaged, firing questions at Bhaskar left, right and center was a sure sign that our roadmap was compelling. Bhaskar’s uncanny ability to pre-empt questions and give customers confident answers that AppDynamics would deliver, was pretty inspiring. For obvious reasons I can’t reveal our product roadmap here in this blog, needless to say this session went down extremely well with attendees.

     

    Managing Agile .NET application in Production
    After a brief break Kristof Berger, Chief Software Architect at Covis took to the stage. Kristof began by outlining his team’s agile methodology around 4 week sprint cycles, and why delivering releases on time with quality was so important – “When we commit features to the business, it’s all about hitting deadlines,” said Kristof.

    Kristof then talked about monitoring prior to AppDynamics, and how his dev team had built custom performance collectors into the application. Operations were using traditional monitoring solutions like SCOM to view this performance data. “Monitoring server health is a misleading indicator,” Kristof said. “Legacy tools are not suitable for distributed production environments.” Kristof pointed out that troubleshooting customer issues in production was a long, drawn out, manual process, whereby development would have to request log files and performance data which often took several hours.

    AppDynamics was able to give both Dev and Ops complete visibility of their application, business transactions, and code execution in production. “We reduced our MTTR from an average of 16 hours to under 30 minutes, with most issues now taking 10 to 15 minutes to resolve,” Kristof said. “AppDynamics fundamentally changed the culture between Dev and Ops. Our teams no longer blame each other; they sit down together and collaborate using the same information to find root cause with AppDynamics.”

    Kristof then walked the audience through several examples of how AppDynamics had helped his team resolve slowdowns within their business rules engine – identifying locking with SQL queries, as well as reducing the memory footprint of specific business transactions that were persisting too much data.

    Customer Panel – Battle of the Architects 

    The last session of day one was a customer panel between Johan Sandberg from SEB (Performance Architect), Andrew Mulholland from Expedia (Operations Arch) and Kristof from Covis (Software Arch). Having different roles on the panel made the discussions and perspectives very interesting. For example, Kristof made the point that “Operations need to make more of an effort to better understand how applications are built,” rather than just being responsible for their deployment. Johan also said “Non-functional requirements like performance should not be optional throughout the application lifecycle,” explaining how production is often the first place where application performance is tested.

    Time to Jam
    With the day sessions over, attendees were left to “Jam” with cocktails on the hotel roof, which in our case happened to overlook the bay of Mallorca. This was a great opportunity for everyone to network, share experiences, and learn from the range of APM customers and deployments present.

    Overall, A great first day which ended with cocktails and dinner overlooking the coast of Mallorca (as shown by the photos below :))

    Day Two:

    AppDynamics Tips and Tricks
    Bhaskar kicked off day two with a tips and tricks session on how best to use AppDynamics. Covering some of the new features and functionality of the recent AppDynamics 3.4 release, specifically around policy management, end user monitoring, information points and regression analysis.

    EAN’s Journey to the Cloud:
    Andrew Mulholland, Operations Architect led the first customer session of day two by talking about Expedia’s journey to the Cloud. Andrew used a car racing analogy throughout his presentation to describe the goals, requirements, challenges, processes and people associated with migrating to the public cloud. EAN’s application serves 10,000 partners around the world, so for Andrew it’s all about delivering performance and a great end-user experience across multiple geographies. EAN’s motto last year was “2 seconds is too long”–this year it’s become “Deliver 1000 hotels from anywhere in under 1 second.”

    Andrew talked about the importance of moving data closer to the customer, with cloud being a natural fit for that. Building data centers around the world was ruled out because “They’re a commodity, and being competitive is about the application and its performance.” With Cloud, EAN gets capacity on-demand around the world without having to sign long-term contracts with managed service providers. To do this, EAN would need several Cloud providers to cover partners and customers in the US, EMEA and Asia-Pac. The Cloud OpenStack alliance and the need for standards across Cloud providers is of great interest and importance to EAN.

    “You can’t drive with slick tires on a wet race track, and in the cloud it’s the same thing. Preparing for failure is critical,” said Andrew, highlighting the constant need to embrace failure and learn from it. EAN aren’t planning to migrate their entire application to the cloud at once: “We want to take small steps and offer a beta for some partners first.” Andrew also mentioned the hidden hazards of cloud, specially around tax, data privacy and security along with the need to standardize on technology and automate processes.

    “Race drivers rely on their dashboard gauges and dials throughout a race,” said Andrew, “It’s exactly the same for us running a website.”

    Monitoring therefore becomes key to success in the Cloud, especially in EAN’s case where performance was a key driver. “AppDynamics ability to auto-discover, and monitor application nodes, on-the-fly makes it a perfect fit for Cloud applications” said Andrew, “We don’t have time to manually tell our monitoring solution what to monitor; we have to be agile and automation is key to that.” Andrew finished by saying that his migration would be a continuous process, with feedback being king from both partners and their monitoring solutions.

    Customer Success
    Hatim Shafique, VP of Customer Success at AppDynamics, then delivered a 30 minute session on why AppDynamics is laser focused on making customer deployments successful. “I see it like each of you going to the doctors every year for a check-up to make sure you’re OK and healthy. We want your deployments to be the same so you’re successful.” Hatim then covered his team’s structure, along with processes used like customer feedback surveys which is just one KPI AppDynamics uses to measure customer satisfaction. Overall, a great session which was received well by the audience.

    Enabling APM for Dev, Ops and the Business:
    The final session came from Benoit Villaumie (Lead Architect) and Guillaumie Postaure (Infra Manager) at Karavel, France’s number 1 travel portal. Benoit began by talking about their transition from a monolithic application architecture to an SOA architecture. “We hit a brick wall in terms of scalability with the original architecture, as it became to expensive to maintain and scale,” said Benoit. Moving to a SOA architecture in 2009 eased the maintainability and scalability side of their applications–but as Guillaumie pointed out, it became incredibly complex to manage as a result: “We have 80 applications with 2 releases a week. This kind of constantly changing environment is difficult to manage.”

    Karavel had a history of architecture issues from slow SQL queries, 3rd party web services, open source framework bugs and resource leakages. “Firefighting was long and painful for us,” said Guillaumie. “We’d manually try to piece together log files and heap dumps from multiple servers.”

    Benoit then talked our their AppDynamics experience, “We downloaded the free version and deployed it in production–after 2 hours we requested a Pro trial. We deployed Pro across our 200 tomcat production instances with no consultants or services.” The net impact of this was a 10x improvement in response times with some business transactions dropping from 5 seconds to under 500 milliseconds. Mean-time-to-Detect (MMTD) dropped from hours to minutes and Mean-time-to-Resolution (MTTR) dropped from days to hours. One user (Adrien B) called AppDynamics “a magic wand to spot issues quickly.” Benoit added, “When issues occur it’s no longer a ping pong match between Dev and Ops.”

    Benoit then walked the audience through five different roles at Karavel who were using AppDynamics. For each role Benoit outlined their job title and key use cases which were accompanied by a series of screenshots that visualized which AppDynamics views/data were relevant. “AppDynamics is like a rosetta stone for Dev, Ops and the Business,” Guillaume said. “Today we have over 50 people at Karavel using AppDynamics.” This session was a perfect example of what’s possible when an organization embraces APM, it was also another example of how development and operations culture can change from being the blame game to team collaboration.

    AppDynamics Exec Panel

    The final session of App Jam allowed attendees to ask our AppDynamics Exec team any question. Questions ranged from product features to integration points along with an interesting question which was “What aren’t you planning to do?” Jyoti smiled and replied “I can categorically say, we won’t be providing support for Cobol,” which as greeted with laughter from the audience.

    With that, App Jam came to an end after two days of sessions and Jamming. We received lots of positive feedback from attendees – and we’re expecting App Jam 2013 to be bigger and better! A big thank you to all our customers, speakers and partners who attended. Finally I’d like to apologize for not attending App Jam in my usual attire; several attendees expressed their dissatisfaction at not being able to meet the real App Man. Don’t worry folks–they’ll be plenty of App Man time next year I promise!

    App Man.

    How was App Jam for you? We’d love to hear your comments and feedback below.