Proactively manage your user experience with EUM

As companies embark on a digital transformation to better serve their customers the challenge of managing the performance and satisfaction with each user becomes ever more critical to the success of the business. When we look at the breakout companies today like Uber, Airbnb, and Slack it’s evident that software is at the core of their success in each industry.  Consumers have gone from visiting a bank’s branch to making a transfer to using their desktop or mobile device to fulfill it instantaneously.  If the web application is not responding then it could be the equivalent of walking into a long line at the branch and walking out creating a negative impression of that brand. In a digital business making each customer interaction with your digital storefront successful should be at the core of the business objectives.

So, how do we ensure that every digital interaction is successful and performs quickly? The answer lies in using multiple approaches to manage that end-user experience. In our tool arsenal, we have real-user monitoring tools and synthetic monitoring tools that when combined with APM capabilities can help us quickly identify poor performing transactions and quickly triage the root cause to minimize end-user impact. Each tool covers a core area for the web application that gives visibility into the whole experience. For real user monitoring tools, having an understanding of the performance and sequence of every step in the customer path is critical to identifying areas of opportunity in the funnel and increase conversions. Real user monitoring provides the measurement breadth impossible to achieve with only synthetic tools and puts back end performance into a user context. On the synthetic side of monitoring, a repeatable and reproducible set of measurements focused on visual based timings can be of great value in baselining the user experience between releases, pro-actively capturing errors before users are impacted, benchmarking against competitors, and baselining and holding 3rd party content providers accountable for the performance of their content.  Synthetic measurements also allow for script assertions to validate that the expected content on a page is delivered in a timely way and accurately alert when there are deviations from the baseline or errors occur.

In a recent survey sponsored by AppDynamics, over 50% of people who manage web and mobile web applications identified 3rd party content as a factor in delivering a good end-user experience.  Our modern web and mobile sites most often contain some kind of 3rd party resource from analytics tracking, to social media integration, all the way to authentication functionality. Understanding how each of these components affects the end user experience is critical in maintaining a healthy and well performing site. Using a real-user monitoring tool like AppDynamics Browser EUM solution, you can visualize slow content that may be affecting a page load and identify the provider.  The challenge now is how do you see if this provider is living up to their performance claims and how do you hold them accountable.

Third party benchmarking is a capability that a synthetic monitoring solution best provides.  With a synthetic transaction you are able to control many variables that are impossible to do on a real user measurement.  A synthetic measurement will always use the same browser/browser version, connectivity profile, hardware configuration, and is free from spyware, virus, or adware. Using this clean room environment, you can see what the consistent performance of a page along with every resource to manage and track each element downloaded from multiple synthetic locations worldwide. In this instance, when your monitoring system picks up an unusually high number of slow transactions you can directly drill down and isolate the cause either to a core site slowdown or a 3rd party slowdown and compare the performance across synthetic to determine if it’s a user/geography centric issue or something happening across the board.

In managing the user experience, having all pertinent data in real-time on a consolidated system can be the difference between a 5 min performance degradation or a 5 hour site outage while multiple sources of discordant information are compiled and rationalized. The intersection of data from real-user monitoring and synthetic monitoring can bring context to performance events by correlating user session information like engagement and conversion with changes in performance of 3rd party content or end-user error rates. A 360 degree view of the customer experience will help ensure a positive experience for your customers.

Interested in learning more? Make sure to watch our Winter ’16 Release webinar

 

Why It’s So Difficult for Devs to Create Apps for the Apple Watch

Well, that didn’t take long. According to an article on Reuter’s some people are reporting their first impressions of the Apple Watch and, unsurprisingly, they are already complaining about the performance of the apps on the Apple Watch.

As an engineer who started my career working on real-time embedded systems for industrial automation, I can tell you what a Herculean task it can be to get a tiny piece of hardware to run very sophisticated software on comparatively low spec CPU. All the while having it execute quickly to provide a responsive end-user experience. The tradeoffs between cost, CPU capability and speed, power drain, software functionality are nearly impossible to navigate and satisfy all of your goals. There’s the old product managers joke that you’ve got three options: cheap, fast, and good, but you can only have two.

For developers creating applications for the Apple Watch or creating extensions for the Apple Watch for existing applications these complaints show two things:

  1. You have to completely re-think the nature of an Apple Watch app UI/UX
  2. You have to be laser focused on every aspect of your app that contributes to the performance of your app on the Apple Watch

Re-imaging UI/UX of apps for Wearables

One of the complaints in the article illustrates this point perfectly: “The maps app, surely the answer to wandering pedestrians’ dream, is so slow it makes me want to pull out my paper Rand McNally,” says The Wall Street Journal’s Geoffrey Fowler.

This seems to be a perfect example of not carefully examining the use case for an application and making sure that the UI/UX of the app has been properly redesigned to match the experience of the user. If I’m wandering around on foot, as Fowler supposes, then it makes no sense to have a complete maps UI on my watch, which would surely require a ton of data to be transferred from the phone to the watch and cause it to run very slowly.

Rather, the prime use case for pedestrian use of a maps app would be navigation, where the watch app is prompting me only with navigation cues such as “turn left on Main St. in 25 meters”, in which case the UI would be a simple as a left pointing arrow whose length is proportional to the distance remaining and a bit of text with the name of the next cross st.

As a developer or application product manager, your mantra should be Fit for Purpose. Don’t try and port your entire app UI to the watch. Only put that portion of the UI on the watch that is explicitly required for a particular action. Let the phone do what it does well, and let the watch do only the micro interaction required for a part of the workflow specific to using the watch.

Let’s look at another complaint from Nilay Patel of theverge.com who is quoted as saying “There’s virtually nothing I can’t do faster or better with access to a laptop or a phone except perhaps check the time.” Though the statement may be a sarcastic statement, it shows the underestimating the wearable UX.

To show just how silly Patel’s comment is, let’s look at the fitness use case. This is an example of where I think the Apple Watch has the potential to make health and fitness tracking appeal to the general public rather than just the niche market of fitness enthusiasts or quantified self-geeks.

Let’s consider the act of jogging, cycling, or even just walking. Now a lot of people already bring their phones with them when they exercise, and many of them even use specialized fitness apps like Strava, RunKeeper, MapMyFitness, MyFitnessPal etc. to track their activity, share it with friends via social media or even just listen to music while they are doing their activity.

But the key thing is that once the activity is started, the phone is usually tucked away in a pocket, a cycling jersey, a fanny pack, or even an armband, and anyone who has tried knows that its a pain in the neck to take your phone out to interact with the app while you are doing the activity.

Now, for a small niche of enthusiasts like myself and my triathlete friends, we will also wear specialized fitness trackers from Garmin or Timex on our wrists or attached to the handlebars of our bikes so we can see our data like pace, cadence, distance, elapsed exercise time, and power output.

But for the vast majority of the general public, they are not going to shell out another $300+ for a single purpose fitness tracking wearable. With the Apple Watch, on the other hand, one can easily imagine a significant number of people who would be willing to spend that money for a multipurpose device that connects to the apps on the phone and lets them interact with them naturally and easily while they are doing their various activities.

There’s no way, for example, that I’m going to pull out my very expensive and fragile iPhone (or even risk mounting on my handlebars) while I’m bombing down Panoramic Highway on Mt. Tam at 40 mph on my road bike in order to check my pace, but it would be quite simple for me to take a quick glance at my Apple Watch to see the data.

Laser Focus on App and Watch Kit Extension Performance

OK, for now, let’s assume that you’ve already carefully thought through your use cases, and now you are building your app and/or the WatchKit extension for your app.

According to the reviews published on Wednesday, the apps need upgrades to load more quickly. Again, the article says that Patel claims loading an app required the watch to pull tremendous amounts of data from iPhones.

No matter how carefully you’ve thought through the scenarios, you still need to know exactly what your app is doing and all of the factors that are contributing to the performance of the app during development, test, and production.

You need to understand how much data you are transferring and how long it’s taking.

Now you could try and instrument the app yourself with a bunch of timers and variables on your method calls, but that would be a waste of your time as a developer when you should be focused on implementing the features your users want that make sense for the watch.

Fortunately, this is where AppDynamics has you covered. With AppDynamics Mobile Real User Monitoring, you can get detailed data about how every aspect of your app and the Watch Kit Extension are performing and track the interactions of the extension with the Apple Watch.

Moreover, the AppDynamics Mobile RUM solution provides you with all of this information through a single portal where you can see how your users are distributed geographically, all the network requests and errors that are occurring, how long they are taking, how much data is being transferred, and you can dig down into any line of code and stack trace to see what’s happening in your app and how that is contributing to the end user experience.

Now get cracking on those apps and extensions for the Apple Watch, and sign up for a free trial of AppDynamics Mobile Real User Monitoring to make sure your apps are delivering the performance your customers deserve and expect.

Improving the quality of mobile applications with Mobile APM

Mobile applications now account for 15 percent of all Internet traffic, with 1.5 billion users worldwide. Done right, a simple iPhone application can reach 50 million users in days. When you have a successful mobile application it is difficult to ensure all of your users have a great experience. While good performance gets a high rating and strong sales in the App Store, poor performance will impact the application’s rating—and can cost a business tens of thousands of dollars for every second of delayed response or app failure.

The reality is often times the organization responsible for infrastructure and monitoring have limited influence with the mobile application owners. Most mobile application owners have visibility into the server-side applications, but limited visibility into the mobile experience. In a recent report from Gartner, Jonah Kowall calls out the growth in the mobile space:

By 2018, 25% of those organizations with current, five-dimensional APM investments will extend them to mobile devices and applications, up from under 1% today.

Monitoring native mobile applications is increasingly challenging due to the sheer amount of platforms, devices, and variability of connectivity. In order to truly understand your users you need to leverage end user experience monitoring (EUEM) to gain visibility into how your actual users perceive your application. You need to utilize Mobile APM to gain visibility into application crashes and hangs, understand the interactions between mobile applications and the server-side, and understand your users through analytics. AppDynamics makes it easy to get complete visibility across your web and mobile applications:

Mobile APM

 

Find out more about the key challenges in improving the quality of mobile applications in a Gartner analyst report written by the leading APM analyst Jonah Kowall.

Take five minutes to get complete visibility into the performance of your mobile applications with AppDynamics Mobile RUM today.

For a introduction to AppDynamics Mobile Real-User Monitoring, watch our On-Demand Webinar now.

Announcing AppDynamics Application Intelligence Platform

AppDynamics is excited to unveil details behind our new Application Intelligence Platform which is the technology foundation for the AppDynamics portfolio of performance monitoring, automation and analytics solutions (transaction analytics was also announced today).

Today’s software defined businesses need the ability to see everything in their application environment, act with certainty, and know faster. See, Act, & Know, the three core pillars of Application Intelligence.

See Act Know

Making sense of all the data, transactions, events and user interactions in these environments is increasingly difficult as the volume, variety and velocity of data grows. Traditional approaches to solving this problem are typically silo’d in their view providing limited business correlation and only present static data. In order to keep up with todays complexity, what’s needed is a platform that can collect, understand, and act on the data and translate that into actionable information about your business.

platform page - application intelligence platform graphic2

The AppDynamics Application Intelligence Platform is uniquely able to deliver rich performance data, learning, and analytics, combined with the flexibility to adapt to virtually any infrastructure or software environment. The platform was designed for the modern day application, including SOA architectures, frequent agile releases, cloud deployments, big data technologies and mobile delivery – and maintains a unique low-overhead architecture. This gives you the ability to instantly see the impact of production changes as you work, and adapt to optimize performance and minimize bottlenecks while ensuring the revenues are maximized.

Today’s consumer-driven world moves faster than ever. Everything from your customer experience to revenue depends on your business-critical applications performing at their highest level. AppDynamics delivers real-time access to every aspect of your performance, so you can anticipate problems, resolve them automatically, and make smarter, more certain business decisions.  This includes detailed understanding of the user experience, business data such as real time revenue metrics, deep application performance and infrastructure data. All collected within context of business & operational transactions.

Platform Design Principles:

The Application Intelligence Platform is the underlying architecture that we use to deliver all of the capabilities a modern business needs to keep their business running at peak performance and use application intelligence as a competitive advantage.

The core design principles behind the AppDynamics Application Intelligence Platform include:

Scalability:

The interactions customers have with the business are highly dynamic, environments are larger and more hybrid and organizations are demanding IT transform the business faster than ever. AppDynamics monitors many of the largest, most complex enterprise application environments in the world, supporting environments up to 10,000 nodes on a single management server. This is critical for customers who do not want to manage and maintain a farm of servers.

Lightweight:

Our platform was designed to be a production tool from the very beginning, giving it a unique low-overhead architecture. Intelligent data collection to ensure the lowest production overhead in the market. AppDynamics correlates end user business transaction details with completion status (success, error, exception), response times, and all other data points measured at any given time. It automatically analyzes the entire data set to provide information from which to draw conclusions and take the appropriate action.

Self-learning:

Auto-instrumentation, and minimal configuration enables simple and intuitive operation for any size business. AppDynamics has a unique ability to instrument your application, learning its transactions, code execution behavior, and normal performance patterns. It can adapt and adjust instrumentation automatically when you change your application-no manual configuration needed.

Open and Extensible:

The AppDynamics platform is open, extensible, and interoperable to fit any business need. Through the AppDynamics Exchange our community comes together to share knowledge and contribute back over 100 extensions that provide deep integrations to the tools you already use like Splunk, Apica, PagerDuty, and Amazon Web Services. From monitoring Amazon Web Services costs to MEAS mainframes our extensions allow you to leverage the tools you already have in place. We believe you own your data and it should be easy to consume and analyze which is why we provide REST APIs and SDKs available on GitHub that make it as easy as possible to get started.

Flexible:

The platform can be deployed in any operating model, including on-premise, SaaS, private cloud, or a hybrid combination. It’s always the exact same platform and there are no cost implications associated regardless of deployment model.

Secure:

Provides enterprise grade security with Role-Based Access Control (RBAC), LDAP, and SaaS certifications that guarantee security and compliance.

What services are delivered on the Application Intelligence Platform?

The AppDynamics Application Intelligence Platform gives companies the foundation they need to monitor performance and extract meaningful analytics from their business applications. Software-defined businesses must be certain that their most complex, business-critical applications are performing at the highest level; and be certain that the data and information generated by these applications can be harnessed for ongoing business advantage and impact. AppDynamics’ Application Intelligence platform enables customers to monitor, respond, and analyze their application environment.

Monitor

Application Performance Management

Mobile Application Monitoring

End User Experience Management

Database/NoSQL/BigData Monitoring

Infrastructure Monitoring

Respond

Run Book Automation

Cloud Auto-Scaling

Alerting

Analyze

Transaction Analytics (Beta)

analytics

Real-Time Business Metrics

analytics page - real-time business metrics graphic2

Operational Analytics

scalability analysis

The Application Intelligence Platform delivers this functionality by collecting data from across the application environment, processing & correlating it, and converting that data into knowledge for customers.

Collecting Data:

  • Instrumentation allows us to watch every line of code.
  • Distributed transaction tracing follows the business transaction across all tiers of the application, not just limited to a single node.

  • Transaction auto-learning engine inspects execution code, payload, libraries and methods so you are not left in the dark.

  • Real-time service discovery renders architecture topology to clearly and automatically map critical relationships and dependencies..

  • The smart agent filters and transfers data to the management server securely and in real-time.

Processing & Correlating Data

  • Real-time stream processing allows for complex event processing of metric and event streams.
  • Time series clustering and analytics provides the ability to index and manage time series by auto rolling up, purging or clustering data sets by time increments.

  • Behavioral learning is an engine that continuously adjusts dynamic baselines for an automated manageable data set.

  • Unstructured and structured big data indexing creates a data warehouse for different types of application data which is seamlessly correlated and processed.

  • Event correlation shows the relationship of performance, change, and business events.

Converting Data Into Knowledge

  • Intuitive user interface (UI) allows for a simple and easy-to-use platform for business users, developers and operations teams.

  • Dynamic flow maps create a clear and understandable visual representation of the application topology, so organizations understand what is happening in the big picture.

  • Cross-correlated drill down from anywhere in the UI, enabling pin-point accuracy across multiple nodes and machines.

  • Real-time correlation of performance metrics and business metrics create a business understanding of data never before delivered.

  • Compare and analyze the side-by side performance of different versions of an application.

  • Custom drag and drop HTML5 dashboards can be configured to any specification to deliver clear actionable results.

  • Query language allows easy, targeted data discovery.

In today’s competitive and demanding environment, organizations need to leverage Application Intelligence for ongoing business advantage. The AppDynamics Application Intelligence Platform enables customers to see everything across all front and back-end interactions: business transaction and code-level visibility with no blind spots.

Maximize digital business performance through an open, integrated solution that delivers real-time monitoring, actionable business and operational insights, and automated issue resolution across applications, infrastructure, and user experience. Take five minutes to get complete visibility and control into the applications that power your software-defined business. Get started with the AppDynamics Application Intelligence Platform today.

Diving Into AppDynamics Mobile APM & End User Monitoring

In the AppDynamics Spring 2014 release we greatly improved our End User Monitoring solution and added Mobile Application Performance Monitoring for iOS and Android applications. Through our end user experience dashboard you can understand the end user experience of your users in real-time via browsers or mobile devices:

 

We greatly improved visibility into the geography drill down breaking down requests by web vs mobile and  iOS vs Android:

We added more granular client-side metrics with a new waterfall timing visualization for browser snapshots:

 

We also added server-side correlation for browser snapshots for end to end visibility from the client-side to the server-side:

We enhanced our browser snapshots with a better drill down for discovering javascript errors:

We also expore all client-side metrics in the metrics browser to easily track and correlate over time:

AppDynamics added support for iOS and Android applications with complete end user monitoring, crash reporting, network request snapshots and device and user analytics:

 

Through instrumenting network requests from mobile apps to the server side you gain end to end visibility from device all the way to the server-side with Network Request Metrics. With AppDynamics Mobile Snapshots with Server-side Correlation you can correlate network errors and exceptions on the server-side:

With AppDynamics Mobile APM Crash Reporting you can proactively detect and respond to application crashes, hangs, and failed network requests and discover crash metrics by Device, Geo, App and OS Version etc. Gain complete visibility with iOS Crash Reporting with symbolication:

With User & Device Analytics you can understand your users with breakdown by device, connection, carrier, os version, app version

Through the AppDynamics metric browser you can track custom  metrics and get complete visibility into the performance of  your mobile apps:

Take five minutes to get complete visibility into the performance of your production applications with AppDynamics today.

A Newbie Guide to APM

Today’s blog post is headed back to the basics. I’ve been using and talking about APM tools for so many years sometimes it’s hard to remember that feeling of not knowing the associated terms and concepts. So for anyone who is looking to learn about APM, this blog is for you.

What does the term APM stand for?

APM is an acronym for Application Performance Management. You’ll also hear the term Application Performance Monitoring used interchangeably and that is just fine. Some will debate the details of monitoring versus management and in reality there is an important difference but from a terminology perspective it’s a bit nit-picky.

What’s the difference between monitoring and management?

Monitoring is a term used when you are collecting data and presenting it to the end user. Management is when you have the ability to take action on your monitored systems. Management tasks can include restarting components, making configuration changes, collecting more information through the execution of scripts, etc… If you want to read more about the management functionality in APM tools click here.

What is APM?

There is a lot of confusion about the term APM. Most of this confusion is caused by software vendors trying to convince people that their software is useful for monitoring applications. In an effort to create a standard definition for grouping software products, Gartner introduced a definition that we will review here.

Gartner lists five key dimensions of APM in their terms glossary found here… http://www.gartner.com/it-glossary/application-performance-monitoring-apm

End user experience monitoringEUM and RUM are the common acronyms for this dimension of monitoring. This type of monitoring provides information about the response times and errors end users are seeing on their device (mobile, browser, etc…). This information is very useful for identifying compatibility issues (website doesn’t work properly with IE8), regional issues (users in northern California are seeing slow response times), and issues with certain pages and functions (the Javascript is throwing an error on the search page).

prod-meuem_a-960x0 (2)

Screen_Shot_2014-08-04_at_4.29.55_PM-960x0 (2)

Runtime application architecture discovery modeling and display – This is a graphical representation of the components in an application or group of applications that communicate with each other to deliver business functionality. APM tools should automatically discover these relationships and update the graphical representation as soon as anything changes. This graphical view is a great starting point for understanding how applications have been deployed and for identifying and troubleshooting problems.

Screen_Shot_2014-07-17_at_3.42.47_PM-960x0 (3)

User-defined transaction profiling – This is functionality that tracks the user activity within your applications across all of the components that service those transactions. A common term associated with transaction profiling is business transactions (BT’s). A BT is very different from a web page. Here’s an example… As a user of a website I go to the login page, type in my username and password, then hit the submit button. As soon as I hit submit a BT is started on the application servers. The app servers may communicate with many different components (LDAP, Database, message queue, etc…) in order to authenticate my credentials. All of this activity is tracked and measured and associated with a single “login” BT. This is a very important concept in APM and is shown in the screenshots below.

Screen_Shot_2014-08-04_at_1.30.15_PM-960x0

Component deep-dive monitoring in application context – Deep dive monitoring is when you record and measure the internal workings of application components. For application servers, this would entail recording the call stack of code execution and the timing associated with each method. For a database server this would entail recording all of the SQL queries, stored procedure executions, and database statistics. This information is used to troubleshoot complex code issues that are responsible for poor performance or errors.

Screen_Shot_2014-08-07_at_11.08.00_AM-960x0

Analytics – This term leaves a lot to be desired since it can be and often is very liberally interpreted. To me, analytics (in the context of APM) means baselining, and correlating data to provide actionable information. To others analytics can be as basic as providing reporting capabilities that simply format the raw data in a more consumable manner. I think analytics should help identify and solve problems and be more than just reporting but that is my personal opinion.

business-impact-analytics2-1-960x0

performance-analytics-960x0

Do I need APM?

APM tools have many use cases. If you provide support for application components or the infrastructure components that service the applications then APM is an invaluable tool for your job. If you are a developer the absolutely yes, APM fits right in with the entire software development lifecycle. If your company is adopting a DevOps philosophy, APM is a tool that is collaborative at it’s core and enables developers and operations staff to work more effectively. Companies that are using APM tools consider them a competitive advantage because they resolve problems faster, solve more issues over time, and provide meaningful business insight.

How can I get started with APM?

First off you need an application to monitor. Assuming you have access to one, you can try AppDynamics for free. If you want to understand more about the process used in most companies to purchase APM tools you can read about it by clicking here.

Hopefully this introduction has provided you with a foundation for starting an APM journey. If there are more related topics that you want me to write about please let me know in the comments section below.

Check out our complementary ebook, Top 10 Java Performance Problems!

APM vs aaNPM – Cutting Through the Marketing BS

Marketing_BSMarketing; mixed feelings! I’m really looking forward to the new super bowl ads (some are pure marketing genius) in a few weeks but I really dislike all of the confusion that marketing tends to create in the technology world. In todays blog post I’m going to attempt to cut through all of the marketing BS and clearly articulate the differences between APM (Application Performance Management) and aaNPM (Application Aware Network Performance Management). I’m not going to try to convince you that you need one instead of the other. This isn’t a sales pitch, it’s an education session.

Definitions

APM – Gartner has been hard at work trying to clearly define the APM space. According to Gartner, APM tools should contain the following 5 dimensions:

  • End-user experience monitoring (EUM) – (How is the application performing according to what the end user sees)
  • Runtime application architecture discovery modeling and display (A view of the logical application flow at any given point in time)
  • User-defined transaction profiling (Tracking user activity across the entire application)
  • Component deep-dive monitoring in application context (Application code execution, sql query execution, message queue behavior, etc…)
  • Analytics

aaNPM – Again, Gartner has created a definition for this market segment which you can read about here

“These solutions allow passive packet capture of network traffic and must include the following features, in addition to packet capture technology:

  • Receive and process one or more of these flow-based data sources: NetFlow, sFlow and Internet Protocol Flow Information Export (IPFIX).
  • Provide roll-ups and dashboards of collected data into business-relevant views, consisting of application-centric performance displays.
  • Monitor performance in an always-on state, and generate alarms based on manual or automatically generated thresholds.
  • Offer protocol analysis capabilities to decode and understand multiple applications, including voice, video, HTTP and database protocols. The tool must provide end-user experience information for these applications.
  • Have the ability to decrypt encrypted traffic if the proper keys are provided to the solution.”

Reality

So what do these definitions mean in real life?

APM tools are typically used by application support, operations, and development teams to rapidly identify, isolate, and repair application issues. These teams usually have a high level understanding of how networks operate but not nearly the detailed knowledge required to resolve network related issues. They live and breathe application code, integration points, and component (server, OS, VM, JVM, etc…) metrics. They call in the network team when they think there is a network issue (hopefully based upon some high level network indicators). These teams usually have no control over the network and must follow the network teams process to get a physical device connected to the network. APM tools do not help you solve network problems.

aaNPM tools are network packet sniffers by definition. The majority of these tools (NetScout, Cisco, Fluke Networks, Network Instruments, etc.) are hardware appliances that you connect to your network. They need to be connected to each segment of the network where you want to collect packets or they must be fed filtered and aggregated packet streams by NPB (Network Packet Broker) devices. aaNPM tools contain a wealth of network level details in addition to some valuable application data (EUM metrics, transaction details, and application flows). aaNPM tools help network engineers solve network problems that are manifesting themselves as application problems. aaNPM tools are not capable of solving code issues as they have no visibility into application code.

If I were on a network support team I would want to understand if applications were being impacted by network issues so I could prioritize remediation properly and have a point of reference if I was called out by an application support team.

Network and Application Team Convergence

I’ve been asked if I see network and application teams converging in a similar way that dev and ops teams are converging as a result of the DevOps movement. I have not seen this with any of the companies I get to interact with. Based on my experience working in operations at various companies in the past, network teams and application support teams think in very different ways. Is it impossible for these teams to work in unison? No, but I see it as unlikely over the next few years at least.

But what about SDN (Software Defined Networking)? I think most network engineers see SDN as a threat to their livelihood whether that is true or not. No matter the case, SDN will take a long time to make it’s way into operational use and in the mean time network and application teams will remain separate.

I hope this was helpful in cutting through the marketing spin that many vendors are using to expand their reach. When it comes right down to it, your use case may require APM, aaNPM, a combination of both, or some other technology not discussed today. Technology is made to solve problems. I highly recommend starting out by defining your problems and then exploring the best solutions available to help solve your problem.

If you’ve decided to explore your APM tool options you can try AppDynamics for free by clicking here.

Scaling our End User Monitoring Cloud

Why End User Monitoring?

In a previous post, my colleague Tom Levey explained the value of Monitoring the Real End User Experience. In this post, we will dive into how we built a service to scale to billions of users.

The “new normal” for enterprise web applications includes multiple application tiers communicating via a service-oriented architecture that interacts with several databases and third-party web services. The modern application has multiples clients from browser-based desktops to native applications on mobile. At AppDynamics, we believe that application performance monitoring should cover all aspects of your application from the client-side to the server-side all the way back to the database. The goal of end user monitoring is to provide insight into client-side performance and capture errors from modern javascript-intensive applications. The challenge of building an end user monitoring service is that every single request needs to be instrumented. This means that for every request your application processes, we will process a beacon. With clients like FamilySearch, Fox News, BackCountry, ManPower, and Wowcher, we have to handle millions of concurrent requests.

1geo

AppDynamics End User Monitoring enables application owners to:

  • Monitor Their Global Audience and track End User Experience across the World to pinpoint which geo-locations may be impacted by poor Application Performance
  • Capture end-to-end performance metrics for all business transactions – including page rendering time in the Browser, Network time, and processing time in the Application Infrastructure
  • Identify bottlenecks anywhere in the end-to-end business transaction flow to help Operations and Development teams triage problems and troubleshoot quickly
  • Compare performance across all browsers types – such as Internet Explorer, FireFox, Google Chrome, Safari, iOS and Android
  • Track javascript errors

“Fox News already depends upon AppDynamics for ease-of-use and rapid troubleshooting capability in our production environment,” said Ryan Jairam, Internet Operations Lead at Fox News. “What we’ve seen with AppDynamics’ End-User Monitoring release is an even greater ability to understand application performance, from what’s happening on the browser level to the network all the way down to the code in the application. Getting this level of insight and visibility for an application as complex and agile as ours has been a tremendous benefit, and we’re extremely happy with this powerful new addition to the AppDynamics Pro solution.”

EUM Cloud Service

The End User Monitoring cloud is our super-scalable platform for data analysis and processing end user requests. In this post we will discuss some of the design challenges of building a cloud service capable of supporting billions of requests and the underlying architecture. Once End User Experience monitoring is enabled in the controller, your application’s requests are automatically instrumented with a very small piece of javascript that allows AppDynamics to capture critical performance metrics.

Screen Shot 2013-07-25 at 9.47.14 AM

The javascript agent leverages Web Episodes javascript timing library and the W3C Navigation Timing Specification to capture the end user experience metrics. Once the metrics are collected, they are pushed to the End User Monitoring cloud via a beacon for processing.

EUM (End User Monitoring) Cloud Service is our on-demand, cloud based, multi-tenant SaaS infrastructure that acts as an aggregator for the entire EUM metrics traffic. All the EUM metrics from the end user browsers from different customers are reported to EUM Cloud service. The raw browser information received from the browser is verified, aggregated, and rolled up at the EUM Cloud Service. All the AppDynamics Controllers (SaaS or on-premise) connect to the EUM Cloud service to download metrics every minute, for each application.

Design Challenges

On-Demand highly available

End users access customer web applications anywhere in the world and any time of the day in different time zones, whenever an AppDynamics instrumented web page is accessed. From the browser, EUM metrics are reported to the EUM Cloud Service. This requires a highly available on-demand system accessed from different geo locations and different time zones.

Extremely Concurrent usage

All end users of all AppDynamics customers using EUM solution continuously report browser information on the same EUM Cloud Service. EUM Cloud Service processes all the reported browser information concurrently and generate metrics and collect snapshot samples continuously.

High Scalability

The usage pattern for different applications throughout the day is different; the number of records to be processed at EUM Cloud vary with different applications at different times. The EUM Cloud Service automatically scale up to handle any surge in the incoming records and accordingly scale down with lower load.

Multi Tenancy support

The EUM Cloud Service process EUM metrics reported from different applications for different customers; the cloud service provides multi-tenancy. The reported browser information is partitioned based on customers and their different applications. EUM Cloud Service provides a mechanism for different customer controllers to download aggregated metrics and snapshots based on customer and application identification.

Cost

The EUM Cloud Service needs to be able to dynamically scale based on demand. The problem with supporting massive scale is that we have to pay for hardware upfront and over provision to handle huge spikes. One of the motivating factors when choosing to use Amazon Web Services is that costs scale linearly with demand.

Architecture

The EUM Cloud Service is hosted on Amazon Web Services infrastructure for horizontal scaling. The service has two functional components – collector and aggregator. Multiple instances of these components work in parallel to collect and aggregate the EUM metrics received from the end user browser/device. The transient metric data be transient is stored in Amazon S3 buckets. All the meta data information related to applications and other configuration is stored in the Amazon DynamoDB tables.

A single page load will send one or more beacon–one per base page and every iframe onload and one per ajax request. Javascript errors occurring post page load are also sent as error beacons.

The functionality of the nodes is to receive the metric data from the browser and process it for the controller:

  • Resolve the GEO information (request coming from the country/region/city) and add it to the metric using a in-process maxmind Geo-resolver.
  • Parse the User-Agent information and add browser information, device information and OS information to the metrics.
  • Validate the incoming browser reported metrics and discard invalid metrics
  • Mark the metrics/snapshots SLOW/VERY SLOW categories based on a dynamic standard deviation algorithm or using static threshold

Load Testing

For maximum scalability, we leverage Amazon Web Services global presence for optimal performance in every region (Virginia, Oregon, Ireland, Tokyo, Singapore, Sao Paulo). In our most recent load test, we tested the system as a collective to about 6.5 B requests per day. The system is designed to easily scale up as needed to support infinite load. We’ve tested the system running at many billions of requests per day without breaking a sweat.

Check out your end user experience data in AppDynamics

4breakdown

Find out more about AppDynamics Pro and get started monitoring your application with a free 15 day trial.

As always, please feel free to comment if you think I have missed something or if you have a request for content in an upcoming post.

Monitoring the Real End User Experience

Web application performance is fundamentally associated in the mind of the end user; with a brands reliability, stability, and credibility. Slow or unstable performance is simply not an option when your users are only a click away from taking their business elsewhere. Understanding your users, the locations they are coming from, and the devices/browsers they are using is crucial to ensuring customer loyalty and growth.

Today’s modern web applications are architected to be highly interactive, often executing complex client side logic in order to provide a rich and engaging experience to the user. This added complexity means it is no longer good enough to simply measure the effects users have on the back-end. It is also necessary to measure and optimize the client-side performance to ensure the best possible experience for your users.

Determining the root cause for poor user experience is a costly and time consuming activity which requires visibility into page composition, JavaScript error diagnostics, network latency metrics and AJAX/iframe performance.

Let’s take a look at a few of the key features available in AppDynamics 3.7 which simplify troubleshooting these problems.

End User Experience dashboard:

The first view we will look at reports EUM data by geographic location showing which regions have the highest loads, the longest response times, the most errors, etc.

1geo
The dashboard is split into three main panels:

  • A main panel in the upper left that displays geographic distribution on a map or a grid
  • A panel on the right displaying summary information: total end user response time, page render time, network time and server time
  • Trend graphs in the lower part of the dashboard that dynamically display data based on the level of information displayed in the other two panels

The geographic region for which the data is displayed throughout the dashboard is based on the region currently showed on the map or in the summary panel. For example, if you zoom down from global view to France in the map, the summary panel and the graphs display data only for France.

This view is key to understanding the geographical impact of any network & CDN latency performance issues. You can also see which are the busiest geographies, driving the highest throughput of your application. 

Browsers and Devices:

From the Browsers and Devices tabs you can see the distribution of devices, browsers and browser versions providing an understanding of which are the most popular access mechanisms for the users of your application and geographic-split by region.  From here we can isolate if a particular browser or device is delivering a reduced experience to the end user and help plan the best areas for optimisation.

2browserdevice

Troubleshooting End User Experience:

The user response breakdown shown below, is the first place we look to troubleshoot why a user is experiencing slow response times. It provides a full breakdown of where the overall time is being spent during the various stages of a page render, highlighting issues such as network latency, poor page design, too much time parsing HTML or downloading and executing JavaScript. 

Screen Shot 2013-07-25 at 9.47.14 AM

Response time metric breakdown

First Byte Time is the interval between the time that a user initiates a request and the time that the browser receives the first response byte.

Server Connection Time is the interval between the time that a user initiates a request and the start of fetching the response document from the server. This includes time spent on redirects, domain lookups, TCP connects and SSL handshakes.

Response Available Time is the interval between the beginning of the processing of the request on the browser to the time that the browser receives the response. This includes time in the network from the user’s browser to the server.

Front End Time is the interval between the arrival of the first byte of text response and the completion of the response page rendering by the browser.

Document Ready Time is the time to make the complete HTML document (DOM) available.

Document Download Time is the time for the browser to download the complete HTML document content.

Document Processing Time is the time for the browser to build the Document Object Model (DOM)

Page Render Time is the time for the browser to complete the download of remaining resources, including images, and finish rendering the page.

AppDynamics EUM reports on three different kinds of pages:

  • A base page represents what an end user sees in a single browser window. 
  • An iframe is a page embedded in another page.
  • An Ajax request is a request for data sent from a page asynchronously.

Notifications can be configured to trigger on any of these.

JavaScript error detection

JavaScript error detection provides alerting and identification of the root cause of JavaScript errors in minutes, highlighting the JavaScript file, line # and exception messages for all errors seen by your real users.

Screen Shot 2013-07-25 at 9.45.01 AM

Server-side correlation

If the above isn’t enough and you want to look into the execution within the datacentre, you can drill in-context from any of the detailed browser snapshots directly into the corresponding call stack trace in the application servers behind. This provides end-to-end visibility of a user’s interaction with your web application, from the browser all the way through the datacentre and deep into the database.

4breakdown

Deployment and scalability:

Deployment is simple – all you have to do is add a few lines of JavaScript to the web pages you want to monitor. We’ll even auto-inject this JavaScript on certain platforms at runtime. With its elastic public cloud architecture, AppDynamics EUM is designed to support billions of devices and user sessions per day, making it a perfect fit for enterprise web applications.

See Everything:

With AppDynamics you’ll get visibility into the performance of pages, AJAX requests and iframes, and you can see how performance varies by geographic region, device and browser type. In addition, you’ll get a highly granular browser response time breakdown (using the Navigation Timing API) for each snapshot, allowing you to see exactly how much time is spent in the network and in rendering the page. And if that’s not enough, you’ll see all JavaScript errors occurring at the end user’s browser down to the line number.

If you don’t currently know exactly what experience your users are getting when they access your applications or your users are complaining and you don’t know why then why not checkout AppDynamics End User Monitoring for free at appdynamics.com

Manpower Group Sees Real Results from End User Monitoring

Some companies talk about monitoring their end user experience and other companies take the bull by the horns and get it done. For those who have successfully implemented EUM (RUM, EUEM, or whatever your favorite acronym is) the technology is rewarding for both the company and the end user alike. I recently had the opportunity to discuss AppDynamics EUM with one of our customers and the information shared with me was exciting and gratifying.

The Environment

ManpowerGroup monitors their intranet and internet applications with AppDynamics. These applications are used for internal operations as well as customer facing websites; in support of their global business and accessed from around the word, 24×7. We’re talking about business critical, revenue generating applications!

I asked Fred Graichen, Manager of Enterprise Application Support, why he thought ManpowerGroup needed EUM.

“One of the key components for EUM is to shed light on what is happening in the “last mile”. Our business involves supporting branch locations. Having an EUM tool allows us to compare performance across all of our branches. This also helps us determine whether any performance issues are localized. Having the insight into the difference in performance by location allows us to make more targeted investments in local hardware and network infrastructure.”

Meaningful Results

Turning on a monitoring tool doesn’t mean you’ll automagically get the results you want. You also need to make sure your tool is integrated with your people, processes, and technologies. That’s exactly what ManpowerGroup has done with AppDynamics EUM. They have alerts based upon EUM metrics that get routed to the proper people. They are then able to correlate the EUM information with data from other (Network) monitoring tools in their root cause analysis. Below is an EUM screen shot from ManpowerGroup’s environment.

MPG EUM

By implementing AppDynamics EUM, ManpowerGroup has been able to:

  • Identify locations that are experiencing the worst performance.
  • Successfully illustrate the difference in performance globally as well. (This is key when studying the impact of latency etc. on an application that is being accessed from other countries but are located in a central datacenter.)
  • Quickly identify when a certain location is seeing performance issues and correlate that with data from other monitoring solutions.

But what does all of this mean to the business? It means that ManpowerGroup has been able to find and resolve problems faster for their customers and employees. Faster application response time combined with happier customers and more productive employees all contribute to a healthier bottom line for ManpowerGroup.

ManpowerGroup is using AppDynamics EUM to bring a higher level of performance to it’s employees, customers, and shareholders. Sign up for a free trial today and begin your journey to a healthier bottom line.