The New Serverless Agent For AWS Lambda

To control costs and reduce the burden of infrastructure management in the cloud, more companies are using services like AWS Lambda to deploy serverless functions. Due to the unpredictable nature of end-user demand in today’s digital-first world, serverless functions that can be spun up as needed can also help resolve unplanned scaling issues.

But that’s not to say these serverless workloads don’t impact the overall performance of your application environment. In fact, since these workloads are transient in nature, they represent a real challenge for teams who need to correlate an issue across their application environment, or see the impact that serverless applications are having on end users—or even on the business itself.

How AppDynamics Helps

Today, we’re announcing a new family of application agents that help our customers who use serverless microservices gain more visibility and insight into the performance of their application and its impact on the broader ecosystem.

In the same way that we collect and baseline metrics and events for traditional applications, we can now help serverless users gain deep insight into response times, throughput and exception rates in applications using services built in any mixture of serverless and conventional runtimes. Thus bringing our industry-leading ability to visualize end-user and business impact into the serverless realm, helping teams prioritize issue-resolution efforts and optimize the performance of these ephemeral workloads.

What We Do

The first iteration of AppDynamics’ Serverless Agent family targets Java microservices running in AWS Lambda, and is available as a beta program for qualified customers. Here’s how it works:

The Serverless Agent for AWS Lambda allows our customers to instrument their lambda code at entry (when it is invoked from an external request source), and exit (when it invokes some external downstream service), and to ingest incoming or populate outgoing correlation headers. Also, our streamlined approach to collecting metrics and events from serverless functions means you never have to worry about missing an important data point that may have gone unnoticed, or slowing down your otherwise healthy serverless functions.

You can find out more and sign up for AppDynamics’ AWS Lambda beta program on our community site.

Migrating to AWS Without Losing (Too Much) Sleep

As my fellow CIOs are well aware, the rapid changes to our digital economy can seem daunting.

Despite the challenges of our digital world, Wyndham Hotels & Resorts, the world’s largest hotel franchisor, executed a significant digital transformation requiring change from our North American hotel owners that ultimately enabled them to provide better service to their guests. We succeeded without losing loyalty from our owners or guests, and with the right partnerships, support and strategy in place – you can too.

New research shows that only 22 percent of technologists are optimistic that their organization is ready for the rapid pace of technological advancement and 67 percent of IT professionals worry that technology innovation is outpacing society’s ability to harness it and adapt to change.

In light of these statistics, it’s easy to feel like these obstacles could create real problems for your customers and your business.

But it doesn’t have to.

For those of you in travel and hospitality, you understand the criticality of seamless digital experiences, especially on mobile. Consumers can book their flight, order room service and schedule guided tours – and owners can change rates, respond to guest requests, and see what competitors are doing – all from the touch of their phone.

At Wyndham Hotels & Resorts, we are on a mission to democratize travel for guests and the people who serve them with a portfolio of iconic hotel brands strategically designed to offer a wide range of compelling experiences and price points for guests.

We know our hotel owners are the platform for our growth, and together we serve the everyday traveler; as the stewards of our brands, owners are vital to delivering the experiences our guests want. That’s why we believe in providing our owners with the best technology resources. We knew we needed to evolve our digital platforms to meet our owners’ changing needs, enabling them to give the best travel experiences possible to their guests.

Supporting that business mission, here are three keys to a successful digital transformation.

Evaluate your company’s strategic direction

The first step in our digital transformation was to re-evaluate our existing capabilities and direction. After reviewing all of our options and conducting plenty of due diligence around scale, security and performance we decided to migrate many applications in our existing datacenter to AWS so we could innovate faster.

While we knew leveraging the speed and flexibility of AWS was the best way for our owners to receive the reliability and accessibility they need to manage digital information about their properties (from photos, to descriptions, to rates), we knew we had to mitigate any negative impacts a migration of this size and complexity could have.

We needed the right partner to protect our brand and accelerate the process.

Choose the right partner

Wyndham prides itself in partnering with like-minded companies that are customer-centric, nimble and experts in their fields.

We found AppDynamics and discovered our teams were drawn to an AWS-based approach at the same time, and we saw the potential of cross-enterprise monitoring. By partnering with AppDynamics, we were able to migrate 8,400 hotels across 18 brands with confidence by rapidly and securely adopting AWS’ cloud-native architectures, maintaining ongoing performance throughout the migration, ensuring equal or better customer experience in our new hybrid environments and driving business objectives while maximizing value.

The powerful level of visibility and real-time diagnosis provided by AppDynamics helped us deliver high quality service to our owners before, during and after the move to AWS.

Put the customer experience first

Not only was AppDynamics critical to the success of our cloud migration, to support our larger digital footprint, they enabled us to segment customer experience by category, track user sessions to understand how much time users spend at each step, filter and analyze by geography, device type and receive proactive alerts and analysis before the customer was ever impacted.

These additional insights gave us a holistic view of our applications and how our users interact with our apps, down to the design and feature level. This enabled us to focus on other critical improvements, accelerating user acceptance testing (UAT) during the migration process while maintaining application performance throughout.

Since our migration to AWS with the support of AppDynamics, we’ve seen an increase of 75 percent in mobile bookings. Together, AppDynamics and AWS are hugely strategic to our company’s continued innovation.

The digital economy today presents the opportunity to become an agent of transformation for your business.

With the right partners, culture and leadership in place, you too can drive the innovation necessary to keep up with the rapid pace of technological change, while maintaining and growing customer loyalty and your business at the same time…allowing everyone to migrate peacefully and without losing (too much) sleep.

Expanding Amazon Web Services Monitoring with AppDynamics

Enterprises are increasingly moving their applications to the cloud and Amazon Web Services (AWS) is the leading cloud provider. In this blog, I will provide some additional details on the expanded support of AWS monitoring.

20 AWS Monitoring Extensions

The AppDynamics platform is highly extensible to monitor various technology solutions that are not discovered natively. Our extensions collect metrics for a specific AWS component and pass them to the AppDynamics controller for tracking, creating health rules and visualizing in dashboards efficiently.

Here is the list of all the AWS monitoring extensions:

  1. AWS SQS Monitoring Extension
  2. AWS EC2 Monitoring Extension
  3. AWS ElastiCache Monitoring Extension
  4. AWS DynamoDB Monitoring Extension
  5. AWS AutoScaling Monitoring Extension
  6. AWS Billing Monitoring Extension
  7. AWS EBS Monitoring Extension
  8. AWS OpsWorks Monitoring Extension
  9. AWS ELB Monitoring Extension
  10. AWS RDS Monitoring Extension
  11. AWS ElasicMapReduce Monitoring Extension
  12. AWS RedShift Monitoring Extension
  13. AWS Route53 Monitoring Extension
  14. AWS SNS Monitoring Extension
  15. AWS StorageGateway Monitoring Extension
  16. AWS CloudSearch Monitoring Extension
  17. AWS Lambda Monitoring Extension
  18. AWS Custom Namespace Monitoring Extension
  19. AWS S3 Monitoring Extension
  20. AWS API Gateway Monitoring Extension


Customers can leverage all the core functionalities of AppDynamics (e.g. dynamic baselininghealth rulespoliciesactions, etc.) for all of these AWS metrics while correlating them with the performance of the application using the AWS environment. The AppDynamics platform is highly extensible to monitor various technology solutions that are not discovered natively.

AppDynamics Customers using AWS Monitoring

With AppDynamics, many of our customers have accelerated their application migration to the cloud while others continue to monitor cloud applications as the application complexity explodes with the move towards microservices and dynamic web services. As the workload on the these cloud applications grew, AppDynamics came to rescue by elastically scaling the AWS resources to meet the exponential demand for applications.


For example, Nasdaq accelerated their application migration to AWS and gained complete visibility into complex application ecosystem. Heather Abbott, Senior Vice President of Corporate Solutions Technology at Nasdaq, summarized their experience with AppDynamics. “The ability to trace a transaction visually and intuitively through the interface was a major benefit AppDynamics delivered. This visibility was especially valuable when Nasdaq was migrating a platform from its internal infrastructure to the AWS Cloud. We used AppDynamics extensively to understand how our system was functioning on AWS, a completely new platform for us.”

Amazon ECS or EKS? AppDynamics Supports Both

You’ve probably read our recent blog on monitoring Amazon Elastic Container Service for Kubernetes (EKS) with AppDynamics, which integrates seamlessly into EKS environments.

There are multiple approaches to running Docker containers in AWS, and we support them. The two main flavors are Amazon Elastic Container Service (ECS) and EKS. Here’s how they differ, and why you may choose one over the other.


Amazon ECS is the Docker-compatible container orchestration solution from Amazon Web Services. It allows you to run and scale containerized applications on EC2 instances. Provided by Amazon as a service, ECS consists of multiple built-in components that enable administrators to create clusters, tasks and services.

EKS makes it easy to deploy, manage, and scale containerized applications using Kubernetes on AWS. It manages the complexities of installing and operating Kubernetes clusters, and brings additional value to enterprises, including choice and portability, high availability, and network isolation and performance. For more details on the power of EKS, see our blog on monitoring EKS with AppDynamics.

Which is best for you, ECS or EKS? For many enterprises that containerize their applications and want a simple way to deploy them in AWS, ECS may be the way to go. Perhaps they don’t need all the features that Kubernetes provides, or simply don’t want the additional complexity. They do, however, want to run Docker with support for AWS native services, such as Elastic Load Balancers, CloudTrail, CloudWatch, and so on.

EC2 vs. Fargate

Once you’ve decided that ECS is the way to go, there’s one more decision to make: whether to use the EC2 launch type and have more granular control over instance types, server images and clusters, or have AWS Fargate manage all the placement and availability of your containers. In the examples below, we’ll use the EC2 launch type, as this will allow us to look at the underlying concepts of ECS. But it’s definitely worth checking out Fargate, which has the potential to greatly simplify administration of your container infrastructure. As I write this, Fargate support is only available for ECS, but Amazon has announced that Fargate will also support EKS before the end of the year.

Source: Amazon

Key Concepts in ECS

An ECS Container Instance is simply an EC2 instance running the ECS Container Agent and registered with an ECS cluster. To get started, use Amazon’s ECS-optimized AMI and the default cluster that’s created for you when you first use the ECS service. In practice, you would use ECS clusters to separate your workloads into different environments (for example, dev, staging and production) and also to distribute them across Availability Zones. As the name implies, ECS Container Instances provide the services needed to run your containers, and can be configured with the correct access permissions, connectivity and resources.

ECS Task Definitions are used to specify the container(s) where you wish to deploy to the Container Instances, and how the ECS Container Agent should run them. In addition to setting the location of the container images, which typically are pulled from the Elastic Container Registry (ECR) service or the Docker Hub/Store, the task definition also controls resource limits, environment variables, networking mode, data volumes—in fact, all the things you specify when you launch a container from the command line. So, no surprises there.

Finally, ECS provides a service scheduler for long-running services or applications, with an ability to run one-off jobs. The scheduler supports two strategies: Replica, which places and maintains tasks across the cluster, as specified by the Task Definition; and Daemon, which can be used to place exactly one task on each container instance. You would typically use the Replica strategy to scale out application components, while the Daemon strategy is ideal for running the AppDynamics Server Agent to support container visibility. (More on Daemon in a minute.)

AppDynamics in ECS

To understand how AppDynamics integrates with ECS environments, let’s quickly compare the concept of a container and task in ECS with a node and tier in AppDynamics.  Here’s a quick recap of Apps, Tiers and Nodes in AppD:

  • App: a business service or application that contains a set of related services and business transactions, typically something that people or systems interact with. An app could be a mobile banking application with its associated backend, an airline booking application, and so on. In general, multiple pieces of functionality are tied together to provide business value.

  • Tier: a particular service that’s a part of an application. For example, an identification service or an order service. An application generally has multiple tiers.

  • Node: a member of a tier. All nodes essentially run the same code, usually with a load balancer (tier) in front of them.

So how does this related to ECS containers and tasks?  In AppDynamics, it’s natural to group similar containers deployed as a service at the tier level; it’s exactly the same as Kubernetes, where a tier maps naturally to a service deployment, and the nodes are instances of that service. In this model, we normally configure node reuse with a suitable prefix to handle the dynamic nature of typical services, which scale up and down naturally. For APM purposes, we are usually interested in the aggregate behavior and availability of the service as a whole, rather than that of individual (and often short-lived) containers.

Monitoring in ECS with AppDynamics

Using AppDynamics to monitor applications deployed with ECS is straightforward. Simply follow these main steps:

  1. Include the appropriate APM agent in each container you want to monitor. We’ll show a simple example that you can try, one using a pre-built image freely available from the AppDynamics Docker Store with the Java Agent already installed and configured.

  2. Create a task that acts as a recipe for the deployment of your containers. Define all the info for each container, including environment variables, in the task. Note: one unfortunate feature of ECS is that you can’t pass environment variables across all containers, as you would with a environment file. Rather, you must name the environment variables for the agents in each container, if you aren’t naming them another way. Fortunately, you can do all this using the JSON task definition, so this is easily scriptable.

  3. Deploy the AppDynamics Server Agent with Docker Visibility enabled on each ECS container instance to provide fully container visibility for your APM monitoring. If you’re using ECS with the EC2 launch type, you’ll usually want to take advantage of the Daemon scheduling strategy to ensure that the Server Agent is deployed once on each Container Instance, just like a Kubernetes daemonset or server daemon service. If you’re using ECS with Fargate, this option isn’t available, so you would include the Server Agent as part of your task definition.

A Simple Example

Here’s a simple example that shows how to deploy a Docker container with the AppDynamics Java APM Agent pre-installed. You can find examples on the Docker Store, and the source code is available on GitHub, if you want to build your own. To keep things simple, you may prefer to copy your images to Amazon’s Elastic Container Registry (ECR) and load them from there.

Here you can see the task and container definitions used to deploy the base image, which contains the Java APM Agent using the standard environment variables for configuring the agent. The standard base container from the AppDynamics Docker Store has Tomcat pre-installed with our Java agent. On startup, you will see output from the agent indicating that it is initialized and ready to begin instrumenting your application.

Step 1. Task Definition

Step 2. Container Definition

Step 3. Runtime Output from the Container

Now here’s how to deploy the AppDynamics Server Agent, again using a base image from the AppDynamics Docker Store, to provide integrated container visibility:

Step 1. Task Definition

Step 2. Container Definition

Step 3. Runtime Output from the Server Agent Container

Amazon ECS includes a command line interface compatible with Docker Compose but, if you wish, you can also use the AWS Management Console.

Try this for yourself with your AppDynamics account (sign up for a free trial using our SaaS service). If you follow these steps and then view the logs for the Tomcat container, you’ll see the AppDynamics Agent start up and connect to the Controller, ready to report business transactions and application metrics. Drop an application war file into the Tomcat working directory, and you’re up and running!

Monitor Amazon EKS with AppDynamics

On the heels of announcing the general availability of AppDynamics for Kubernetes at KubeCon Europe, we’ve partnered with Amazon Web Services (AWS) to bring Amazon EKS to the broader Kubernetes community. AppDynamics provides enterprise-grade, end-to-end performance monitoring for applications orchestrated by Kubernetes.

Amazon EKS, AWS’s managed Kubernetes service, shoulders the heavy lifting of installing and operating your own Kubernetes clusters. Beyond the operational agility and simplicity in managing Kubernetes clusters, Amazon EKS brings additional value to enterprises, including the following:

1. Choice and Portability: Built on open source and upstream Kubernetes, EKS passes CNCF’s conformance tests, enabling enterprises to run applications confidently on EKS without having to make changes to the app or learn new Kubernetes tooling. You can choose where to run applications on various Kubernetes deployment venues—on-premises, AWS clusters managed with kops, Amazon EKS, or any other cloud provider.

2. High Availability: EKS deploys the control plane in at least two availability zones, monitors the health of the master nodes, and re-instantiates the master nodes, if needed, automatically. Additionally, it patches and updates Kubernetes versions.

3. Network Isolation and Performance: Worker nodes run in the subnets within your VPC, giving you control over network isolation via security groups.

Amazon EKS brings VPC networking to Kubernetes pods and removes the burden of running and managing overlay networking fabric. CNI plugin runs as a DaemonSet on every node, and allocates an IP address to every pod from the pool of secondary IP addresses attached to the elastic network interface (ENI) of the worker node instance. Communication between control plane and worker nodes occurs over AWS networking backbone, resulting in better performance and security.

Monitoring Amazon EKS with AppDynamics

EKS makes it easier to operate Kubernetes clusters; however, performance monitoring remains one of the top challenges in Kubernetes adoption. In fact, according to a recent CNCF survey, 46% of enterprises reported monitoring as their biggest challenge. Specifically, organizations deploying containers on the public cloud, cite monitoring as a big challenge. Perhaps because cloud providers monitoring tools may not play well with organization’s existing tools which are used to monitor on-premises resources.

We are therefore excited that AppDynamics and AWS have teamed up to accelerate your EKS adoption.

How Does it Work?

AppDynamics seamlessly integrates into EKS environments. The machine agent runs as a DaemonSet on EKS worker nodes, and application agents are deployed alongside your application binaries within the application pods. Out-of-the-box integration gives you the deepest visibility into EKS cluster health, AWS resources and Docker containers, and provides insights into the performance of every microservice deployed—all through a single pane of glass.


Unified, end-to-end monitoring helps AppDynamics’ customers expedite root-cause analysis, reduce MTTR, and confidently adopt modern application architectures such as microservices. AppDynamics provides a consistent approach to monitoring applications orchestrated by Kubernetes regardless where the clusters are deployed – on Amazon EKS or on-premises enabling enterprises leverage their existing people, processes and tools.

Correlate Kubernetes performance with business metrics: For deeper visibility into business performance, organizations can create tagged metrics, such as customer conversion rate or revenue per channel correlated with the performance of applications on the Kubernetes platform. Health rules and alerts based on business metrics provide intelligent validation so that every code release can drive business outcomes.

Get Started Today

To get started with enterprise-grade monitoring of EKS follow these easy steps:

1. Sign-up for a free AppDynamics trial and configure the environment’s ConfigMap. Sample configuration and instructions are available on our GitHub page.

2. Create the EKS cluster and worker nodes, and configure kubectl with the EKS control plane endpoint. Deploy your Kubernetes services and deployments.

3. Start end-to-end performance monitoring with AppDynamics!

The AppD Approach: Capturing 5M metrics/minute with AWS and Aurora

AppDynamics customers have always benefited from flexibility when it comes to where the AppDynamics Controller is hosted; be it in our SaaS environment managed by AppDynamics, or hosted “on-premises” under the complete control of the enterprise. While it has always been possible to host an AppDynamics Controller in AWS, achieving large scale required effort. Using an AWS-based Controller with an Aurora backend, current benchmarks easily reached the 5M metrics/minute mark with minimal effort.

This solution is a game-changer for on-premises customers who are open to hosting the AppD platform in AWS, as it allows them to run a large-scale Controller without procuring physical hardware. (Update – as of June, 2018 new AppDynamics SaaS customers also use this solution).

So Why Aurora?

High Performance and Scalability

Aurora provides higher performance than MySQL, allowing AppDynamics to scale the Controller to handle more metrics than was previously possible in a cloud environment.
Before Aurora, it took some effort and expense to build a controller at scale in AWS, requiring setting up higher-performance storage behind your database, tweaking file systems, and so on. With AWS and Aurora, we easily start to collect 1M metrics/minute and with minimal effort scale to over 5M metrics/minute and 10K agents. Our Professional Services team can provide guidance for deployments larger than 10K agents.

With AWS, you can start small and scale up as needed, thereby avoiding the risk of underutilizing expensive hardware.

Fully Managed

With Aurora, there is no need to worry about database backups, as the Amazon Relational Database Service (RDS) manages this for you.  It also handles software-patching and minor upgrades, thus lowering the total cost of ownership.

High Availability and Durability

With the multi-AZ deployment option, Amazon Aurora offers 99.99% availability.  Aurora automatically replicates the data across multiple availability zones, and can typically failover to a new instance in less than 30 seconds.  The failover process is automatic and seamless, requiring no changes to the Controller itself. It’s possible to failover not only to another Availability Zone, but also to another Region, if needed.

AppD does support High Availability configurations for on-premise deployments, requiring 2 physical machines each with its own storage.If one machine has an issue you can quickly switch to the second system.

In AWS with Aurora, a single machine is utilized to host the Controller application server, which is separate from the Aurora backend. If the application server has an issue, you can spin up another one based on an image as fast as AWS can start the EC2 instance (usually less than 90 seconds).

The benefit: There is no longer a need for a second, barely utilized machine waiting to do something. You can spin-up a new machine on demand, as needed.

Solid Security

Amazon Aurora is highly secure. Network isolation is achieved through the use of an Amazon Virtual Private Cloud (VPC). Aurora also supports encryption for data at rest, with no measurable performance impact.

How Does It Work?

One Click Deployment with CloudFormation

The AppDynamics CloudFormation template provides a comprehensive solution for automating the process of creating the virtual infrastructure required to host a Controller in AWS. With a single button-click, you can create all the security groups, elastic network interfaces, EC2 instances, and virtual infrastructure required.

Installation Using Enterprise Console

Installation of the Controller is done via the AppD Enterprise Console command-line interface (CLI); the installer includes a new option to specify Aurora as the database type.


./ submit-job –platform-name myplatform –service controller –job install –args controllerProfile=large controllerPrimaryHost=<hostname> controllerTenancyMode=single controllerRootUserPassword=”<password>” mysqlRootPassword=”<password>” controllerAdminUsername=”admin” controllerAdminPassword=”<password>” databaseType=Aurora controllerDBPort=3388 controllerDBHost=”<aurora hostname>”


AppDynamics version 4.4.3 makes it easy to deploy a Controller in AWS with Aurora. This solution offers a host of benefits for AppD customers, including vastly improved scalability, reliability, availability, security and performance. For these reasons, AppDynamics SaaS now leverages Aurora as well.

And with support for large-scale Controllers, the entire AppD platform can now be deployed at scale in AWS, including the Events Service Cluster, EUM Server and other components.

Want to learn more? The product documentation has additional info on this great new capability from AppD.

Himanshu Sharma, co-author, is Senior DevOps Engineer on the AppDynamics Platform team, and lead on the Aurora-backed controller.

Derek Mitchell is part of AppDynamics Global Services team, which is dedicated to helping enterprises realize the value of business and application performance monitoring. AppDynamics’ Global Services’ consultants, architects, and project managers are experts in unlocking the cross-stack intelligence needed to improve business outcomes and increase organizational efficiency.

The AppD Approach: How to Monitor the AWS Cloud and Save Money

The AppDynamics platform is highly extensible, allowing our customers to monitor a variety of key Amazon Web Services (AWS) metrics. We have 20 unique AWS extensions that capture stats for everything from Auto Scaling which optimizes app performance and cost, to Storage Gateway which enables on-prem apps to use AWS cloud storage. We’re always fine-tuning these cloud-monitoring extensions and making improvements where necessary. In some cases, we’ll integrate these features into our core APM product to keep it best in class.

What can AppD’s extensions do for you? Each efficiently gathers metrics from all Regions and Availability Zones in the AWS Global Infrastructure. Using Amazon CloudWatch APIs, our extensions pull metrics from specific AWS components and pass them to the AppDynamics Controller for tracking, and for creating health rules and dashboard visualizations. These cloud-monitoring extensions give our customers greater insight into how their apps and businesses are running on AWS.

The AWS EC2 Monitoring Extension, for instance, retrieves data from Amazon CloudWatch on EC2 instances—including CPU, network and IO utilization—and displays this information in the AppDynamics Metric Browser. Similarly, the Billing Monitoring Extension captures billing statistics, while the S3 Monitoring Extension pulls in data on S3 bucket size, the number of objects in a bucket, HTTP metrics, and more.

AppDynamics users can also leverage the AWS cloud connector extension to automatically scale up or down in the cloud based on a variety of rules-based policies, such as the health of business transactions, the end-user experience, database and remote services, error rates, and overall app performance.

This cloud-monitoring extension helped one AppD customer avoid a Black Friday ecommerce meltdown by implementing health rules to automatically scale up EC2 resources when certain load and response time metrics were breached, and scale down when those metrics returned to normal. Another plus: By adding an authorization step to these workflows—one that asked permission before spinning instances up or down—the customer paid only for EC2 resources he needed.

Simple Setup

It’s easy to edit an AppDynamics AWS extension’s config file, extract the performance metrics you need, and show the data on your dashboard. Once you provide the necessary information in config.yml (see below), the extension will do the rest. You can select time ranges for data gathering, include/exclude specific metrics, and specify the way you’d like your stats collected: ave, max, min, sum, or sample count:

How AWS Extensions Save You Money

Many of our customers are gathering AWS metrics in the AppD platform. Collecting these metrics requires the use of Amazon API calls, which can get expensive when used excessively. As reported by AWS, a recent study by migration analytics firm TSO Logic found that most organizations are overpaying for cloud services, and that 35% of an average company’s cloud computing bill is wasted cost.

AWS regularly releases new products, all of which use CloudWatch APIs. The good news is that AppDynamics offers a special extension that monitors products using those APIs. By helping you collect only the metrics you need, AppD can help you manage your AWS bill.

EC2 Example

CloudWatch monitoring has two levels of pricing: Basic and Detailed. In Basic monitoring, metrics are updated in five-minute intervals; in Detailed, metrics are updated every minute.

Let’s say you have an EC2 instance that returns seven metrics, and you want to monitor all EC2 instances from one AWS region. To do so, you install the EC2 Monitoring Extension on your machine agent, and add your access key and secret key to the config.yml file. Then add the AWS region you’d like to monitor.

The extension makes a call to AWS to get the list of instances for each region listed in the config file, as well as the metrics associated with them. To get a metric value, AppDynamics calls CloudWatch to get the value associated with each instance.

Simply put, API calls add up in a hurry. Let’s look at an example:

1 list call + 7 metrics x 1 every minute   = 8 calls
x 60 minutes/hour = 480 calls
x 24 hours/day = 11,520 calls
x 30 days = 345,600 calls
x 20 instances = 6,912,000 calls

CloudWatch pricing gives you one million free API requests each month. This means you’ll be charged for the remaining ~6 million calls.

How can AppD help? By letting you specify how frequently your AWS extensions should make API calls. This feature allows you to dramatically reduce the number of AWS calls, while still monitoring all of your instances. It has been very effective for several AppD clients and is being added over time to all of our AWS extensions.

Select Your Cloud-Monitoring Metrics

Metric selection is another key feature of our AWS extensions. This is important because CloudWatch, by default, provides several metrics that may not be necessary for monitoring your environment.

AppDynamics’ extensions let you select only the metrics you’d like to monitor. In addition to better managing your CloudWatch bill, this also allows you to better manage which data is important to your business, and which isn’t.

You can monitor individual instances as well: Choose the instance you’d like to monitor, and the extension will only make calls regarding that instance. Since this feature allows you to monitor just those instances you use regularly, it saves money.

AppDynamics is always refining its AWS extensions, making improvements where necessary and integrating some of these features into our core APM product. Again, if there’s an AWS metric you need, we collect it. If you have specific AWS needs, contact your account manager. Want to learn more about AppDynamics? Learn more here or schedule a demo today.

Understanding the Momentum Behind .NET Core

Three years ago Satya Nadella took over as CEO of Microsoft, determined to spearhead a renewal of the iconic software maker. He laid out his vision in a famous July 10, 2014 memo to employees in which he declared that “nothing was off the table” and proclaimed his intention to “obsess over reinventing productivity and platforms.”

How serious was Nadella? In the summer of 2016, Microsoft took the bold step of releasing .NET Core, a free, cross-platform, open-source version of its globally popular .NET development platform. With .NET Core, .NET apps could run natively on Linux and macOS as well as Windows.

For customers .NET Core solved a huge problem of portability. .NET shops could now easily modernize monolithic on-premises enterprise applications by breaking them up into microservices and moving them to cloud platforms like Microsoft Azure, Amazon Web Services, or Google Cloud Platform. They had been hearing about the benefits of containerization: speed, scale and, most importantly, the ability to create an application and run it anywhere. Their developers loved Docker’s ease of use and installation, as well as the automation it brought to repetitive tasks. But just moving a large .NET application to the cloud had presented daunting obstacles. The task of lifting and shifting the large system-wide installations that supported existing applications consumed massive amounts of engineering manpower and often did not deliver the expected benefits, such as cost savings. Meanwhile, the dependency on the Windows operating system limited cloud options, and microservices remained a distant dream.

.NET Core not only addressed these challenges, it was also ideal for containers. In addition to starting a container with an image based on the Windows Server, engineers could also use much smaller Windows Nano Server images or Linux images. This meant engineers had the freedom of working across platforms. They were no longer required to deploy server apps solely on Windows Server images.

Typically, the adoption of a new developer platform would take time, but .NET Core experienced a large wave of early adoption. Then, in August 2017, .NET Core 2.0 was released, and adoption increased exponentially. The number of .NET Core users reached half a million by January 2018. By achieving almost full feature parity with .NET Framework 4.6.1, .NET Core 2.0 took away all the pain that had previously existed in shifting from the traditional .NET Framework to .NET Core. Libraries that hadn’t existed in .NET Core 1.0 were added to .NET Core 2.0. Because .NET Core implemented all 32,000 APIs in .NET Standard 2.0 most applications could reuse their existing code.

Engineering teams who have struggled with DevOps initiatives found that .NET Core allowed them to accelerate their move to microservices architectures and to put in place a more streamlined path from development to testing and deployment. Lately, hiring managers have started telling their recruiters to be sure and mention the opportunity to work with .NET Core as an enticement to prospective hires—something that never would have happened with .NET.

At AppDynamics, we’re so excited about the potential of .NET Core that we’ve tripled the size of the engineering team working on .NET. And, just last month, we announced a beta release of support for .NET Core 2.0 on Windows using the new the .NET micro agent released in our Winter ‘17 product release. This agent provides improved microservices support as more customers choose .NET Core to implement multicloud strategies. Reach out to your account team to participate in this beta.

Stay tuned for my next blog posts on how to achieve end-to-end visibility across all your .NET apps, whether they run on-premises, in the cloud, or in multi-cloud and hybrid environments.

The AppD Approach: IoT and AWS Greengrass

As both data and processing power rise on the edge of the network, monitoring the performance of edge devices becomes increasingly important. In addition to deploying the AppDynamics IoT monitoring platform to monitor C/C++ and Java apps, end-to-end visibility can be extended to applications running in an AWS Greengrass core by using AppDynamics IoT RESTFul APIs. The easiest way to do this today is with a Lambda function. We recently demonstrated this at AWS re:Invent using Cisco IOx and Cisco Kinetic together with AWS Greengrass on a Cisco Industrial Integrated Services router.

The best thing about this approach is that it opens up a new ecosystem of edge applications to the benefits of unified application monitoring. It ensures customers will resolve incidents faster, reduce downtime, and lower operations’ costs. Meanwhile, the combined strengths of AWS Greengrass and AppDynamics’ IoT Monitoring Platform allow very large volumes of data generated by the Internet of Things to be mined for business insights and harnessed to achieve business objectives.

AWS Greengrass is designed to simplify the implementation of local processing on edge devices. A software runtime, it lets companies execute compute, messaging, data caching, sync, and machine learning (ML) inference instructions even when connectivity to the cloud is temporarily unavailable. Since its release, it has helped accelerate adoption of IoT by making it easier for developers to create and test applications in the cloud using their programming language of choice and then deploy the apps to the edge.

Once the apps are deployed, AppDynamics’ IoT Monitoring Platform provides deep visibility, in real-time, by letting developers capture application performance data, errors and exceptions, and business data. Since the AppDynamics solution is designed for flexible integration at the edge, Lambda functions can be individually instrumented, or a dedicated Lambda function can be written to provide insight into all the Lambdas running. This allows for a wide range of edge applications to monitor any key metric that makes sense to the business.

In the demo at AWS re:Invent, we instrumented an edge application running on a manufacturing floor that was reading sensor data from a programmable logic controller (PLC) over a Modbus interface and reporting it back to the cloud. A key success metric was how edge computing reduced the large amount of inbound data volume to a much smaller meaningful volume that was being pushed to the cloud. AppDynamics provided real-time verification by keeping track of the volume of data being ingested into the Lambda functions, and of the data that was being processed and being sent to the various cloud applications, including AWS Cloud.

Learn more about AppDynamics IoT monitoring and please send us any feedback or questions.

This Is How Amazon’s Servers Rarely Go Down [Infographic]

Amazon Web Services (AWS), Amazon’s best-in-class cloud services offering, had downtime of only 2.5 hours in 2015. You may think their uptime of 99.9997 percent had something to do with an engineering team of hundreds, a budget of billions, or dozens of data centers across the globe—but you’d be wrong. Amazon’s website, video, and music offerings, and even AWS itself, all leverage multiple AWS products to get five nines of availability, and those are the same products we get to use as consumers. With some clever engineering and good service decisions, anyone can get uptime numbers close to Amazon’s for only a fraction of the cost.

But before we discuss specific techniques to keep your site constantly available, we need to accept a difficult reality: Downtime is inevitable. Even Google was offline in 2015, and if the single largest website can’t get 100 percent uptime, you can be sure it’s impossible for your company to do so too. Instead of trying to prevent downtime, reframe your thinking and do everything you can to make sure your service is as usable as possible even while failure occurs, and then recover from it as quickly as possible.

Here’s how to architect an application to isolate failure, recover rapidly from downtime, and scale in the face of heavy load. (Though this is only a brief overview: there are plenty of great resources online for more detailed descriptions. For example, don’t be afraid to dive into your cloud provider’s documentation. It’s the single best source for discovering all the amazing things they can do for you.)

Architecture and Failure Mitigation

Let’s begin by considering your current web application. If your primary database were to go down, how many services would be affected? Would your site be usable at all? How quickly would customers notice?

If your answers are “everything,” “not at all,” and “immediately” you may want to consider a more distributed, failure-resistant application architecture. Microservices—that is, many different, small applications that work together to act like a larger app—are extremely popular as an engineering paradigm. The failure of an individual service is less noticeable to all clients.

For example, consider a basic shop application. If it were all one big service, failure of the database takes the entire site offline; no one can use it at all, even just to browse products or plan purchases. But now let’s say you have microservices instead of a monolith. Instead of a single shop application, perhaps you have an authentication service to login users, a product service to browse the shop, and an order fulfillment service to charge customers and ship goods. A failure in the order fulfillment database means that only people who try to ship see errors.

Losing an element of your operation isn’t ideal, but it’s not anywhere near as bad as having your entire site unavailable. Only a small fraction of customers will be affected, while everyone else can happily browse your store as if nothing was going wrong. And with proper logging, you can note the prospects that had failed requests and reach out to them personally afterward, apologizing for the downtime and hopefully still converting them into paying customers.

This is all possible with a monolithic app, but microservices distribute failure and better isolate it to specific parts of a system. You won’t prevent downtime; instead, you’ll make it affect less people, which is a much more achievable goal.

Databases, Automatic Failover, and Preventing Data Loss

It’s 2 a.m. and a database stops working. What happens to your website? What happens to the data in your database? How long will you be offline?

This used to be the sysadmin nightmare scenario: pray the last backup was usable and recent, downtime would only be a few hours, only a day’s worth of data perished. But nowadays the story is very different, thanks in part to Amazon but also to the power and flexibility of most database software.

If you use the AWS Relational Database Service (RDS), you get daily backups for free, and restoration of a backup is just a click away. Better yet, with a multi-availability zone database, you’re likely to have no downtime at all and the entire database failure will be invisible.

With a multi-AZ database, Amazon keeps an up-to-date copy of your database in another availability zone: a logically separate datacenter from wherever your primary database is. An internet outage, a power blip, or even a comet can take out the primary availability zone, and Amazon will detect the downtime and automatically promote the database copy to be your main database. The process is seamless and happens immediately—chances are, you won’t even experience any data loss.

But availability zones are geographically close together. All of Amazon’s us-east-1 datacenters are in Virginia, only a few miles from each other. Let’s say you also want to protect against the complete failure of all systems in the United States and keep a current copy of your data in Europe or Asia. Here, RDS offers cross-region read replicas that leverage the underlying database technology to create consistent database copies that can be promoted to full-fledged primaries at the touch of a button.

Both MySQL and PostgreSQL, the two most popular relational database systems on the market and available as RDS database drivers, offer native capabilities to ship database events to external follower databases as they occur. Here, RDS takes advantage of a feature that anyone can use, though with Amazon’s strong consumer focus, it’s significantly easier to set up in RDS than to do it manually. Typically, data is shipped to followers simultaneously to data being committed to the primary. Unfortunately, across a continent, you’re looking at a data loss window of about 200 to 500 milliseconds, because an event must be sent from your primary database and be read by the follower.

Still, for recovering a cross-continental consistent backup system, 500 milliseconds is much better than hours. So next time your database fails in the middle of the night, your monitoring service won’t even wake you. Instead you can read about it in the morning—if you can even detect that it occurred. And that means no downtime and no unhappy custom.

Auto Scaling, Repeatability, and Consistency

Amazon’s software-as-a-service (SaaS) offerings, such as RDS, are extremely convenient and very powerful. But they’re far from perfect. Generally, AWS products are much slower to provision compared to running the software directly yourself. Plus, they tend to be several software versions behind the most recent releases.

In databases, this is a fine tradeoff. You almost never create databases so slow that startup doesn’t matter, and you want extremely stable, well-tested, slightly older software. If you try to stay on the bleeding edge, you’ll just end up bloody. But for other services, being locked into Amazon’s product offerings makes less sense.

Once you have an RDS instance, you need some way for customers to get their data into it and for you to interact with that data once it’s there. Specifically, you need web servers. And while Amazon’s Elastic Beanstalk (AWS’ platform to deploy and scale web applications) is conceptually good, in practice it is extremely slow with middling software support, and can be painfully difficult to debug problems.

But AWS’ primary offering has always been the Elastic Compute Cloud (EC2). Running EC2 nodes is fast and easy, and supports any kind of software your application needs. And, unsurprisingly, EC2 offers exceptional tools to mitigate downtime and failure, including auto scaling groups (ASGs). With an ASG, Amazon keeps as many servers up as you specify, even across availability zones. If a server becomes unresponsive or passes other thresholds defined by you (such as amount of incoming traffic or CPU usage), new nodes will automatically spin up.

New servers by themselves do you no good. You need a process to make sure  new nodes are provisioned correctly and consistently so a new server joining  your auto scaling group also has your web  software and credentials to access  your database. Here, you can take  advantage of another Amazon tool, the  Amazon  Machine Image (or AMI). An AMI is a saved copy of an EC2 instance.  Using an AMI, AWS can spin up a new node  that is  an exact copy of the machine that generated the AMI.

Packer, by Hashicorp, makes it easy to create and save AMIs, and is also free and open-source. But there are lots of  amazing tools that can simplify AMI creation. They are the fundamental building blocks of EC2. With clever AMI use you’ll  be able to create new, functional servers in less than 5 minutes.

It’s common to need additional provisioning and configuration even after an AMI is started—perhaps you want to make  sure  the latest version of your application is downloaded onto your servers from GitHub, or that the most recent security  patches  have been applied to your installed packages. In cases such as these a provisioning system is a necessity. Chef  and  Puppet  are the two biggest players in this space, and both offer excellent integrations with AWS. The ideal use case  here i  is an AMI  with credentials to automatically connect to your Chef or Puppet provisioning system, which then ensures  the  newly  created node is as up to date as possible.


Final Thoughts

By relying on auto scaling groups, AMIs, and a sensible provisioning system, you can create a system that is completely  repeatable and consistent. Any server could go down and be replaced, or 10 more servers could enter your load balancer,  and the process would be seamless, automatic, and almost invisible to you.

And that’s the secret why Amazon’s services rarely go down. Not the hundreds of engineers, or dozens of datacenters, or  even the clever products: It’s the automation. Failure happens, but if you detect it early, isolate it as much as possible, and  recover from it seamlessly—all without requiring human intervention—you’ll be back on your feet before you even knew a  problem occurred.

There are plenty of potential concerns with powerful automated systems like this. How do you ensure new servers are  ones  provisioned by you, and not an attacker trying to join nodes to your cluster? How do you make sure transmitted  copies of  your databases aren’t compromised? How do you prevent a thousand nodes from accidentally starting up and dropping a  massive AWS bill into your lap? This overview of the techniques AWS leverages to prevent downtime and isolate failure  should serve as a good jumping-off point to those more complicated concepts. Ultimately, downtime is  impossible to  prevent, but you can keep it from broadly affecting your customers. Working to keep failure contained and recovery as  rapid as possible leads to a better experience both for you and your users.

Share this Image On Your Site