Why Idempotency and Ephemerality Matter in a Cloud-Native World

The first time I heard “idempotent” and “ephemeral,” I had no idea what they meant. Perhaps I should have, though, because idempotent and ephemeral patterns are not new to the computing world.

In mathematics and computer science, idempotence is a property in which no matter how many times you execute some operations, you achieve the same outcome. Ephemerality is a concept of things being short-lived or transitory. In a Cloud Native environment, we expect consistency and portability with the presumption that infrastructure will likely be impermanent, including containers that are transient and disposable. These shifts are influencing providers across multiple layers of the OSI Model.

Idempotent

Again, “idempotent,” threw me the first time I heard it. I had not felt so perplexed by the meaning of a word since childhood. I was a staff developer and excited to use JMS for the first time. My team was starting to modernize a financial service client’s system to be event-based, implementing ActiveMQ as the message broker. We were determining transaction boundaries and suffering from duplicate messages that were wreaking havoc on our final calculations. One of the project’s senior developers suggested I make the endpoint “idempotent.” I gave him a blank look as if he was speaking pig Latin.

Someone recommended I get a copy of what soon became one of my favorite books, Enterprise Integration Patterns. The book’s authors, Gregor Hohpe and Bobby Woolf, describe a lot of the system-to-system design patterns that we depend on today. In the case of the financial service client, the design pattern was Idempotent Receiver (Consumer). We deployed into one of the client’s data centers, and minus a severe application infrastructure fault, we were under the guise that the center’s infrastructure would be there indefinitely.

By today’s standards, our application infrastructure was fragile. Our duplicate check implementation was very stateful, but if the service had stopped we would have been open to duplicates again. In the next iteration of our application, we designed the idempotent service to be more robust—something we would’ve done sooner had we not been under the guise that our infrastructure was more stable than ephemeral.

Ephemeral

Cloud providers are looking to capture more workloads while providing their clients with better ROI. When Google launched its Preemptible Virtual Machine in 2015, it was responding to a demand for lower-cost instances. Due to the short-lived nature of instances, I was having a hard time wrapping my head around the workload that would be appropropriate for a Preemptible Virtual Machine—or even an Amazon Spot Instance, which offers spare compute capacity in the AWS cloud at steep discounts.

The rise of preemptible or spot instances shows the upsurge in ephemeral computing. As enterprises grapple with the nuances of cloud cost, one avenue for lower-cost services is to have the compute live for a shortened, or ephemeral, period.

Prior to preemptible or spot instances, there was a baseline understanding that compute capacity would exist for a finite period of time. Because of cloud availability, some organizations treated traditional instances as indefinite. For a planned hardware upgrade, the cloud vendor would give its customers advance notice to switch workloads over to another instance. For unplanned events like outages, a service designed to be multi-region or multi-zone would suffice.

Today, it feels like we are designing workloads to cope with Chaos Monkey at every level, including infrastructure. There’s a growing understanding that our workload infrastructure likely won’t be there in a predictable format. As a result, we’re building more robust services to cope with this unpredictability. These changes spotlight the importance of keeping workloads portable in case we have to switch to another instance, region, zone or provider.

Software is Eating (Feeding) the Cloud Native World

Marc Andreessen’s famous quote—“software is eating the world”—proves equally true in the Cloud Native space. The most prolific push for generic hardware has been led by public cloud vendors. Similar to enterprises making the move to x86, cloud vendors have been pushing to make all parts of their stack as generic as possible. In case of failure or expansion, vendors can swap a generic part in and out with ease.

With the generic hardware approach, a good amount of logic moves to the software stack. The rationale here is that if hardware is ephemeral, reconstituting the compute, storage, and even networking would be both seamless and consistent with software-defined storage and networking. Applying this to the public/hybrid cloud market, a software-driven solution that’s robust, scalable and portable becomes a core component of Cloud Native.

Save Us, Software!

Configuration control and consistency is moving down the stack: from application to application infrastructure, and now down to infrastructure. With advances in software-defined infrastructure (SDI), the trifecta of load-balancing, clustering and replication can be applied to multiple parts of the stack.

Networking

Medium has a very well-written article on the different layers of software-powered networking, from software-defined networking (SDN) to container networking. With the ever-widening adoption of the container networking interface (CNI), containerized applications can have a more consistent approach to network connectivity. For example, with Cisco Application Centric Infrastructure as a robust SDN platform, coupled with a service mesh, enterprises have a consistent and recreatable way of discovering and participating in services. AppDynamics can provide insight into this increasingly complex networking landscape as well.

Storage

Not long ago, storage in the cloud world was viewed as non-ephemeral. But as offerings, practices and architectures have begun to shift for some cloud storage products, there’s now a delineation between ephemeral and non-ephemeral storage. Although one of the pillars of a twelve-factor application is to run stateless processes, some sort of state needs to be written somewhere, and a popular place is to disk. Advances in software-defined storage (SDS), with projects such as Ceph and Gluster, provide object and file storage capabilities, respectively. Similar to the delineation of SDN and CNI, there is SDS and Container Storage Interface (CSI). For example, Portworx, a popular cloud-native storage vendor, coupled with commodity cloud or on-premises storage, allows for greater portability and storage consistency from the infrastructure to the container/application level.

One More ‘y’ Term

A successful Cloud Native implementation requires another key component: observability. Because without proper visibility into the stack, it’s nearly impossible to know when and how to react to an ephemeral infrastructure action to maintain idempotency.

Idempotency + Ephemerality + Observability = Cloud Native

Despite the inherent challenges with observability, insight into the system is crucial. Relating changes in ephemeral infrastructure to overall sentiment and KPIs can be a challenge as well. With AppDynamics, it’s much easier to validate and advance your investment in the software-defined world.

AppDynamics provides insight on KPIs for a cloud migration/infrastructure change.

 

AppDynamics delivers deep insights into containers running across an enterprise infrastructure.

A Look to the Future

Every month it seems like a new project is accepted into the Cloud Native Computing Foundation, which is very exciting. As enterprises march toward infrastructure nirvana, where organizations can recreate robust and consistent infrastructure in an ephemeral world, Cloud Native computing will be an important part of the equation. With the power to create cloud computing almost anywhere, it’s important to not lose focus of non-functional requirements such as security. Cisco’s Tetration Platform, which addresses security and operational challenges for a multicloud data center, can protect hybrid cloud workloads holistically.

Look to AppDynamics for help with navigating the Cloud Native world!

The AppD Approach: Leveraging Docker Store Images with Built-In AppDynamics

In my previous blog we explored some of the best and worst practices of Docker, taking a hands-on approach to refactoring an application, always with containers and monitoring in mind. In that project, we chose to use physical agents from the Appdynamics download site as our monitoring method. But this time we are going to take things one step further: using images from the Docker Store to improve the same application.

Modern applications are very complex, of course, and we will show the three most common ways to use AppDynamics Docker Store Images to monitor your app, all while adhering to Docker best practices. We will continue to use this repo and move between the “master” and “docker-store-images” branches. If you haven’t read my previous post, I recommend doing so first, as we will build on the source code used there.

First Things First: The Image

Over at the AppDynamics page on the Docker Store (login required), we have three types of images for Java applications, each with our agents on them. In this project, we will work solely with the Machine Agent and Java images but, in principle, the scenarios and implementations are language-agnostic. The images are based on OpenJDK, Tomcat, and Jetty. Since our application uses Tomcat, we will use that image. (store/appdynamics/java:4.3.7.1_tomcat9-jre8).

You can see how every image is versioned with (store/appdynamics<language>:<agent-version>_<server-runtime environment>).

By inspecting the image <docker inspect store/appdynamics/java:4.3.7.1_tomcat9-jre8>, we are able verify important environmental variables, including the fact that we’re running Java 8. We’re also able to identify the JAVA_HOME variable, which will play an important role (more on this below). Furthermore, we can verify that Tomcat is installed with the correct versions and paths. Lastly, we notice the command to start the agent is simply the catalina.sh run command. On startup, the image runs Tomcat and the agent. (This is important to note as we dive deeper.)

Lastly, if you plan to use a third-party image in your production application, the image must be trusted. This means it must not modify any other containers at runtime. Having the OS level agent force itself into a containerized app—and intrusively modify code at runtime—defeats one of the main advantages of containerization: the ability to run the container anywhere without worrying about how your code executes. Always keep this in mind when evaluating container monitoring software. (AppDynamics’ images pass this test, by the way.)

Here are the three most practical migrations you’re likely to face:

Scenario 1: A Perfect World

This is the best case scenario, but also the least practical. In a perfect world, we’d be able to pull the image as a top layer, pass in only environment variables, and see our application discovered within minutes. However in our situation, we can’t do this because we have a custom startup script that we want to have run when our container starts. In this example, we’ve chosen to use Dockerize (https://github.com/jwilder/dockerize) to simplify the process of converting our application to use Docker, but of course there are many situations where you might need some custom start logic in your containers.  If you’re not using Dockerize in your script, simply pull the image and pass in the environment variables that name the individual components. Since the agents run on startup, this method will be seamless.

Scenario 2: Install Tomcat

Ideally, we’d like to make as few changes as possible. The problem here is that we have a unique startup script that needs to run when each project is started. In this scenario, your workaround is to use the agent image that doesn’t have Tomcat—in other words, use store/appdynamics/java:4.3.7.1 and install Tomcat on the image. With this approach, you remove the overlapping start commands and top-level agent. The downside is reinstalling Tomcat on every image rebuild.

Scenario 3: Refactor Start Script

Here’s the most common scenario when migrating from a physical agent—and one we found ourselves in. A specific run script brings up all of your applications. Refactoring your apps to pull from the image and start your application would be too time consuming, and would ask too much of the customer. The solution: Combine the two start scripts.

In our scenario, we had a directory responsible for the server, and another responsible for downloading and installing the agents. Since we were using Tomcat, we decided to leverage the image with Tomcat and our monitoring software, which was already installed <store/appdynamics/java:4.3.7.1_tomcat9-jre8>. (We went with the official Tomcat image because it’s the one used by the AppDynamics image.)

In our startup script, AD-Capital-Docker/ADCapital-Tomcat/startup.sh, we used Dockerize to spin up all the services. You’ll notice that we added a couple of environment variables, ${APPD_JAVAAGENT} and ${APPD_PROPERTIES}, to each start command. In our existing version, these changes enable the script to see if AppD properties are set and, if so, to start the application agent.

The next step was to refactor the startup script to use our new image. (To get the agent start command, simply pull the image, run the container, and run ps -ef at the command line.)

Since Java was installed to a different location, we had to put its path in our start command, replacing “java” with “/docker-java-home/jre/bin/java”. This approach allowed us to ensure that our application was using the Java provided from the image.

Next, we needed to make sure we were starting the services using Tomcat, and with the start command from the AppDynamics agent image. By using the command from above, we were able to replace our Catalina startup:

-cp ${CATALINA_HOME}/bin/bootstrap.jar:${CATALINA_HOME}/bin/tomcat-juli.jar org.apache.catalina.startup.Bootstrap

…with the agent startup:

-Djava.util.logging.config.file=/usr/local/tomcat/conf/logging.properties
-Djava.util.logging.manager=org.apache.juli.ClassLoaderLogManager
-Djdk.tls.ephemeralDHKeySize=2048
-Djava.protocol.handler.pkgs=org.apache.catalina.webresources
-javaagent:/opt/appdynamics/javaagent.jar -classpath /usr/local/tomcat/bin/bootstrap.jar:/usr/local/tomcat/bin/tomcat-juli.jar
-Dcatalina.base=/usr/local/tomcat -Dcatalina.home=/usr/local/tomcat
-Djava.io.tmpdir=/usr/local/tomcat/temp org.apache.catalina.startup.Bootstrap start

If you look closely, though, not all of the services were using Tomcat on startup. The last two services simply needed to start the agent. By using the same environment variable (APPD_JAVA_AGENT), we were able to rename that variable (APPD_JAVA_AGENT) to be the path of the agent jar. And with that, we had our new startup script.

[startup.sh (BEFORE)]

[startup.sh (AFTER)]

Not only did this approach allow us to get rid of our AppDynamics directory, it also enabled a seamless transition to monitoring via Docker images.

Attaining Nirvana: The Four Levels of Cloud Maturity

Cloud adoption is atop every CIO’s priority list, and for good reason. Technology stacks are advancing at lightning speeds. Application architectures of the past decade are aging fast and being replaced with modern, public and private cloud-based ones. But while cloud adoption is inevitable, the vast majority of organizations are still searching for an effective application migration strategy, notes Gartner.

If you feel like you are falling behind the competition in your cloud journey, there’s no need to panic. A structured and comprehensive migration model, combined with a smart investment strategy, will go a long way toward ensuring success. Our cloud maturity model is based on insights we’ve gleaned from hundreds of conversations with CIOs about their cloud adoption strategies, as well as numerous customer migrations we’ve supported successfully. We’ve identified common patterns—or “maturity levels”—in the adoption process. By understanding this maturity model, you may find it easier to develop your own cloud strategy.

Below we describe four levels of cloud maturity. It is important to note that progression does not require adopting every level along the way. Some organizations skip levels, jumping from Level 1 to Level 3, for instance, and bypassing Level 2 altogether. Not all organizations need to end up at Level 4, and most will have different applications in their portfolio at different levels of maturity at the same time. For example, since customer-facing apps that generate revenue need to be the most agile and responsive, it makes sense to migrate them to Level 3 or Level 4, which are optimized for rapid application delivery and scale. On the other hand, older apps in maintenance mode can be kept at Level 1 or Level 2 without incurring too much additional investment. Also, keep in mind that companies are adopting a DevOps operational philosophy to accomplish these transformational tasks (more on this below).

Hybrid apps, where parts of the application continue to run on-premises due to immovable mainframe systems or data gravity—Dave McCrory’s concept where data becomes so large it’s nearly impossible to move—are a reality as well.

Level 1: Traditional Data Center Apps

Traditional data center apps run in classic virtual machines (such as VMWare) or on bare metal in a private data center, and are typically monitored by the IT Ops team. There are pros and cons to these architectures and deployments, of course. Advantages include total control over your hardware, software, and data; less reliance on external factors such as internet connectivity; and possibly a lower total cost of ownership (TCO) when deployed at scale. Disadvantages include large upfront costs that often require capital expenditure (CapEx), as well as maintenance responsibilities. They also result into longer implementation times that begin with hardware and software procurement before any code is written. Given these drawbacks, it’s quite likely that by 2020 a corporate “no cloud” policy will be extremely rare.

Level 2: Lifted and Shifted Apps

Given the cloud’s promise of elasticity and agility, nearly every IT organization has embarked on a cloud adoption journey. Those who haven’t actually started migrating their production workloads are definitely prototyping or experimenting in this area. However, cloud migration seldom runs smoothly. Oftentimes an organization will take the same virtual machines it was running on-premises and “lift and shift” them to the cloud. The target environment is typically a public cloud provider such as Amazon AWS, Microsoft Azure, or Google Cloud, or at times a private data center configured as a private cloud.

A sound migration strategy? Not really. Despite the expected cost savings of this approach, a company often finds it’s far more expensive to run its applications in the cloud in the same way it did on-premises. The large VM configurations that these applications were architected for are very expensive in the cloud. In other cases, the underlying infrastructure lacks the support it had on-premises. For example, some application servers relying on multicasting capabilities to communicate with each other, no longer work in the cloud when purely lifted and shifted. These shortcomings demand an application refactoring.

Lastly—and perhaps most importantly—these traditional data center applications were written with certain hardware reliability assumptions that may no longer be valid in the cloud. On-premises hardware is typically custom built and designed for higher reliability, whereas the cloud is built on a tenet that hardware is cheap and unreliable, while the software layer delivers application resiliency and robustness. This is another reason why application refactoring becomes necessary.

Level 3: Refactored Apps

Once an organization realizes that some of its lifted-and-shifted applications are fundamentally hamstrung in the cloud, it needs to refactor its apps. Several modern platforms that are purpose-built for running apps in cloud environments are a perfect choice for refactored programs. These modern platforms generally include application platform-as-a-service (PaaS) technologies or cloud native services. The typical PaaS technologies include Pivotal Cloud Foundry, Red Hat OpenShift, AWS Elastic Beanstalk, Azure PaaS, and Google App Engine. The cloud native offerings include hundreds of managed services offered by AWS, Azure and Google Cloud, such as database, Kubernetes, and messaging services, to name a few.

In a traditional data center apps environment, the IT organization is responsible for deploying, managing, and scaling the application code. By comparison, these modern platforms automate tasks and abstract out a lot of complexities; applications become elastic, dynamically consuming computing resources to meet varying workloads. The modern approach frees the IT organization from maintaining something that has no intellectual property benefit to its business, allowing it to focus on business problems as opposed to infrastructure issues.

Level 3 is where organizations begin to see signs of the cloud’s true potential. This is also where companies can dabble in container technology, and start to build the organizational muscle to run apps in modern architectures. However, not all is perfect in this world; organizations need to use caution when adopting some cloud-native services, which can result into vendor lock-in and poorer application portability—a top priority for some companies.

And while it may appear the cloud journey is now complete, obstacles remain. Refactored apps are often the same code from monolithic apps, only broken down into more manageable components. These smaller components can now scale automatically using PaaS or cloud-native services, but their code was not originally architected for truly modular, stateless deployments that would make them linearly scalable. Yes, a car may run faster after getting a custom performance chip upgrade, but in order to negotiate race-track turns at high speeds, it needs to be built from the ground up for high performance.

Level 4: Microservices—the Nirvana State

Microservices, as the name suggests, is an architecture in which the application is a collection of small, modular and independently deployable services that scale on demand to meet application workload requirements. These services are usually architected either using container and orchestration technologies (Docker and Kubernetes, or AWS, Azure, and Google container services) or serverless computing (AWS Lambda, Azure Functions, or Google Functions).

The reason this level is regarded as the “Nirvana State” is because applications built on microservices architectures are ultra-scalable and fault tolerant, ensuring an ultra-responsive and uninterrupted end-user experience of the application. Organizations can distribute smaller services to nimble DevOps teams, which run independently and enjoy the freedom to innovate faster, bringing new features to market in less time.

It requires a big commitment, though: a company must either rearchitect its applications from the core, or write all new applications with the microservices approach.

While these granular services offer many benefits, and the individual services are simpler to manage, combining them into broader business applications can get complicated, particularly from a monitoring and troubleshooting standpoint.

Where DevOps Fits In

Although this maturity model explores cloud adoption through a technology lens, organizational evolution is absolutely critical to the success of the adoption journey. Moving from siloed Dev and IT organizations to a more DevOps model is absolutely critical for achieving all the benefits of the higher levels of maturity.

A DevOps team generally consists of 8 to 10 people, including developers and site reliability engineers (SREs). Each super-agile team is responsible for only a few services, not the entire application. Teams take great pride in the responsiveness of their services, as well as the speed at which they deliver new functionality. The key tenet here is: You build it, you own it, you run it! That’s what makes DevOps different from the traditional Dev and Ops split.

Nirvana Takes Time—and Hard Work

Cloud adoption is a journey where the adoption of microservices on cloud platforms (public or private) can lead to greater agility, significant cost savings, and superior elasticity for organizations. The road may seem treacherous at times, but don’t get discouraged! Everyone is on the same path. The global IT sector is poised for another year of explosive growth—5.0 percent, CompTIA forecasts—and is embracing fast-paced innovation. Taking a considered approach and adopting DevOps practices is the fastest way to achieving the Nirvana State.

Learn more about how AppDynamics can help you on the path to cloud maturity.

Good Migrations: Five Steps to Successful Cloud Migration

Unless you’ve been living in a cave for the last decade, you’ve seen cloud computing spread like fire across every industry. You also probably know that the cloud plays a pivotal role when it comes to digital transformation. Whether “the app is the business” is a well-worn subject or not, it doesn’t change the fact that companies are spinning up their apps faster because they can scale, test, and optimize them in real production environments in the cloud. But, like any technology, the cloud isn’t perfect. If it isn’t configured specifically to your application and business needs, you can find yourself dealing with performance issues, unhappy users, and one splitting headache.

On the flip side, developing a successful cloud migration strategy can be a rewarding experience that reverberates across your enterprise. The fact that 48 of the Fortune 50 Corporations have announced plans to adopt the cloud or have done so already speaks to the fact that the cloud isn’t just good for business. It’s a must-have, basic requirement. Like Wi-Fi. And a laptop.

In our new eBook, Good Migrations: Five Steps to Successful Cloud Migration, we focus on the right steps in migration, and also the varying ways enterprises can take those steps. Simply put, no two enterprises — nor their apps — are alike. So there’s no single one-size-fits-all cloud migration plan or solution that works for every enterprise. As for company happy hours? Those seem to work pretty well for everyone.

Here are just a few of the valuable points covered:

Why Migrate at All

Every company has different reasons for migrating to the cloud. So, you do need to ask yourself a few questions about why you want (or need) to move to the cloud. It’s critical that you focus on precisely what it is you need specific to your app and business needs. Define what cloud environment fits your objectives. Determine how you’ll make the move. Plan for every phase: before, during — and yes — after you migrate.

Where Your App Lives Impacts How it Lives

New environments and IT configurations come with new rules. So you’ll scrutinize your app through a different lens after migration. You have to understand every system that connects to and interacts with it. You’re not reinventing the wheel, but you will likely need to modify it. Yes, it’s a time investment, but it’s one that will ultimately save you in the long run.

The Phases of Cloud Migration

Migration is a process that rolls out in phases. Here’s a summary of them:

  1. Choose your providers: Choose one or several, depending on your needs.
  2. Assign responsibilities to the providers: Who does what? Who is in control of the app?
  3. Adjusting the internal configuration: Manage IT expectations and apps that aren’t migrated.
  4. Getting users on board: If your team isn’t comfortable with the technology and aligned with the change, the migration won’t make a difference.

Rising Currency in the Cloud

Migrating an app or two isn’t going to fetch you the ROI it’s capable of. You need to go big or go home (that is, if you determine that the cloud is appropriate for your enterprise at all). When you implement a cloud migration on a large scale across your enterprise, you can create considerable value. To measure that value you need to set clear goals that can be measured across a variety of operations. In time, you’ll be able to calculate how much time and money your company saves, spends, and earns in the cloud.

Migration is a Journey, Not a Destination

Your apps are never truly complete. You’re always improving them. The same is true with cloud migration. Priorities change. Business inevitably changes. Users change. Like your apps and your business in general, you’ll continue to evaluate your cloud environment, making sure your teams are in alignment, along with tweaking networks and devices.

Learn More

To learn more so you can make informed decisions, be sure to download the eBook Good Migrations: Five Steps to Successful Cloud Migration.