Tales from the Field: Migrating from Log4J to Log4J2

The AppDynamics Java agent, like many Java applications, uses logging extensively. We have used Log4J as our logging framework for many years. And while the latest release of Log4J was in 2012 – and the Apache Foundation announced end-of-life for Log4J in August 2015 – we didn’t upgrade to Log4J2 because of the need to maintain support for the Java 5 VM and other competing priorities. However, our recent move from a monolithic repository to a product-specific repository has made the upgrade possible.

Log4J2 is full of enticing features. For example, the framework dramatically increases the speed of logging and reduces memory usage by providing garbage-free logging. Through native support for async logging, we can further reduce the already small amount of time we spend logging when running on customer applications. Since compression is also a native feature, our agent can tolerate more logging while simultaneously reducing file storage requirements. Both features allow us to add more frequent and better-quality logging that contains actionable information for our customer success and dev teams to help our customers.

Migration Goals and Challenges

So, what did we want to accomplish with the migration? And what were the challenges we faced during the move?

Let’s start with challenges:

– We had to namespace the framework packages to isolate our use of Log4J from our customers’ logging frameworks and we also needed to make the source Java 5 compatible because standard Log4J2 requires Java 1.6 and above.

– Since almost every class uses logging, we had to find a way to make these changes incremental and (relatively) easy to review to maintain the high quality that’s necessary in a production monitoring agent.

– We had to be able to fall back to Log4J (which is proven to work) in case Log4J2 initialization fails.

Our first goal was to repackage the jar with Java 5 compatible source. This step was easy. We programmatically refactored all the classes to namespace their packages.  We manually fixed a few issues involving APIs that only Java 6 and above supports, such as String.isEmpty().

The second step was to test the framework in a compatible environment. We used a docker container that has Java 5 installed, and created a test application mirroring our agent structure. This step took time because we needed to figure out how the configuration and customization were going to work with our agent. For example, one of the features that we have is agent error safety. We silence the logs and remove instrumentation if our agent code experiences too many internal errors. Another feature we have is reusing node names. We buffer the logging events and only write them to file after we know the node name from our UI. Using a test application, we were able to mimic all of these features in preparation for the migration.

In order to enable reversibility, we still have both frameworks present at the same time. We used the bridge pattern to extract logging to a separate shared package. This allows us to have multiple logging frameworks present in the code base, and we can easily switch between them at runtime. It also allows us to upgrade logging frameworks in the future, providing high flexibility and changeability. This step was significant because we had to change the build scripts and change every single file that uses a logger.

Lastly, we simply moved the Log4J2 version of our custom appenders we created from the second step, copied the configuration code over, and with that – we successfully upgraded our logging framework!

Log4J2 Log Correlation Support in 4.4

While working with Log4j2, we also took the opportunity to add support for it to our log correlation feature.

Log Correlation enables the user to designate a spot in their log appender pattern for us to insert our business transaction (BT) request guid at runtime.  Any call made to the logger in the context of a BT will dynamically inject the guid, regardless of whether the line ultimately ends up in a file or on the console.  The presence of these guids in the log output empowers log processing applications, including our own Log Analytics product, but also others such as Splunk. Using them, we can correlate any lines logged by individual transaction to the snapshot data we collected about that request on the APM side, all without requiring any changes to the customer application.  Conversely, it also enables users of our Controller to easily transition from a BT snapshot to the exact lines that occurred during that BT request in their logs.

Logging frameworks supported in addition to the new support for Log4J2 include Log4J, Logback, and Slf4J.

Final Thoughts

Working toward a product-wide upgrade is daunting at first. However, once it’s broken up into small independent steps, it becomes much more manageable. It seems harder to run a 10k than it does to run ten 1ks. The upgrade went smoothly because each step made changes to the product while keeping it functional and ready to ship. This is a boon to faster build verification and code review.

To lean more, see our documentation on business transaction and log correlation.

Want to see how AppDynamics Log Analytics works? Start a free trial today.

Haojun Li is a co-author of this blog post. Haojun is a software engineer that has been with AppDynamics for approximately 5 months. He is a fresh grad from UC Berkeley with a degree in Computer Science and Statistics. He enjoys sailing and biking on the road during the weekend. 

3 Reasons Why You Should be Using Smarter Log Analytics

Application complexity is increasing rapidly, and so is the need and dependency on software to provide exceptional customer experience and business agility. Machine-generated data holds the key to providing insights as enterprises rush to fix customer issues, optimize application, and add a plethora of new features on a regular basis. In a hyper-competitive mode and quickly evolving marketplace, enterprises must find a way to resolve issues or risk becoming irrelevant.

In this blog, I’m going to discuss an alternate approach to traditional big data and log-only solutions. Rather than looking for a needle in the haystack or struggling to find answers when the application code changed and you no longer have the right logs to store the data to get you granular insights, why not use a solution that simply differentiates and leverages the power of application performance management solution and logs together to provide comprehensive IT operational intelligence.

Without further adieu, here are the top 3 reasons to power your operational insights using AppDynamics Log Analytics solution:

1. Comprehensive IT operational visibility with APM & logs

Modern enterprise applications are complex and generate tons of data. AppDynamics, the leading application intelligence company, auto-discovers and monitors end-to-end business transaction performance with transaction tag and follow. This approach identifies customer-impacting issues quickly and provides deep visibility into application health making AppDynamics Application Performance Management (APM) solution of choice for enterprises running an application in production with minimum overhead.

However at times trouble goes beyond application code. APM tools are generally superior in pinpointing the errors, and for most parts, IT and DevOps professionals use APM to drill down from end-use clients to application code level details to isolate the performance issue and then dig into logs to get more insight and full context around the issue. Multiple vendors are competing here occupying precious time and resources that are wasted from switching from one tool to another while relying on time-based correlations to stitch the information. Customers also have to deal with having multiple dashboards separately built on different platforms.

Our unique approach brings the best of the both worlds together to provide comprehensive IT operations intelligence in real-time. Business transactions data from APM and log data as part of one platform gives the power back to IT and DevOps professionals who can use the two together to complete the picture between application and infrastructure components.

Custom dash for logs.png

2. Industry’s first auto-correlation between APM and Logs

But why stop there? Bringing the two data sets together without source code modification and burden of owning, managing, and deploying multiple tools is a huge leap and tremendous value to enterprise IT teams. The value proposition increases several folds by the industry’s first auto-correlation of transactions and log data. AppDynamics application agents automatically identify business transactions to tag, trace, and follow that transaction end-to-end with a unique identifier. Smart correlation inserts the same unique identifier related to a single business transaction and automatically ties it to all associated logs.

Leveraging the power of this intelligent correlation, IT and DevOps professionals can pinpoint errors, get full context and forensics behind the root cause, and ultimately achieve faster mean-time-to-resolution. In a time where every customer interaction is an important one with business consequences, auto-correlation will pinpoint quickly that a particular “place order” business transaction took 7 seconds because messaging service queue was full, and finally it succeed in the seventh attempt. It eliminates the need to refine and correlate logs from different tiers and adds business context.

3. Unified Analytics Solution to power IT operational intelligence

Leverage AppDynamics Unified Analytics platform to provide real-time insights into IT operations, customer experience, and business outcomes. AppDynamics Log Analytics is part of the Unified Analytics Platform to enable unprecedented, comprehensive operational visibility in real-time in production.

Search logs and transaction data in one platform to connect the dots. Use the same platform to visualize rapidly and analyze large datasets to gain granular real-time insights. Build custom and interactive custom dashboards and create metrics alerts to act on information quickly. Set-up automated reports to share learnings with managers and team members.

To summarize with two words: “Smart Logs.”

  • Smart logs to get real-time insights into your machine data
  • Smart logs to resolve application issues faster
  • Smart logs to gain comprehensive IT operational intelligence

In my next blog, I’ll write about how AppDynamics is using our Log Analytics solution to gain deep, comprehensive IT operations insights into our SaaS customer environment.