Return To Sender — Black Friday Brings New Digital Challenges

Return to sender, dress is a no. Not my style — even with a discount code

Welcome to the busiest time of the year for retailerselvis

This week marks one of the busiest sales periods for retail, with Black Friday being a central focus for consumers and the media alike. During the holiday period it’s estimated that most traditional retail brands will make between 70% and 90% of total revenue in-store during this period. But if you’re like me, I can’t think of many more things worse than being stuck in queue this Friday, and it seems that many other consumers also agree. According to Adobe, last year Thanksgiving and Black Friday netted $4.45 billion in U.S. online purchases, with smartphones accounting for a record $1.5 billion of that amount. This year, it’s predicted that Black Friday will be bigger than ever, as for the first time, sales are expected to exceed $3 billion, an increase of 11.5% over last year with online and smartphone-based sales smashing all records.

There’s no doubt that the retail sector is at the forefront of digital transformation. Mobile is now an integral part of the shopping experience in areas ranging from fashion to sporting goods. The focus for retail this year has been on how to utilize digital technologies to personalize omni-channel customer experiences in order to ensure that products, offers, and content remain relevant.

So, all’s good in retail in regards to digital transformation then?

The stats above are compelling, and many retailers will have a bumper holiday period this year, but all’s not rosy. Why is this the case? The clue is in my “digitally transformed” Elvis quote at the start of the post. You see, online and mobile channels have made shopping seamless, convenient, and with flexible delivery options. It means that you don’t have to wait very long for your bargains to arrive.

But that’s not all — in the search for differentiation and customer loyalty, many digital retailers have now shifted focus to providing seamless and convenient returns. For example, Zappos, a subsidiary of Amazon and an online shoe and clothing shop, have led the way with a 365-day returns policy and free two-way shipping. Today, if you don’t like a product you’ve bought, it’s simple: just return it, or even better — if you can’t decide which new sweater to buy, then just buy five and send the four back that you don’t want. Shoppers have never had it better.

But for retailers, this is where a new challenge caused by the digital transformation begins to emerge. According to the National Retail Federation, the total of U.S. annual returns in 2015 reached $260.5 billion, and in the UK, £600m of products bought over last year’s Black Friday and Cyber Monday entered into returns processes. The obvious immediate impact for retailers is on revenue, but by the time these returns go through the process of being repackaged, the costs of being redistributed correctly to either distribution centers or back to store increase, especially during the holidays. By then, many products become out of season, cannot be sold, or must be discounted even further —  thus magnifying the impact on retail revenues and profit even further.

What’s the solution to the returns challenge?

According to Shorr, who conducted research into the returns challenge last year, returns are increased by poor product portrayal or fit, especially in fashion. So, the first rule is for retailers to make sure that their online channels accurately portray size, colors, and other product attributes clearly. Beyond this though, it’s about digital retailers making sure that they have full visibility of the customer journey. Visibility into all transactions up to the purchase is essential, whether online or in-store, followed by visibility through the entire returns process.

Today, this journey happens through interconnected applications, increasingly via mobile apps for the customer and via returns or supply chain apps for the employees managing the returns process. This means that the adoption of an intelligent application performance management solution that can provide end-to-end transaction performance visibility of these applications is critical, as any slowdown causes customer experience issues and impacts directly on revenue. On top of this, utilizing deep application analytics like Business iQ can help digital retailers to understand in real-time where products are being returned, by which customer, and at what value — helping them to improve and fine-tune their returns process for future holiday seasons.

So, while the world will be watching the revenues generated this Black Friday and Cyber Monday, I’ll be on the lookout for this year’s returns figures to see how retail is adapting to the new challenges that digital transformation brings. But I wish all retailers a great holiday season, and as for me, I can’t wait to bag a bargain this Friday!

Learn More

To learn more about the challenges that digital retailers face, please download our report, An App Is Not Enough.

Turning Black Friday Gold [Infographic]

Holiday season for retailers this year means a little more than just stocking online inventory and extending shopping hours. Today’s software-defined retailers need to rely on the few days of holiday shopping season to determine their business’ ultimate success. That requires a little more insight into the performance and uptime of online transactions, and the right solution to monitor and manage their applications as well.

We partnered with InternetRetailer.com to see exactly how digital retailers are preparing for events like Black Friday and Cyber Monday, and if those results determined a win or loss.

One finding we came across was extremely striking: “Those that had a site performance troubleshooting process in place were 92% more likely to meet or exceed their Black Friday revenue numbers”

Your customers no longer need the option of waiting for your website to recover from downtime and blackouts, which makes the need for performance monitoring and application intelligence even more necessary.

Interested in checking out the full report? Find it here!

Black Friday Preparations from AppDynamics

How Retailers Can Have a Successful Black Friday [INFOGRAPHIC]

Partnering with InternetRetailer.com, we wanted to see exactly how e-commerce sites prepared for Black Friday and Cyber Monday, and how those preparations led to the success (or lack thereof) during the busy holiday season.

One stat that jumps out at me is extremely telling:

“Those that had a site performance troubleshooting process in place were 92% more likely to meet or exceed their Black Friday revenue numbers”

When your business’ bottom line is directly tied to the success over a few days a year, uptime and optimal performance have never been more highly coveted. Your customers have too many options these days to be tolerant of poor performance and downtime. My colleague, Alex Fedotyev, stated, “Your business no longer runs on software, your business is software.” This quote rings especially true for the e-commerce industry.

Interested in checking out the full report? Find it here!

Read the full report here!

How Black Friday and Cyber Monday Were Saved with Performance and Monitoring Tools

Ah, the holidays. A great occasion to spend quality time with your family, eat a little too much, and indulge on the fantastic shopping deals. Online shopping during Black Friday and Cyber Monday account for a critical chunk of a retailer’s yearly revenue. Any downtime or abnormal slow performance will directly impact the bottom line of the business. So how should these e-commerce companies protect this increased income by preparing their site and applications properly to handle the drastic increase in load?

We partnered with InternetRetailer.com to conduct a survey of senior retail executives to see how e-commerce sites prepared for the holiday season, and how these preparations were crucial in surpassing their revenue goals. We went into the survey looked for specific answers, such as: If these sites invested in performance monitoring tools did they fare better through the holidays with less downtime? Were sites with a performance troubleshooting process in place more likely to exceed their goals? How has mobile e-commerce traffic grown year over year?Screen Shot 2015-02-11 at 11.00.25 AM

Download the full report here

The results were quite surprising. We knew performance mattered as every minute of downtime these e-commerce sites would be losing money. However, we didn’t realize exactly how crucial having monitoring tools and a troubleshooting process were.

Some interesting facts:

  • 92% of retailers that had a site performance troubleshooting process were more likely to have met or exceeded their revenue numbers
  • 70% of retailers who exceed their revenue goals added new site performance and monitoring tools prior to the holidays
  • 24% reported their holiday mobile traffic increased by over 41% from the year before

The mobile piece is quite interesting. Though we know the mobile experience is a crucial stage in the buying cycle — notably for discovering and browsing — it still isn’t typically the final device used for the purchase stage. However, as the mobile traffic increases, the user experience still remains an absolutely vital cog in the purchase decision. In a previous study we conducted with the University of London, we found 86% of customers who have a bad first experience on a mobile app will never return to the app for a second chance.

The report doesn’t only highlight interested facts derived from the survey, but also insight from some of the top e-commerce companies in the world. Charles Hunsinger, CIO at Harry and David, details their troubleshooting process, “ We have several different tools in place, and they give us a lot of information to understand, if there is an issue, where it is coming from.”

On the mobile angle, Heather Dettmann, Digital Business Manager at Finish Line, states, “We know that customers will continue to explore and purchase products through a variety of touch-points, and we’ll continue to be everywhere that they are …. and deliver the best possible customer experience.”

Interested in checking out the full report, click here for your FREE copy!

How Omni-Channel is Your E-Commerce Application Monitoring?

Screen Shot 2014-10-30 at 4.36.31 PM
Like it or not, an ever-expanding holiday season is here, and the words moratorium and code freeze are being heard around the halls of many retailers who are gearing up for their Super Bowl of shopping. Just this time last year I was responsible for middleware performance of a leading online and mobile payment provider, and I remember the excitement, and stress, of heading into this time of year. With mobile (mCommerce) and ecommerce shopping trends skyrocketing, expectations and demand for the business and technology we worked so hard to build were extremely high.

This year alone eMarketer predicts Business to Consumer (B2C) e-commerce sales will increase over 20%, to upwards of $1.5 Trillion Dollars, and by 2017, that number jump to $2.3 Trillion. Mobile trends are predicted to show an equally impressive growth; mCommerce in the US alone is estimated to grow $52 to $86.8 Billion from 2014 to 2016 (eMarketer). Moreover, with the growth comes the challenge, as IT organizations are pushed to understand, manage, and support heterogeneous customer access points and technology stacks in parallel.

Screen Shot 2014-10-30 at 4.36.41 PM
With the competitive nature of e-commerce, managing and optimizing end-user experience has never been more important. Studies co-authored by Bing and Google – http://radar.oreilly.com/2009/06/bing-and-google-agree-slow-pag.html– prove a direct correlation between page speed and revenue, determining that even a 500 ms delay can have a significant impact over most any Key Performance Indicator (KPI).

Screen Shot 2014-10-30 at 4.36.50 PM

AppDynamics is deeply involved in helping online, and offline, retailers guarantee service delivery during these critical times. While mobile is certainly the wave of the future, improving end-user experience does not start or stop, at the browser and mobile device. The brick and mortar point of sale (POS) remains an essential element to many retailers omni-channel strategy.

Screen Shot 2014-10-30 at 4.37.03 PM
Whether it is a custom solution or built on top of a 3rd party platform like Retalix, AppDynamics provides complete monitoring visibility, fault isolation, and diagnostics end-to-end across the entire POS system. Below is an example of a Retalix environment where AppDynamics is providing real-time, end-to-end application performance monitoring coverage, including: in-store registers, self-checkout, systems, price scanners, backend services, integrations with external systems, and backend databases.

Screen Shot 2014-10-30 at 4.37.12 PM
Using AppDynamics, we’ve enabled IT to monitor, map, and score 100% of the Retalix Point of Sale (POS) transactions in real-time as well, as well as getting out of the box visibility into the exact business functions being executed. If and when there is a problem, the operations team is equipped to take immediate action by:

  • Receiving a notification identifying what performance degradation is occurring, and which stores and business services are impacted; e.g., a response-time problem at critical severity affects all mid-west stores.
  • Identifying the business function impact: understanding the problem affects the item scan feature in the self-checkout aisles
  • Isolating precisely what components are degraded, by visualizing the end-to-end application topology; knowing the failure lies within the store or regional servers, a network link, shared backend services or databases.
  • Quickly diagnosing the issue down to a source line of code or SQL query.
  • Screen Shot 2014-10-30 at 4.37.22 PM
    Screen Shot 2014-10-30 at 4.37.31 PM

    Consumer trends are driving the business rapidly to expand commerce over multiple channels, and as highlighted above our goal is always to enable IT to follow suit with an equally comprehensive monitoring plan. Failure to do so, despite best intentions, leads to what we call the Frank-Monitor problem; a series of loosely or de-coupled point solutions that fail to integrate into an effective end-to-end capability.

    So when you are considering how best to define, initiate, or develop, your monitoring strategy, make sure to ask yourself just how omni are you, actually?

    Because despite the best efforts to stitch together disparate solutions into a cohesive understanding of system performance, it almost always ends up a science project!

    Want to give AppDynamics a try for your e-commerce platform? Download a FREE trial today!

    Top 10 Reasons Why eCommerce Apps Will Fail This Black Friday

    My wife is a shopoholic and serial checkout killer. Every week she spends several hours browsing and carefully adding items to shopping carts. Every now and then she’ll interrupt my sports program with an important announcement “My Checkout just failed”. Take this example Mrs Appman sent me during the month of September:

    Checkout Fail

    The fact I work in the APM industry means it is my responsibility to fix these problems immediately (for my wife). As Black Friday is coming up I thought I’d share with you what typically goes wrong under the covers when our customer’s e-commerce applications go bang during a critical time.

    It is worth mentioning that nearly all our customers perform some form of performance or load testing prior to the Black Friday period. Many actually simulate the load from the previous year on test environments designed to reproduce Black Friday activity. While 75% of bottlenecks can be avoided in testing, unfortunately a few surface in production as a result of applications being too big and complex to manage to test under real world scenarios. For example, most e-commerce applications these days span more than 500 servers distributed across many networks and service providers.

    Here are the top 10 reasons why eCommerce applications will fail this Black Friday:

    1. Database Connection Pool

    Nearly every checkout transaction will interact with one or more databases. Connections to this resource are therefore sacred and can often be deadly when transaction concurrency is high. Most application servers come with default connection pool configurations of between 10 and 20. When you consider that transaction throughput for e-commerce applications can easily exceed 100,000 trx/min you soon realize that default pool configurations aren’t going to cut it. When  a database connection pool becomes exhausted incoming checkout requests simply wait or timeout until a connection becomes available. Take this screenshot for example:

    Connection Pool Issue

    2. Missing Databases Indexes

    This root cause is somewhat related to the exhausted connection pools. Simply put, slow running SQL statements hold onto a database connection for longer, therefore connection pools aren’t recycled as often as they should be as queries take longer. The number 1 root cause of slow SQL statements is missing indexes on database tables, which is often caused by miss-communication between developers who write SQL, and the DBAs who configure and maintain the database schemas which hold the data. The classic “full table scan” query execution where a transaction and its database operation must scan through all the data in a table before a result is returned. Here is an example of what such looks like in AppDynamics:

    Missing Index

    3. Code Deadlock

    High transaction concurrency often means application server threads have to contend more for application resource and objects. Most e-commerce applications have some form of atomicity build in to their transactions, so that order and stock volumes are kept in check as thousands of users fight over special offers and low prices. If access to application resource is not properly managed some threads can end up in deadlock, which can often cause an application server and all its user transactions to hang and timeout. One example of this was last year where an e-commerce customer was using a non-thread safe cache. Three threads tried to perform a get, set and remove on the same cache at the same time causing code deadlock to occur, impacting over ~2,500 checkout transactions as the below screenshot shows.

    deadlock

    4. Socket Timeout Exceptions

    Server connectivity is an obvious root cause, if you check your server logs using a Sumologic or Splunk then you’ll probably see hundreds of these events. They represent network problems or routing failures where a checkout transaction is attempting to contact one or more servers in the application infrastructure. Most of the time the services you are connecting to aren’t your own, for example a shipping provider, credit card processor, or fraud detector. On high traffic days like Black Friday it isn’t just your site experiencing a surge in traffic – often times entire networks are saturated due to intense demand. After a period of time (often 30-45 secs) the checkout transaction will just give up, timeout and return an error to the user. No Connectivity = No Revenue. Here is an example of what it looks like:

    socket timeout exception

    5. Garbage Collection

    Caches are an easy way to speed up applications. The closer data is to application logic (in memory) the faster it executes. It is therefore no surprise that as memory has gotten bigger and cheaper most companies have adopted some form of in-memory caching to eliminate database access for frequent used results. The days of 64GB and 128GB heaps are now upon us which means the impact of things like Garbage Collection are more deadly to end users. Maintaining cache data and efficiently creating/persisting user objects in memory becomes paramount for eliminating frequent garbage collection cycles. Just because you have GB’s of memory to play with doesn’t mean you can be lazy in how you create, maintain and destroy objects. Here is are a few screenshots that show how garbage collection can kill your e-commerce application:

    Garbage Collection

    Screen Shot 2013-10-14 at 2.57.04 PM

    6. Transactions with High CPU Burn

    Its no secret than inefficient application logic will require more CPU cycles than efficient logic. Unfortunately the number 1 solution to slow performance in the past was for eCommerce vendors to buy more servers. More servers = More Capacity = More Transaction Throughput. While this calculation sounds good, the reality is that not all e-commerce transactions are CPU bound. Adding more capacity just masks inefficient code in the short term, and can waste you significant amounts of money in the long term. If you have specific transactions in your eCommerce application that hog or burn CPU then you might want to consider tuning those before you whip out your check book with Oracle or Dell. For example:

    High CPU Burn

    7. 3rd Party Web Services

    If your e-commerce application is built around a distributed SOA architecture then you’ll have multiple points of failure. Especially if several of those services are provided by a 3rd party where you have no visibility. For example, most payment and credit card authorization services are provided by 3rd party vendors like PayPal, Stripe, or Braintree. If these services slow down or fail then its impossible for checkout transactions to complete. You therefore need to monitor these services religiously so when problems occur you can rapidly identify whether it is your code or connectivity or someone else’s outage. Here is example of how AppDynamics can help you monitor your 3rd party web services:

    Transaction Flow

    Screen Shot 2013-10-14 at 2.59.50 PM

    8. Crap Recursive Code

    This is similar to #6 but burns time instead of resources. For example, many e-commerce transactions will request data from multiple sources (caches, databases, web services) at the same time. Every one of these round trips could be expensive and may involve network time along the way. I’ve seen a single eCommerce search transaction call the same database multiple times instead of performing a single operation using a stored procedure on the database. Recursive remote calls may only take 10-50 millisecond each, but if they are invoked multiple times per transaction they can add seconds to your end user experience. For example, here is that search transaction that took x seconds and made 13,000 database calls.

    Screen Shot 2013-10-14 at 3.00.36 PM

    9. Configuration Change

    As much as we’d like to think that production environments are “locked down” with change control process, they are not. Accidents happen, humans make mistakes and hotfixes occasionally get applied in a hurry at 2am 😉 Application server configuration can be sensitive just like networks, or any other pieces of the infrastructure. Being able to audit, report and compare configuration change across your application gives you instant intelligence that a change may have caused your eCommerce application to break. For example, AppDynamics can record any application server change and show you the time and values that were updated to help you correlate change with slowdowns and outages, see below screenshot.

    Screen Shot 2013-10-14 at 3.01.02 PM

    10. Out of Stock Exception

    “I’m sorry, the product you requested is no longer in stock”. This basically means you were too slow and you’ll need to wait until 2014 for the same offer. Remember to set an alarm next year for Black Friday 😉

    out_stock_en

    In addition, AppDynamics can also monitor the revenue and performance of your checkout transactions over-time which helps Dev and Ops teams monitor and alert on the health of the business:

     Correlating revenue and performance

    The good news is that AppDynamics Pro can identify all of the above defects in minutes. You can take a free trial here and be deployed in production in under 30 minutes! If you send us a few screenshots of your findings in production like the above we’ll send you a $250 Amazon gift certificate for your hard work!

    Steve.

    Why did your Checkout Fail? AppDynamics knows

    The reason for this blog is purely down to a real-life incident which one of our e-commerce customers shared with us this week. It’s based around a use case that pretty much anyone can relate to – the moment your checkout transaction spectacularly fails. You sit there, looking at a big fat error message and think “WTF – did my transaction complete or did the company steal my money?” A minute later you’re walking a support team through exactly what happened: “I just clicked Checkout and got an error…honestly…I waited and never got a response.”

    What’s different in this story is that the support team had access to AppDynamics as they were talking to a customer on the phone…and the customer got to find out the real reason their checkout failed. How often does that happen? Never, until now. Here is the story as documented by the customer.

    Black Friday and Cyber Monday thru the eyes of an APM solution

    A week has passed since Black Friday, so I thought it would be a good idea to summarise what we saw at AppDynamics from monitoring one of several e-commerce applications in production.

    Firstly, things went pretty well for our customers who experienced between 300 and 500% increase in transaction volume over the holiday period on their applications. Thats a pretty big spike in traffic for any application so its always good to look at those spikes and see what impact they had on application performance.

    Here’s a screenshot which shows the load (top) and response time (bottom) of a major e-commerce production application during the thanksgiving period. The dotted line in both charts represents the dynamic baseline of normal activity. You can see on Black Friday (23rd) and Cyber Monday (26th) that transaction throughput was peaking between 24,000 and 31,000 tpm on the application, spiking between 150 and 200% over the normal load the application experiences throughput the rest of the year.

    Application response time during the period had one blip during the first minutes of Black Friday (9pm PCT/Midnight EST) with no major performance issues following thru into Cyber Monday. The blip in the application related to the web container thread pool becoming exhausted during peak load when the Black Friday promotions went live. Below you can see throughput was hitting 23,000 tpm.

    Two business transactions “Product Display” and “Checkout” were breaching their performance baselines during that period. Looking at the average response times of 516ms and 733ms tells one story, looking at the maximum response time and number of slow/very slow transactions (calculated using SD) tells a completely different story.

    Let’s take a look at the execution of one individual “Product Display” business transaction that was classified as very slow with a 66 second response time.

    When we drill into the code execution and SQL activity we can see a simple SELECT SQL query had a response time of 588ms, the problem in this transaction was that this query was invoked 102 times resulting in a whopping 59.9 seconds of latency, its therefore no surprise that thread concurrency inside the JVM was high waiting for transactions like these to complete. If these queries are simply pulling back product data then there is no reason why a distributed cache can’t be used to store the data instead of expensive calls to a remote database like DB2.

    Let’s look at the other “Checkout” transaction which was breaching during the performance spike. Here is a checkout which took 9.1 seconds and deviated significantly from its performance baseline. You can see from the screenshot below the latency or bottleneck is again coming from the DB2 database:

    Hardly surprising given most application scalability issues these days still relate to data persistence between the JVM and database. So let’s drill down into the JVM for this transaction and understand what exactly is being invoked in the DB2 database:

    Above is the code execution of that transaction and you immediately see 8.5 seconds of latency is spent in an EJB call which is performing an update. Let’s take a look at the invoked queries as part of that update:

    Nice, a simple update query was taking 8.4 seconds, notice all the other SQL queries associated with a single execution of the “Checkout” transaction. The application during this performance spike was clearly database bound and as a result a few code changes were made overnight that reduced the amount of database calls the application was making. We had one retail e-commerce customer last year who found a similar bottleneck, a fix was applied that reduced the number of database calls per minute from 500,000 to a little under 150,000. While the problem may at first appear to be a database issue (for the DBA) it was actually application logic and the developers who were responsible for resolving the issue.

    You can see in the first screenshot that application response time was stable throughout the rest of the thanksgiving period , no spikes or outages occurred for this customer and all was well. While every customer will do their best to catch performance defects in pre-production and test, sometimes its not possible to reproduce or simulate real application usage or patterns, especially in large scale high throughput production environments. This is where Application Performance Management (APM) solutions like AppDynamics can help – by monitoring your application in production so you can see whats happening. Get started today with a free 30-day trial.

    Appman