CDN fault-tolerance

This is a post for web developers.

What do you do when your CDN fails to serve a resource to a visitor?

For a year or two, we've had a message at the top of our site that more or less said "If you're seeing this message, our style sheet has failed to load. Try clearing your cache, wait a few minutes and try again, etc". Of course you wouldn't see this message if it our stylesheet had loaded, because our stylesheet had a rule to hide it.

That's a reasonable first step. At least we're communicating that we know something is wrong. But we still get customers emailing in a few times a month and the CDN just wouldn't work no matter what they tried. Maybe a CDN server in their location was having an outage, even though traffic should get routed around that, failures still happen. Maybe the customer's DNS is failing. Lots of possibilities, and almost always impossible to track down because eventually, within a day or two at the most, it would always start working for them again.

Clicky isn't usable without CSS (I know, I know...), and minorly broken without javascript (e.g. graphs). This kind of problem is just annoying, so I finally resolved to fix it. As any page on our site is loading, we use a few inline javascript tests to see if our master javascript file and master CSS file have loaded. If we detect a failure for either of those, then we try to load the resource a second time, directly from our web server. The code looks like this:




As you can see, this goes right at the top of the BODY tag. There are two reasons we need to wait until the BODY tag. First, we are creating an element to test its visibility immediately. We have to wait until BODY to do that. Second, if there is a CSS failure, we need to inject a new CSS element into the HEAD tag, but we can't do that until we are closed out of the HEAD tag. So, we wait until BODY.

Here's how the code works.

First, we create an empty element, #cdnfailtest, which we'll be testing the visibility of. Then we bust into javascript for the testing.

We want to use jQuery to test if this faux element is actually visible (because jQuery makes that very simple), so before doing that, we check if our javascript file has loaded by testing for window.jQuery. If that test fails, then our minified javascript wasn't loaded, so we need to load it from the web server. Unfortunately, we have no choice but to use document.write() in order to guarantee the script file is loaded immediately.

I did notice during testing however, even with document.write(), the javascript file wasn't always immediately available. It was a bit random on every refresh whether or not it worked. So we wrap the rest of the code within a setTimeout() call to delay it slightly (500 milliseconds).

After the 500 milliseconds, we start the CSS visibility test. We use jQuery to find the element #cdnfailtest and see if it's visible. (If jQuery still isn't loaded, we assume something is wrong and force load the CSS from the web server anyways, just to be safe). If it is visible, that means the CSS did not load, because otherwise its rules would hide the element.

This is the part where we need the HEAD element to be already fully declared, so we can attach and inject the CSS file into it. That's what the rest of the code does. Since CSS is fine to load asynchronous, we use the more modern method of adding it to the DOM (which we can't use for the javascript file, because even when "async" is not specified, it still seems to load it like that - e.g. parsing continues while that file is loaded in the background). So the page may still look funny for a moment, but that should only be the case for a second or two.

Anyways, this is our solution to this problem. If you have had to come up with your own solution, we'd be interested to see how yours works.
0 comments |   Jul 18 2013 5:04pm





Copyright © 2014, Roxr Software Ltd     Blog home   |   Clicky home   |   RSS