Wednesday, August 19, 2009

Site optimization - Leveraging Content Delivery Networks (CDN) for blazingly quick sites

Ok firstly there's a lot of different content delivery network providers around. I'm going to be using Amazon CloudFront in this post because I'm endeared to their pricing structure ($0.17 per GB) and REST based API.

The problem: You have a web site hosted in the US. Whenever you connect to the site, all the resources are downloaded from that single host. If you pay for cheap hosting, chances are your download speeds are deplorable. Add to that the fact that if you hit the site from outside the US, you get worse performance due to latency. How do you make your site faster without spending big?

We all know that slow sites drives visitors away. If they click a link to your site, you better try your damned best to get your landing page up on their screen as fast as possible - before they hit 'back' and go to your competitor's site.

Content Delivery Networks (CDN) are set up exactly for this purpose - to get your content onto your customer's computer as quickly as possible.

CDN's basically replicate your content onto "edge locations" that are servers that sit on crazy fast backbone networks, and are located in different continents around the world. When a customer requests some content, their request is routed to the geographically closest edge location, and the response is sent back. ridiculously fast.

CloudFront is Amazon's implementation of this. To use it you'll also need to register for an S3 account. There are a few considerations that you need to be aware of before going down this path, so it's worthwhile reading on to get an overview.

A typically implementation will see that you serve images, video, audio, flash, whatever content from the CDN. Your actual HTML page, however, should be served from your normal web host (not mandatory, but easier this way). Whenever you reference a resource, eg an IMG such as: 

<img src="/Images/Home.png" />

You just need to direct this to your CDN via:

<img src="http://yourdomain.cloudfront.net/Images/Home.png" />

So that when the customer's browser request the image, the route to the closest edge location will be determined, and will fulfil the request.

Pretty easy so far.

To get your image into the edge server, you need to pop it in the origin server, ie: your S3 bucket. Once you've uploaded it, you create a new distribution to CloudFront so that Amazon knows the content you've uploaded is to be placed on their CDN.

To do this, it's best to check out the API

Price is always a consideration. You get charged for storage, and charged for transfers (not just from your edge location to the customer, but also from your S3 bucket to the edge location).

By default, content will expire after being at an edge location for 24 hours. It is only pushed to these locations when that location receives a request to serve some content. At that point it quickly goes to copy the resource from the bucket, and then serves this to the request.

Interestingly if you have big files (eg: video), browsers tend to chunk the request and download it in parts. If the browser requests from byte 0, then the entire resource is copied to the edge location. If the browser requests some other arbitrary byte, then only that chunk of the file is copied.

It's not all smooth sailing. If you have secure content that you want to distribute, eg: user pays downloads, then you're out of luck. Cloudfront only serves content, and provides no authentication, authorisation or restrictions of content. Furthermore it will strip all query parameters, so even if you're trying to be clever, you'll get tripped up here.

In summary, if you're looking to increase the scalability and speed of your site, check out a CDN. Taking advantage of a globally distributed content network will let your users - wherever they're geographically located - to hit your site, drive more throughput, and make more profits.

No comments:

Post a Comment