Website optimization

Creative_Wallpaper_Speed_limit_015789_

A lot of people may think about web performance as optimizing a website for better Search Engine Optimization. This is very true, but on todays web, there’s another kind of website optimization. And that’s the one I’m going to talk about today. So what is Web Performance?

Web Performance, is about speeding up websites, minimize the amount of bytes and requests sent to the user. It’s about striping away all the data that isn’t needed, and the end user doesn’t care about.

Who am I?

My name is Lucas, I’ve been living in Denmark all my life, but decided to move to the Netherlands because I got the most awesome job. I got the job as Performance Engineer. Which basically is about optimizing the websites we have. Currently we serve more than 190 million unique visitors a month, all around the world, but we’re only having 1 datacenter in the Netherlands to do this. So how do we do? – It’s what I’m going to talk about in this post.

The basics

To get started with optimizing websites, you don’t need to know much, a lot of websites today, can give you insights in your websites performance, how good it performs on different aspects. These sites can be Google Pagespeed which is well known, together with Yahoo YSlow. Using GTMetrix to check the performance can get a lot of things done, to speed up your website.

Javascript & CSS

As a few examples of website optimization, there is things like minifying javascript and CSS files, to remove all the spaces in the code that isn’t needed. Spaces sounds like things that doesn’t really take a lot of bandwidth. But it does, one of the best examples to take is jQuery, it’s one of the most known javascript libraries on the internet. Uncompressed an unminified, it’s around 247kilobytes, when checking the filesize. 247kilobytes is actually quite a lot to send over the web, because even when you have a high speed internet connection, your effective bandwidth that you get is much less when sending data via the network. It doesn’t really make any difference if you’re having 5mbit connection or 100mbit connection, it won’t go much faster.

Images

Another example is optimizing images, this can be one of the biggest wins on web performance. Images contains a lot of data, in fact images is just a bunch of text that is put together, making the image. It’s exactly like a javascript or CSS file. It can be compressed and minified.

Images contains the data for the image itself, the things that the user see, this data is very important. But it can also contain information about where the image came from, which camera was used, which software was used to edit it, things like color-profiles that contains information about how the colors in the image should be shown. All this data isn’t needed, so what can we do about it?

JPEG images is a lossy filetype, which means, the more times you save it, the more quality of the image will get destroyed. But this also means that you can decide which quality it should be. 100% is the best looking image you can get, it’s 100% of it’s current quality (Remember it’s lossy). Normally optimizing images to between 60% and 85% can stripe a lot of data out, often more than 80% of the image can be saved, just setting the quality on 80%, which actually isn’t bad, you can rarely see it when you just see the image. But if you’re looking close, you’ll see it’s a little blurry around edges of different colors, but it doesn’t matter if your site loads 4 times faster, and could make you more money.

PNG images is a lossless filetype, which means you can’t set the quality of the image. This is not completely true, because you have 3 types of PNG: PNG32, PNG24 and PNG8, PNG32 is all colors, including transparency. PNG24 is all colors, but no transparency, and PNG8 is images with maximum 256 colors, but can contain transparency.

But you can do more, than just having those 3 types of PNG images. Normally PNG images can’t be optimized at all, but then again. PNG images is compressed using gzip compression, by default the gzip compression for PNG images is set to a standard of 6, which is half compressed on the PNG compression scala. So what you can do, is to recompress the PNG images with the highest compression, but this isn’t recommended, an example is to take a 1megabyte PNG image, and run the highest compression algorithms on it, this can take between 5 and up to 20 minutes, based on what data is within the image, no one wants to wait this long.

That’s why another method should be used. We could use the 7zip compression, but pack it in a gzip container, this is a much faster process, and in fact it’s also optimized more than most other compressions. When saying faster, it’s maybe 2 or 3 minutes which is fast for this kind of optimization, but you can win over half of the size of your image by doing this.

Another way to optimize PNG images, it to convert them to JPEG, if they doesn’t contain any transparency, and then use the way of optimizing JPEG images on your new converted JPEG image.

GIF images is old, and actually takes up a lot of space, an easy way to optimize gif images, is to convert them to PNG8, which both contains 256 colors. This process is fast, but can save a lot of bytes.

Setting caching headers, gzip headers, set Vary headers

Easy optimization when talking about the web, is the 3 headers above. Caching, gzip and Vary headers. This in general is something you do on either your server, or VPS, but what is it good for?

Setting Caching Headers is about setting some headers that the end-users browser will understand and read, these headers tells the browser how often the users cache should be cleared for each different filetype, many websites today doesn’t set these headers, but they’re really important, because when the user requests a website, it will download a bunch of resources, these resources, if caching headers isn’t set, will often be downloaded again on next visit or pageview. Setting these caching headers will very often prevent this from happening, so the user may only download your files once every week or so. Which saves the visitor for good amount of load time, and bandwidth. But it also saves money on your host bill.

Setting Gzip Headers is pretty simple, what it does, is that you let the server serve files that is compressed using gzip compression if the browser supports it. This will work on all text files, and SVG images, but it won’t work on other image files. Setting the gzip headers can save up to 80% of the data sent to the visitor.

Setting Vary Headers is a little more technical, actually it’s basic – All that needs to be sent is Vary: Accept-Encoding but to understand why you should set them is pretty important. It seems like people often forget to set these headers, because they find them unnecessary. The Vary headers is in general set, because of caching proxies, what a caching proxy does, is to cache content from the server, and then generate a hash key out of the content. Next time a visitor visits the page, the caching proxy will check if the hash requested, is already in it’s own memory or on disk, before going to the webserver. You can parse different Vary Headers to the server, but the most generic one is the Vary: Accept-Encoding, all it does is to match the browsers Accept-Encoding header, which is a part of the hash, which means all Chrome, Maxthon, Firefox and Safari users will often get the same cached content. By default the Vary Headers is also matching the User-Agent, which is different from Operating system, browser and browser version, which means that the cache-hit rate will be much lower than if you just look for the Accept-Encoding header.

Pages: 1 2

6 replies to this post
    • Varnish is a good caching proxy indeed, but in this specific user case – It doesn’t fit our needs.

      I use it for all my personal projects. And I can see a lot of wins with it.

      I even have a lot TTL on the cache itself, around 5 minutes, but in 5 minutes it’s possible to get a lot of requests.

      A small test I made, was actually to benchmark an application using jmeter – With 1000 concurrent users, downloading all resources and stuff, using pure PHP, I could handle 70 requests per second which is OK, but CPU was almost dying.

      Enabling Varnish cache made the application handle 600 requests per second, with a CPU utilization of 10% – The reason why I couldn’t hit more than 600 request per second was in fact that the network connection from the computer I tested was waay to slow.

      So yes Varnish Cache is good, but there’s also other great alternatives!