Keith Irwin

Loading Images Faster

Apr 5, 2016

Web Development


One of the biggest problems I've had with this site is the amount of time it takes to load large images. That's because the whole image loads, even if it gets downsized with css on the page. The picture of me on the homepage, for example, is a 1225 x 1536 behemoth squeezed into a space of half the page. On my 4" mobile phone, this image has a width of 150 pixels, but the full 1mb photo is downloaded. Thus the problem. 

I looked at several packages on npm, such as image-resizer and express-resizer, but they all seemed to work by resizing the image with ImageMagick on the fly. I don't imagine it would really improve my load times by resizing the image on every load. Plus it would wear down my poor server. Plus my host does not have ImageMagick. 

Obviously, I need to save smaller versions (thumbnails) of all my images and serve the appropriate one for a given viewport. OK, let's get started! 

Creating the thumbnails

I don't want to have to think about all that resizing, so I'll write a script to do it for me. There are a few things that this script has to do, given a list of thumbnail widths:

The script must perform all these tasks so I can add or delete images and the thumbnails will update automatically. This is written in python3 and uses pillow. The thumbnails are saved in a hidden .small folder, in the same path as the original image. The thumbnails are named using the original image name, such as image.300.jpg for a thumbnail of image.jpg at a width of 300px. 

#!/usr/bin/python3
import sys, os, re
from PIL import Image

# List of resolutions, from biggest to smallest
resolutions = ['6000','4000','3000','2000','1500','900','500','300','100']

# Arguments
if len(sys.argv)>1:
	# Target directory
	if sys.argv[1][0] in ['/','~']:
		folder = sys.argv[1]
	else:
		folder= os.path.join(os.getcwd(), sys.argv[1])
	if not os.path.isdir(folder): sys.exit(folder+" is not a directory")
	# Verbosity
	verbose=quiet=False
	if len(sys.argv)>2:
		if sys.argv[2]=='-v': verbose=True
		elif sys.argv[2]=='-q': quiet=True
else: sys.exit("Please supply a directory")

for path,dirs,files in os.walk(folder):
	for filename in files:
		filepath = os.path.join(path,filename)
		if filename[-4:].lower() not in ['.jpg','.png','.bmp','.gif','.ico']:
			if not quiet: print("Warning: "+filepath+" not a JPG, PNG, BMP, GIF, or ICO.  Does it belong here?")
		else:
			
			# Check smaller images
			if os.path.basename(path) == '.small':
				split = filename.split('.')
				if len(split) < 3:
					if not quiet: print("WARNING: "+filepath+" does not have two .'s.  Is it fullsize?")
				else:
					if len(split) > 3:
						split = ['.'.join(split[:-2]), split[-2], split[-1]]

					# Ensure fullsize image exists or delete
					if not os.path.isfile(os.path.join(os.path.dirname(path), split[0]+'.'+split[2] )):
						if verbose: print(filepath+" has no fullsize version.  Deleting...")
						os.remove(filepath)

					# Ensure correct resolution or delete
					elif split[1] not in resolutions:
						if verbose: print(filepath+" not at correct resolution.  Deleting...")
						os.remove(filepath)

			# Check fullsize images
			else:
				if not os.path.isdir(os.path.join(path,'.small')):
					os.mkdir(os.path.join(path,'.small'))
				image = Image.open(filepath)
				for res in resolutions:
					resint=int(res)
					if image.width - resint > image.width/10:
						split = filename.split('.')
						if len(split) > 2:
							split = ['.'.join(split[:-1]), split[-1]]

						# Check for smaller resolution, or create
						smaller=os.path.join(path,'.small',split[0]+'.'+res+'.'+split[1])
						if not os.path.isfile(smaller):
							if verbose: print(smaller+" does not exist.  Resizing...")
							image.thumbnail((resint,99999))
							image.save(smaller)

if not quiet: print("All done!")

The script takes a directory as its first argument and works on all image files in it and its subdirectories. Note that a thumbnail is only created if it's significantly smaller than the original image. That way, a 3000px thumbnail won't be generated for a 3005px image; that's just wasteful! 

Now, we can use a cronjob to ensure the script runs every week, keeping our thumbnails up to date. The script uses an optional second argument to determine if the output should be quiet or verbose. Don't forget to make it executable with chmod! 

@weekly ~/bin/gen_thumbs.py ~/path/toimg -q

I'll make sure that the server fails gracefully if the thumbnails are missing or out of date. 

The server

I'm using express, and my static middleware looks like this:

app.use('/static', express.static(__dirname+'/static'));

I'm going to add another function to app.use() to run before express.static(). Here's some code I prepared earlier:

app.use('/static',
	function(req,res,next){ // resize images
		if (req.url.slice(0,5)=='/img/') {
			if (!req.query.width || isNaN(req.query.width)) { next(); }
			else {
				var path = req.url.slice(0,req.url.lastIndexOf('/')+1),
					filename = req.url.slice(path.length,req.url.lastIndexOf('.')),
					extension = req.url.slice(path.length+filename.length,req.url.indexOf('?'));
				fs.readdir(__dirname+'/static'+path+'.small',function(err,files){
					if (err) { if(prod){console.log(err);} next(); }
					else if (!files.length) { next(); }
					else {
						var images = files.filter(function(file){
							return file.slice(0,file.lastIndexOf('.',file.length-5))==filename;
						});
						if (!images.length) { next(); }
						else {
							var sizes = images.map(function(file){
								return parseInt(file.slice(file.lastIndexOf('.',file.length-5)+1,file.lastIndexOf('.')),10);
							}).sort(function(a,b){return a-b;});
							var queryWidth = parseInt(req.query.width,10);
							if (queryWidth>sizes[sizes.length-1]){ next(); }
							else {
								sizes.push(queryWidth);
								sizes.sort(function(a,b){return a-b;});
								var width = sizes[sizes.indexOf(queryWidth)+1];
								req.url = path+'.small/'+filename+'.'+width+extension;
								next();
							}
						}
					}
				});
			}
		} else { next(); }
	},
	express.static(__dirname+'/static'));

Here we are checking if an image is being loaded, and has a URL query parameter titled ‘width’. If so, it checks our .small folder for smaller versions and changes req.url so the built-in static middleware serves the next-largest thumbnail instead of the original. It looks messy, but it fails fast by calling next() in any of these cases:

In any of these cases, the full image will be returned, and our poor users will have to wait an extra second. 

The client

All that's left is serving the correct thumbnail for the job. I really don't want to rewrite every <img> tag on every page. So I added this script to the base template, which gets loaded on all pages:

<script>
	$(function() {
		$('img').attr('src',function(i,url){
			if (url.indexOf('//')==-1) {
				return url+'?width='+$(this).width();
			} else {
				return url;
			}
		});
	});
</script>

If you don't speak jQuery, this code simply finds all internal images, and adds our width query to the source url. The width is determined by the size of the image element. We can trust the server-side code to choose the correct thumbnail. 

The dilemma

Now all that's left is to test! Here's what happens when I load my homepage in chrome:

A screenshot of a page load test in chrome development tools, showing the fullsize image loading for 2.19s after the smaller version loads in 662ms

Well, the good news is that the small thumbnail loads first. The bad news is that the fullsize image loads thereafter! Well I've achieved my goal of loading pages faster, but somehow I'm not quite satisfied. Some browsers might even load the fullsize version first and defeat the purpose completely. I guess I just wasted an afternoon, right? 

Some stackoverflow questions led me to believe that there's no way around this. 

Well, there is one way, but it will cause all images to fail if the user has javascript disabled. It's 2016, and javascript is 100% ubiquitous so long as you aren't wearing a tinfoil hat. Let's do it! 

All we need to do is change the server-side code to send an error (I'm using 301) if there's no width parameter. I also made a case to show the fullsize image if ?width=full, or any other non-numerical entry. 

…

if (req.url.slice(0,5)=='img') {
	if (!req.query.width) { res.sendStatus(301); }
	else if (isNaN(req.query.width)) { next(); }
	else {

…

I also replaced all the next(); fallbacks with:

req.url = path+filename+extension+"?width=full";
next();

OK, let's see if that helps:

A screenshot of a page load test in chrome development tools, showing the fullsize image throwing a 301 error in 634ms and smaller version loading in 438ms

Much better! Sometimes the fullsize image takes a long time to load due to latency, but at least the page renders at top speed and the full megabyte image isn't being downloaded. Your data plan is saved! 

So everything works after all. Boris from GoldenEye knows how it feels. 

A GIF of Boris saying 'I am invincible!'

Tracman Beta PopTab