Add cache busting hashes with Gulp

I recently restarted my series of blog posts about my new development process with Git and Gulp by talking about how I went about upgrading to Gulp 4. Now that we’ve upgraded, let’s get back to adding new time-saving, website-enhancing functionality.

This one is a simple one, cache busting hashes. If you’re not familiar, this is a term that’s used to describe adding a parameter to the filename of a file so that the browser sees it as a different file, even though it may not be. There are different techniques for doing this, such as incrementing version numbers, or using the release date, but for me the easiest is simply a hash of the file contents – this means that it will only ever change when the contents changes, which is perfect, and is super easy to implement with Gulp.

For this, I used the plugin gulp-hash-src. So the first step is to install it…

npm install --save-dev gulp-hash-src

…and then secondly, include it at the top of your Gulp file…

var hashsrc = require("gulp-hash-src");

Now that we’ve got that, we can start to call it. First I’ll give you the example I use, then I’ll talk you through the options…

var index = function() {
  return gulp.src("build/index.php")
    .pipe(hashsrc({build_dir:"build",src_path:"js",exts:[".js"]}))
    .pipe(hashsrc({build_dir:"build",src_path:"css",exts:[".css"]}))
    .pipe(gulp.dest("build"));
};

In this example, I am processing the “index.php” file that has already been output into the build folder. I’m calling the gulp-hash-src plugin with the following options…

build_dir – this is where the assets are located, which is the “build” folder for me
src_path – this is where the assets came from, which I have split into “js” and “css” separately
exts – this is an array, but I have split into “.js” and “.css” because they come from different source folders

Other options available include…

hash – the hash type to use, but I’m happy with the default “md5”
hash_len – if you’re not happy with the length you can shorten it
enc – the character encoding, but I’m happy with the default “hex”
regex – regular expression can be used to limit the files which match and are processed, but I’m happy with the default
analyze – ties in with “regex” above
query_name – I’m happy with the default “cbh” but you can change if required
verbose – helpful for debugging, set to true if required

What this will do is process your input file(s), find tags which match the regular expression and then calculate a hash of the files contents to update the url. For example…

<script src="/js/script.js"></script>

…becomes…

<script src="/js/script.js?cbh=3972257cb5f8fddb3d122ef4a4275bf9"></script>

As you can see, the url has been updated and will always match with the file contents. As long as the file contents remains the same, so does the url, but when the contents changes, the url will change. This means that you can put really long browser caching in place, which is excellent for performance for returning visitors, knowing that the visitor will automatically receive the new version of the file when it changes.

You don’t have to limit this to javascript files and stylesheets, it could also be utilised on images, for example.