Book Review: Refactoring HTML

Despite years of progress by web standards advocates, and a significant improvement in the quality of the HTML on the web, many of us still end up grappling with outmoded, broken HTML on a regular basis. When confronted with a large site filled with broken pages it can be hard to know where to start. Elliotte Rusty Harold’s Refactoring HTML offers a step by step recipe book for migrating such sites to clean, semantic code.

Harold’s is a well known name in the XML world, and that background shows through in how he approaches the book. While a general audience will probably find useful content, the reader needs to be prepared for a series of command-line and Java-based examples. Tools like tidy are featured prominently, as is the use of regular expressions to seek out broken code to fix and, in the music-to-my-ears category, automated testing.

If you’re equipped to do so, following these steps will lead to much cleaner, more manageable sites, but I found myself wondering how many of those comfortable with command line tools and regular expressions are in the market for a book like this.

In general I suspect the key audience for this will be IT departments inside large organisations tasked with refreshing or extending an intranet. For those developers, who maybe don’t spend much of their time working with HTML and like the idea of using scripting tools similar to those in their regular workflow, this book’s worth a look. If you’re already familiar with current trends in web development, then there are probably other ways of picking up on the scattering of techniques that might be new to you.

Disclaimer: I was sent a copy of this book for review by the publisher. You can find it at amazon US, amazon UK and all sorts of other places.