User Friction & Site Performance Blog | Blue Triangle

How to Optimize HTML to Boost Web Performance

Written by Kristina Ravensbergen | May 17, 2019 4:18:00 PM

Why optimize HTML?

HTML is the backbone of the internet. It is the document type that builds the structure of a website. Without HTML, JavaScript wouldn’t be able to run, CSS wouldn’t be able to style anything, and images wouldn’t have a place to load. The power of HTML lies in its versatility, mainly because it can load other files – which is what hypertext, the first part of HTML, means.

When HTML takes a long time to load, parse, and download external files, user experience can suffer. Page load times (Page Onload) grow longer, with more users tending to abandon the longer they have to wait.

There are many ways you can optimize HTML to avoid these outcomes, including semantic optimizations that vary based on browser type. These change over time because HTML syntax changes over time, and different browsers adopt the updates at different rates.

In 2014, HTML5 became the recommended standard by the W3C. Since then, we’ve gone through two version updates. As of December 2017, HTML5.2 is the current standard, and version 5.3 is in the works.

I won't be covering those granular optimizations today. Instead, I want to give you a deeper understanding of:

  • How to make sure HTML gets delivered quickly regardless of the browser type, and
  • Which syntactical pieces affect modern HTML parsing the most.

I’m also going to explain why these optimizations are recommended, which will get a little technical, so I’ve included an overview at the beginning of each section, and then a TL;DR at the end.

Let’s get started.

 

Best practices for fast HTML delivery

  • Clean up HTML so it is concise
  • Compress HTML server-side
  • Use non-standard optimizations as needed

HTML gets delivered like any other file on the internet – over a network in data packets, which have limited room for data. Here’s what the process looks like:

  1. On a new connection, the server can send up to 10 TCP packets in the first roundtrip.
  2. The server waits for the client (i.e., browser) to acknowledge the data.
  3. If the server receives confirmation from the client that it received the data, the server will double how much data it sends for each successive trip.

10 TCP packets is equivalent to about 14.3KB. So if the HTML is larger than 14.3KB, it will take multiple roundtrips to deliver the base file. Ideally, you would be able to include multiple files in that first connection, like CSS with server push, in order to complete the critical rendering path in a shorter amount of time.

Reducing the size of the HTML file helps reach this goal, with two main ways to do so:

  • Clean up excess HTML code to shorten the file length.
  • Compress the HTML file so that smaller file size is delivered.

HTML Delivery Tip #1: Clean up HTML so it is concise

Following W3C specifications for markup makes HTML more maintainable and readable. The ones that reduce the HTML file length most follow.

Don’t use inline styles.

Link to a stylesheet in the <head> of the document instead of using inline styles. The type attribute does not need to be declared so that the reference to the external stylesheet looks something like this:

<link rel="stylesheet" href="styles.css">

Don't use inline scripts.

Link directly to a JavaScript file instead. When a browser sees a <script> tag in HTML, it also assumes JavaScript, so the type attribute does not need to be declared. The script tag should be succinct and look something like this:

<script src="script.js"></script>

Reduce blank lines and unnecessary indentation.

Mozilla recommends indenting with 2 spaces rather than a tab – the equivalent of 4 spaces – and only separating blocks of code with a blank line when there is a good reason. You can also use a tool like HTML Tidy to strip out whitespace and extra blank lines from valid HTML.

HTML Delivery Tip #2: Compress HTML server-side

GZIP compression or a similar compression model allows less data to be sent to an end user’s browser to construct the same page. Total compressed page size is about half as large in MB as the uncompressed page size.

If you’re not compressing HTML and other files, your site is likely slower than competitors.

HTML Delivery Tip #3: Use non-standard optimizations as needed

There are some kinds of optimizations done regularly to other files that are not standard for HTML.

Minification

Minification deletes all unnecessary whitespace and all new line characters, and is not common practice in HTML. While you can minify HTML if you wish to do so, it can make the document more difficult to read, especially if the page changes often.

Caching

Caching is not always used for HTML either, because HTML files tend to change frequently.

That being said, it is possible to cache HTML. Caching rules allow you to dictate where users’ browsers will request the document from – the cache or the server. Use caution, because you don’t want to serve up an old version of a website. Static HTML pages, like blog posts, can usually be cached without adverse effects.

Best practices for fast HTML parsing

  • Get critical rendering files early

  • Load files in the right order

  • Load render-blocking scripts asynchronously

  • Use valid markup and include essential tags

Once the HTML document has been delivered to a browser, several steps need to happen in the background before anything shows on the screen. This is known as the critical rendering path – the minimum steps that the browser has to take before the first pixel displays.

HTML parsing is included in the critical rendering path. The faster HTML parsing can occur, the quicker DOM construction can happen, and the faster the rendering will occur. I will probably cover this content in a separate blog post in the future, but for now, I’ll explain the critical rendering path in conjunction with optimizations for the HTML portion of this path.

HTML Parsing Tip #1: Get critical rendering files early

The critical rendering path does not exist in a vacuum – it can be affected by the load order of the files that build a page. This includes, at minimum, HTML and CSS, but often includes JavaScript as well. For that reason, you want to load external CSS in the <head> tag of the document and load any JavaScript that is critical for styling or updating the content above the fold as early as possible.

For both critical CSS and JS, you can also use the HTTP preload and server push methodologies to get these files faster. CSS and JS are also typically static, which makes them excellent candidates for caching.

HTML Parsing Tip #2: Load files in the right order

Load order matters between external CSS and JS files, too. Both HTML and CSS have to be parsed for the page to render. When the browser reads through – parses – the HTML, it goes from top to bottom. When it runs into CSS, the browser can start parsing it.

However, the default behavior of the browser when it sees a <script> tag is to stop parsing of HTML, download the script, parse it, and execute it. This is because the browser expects the script to affect the structure of the HTML, which in turn affects the way the page renders. It also means that if the HTML hasn't seen the <link> tag for CSS yet, it can't download the file until the JavaScript is processed. This leads to two best practices for JavaScript placement in HTML:

  • If you must load JavaScript in the <head> tag of the document, load it after external CSS.
  • Load all other JavaScript at the bottom of the <body> tag, after the HTML content.

Finally, limit the number of files that need to load for rendering to happen. This can mean deferring third-party content that would otherwise load early in the page and slow rendering.

HTML Parsing Tip #3: Load render-blocking scripts asynchronously

When a browser sees a <script> tag in HTML, it stops HTML parsing until the script is downloaded (if external), parsed, and executed. This is known as synchronous behavior because it all happens in the main processing thread of the browser. However, there are two attributes you can assign scripts to change this default behavior – async and defer.

The async attribute – short for “asynchronous”

When a browser encounters an asynchronous script in the HTML, downloading and parsing of the script happens in a separate processing thread, allowing HTML parsing to continue. The only portion of an asynchronous script that affects HTML parsing occurs upon execution, which occurs as soon as the script is parsed.

The async attribute should be denoted like this:

<script async src="script.js"></script>

The defer attribute – for deferral of execution

Downloading and parsing a deferred script is also asynchronous, taking place in a separate processing thread. However, a script with the defer attribute will only execute once the HTML is done being parsed, at which the point the document is considered ready.

The defer attribute should be denoted like this:

<script defer src="script.js"></script>

Almost every JavaScript file should load asynchronously because asynchronous scripts do not stop HTML parsing. Note that you can only load JavaScript asynchronously or defer its execution if it’s called externally in a <script> tag. Use the defer attribute sparingly since it’s difficult to control execution order, and only when the script does not alter the rendering of the page.

Support for async and defer differs depending on the browser, but most browsers support both. When both attributes are listed in a script tag, the async attribute takes priority.

HTML Parsing Tip #3: Use valid markup and include essential tags

Valid HTML5 markup is specified by the W3C. They also have an HTML validation tool you can use to see syntax and style errors in your code. There will almost always be some errors, but excessive errors in your document should be a concern. Browsers rely on HTML standardization to read and understand what an HTML document contains and how to display it, but poor document structure and poor use of syntax can slow down how quickly the page can display.

To make sure browsers can easily read your HTML, you should:

  • Include essential tags and attributes
  • Close all tags that require closure.
  • Use descriptive tags in favor of generic ones.

Include essential tags and attributes.

Declare doctype

The doctype declaration should happen at the very top of the HTML document, outside of any other tags and above the <html> tag. This lets the browser know what it’s looking at as soon as the page is delivered. For most cases, <!DOCTYPE html> will be appropriate, which defaults to the current HTML version. However, other doctypes can be declared depending on how the document will primarily be used.

Declare the document language

Letting the browser know what language it’s looking at reduces errors in parsing and allows faster rendering. Use as short a language declaration as possible inside the <html> tag.

For example, if your document is in Japanese, you would write:

<html lang="ja">

You can exclude the country code in this example because Japanese is only spoken in Japan.

Declare what character encoding the browser should use.

For character encoding, the current standard is to use UTF-8, which avoids the vulnerabilities of UTF-7. Character encoding is declared as a <meta> tag attribute within the <head> of the document.

Without declaring the character encoding with the charset attribute, the browser will not know how to read the file. For that reason, you should include the <meta> tag with charset attribute immediately after the opening <head> tag so that it is one of the first things the browser reads.

In summary, the beginning of the HTML document should include something like this at minimum:

<!DOCTYPE html>
<html lang="en-US">
<head>
<meta charset="utf-8">

</head>

Close all tags that require closure

In HTML, a few tags are assumed to be self-closing, called void elements. Most tags, however, require a closing tag. Although a browser can usually read HTML without closing tags, leaving tags open can result in disproportionately poor performance because the browser must construct additional DOM nodes to compensate for the nested elements.

An appropriately opened and closed element looks like this:

<body></body>

Void elements that do not require a closing tag in HTML include <area>, <base>, <br>, <col>, <embed>, <hr>, <img>, <input>, <link>, <meta>, <param>, <source>, <track>, and <wbr>.

Favor descriptive element types and avoid generic ones

HTML5 includes new elements that are more specific to certain kinds of content. Using these descriptive element names gives the browser a more rigid set of rules for reading and styling the content contained in the element than using a generic element would. This can cut down on the number of rules necessary in CSS, as well as reducing redundant class attributes.

For instance, a navigation bar containing a set of links for the main navigation on the site can be denoted with the new <nav> element instead of with a <div> element:

<nav>
  <a href="/link1/">Link 1</a> |
  <a href="/link2/">Link 2</a> |
  <a href="/link3/">Link 3</a> |
</nav>

Takeaways and the TL;DR

HTML can make or break your site. The way it’s delivered and structured determines how quickly the browser renders a webpage and what quality the rendering will be. With that in mind, we determined that reducing the amount of data that gets delivered with the HTML file allows the browser to start reading the HTML sooner, and that following best practices for document structure help the browser read it faster.

Here are the HTML optimization recommendations made in this article (the TL;DR):

  • Avoid inline JavaScript and CSS.
  • Reduce unnecessary whitespace and blank lines.
  • Compress HTML on the server with GZIP or similar.
  • Get critical rendering files – like above-the-fold styles – early in the page load with preload and server push.
  • Always load external CSS before JS in the <head>.
  • Place synchronous JS at the bottom of the <body>.
  • Load scripts asynchronously whenever possible.
  • Validate your HTML.
  • Always include essential elements, like <!DOCTYPE>, <html> with the lang attribute, and <meta> with the charset attribute.
  • Favor descriptive elements types over generic ones.

And as always, test changes before you make them!

<< Part 1: How to Optimize Images to Improve Web Performance

< Part 2: How to Optimize CSS for Better Web Performance