Lighthouse is a critical, open-source tool used by web developers to benchmark their web applications and sites over time and to gauge how well they are performing relative to Google’s expectations. There are separate scores within Lighthouse:
The performance score is what impacts your site. While following recommended best practices for code and SEO generally lead to better performance outcomes too, we will be focusing on what goes into the performance score here, and answer the following questions:
Google’s expectations for performance are important – they use performance to rank web applications and sites in their search engine results, and tell developers to follow the guidelines set in Lighthouse. Google’s dominance in the market as a search engine makes it critical for web developers to heed the numbers reported for each performance metric that goes into Lighthouse and to understand how the metrics work.
Luckily the brains behind Lighthouse have made it their mission to provide clear guidelines for performance on sites. The documentation is extensive, and we will cover the highlights here.
Understanding that Lighthouse is a tool used by Google and others is important. The fact that Lighthouse is a tool means that different uses of the tool result in different outcomes.
Developers can run Lighthouse locally on their browsers, but there are other methods for using it, too. They can also track the same metrics that are in a Lighthouse score across their user base with real-user monitoring, or synthetically using a synthetic monitoring provider.
There are a few distinct ways to run Lighthouse on a computer:
Like any software tool, Lighthouse generally gets better results on faster, newer computers than on slower, older ones. Here are a few aspects that can affect Lighthouse scores in different situations when you run it on a computer:
Lighthouse documents how the metrics it uses are measured in a browser. They also document how much each metric affects scoring, as well as what constitutes a “good” result.
Different kinds of metrics go into calculating the total score, all of which broadly relate to performance. Each metric will be measured, with different results corresponding to different scores out of 100. That metric’s score is given a weighting toward the overall score out of 100.
There are two broad categories for these metrics:
Each time Lighthouse runs, it will measure one load of the page or web application on a desktop and report both a desktop score and a mobile score based on its mobile simulated throttling calculations.
The metrics Lighthouse uses change depending on the version. The current version is Lighthouse v6, which surfaces well with the Web Vitals project Google announced earlier this year. With each version, the weighting of metrics that are still being used typically shifts and the current version has shifted to reflect the Core Web Vitals in that project.
Two metrics currently have the highest weights, accounting for 25% of the total score each. They are:
These are followed by three metrics that control 15% of the total score each:
Finally, Cumulative Layout Shift (CLS) account for just 5% of the total score.
This is a huge departure from Lighthouse v5. In that version, TTI was the most crucial score for performance, holding a third of the total weight. That weight has been halved.
Each of these metrics have their own set of parameters, which you can read about in the documentation for Lighthouse. There are a lot of considerations for each that are too extensive to cover here. But let’s talk about what a “good” score is.
“Good” is a range of values where Lighthouse says performance is expected for current web conditions.
The minimum score out of 100 needed for a "good" result is 90, but the exact requirements for that minimum good score vary by metric. Feedback may be less than “good”. Here are the different feedback ranges as the score decreases:
The time allowed for a good TTI score will be greater than a good FCP score due to the nature of how applications and webpages load. For example, a good Largest Contentful Paint happens within the first 2.5 seconds of the load, while a good Cumulative Layout Shift happens within 0.1 seconds.
What constitutes a good score varies by metric and is set by Lighthouse. Although it is not clear how the ranges correlate to search results in Google, the value ranges correspond to user experience studies performed by Google. A site or application with a good score will also rank higher than one with a poor score. To get a good overall Lighthouse Score, most of the metrics must meet or exceed the set standard for “good”.
Here is a quick rundown of the maximum time for each metric on the Lighthouse v6 calculator that still results in a minimum score of 90:
Lighthouse recommends (1) using simulated throttling that matches Google’s “Slow 3G” speed for the mobile portion, (2) running multiple measurements, and (3) using a consistent environment over time for benchmarking.
Something important to note is that variability is inherent in Lighthouse measurements, and not all can be mitigated by following these strategies. In our next post in this series, we will address some of the issues inherent with Lighthouse and the decision to use simulated throttling for measurements.
Measuring Lighthouse metrics with any other tool will result in different numbers. Monitoring these metrics synthetically without Lighthouse or monitoring them in real user traffic will result in different measured values than the values measured through Lighthouse.
For instance, Lighthouse generates its own measures of LCP through the simulated throttling mechanism. Here are two cases where that matters:
While Lighthouse results might correlate to the results generate by these other tools, the results for each tool are relative to themselves.
Still, because each metric has a specific method for calculation, improving the measurements for a metric using one tool should result in some improvement when measuring the same metric with a different tool. To continue the example, improving LCP for your real users should also improve LCP when you measure it during a synthetic page load or using Lighthouse.
Lighthouse performance metrics are centered around user experience. Tracking them with real user monitoring is a great idea because you can correlate them to business outcomes. Using Blue Triangle, that correlation becomes even more powerful — performance gets tied to outcomes on individual pages and on specific paths through your web application.
Monitoring the same metrics over time with synthetic monitors is also a good idea. The consistency synthetics provide will allow you to see instantly when a metric that your users rely on is suffering.
We believe knowing the strengths and limitations of the different tools at your disposal is essential in an ever-changing environment like the web. In our next part in this series, we will talk about preparing for some of the changes Google will be making to its performance scoring in the future, as well as what exact performance aspects influence the SEO of sites.
Let us know if you have any questions!