Tutorial: Using BigQuery to Analyze CrUX Data

Last blog I gave some examples of how we can use the Chrome User Experience report (CrUX) to gain some insights about site speed. This blog I will continue to show you how to use bigquery to compare your site with the competitors.

Image for post
Image for post

Prerequisite:

  • Log into Google Cloud,
  • Create a project for the CrUX work
  • Avigate to BigQuery console
  • Add the chrome-ux-report dataset and explore the way the tables are structured in 'preview'
Image for post
Image for post

Step one: Figure out what is the origin of your site and the competitor site

like syntax is preferred (Take care of the syntax difference between Standard SQL and T-SQL)

-- created by: Jacqui Wu -- data source: Chrome-ux-report(202003) -- last update: 12/05/2020 SELECT DISTINCT origin FROM `chrome-ux-report.all.202003` WHERE origin LIKE '%yoursite'

Step two: Figure out what should be queried in the select clause?

What we can query from CrUX?

Image for post
Image for post

The specific elements that Google is sharing are:

  • “Origin”, which consists of the protocol and hostname, as we used in step one, which can make sure the URL link
  • Effective Connection Type (4G, 3G, etc), which can be queried as the network
  • Form Factor (desktop, mobile, tablet), which can be queried as the device
  • Percentile Histogram data for First Paint, First Contentful Paint, DOM Content Loaded and onLoad (these are all nested, so if we want to query them, we need to unnest them)

Here I create a SQL query of FCP percentage in different sites, which measures the time from navigation to the time when the browser renders the first bit of content from the DOM.

Image for post
Image for post

This is an important milestone for users because it provides feedback that the page is actually loading.

SQL queries:

-- created by: Jacqui Wu -- data source: Chrome-ux-report(202003) in diffrent sites -- last update: 12/05/2020 -- Comparing fcp metric in Different Sites SELECT origin, form_factor.name AS device, effective_connection_type.name AS conn, "first contentful paint" AS metric, bin.start/1000 AS bin, SUM(bin.density) AS volume FROM( SELECT origin, form_factor, effective_connection_type, first_contentful_paint.histogram.bin as bins FROM `chrome-ux-report.all.202003` WHERE origin IN ("your site URL link", "competitor A site URL link", "competitor B site URL link") ) CROSSS JOIN UNNEST(bins) AS bin GROUP BY origin, device, conn, bin

Step 3: Export the results to the Data Studio(Google visualization tool)

Image for post
Image for post

Here are some tips may be useful

  1. Line chart is preferred for comparing different sites in Visual Selection
  2. Set x-axis to bin(which we already calculate it to seconds) and y-axis to percentage of fcp
  3. Set filter(origin, device, conn) in Filtering section

Wrapping up

This post explored the data pipeline we can use CrUX report to analyze the site performance. In the future, I will write more about CrUX.

If you are interested in or have any problems with CrUX or Business Intelligence, feel free to contact me.

Or you can connect with me through my LinkedIn.

Originally published at http://jacquiwu.com on May 13, 2020.

Written by

A current Data Analyst in a subsidiary under Webjet, with experience in applying data science techniques to business.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store