A to Z of Google Analytics - A Brief Guide
What does Google Analytics have in common with a glass of water? More than you might think!
Navigating Google Analytics (GA) can feel overwhelming, but mastering its tools is crucial for understanding your website’s performance and optimising marketing efforts.
This A to Z guide simplifies some key concepts and functionalities, from building audiences and tracking events to using attribution models and segmenting data effectively.
A
We start with audiences.
Audiences are segments of users who share attributes. GA has two pre-built audiences, ‘All Users’ and ‘Purchasers,’ but there’s much scope to build audiences according to whatever criteria work best for your target markets.
Audiences have two uses: in GA, we can use them to compare how different cohorts of users experience your site, and if GA is linked to Google Ads, the audiences are available as targeting options for advertising.
Attribution is an often poorly understood but crucial concept in GA and digital marketing.
Attribution means “giving credit” in marketing. If people click on an ad one day but don’t purchase until the next day when they visit your site through organic search, which channel should receive credit for the purchase, paid or organic?
This depends on how you view it, or which attribution model is used.
GA has three attribution models:
‘Session’, also known as ‘last-touch’, attribution considers the current visit to your site. In the above example, credit would go to organic because that was the session where the purchase happened. This can be viewed by using dimensions with names starting with ‘Session’, like ‘Session source’ and ‘Session campaign’.
‘First user’ is a slightly odd name for what is actually ‘first-touch’ attribution. This concerns what first brought a user to your site and gives credit accordingly - in our example, if the ad was the first time the user visited your site, then this would be given credit because this was the ‘first-touch’. Dimensions starting with ‘First user’ will show this attribution.
GA has a third, data-driven attribution model. This attempts to apportion credit using a sophisticated algorithm, and conversions are given as fractions rather than whole numbers. However, this model relies on a lot of data to become reliable, so it won’t be useful for smaller organisations.
B
Be careful when setting up a GA property. Many settings can impact the data collected and stored.
One of those settings is the enhanced measurement events, which Google recommends turning on. Unfortunately, many of those automated events do not work at all or not as you might expect – the form tracking is particularly bad as it doesn’t recognise multi-page forms!
C
For more mature sites, setting up content groups can be a really powerful way to segment pages into distinct groups. For example, at its most basic level, a charity website will probably want to group pages into ‘fundraising’, ‘donating’, ‘services’, and ‘about’ groups, while a clothing brand will want to group by gender and product lines.
This can be set up manually through Google Tag Manager, though it is typically deployed in the dataLayer with other page data for a truly scalable solution.
Once the content group data is available in GA, this is available as a dimension that allows for quick analysis and insight, which is far superior to awkward filtering.
Also, C is for cardinality. This is the number of unique values in a dataset.
Let’s look at the below dataset:
A table showing a dataset of attributes for fruit
Fruit ID has a high cardinality because every row has a unique ID number. Fruit Category, compared to the other columns, has a low cardinality because there are only two categories (Pome and Citrus). Fruit Name and Fruit Colour have moderate cardinality in this dataset (four unique values each).
GA has a cardinality limit of 50,000 values, so when setting up custom dimensions, care must be taken to ensure datasets with high cardinality, such as IDs and order numbers, are deployed correctly and only where they will be genuinely useful.
D
Default channel grouping is an incredibly useful way to segment data in GA, but if marketing tracking isn’t set up correctly, its value in analysis can be compromised.
The default channel group is most prominent in the User and Traffic acquisition reports. It groups the different traffic sources into broad segments, such as Organic Search, Paid Social, and Display.
However, for GA to perform this segmentation correctly, your setup must follow Google’s schema for defining manual traffic, specifically when generating UTM parameters (more on that later).
If Google’s schema isn’t followed, GA will allocate traffic to the wrong channel grouping, place it in the ‘Unassigned’ grouping, or, if no parameters are provided, assign it to ‘Direct’—try getting useful insight from that!
We’ve mentioned dimensions already, but it’s worth clarifying what they mean. A dimension is an attribute of data.
This sounds abstract, but only because it’s so intuitive that we rarely have to actively think about how we label things in our everyday lives.
As I type, a glass of water is on my desk. We could give it several labels or dimensions, such as ‘Material’ (glass), ‘Category’ (beer), and ‘Brand’ (Adnams).
Dimensions serve the same purpose in GA. They are ways to label the data our site generates, with different types of data (page, event, user) labelled with different dimensions.
E
“Events, dear boy, events” was British Prime Minister Harold Macmillan’s response when asked what his greatest challenge as leader is. In GA, events are just as important. They are the most important part of GA.
In GA, events are any trackable interaction a user takes on a site. This includes page views, clicks, and conversion (known as ‘key events’, see below). GA is said to be event-based analytics because the data in GA4 is an aggregation of all the events that users take on a site and report back.
Another important E is explorations. This is the third section in the left-hand menu and enables GA users to build their analysis using various techniques, including free-form tables, funnels, and paths. The user explorer can be found here, too.
F
There’s so much data to explore, so we need to use filters to focus on the data we need to answer our questions and discover new insights.
Depending on where and what reports or explorations, we can apply filters to dimensions and metrics - convenient for stripping out statistical outliers from datasets!
GA also understands regular expression, which can be game-changing when we need to set up sophisticated filtering. So it’s a good idea to learn a little Regex (or ask AI for help!).
G
GDPR, or, to give it its full name, General Data Protection Regulation, is the European Union’s data privacy regulation that governs how Europeans' data can be processed, stored, and transferred. The UK enacted mirror regulation after leaving the EU, while many countries and regions across the world are enacting similar privacy regulations.
Why is this important for GA? Google expects sites to follow the privacy regulations of the countries where they do business. Because very few sites actively block users located in the EU, GDPR has become the de facto standard.
GDPR requires sites to obtain opt-in consent from users for tracking through a cookie banner and for users to have the ability to change or withdraw consent at any time.
While we can no longer expect 100% of site data to be available in GA, building trust is vital in a society where people are increasingly aware of how their data is collected and used.
H
It’s the screen that loads first when logging into GA, but the home screen is a bit of an… anti-climax. That said, it has become more useful since Google introduced benchmarking to the main card.
I
While it’s not as useful in a hybrid world, blocking out internal traffic is still worth doing, even if it only screens out data from your office-based teams.
The setting is hidden deep in the admin section, but once found, the relevant IP address(es) must be defined as internal. Once defined, data from those IP addresses will not be shown with the regular site traffic.
J
GA is just a marketing tool. It’s very good at telling us how our site visitors found our site and giving us a good idea of what marketing and communications are working well or not, but it’s not the only tool you should use to track performance and determine strategy.
Given the limitations set by consent rates (see above), it’s not the complete picture. GA isn’t a CRM, a finance system, or a substitute for marketing mix modelling. It isn’t even particularly good at telling us why people engage with our site and content. Consider using tools like HotJar and Microsoft Clarity to generate heatmaps, run A/B tests, and deploy surveys on-site.
Combining data from other tools with GA data is the only way to build a complete picture of our customers.
K
We know what events are, so it won’t surprise us to learn that key events are special events. In other words, these are the conversions that ultimately determine whether our marketing efforts have resulted in real business outcomes like purchases and leads.
Marking those conversions as key events in GA is important not only for visibility in GA itself but also if we import GA key events into Google Ads to optimise paid campaigns.
Purchases are automatically marked as key events, but service businesses typically need to set up and mark generate_lead or sign_up events as key events.
L
None of us like to be told what we can’t do, but Google must put some barriers in place. Be aware of the limits when setting up a GA property and events.
Without careful consideration, it’s surprisingly easy to hit the 50 custom dimensions limit, and the 25-parameter limit for events is another trap which trips people up.
See the pre-defined section for advice on getting around this…
M
Metrics are the quantitative measures that bring GA to life. The most basic one is Event count, which does what it says on the tin: it counts events. Creating custom metrics is rarely necessary because, often, a simple count of whatever event will be satisfactory.
N
Building on the earlier point about what GA is and isn’t good at, one of the notable missing features of GA is no heatmaps.
This might seem like a glaring omission, given that GA is meant to help us understand how people navigate our website.
Heatmaps normally incorporate click, scroll, and mouse movement tracking, which is projected onto screenshots of pages. The trouble with this is that only clicks fit the event-based model of tracking that GA relies on and which can be plotted on a table or a graph, whereas scrolls and movement aren’t well suited to this type of analysis.
This is why GA doesn’t provide heatmaps and why sites should use other services like HotJar or Microsoft Clarity for heatmaps.
O
GA does have real-time overview available in the Reports section, but it’s very limited in what can be shown and often unfit for detailed analysis, which needs whole-day data that is refreshed overnight. In short, real-time is okay for big of-the-moment traffic acquisition monitoring, and debugging, but probably not much else.
P
Here’s the secret sauce when setting up a GA property – pre-defined events and dimensions.
In nearly all circumstances, Google has already considered when people want to track on a website and have pre-defined events and dimensions to use.
Purchase events and the e-commerce dimensions are well known, but it goes beyond this.
A classic need is to track specific types of clicks on a site, like CTAs or the header menu. There’s no need to set up a custom event with custom dimensions for this, just set up a ‘click’ event with an outbound dimension value of “false”, and use the different ‘link’ dimensions to send over the different details required
This is a lifesaver when we are restricted in how many custom dimensions we can define in GA!
Q
Q is for… BigQuery (yes, I know…). By default, GA retains data for two months, and even when adjusted in the settings, it can only retain 14 months. This is a big problem for organisations that need multi-year comparisons.
The solution is to export GA data into Google’s BigQuery data warehouse service. This will cost very little per month for most sites, and the data can then be used in Google Looker Studio reports.
If you need the data in your own data warehouse or available in non-Google systems like PowerBI, this is difficult and best contracted out to specialist agencies.
R
The reports section of GA is the front desk of operations. Here, the most used reports, such as acquisition and e-commerce data, are available to give a big-picture overview of what is happening. Admins can create and amend reports to fit your business needs.
S
Google Signals is an optional setting in GA that needs to be turned on, but once it’s authorised, it enables demographic and interest reporting and some advertising targeting features.
How does it do this? Google Signals allows Google to associate your site visitors with their Google accounts provided they have turned on Ads Personalization in their personal Google account. For this reason, before turning this feature on, it’s advised to discuss the privacy implications with legal and update your site privacy policy to mention Google Signals explicitly.
If you notice a red or orange triangle at the top of your GA reports, you’ve encountered data sampling.
Sampling is common practice in analysis, and Google provides an excellent summary of how sampling works:
”If you wanted to estimate the number of trees in a 100-acre area where the distribution of trees was fairly uniform, you could count the number of trees in 1 acre and multiply by 100, or count the trees in a half acre and multiply by 200 to get an accurate representation of the entire 100 acres.”
To balance accuracy with speed – because nobody likes waiting forever for reporting to generate – GA will use sampling to generate a report when the number of events used to create that report would exceed 10 million data points if the whole dataset were used.
Should we trust sampled data? Broadly, yes, we can have a good degree of confidence that the sample is representative of the entire dataset because GA algorithms are powerful enough to take a truly representative sample.
If you are unsure about the accuracy of sampled data, you can increase the sample percentage or avoid sampling altogether by reducing the data range, reducing filtering, or exporting the data and conducting analysis on another system (exporting is always unsampled).
T
We’ve already discussed setting up GA, which is usually done through Google Tag Manager for most businesses.
GTM is Google’s no-code solution to getting GA and just about any other tracking script (Meta Pixel, TikTok Pixel, HotJar) live on your site. It does so in a GDPR-compliant way by integrating your cookies banner with Google Consent Mode, which then governs when and how those scripts are deployed.
GTM is a powerful tool, but the learning curve is steep!
U
As we already know, getting your data segmented into the right channel group is vital, and UTMs are the key.
UTMs are parameters we can append to the end of links to our site, which we add to ads, emails or social posts. Here’s an example:
https://www.example.com?utm_source=facebook&utm_medium=medium=paid&utm_campaign=bigcampaigncom
GA automatically understands the five standard UTM parameters (source, medium, campaign, term, content), and needs at least source and medium to segment the data to the right channel group.
It’s well worth investing resources in either setting up a spreadsheet or paying for a ready-made solution to ensure your links consistently generate UTM parameters that provide useful data in GA.
And one more thing about UTMs: never use them on your site's internal links. This causes issues with SEO and GA!
Why are there different counts of users? Total and Active should be very close, but it is possible for a user to visit the site and immediately leave before taking any further action, such as scrolling or clicking – Total users include these not-really-users, while Active users ignore them.
V
Values are essentially the data you send to, and see in, GA. All the different dimensions and metrics are values that GA aggregates in the reports to generate something that will (hopefully) be of value to you.
W
If this sounds complicated to pick and learn while juggling your seemingly never-ending to-do list, work with an export.
X
Business continuity should be high on the list of priorities for any organisation, and GA is no different. What’s your strategy for when an admin exits?
Here’s a quick exercise: 1) make a list of everyone with access to your GA and what level of access they have; 2) how many administrators are on the list? 3) if there are fewer than three, you must add more administrators.
Y
Carrying on the business continuity theme, always know that the data from your site sent to GA is your data. After all, you paid for your site, you paid for the marketing, you paid for the user's generation of the data, and you are officially the data controller.
This is why it’s important to ensure that people in your organisation are administrators in GA and that agencies and partners are granted access as editors or below.
If you appoint an outsider to create and develop your GA, be sure a contract clearly states that your organisation owns your GA account and all the data within it and that your people are provided with administration access.
YouTube is full to bursting with how-tos, demos, and advice on all things GA. Look out for channels including Measureschool and the Analytics Mania for detailed step-by-step instructions and technical details on how to create, build, and develop your GA correctly.
Z
Last by not least, time zones. You wouldn’t think anyone could get this wrong, but in GA, it’s more confusing than simply selecting the time zone where your business’s base of operations is.
The problem is that in the drop-down to select the time zone, GA provides both UTC and City / Location (‘London’) options, which appear to be the same. You should pick the latter as this adjusts for daylight savings time, whereas UTC is set to that throughout the year, which will cause confusion for half the year!
There’s way more to GA than is covered in this A to Z, but hopefully, this has given you some great ideas and knowledge on how to use GA to its fullest. Need expert help? You can reach out to me and we can discuss your analytics needs and your advertising strategy and execution.
(Article image courtesy of https://unsplash.com/@yosuke_ota)