A/B Test: what it is and how to use it to improve the site

Tempo di lettura : 11 minuti

It is one of the most used systems to verify the effectiveness of the interventions and the changes made on the site, useful above all because it allows to discover the actual satisfaction of the users placed before two different options for style, graphics and content. Let’s find out more about the A/B Test and learn how to set it in the best way for our site.

What is an A/B Test

Also known as Baskets Test or split-run testing, the A/B test is a controlled experiment with two variants – called precisely A and B – which provides for the setting of two versions of the same page with a single change, so as to have practical data on which exceeds the other in the liking and actions of users.

In summary, it allows you to compare two versions of a different page for a single variable so as to test the subject’s response (a sample of the type audience) with respect to variable A or B and determine in this way which is the most effective.

The purpose of the test

Very much used in web analytics – and we have already cited it as a strategic resource to create a perfect landing page – this tool is among the hypothesis testings or “2-sample hypothesis testings” in the field of statistics.

The goal of the activity is as simple as strategic: to identify the changes – even apparently small – within a web page that increase or maximize the result of an interest, such as the click-through rate for an advertising banner, and for this reason it is decisive in the Web design and in the study of the user experience.

Exploting the test for the CRO

If implemented correctly, the test allows us to experiment with individual elements of the site that we intend to vary and determine whether they are effective for our audience: the most practical and widespread function is the analysis of call to actions, elements that need to be more than designed to produce results.

The lack of effectiveness of the call to action can depend on various reasons, including changes in user behavior, shifts in the audience or the general way in which users interact with our website.

Making small changes and testing them is a critical process, which is part of CRO operations (conversion rate optimization) and involves the use of scientifically valid tests on the elements of user interaction on our site, with the aim of achieving a significant increase in performance at the end of all tests.

Do not rely on fate for site’s changes

It is quite obvious to say that changes on the site should not be made at random, intervening in a non-strategic way on individual elements and hoping for the best. By doing so, we are reducing the chances of having concrete and measurable results, because we do not know what to attribute the possible effects found.

Especially for large sites in very complex sectors, it is not possible to draw appropriate conclusions about what to change and what not to change within the pages if we do not know the way target users interact with the website, and that is where the A/B test comes into play.

Understanding the goals first

The conversions do not just happen and to get real results – for example, real people who buy our service or product – you need a significant investment of time, to study the performance of users on the site and make appropriate changes to stimulate their actions.

The brutal reality of digital marketing is this, Brian Harnish tells us from Search Engine Journal: we must “know what the data say about the changes before we can make the changes”. The whole raison d’être of these tests is to allow us to have concrete results on the changes before making them, to consolidate assumptions on optimizations in a set of real data.

If a super marketing expert knows what changes to make based on their experience and then can reduce the time needed for testing and analysis that improve the performance of a website, a less experienced marketer will do random tests that don’t always achieve the desired result.

A practical example: the test to improve affiliation campaigns

The article also provides an example to understand how split testing works, analyzing one of the ways “to earn money on the Internet today”, namely affiliate marketing, which you can do by offering “customers a product or a free service provided as an affiliate” that earns a commission for each sale.

If we get enough information about our customers we can build a relationship with them, because we achieve a relationship of trust; this study process also allows us to learn a lot about the way they use the website and to implement changes that can help users achieve what they are trying to achieve, namely to purchase our service or product.

A/B Test, the elements to verify

Harnish also reports what are the elements that may have the maximum potential return on the ROI during testing, which include among others:

  • All the CTAs on the site.
  • The overall background color of the site.
  • The colors of the elements of the entire page.
  • The photograph of the pages.
  • Contents and their structure.
  • Any item on the page that requires user interaction.

Site elements are critical adaptations that can lead to significant increases in performance: for example, if we think that a conversion button is not working properly, we may submit a variant (different color, or modified text) to a sample of actual users of the site, who can provide us with factual evidence on the effectiveness of the new solution.

With the A/B test we can verify the performance of one element at a time, varying it and comparing it to an alternative version that contains modifications studied on the hypothesized behavior of users.

A different approach to site interventions

The advice that comes from the expert is to deal with A/B tests with an open mind, without prejudice to the elements to be tested and any results, because in this way we can really find out what we do not know.

For example, we can run a test on how the site shows contacts (such as the phone number), using distinct elements within the same segment of the data audience sample to determine which ones get the best results. Or we can run a content test to make sure we don’t waste time and effort on content that simply won’t work.

This approach allows us to have a real and empirical demonstration of how well the site works in every single element, and in some cases we can achieve really excellent results on final conversions simply by implementing a variation that improves something that was not working anymore.

How to create an A/B test

The creation of a successful A/B test involves a search on several fronts, starting with the size and current status of the website.

In many cases, if we have small sites or with a small audience, it is not necessary to do this type of experiments, but even if we already know the best practices most common for our industry, this test can still indicate what are the useful changes to be made. Rather than launching into this operation, a small site would have better results (and with less effort) interviewing its customers, discovering weaknesses and working on optimizing the site based on the feedback received.

Different is the case of large sites (over 10,000 visitors per month and with hundreds of pages, says the article), because running the A/B tests can help us find out which version will give the best conversion among those of the sample. This activity becomes essential for even larger sites, where even a single element can lead to a great return in conversions.

What we need to do is to establish a general baseline of website traffic, which allows us to continue with evaluating the elements of the website and what we need to change to run an effective test.

Deciding which elements to test

We have reached a decisive stage, that is to decide which elements to vary and to submit to the test, a passage that in turn requires tests (and often) errors in order to obtain the right result.

For instance, if we believe that for our industry a red CTA is seen better than the blue we are currently using, we will change the color of the buttons during the A/B test and find out from the reactions of users which really converts best. The usefulness of the A/B test is that it allows us to find out what our users really respond to, rather than changing the site based on simple hypotheses.

This is an important distinction to make, because often the work of optimizing the site is based on assumptions (however realistic or studied), and unless we have spoken directly with customers we do not know what they are responding to. Instead, the A/B test turns the theory into a high precision tool: after a successful test, we will know what works best for our users.

The usability tests

The user test is the most essential phase of the usability test, which in turn is a crucial tool in evaluating the usability of the site. In fact, the study of users and audience should always be part of an ongoing process, rather than being conducted randomly.

The purpose of this test is to understand in a more systematic way how people communicate with our site, and the usability test allows us to recognize if there are any difficulties encountered in the conversion funnel of users. The second step is to initiate a screening procedure for user analysis, which can have many goals or only one goal.

For example, the user screening can be designed to:

  • Find out how users actually scan the page.
  • Assess what is really attracting their attention.
  • Address any deficiencies in existing content.
  • Find out what are the weaknesses of the buttons and how to correct them.

There are various tools that allow you to perform A/B tests and help define the parameters that we may decide to vary.

Examples of A/B tests on the content

As we said earlier, the content is one of the possible subject of testing, especially if we are not sure what type and form may have the greatest impact on our audience.

For instance, if readers are accustomed to a longer text and are prone to interaction when they find it – as in the case of many scholars and people in academic fields – our goal is to produce a text that looks like an essay. However, if the audience is composed of more informal readers, it would be better to use shorter lines of text and paragraphs.

And so, we can do a test by making two pages – one with long, formal content, the other with short, more appealing text – to find out which offers the best results.

Or, we can study the effect of changes on font size, color or spacing, to find out if these factors affect readability and, possibly, understanding and liking the content.

Examples of split test on buttons

Evidence may also relate to seemingly minor details, such as the size of the CTA button: if users who click on the button are less than desired, or if people fill out the form by entering incorrect information, You can intervene to understand where the problem is and test different elements of the form, including the button.

For example, we can improve the written text of the form to get better and more accurate conversions; many experts in this process recommend providing users with a deadline for a request, with a message reporting to the user “We will reply to your message within 24 – 48 hours”, an intervention known as “setting user expectations”.

A small solution of this kind also develops trust, as it tells people that they will receive a guaranteed response (obviously to be respected) without fear that their request will end in an oblivion.

Variariating the comunication of the brand

The way we communicate is as important as the content that is interpreted by the audience: as Harnish reminds us, you cannot “know exactly what your audience will respond to without testing live variables” and you can’t know in advance “what will make the reaction you desire happen”.

This is why it is important to “test real-time variables and variations of these variables”.

One of the most important tests involves modifying the message, in particular of titles, sentences in the contents, taglines and formulation of calls to actions, because communication may not be as effective as we want (and think).

Verifying A/B test results

The split tests can reveal things about our users that we may never have suspected and are, as mentioned, a very useful tool to understand if, how and when to make changes to the site that can give concrete improvements (also in terms of profitability).

The key to the success of such tests is to create a solid methodology: after planning and fine-tuning the final elements, we must run them and, above all, verify in the given time the response of the users. Only in this way can we be (more) sure to make interventions that will work and give the expected results.

A/B testing and SEO, Google’s advice

In conclusion, it is useful to pay attention to the possible negative effects that conducting a test of variations in page content or page URLs can cause on Google Search performance, and it is precisely an official Google guide – which appeared online in September 2022-that warns against these problems and suggests solutions to minimize the risks of A/B testing on SEO.

The document first clarifies what is meant by site testing, which is “when you try different versions of your website (or part of your website) and collect data on how users react to each version.” Therefore, two types of testing fall under this definition:

  • A/B testing, which, as mentioned, involves testing two (or more) variations of a change. For example, we can test different fonts on a button to see if this increases button clicks.
  • Multivariate testing, on the other hand, involves testing more than one type of change at a time, looking for the impact of each change and potential synergies between changes. For example, we might try different fonts for a button, but also try changing (and not changing) the font of the rest of the page at the same time: is the new font easier to read and therefore should be used everywhere, or does using a different font on the button than on the rest of the page help attract attention?

Google reminds us that we can use a software to compare behavior with different page variants (parts of a page, whole pages, or entire multipage streams) and monitor which version is most effective with our users. In addition, we can test by creating multiple versions of a page, each with its own URL: when users try to access the original URL, we redirect some of them to each of the URL variations and then compare the users’ behavior to see which page is most effective.

Again, we can run tests without changing the URL by inserting variations dynamically into the page, even using JavaScript to decide which variation to display.

Testing sites and Google, the aspects to consider

Depending on the types of content we are testing, the guide says, it may not even be “very important whether Google crawls or indexes some of the content variations in the course of the testing activity”. Small changes, such as the size, color or position of a button or image, or the text of the call-to-action may have a surprising impact on users’ interactions with the page, but “often have little or no impact on the snippet or search result ranking of that page.”

Also, if Googlebot crawls the site “often enough to detect and index the experiment,” it will likely index any updates we make rather quickly at the conclusion of the test.

Google’s best practices for testing sites

The paper also goes into more technical and practical details, providing a set of best practices to follow to avoid negative effects on site behavior in Google Search when testing variations on the site.

  • Do not cloak test pages

Do not show one set of URLs to Googlebots and a different set to humans, and thus “don’t do Cloaking”, a tactic that violates Google’s Instructions, regardless of whether we are running a test or not. The risk of these violations is “causing the site to be demoted or removed from Google’s search results, likely not the desired result of the test.”Cloaking matters whether we run it through server logic or robots.txt or any other method, and alternatively Google suggests using links or redirects as described below.If we use cookies to control the test, we should keep in mind that Googlebot generally does not support cookies: therefore, it will only see the version of the content accessible to users with browsers that do not accept cookies.

  • Using rel=”canonical” links

If we are testing with multiple URLs, we can use the rel=”canonical” attribute on all alternate URLs to indicate that the original URL is the preferred version. Google recommends using rel=”canonical” rather than a noindex meta tag because it “more closely matches the intention in this situation.” For example, if we are testing variations of your home page, we don’t want “search engines not to index the home page, but only to understand that all test URLs are duplicates or variations similar to the original URL and should be grouped together, with the original URL as canonical.” Using noindex instead of canonical in such a situation could sometimes have unexpected negative effects.

  • Using 302 redirects and not 301 redirects

If we are running a test that redirects users from the original URL to a variant URL, Google invites us to use a 302 redirect (temporary) and not a 301 redirect (permanent). This tells search engines that the redirect is temporary (it will only be active as long as the experiment is running) and that they should keep the original URL in their index, rather than replacing it with the target URL of the redirect (the test page). JavaScript-based redirects are also fine.

  • Launch the experiment only as long as necessary

The amount of time needed for a reliable test varies depending on factors such as conversion rates and the amount of traffic the website receives, and a good testing tool tells us when we have collected enough data to draw a reliable conclusion. Once the test is finished, we need to update the site with the desired content variants and remove all test elements, such as alternate URLs or test scripts and markup, as soon as possible. If Google discovers “a site running an experiment for an unnecessarily long time”, it may interpret this as an attempt to deceive search engines and act accordingly, especially if the site offers a content variant to a large percentage of its users.