What You Can and Can’t A/B Test for SEO

Home / SEO News / What You Can and Can’t A/B Test for SEO
Liam Blackledge
15 October 2023
Read Time: 12 Minutes
Article Summary

SEO A/B testing splits similar pages into control and variant groups to measure how Google responds to on-page changes. This guide covers prerequisites, viable test elements, and why most sites don’t qualify for true split testing.

Key Takeaways

A/B testing for SEO is the practice of splitting a group of similar pages into a control set and a variant set, making a change to the variant pages, and measuring the difference in organic performance. Unlike traditional conversion rate optimisation testing, where you show different versions of a page to different users, SEO split testing measures how Google responds to changes. You’re testing what the search engine sees, not what the visitor clicks.

It sounds straightforward, but there’s a significant gap between the concept and the reality. Most articles on this topic focus heavily on what you can test. Fewer are honest about the prerequisites, the limitations, and the fact that the majority of websites don’t have the structure or traffic volume to run true SEO A/B tests at all. At Gorilla Marketing, we help clients make data-backed technical SEO decisions, and that means being upfront about when testing works and when it doesn’t.

How Is SEO A/B Testing Different from CRO Testing?

Ab Testing Seo

Standard A/B testing, the kind used in UX and SEO improvement, works by randomly splitting users. Half your visitors see version A, half see version B. You measure which version gets more clicks, form fills, or purchases. The user is the variable.

SEO A/B testing flips this entirely. You can’t show Google two versions of the same page and ask which one it prefers. There’s only one Googlebot, and it sees one version of each URL. So instead of splitting users, you split pages.

Here’s how it works in practice. You take a set of similar, templatised pages, like product pages, location pages, or category pages. You divide them into two groups: a control group that stays the same, and a variant group where you make a specific change. Both groups should have comparable traffic levels and ranking patterns before the test starts. After making the change to the variant pages, you monitor organic traffic to both groups over several weeks. If the variant group’s traffic diverges significantly from what was predicted based on the control group’s trend, you have a result.

The key distinction: CRO testing measures human behaviour on the page. SEO testing measures how search engine algorithms respond to on-page changes. Different mechanism, different methodology, different requirements.

What Do You Need Before You Can Run SEO Split Tests?

Ab Testing Seo

This is where most sites fall out of the running. SEO A/B testing has steep prerequisites that articles on the topic often gloss over.

Enough Templatised Pages

You need a large set of pages that follow the same template. Product pages, city pages, listing pages, blog posts with a consistent format. The pages need to be structurally similar enough that you can isolate the variable you’re testing. If every page on your site is unique, there’s no clean way to create matched control and variant groups.

The minimum depends on your traffic levels, but most platforms that run these tests recommend at least 50-100 pages in each group. Some suggest more. Fewer pages means more noise and less reliable results.

Sufficient Organic Traffic

Each page group needs enough organic traffic to produce statistically significant results within a reasonable timeframe. If your templatised pages get a handful of visits per week each, you’ll be waiting months for data that still might not be conclusive.

There’s no universal threshold, but as a rough guide, you want the combined group traffic to be in the thousands of sessions per month. Sites with low organic traffic are better off using other methods to inform their SEO decisions.

A Clear Hypothesis

This applies to any kind of testing, but it’s worth stating. “Let’s change the title tags and see what happens” isn’t a test. A proper hypothesis looks more like: “Adding the product category to our title tags will increase organic CTR for our product pages, leading to higher organic traffic.” You need a specific change, a measurable outcome, and a reason you believe the change will work.

Time and Patience

SEO tests take longer than CRO tests. You’re waiting for Google to recrawl the changed pages, reindex them, and for enough time to pass that any ranking or CTR changes show up in the data. Most SEO A/B tests need a minimum of two to four weeks, and some need longer. Seasonal trends, algorithm updates, and external factors can all muddy the results.

What You Can Test for SEO

Ab Testing Seo

With the right setup, several on-page elements are viable candidates for split testing.

Title Tags

Title tags are the most accessible and most commonly tested element. They directly affect what appears in search results, which influences click-through rate. Changes to title tag format, keyword placement, the inclusion of numbers or dates, or adding brand names are all testable.

Title tag tests are popular for a reason. The change is simple to implement at scale across templatised pages, Google recrawls and reflects title tag changes relatively quickly, and CTR impact shows up in Google Search Console data. If you’re going to start anywhere, this is the place.

Meta Descriptions

Meta descriptions don’t directly affect rankings, but they do influence CTR from search results. Testing different description formats, calls to action, or the inclusion of specific details like pricing or location can show whether one approach drives more clicks than another.

The caveat: Google rewrites meta descriptions frequently. Your carefully crafted test description might not be the one that appears in results. This makes meta description tests less reliable than title tag tests, though still worth trying if you have the page volume.

Heading Structure

Changes to H1 tags, H2 structure, or the addition of specific heading patterns can be tested. This overlaps with content structure testing, since headings signal to Google what a page is about.

For instance, you might test whether adding FAQ-style H2s to product pages leads to more long-tail traffic. Or whether restructuring H1s to include a specific modifier changes how pages rank for variant keywords.

Schema Markup

Adding or modifying structured data is a clean variable for testing. You can test whether adding FAQ schema, product schema, review schema, or other markup types affects CTR through enhanced search result appearances. Since schema markup changes are code-level and don’t alter visible content, they’re relatively easy to implement and isolate.

Internal Linking

Testing different internal linking patterns is possible on templatised pages. You might add contextual links to a specific set of pages and measure whether they gain organic traffic compared to the control group. Or test whether adding breadcrumb navigation or related product links to a subset of pages changes their performance.

Internal link tests are trickier to isolate because links affect both the page they’re on and the page they point to, but they’re still viable with careful design.

Content Structure and Length

Adding content blocks, expanding thin pages, restructuring information hierarchy, or changing content format are all testable if you can apply the change consistently across a group of similar pages. This works particularly well for e-commerce category pages or location pages where you can add descriptive content to half the set and compare.

What You Can’t Easily Test (or Shouldn’t)

Ab Testing Seo

Here’s the part most guides skip or mention briefly. Some SEO factors are genuinely difficult or impossible to A/B test, and understanding why saves you from wasting time on flawed experiments.

Backlinks and Off-Page Authority

You can’t split test backlinks. Building links to half your product pages and not the other half isn’t a controlled experiment. Link acquisition is unpredictable by nature, the quality and relevance of links varies wildly, and you can’t standardise the treatment across groups. Domain authority and page authority are cumulative, site-wide signals that don’t lend themselves to page-level split testing.

If you want to understand the impact of link building, you’re looking at before-and-after analysis, not A/B testing.

Algorithm Changes

Google’s algorithm updates happen to your entire site at once. You can’t test against them because there’s no control group. When a core update rolls out, every page is affected by the same change at the same time.

What you can do is analyse the impact of updates after they happen, looking at which page types gained or lost visibility. But that’s forensic analysis, not testing.

Site-Wide Technical Changes

Changes that apply universally, like switching to a new CMS, implementing a CDN, changing your URL structure, or migrating to HTTPS, affect every page simultaneously. There’s no way to apply these changes to half your pages. You can measure the before-and-after impact, but it’s not a split test.

Page Speed Improvements

While site speed matters for SEO, testing speed improvements is problematic. Server-side changes affect all pages. Even if you could serve faster versions of some pages, the measurement is complicated by caching, user location, device variation, and the fact that speed improvements tend to be site-wide infrastructure changes.

Brand and Domain-Level Signals

Your domain’s age, authority, brand recognition, and overall trust are not page-level variables. You can’t A/B test your way to a more authoritative domain. These factors operate above the page level and influence everything on your site equally.

How to Measure Results

Running the test is only half the challenge. Measuring and interpreting results properly is where many SEO tests fall apart.

Statistical Significance Matters

A small uptick in traffic to your variant pages doesn’t necessarily mean your change worked. Organic traffic fluctuates naturally. You need enough data to be confident the difference isn’t just noise.

Most SEO testing platforms use a forecasting model. They predict what the variant group’s traffic would have been based on the control group’s performance, then compare the prediction against actual results. The gap between predicted and actual, combined with confidence intervals, tells you whether the result is statistically significant.

A common threshold is 95% confidence, meaning there’s only a 5% chance the observed difference is due to random variation. Don’t draw conclusions from results that haven’t reached significance.

What to Measure

The primary metric is usually organic traffic to the page groups, measured through analytics and Google Search Console data. But depending on what you’re testing, you might also look at:

Organic CTR for title tag and meta description tests

Impressions for changes that might affect ranking breadth

Average position for changes to on-page content or headings

Conversions from organic traffic if you want to connect SEO changes to business outcomes

Avoid measuring too many things at once. Pick a primary metric that directly relates to your hypothesis.

Account for External Factors

Seasonality, competitor activity, algorithm updates, and other external events can all distort your results. Running both a control and variant group helps control for most of these, since both groups should be equally affected. But if something unusual happens during your test period, such as a major algorithm update, you may need to extend the test or discard the results.

Common Mistakes in SEO A/B Testing

Even with the right setup, several mistakes can undermine your tests.

Testing too many things at once. If you change the title tag, add schema markup, and restructure the content simultaneously, you won’t know which change drove the result. Test one variable at a time.

Not running tests long enough. Two weeks might feel like enough, but SEO changes can take time to fully propagate. Ending a test too early means you might catch a temporary fluctuation rather than a genuine trend. Three to four weeks is a safer minimum.

Ignoring the control group. If your control group’s traffic also changed significantly during the test, your variant results need to be interpreted relative to that shift, not in absolute terms. The control group exists to account for external factors.

Drawing conclusions from small sample sizes. If you split 20 pages into two groups of 10, even a tool showing “significant” results should be treated with scepticism. Small groups produce noisy data.

Assuming results are permanent. An SEO test shows what happened during the test period under the conditions that existed at the time. Algorithm updates, competitive changes, and seasonal shifts can all alter the outcome if you were to rerun the test later.

When SEO A/B Testing Isn’t Practical

Here’s the honest reality: most websites can’t run true SEO split tests. And that’s fine.

If your site has fewer than a few hundred templatised pages, you probably don’t have enough volume for reliable results. If your organic traffic is modest, you’ll struggle to reach statistical significance in a reasonable timeframe. If your pages are mostly unique (like a services site or a small blog), there’s no clean way to create matched groups.

Small and medium-sized businesses, service companies, and sites with fewer than a hundred pages are almost always in this category. That doesn’t mean they can’t make informed SEO decisions. It just means they need different approaches.

Time-Based Testing as an Alternative

For sites that can’t run page-level split tests, time-based testing is the next best option. Make a change across a group of pages, then compare organic performance for a period before the change against a period after. It’s less rigorous because there’s no simultaneous control group, so you can’t fully account for external factors. But it’s better than making changes blind.

The approach works best for changes that are likely to have a noticeable impact, such as rewriting title tags across all blog posts or adding schema markup to all product pages. Subtle changes will be lost in the noise of natural traffic variation.

Using Google Search Console for Lightweight Testing

Even without a formal testing setup, Google Search Console gives you before-and-after data on impressions, clicks, CTR, and average position. If you change title tags on a batch of pages, you can compare their Search Console metrics from before the change to the weeks after. It’s not a controlled experiment, but it’s practical data that most sites can access.

Rely on Industry Evidence

Platforms like SearchPilot publish case studies from large-scale SEO tests they’ve run across enterprise sites. While these results won’t map perfectly to your site, they provide directional evidence. If a title tag format change produced a 5-10% traffic uplift across thousands of pages on a retail site, it’s reasonable to test a similar approach on yours, even without formal split testing infrastructure.

Google’s Official Guidance on SEO Testing

Google has addressed A/B testing directly, and their guidance is worth knowing.

No cloaking. You must not show Googlebot a different version of the page than what users see. If you’re running a CRO test that shows different content to different users, make sure Googlebot sees the same page as regular users. Serving search engines a specific version designed to manipulate rankings violates Google’s guidelines.

Use 302 redirects if redirecting. If your test involves redirecting some URLs to variant pages, use 302 (temporary) redirects, not 301s (permanent). A 302 tells Google the redirect is temporary and the original URL should stay in the index. A 301 signals a permanent move, which could consolidate signals away from your original page.

Use canonical tags appropriately. If variant pages exist at separate URLs, the canonical tag should point to the original page. This prevents Google from indexing the test variant as a separate, competing page.

Don’t run tests indefinitely. Google’s guidelines note that running a test for an unreasonable length of time could be seen as an attempt to deceive the search engine. Run your test, collect your data, and either implement the winning version or revert.

These guidelines are mostly relevant to tests where variant pages exist as separate URLs. For page-level split testing where changes are made directly to existing URLs (the more common approach in SEO testing), the main concern is simply not cloaking.

Making the Most of SEO Testing Without Enterprise Tools

You don’t need a dedicated platform or thousands of pages to bring a testing mindset to your SEO content and technical SEO work. The core principle is simple: make one change at a time, measure the impact, and let data inform your next move.

Start with title tags. They’re the easiest element to change, the fastest to show results, and the simplest to measure through Search Console. Pick a batch of similar pages, update the title format, and track impressions and CTR over the following weeks. Even without statistical rigour, you’ll get useful signals.

Document everything. Record what you changed, when, and what happened. Over time, these small experiments build a library of what works for your specific site, which is more valuable than any generic best practice.

If your site does have the scale for formal testing, the investment can be substantial. Dedicated SEO testing platforms exist, and larger sites with template-driven page structures get the most out of them. But even at that scale, the fundamentals are the same: clear hypothesis, single variable, sufficient data, honest interpretation. For a structured approach to making these decisions, a proper digital strategy framework ensures testing fits into broader optimisation goals rather than being a series of disconnected experiments.

Liam Blackledge
Liam has been in the SEO industry since 2019, cutting his teeth as an SEO Executive before levelling up by joining Gorilla at Manager level in 2023. Specialising in technical SEO, site architecture and content strategy, Liam manages a portfolio of clients across multiple sectors and takes a hands-on approach to every campaign he runs. When he’s not buried in Search Console, he’s either hard at work at the snooker table, or telling anyone who’ll listen that he’s going to start back at the gym.

Related Articles