A/B testing compares two versions of a webpage to see which one performs better. You show Version A to half your visitors and Version B to the other half. Then you measure which version gets more signups, clicks, or sales.
Thatâs the whole concept. Three sentences.
The execution is where people get lost. So hereâs everything you need: how testing works, which tools are worth your money, what the research says about win rates, and how to avoid the mistakes that waste most teamsâ time.
What is A/B testing?
You have a webpage. Maybe itâs your homepage, a pricing page, or a signup form. You think a different headline might get more people to click. A/B testing lets you find out without guessing.
Hereâs the process:
- Pick what to change. A headline, a button, a call-to-action, an image. One thing.
- Create Version B. Your current page is Version A (the original). Your change becomes Version B.
- Split traffic. Half your visitors see A, half see B. The split is random.
- Wait for data. The test runs until you have enough visitors for a reliable answer.
- Read the results. One version wins, or they perform about the same. Both outcomes are useful.
| Step | What happens | What you do |
|---|---|---|
| Setup | Pick a page and an element to test | 3 minutes in most tools |
| Split | Visitors randomly see Version A or B | Nothing. The tool handles this |
| Measure | Conversions are tracked per version | Check back in a week or two |
| Decide | One version gets more conversions | Keep the winner, or test again |
The term âsplit testingâ means the same thing. Some people use it interchangeably, some draw a technical distinction between the two. For most purposes, theyâre the same idea.
With Kirro, setup takes about three minutes. Paste a small script on your site, open the visual editor, click the element you want to change, and hit start. No code. No developer.
What can you test?
Almost anything visible on a page. Headlines, button text, images, hero sections, form layouts, pricing displays, page structure. If a visitor can see it, you can test it.
How do you know if a test worked?
After enough visitors have seen both versions, the tool runs the numbers. Youâll see something like: âVersion B gets 23% more signups. Kirro is confident this works.â Or: âBoth versions performed about the same.â Both answers are useful. A âno differenceâ result means your original was already decent. Thatâs good information.
The key word is âenough visitors.â Run a test with 50 visitors and youâre basically flipping a coin. Run it with 5,000 and youâre getting a real answer. How many you need depends on your traffic and the size of the change youâre testing. (More on sample size later.)
When youâre testing multiple elements at once (a headline AND a button AND an image), thatâs called multivariate testing. It needs much more traffic. Most small businesses donât need it. Start with one change at a time.
Why A/B testing matters
According to BuiltWith data compiled by Convert.com, only 0.2% of all websites use any A/B testing tool. Even among the top 10,000 highest-traffic sites, just 32% run tests.
That means the vast majority of businesses are guessing what works. And guessing is expensive.
Hereâs what happens when companies actually test:
| Example | What was tested | Result | Source |
|---|---|---|---|
| Bing (Microsoft) | Ad headline layout | 12% revenue increase (~$100M/year in the US) | HBR, 2017 |
| Obama campaign | Signup page variations | 140% more signups, $75M in added donations | TrueList |
| Bing page speed | 100ms faster load time | $18M additional annual revenue | HBR, 2017 |
These are big companies with big numbers. But the principle scales down perfectly.
Say your landing page converts at 2% and gets 10,000 visitors a month. Thatâs 200 conversions. A better headline pushes it to 2.5%. Now youâre getting 250 conversions. Same traffic, 50 more customers every month, zero extra ad spend. Over a year, thatâs 600 additional customers from a single headline test.
And the wins compound. Run 10-15 tests a year, find 3-4 winners, and each improvement stacks on top of the last. A 2% conversion rate becomes 2.5%, then 2.8%, then 3.1%. Your website gets better at turning visitors into customers without any increase in traffic budget.
This is why Ron Kohavi, who ran experimentation at Microsoft, Amazon, and Airbnb, calls controlled experiments âthe best scientific way to establish causalityâ. Youâre not guessing what works. Youâre measuring it.
Our take: Most A/B tests donât produce winners. At Google and Bing, only 10-20% of experiments show positive results. Thatâs normal. The 10-20% that DO win more than pay for the effort. Testing isnât about winning every time. Itâs about finding the wins that matter.
Booking.com runs roughly 1,000 tests at any given time. Google runs over 10,000 per year. They donât do this because every test is a slam dunk. They do it because the ones that hit, hit big.
Frequently asked questions
What is A/B testing?
A/B testing shows two versions of a webpage (or email, or ad) to different groups of visitors. You measure which version gets more conversions, whether thatâs signups, purchases, or clicks. The version that performs better wins.
Itâs the simplest way to make data-backed decisions about your website instead of guessing. You donât need a statistics degree or a dedicated CRO team. With modern tools like Kirro, you can set up and run a test in a few minutes.
How long should an A/B test run?
At minimum, two weeks. Even if you hit your target sample size faster, you need at least one full business cycle to account for weekday vs weekend differences.
The biggest mistake is stopping a test early because it âlooks likeâ a winner. Checking results before the test is done inflates your false positive rate from 5% to 26%. Let the test run.
What is a good sample size for A/B testing?
Thereâs no universal number. It depends on your current conversion rate, the minimum improvement youâd find meaningful, and your desired confidence level.
Rough benchmark: most tests need at least 350-1,000 conversions per version. Sites with fewer than 10,000 monthly visitors should focus on testing big, obvious changes (headlines, layout) rather than subtle tweaks. Use our free sample size calculator to get a number for your situation.
Whatâs the difference between A/B testing and multivariate testing?
A/B testing compares two versions with one change. Multivariate testing changes multiple elements at the same time and measures how they interact.
The tradeoff: multivariate tests reveal more but need significantly more traffic (often 10x or more). In practice, less than 1% of all tests run are multivariate. For most businesses, A/B testing is the right starting point. Test one thing, get a clear answer, move on.
Ready to stop guessing and start measuring? Try Kirro free for 30 days. Set up your first test in 3 minutes. No credit card. No setup guide. See what actually works.
A/B Testing Tools & Software
The A/B testing tools market is approaching $1 billion and growing at about 14% per year. Yet only 11.5% of the top million websites actually run a testing tool. Thatâs a lot of sites leaving money on the table.
The market itself is lopsided. Most tools were built for enterprise teams with dedicated optimization specialists and six-figure budgets. If youâre a marketer or founder at a smaller company, the options look expensive and complicated. They donât have to be.
When Google Optimize shut down in September 2023, an estimated 2-3 million websites lost the only simple, free testing tool that worked inside their Google stack. That gap still hasnât been properly filled. Enterprise tools are too expensive. Developer tools assume you write code.
Kirro was built for that gap. EUR 99/month, unlimited tests, unlimited visitors, visual editor, GA4 integration. No per-visitor pricing that punishes you for growing. Try it free for 30 days.
How to pick the right tool
A/B testing tools fall into four categories. Knowing which one fits saves you from buying a race car when you need a bicycle.
| Category | Who itâs for | Price range | How it works | Examples |
|---|---|---|---|---|
| Visual editor | Marketers, founders, small teams | $0-$1,200/year | Point-and-click changes on your live site. No code. | Kirro, Crazy Egg, VWO |
| Full-stack platform | Mid-market to enterprise CRO teams | $3,500-$100,000+/year | Visual editor plus server-side testing, personalization, and analytics. | VWO, Convert, Optimizely, AB Tasty |
| Developer-first | Engineering teams, feature flagging | $0-$5,000/year | Code-based. Tests live in your codebase, not a visual editor. | GrowthBook, Statsig, LaunchDarkly |
| Server-side only | Large sites needing zero-flicker performance | $10,000+/year | Tests run on your server before the page loads. Fastest, but requires dev work. | Optimizely Full Stack, Kameleoon |
Visual editor tools are the starting point for most small teams. You click on a headline, change it, and hit start. Kirro fits here. So does VWOâs basic plan and Crazy Egg.
Full-stack platforms bundle everything: visual editor, server-side tests, personalization, heatmaps, session recordings. VWO and Optimizely live here. You pay for the bundle whether you use all of it or not.
Developer-first tools are free or cheap but need someone who can write code. GrowthBook is open source and genuinely free. PostHog and Statsig have generous free tiers. If you have a developer on the team and a data warehouse, these are worth a look. If you donât, skip them.
Pricing models matter too. Some tools charge per seat (per person who logs in). Others charge per MTU (monthly tested users, meaning visitors who see a test). MTU pricing means your bill goes up as your traffic grows. Seat pricing stays flat. Kirro charges a flat EUR 99/month regardless of traffic or team size. Thatâs unusual. Most tools in this space charge more as you grow.
One thing every âbest toolsâ roundup skips: only about 1 in 8 A/B tests produce a clear winner. The tool you pick matters less than actually running tests consistently. A EUR 99/month tool used weekly beats a $36,000/year tool used quarterly.
Find the right guide for your question
Each of these posts goes deep on a specific angle. Hereâs where to start based on what youâre looking for.
Want a full buyerâs guide? Our A/B testing software comparison reviews 13 tools side by side. Total cost of ownership, real pricing (not just the sticker), and tradeoffs nobody else mentions. Itâs the deep dive.
Short on time? The best A/B testing tools post is the 5-minute version. Five picks organized by use case: best for small teams, best for enterprise, best for ecommerce, best for developers. Clear winner recommendation upfront.
Need redirect testing? Split testing (sending visitors to completely different URLs) is a different job than element-level A/B testing. Our split testing software guide covers 8 tools that handle real URL redirects and explains when you need that instead of standard A/B testing.
Building a mobile app? Mobile testing has unique headaches: app store review delays, version fragmentation, and smaller sample sizes. The mobile app A/B testing guide covers the best tools and what makes mobile different from web testing.
The bigger picture
A/B testing tools are one piece of the conversion puzzle. The full CRO stack includes heatmaps, session recordings, surveys, and analytics alongside testing. Our CRO tools guide breaks down the full toolkit.
For the fundamentals of A/B testing itself (how it works, what to test first, how to read results), start with our A/B testing pillar guide. Everything in this section lives under that umbrella.
Best A/B testing tools in 2026 (the 5-minute version)
Kirro, VWO, and GrowthBook are the best A/B testing tools for most teams in 2026. But the right pick depends on three things: your traffic, ...
Mobile app A/B testing: best tools and how to get started
The best A/B testing software in 2026 (honest buyer's guide)
The best split testing software in 2026 (and what split testing actually means)
CRO Tools & Software
Most âCRO toolsâ articles list 15 products and tell you to pick a few from each category. Thatâs advice built for teams with dedicated optimization specialists and five-figure budgets. If youâre a marketer or founder at a smaller company, you need a different approach.
Hereâs the reality: only 0.2% of websites use any A/B testing tool at all. The gap isnât knowledge. Itâs that the industry keeps selling full âCRO platformsâ to people who need one or two tools.
A/B testing tools and CRO tools are different things, even though people use the terms like they mean the same thing. An A/B testing tool does one job: it shows two versions of a page to different visitors and tells you which one performs better. A CRO tool is a broader category that includes heatmaps, session recordings, surveys, form analytics, and personalization alongside testing. Think of A/B testing as one wrench. CRO tools are the whole toolbox.
The question is whether you need the toolbox or just the wrench. For most small teams, the answer is the wrench plus one or two free tools you probably already have access to.
The 3-tool stack that covers 90% of what you need
This isnât a theoretical framework. Itâs what actually works for small teams:
Google Analytics 4 handles the âmeasureâ part. Track traffic, conversions, funnels, and user behavior. Itâs free and itâs the foundation.
Microsoft Clarity handles the âwatchâ part. Free heatmaps and session recordings with no limits on traffic or sessions. See where people click, how far they scroll, and where they rage-click. No credit card, no caps, no catch. Hotjar still owns 55% of the heatmap market, but Clarity does the same core job for zero dollars.
Kirro handles the âtestâ part. Change a headline, swap a button, try a new hero section. EUR 99/month, unlimited tests, unlimited visitors, 9KB script. The visual editor means no developer needed.
Total cost: EUR 99/month. The enterprise equivalent (Optimizely + Hotjar Business + GA4 360) runs $50,000+ per year. Same workflow. Different price tag.
Our take: The VWO and AB Tasty merger in January 2026 signals where the industry is heading. PE-backed consolidation pushes tools upmarket and prices up. The gap for small teams keeps getting wider. Building your stack from focused, affordable tools protects you from that trend.
Which post should you read first?
Every post in this cluster covers a different part of the CRO toolkit. Start with the one that matches where you are right now.
Choosing tools by category? Our CRO tools guide breaks down all six types of conversion optimization tools (analytics, heatmaps, testing, surveys, form analytics, personalization). It includes a decision framework based on your traffic level and budget, plus honest recommendations for each category. Start here if youâre building your stack from scratch.
Ready to buy software? The CRO software buyerâs guide is for people who already know what they need and want to compare specific products. It covers pricing, stacks by budget tier, and the trade-offs between all-in-one platforms and best-of-breed setups. Includes the post-merger VWO/AB Tasty picture.
Need to understand user behavior first? Our session replay tools roundup compares every major heatmap and recording tool, starting with free options. If you want to see what visitors actually do on your site before deciding what to test, start there.
Thinking about switching from Hotjar? The Hotjar alternatives guide covers why teams leave (pricing, session limits on the free plan) and what to use instead. Includes a head-to-head with Microsoft Clarity and the full comparison with FullStory, Mouseflow, and Lucky Orange.
The thread connecting all of this
Most businesses get stuck because they either skip straight to testing (without knowing what to test) or they buy observation tools and never act on what they find. Most businesses get stuck because they skip straight to testing (without knowing what to test) or buy observation tools and never act on what they find.
The posts in this cluster cover the full picture, from choosing the right tool categories to comparing specific products. The parent A/B Testing pillar page ties this into the broader testing and experimentation strategy.
When youâre ready to start testing, Kirroâs free trial gives you 30 days with the full product. No features locked. No credit card required. Pair it with Clarity and GA4, and youâve got the same CRO workflow the enterprise teams use.
Best session replay tools in 2026: 8 tools compared honestly
Microsoft Clarity is the best free session replay tool. Hotjar is the most popular all-in-one option. PostHog wins for technical teams who want open ...
7 best Hotjar alternatives in 2026 (and why people are switching)
CRO software: what it is, what to buy, and what actually matters
The best conversion rate optimization tools in 2026
Competitor Comparisons
The A/B testing market just got a lot smaller. VWO and AB Tasty merged in January 2026, creating a $100M+ revenue company backed by private equity. Optimizely keeps moving upmarket. The tools are consolidating into bigger, pricier bundles.
And the gap for small teams keeps getting wider.
High pricing already restricts about 35% of potential market growth among small and mid-size businesses. If that describes you, the comparisons below cut through the noise. No feature spreadsheets. Just: hereâs what each tool costs, who itâs built for, and whether it fits your situation.
How to pick the right comparison
Every post in this cluster covers a different buying scenario. Hereâs which one to read based on where you are right now.
Comparing the two biggest names? Our VWO vs Optimizely comparison uses real G2 data, pricing breakdowns, and 796 verified TrustRadius reviews. VWO wins on ease of use and price. Optimizely wins on enterprise features. For teams under 10 people, the honest answer is: both are overkill.
Leaving Optimizely? The best Optimizely alternatives guide covers seven replacements, organized by why youâre switching. Price, complexity, privacy, open source. Each reason points to a different tool. Updated for the VWO/AB Tasty merger.
Lost Google Optimize? When Google shut down Optimize in September 2023, 500,000+ websites lost their testing tool. Our Google Optimize alternatives guide ranks 10 replacements with real pricing and honest drawbacks.
Need to know what Optimizely actually costs? Optimizely pricing breaks down the real numbers. Spoiler: it starts at $36,000/year, and total cost runs 35-50% above the license fee. None of this is on their website.
Evaluating Convert? Our Convert A/B testing review covers the privacy-focused mid-market option. Strong on data compliance and support. Starts at $299/month. But the visual editor and a HIPAA gap are worth knowing about before you commit.
The market right now
Hereâs how the A/B testing tool market breaks down in 2026:
| Tier | Tools | Starting price | Built for |
|---|---|---|---|
| Enterprise | Optimizely, Adobe Target | $36,000+/year | 50+ person teams, dedicated CRO specialists |
| Mid-market (post-merger) | VWO/AB Tasty, Convert | $299-599/month | Agencies, mid-size companies, privacy-focused teams |
| SMB | Kirro, Mida, Zoho PageSense | EUR 99-299/month | Marketers, founders, small teams |
| Developer/open-source | GrowthBook, Statsig | Free-$150/month | Engineering-led product teams |
The mid-market is where the most change happened. VWO and AB Tasty combining means fewer choices at the $300-500/month price point. And PE-backed consolidation usually pushes prices up, not down.
If youâre a small team or solo marketer, the parent A/B Testing pillar page covers the full picture, from methodology to tools to strategy.
Our take: The tool matters less than actually using it. CXL analyzed 28,304 experiments and found that the companies producing results werenât the ones with the fanciest tools. They were the ones running tests consistently. Only 1 in 8 A/B tests creates a significant lift. That means you need volume. A EUR 99/month tool used weekly beats a $36,000/year tool used quarterly.
Ready to stop comparing and start testing? Kirroâs free trial gives you 30 days with the full product. No credit card. No feature limits. Three minutes to set up.
Platform-Specific Testing
A/B testing on your own website is straightforward. You control the traffic split, the test duration, and what âwinningâ means. Platform testing is different. Amazon, Meta, Google Ads, and Webflow each have built-in testing features, but they all play by their own rules.
The biggest difference most guides skip: on ad platforms, the algorithm decides who sees what. A 2025 study published in the Journal of Marketing found that platform algorithms create âdivergent delivery,â where different ads get shown to different types of people. Your âwinningâ ad might only look better because the algorithm showed it to more receptive users, not because the creative itself was stronger.
Thatâs a problem if youâre trying to learn what actually works.
What you can (and canât) test on each platform
| Platform | What you can test | What you canât test | Big limitation |
|---|---|---|---|
| Amazon | Titles, images, bullet points, A+ Content | Pricing, layout, storefront structure | Requires Brand Registry + undisclosed traffic minimum |
| Meta | Ad creative, audiences, placements | Landing page experience, post-click behavior | Algorithm redistributes budget mid-test |
| Google Ads | Ad copy, bidding strategies, asset groups | Cross-campaign comparisons, landing pages | One asset group at a time (10 groups = 40-60 weeks) |
| Webflow | Any page element via Webflow Optimize or third-party tools | Backend logic, pricing, checkout flows | Webflow Optimize starts at $299/month |
Our take: Every platform tests the ad or the listing. None of them test what happens after someone clicks. Thatâs where most conversions are won or lost. A dedicated tool like Kirro fills that gap: you test your actual landing page, checkout flow, or homepage with a clean 50/50 traffic split that you control.
Pick the right guide for your situation
If you sell on Amazon and want to test product listings, our Amazon A/B testing guide walks through Manage Your Experiments step by step, including the eligibility requirements Amazon doesnât make obvious and the third-party alternatives when their tool falls short.
Running Facebook or Instagram ads? The Meta A/B testing guide covers how their Experiments tool changed in 2025, when to use it vs. testing landing pages externally, and how to avoid the algorithm interfering with your results.
For Google Ads, the Google Ads A/B testing guide explains campaign experiments, the new Performance Max asset testing beta, and the math behind why PMax testing takes so long.
If your site runs on Webflow, Webflow A/B testing compares your options: Webflow Optimize ($299/month), Optibase (from $19/month), and external tools like Kirro that work with any site, Webflow included.
The handoff problem
Hereâs the thing nobody in the ad platform world talks about: the ad got someone to click. Great. Now what?
If the landing page doesnât convert, the best ad creative in the world doesnât matter. Platform tests stop at the click. Your website is where the actual conversion happens, and thatâs where you need a separate testing tool.
Think of it as a relay race. The ad platform runs the first leg. Your landing page runs the second. Most teams only time the first runner.
All of these guides live under the A/B Testing & Experimentation pillar, alongside our tools and methodology deep dives.
Google Ads A/B testing: test your ads and landing pages
Google Ads A/B testing works on two levels. The first is ad-level testing (headlines, descriptions, bid strategies) which Google handles natively through a feat ...
Webflow A/B testing: how to run split tests on your Webflow site
Amazon A/B testing: how to split test listings (and what Amazon won't tell you)
Meta A/B testing: how to run Facebook and Instagram ad tests that actually work
SEO A/B Testing
Most testing guides treat SEO A/B testing and regular A/B testing as the same thing. Theyâre not. The difference changes everything: which tools you need, how long tests take, how many pages you need, and whether your site even qualifies.
Regular A/B testing (the kind Kirro does) shows half your visitors one version of a page and the other half a different version. Same URL, two experiences. Youâre testing how people behave.
SEO testing canât work that way. Google is one crawler. You canât show it two versions of the same URL. So instead, you group similar pages (say, 200 product pages), change half of them, and compare organic traffic between the two groups over weeks. Youâre testing how the search engine behaves.
That distinction is why SEO testing is its own discipline. It needs different tools, larger sites, and more patience. For a full breakdown of how it works, read our SEO A/B testing guide. It covers the methodology, the tools, the real failure rates (75% of tests are inconclusive), and what smaller sites can do instead.
Who SEO testing is for (and who it isnât)
SEO split testing is built for publishers, large e-commerce sites, and marketplaces with hundreds or thousands of pages on the same template. Etsy, Pinterest, and Booking.com run it because they have the page volume to produce reliable results.
Most small and mid-size sites donât have that volume. And thatâs fine. If your site has fewer than 100 similar pages, the math doesnât work for a proper split test. The noise in your traffic data drowns out any real signal.
The better move? Test what happens after visitors arrive. Regular A/B testing on your landing pages, headlines, and calls to action works at any traffic level. Try Kirro free and test your highest-traffic page today. Thatâs where the fastest wins are for most businesses.
Where to start reading
Our complete SEO A/B testing guide covers the full picture: how the methodology works step by step, what you can actually test (title tags, headings, structured data, internal links), why most tests show no clear result, and the honest alternatives for sites that donât have enterprise-scale traffic. It also breaks down every major SEO testing tool and when each one makes sense.
If youâre new to testing in general, start with the A/B testing pillar page for the fundamentals. Want to understand how A/B testing affects conversion rates? That guide covers what to expect by industry. And if the term âsplit testingâ is new, our split testing explainer starts from zero.
The bottom line: SEO testing is powerful for sites with the right setup. For everyone else, regular A/B testing delivers faster, cheaper results with way less complexity.
Testing Methodology
Only 1 in 7 A/B tests reach statistical significance. Thatâs an 86% failure rate. And yet 58% of companies have no framework for deciding what or how to test.
Those two stats are connected. Bad methodology is why most tests fail. Not bad ideas.
This section covers the science behind reliable testing. Every method below has a specific use case. Some need thousands of visitors. Some work with a few hundred. Some give you fast answers. Others give you precise ones. The trick is matching the method to your situation.
Kirro uses Bayesian statistics and handles the math for you. You get answers like âVersion B has an 89% chance of being better,â not p-values.
This cluster is part of the A/B Testing & Experimentation pillar.
Which method do I need?
Start here. Donât read all 18 articles. Answer these three questions, then go to the one that matches.
Question 1: How much traffic does your page get?
- Under 1,000 visitors/month: Focus on big, obvious changes. Test one thing at a time. Read split testing meaning for the basics, then jump to landing page split testing for a step-by-step playbook.
- 1,000 to 10,000 visitors/month: Standard A/B testing works. You can detect meaningful differences with enough patience. Start with how to design a marketing experiment and use our sample size calculator to check timing.
- 10,000+ visitors/month: You have options. A/B tests, multivariate testing, even multi-armed bandits. Read on.
Question 2: Whatâs your goal?
- âI need to learn which version is better.â Classic A/B test. You want precision and confidence. This is what most teams need most of the time.
- âI need to optimize revenue right now.â Multi-armed bandits send more traffic to the winning version while the test runs. Less learning, more earning. Good for short campaigns and sales events.
- âI need to test several elements at once.â Multivariate testing lets you test headlines, images, and buttons simultaneously. But youâll need serious traffic (think 50,000+ visitors) to get reliable results.
Question 3: How comfortable are you with statistics?
- âNot at all.â Thatâs fine. Bayesian A/B testing gives you probabilities in plain language. Kirro shows â89% chance Version B winsâ instead of confusing p-values. Start there.
- âI know the basics.â Dig into the sample size formula and A/B testing conversion rate benchmarks to set realistic expectations.
- âI want the deep stats.â Go for type 1 vs type 2 errors, power analysis, null hypothesis testing, and minimum detectable effect.
Our take: A December 2025 Harvard Business Review study found that traditional significance testing demands 24 to 55 times more data than you actually need for a good business decision. Speed often matters more than certainty. Most small teams are better off running more tests with slightly less precision than running fewer âperfectâ tests.
The core methods
Standard A/B testing splits traffic 50/50 between your current page and one change. Itâs the workhorse. Reliable, easy to understand, works at almost any traffic level. If youâre new to testing, split testing meaning explains the concept, and landing page split testing walks through a real example. For the stats behind it, see A/B testing conversion rate.
Bayesian A/B testing updates results as visitors arrive instead of making you wait for a fixed sample. Kirro uses this approach because the results make sense to non-statisticians. â89% chance Version B winsâ is a sentence your boss can act on. Our Bayesian A/B testing guide covers when it helps and when itâs overkill.
Multivariate testing tests combinations of changes. Different headlines paired with different images paired with different buttons. Powerful, but hungry for traffic. Our multivariate testing guide includes the traffic calculator so you can check if your site qualifies before committing.
Multi-armed bandits automatically shift traffic toward whichever version is winning. Less learning, faster revenue. Good for flash sales or time-limited campaigns where waiting for full statistical confidence would cost you money. Deep dive: multi-armed bandit testing.
Sequential testing lets you stop a test early (or keep it running longer) based on ongoing results, without inflating your false positive rate. It solves the âpeeking problemâ that Evan Miller famously showed raises error rates from 5% to 26%. Full guide: sequential testing.
CUPED (variance reduction) uses data you already have about your visitors to reduce the noise in your results. The practical result: 30 to 40% smaller sample sizes for the same precision. If your tests always seem to take too long, this is probably the fix. Guide: CUPED and variance reduction.
The statistics that actually matter
Every testing method above relies on the same handful of statistical concepts. You donât need to calculate them (thatâs the toolâs job). But knowing what they mean helps you avoid the most common A/B testing mistakes.
Sample size is âhow many visitors do I need?â Too few and your test canât tell a real winner from random noise. Our sample size formula guide breaks down the math, and the free calculator does it for you.
Minimum detectable effect is âwhatâs the smallest improvement worth finding?â If youâd only act on a 20% improvement, donât set up a test designed to detect 2% changes. Itâll take forever. MDE guide.
Type 1 and type 2 errors are the two ways a test can lie to you. A type 1 error says B wins when it doesnât (false alarm). A type 2 error misses a real winner (missed opportunity). Understanding both helps you set up tests that balance speed with accuracy.
Statistical power is the probability your test will actually detect a real difference. Low power means youâll miss winners. Microsoft runs 10,000+ experiments annually and still obsesses over power calculations. If it matters to them at that scale, it matters to you. Power analysis guide.
For the full theoretical foundation (what a null hypothesis is, how to think about probability): null hypothesis in A/B testing.
Implementation and architecture
Picking the right statistical method gets you halfway. The other half is the practical setup: where the test runs, how visitors get assigned to versions, and what happens when cookies disappear.
Experiment design covers the full process: forming a clear guess about what will happen, choosing the right metric, picking the page, and setting up controls. How to design a marketing experiment walks through this start to finish.
Client-side vs server-side testing is about where the test runs. Client-side (in the browser) is easier to set up but can cause page flicker. Server-side (on your server) is invisible to visitors but needs developer involvement. Most small teams start client-side and it works fine. Client-side vs server-side A/B testing helps you decide.
Cookieless testing matters more every year. Safari already blocks third-party cookies. Chrome offers users a choice. If your testing tool relies on third-party cookies, youâre losing data on a growing chunk of visitors. Cookieless A/B testing covers the alternatives.
Feature flags vs A/B testing confuses a lot of teams. Feature flags let developers turn features on and off. A/B tests measure which version performs better. They solve different problems, and some platforms bundle them together. If youâre wondering whether you need a feature flag tool or a testing tool, feature flags vs A/B testing sorts it out. (Short answer for most marketers: you need a testing tool.)
AI-powered testing is the newest addition to the toolkit. AI can help prioritize what to test, generate variations, and analyze results faster. But itâs not magic, and the fundamentals still apply. AI A/B testing separates the real applications from the hype.
Start somewhere
Most teams overthink the methodology and underthink the action. Microsoft found that a 1% improvement to Bingâs revenue equals over $10 million per year. Those gains came from running thousands of simple tests, not from picking the âperfectâ statistical method.
Pick a high-traffic page. Change one thing. Run the test. Three minutes to set up in Kirro. The methodology guides above are here for when you want to go deeper. But the first test? Just run it.
AI A/B testing: what's real, what's hype, and what actually helps
AI A/B testing uses machine learning to help you design, run, and analyze your tests faster. You stop manually picking what to test, writing ever ...