How do I test different audience segments?

Alexandre Airvault
January 13, 2026

Part 1: Build a test plan that won’t lie to you

Start with a single question (your hypothesis)

“Testing audiences” sounds simple, but most audience tests fail because too many things change at once. Before you touch targeting, write down one clear hypothesis in plain English, such as: “Past purchasers will convert at a lower cost than all visitors,” or “In-market segments will drive higher conversion value than affinity segments,” or “A lookalike built from high-intent leads will outperform interest targeting.” If you can’t describe the expected outcome, you’ll struggle to interpret the results once the data comes in.

Decide whether you’re measuring efficiency, scale, or quality

Audience tests can optimize for different “wins,” and you need to choose one primary success metric upfront. If your primary goal is efficiency, you’ll look at cost per conversion (or cost per lead) and conversion rate. If your goal is scale, you’ll focus on incremental conversions and total volume without breaking your target. If your goal is quality, you’ll need a conversion setup that reflects quality (for example, qualified leads, revenue-based values, or downstream conversion actions), otherwise you’ll just be optimizing for the easiest leads—not the best ones.

Match the testing method to the campaign type (this is where people go wrong)

Not all campaign types treat audiences the same way. In some campaigns, audiences can be strict gates (your ads only show to those users). In others, audiences behave more like signals that guide automation but don’t restrict delivery. The “right” way to test audiences depends on which of those two worlds you’re in, because your setup determines whether you’re running a clean segment test or just giving the system suggestions.

Part 2: Three reliable ways to test different audience segments

Method 1: Use “Observation” to compare segments without restricting reach (ideal for Search)

If you want to understand how different segments behave while keeping your keyword targeting intact, run your audience test in “Observation.” This lets your ads continue serving based on your existing targeting, while you break out performance for users who also belong to selected audience segments. You can then use what you learn to adjust bids (when applicable) or to justify building separate, targeted structures later.

This approach is especially useful in Search because it avoids the most common mistake: accidentally shrinking reach by layering audience targeting on top of keywords and then concluding the “audience doesn’t work” when volume collapses. With Observation, you can compare audience segments side-by-side on the same keywords, ads, landing pages, and budgets, which produces cleaner insight.

Method 2: Use “Targeting” to run true segment-vs-segment splits (common for Display, Video, Demand Gen)

When you need a true test where the audience is the gate (meaning the campaign or ad group only serves to that segment), use “Targeting.” This is how you run classic audience A/B tests like “in-market vs custom segment,” “all visitors vs cart abandoners,” or “lookalike narrow vs broad.”

The key is to keep everything else identical: creative, bids, landing pages, conversion actions, and (as much as possible) budget. If you change creative at the same time you change audience, you’re no longer testing audiences—you’re testing an entire strategy change.

Method 3: Use Experiments to A/B test audience strategies with proper splits and decisioning

If you want the cleanest read possible—especially when budgets are meaningful—use the Experiments framework. Experiments are built to compare a base campaign versus a trial campaign over a defined period, with traffic or budget split between the two so you can measure impact and then apply the winner.

Practically, this is how experienced teams test audience changes without “peeking” at noisy early data and making premature optimizations. It’s also the right choice when you’re testing audience changes that interact with other systems, like bidding strategy changes, landing page changes, or broader structural changes that would be messy to compare manually.

  • Use custom experiments when you want to test audience-related settings (and other controlled changes) by splitting traffic/budget between the original and the experiment.
  • Use video experiments when the real question is which video creative performs best (audience tests often fail on video because the ad is the variable that matters most).
  • Use Performance Max experiments when you’re trying to measure uplift or compare approaches involving Performance Max (these tests often need more time to become decisive, so plan for a longer runway).

Part 3: How to keep automation from polluting your audience test

Optimized targeting: treat it like an “audience expansion engine,” not a segment

In Display, Video action, and Demand Gen-style setups, optimized targeting can expand beyond the audiences you select to find additional converters. That’s not “bad,” but it changes what your test means. If your question is “Which audience segment is best?” you generally want to reduce expansion effects so you’re comparing like-for-like. If your question is “What’s the best way to drive conversions at my goal?” then leaving optimized targeting on can be appropriate, and your audiences become starting signals rather than strict constraints.

Also be disciplined about timing. Optimized targeting needs learning time, and performance can fluctuate early. For new campaigns, don’t evaluate too soon—wait until you have meaningful conversion volume and enough runtime to smooth out day-to-day volatility.

Performance Max audience signals: great for guidance, but they are not hard targeting

Audience signals in Performance Max are suggestions that help the system learn faster; they don’t guarantee delivery only to those users. That means you should avoid framing your analysis as “this audience did better” unless you’re using a controlled experiment design where the only meaningful difference is the signals you’re providing (and you’re reading results in the right place, over a long enough period).

If you want to “test segments” in a Performance Max context, do it as a structured experiment (not by swapping signals every few days). Otherwise, you’ll mostly be measuring learning disruption rather than segment performance.

Lookalike segments (Demand Gen): a powerful, testable way to scale first-party performance

If you have strong first-party seed lists (for example, customer lists or high-intent site visitors), lookalike segments are one of the most practical audience tests because you can choose different reach/similarity settings (narrow, balanced, broad) and compare outcomes. Make sure your seed lists are large enough and plan ahead: lookalike eligibility and refresh behavior mean you should create the campaign and segment in advance so your test window isn’t wasted on ramp-up.

Part 4: Read results correctly and turn them into actions

Use confidence, not gut feel (and don’t end tests early)

When you’re reviewing experiment results, focus on whether the difference is statistically meaningful—not just whether one line is temporarily higher this week. Experiments reporting provides confidence intervals and significance indicators so you can see whether the observed lift is likely real or just noise. If the result isn’t decisive yet, the right move is often to let it run longer (rather than “optimizing” mid-test and invalidating your comparison).

Know what “audience performance” data is actually showing you

Audience reporting is strongest when you treat it as a diagnostic lens: who is converting, at what efficiency, and at what scale. In Observation-based setups, you’re reading overlapping behavior (users who matched your existing targeting and also matched the audience). In Targeting-based setups, you’re reading gated behavior (only users in that audience could have seen the ads). Those are different questions—so interpret them differently.

Avoid the most common audience-testing mistakes (quick checklist)

  • Changing creative while testing audiences: keep ads and landing pages consistent unless the test is explicitly “audience + message match.”
  • Mixing signals and constraints: don’t compare a tightly targeted ad group to an expanded/optimized one and call it an “audience test.”
  • Judging too early: give the system time to learn, and give your test enough conversions to be meaningful.
  • Optimizing mid-test: if you change bids, budgets, or targeting during the test, you’ve changed the experiment.
  • Picking the wrong success metric: if lead quality matters, don’t run the test on a “cheap lead” conversion action and hope quality follows.

What to do after you find a winner

Once you’ve identified a segment that clearly outperforms (or scales better), promote it from “test” to “structure.” That usually means creating dedicated ad groups or campaigns with tailored messaging and landing pages, shifting budget toward the winners, and using exclusions carefully to prevent overlap or wasted spend where it’s appropriate. If you’re using automation-heavy campaign types, the best “next step” is often improving your inputs—stronger first-party segments, cleaner conversion goals, and more relevant creative—so the system can scale what worked without drifting away from your intent.

Let AI handle
the Google Ads grunt work

Try now for free
Stage / Question What to Decide or Do How to Test Audience Segments in Google Ads Relevant Google Ads Features & Docs
1. Define your audience test hypothesis Turn “test audiences” into one clear hypothesis in plain English, e.g.:
– “Past purchasers will convert at a lower CPA than all visitors.”
– “In‑market segments will drive higher conversion value than affinity segments.”
– “A lookalike from high‑intent leads will beat interest targeting.”
Before changing any targeting or segments, write down:
– Which two segments or strategies you’re comparing.
– Which metric should improve (CPA, ROAS, conversion value, lead quality, etc.).
– What you would do if A wins vs B wins (budget shifts, new campaigns, exclusions).
– Use consolidated audience reporting to see how existing audience segments perform and to inform better hypotheses.
– Make sure your audiences are set up correctly via audience setup in Google Ads Editor.
2. Choose the primary success metric (efficiency, scale, or quality) Decide what “winning” means before you launch:
– Efficiency: CPA, CPL, conversion rate.
– Scale: incremental conversions and volume at or near target CPA/ROAS.
– Quality: qualified leads, revenue-based conversions, or downstream actions.
– Map your business goal to a specific conversion action and value strategy.
– Avoid testing on a “cheap lead” conversion if you actually care about quality; create or import a quality‑aligned conversion instead.
– Use conversion goals and values inside campaigns, guided by Performance Max campaign inputs and goals (same principles apply to other campaign types).
– Ensure your first‑party segments are available via your data segments.
3. Match test method to campaign type Decide if audiences are:
Gates (hard targeting: only those users can see ads).
Signals (soft guidance to automation; delivery can go beyond them).
Your testing method must respect which world you’re in.
– For Search & Shopping, audiences usually layer on top of keywords; use them mainly as observation or bid signals.
– For Display, Video, Demand Gen, audiences can be strict gates; you can run cleaner segment‑vs‑segment splits.
– Learn how “Targeting” vs “Observation” behaves in different campaign types with the Targeting and Observation settings article.
4. Method 1: Use “Observation” to compare segments without shrinking reach (ideal for Search) Goal: Understand how different segments behave on the same keywords, ads, and budgets without limiting who can see your ads. – Add audience segments to Search (or Shopping) ad groups in Observation mode.
– Keep keyword targeting and budgets unchanged.
– Compare performance by audience in the Audiences reporting view (CPA, CVR, value/conversion).
– Optionally apply bid adjustments or build future, dedicated structures based on what you learn.
– Use the Targeting and Observation settings guide to confirm you’ve set “Observation,” not “Targeting,” for Search ad groups.
– View performance by audience in consolidated audience reporting.
5. Method 2: Use “Targeting” for true segment‑vs‑segment tests (Display, Video, Demand Gen) Goal: Run clean A/B tests where each campaign or ad group only serves one audience segment (the audience is the gate). – Create separate ad groups or campaigns per segment (e.g., in‑market vs affinity, all visitors vs cart abandoners, narrow vs broad lookalike).
– Set audiences to Targeting so only those users can be served.
– Keep creative, bids, landing pages, and conversion actions identical so the only variable is audience.
– Compare performance and then re‑allocate budget or build out winners as their own structures.
– Configure audience mode correctly using the Targeting and Observation settings article.
– For first‑party audiences used in these tests, rely on your data segments.
6. Method 3: Use Experiments for controlled A/B tests (including audience strategies) Goal: Compare a base campaign vs a trial campaign with proper traffic/budget splits, confidence reporting, and one clear decision at the end. – Use custom experiments when testing audience settings, structures, or combinations with other controlled changes (bidding, landing pages, etc.).
– Use video experiments when creative is the main question so you don’t confuse an audience test with an ad test.
– Use Performance Max experiments when testing PMax‑related audience or structural changes; plan for longer run times.
– Learn how experiments work from custom experiments and setting up a custom experiment.
– For creative‑focused tests, use the video experiment workflow.
– Monitor statistical outcomes and significance with experiment monitoring.
7. Control automation: Optimized targeting and expansion Decide whether your test question is:
– “Which audience segment is best?” → minimize expansion.
– “What’s the best way to hit my CPA/ROAS goal?” → allow optimized targeting to help scale.
– In Display, Video action, and Demand Gen, adjust the optimized targeting setting at the ad group level.
– Turn it off for cleaner segment‑vs‑segment comparisons.
– Keep it on if you’re testing an overall conversion‑driving strategy and audiences are just starting signals.
– Give optimized targeting enough time and conversions before judging results.
– Manage expansion behavior via the optimized targeting setting.
– Use audience insights to understand which audiences the system is finding.
8. Performance Max: audience signals vs hard targeting Remember that in Performance Max, audience signals are suggestions, not strict limits. Your test is about strategies and signals, not “pure” segment performance. – Treat audience signals as a way to guide learning; don’t expect delivery to stay inside those segments only.
– When testing different audience signal sets, use structured Performance Max experiments instead of swapping signals frequently in‑place.
– Read results at the campaign level over a long enough period, not from short‑term fluctuations.
– Understand how audience suggestions work from the audience signals for Performance Max article.
– See how audience signals fit into the overall campaign in Performance Max campaigns documentation.
9. Lookalike segments (Demand Gen): testing reach vs similarity Use first‑party seed lists to create lookalike segments with different reach/similarity settings (narrow, balanced, broad) and compare outcomes. – Ensure seed lists are high‑intent and large enough (over 100 matched users).
– Create separate ad groups or campaigns using narrow vs balanced vs broad lookalike segments.
– Launch campaigns and lookalike segments a few days before the test window so lists can populate and refresh.
– Compare efficiency, scale, and quality across reach levels and reallocate budget accordingly.
– Follow setup and best practices in lookalike segments for Demand Gen.
– Combine lookalikes with first‑party and custom audiences using your data segments.
10. Read results correctly (confidence, not gut feel) Focus on whether differences are statistically meaningful, not just “higher this week.” Don’t end tests early or change variables mid‑experiment. – Run experiments for a sufficient duration (often 2–12 weeks depending on volume).
– Use experiment reporting to check confidence intervals and significance indicators.
– If results aren’t decisive, extend the test instead of editing bids, budgets, or targeting mid‑flight.
– Use experiment monitoring to review lift, confidence, and statistical outcomes.
– Reference custom experiments and experiment setup for guidance on durations and splits.
11. Interpret “audience performance” correctly Understand what your numbers actually represent:
– Observation = overlapping behavior on your existing targeting.
– Targeting = gated behavior from that audience only.
– In Observation setups, read audience data as diagnostics (who converts, at what efficiency, on your existing keywords/placements).
– In Targeting setups, treat performance as the result of that segment being the gate.
– Don’t compare a tightly targeted ad group to an expanded/optimized one and call it a pure “audience test.”
– Use the Targeting and Observation settings doc to align how you read each view.
– Leverage consolidated audience reporting to slice performance by segment and targeting mode.
12. Avoid common audience testing mistakes Quick checklist:
– Don’t change creative while testing audiences (unless testing “audience + message match”).
– Don’t mix strict constraints with expanded/optimized setups and call it a clean test.
– Don’t judge too early or with too few conversions.
– Don’t optimize mid‑test (bids, budgets, targeting).
– Don’t pick a metric that ignores lead or customer quality.
– Lock creatives, bids, and landing pages when the goal is purely audience performance.
– Keep optimization work (bid strategy changes, budget reallocations) for after the test is complete and interpreted.
– If you care about quality, ensure the conversion(s) used in experiments reflect downstream value, not just form fills.
– Structure controlled tests with custom experiments to avoid accidental mid‑test changes.
– For creative‑only testing (so it doesn’t contaminate audience tests), use dedicated video experiments.
13. After finding a winning segment Promote winners from “test” to “structure” and give them the support they deserve. – Build dedicated campaigns or ad groups for winning segments with tailored messaging and landing pages.
– Shift budget toward winners and use exclusions to reduce overlap and wasted spend where appropriate.
– For automation‑heavy campaigns (like Performance Max, Demand Gen, Video), focus on improving inputs: stronger first‑party lists, better audience signals, cleaner conversion goals, and more relevant creative.
– Use audience manager and reporting to scale high‑performing segments.
– For scaling via automation, align your inputs with the guidance in Performance Max campaigns and audience signals.

When you’re testing different audience segments in Google Ads, the hard part is less “adding audiences” and more running a clean comparison: start with a clear hypothesis and success metric (CPA, ROAS, or lead quality), pick the right method for your campaign type (Observation in Search to compare behavior without shrinking reach, Targeting in Display/Video/Demand Gen for true segment-vs-segment splits, or Experiments for controlled A/B tests), and limit variables like creative, bids, and landing pages so you’re actually measuring the audience and not everything else. If you want help keeping those tests disciplined and actionable, Blobr connects to your Google Ads account and uses specialized AI agents to continuously analyze performance (including audiences), surface what’s working or wasting budget, and turn best practices into a prioritized list of changes you can review and apply while staying fully in control.

Part 1: Build a test plan that won’t lie to you

Start with a single question (your hypothesis)

“Testing audiences” sounds simple, but most audience tests fail because too many things change at once. Before you touch targeting, write down one clear hypothesis in plain English, such as: “Past purchasers will convert at a lower cost than all visitors,” or “In-market segments will drive higher conversion value than affinity segments,” or “A lookalike built from high-intent leads will outperform interest targeting.” If you can’t describe the expected outcome, you’ll struggle to interpret the results once the data comes in.

Decide whether you’re measuring efficiency, scale, or quality

Audience tests can optimize for different “wins,” and you need to choose one primary success metric upfront. If your primary goal is efficiency, you’ll look at cost per conversion (or cost per lead) and conversion rate. If your goal is scale, you’ll focus on incremental conversions and total volume without breaking your target. If your goal is quality, you’ll need a conversion setup that reflects quality (for example, qualified leads, revenue-based values, or downstream conversion actions), otherwise you’ll just be optimizing for the easiest leads—not the best ones.

Match the testing method to the campaign type (this is where people go wrong)

Not all campaign types treat audiences the same way. In some campaigns, audiences can be strict gates (your ads only show to those users). In others, audiences behave more like signals that guide automation but don’t restrict delivery. The “right” way to test audiences depends on which of those two worlds you’re in, because your setup determines whether you’re running a clean segment test or just giving the system suggestions.

Part 2: Three reliable ways to test different audience segments

Method 1: Use “Observation” to compare segments without restricting reach (ideal for Search)

If you want to understand how different segments behave while keeping your keyword targeting intact, run your audience test in “Observation.” This lets your ads continue serving based on your existing targeting, while you break out performance for users who also belong to selected audience segments. You can then use what you learn to adjust bids (when applicable) or to justify building separate, targeted structures later.

This approach is especially useful in Search because it avoids the most common mistake: accidentally shrinking reach by layering audience targeting on top of keywords and then concluding the “audience doesn’t work” when volume collapses. With Observation, you can compare audience segments side-by-side on the same keywords, ads, landing pages, and budgets, which produces cleaner insight.

Method 2: Use “Targeting” to run true segment-vs-segment splits (common for Display, Video, Demand Gen)

When you need a true test where the audience is the gate (meaning the campaign or ad group only serves to that segment), use “Targeting.” This is how you run classic audience A/B tests like “in-market vs custom segment,” “all visitors vs cart abandoners,” or “lookalike narrow vs broad.”

The key is to keep everything else identical: creative, bids, landing pages, conversion actions, and (as much as possible) budget. If you change creative at the same time you change audience, you’re no longer testing audiences—you’re testing an entire strategy change.

Method 3: Use Experiments to A/B test audience strategies with proper splits and decisioning

If you want the cleanest read possible—especially when budgets are meaningful—use the Experiments framework. Experiments are built to compare a base campaign versus a trial campaign over a defined period, with traffic or budget split between the two so you can measure impact and then apply the winner.

Practically, this is how experienced teams test audience changes without “peeking” at noisy early data and making premature optimizations. It’s also the right choice when you’re testing audience changes that interact with other systems, like bidding strategy changes, landing page changes, or broader structural changes that would be messy to compare manually.

  • Use custom experiments when you want to test audience-related settings (and other controlled changes) by splitting traffic/budget between the original and the experiment.
  • Use video experiments when the real question is which video creative performs best (audience tests often fail on video because the ad is the variable that matters most).
  • Use Performance Max experiments when you’re trying to measure uplift or compare approaches involving Performance Max (these tests often need more time to become decisive, so plan for a longer runway).

Part 3: How to keep automation from polluting your audience test

Optimized targeting: treat it like an “audience expansion engine,” not a segment

In Display, Video action, and Demand Gen-style setups, optimized targeting can expand beyond the audiences you select to find additional converters. That’s not “bad,” but it changes what your test means. If your question is “Which audience segment is best?” you generally want to reduce expansion effects so you’re comparing like-for-like. If your question is “What’s the best way to drive conversions at my goal?” then leaving optimized targeting on can be appropriate, and your audiences become starting signals rather than strict constraints.

Also be disciplined about timing. Optimized targeting needs learning time, and performance can fluctuate early. For new campaigns, don’t evaluate too soon—wait until you have meaningful conversion volume and enough runtime to smooth out day-to-day volatility.

Performance Max audience signals: great for guidance, but they are not hard targeting

Audience signals in Performance Max are suggestions that help the system learn faster; they don’t guarantee delivery only to those users. That means you should avoid framing your analysis as “this audience did better” unless you’re using a controlled experiment design where the only meaningful difference is the signals you’re providing (and you’re reading results in the right place, over a long enough period).

If you want to “test segments” in a Performance Max context, do it as a structured experiment (not by swapping signals every few days). Otherwise, you’ll mostly be measuring learning disruption rather than segment performance.

Lookalike segments (Demand Gen): a powerful, testable way to scale first-party performance

If you have strong first-party seed lists (for example, customer lists or high-intent site visitors), lookalike segments are one of the most practical audience tests because you can choose different reach/similarity settings (narrow, balanced, broad) and compare outcomes. Make sure your seed lists are large enough and plan ahead: lookalike eligibility and refresh behavior mean you should create the campaign and segment in advance so your test window isn’t wasted on ramp-up.

Part 4: Read results correctly and turn them into actions

Use confidence, not gut feel (and don’t end tests early)

When you’re reviewing experiment results, focus on whether the difference is statistically meaningful—not just whether one line is temporarily higher this week. Experiments reporting provides confidence intervals and significance indicators so you can see whether the observed lift is likely real or just noise. If the result isn’t decisive yet, the right move is often to let it run longer (rather than “optimizing” mid-test and invalidating your comparison).

Know what “audience performance” data is actually showing you

Audience reporting is strongest when you treat it as a diagnostic lens: who is converting, at what efficiency, and at what scale. In Observation-based setups, you’re reading overlapping behavior (users who matched your existing targeting and also matched the audience). In Targeting-based setups, you’re reading gated behavior (only users in that audience could have seen the ads). Those are different questions—so interpret them differently.

Avoid the most common audience-testing mistakes (quick checklist)

  • Changing creative while testing audiences: keep ads and landing pages consistent unless the test is explicitly “audience + message match.”
  • Mixing signals and constraints: don’t compare a tightly targeted ad group to an expanded/optimized one and call it an “audience test.”
  • Judging too early: give the system time to learn, and give your test enough conversions to be meaningful.
  • Optimizing mid-test: if you change bids, budgets, or targeting during the test, you’ve changed the experiment.
  • Picking the wrong success metric: if lead quality matters, don’t run the test on a “cheap lead” conversion action and hope quality follows.

What to do after you find a winner

Once you’ve identified a segment that clearly outperforms (or scales better), promote it from “test” to “structure.” That usually means creating dedicated ad groups or campaigns with tailored messaging and landing pages, shifting budget toward the winners, and using exclusions carefully to prevent overlap or wasted spend where it’s appropriate. If you’re using automation-heavy campaign types, the best “next step” is often improving your inputs—stronger first-party segments, cleaner conversion goals, and more relevant creative—so the system can scale what worked without drifting away from your intent.