Your CMO asks: "What's our total reach?"
The honest answer takes a data warehouse, a methodology document, and the courage to admit that some of those numbers are estimates.
Every platform defines "reach" differently. Meta counts unique users who saw your ad. YouTube counts unique viewers. TV panels estimate households. Radio uses diary-based sampling. Outdoor counts traffic past a billboard. When someone asks "what's our total reach?" they're asking you to add up numbers that were measured in fundamentally different ways.
That's a data-engineering problem, not a media-buying problem.
Why reach numbers don't add up
Different units of measurement
| Channel | What they count | Unit |
|---|---|---|
| Meta Ads | Unique accounts who saw the ad | People |
| Google Ads (Display) | Unique cookies who saw the ad | Devices |
| YouTube | Unique viewers | Google accounts |
| TV (Nielsen) | Estimated household viewership | Households |
| Radio (diary) | Self-reported listening | People (recalled) |
| OOH / Outdoor | Traffic counts past the location | Opportunities-to-see |
| Readership surveys | People (estimated) |
Adding Meta's unique users to TV's estimated households to radio's diary responses is like adding kilometers to pounds. The units don't match.
Deduplication across channels
If the same person saw your Meta ad, your YouTube pre-roll, and your TV spot — they should count as 1 unique reach, not 3. But no platform can deduplicate across other platforms. Each counts their own silo.
At scale, total "reach" reported by summing platforms can overstate actual unique reach by 30-60%.
The methodology: a hierarchy approach
We've built this for clients who need to report global reach across digital and traditional media. The approach:
Step 1: Pick an anchor channel
Choose your highest-confidence channel as the anchor. Usually digital (Meta or Google Display) because:
- Unit is closest to "actual people"
- Data is deterministic, not survey-based
- Daily granularity
- Available via API
Step 2: Build conversion factors for other channels
For each non-anchor channel, define a conversion factor that translates its native unit into the anchor unit:
-- seeds/reach_conversion_factors.csv
channel,native_unit,conversion_factor,confidence,notes
meta_ads,unique_users,1.0,high,anchor channel - no conversion needed
google_display,unique_cookies,0.75,medium,cookie-to-person ratio estimate
youtube,unique_viewers,0.85,medium,some cross-device duplication
tv_nielsen,households,2.3,low,average household size for target demo
radio_diary,recalled_listeners,0.6,low,diary overreporting adjustment
ooh,opportunities_to_see,0.15,very_low,visibility factor for the formatThese factors are debatable. That's the point — documenting them makes the assumptions explicit. A reach number without methodology is fiction. A reach number with documented conversion factors is an estimate you can defend.
Step 3: Aggregate in the warehouse
-- models/marts/mart_global_reach.sql
WITH channel_reach AS (
SELECT
report_date,
campaign_name,
'meta_ads' AS channel,
unique_reach AS native_reach,
unique_reach * 1.0 AS estimated_people_reach,
'high' AS confidence
FROM {{ ref('stg_meta_ads__reach') }}
UNION ALL
SELECT
report_date,
campaign_name,
'tv_nielsen',
household_reach,
household_reach * 2.3,
'low'
FROM {{ ref('stg_nielsen__tv_reach') }}
UNION ALL
SELECT
report_date,
campaign_name,
'youtube',
unique_viewers,
unique_viewers * 0.85,
'medium'
FROM {{ ref('stg_google_ads__youtube_reach') }}
)
SELECT
report_date,
campaign_name,
SUM(estimated_people_reach) AS gross_reach,
-- Apply cross-channel deduplication factor
SUM(estimated_people_reach) * 0.65 AS estimated_unique_reach,
AVG(CASE
WHEN confidence = 'high' THEN 3
WHEN confidence = 'medium' THEN 2
WHEN confidence = 'low' THEN 1
ELSE 0
END) AS avg_confidence_score
FROM channel_reach
GROUP BY 1, 2The 0.65 cross-channel deduplication factor is an estimate based on media-mix modeling research. For your specific audience and channel mix, it might be higher or lower. Document it. Defend it. Update it when you have better data.
Handling manual data inputs
Not every channel has an API. TV reach data arrives as an Excel file from your media agency. Radio data comes quarterly. Print is annual.
These manual inputs deserve the same schema enforcement as API feeds:
-- models/staging/stg_manual__tv_reach.sql
SELECT
PARSE_DATE('%Y-%m-%d', report_date) AS report_date,
campaign_name,
CAST(household_reach AS INT64) AS household_reach,
market_region,
CASE
WHEN household_reach IS NULL THEN 'MISSING'
WHEN household_reach < 0 THEN 'INVALID'
ELSE 'VALID'
END AS validation_status
FROM {{ source('manual_uploads', 'tv_reach_reports') }}Validation rules catch errors before they corrupt your dashboard. A negative reach number, a missing date, a campaign name that doesn't match your taxonomy — catch it in staging, not in the board meeting.
Regional considerations
For global companies, reach measurement gets harder:
- Markets without reliable measurement. Some regions don't have Nielsen equivalents. Options: use proxy markets (a similar-sized market with measurement), apply Bayesian models based on media spend, or report with explicit "estimated" flags.
- Currency and unit differences. Some markets report in thousands, others in actuals. Normalize in staging.
- Timezone alignment. A campaign running globally accumulates reach across timezones. Define whether "daily reach" means UTC, local-market, or campaign-timezone.
-- Handle regional normalization
SELECT
report_date,
market_region,
CASE
WHEN reporting_unit = 'thousands' THEN native_reach * 1000
WHEN reporting_unit = 'millions' THEN native_reach * 1000000
ELSE native_reach
END AS normalized_reach
FROM {{ ref('stg_regional__media_reach') }}What the dashboard looks like
The final output: a reach dashboard that the CMO can actually use.
- Top line: estimated unique reach with confidence band (not a single number — a range)
- By channel: breakdown showing each channel's contribution with its confidence level
- Over time: weekly or monthly trend, highlighting when new channels were added
- By region: for global campaigns, reach by market with "measured" vs "estimated" flags
- Methodology tab: the conversion factors, deduplication assumptions, and data sources — visible, not hidden
The confidence band is the key innovation. Instead of reporting "total reach: 4.2M," report "estimated unique reach: 3.4M – 5.1M (confidence: medium)." This is more honest and, paradoxically, more credible. A precise-looking number with no methodology is less trustworthy than a range with documented assumptions.
The uncomfortable truth
Perfect reach measurement is impossible. The channels are too different, the data is too heterogeneous, and cross-channel deduplication remains an unsolved problem at the individual level.
What's possible: a defensible estimate with documented methodology, consistent measurement over time, and the ability to compare periods against each other. The absolute number is always approximate. The trend is reliable.
Your CMO doesn't actually need to know that reach was exactly 4,237,891. They need to know whether reach is growing or shrinking, which channels are contributing, and whether the campaign was worth the spend. A warehouse with honest methodology answers all three.
We build media-measurement frameworks that unify digital and traditional channels into a single reach model — with documented methodology, validation rules, and confidence bands. Book a discovery call if your "total reach" number is a fiction nobody believes.