97% of your website visitors leave without filling out a form, clicking a CTA, or doing anything that identifies them. They're anonymous sessions in GA4. Ghost rows in PostHog. Traffic you paid for, learned nothing from.
Deanonymization tools change that equation. For B2B companies, they turn anonymous page views into named contacts with titles, companies, and LinkedIn URLs. The question isn't whether this data is useful — it obviously is. The question is what you do with it once you have it.
How it works
Three mechanisms, depending on the tool:
1. IP-to-company matching
The simplest and most common approach. Every web request carries an IP address. Databases like IPinfo, MaxMind, and Clearbit map IP ranges to company names.
- Accuracy: company-level only. You know Acme Corp visited your site. You don't know who at Acme Corp.
- Coverage: decent for mid-to-large companies with dedicated IP ranges. Poor for small companies on shared ISPs and anyone working from home on residential internet.
- Cost: IPinfo Lite is free. Commercial APIs run $100-500/month.
2. Identity graph / co-op pixel networks
Tools like RB2B and Leadfeeder use a network of publisher pixels across the web. When a user visits a site in the network and identifies themselves (by logging in, filling a form), the network associates their browser/device with their identity. When that same browser visits your site, the network matches them.
- Accuracy: individual-level. You get name, email, title, LinkedIn.
- Coverage: depends on the network size. RB2B claims 40-60% match rates for US B2B traffic. Reality varies.
- Cost: RB2B starts around $200/month. Clearbit Reveal, Demandbase, and 6sense are enterprise-priced.
3. Email-hash matching
Some tools match browser cookies to hashed email databases. When a user's hashed email appears in the database (from data partnerships, publisher networks, or opt-in sources), the tool matches it.
- Accuracy: individual-level, but match rates are lower than identity graphs.
- Cost: usually bundled in enterprise tools.
The pipeline we built
For one client (SecureW2), we implemented RB2B → warehouse → CRM:
Website visitor ──→ RB2B pixel ──→ RB2B webhook
│
▼
Cloud Function (parse JSON)
│
▼
Pub/Sub topic
│
▼
BigQuery (raw events)
│
▼
dbt models (stage, route, score)
│
▼
Salesforce sync (via Hightouch)
Staging: parse the webhook payload
RB2B sends a JSON payload for each identified visitor:
-- models/staging/stg_rb2b__visitor_events.sql
SELECT
event_id,
TIMESTAMP(received_at) AS identified_at,
JSON_EXTRACT_SCALAR(payload, '$.email') AS email,
JSON_EXTRACT_SCALAR(payload, '$.first_name') AS first_name,
JSON_EXTRACT_SCALAR(payload, '$.last_name') AS last_name,
JSON_EXTRACT_SCALAR(payload, '$.title') AS job_title,
JSON_EXTRACT_SCALAR(payload, '$.company') AS company_name,
JSON_EXTRACT_SCALAR(payload, '$.linkedin_url') AS linkedin_url,
JSON_EXTRACT_SCALAR(payload, '$.page_url') AS page_visited,
CAST(JSON_EXTRACT_SCALAR(payload, '$.company_size') AS INT64) AS company_size
FROM {{ source('rb2b', 'raw_webhook_events') }}
WHERE JSON_EXTRACT_SCALAR(payload, '$.email') IS NOT NULLRouting: CRM lookup + classification
Not every identified visitor deserves action. Route based on who they are:
-- models/intermediate/int_rb2b__routed_visitors.sql
SELECT
v.*,
sf_contact.contact_id AS existing_contact_id,
sf_opp.opportunity_id AS active_opportunity_id,
sf_opp.amount AS opportunity_value,
CASE
WHEN sf_opp.opportunity_id IS NOT NULL
THEN 'ACTIVE_OPPORTUNITY' -- visiting during a deal = high signal
WHEN sf_contact.contact_id IS NOT NULL
THEN 'EXISTING_CONTACT' -- known contact, re-engaging
WHEN v.company_size >= 200
THEN 'NEW_ENTERPRISE_VISITOR' -- unknown, large company
WHEN v.company_size >= 50
THEN 'NEW_MID_MARKET_VISITOR'
ELSE 'NEW_VISITOR'
END AS routing_category,
CASE
WHEN sf_opp.opportunity_id IS NOT NULL THEN 'NOTIFY_OWNER_IMMEDIATELY'
WHEN v.company_size >= 200 AND v.page_visited LIKE '%pricing%' THEN 'CREATE_LEAD'
WHEN sf_contact.contact_id IS NOT NULL THEN 'UPDATE_ACTIVITY'
ELSE 'ADD_TO_NURTURE'
END AS routing_action
FROM {{ ref('stg_rb2b__visitor_events') }} v
LEFT JOIN {{ ref('stg_salesforce__contacts') }} sf_contact
ON LOWER(v.email) = LOWER(sf_contact.email)
LEFT JOIN {{ ref('stg_salesforce__opportunities') }} sf_opp
ON sf_contact.account_id = sf_opp.account_id
AND sf_opp.stage NOT IN ('Closed Won', 'Closed Lost')When someone from a company with an active $200K opportunity visits your pricing page — that's not a casual browse. That's buying behavior. Notify the account owner within minutes.
The free alternative: PostHog + IPinfo
If RB2B is too expensive or you want company-level identification without individual data:
-- PostHog captures IP (configurable)
-- Join to IPinfo for company matching
SELECT
ph.session_id,
ph.page_url,
ph.session_start,
ip.company_name,
ip.company_domain,
ip.employee_count,
ip.industry
FROM {{ ref('stg_posthog__sessions') }} ph
LEFT JOIN {{ ref('stg_ipinfo__ip_companies') }} ip
ON ph.ip_address = ip.ip_range_start
WHERE ip.company_name IS NOT NULL
AND ip.company_type = 'business' -- exclude ISPs, universitiesYou get company, not individual. But it's effectively free and requires no third-party pixel.
The legal landscape
US: largely permissible for B2B. No federal law prohibits using IP-to-company matching or identity-graph data for business purposes. California's CCPA requires honoring opt-out requests, but B2B contact data has carve-outs.
EU/UK: GDPR makes individual-level deanonymization risky without explicit consent. Company-level (IP-to-company) is generally acceptable as legitimate interest. Individual identification via RB2B-style tools requires careful legal review and a clear consent mechanism.
Practical guidance: start with company-level identification everywhere. Layer individual identification only for US traffic where your legal counsel approves. Always honor opt-out requests immediately.
Realistic match rates
Vendor claims vs. reality:
| Tool | Claimed match rate | Realistic rate | Level |
|---|---|---|---|
| RB2B | 40-60% | 20-40% (varies by traffic mix) | Individual |
| Clearbit Reveal | 30-50% | 15-30% | Individual + company |
| IPinfo | N/A (deterministic) | 60-80% of business traffic | Company only |
| PostHog + IPinfo Lite | N/A | 40-60% of business traffic | Company only |
Match rates depend heavily on your traffic composition. If most visitors are US-based, at mid-to-large companies, on corporate networks — rates are higher. If traffic is global, includes SMBs, or is heavily mobile — lower.
The honest take
Deanonymization is powerful and imperfect. You'll identify a subset of visitors, not all of them. The subset is biased toward larger companies on corporate networks. Small-company visitors, remote workers, and mobile users will remain anonymous.
Don't build your strategy around 100% identification. Build it around the 20-40% you do identify — and make sure that data actually reaches your sales team in a format they can act on. A beautifully identified visitor sitting in a BigQuery table nobody queries is worth exactly nothing.
We build visitor-identification pipelines from webhook to warehouse to CRM — including the routing logic that turns a page view into a signal your sales team can act on. Book a discovery call if 97% of your traffic is walking away anonymous.