RFM analysis answers three questions about every customer:
- Recency — how recently did they buy?
- Frequency — how often do they buy?
- Monetary — how much do they spend?
Score each on a scale of 1-5. Combine the scores. Your 555s are your best customers. Your 111s are gone. The 511s bought recently but only once — they're your conversion opportunity. The 155s used to be great but haven't bought in months — they're your churn risk.
It's one of the oldest segmentation frameworks in analytics. It works because it's grounded in behavior, not demographics or personas. What someone does matters more than who they are.
The model
Three dbt models. That's it.
1. Base metrics per customer
-- models/intermediate/int_rfm__customer_metrics.sql
SELECT
customer_id,
MAX(order_date) AS last_order_date,
COUNT(DISTINCT order_id) AS total_orders,
SUM(order_total) AS total_revenue,
DATE_DIFF(CURRENT_DATE(), MAX(order_date), DAY) AS days_since_last_order,
DATE_DIFF(CURRENT_DATE(), MIN(order_date), DAY) AS customer_tenure_days,
SAFE_DIVIDE(SUM(order_total), COUNT(DISTINCT order_id)) AS avg_order_value
FROM {{ ref('mart_orders') }}
WHERE order_status = 'completed'
GROUP BY 12. Score with percentile-based quintiles
-- models/intermediate/int_rfm__scored.sql
SELECT
customer_id,
days_since_last_order,
total_orders,
total_revenue,
avg_order_value,
-- Recency: lower days = better = higher score
5 - NTILE(5) OVER (ORDER BY days_since_last_order ASC) + 1 AS recency_score,
-- Frequency: more orders = higher score
NTILE(5) OVER (ORDER BY total_orders ASC) AS frequency_score,
-- Monetary: more revenue = higher score
NTILE(5) OVER (ORDER BY total_revenue ASC) AS monetary_score
FROM {{ ref('int_rfm__customer_metrics') }}
WHERE total_orders > 0NTILE(5) splits customers into 5 equal-sized buckets. No arbitrary thresholds — the boundaries adapt to your data distribution. If your top quintile has 50+ orders and the bottom has 1-2, the model reflects that automatically.
3. Segment and label
-- models/marts/mart_rfm_segments.sql
SELECT
customer_id,
days_since_last_order,
total_orders,
total_revenue,
avg_order_value,
recency_score,
frequency_score,
monetary_score,
CONCAT(
CAST(recency_score AS STRING),
CAST(frequency_score AS STRING),
CAST(monetary_score AS STRING)
) AS rfm_score,
CASE
-- Champions: recent, frequent, high-value
WHEN recency_score >= 4 AND frequency_score >= 4 AND monetary_score >= 4
THEN 'Champions'
-- Loyal: frequent and high-value, recency varies
WHEN frequency_score >= 4 AND monetary_score >= 4
THEN 'Loyal Customers'
-- Potential Loyalists: recent, moderate frequency
WHEN recency_score >= 4 AND frequency_score >= 2 AND frequency_score <= 3
THEN 'Potential Loyalists'
-- New Customers: very recent, low frequency
WHEN recency_score >= 4 AND frequency_score = 1
THEN 'New Customers'
-- At Risk: used to be good, not recent
WHEN recency_score <= 2 AND frequency_score >= 3 AND monetary_score >= 3
THEN 'At Risk'
-- Can't Lose Them: were champions, going dormant
WHEN recency_score <= 2 AND frequency_score >= 4 AND monetary_score >= 4
THEN 'Cant Lose Them'
-- Hibernating: low everything
WHEN recency_score <= 2 AND frequency_score <= 2
THEN 'Hibernating'
-- Need Attention: mid-range across the board
WHEN recency_score = 3 AND frequency_score = 3
THEN 'Need Attention'
-- About to Sleep: below average recency, low-mid frequency
WHEN recency_score <= 3 AND frequency_score <= 2
THEN 'About to Sleep'
ELSE 'Other'
END AS segment_name
FROM {{ ref('int_rfm__scored') }}What each segment tells you
| Segment | RFM Pattern | Size (typical) | Action |
|---|---|---|---|
| Champions | 5-5-5, 5-4-5, etc. | 5-10% | Reward, ask for referrals, early access |
| Loyal Customers | *-4-4, *-5-4 | 10-15% | Upsell, loyalty programs |
| Potential Loyalists | 4-2-, 5-3- | 10-15% | Nurture to increase frequency |
| New Customers | 5-1-* | 5-10% | Onboard well, don't overwhelm |
| At Risk | 2-4-4, 1-3-3 | 10-15% | Win-back campaign, personal outreach |
| Can't Lose Them | 1-5-5, 2-4-5 | 3-5% | Urgent — these are high-value going dormant |
| Hibernating | 1-1-1, 2-1-1 | 15-25% | Low-cost re-engagement or accept the loss |
| About to Sleep | 3-2-, 2-2- | 10-15% | Gentle re-engagement before they go dormant |
The "Can't Lose Them" segment is the one that pays for the analysis. These are customers who used to be your best — high frequency, high spend — but haven't purchased recently. Every day without action increases the chance they're gone for good.
Visualization
RFM data is naturally visual. Three approaches:
Segment distribution (bar chart)
-- How many customers in each segment?
SELECT
segment_name,
COUNT(*) AS customers,
SUM(total_revenue) AS segment_revenue,
AVG(total_revenue) AS avg_revenue_per_customer
FROM {{ ref('mart_rfm_segments') }}
GROUP BY 1
ORDER BY segment_revenue DESCRFM heatmap (the classic view)
Plot Recency (x-axis) vs Frequency (y-axis), color by count or revenue. Each cell is one RFM combination. Bright cells = many customers. The top-left (high recency, high frequency) is your sweet spot.
-- Heatmap data: recency × frequency, colored by customer count
SELECT
recency_score,
frequency_score,
COUNT(*) AS customer_count,
SUM(total_revenue) AS total_revenue,
AVG(monetary_score) AS avg_monetary_score
FROM {{ ref('int_rfm__scored') }}
GROUP BY 1, 2
ORDER BY 1 DESC, 2 ASCMovement tracking (the advanced view)
How are customers migrating between segments over time? This requires snapshotting:
-- models/snapshots/rfm_snapshot.sql
{% snapshot rfm_segment_snapshot %}
{{
config(
target_schema='snapshots',
unique_key='customer_id',
strategy='check',
check_cols=['segment_name'],
)
}}
SELECT customer_id, segment_name, CURRENT_DATE() AS snapshot_date
FROM {{ ref('mart_rfm_segments') }}
{% endsnapshot %}Now you can answer: "How many Champions from Q1 became At Risk in Q2?" That's the real insight — not the static picture, but the flow.
Activation: making RFM operational
Segments sitting in a dashboard are interesting. Segments pushing to your marketing tools are valuable.
Email campaigns (via reverse ETL)
-- models/activation/rfm_email_audiences.sql
SELECT
c.email,
r.segment_name,
r.days_since_last_order,
r.total_revenue,
CASE
WHEN r.segment_name = 'Cant Lose Them'
THEN 'winback_urgent'
WHEN r.segment_name = 'At Risk'
THEN 'winback_standard'
WHEN r.segment_name = 'Potential Loyalists'
THEN 'nurture_to_loyal'
WHEN r.segment_name = 'New Customers'
THEN 'onboarding_series'
WHEN r.segment_name = 'Champions'
THEN 'referral_ask'
END AS email_campaign
FROM {{ ref('mart_rfm_segments') }} r
JOIN {{ ref('dim_customers') }} c ON r.customer_id = c.customer_id
WHERE email_campaign IS NOT NULLPush to SendGrid, Klaviyo, or whatever email tool you use via Hightouch or Census. Each segment gets a tailored campaign — not a generic blast.
Ad audiences
Same pattern — push "At Risk" and "Can't Lose Them" segments to Google Ads and Meta as custom audiences. Win-back ads targeted at people who already know your brand are 3-5x more efficient than cold prospecting.
CRM enrichment
Add the RFM segment and score to each contact in Salesforce or HubSpot. When a sales rep opens a contact record, they see: "Champion — 12 orders, $47K lifetime value, last purchase 8 days ago." Context changes the conversation.
Common mistakes
1. Fixed thresholds instead of percentiles
Don't define "high frequency" as ">10 orders." Use NTILE() or percentile-based scoring. Fixed thresholds break when your data distribution changes — after a holiday season, after a product launch, after a year of growth. Percentiles adapt automatically.
2. Not excluding outliers
One customer with 500 orders will compress everyone else into the bottom quintile. Consider capping at the 99th percentile or excluding obvious B2B/wholesale accounts from B2C segmentation.
3. Running it once
RFM is a moving picture, not a snapshot. Run it on a schedule (weekly or monthly). Track how customers move between segments. A customer who was a Champion 3 months ago and is now At Risk is a more urgent signal than a customer who's been Hibernating for a year.
4. Not connecting it to action
The most common RFM failure: a beautiful analysis that lives in a slide deck and never reaches the email tool, the ad platform, or the CRM. Build the activation layer from day one. If the segment doesn't trigger an action, it's trivia.
The stack
| Layer | Tool | What it does |
|---|---|---|
| Data source | Shopify / Stripe / your order system | Raw transaction data |
| Ingestion | Fivetran | Syncs to BigQuery on schedule |
| Warehouse | BigQuery | Stores all historical transactions |
| Transformation | dbt | The 3 RFM models + activation models |
| Visualization | Sigma or Looker | Heatmaps, segment distribution, movement tracking |
| Activation | Hightouch → SendGrid + Google Ads + Salesforce | Pushes segments to operational tools |
Total infrastructure cost: under $300/month for most e-commerce and B2B companies.
We build RFM models and customer segmentation pipelines in BigQuery + dbt — from raw transactions to activated audiences in your email and ad tools. If your customer data is sitting in a warehouse without segmentation, book a discovery call.