What is a calibrated lead score?

A score that means what it says: contacts scored 30 should convert at roughly 30% over time. Ace records every prediction, checks it against what actually happened — responded, booked an appointment, or went cold at 24 hours and 7 days, plus a separate closed-won track — and re-fits the calibration layer on a schedule so the probabilities keep matching reality.

What predictive models does Follow Up Ace run?

Four model families in production: conversion (probability of a closed deal — the Ace Win Score), response (probability of responding to outreach), churn (probability an engaged contact goes cold — the Ace Churn Risk), and appointment propensity (probability of booking an appointment). They are trained in BigQuery ML on a leakage-audited labeled corpus and re-evaluated weekly.

How often are the models retrained?

The labeled training corpus grows weekly as new leads enter and deals close; the models are re-evaluated on a fresh held-out slice weekly and retrained when due, behind a promotion guard — a model that scores worse than the deployed one is never promoted. Calibration is re-fit daily against labeled outcomes, and scores sync to contacts and FUB fields nightly.

Can a predictive score replace agent judgment?

No. The scores allocate attention — who to call first, who to save — but they don't decide what to say or replace relationship judgment. Engagement behavior is a strong but imperfect proxy for intent, so low scorers should still get nurture touches.

How Ace's Predictive Lead Scores Work

Q: What data trains the models?

Your account's own CRM activity — calls, texts, emails, notes, stage changes, IDX website behavior, deal outcomes — captured via Follow Up Boss webhooks and engineered into features with strict as-of discipline (a training example only sees data that existed before its cutoff, so the model can't cheat by peeking at the future). Cross-tenant comparisons are separate, anonymized, and only published for cohorts of 10+ teams.

The four model families

Ace runs four predictive model families in production, trained in BigQuery ML on labeled outcomes from real Follow Up Boss activity. Each family answers one question:

Model	Predicts	Where you see it
Conversion	Probability the contact converts to a closed deal	Ace Win Score field, Expected GCI (est.), opportunity ranking
Response	Probability the contact responds to outreach	Ranked daily queue, reach-out timing, Seller Radar contactability
Churn	Probability an engaged contact goes cold	Ace Churn Risk field, at-risk surfaces, save-lists
Appointment	Probability the contact books an appointment	Richer contact scores, on-demand single-contact scoring

The features feeding these models are engineered from your webhook stream — engagement velocity, inbound/outbound direction, response latency, channel behavior, deal context, IDX website activity — with strict as-of discipline: a training example only sees data that existed before its cutoff. That single rule is what separates a real predictive model from one that quietly "predicts" the past.

The loop that keeps the numbers honest

Most "AI lead scores" in real estate are static formulas. Ace's are predictions that get graded. The full loop:

Predict. Every scored contact gets conversion, response, churn, and appointment probabilities, and each prediction is recorded.
Label — hourly. The system checks each recorded prediction against what actually happened: did the contact respond or book an appointment within 24 hours? Within 7 days? Or go cold? Closed-won outcomes feed a separate, longer-horizon track.
Recalibrate — daily. Those labeled outcomes re-fit the calibration layer (isotonic regression — a monotone mapping from raw model output to observed outcome rates). This is what makes "Win Score 30" mean ≈30% instead of "higher than 25".
Serve — nightly. The latest calibrated scores sync onto every contact and diff-update the FUB custom fields.
Retrain — weekly. The labeled corpus grows as new leads arrive and deals close; the deployed models are re-evaluated on a fresh held-out slice, and retrained when due — behind a promotion guard: a candidate model that evaluates worse than the deployed one is never promoted, and the regression raises an internal alert instead.

For the deeper technical story — the data warehouse, the leakage audits, and why we publish what's still dark — read the predictive scoring & BI flywheel deep-dive.

What the numbers mean in practice

conversionProb → Ace Win Score

The calibrated probability of a closed deal, shown as 0–100. Use it as your primary sort key. Because it's calibrated, a Smart List of 50-scored contacts should convert at roughly half the rate of a 100-scored list — the ordering and the magnitude are meaningful.

churnProb → Ace Churn Risk

The probability an engaged contact goes cold, bucketed Low / Medium / High with conservative cutpoints — "High" is deliberately rare so it stays actionable rather than becoming alert noise.

expectedGci → Ace Expected GCI (est.)

Win probability × your account's real median commission per closed deal (from your own closed-deal history in the warehouse). Always an estimate, always labeled. It converts probability into dollars at stake, which is how team leads should rank follow-up.

responseProb and appointment propensity

These two mostly work behind the scenes — ordering the daily queue, timing reach-outs, and powering on-demand scoring when you ask Ace about a single contact. Appointment propensity answers the question agents actually care about mid-funnel: "who's likely to get on my calendar this week?" Appointments themselves are tracked as first-class outcomes — set, held, no-show — which is also what makes the model's own report card honest.

The scores drive the experience — not just the fields

The FUB custom fields are the exported half. Inside the product, the same consolidated model suite (Win Score, Churn Risk, Opportunity, Seller intent, Expected GCI (est.)) runs the day-to-day experience:

The contact launchpad. Open a contact in the embedded assistant and it leads with the model's read — a headline ("at risk of going cold", "one of your strongest opportunities"), the key signals, and quick actions that match.
What's Next suggestions. Ace's suggested next moves are drafted against the model signals, so a high-churn contact gets a save play and an active seller gets a listing conversation — not a generic follow-up.
The embed Intelligence dashboard. Your book ranked by the same scores, with a "Save first" list: engaged contacts the model flags as most likely to go cold, strongest first.
Team dashboards. Team Cockpit aggregates the models per agent — average Win Score of each agent's book, churn exposure, modeled GCI (est.) — next to production and response speed.

So if a number in a FUB Smart List and a suggestion inside Ace ever seem to agree suspiciously well — that's because they're the same number.

What the scores can't do (read this part)

They measure behavior, not souls. The models are trained largely on engagement behavior, which is a strong but imperfect proxy for intent. The serious buyer who never replies to anything will score low — keep nurturing low scorers; use the scores to allocate your personal attention, not as an exclusion filter.

Estimates are labeled. Expected GCI carries "(est.)" everywhere it appears because it is a modeled estimate, not booked revenue.

Scores don't write your messages. They tell you who to call first, not what to say. Judgment, relationships, and Fair-Housing-compliant communication stay with the humans (with Ace's compliance scan as the backstop on automated sends).

Frequently asked questions

What is a calibrated lead score?

A score whose value is a real probability: contacts scored 30 convert at roughly 30% over time. Ace records every prediction, labels it against actual outcomes at 24 hours and 7 days (plus a separate closed-won track), and re-fits the calibration daily.
How often do models retrain?

The labeled corpus refreshes weekly, models are re-evaluated on a fresh held-out slice weekly and retrained when due, calibration re-fits daily, and scores sync to contacts and FUB fields nightly.
What data is used — and is my data shared?

Models score your contacts using your account's own CRM activity. Cross-tenant benchmarks (like You vs Market) are a separate, anonymized system that only publishes cohort statistics backed by 10 or more teams — see Revenue Guard & You vs Market.
Why did my contact's Win Score change overnight?

Either new activity fired a real-time rescore, the nightly model sync delivered fresh scores, or the daily recalibration adjusted the probability mapping. All three are the system working as designed — check the Ace Last Analyzed field for the timestamp.
Where do the scores show up?

In the FUB custom fields (field guide), the contact launchpad and What's Next suggestions inside the embedded assistant, the embed Intelligence dashboard, the ranked daily queue, the admin dashboards (including per-agent aggregates in Team Cockpit), and via the MCP connector tools your AI assistant can call.

Put a calibrated Win Score on every contact

Win Score and Churn Risk populate free on every connected account. Connect Follow Up Boss once and see your own database ranked.

Get Started Free →