Evidence based Lead Scoring models

Feb 06, 2026

∙ Paid

Every RevOps team I work with has a lead scoring model. Almost none of them trust it.

The pattern is always the same. Somebody built the model eighteen months ago. They assigned points based on a mix of intuition, sales feedback, and whatever the marketing team believed about their ideal customer profile at the time. The scores went live, MQL thresholds were set, and then everyone moved on to the next fire.

Fast forward to today. The product has changed. The ICP has shifted. Two new competitors emerged. The content library tripled in size. The behaviors that signaled buying intent a year ago might mean something completely different now. But the scoring model? Frozen in time.

This is the lead scoring decay problem, and it is more common than anyone wants to admit. The good news is that fixing it does not require a data science team, a six-figure predictive analytics platform, or an ML engineering hire. It requires your CRM data, a spreadsheet, and a few hours of focused work.

I am going to walk you through the exact process.

Why Lead Scores Decay

Lead scoring models are built on assumptions about which attributes and behaviors predict revenue. Those assumptions have a shelf life. And at most companies, nobody is checking the expiration date.

Think about what changes in a typical twelve-month window at a company scaling from $10M to $100M ARR. The sales team expands and starts working new segments. Marketing launches campaigns targeting a different persona. The product ships features that attract a different type of buyer. Pricing changes. The competitive landscape shifts. New content gets published that attracts different audiences. The buying committee evolves as the deal size grows.

Each of those changes alters the relationship between lead characteristics and actual deal outcomes. A job title that used to convert at 15% might drop to 6%. A content download that once correlated with pipeline creation might now attract researchers who never buy. Meanwhile, a new behavior pattern, maybe repeated visits to your integrations page, starts showing up in closed-won deals and your scoring model has no idea.

The result is predictable. Sales stops trusting the scores. Marketing keeps optimizing for MQLs that do not convert. Pipeline reviews become arguments about lead quality. And the RevOps team is stuck in the middle, defending a model they know is stale.

The fix is straightforward: validate your scoring model against actual conversion data on a regular cadence, and adjust the weights based on what the data shows. Here is how to do it with tools you already have.

Step 1: Build Your Analysis Dataset

The foundation of this entire exercise is a clean export from your CRM. You need enough historical data to identify patterns, and you need the right fields to test your assumptions.

Time range: Pull 12 months of leads minimum. If you have lower volume (fewer than 200 closed-won deals in that period), extend to 18 or 24 months. You need enough outcomes for the conversion rates to be meaningful.
Outcome field: Create a binary column where 1 = closed-won and 0 = closed-lost or disqualified. Exclude leads that are still open or stuck in early stages. You want clean outcomes to train against.
Attribute fields: These are the firmographic and demographic characteristics of the lead. Include company size (employee count or revenue range), industry, job title or seniority level, geography, and lead source. Pull anything you track consistently.
Behavioral fields: These are the actions the lead took before becoming an opportunity. Include page visits (especially pricing, demo, and product pages), content downloads, email engagement metrics, webinar attendance, free trial signups, and any product usage data you have access to.

A critical principle here: include fields you suspect do not matter. The entire point of this exercise is to let the data override your assumptions. If you only export the fields you already believe are important, you are just going to rebuild the same flawed model with a fresh coat of paint.

Export this as a CSV. Clean up any obvious garbage (blank rows, test records, duplicate entries). You should end up with something in the range of 500 to 5,000 rows depending on your volume. If you are below 200 total leads with outcomes, this exercise can still be useful, but treat the results as directional rather than definitive.

Step 2: The Spreadsheet Method

This is the path that requires zero technical skills beyond basic spreadsheet literacy. It is also surprisingly powerful for the amount of effort involved.

The core technique is simple: calculate the conversion rate for each value of each attribute, then compare those rates to your overall baseline.

Calculate your baseline: Take the total number of closed-won leads and divide by the total number of leads in your dataset. This is your overall conversion rate. Write it down. Everything else gets measured against this number.
Segment and compare: For each attribute field, calculate the conversion rate by segment. What is the win rate for Enterprise leads vs. Mid-Market vs. SMB? For VP-level contacts vs. Directors vs. Managers? For leads from organic search vs. paid ads vs. events?

Use COUNTIFS to do this efficiently. In a new tab, list each unique value for an attribute in column A. In column B, use COUNTIFS to count total leads with that value. In column C, use COUNTIFS to count closed-won leads with that value. In column D, divide column C by column B to get the segment conversion rate.

Now calculate the lift: divide each segment’s conversion rate by the baseline rate. If your baseline is 8% and VP-level contacts close at 14%, that is a 1.75x lift. If individual contributors close at 3%, that is a 0.38x lift. The lift number is your friend here because it normalizes everything to a common scale regardless of the underlying percentages.

Do this for every attribute and every behavioral field. The output is a table showing which factors actually correlate with winning and by how much.

Assign points proportional to lift: Convert your lift values into a scoring scale. A simple approach: multiply the lift by 10 to get point values. A 1.75x lift becomes 17.5 points. A 0.38x lift becomes 3.8 points. Round as needed. For behavioral fields that are binary (did they visit the pricing page or not), the same math applies.

What you end up with is a scoring model where every weight is backed by actual conversion data from your pipeline. It took a few hours of spreadsheet work. No code required.

I have included a template with this article that has this structure pre-built. Download it, swap in your data, and the formulas do the rest.

Step 3: The AI-Assisted Method

If you want to go further, or if you want to validate what the spreadsheet method found, there is a faster path that uses AI tools most RevOps professionals already have access to.

Export the same CSV dataset from Step 1. Open Claude, ChatGPT, or whichever AI assistant your team uses. Upload the file and use a prompt along these lines:

“Analyze this lead dataset. The column [deal_result] contains the outcome, where 1 means closed-won and 0 means closed-lost. Identify which attributes and behaviors most strongly correlate with a closed-won outcome. Rank them by predictive strength. Then suggest a point-based scoring model with weights for each significant factor.”

What you will typically get back is a ranked list of factors with their correlation strengths, a suggested scoring table, and often some observations about interactions between variables that you would not have spotted manually. For example, the AI might flag that VP-level titles from companies with 200+ employees convert at 3x the rate of VP-level titles from smaller companies, suggesting you need a compound score for that combination.

From there, you can take it one step further. Ask the AI to build you a simple calculator or app where you input a lead’s attributes and get a score back. This is the “vibe coded” piece. You are not deploying a production ML model. You are getting a working tool built from a conversation that your team can use to pressure-test scores against the evidence.

The real value of this approach is speed and pattern detection. The spreadsheet method gives you the same directional answers for individual factors, and you should absolutely do it first so you understand the data yourself. The AI method lets you run the analysis in minutes, explore interactions between variables that would take hours to test manually, and surface non-obvious combinations that a human reviewing pivot tables would likely miss. Use the spreadsheet method to build your intuition and the AI method to stress-test it.

Step 4: Validate Before You Deploy

Before you rip out your current scoring model and replace it, run a validation pass. Skipping this step is how teams end up replacing one set of bad assumptions with a different set of bad assumptions.

Take your new scores and apply them retroactively to a holdout sample. Pull a set of leads that were not in your analysis dataset (the most recent quarter works well for this). Score each lead using your new model. Then check: do the leads with the highest scores actually have the highest win rates? Do the lowest-scored leads have the lowest?

Plot this in a simple chart. Split your scored leads into quartiles (top 25%, second 25%, third 25%, bottom 25%) and compare the actual conversion rate of each quartile. If your model is working, you should see a clear staircase pattern: each quartile converts at a progressively lower rate as you move from top to bottom.

If the staircase is noisy or flat, something is off. The most common culprits are data quality issues, fields with too many missing values diluting the signal, or a sample size that was too small for the patterns to be statistically meaningful. Go back to your dataset and check for these problems before adjusting the model itself.

A practical sanity check: compare your new model’s quartile separation to your current model’s. Apply both scoring approaches to the same holdout sample and see which one produces a steeper staircase. If the new one shows sharper differentiation between high-score and low-score leads, you have a clear and defensible improvement. Share that comparison with sales leadership. It makes the case for adopting the new scores much easier when you can show the math side by side rather than asking people to trust a new methodology on faith.

Step 5: Build the Revisit Rhythm

This is where most teams fall down. They build a scoring model, deploy it, and forget about it. The model decays. Trust erodes. And twelve months later, someone writes another article about fixing lead scoring.

Break the cycle with a simple operating cadence.

Weekly (5 minutes): Glance at the distribution of scores across new leads entering the funnel. Are scores clustering in unexpected ways? Is a disproportionate share of leads landing in the “high score” bucket? Distribution shifts are an early warning sign that something in your inputs has changed.
Monthly (30 minutes): Pull the leads that were scored 90 days ago. What percentage of high-scored leads actually converted? What percentage of low-scored leads surprised you by closing? Track these hit rates over time. If your high-score conversion rate drops by more than 20% relative to the prior period, it is time for a deeper review.
Quarterly (2 to 3 hours): Re-run the full analysis from Step 2 or Step 3 on fresh data. Compare the new factor rankings to the previous quarter. Look for signals that are gaining or losing predictive power. Update your weights accordingly. Document what changed and why.
Trigger-based (immediate): If your company launches a new product, enters a new market segment, changes pricing, or shifts the ICP definition, re-run the analysis immediately. Your old model was trained on a different reality. Waiting for the next quarterly review means running on stale assumptions for months.

Assign a specific person on the RevOps team to own this cadence. Put the monthly and quarterly reviews on the calendar. Treat them like any other operating rhythm you run for pipeline reviews or forecasting. If nobody owns it, it will not happen.

What This Looks Like in Practice

I worked with a client recently where the existing lead scoring model had been untouched for fourteen months. Marketing was generating MQLs at target volume, but sales was rejecting over 40% of them as unqualified. The finger-pointing was constant.

We ran this exact exercise. Exported twelve months of leads, calculated segment conversion rates, and compared them to the scoring weights in the existing model. Three findings stood out.

First, webinar attendance had been assigned 15 points in the original model. The actual data showed webinar attendees converted at roughly the same rate as the baseline. Those 15 points were noise.

Second, repeat visits to the integrations documentation page (three or more visits within a two-week window) correlated with a 3.2x lift over baseline. The original model did not track this behavior at all.

Third, the “Enterprise” company size tier was scored uniformly, but when we broke it out by industry, the conversion rates varied by a factor of four. Manufacturing enterprises closed at nearly 20%. Media enterprises closed at under 5%. The flat score was masking a huge signal.

We rebuilt the model based on the evidence, deployed the updated scores, and within six weeks the MQL rejection rate dropped from 40% to under 15%. Sales started trusting the scores because the scores started reflecting reality.

The entire analysis took one afternoon.

You can do it!

You do not need a data science team to build a lead scoring model that works. You need CRM data with clean outcomes, a few hours with a spreadsheet or AI assistant, and a willingness to let the evidence override opinions that may have been true once and are not true anymore.

The spreadsheet template attached to this article will get you started. Export your data, plug it in, read the lift factors, and have an honest conversation with your sales and marketing leaders about what the numbers say. You will almost certainly find that some of your highest-weighted factors are not actually predictive, and that some behaviors you have been ignoring are strong signals hiding in plain sight.

Then put the revisit rhythm on the calendar and actually do it. The model you build today will start decaying the moment you deploy it. That is not a failure of the model. It is the nature of a business that changes over time. The difference between teams that get value from lead scoring and teams that abandon it is not sophistication. It is maintenance. It is someone owning the cadence and keeping the model honest.

Go build something you can defend with data. Your sales team will thank you for it.

Paid subscribers get the Lead Scoring Analysis Template with pre-built formulas that calculate conversion rates, lift vs. baseline, and suggested point values for every factor in your pipeline👇

Continue reading this post for free, courtesy of Jeff Ignacio.

Or purchase a paid subscription.

RevOps Impact Newsletter