Use Cases

Build a Local Lead Scoring System with n8n and Embeddings

Score inbound leads automatically by comparing their profiles and behaviour against your best customers using local embeddings and n8n.

Robson PereiraMay 31, 202610 min read
n8n workflow showing lead scoring pipeline with embedding comparison nodes.

Build a Local Lead Scoring System with n8n and Embeddings

Lead scoring helps small businesses focus sales effort where it matters most. Instead of manual qualification or expensive cloud CRMs, you can build a private scoring system using n8n and local embedding models that compares inbound leads against your best existing customers.

Why embeddings for lead scoring

Embeddings convert text (company description, job title, website content) into vectors. By comparing the vector of a new lead against vectors of your best customers, you get a similarity score that reflects how closely the lead matches your ideal customer profile.

This approach is:

  • **Private** — no lead data leaves your server
  • **Custom** — scored against your actual best customers, not generic demographics
  • **Automatic** — n8n processes each lead as it arrives

For the technical foundations of embeddings, see Choosing the Best Embedding Model for Local Search.

Architecture

```

Lead arrives (web form, email, CRM API)

n8n extracts lead data (company, industry, role, website)

Scrape website content (optional, via HTTP node)

Generate embedding via local model

Compare against reference embeddings of best customers

Calculate similarity score (cosine similarity)

Classify: Hot / Warm / Cold + route to appropriate sales queue

```

Step-by-step build

Step 1: Define your ideal customer profile

Gather 20-50 records of your best customers. For each, collect:

  • Company name and description
  • Industry and size
  • Key decision-maker titles
  • Website content summary

Generate embeddings for each using your local model. Store the reference vectors in a SQLite database alongside the customer metadata.

Step 2: Build the n8n ingestion workflow

Create a webhook that receives new lead data. The payload should include:

```json

{

"company": "Example Corp",

"industry": "Healthcare Tech",

"description": "SaaS platform for clinic management",

"source": "website_form",

"contact_role": "CTO"

}

```

Step 3: Generate lead embedding

Use an HTTP node to call your embedding model endpoint:

```bash

curl -X POST http://localhost:11434/api/embeddings -H "Content-Type: application/json" -d '{

"model": "nomic-embed-text",

"prompt": "Example Corp: SaaS platform for clinic management. Industry: Healthcare Tech. Contact: CTO"

}'

```

Step 4: Score against reference embeddings

Use a Code node (Python or JavaScript) to compute cosine similarity:

```python

import numpy as np

lead_vector = np.array(items[0]["json"]["embedding"])

reference_vectors = items[0]["json"]["reference_embeddings"]

scores = []

for ref in reference_vectors:

ref_vector = np.array(ref["embedding"])

similarity = np.dot(lead_vector, ref_vector) / (

np.linalg.norm(lead_vector) * np.linalg.norm(ref_vector)

)

scores.append({"customer": ref["name"], "score": float(similarity)})

items[0]["json"]["scores"] = sorted(scores, key=lambda x: x["score"], reverse=True)

items[0]["json"]["max_score"] = max(s["score"] for s in scores)

return items

```

Step 5: Route based on score

| Score range | Classification | Action |

|-------------|---------------|--------|

| > 0.8 | Hot | Notify senior sales within 1 hour |

| 0.6 - 0.8 | Warm | Add to nurture sequence |

| 0.4 - 0.6 | Cool | Add to monthly newsletter |

| < 0.4 | Cold | Log for analytics only |

Keeping scores accurate

Review your ideal customer profile quarterly. Remove customers who churned or were low-value. Add new high-value customers. Re-generate reference embeddings after each update.

For operational guidance on running these systems, see Private AI Productivity Systems for Founders.

Conclusion

A local lead scoring system with n8n and embeddings is a powerful addition to any small business sales process. It keeps lead data private, scores against your actual best customers, and runs automatically as leads arrive. Start with 20 reference records and expand as you validate the approach.

FAQ

Do I need many reference customers?

Start with 20-50 high-quality records. Quality matters more than quantity — use customers who genuinely are your best fit, not just any customer.

How often should I regenerate embeddings?

Regenerate reference embeddings quarterly, or whenever you update your ideal customer profile.

Can this score leads from different sources?

Yes. Normalise the input text to a consistent format before generating the embedding, regardless of whether the lead came from a web form, email, or CRM import.

Related articles