Tutorials

Automate Invoice Processing with n8n and Local AI

Extract line items, totals, and vendor details from PDF invoices using n8n and a local vision-capable LLM — no cloud APIs required.

Robson PereiraMay 31, 202614 min read
Screenshot of an n8n workflow processing invoice PDFs with local AI nodes.

Automate Invoice Processing with n8n and Local AI

Invoice processing is one of the most practical automation wins for small businesses. Every invoice needs data extracted, categorised, approved, and filed. Doing it by hand is slow; sending financial documents to a third-party AI is risky. n8n plus a local LLM gives you the middle path.

Why local AI matters for financial documents

Invoices contain sensitive information: business names, bank details, tax identifiers, and purchase data. Sending them to a cloud API for processing creates data exposure, compliance overhead, and dependency on an external service. Running extraction locally keeps everything under your control.

For the security fundamentals around self-hosted data, read How to Secure a Self-Hosted AI Server.

What you need

  • n8n running on your own infrastructure (Docker or direct install)
  • Ollama with a vision-capable local model (Llama 3.2 Vision, LLaVA, or Qwen-VL)
  • A watched folder or email inbox where invoices arrive
  • A database or spreadsheet for storing extracted data

Model selection for invoice extraction

| Model | Parameters | VRAM | Extraction quality | Speed |

|-------|-----------|------|-------------------|-------|

| LLaVA 7B | 7B | 8 GB | Good for structured invoices | Medium |

| Llama 3.2 Vision 11B | 11B | 12 GB | Excellent, handles handwriting | Slow |

| Qwen-VL 7B | 7B | 8 GB | Very good, strong OCR | Medium |

Build the extraction workflow

Step 1: Watch for new invoices

Set up a trigger node that watches a folder or polls an email inbox for new PDF attachments. n8n's **Email Trigger (IMAP)** node works well for this.

Step 2: Convert PDF to image

Vision models need an image input, not raw PDF. Use a local tool like **pdftoppm** or a Python script node in n8n to convert each page:

```python

import subprocess, os

pdf_path = items[0]["binary"]["data"]["fileName"]

output_path = pdf_path.replace(".pdf", ".png")

subprocess.run(["pdftoppm", "-png", "-r", "300", pdf_path, output_path.replace(".png", "")])

items[0]["json"]["image_path"] = output_path

return items

```

Step 3: Extract with the AI agent node

Configure an AI Agent node with your local vision model and a prompt like:

> Extract the following fields from this invoice image: vendor_name, vendor_address, invoice_number, invoice_date, total_amount, tax_amount, currency, line_items (array of {description, quantity, unit_price, line_total}). Return the result as valid JSON.

Step 4: Validate and store

Add a validation step to check that critical fields (invoice number, total amount) are present before writing to your accounting database. Flag incomplete extractions for human review.

Handling multi-page invoices

Not all invoices are single-page. Loop through pages, extract each one, and merge the results. The AI agent can cross-reference page 2's line items with page 1's header information if you pass both extracts in the context.

For advice on designing reliable extraction pipelines, read Build a Local RAG Pipeline That Actually Answers Questions.

Approval workflow

Add an approval step before any invoice is paid or recorded:

1. **Auto-approve** — invoices under a threshold (e.g. £100) that match expected patterns

2. **Flag for review** — invoices with unusual amounts, new vendors, or extraction confidence below 80%

3. **Manual confirm** — send a notification to the finance team with a summary and approval link

Conclusion

Invoice processing with n8n and local AI is achievable today. The combination gives you accurate data extraction, full data privacy, and a workflow that scales with your business. Start with one vendor's invoices, tune the prompt, then expand to more formats.

FAQ

Can this handle scanned invoices?

Yes, as long as the scan is readable. Vision models handle printed text well. Handwritten fields are less reliable.

What if the extraction is wrong?

Build a manual correction step into the workflow. Flag low-confidence extractions and let a human review and edit before the data enters your accounting system.

Do I need a GPU?

A GPU helps significantly with vision models. CPU-only inference on an 11B vision model will be slow.

Related articles