Tutorials
Automate Invoice Processing with n8n and Local AI
Extract line items, totals, and vendor details from PDF invoices using n8n and a local vision-capable LLM — no cloud APIs required.

Automate Invoice Processing with n8n and Local AI
Invoice processing is one of the most practical automation wins for small businesses. Every invoice needs data extracted, categorised, approved, and filed. Doing it by hand is slow; sending financial documents to a third-party AI is risky. n8n plus a local LLM gives you the middle path.
Why local AI matters for financial documents
Invoices contain sensitive information: business names, bank details, tax identifiers, and purchase data. Sending them to a cloud API for processing creates data exposure, compliance overhead, and dependency on an external service. Running extraction locally keeps everything under your control.
For the security fundamentals around self-hosted data, read How to Secure a Self-Hosted AI Server.
What you need
- n8n running on your own infrastructure (Docker or direct install)
- Ollama with a vision-capable local model (Llama 3.2 Vision, LLaVA, or Qwen-VL)
- A watched folder or email inbox where invoices arrive
- A database or spreadsheet for storing extracted data
Model selection for invoice extraction
| Model | Parameters | VRAM | Extraction quality | Speed |
|-------|-----------|------|-------------------|-------|
| LLaVA 7B | 7B | 8 GB | Good for structured invoices | Medium |
| Llama 3.2 Vision 11B | 11B | 12 GB | Excellent, handles handwriting | Slow |
| Qwen-VL 7B | 7B | 8 GB | Very good, strong OCR | Medium |
Build the extraction workflow
Step 1: Watch for new invoices
Set up a trigger node that watches a folder or polls an email inbox for new PDF attachments. n8n's **Email Trigger (IMAP)** node works well for this.
Step 2: Convert PDF to image
Vision models need an image input, not raw PDF. Use a local tool like **pdftoppm** or a Python script node in n8n to convert each page:
```python
import subprocess, os
pdf_path = items[0]["binary"]["data"]["fileName"]
output_path = pdf_path.replace(".pdf", ".png")
subprocess.run(["pdftoppm", "-png", "-r", "300", pdf_path, output_path.replace(".png", "")])
items[0]["json"]["image_path"] = output_path
return items
```
Step 3: Extract with the AI agent node
Configure an AI Agent node with your local vision model and a prompt like:
> Extract the following fields from this invoice image: vendor_name, vendor_address, invoice_number, invoice_date, total_amount, tax_amount, currency, line_items (array of {description, quantity, unit_price, line_total}). Return the result as valid JSON.
Step 4: Validate and store
Add a validation step to check that critical fields (invoice number, total amount) are present before writing to your accounting database. Flag incomplete extractions for human review.
Handling multi-page invoices
Not all invoices are single-page. Loop through pages, extract each one, and merge the results. The AI agent can cross-reference page 2's line items with page 1's header information if you pass both extracts in the context.
For advice on designing reliable extraction pipelines, read Build a Local RAG Pipeline That Actually Answers Questions.
Approval workflow
Add an approval step before any invoice is paid or recorded:
1. **Auto-approve** — invoices under a threshold (e.g. £100) that match expected patterns
2. **Flag for review** — invoices with unusual amounts, new vendors, or extraction confidence below 80%
3. **Manual confirm** — send a notification to the finance team with a summary and approval link
Conclusion
Invoice processing with n8n and local AI is achievable today. The combination gives you accurate data extraction, full data privacy, and a workflow that scales with your business. Start with one vendor's invoices, tune the prompt, then expand to more formats.
FAQ
Can this handle scanned invoices?
Yes, as long as the scan is readable. Vision models handle printed text well. Handwritten fields are less reliable.
What if the extraction is wrong?
Build a manual correction step into the workflow. Flag low-confidence extractions and let a human review and edit before the data enters your accounting system.
Do I need a GPU?
A GPU helps significantly with vision models. CPU-only inference on an 11B vision model will be slow.


