Tutorials
How to Add Local Documents to Open WebUI with RAG and Ollama
Build a private document chatbot in Open WebUI with Ollama embeddings, local PDFs, and practical RAG tuning tips for better answers.

How to Add Local Documents to Open WebUI with RAG and Ollama
If you already run Open WebUI and Ollama, the next big upgrade is turning your chat interface into a private document assistant. With retrieval-augmented generation (RAG), you can ask questions about manuals, runbooks, project notes, PDFs, and markdown files without sending anything to a cloud service.
This guide shows you how to build a practical local-docs workflow in Open WebUI, keep it accurate, and avoid the common mistakes that make RAG feel flaky. If you still need the base stack, start with How to Deploy Open WebUI and Ollama on a Private LAN with Docker Compose and Docker Setup for Local AI Tools. For hardening tips, pair this with How to Secure a Self-Hosted AI Server.
Why use Open WebUI for local document search?
Open WebUI gives you a simple front end for chat, document upload, and knowledge bases. Ollama handles the local model runtime, which means your prompts and your uploaded documents stay inside your own environment.
That matters for three reasons:
- You can keep sensitive PDFs, SOPs, and internal notes off third-party SaaS platforms.
- You can use the same interface for normal chat and document Q&A.
- You can update your knowledge base as your files change, instead of copying and pasting chunks into every prompt.
RAG works best when you treat it like a small information system, not magic. The model still needs clean files, a solid embedding model, and a sensible way to test whether retrieval is actually working.
What you need before you start
You need:
- A running Open WebUI instance
- A running Ollama instance
- At least one chat model, such as `llama3.1:8b` or another local model you trust
- One embedding model for document indexing
- A small set of documents to test with first
If your machine is memory-constrained, start with a modest chat model and a lightweight embedding model. A good first pass is:
```bash
ollama pull llama3.1:8b
ollama pull nomic-embed-text
```
If you prefer a larger embedding model, you can switch later, but start simple so you can debug the pipeline without extra variables.
Step 1: Make sure Ollama is reachable
Open WebUI needs to talk to Ollama over HTTP. Before you touch the UI, verify the Ollama API is alive and the models are installed.
```bash
curl http://localhost:11434/api/tags
```
You should see the models you pulled. If you run Ollama in Docker, the hostname may be `ollama` instead of `localhost` from inside the Open WebUI container.
You can also test embeddings directly:
```bash
curl http://localhost:11434/api/embeddings -H 'Content-Type: application/json' -d '{"model":"nomic-embed-text","prompt":"Open WebUI can index local documents."}'
```
If that request fails, fix Ollama first. RAG only works when the embedding endpoint works reliably.
Step 2: Create a knowledge base in Open WebUI
Open WebUI’s document workflow is usually centered on a knowledge or workspace area where you can create a collection and upload files.
Use a clean naming scheme so collections stay understandable over time. Good examples:
- `engineering-runbooks`
- `home-lab-docs`
- `client-project-alpha`
- `hr-policies`
For your first test, keep the set small. Upload three to five documents that are easy to verify, such as:
- a short markdown file with known facts
- one PDF manual
- one text export from a notebook or wiki
- a runbook with step numbers and commands
That makes it obvious whether retrieval is working because you already know the answers.
Step 3: Prepare documents for better retrieval
RAG quality depends heavily on the source material. If your documents are messy, the answers will be messy too.
Use clean text when possible
Markdown, plain text, and well-structured notes usually outperform giant unprocessed PDFs. Before uploading, convert whatever you can into a cleaner format.
Useful tools:
```bash
pandoc handbook.docx -t markdown -o handbook.md
pdftotext manual.pdf manual.txt
```
If you have scanned PDFs, OCR them first:
```bash
ocrmypdf --deskew --clean input-scan.pdf output-ocr.pdf
```
Split huge documents
A 300-page PDF is not ideal as a first RAG source. If possible, split large books, vendor manuals, and policy bundles into smaller topic-specific files.
Instead of one giant `ops-manual.pdf`, try:
- `backups.md`
- `networking.md`
- `troubleshooting.md`
- `restore-procedures.md`
Smaller files make it easier for retrieval to land on the right passage.
Remove junk before upload
Before indexing, delete:
- repeated headers and footers
- page numbers
- duplicated tables of contents
- irrelevant screenshots with tiny text
- outdated drafts
The cleaner the input, the more useful the retrieved context.
Step 4: Upload documents and test with a simple question
After creating the collection, upload your files and ask a question with a very obvious answer.
Good first questions look like this:
- “What is the backup retention policy in this document?”
- “Which port does the service use?”
- “List the shutdown steps in order.”
- “What is the default username mentioned in the manual?”
You are testing retrieval, not the model’s creativity. Start with questions where the answer should be found in one or two passages.
If the answer is wrong, ask yourself:
1. Was the fact actually in the document?
2. Did the document upload finish successfully?
3. Is the embedding model working?
4. Is the collection too broad?
Step 5: Ask better RAG questions
The quality of your question matters almost as much as the quality of your files.
A weak prompt:
> Tell me about backups.
A better prompt:
> In the uploaded backup runbook, what is the nightly backup schedule, where are the backups stored, and what is the restore verification step?
The second prompt gives the retriever a much clearer target. It also reduces the chance that the model answers from general knowledge instead of your documents.
A useful pattern is:
```text
Answer only from the uploaded documents.
If the documents do not contain the answer, say so clearly.
Quote the relevant section before summarizing it.
```
That framing nudges the model to stay grounded.
Step 6: Use a sensible Docker Compose baseline
If you want a repeatable setup, keep Open WebUI and Ollama on the same Docker network with persistent volumes.
```yaml
services:
ollama:
image: ollama/ollama:latest
container_name: ollama
restart: unless-stopped
ports:
- "11434:11434"
volumes:
- ollama:/root/.ollama
open-webui:
image: ghcr.io/open-webui/open-webui:main
container_name: open-webui
restart: unless-stopped
depends_on:
- ollama
ports:
- "3000:8080"
environment:
- OLLAMA_BASE_URL=http://ollama:11434
- WEBUI_SECRET_KEY=change-me
volumes:
- open-webui:/app/backend/data
volumes:
ollama:
open-webui:
```
A few notes:
- Keep the data volumes persistent, or your documents and chat history will disappear after a restart.
- Put the containers on a private LAN or behind a reverse proxy if you expose the service.
- Set a strong secret key and apply your normal homelab security posture.
If you already have a working stack, you probably only need to confirm the `OLLAMA_BASE_URL` value and make sure the data volume is durable.
Step 7: Keep your knowledge base accurate over time
RAG gets stale if nobody maintains it. Treat your documents like code.
A good maintenance routine looks like this:
- delete or archive outdated PDFs
- version your runbooks
- re-upload only the documents that changed
- use one collection per topic or team when possible
- test the most important questions after every update
For example, if your backup policy changes, update the `backups.md` file and re-index that document before you trust the answers again.
Troubleshooting Open WebUI RAG
The embedding model does not appear
Pull it manually in Ollama first:
```bash
ollama pull nomic-embed-text
```
Then restart Open WebUI and try the upload again.
Answers ignore the documents
Usually this means one of three things:
- the collection is empty
- retrieval returned poor chunks
- the prompt is too broad
Try a more specific question and a smaller document set.
PDFs produce bad answers
That usually happens when the PDF is a scan, has tiny text, or contains too much noise. Run OCR, convert to text, or split the file before uploading.
The answer is partially correct but missing details
Ask for the exact section, not just the summary:
> Show me the paragraph that mentions the restore location, then explain it in one sentence.
That often improves retrieval and reduces hallucination.
My collection is growing too large
Create separate knowledge bases for separate purposes. A single bucket for every document is convenient at first, but it becomes harder to retrieve the right source later.
A simple test plan you can reuse
Whenever you add a new document collection, run this quick check:
1. Ask one question with a known answer.
2. Ask one question that should not be in the docs.
3. Ask one multi-part question that requires two facts from different sections.
4. Compare the model’s answer with the source file.
If the first answer is wrong, fix ingestion. If the second answer is confidently invented, tighten your prompt. If the third answer misses one fact, shorten the documents or split them more aggressively.
FAQ
Is Open WebUI RAG fully private?
It can be fully private if your Open WebUI instance, Ollama server, and document storage stay inside your own infrastructure. Avoid public exposure unless you know exactly how your authentication and reverse proxy are configured.
Which embedding model should I use first?
Start with a lightweight local model such as `nomic-embed-text`. It is easy to pull, easy to test, and good enough for many homelab and small-team document workflows.
Can I use PDFs, DOCX, and markdown together?
Yes, but you will usually get the best results from clean markdown or text. PDFs work too, especially if they are text-based rather than scanned images.
How big should my knowledge base be?
Start small. A focused collection of a few documents is easier to test and easier to trust than a giant mixed library.
Do I need a vector database?
Open WebUI handles the document retrieval workflow for you, so most home lab users do not need to manage a separate vector database just to get started.
Final thoughts
Open WebUI becomes much more useful once you connect it to a carefully prepared set of local documents. The winning formula is simple: use a working Ollama embedding model, keep your source files clean, build small collections, and test with questions you can verify manually.
Once that foundation is solid, you can expand into backups, runbooks, internal wikis, and project docs without handing your knowledge base to a cloud platform.


