Tutorials

How to Set Up a Local AI Chat Server with Open WebUI and Ollama

Build a private ChatGPT alternative on your own hardware with Open WebUI and Ollama, including Docker deployment, user accounts, and team access.

Robson PereiraMay 31, 20269 min read

Open WebUI running in a browser showing a local AI chat interface with document upload enabled.

How to Set Up a Local AI Chat Server with Open WebUI and Ollama

A local AI chat server powered by Open WebUI and Ollama is the closest you can get to a private ChatGPT that runs entirely on your own hardware. This guide walks through the full setup: Docker deployment, model selection, user accounts, and security hardening so you can share the server with a team or family.

For the fundamentals of running models locally, read How to Run Llama 3 Locally with Ollama.

What you need

A machine running Linux (or Windows with WSL2) with at least 16GB RAM
A GPU with 8GB+ VRAM for acceptable interactive speed
Docker and Docker Compose installed
50GB free disk space for models and data

For hardware sizing, see Best Hardware for Self-Hosted AI.

Docker Compose setup

Create a project directory and a docker-compose.yml with the ollama and open-webui services. Connect them to the same Docker network so Open WebUI can reach Ollama by service name.

Use named volumes for model storage and application data so upgrades do not delete your state. Set restart policies so the services survive host reboots.

First launch

Start the stack with `docker compose up -d`. Open WebUI will be available at http://localhost:3000. The first-run wizard guides you through creating an admin account.

After signing in, go to the admin settings and configure your Ollama connection. Open WebUI auto-discovers Ollama if they share a Docker network.

Pull and select models

From the Open WebUI admin panel, you can pull models through Ollama. Start with a model appropriate for your hardware — a 7B or 8B parameter model is a good starting point for most systems.

Browse models, check their sizes, and pull the ones you want. Open WebUI lets users switch between available models in the chat interface.

Enable document RAG

Open WebUI's document upload feature works out of the box after you configure an embedding model. Set one in the admin settings under Documents. The embedding model converts uploaded files into vector representations that the chat model can search.

For advanced RAG configuration, see Build a Local RAG Pipeline That Actually Answers Questions.

User management

Enable registration in the admin settings to let team members create accounts. Open WebUI supports role-based permissions: admins control model access and system settings, while regular users chat and upload documents.

Set up user limits if your hardware cannot support many concurrent sessions. Monitor GPU memory usage to catch resource contention early.

Security hardening

Before exposing the server beyond your local network:

Change the default admin password
Enable HTTPS through a reverse proxy
Set up regular backups of the application database
Keep Ollama and Open WebUI updated

For TLS and authentication, read Caddy Reverse Proxy for Self-Hosted AI with Automatic TLS.

Conclusion

An Open WebUI and Ollama stack is the most approachable way to run a private AI chat server for multiple users. The setup takes under an hour, and the result is a usable chat interface backed by models that run entirely on your hardware.

FAQ

Can I access the server from my phone?

Yes. Open WebUI has a responsive web interface that works on mobile browsers.

How many users can use it at once?

This depends on your GPU memory. A 7B model at 4-bit quantisation leaves room for 2–3 concurrent users with good response times.

Do I need internet access to use it?

After the initial model download, everything runs locally. No internet connection is required for inference.

How to Set Up a Local AI Chat Server with Open WebUI and Ollama

How to Set Up a Local AI Chat Server with Open WebUI and Ollama

What you need

Docker Compose setup

First launch

Pull and select models

Enable document RAG

User management

Security hardening

Conclusion

FAQ

Can I access the server from my phone?

How many users can use it at once?

Do I need internet access to use it?

Related articles

How to Add Local Documents to Open WebUI with RAG and Ollama

How to Deploy Open WebUI and Ollama on a Private LAN with Docker Compose

How to Build a Self-Hosted AI Workstation with Docker and Multiple Model Runners