Tutorials

text-generation-webui Setup: Install oobabooga for Local LLMs

A complete guide to installing and configuring text-generation-webui (oobabooga) with model loading, extensions, and API server for local AI.

Robson PereiraMay 31, 202610 min read

oobabooga text-generation-webui interface showing chat mode with active model.

text-generation-webui Setup: Install oobabooga for Local LLMs

text-generation-webui, commonly called oobabooga, is one of the most versatile local LLM interfaces available. It supports multiple model backends, a rich extension system, training and fine-tuning capabilities, and an OpenAI-compatible API server — all in one open-source package.

If you are new to local inference engines, read Ollama vs vLLM vs llama.cpp: Choosing a Local Inference Engine to understand the backend options.

One-click installer

The recommended installation method is the one-click installer, which handles Python environment setup, dependency installation, and CUDA toolkit detection:

```bash

git clone https://github.com/oobabooga/text-generation-webui

cd text-generation-webui

./start_linux.sh --listen

```

On Windows, use the included start_windows.bat. The installer downloads Miniconda if needed and sets up an isolated environment.

Model loading and backends

oobabooga supports multiple model loaders:

**Transformers** for full-precision Hugging Face models
**llama.cpp** for GGUF quantised models
**ExLlamaV2** for exl2 quantised models
**AutoGPTQ** for GPTQ quantised models
**AWQ** for AWQ quantised models

This flexibility means you can load almost any openly available model without conversion. The web UI lets you switch loaders, adjust GPU layers, and configure quantisation parameters before loading.

Chat modes and parameters

The interface offers chat mode, instruct mode, notebook mode, and a default chat format. Each mode adjusts how the model receives prompts. Chat mode is best for general conversation; instruct mode works well for task-oriented prompts.

Adjust temperature, top-p, top-k, repetition penalty, and context length in the Parameters tab. Save presets for different tasks — creative writing, code generation, summarisation — and switch between them without reloading the model.

Extensions and plugins

oobabooga's extension system adds significant functionality:

**Gallery** for image generation and viewing
**Character** for persona-based roleplay
**Multimodal** for vision-capable models
**Silero TTS** for text-to-speech
**OpenAI API** extension to expose an OpenAI-compatible endpoint
**Superboogav2** for enhanced UI features

Browse the extensions tab and enable the ones relevant to your workflow. Each extension adds interface elements and API endpoints.

Training and fine-tuning

oobabooga includes LoRA and QLoRA training capabilities. You can fine-tune models on custom datasets directly from the web UI. This is useful for adapting a base model to specialised domains, writing styles, or organisational knowledge.

For RAG-based approaches without fine-tuning, see Build a Local RAG Pipeline That Actually Answers Questions.

API server mode

Start oobabooga with the --api flag to enable the OpenAI-compatible API. This lets you connect Open WebUI, AnythingLLM, or custom applications to the same model instance running in the web UI.

```bash

python server.py --api --listen

```

The API mode supports chat completions and completions endpoints, including streaming responses.

Conclusion

text-generation-webui is the Swiss Army knife of local LLM interfaces. The multiple backend support, extension system, and training capabilities make it a strong choice for users who want more than a chat interface. The initial setup takes longer than simpler runners, but the flexibility is worth it for advanced workflows.

FAQ

What hardware does oobabooga need?

It depends on the model size. A 7B model needs about 6GB VRAM at 4-bit. Larger models need proportionally more.

Can I use oobabooga without a GPU?

Yes, with the Transformers or llama.cpp loader in CPU mode. Expect slow generation speeds.

Is oobabooga secure for public exposure?

Not out of the box. Add a reverse proxy with authentication, TLS, and rate limiting before opening it to the internet.

text-generation-webui Setup: Install oobabooga for Local LLMs

text-generation-webui Setup: Install oobabooga for Local LLMs

One-click installer

Model loading and backends

Chat modes and parameters

Extensions and plugins

Training and fine-tuning

API server mode

Conclusion

FAQ

What hardware does oobabooga need?

Can I use oobabooga without a GPU?

Is oobabooga secure for public exposure?

Related articles

How to Add Local Documents to Open WebUI with RAG and Ollama

How to Deploy Open WebUI and Ollama on a Private LAN with Docker Compose

How to Build a Self-Hosted AI Workstation with Docker and Multiple Model Runners