News

[TechCrunch] GitHub Copilot's Token Billing Backlash: What It Means for Self-Hosted AI

GitHub is switching Copilot from flat-rate to token-based billing on June 1, sparking developer fury — and making self-hosted coding assistants more compelling than ever.

Robson PereiraMay 31, 20264 min read
GitHub Copilot logo with token billing and cost calculation visual metaphor.

[TechCrunch] GitHub Copilot's Token Billing Backlash: What It Means for Self-Hosted AI

GitHub is ending the flat-rate era for Copilot. Starting **June 1, 2026**, the company will switch Copilot from a simple per-seat subscription to a **token-based consumption model** — and developers are not happy about it.

What's changing

Under the new system, users will be charged based on how many tokens they consume while using Copilot, rather than a flat monthly rate. For smaller companies and individual developers who rely heavily on AI-assisted coding, this could mean significantly higher costs. The change affects all Copilot tiers and has been met with widespread criticism across social media and developer forums.

One user on social media quoted by TechCrunch summed up the sentiment: **"What a joke."**

TechCrunch reports that "the golden age of Microsoft's GitHub Copilot appears to be at an end — for the little guy, at least." While larger enterprises may absorb the change without much trouble, smaller teams and independent developers could find themselves priced out.

Why this matters for self-hosted AI

This billing shift creates a powerful incentive to explore alternatives — particularly **self-hosted AI coding assistants** that run on your own hardware with no per-token costs.

The self-hosted ecosystem now offers several mature Copilot alternatives:

  • **OpenCode** — an open-source AI coding agent that pairs with local LLMs via Ollama, llama.cpp, or OpenAI-compatible backends. No tokens, no metering, no subscription.
  • **Continue.dev** — an open-source autocomplete and chat plugin for VS Code and JetBrains that connects to local models.
  • **Hermes Agent** — an open-source terminal agent by Nous Research that works with any LLM provider and can run entirely locally.
  • **Claude Code / Codex CLI** — while these have their own pricing, they offer per-query models rather than per-token billing.

For teams already running local LLMs through Ollama or Open WebUI, the incremental cost of adding AI code completion is effectively zero.

The bigger trend

Copilot's pricing shift is part of a broader industry pattern. As AI API providers mature, many are moving toward granular token-based pricing that can substantially increase costs for power users. We've seen similar shifts from OpenAI, Anthropic, and now Microsoft.

For anyone building a practice or business around AI-assisted development, this trend reinforces the case for **local-first AI**. Running models on your own hardware eliminates API costs, keeps your code private, and insulates you from pricing changes. Our guide on Local AI for Software Developers walks through the full setup.

What the self-hosted community is saying

On r/LocalLLaMA and r/selfhosted, developers have pointed out that models like DeepSeek Coder, Qwen2.5-Coder, and CodeGemma now offer competitive code generation quality — especially when quantised — at a fraction of the operating cost of any token-billed service.

As one commenter put it: "If I'm going to pay per token anyway, I'd rather pay for electricity and run it myself."

**Source:** TechCrunch — 'What a joke': GitHub Copilot's new token-based billing spurs consternation among devs

Related articles