Tools

Self-Hosted AI Agent Security: Deploy Microsoft's Agent Governance Toolkit

Microsoft's Agent Governance Toolkit brings zero-trust policy enforcement, execution sandboxing, and audit trails to autonomous AI agents. Here is how to deploy it on your own infrastructure.

Robson PereiraMay 30, 202612 min read

Architecture diagram of Microsoft Agent Governance Toolkit for self-hosted AI agent security.

Self-Hosted AI Agent Security: Deploy Microsoft Agent Governance Toolkit

Your AI agents call tools, browse the web, query databases, and delegate to other agents. Once deployed, they make decisions autonomously — and you need answers to three questions: *Is this action allowed? Which agent did this? Can you prove what happened?*

Microsoft's **Agent Governance Toolkit**, now in public preview at 3,400+ GitHub stars and trending this week, answers all three. It is an open-source (MIT), framework-agnostic governance layer that adds policy enforcement, zero-trust identity, execution sandboxing, and reliability engineering to autonomous AI agents — and yes, you can run it on your own infrastructure.

For the self-hosted AI community, this is the missing security layer that makes production agent deployments practical without surrendering control to a cloud provider.

What Agent Governance solves

The toolkit covers the full OWASP Agentic Top 10 — all ten categories, which is a first for any open-source project. The three core problems it addresses map directly to what breaks in self-hosted multi-agent systems:

1. Policy enforcement

An agent with access to `send_email` and `query_database` should not be able to `drop_table`. Traditional IAM roles control *which services* an agent can reach, not *what it does* once connected. Agent Governance introduces **scoped policy definitions** that constrain tool calls, API usage, and data access at runtime — before the action reaches the underlying service.

2. Agent identity and attribution

In a multi-agent system, five agents might share a single API key. When something goes wrong, "an agent did it" is not an incident response. The toolkit assigns **cryptographic identities** to agents and binds every action to an auditable principal across distributed traces.

3. Tamper-proof audit trails

Regulators, auditors, and compliance teams need proof. Agent Governance writes every decision (allowed and denied) to an immutable audit log with cryptographic binding, so you can reconstruct exactly what each agent did, when, and why.

Architecture overview

The toolkit is designed around three layers that work together regardless of your agent framework (LangChain, CrewAI, AutoGen, Semantic Kernel, or raw tool-calling code):

```

┌─────────────────────────────────────────────────┐

│ Your Agent │

├─────────────────────────────────────────────────┤

│ Agent Governance SDK │

│ (Python, TypeScript, .NET, Go) │

├─────────────────────────────────────────────────┤

│ Policy Engine │ Identity │ Audit │

│ (OPA/Rego) │ (JWTs) │ (Immutable) │

├─────────────────────────────────────────────────┤

│ Execution Sandbox │ Circuit Breakers │

│ (Restricted env) │ (Rate limit, retry) │

└─────────────────────────────────────────────────┘

```

Every tool call by your agent passes through the **Policy Engine** before execution. The identity layer attaches a signed JWT to every request. The **Execution Sandbox** constrains what the agent can do at the OS level (namespace isolation, filesystem restrictions, network rules). **Circuit breakers** prevent runaway agents from overwhelming upstream services.

Quick start on your own infrastructure

The toolkit ships as a `pip install` package, so it works in any self-hosted environment — bare metal, Docker, Kubernetes, or a homelab server.

Step 1: Install

```bash

pip install agent-governance-toolkit

```

Step 2: Define a policy

Policies use the Rego language (from the Open Policy Agent project). Here is a policy that prevents a customer-support agent from accessing the database admin table:

```rego

package agent_policy

Deny database write operations for support agents

deny contains msg if {

input.agent_role == "support"

input.tool == "query_database"

"write" in input.tool_args.operations

msg = "Support agents cannot write to the database"

}

Allow read-only queries with rate limiting

allow contains msg if {

input.agent_role == "support"

input.tool == "query_database"

"read" in input.tool_args.operations

input.request_count < 100

msg = "Allowed (read-only, under rate limit)"

}

```

Step 3: Wrap your agent calls

```python

from agent_governance import GovernanceClient

client = GovernanceClient(

policy_file="policies/support.rego",

agent_id="support-agent-v1",

identity_provider="local",

)

Before every tool call, check the policy

decision = client.check(

tool="query_database",

args={"query": "SELECT name FROM users", "operations": ["read"]},

context={"agent_role": "support", "request_count": 42},

)

if decision.allowed:

result = your_tool_function(args)

client.audit_log(action="executed", decision=decision)

else:

client.audit_log(action="denied", decision=decision)

raise PermissionError(decision.reason)

```

Step 4: Add sandboxing for untrusted agents

For agents that run user-provided code or browse the web, add execution sandboxing:

```python

from agent_governance.sandbox import Sandbox

sandbox = Sandbox(

read_paths=["/data/input"],

write_paths=["/data/output"],

network_rules={"allow": ["api.internal:443"], "deny": ["*"]},

max_cpu_seconds=30,

max_memory_mb=512,

)

with sandbox:

result = agent.run(task) # constrained inside the sandbox

```

Integrating with self-hosted agent stacks

The toolkit works with every major agent framework used in self-hosted setups:

LangChain / LangGraph

Add governance as a callable that wraps your tool executor. The toolkit ships with a `LangChainGovernanceMiddleware` that intercepts `ToolMessage` routing.

CrewAI

Define a `GovernanceTask` subclass that checks policies before executing each task step. The audit log doubles as CrewAI's callback for observability.

AutoGen

Use the `AgentGovernancePlugin` that registers as an AutoGen tool pre-hook — every tool invocation goes through policy check before reaching the registered function.

Custom agent (Hermes, Claude Code, Codex)

For raw tool-calling agents, wrap the tool dispatch loop. The pattern is the same regardless of agent: check → execute → audit. See the official documentation for framework-specific integration guides.

What the self-hosted community gains

Running agents on your own hardware already gives you data sovereignty and customisation. Adding Agent Governance Toolkit closes the security gap that kept production deployments cautious:

**Multi-tenant isolation** — run agents for different teams or clients on the same hardware with zero-trust boundaries
**Runaway agent protection** — circuit breakers and sandbox limits prevent one misbehaving agent from taking down your entire stack
**Compliance without cloud dependency** — SOC 2, HIPAA, and GDPR audit trails generated entirely on your infrastructure
**Policy-as-code** — version-controlled, reviewable, testable security rules instead of ad hoc approval flows

Production deployment patterns

Docker Compose

```yaml

version: "3.8"

services:

governance-engine:

image: ghcr.io/microsoft/agent-governance-engine:latest

volumes:

./policies:/policies
./audit:/audit

ports:

"8080:8080"

environment:

GOVERNANCE_IDENTITY_PROVIDER=local
GOVERNANCE_AUDIT_BACKEND=sqlite

your-agent:

build: .

environment:

GOVERNANCE_ENDPOINT=http://governance-engine:8080

```

Kubernetes

The toolkit provides a Helm chart that deploys the governance engine as a sidecar to your agent pods — every tool call is intercepted at the pod level before reaching the network.

Getting started

1. **Read the docs:** microsoft.github.io/agent-governance-toolkit

2. **Install:** `pip install agent-governance-toolkit`

3. **Try the quick start:** Copy the Rego policy above and run it against your own agent

4. **Join the community:** The Discord server has active discussions about policy patterns and self-hosted deployments

For context on why agent security matters for local setups, read How to Secure a Self-Hosted AI Server and Team Collaboration with Local LLMs: Multi-User Workflows for Private AI.

FAQ

Do I need a Microsoft account to use this?

No. The toolkit is fully open-source (MIT) and runs entirely on your infrastructure. No telemetry, no cloud dependency.

Does it work with local-only models like Ollama?

Yes. The governance layer is framework-agnostic. It intercepts tool calls, not model inference. You can pair it with any local model runtime.

How much overhead does the policy engine add?

Microsoft benchmarks show sub-10ms policy evaluation for typical Rego rule sets. The sandbox layer adds startup overhead (hundreds of ms to set up namespaces), but per-call overhead is negligible.

Can I use it with existing OPA deployments?

Yes. The toolkit's policy engine uses standard OPA (Open Policy Agent) under the hood. You can author policies in any Rego-compatible editor and reuse existing OPA rule sets.

Source

**GitHub:** https://github.com/microsoft/agent-governance-toolkit

**Documentation:** https://microsoft.github.io/agent-governance-toolkit

Self-Hosted AI Agent Security: Deploy Microsoft's Agent Governance Toolkit

Self-Hosted AI Agent Security: Deploy Microsoft Agent Governance Toolkit

What Agent Governance solves

1. Policy enforcement

2. Agent identity and attribution

3. Tamper-proof audit trails

Architecture overview

Quick start on your own infrastructure

Step 1: Install

Step 2: Define a policy

Deny database write operations for support agents

Allow read-only queries with rate limiting

Step 3: Wrap your agent calls

Before every tool call, check the policy

Step 4: Add sandboxing for untrusted agents

Integrating with self-hosted agent stacks

LangChain / LangGraph

CrewAI

AutoGen

Custom agent (Hermes, Claude Code, Codex)

What the self-hosted community gains

Production deployment patterns

Docker Compose

Kubernetes

Getting started

FAQ

Do I need a Microsoft account to use this?

Does it work with local-only models like Ollama?

How much overhead does the policy engine add?

Can I use it with existing OPA deployments?

Source

Related articles

Run Obscura: The Lightweight Rust Headless Browser Built for AI Agents and Web Scraping

Graphify: Turn Any Codebase into a Queryable Knowledge Graph for AI Coding Assistants

Cut AI Token Costs by 65% with Caveman: The Viral Skill That Makes Claude Code Speak Caveman