Tools
Self-Hosted AI Agent Security: Deploy Microsoft's Agent Governance Toolkit
Microsoft's Agent Governance Toolkit brings zero-trust policy enforcement, execution sandboxing, and audit trails to autonomous AI agents. Here is how to deploy it on your own infrastructure.

Self-Hosted AI Agent Security: Deploy Microsoft Agent Governance Toolkit
Your AI agents call tools, browse the web, query databases, and delegate to other agents. Once deployed, they make decisions autonomously — and you need answers to three questions: *Is this action allowed? Which agent did this? Can you prove what happened?*
Microsoft's **Agent Governance Toolkit**, now in public preview at 3,400+ GitHub stars and trending this week, answers all three. It is an open-source (MIT), framework-agnostic governance layer that adds policy enforcement, zero-trust identity, execution sandboxing, and reliability engineering to autonomous AI agents — and yes, you can run it on your own infrastructure.
For the self-hosted AI community, this is the missing security layer that makes production agent deployments practical without surrendering control to a cloud provider.
What Agent Governance solves
The toolkit covers the full OWASP Agentic Top 10 — all ten categories, which is a first for any open-source project. The three core problems it addresses map directly to what breaks in self-hosted multi-agent systems:
1. Policy enforcement
An agent with access to `send_email` and `query_database` should not be able to `drop_table`. Traditional IAM roles control *which services* an agent can reach, not *what it does* once connected. Agent Governance introduces **scoped policy definitions** that constrain tool calls, API usage, and data access at runtime — before the action reaches the underlying service.
2. Agent identity and attribution
In a multi-agent system, five agents might share a single API key. When something goes wrong, "an agent did it" is not an incident response. The toolkit assigns **cryptographic identities** to agents and binds every action to an auditable principal across distributed traces.
3. Tamper-proof audit trails
Regulators, auditors, and compliance teams need proof. Agent Governance writes every decision (allowed and denied) to an immutable audit log with cryptographic binding, so you can reconstruct exactly what each agent did, when, and why.
Architecture overview
The toolkit is designed around three layers that work together regardless of your agent framework (LangChain, CrewAI, AutoGen, Semantic Kernel, or raw tool-calling code):
```
┌─────────────────────────────────────────────────┐
│ Your Agent │
├─────────────────────────────────────────────────┤
│ Agent Governance SDK │
│ (Python, TypeScript, .NET, Go) │
├─────────────────────────────────────────────────┤
│ Policy Engine │ Identity │ Audit │
│ (OPA/Rego) │ (JWTs) │ (Immutable) │
├─────────────────────────────────────────────────┤
│ Execution Sandbox │ Circuit Breakers │
│ (Restricted env) │ (Rate limit, retry) │
└─────────────────────────────────────────────────┘
```
Every tool call by your agent passes through the **Policy Engine** before execution. The identity layer attaches a signed JWT to every request. The **Execution Sandbox** constrains what the agent can do at the OS level (namespace isolation, filesystem restrictions, network rules). **Circuit breakers** prevent runaway agents from overwhelming upstream services.
Quick start on your own infrastructure
The toolkit ships as a `pip install` package, so it works in any self-hosted environment — bare metal, Docker, Kubernetes, or a homelab server.
Step 1: Install
```bash
pip install agent-governance-toolkit
```
Step 2: Define a policy
Policies use the Rego language (from the Open Policy Agent project). Here is a policy that prevents a customer-support agent from accessing the database admin table:
```rego
package agent_policy
Deny database write operations for support agents
deny contains msg if {
input.agent_role == "support"
input.tool == "query_database"
"write" in input.tool_args.operations
msg = "Support agents cannot write to the database"
}
Allow read-only queries with rate limiting
allow contains msg if {
input.agent_role == "support"
input.tool == "query_database"
"read" in input.tool_args.operations
input.request_count < 100
msg = "Allowed (read-only, under rate limit)"
}
```
Step 3: Wrap your agent calls
```python
from agent_governance import GovernanceClient
client = GovernanceClient(
policy_file="policies/support.rego",
agent_id="support-agent-v1",
identity_provider="local",
)
Before every tool call, check the policy
decision = client.check(
tool="query_database",
args={"query": "SELECT name FROM users", "operations": ["read"]},
context={"agent_role": "support", "request_count": 42},
)
if decision.allowed:
result = your_tool_function(args)
client.audit_log(action="executed", decision=decision)
else:
client.audit_log(action="denied", decision=decision)
raise PermissionError(decision.reason)
```
Step 4: Add sandboxing for untrusted agents
For agents that run user-provided code or browse the web, add execution sandboxing:
```python
from agent_governance.sandbox import Sandbox
sandbox = Sandbox(
read_paths=["/data/input"],
write_paths=["/data/output"],
network_rules={"allow": ["api.internal:443"], "deny": ["*"]},
max_cpu_seconds=30,
max_memory_mb=512,
)
with sandbox:
result = agent.run(task) # constrained inside the sandbox
```
Integrating with self-hosted agent stacks
The toolkit works with every major agent framework used in self-hosted setups:
LangChain / LangGraph
Add governance as a callable that wraps your tool executor. The toolkit ships with a `LangChainGovernanceMiddleware` that intercepts `ToolMessage` routing.
CrewAI
Define a `GovernanceTask` subclass that checks policies before executing each task step. The audit log doubles as CrewAI's callback for observability.
AutoGen
Use the `AgentGovernancePlugin` that registers as an AutoGen tool pre-hook — every tool invocation goes through policy check before reaching the registered function.
Custom agent (Hermes, Claude Code, Codex)
For raw tool-calling agents, wrap the tool dispatch loop. The pattern is the same regardless of agent: check → execute → audit. See the official documentation for framework-specific integration guides.
What the self-hosted community gains
Running agents on your own hardware already gives you data sovereignty and customisation. Adding Agent Governance Toolkit closes the security gap that kept production deployments cautious:
- **Multi-tenant isolation** — run agents for different teams or clients on the same hardware with zero-trust boundaries
- **Runaway agent protection** — circuit breakers and sandbox limits prevent one misbehaving agent from taking down your entire stack
- **Compliance without cloud dependency** — SOC 2, HIPAA, and GDPR audit trails generated entirely on your infrastructure
- **Policy-as-code** — version-controlled, reviewable, testable security rules instead of ad hoc approval flows
Production deployment patterns
Docker Compose
```yaml
version: "3.8"
services:
governance-engine:
image: ghcr.io/microsoft/agent-governance-engine:latest
volumes:
- ./policies:/policies
- ./audit:/audit
ports:
- "8080:8080"
environment:
- GOVERNANCE_IDENTITY_PROVIDER=local
- GOVERNANCE_AUDIT_BACKEND=sqlite
your-agent:
build: .
environment:
- GOVERNANCE_ENDPOINT=http://governance-engine:8080
```
Kubernetes
The toolkit provides a Helm chart that deploys the governance engine as a sidecar to your agent pods — every tool call is intercepted at the pod level before reaching the network.
Getting started
1. **Read the docs:** microsoft.github.io/agent-governance-toolkit
2. **Install:** `pip install agent-governance-toolkit`
3. **Try the quick start:** Copy the Rego policy above and run it against your own agent
4. **Join the community:** The Discord server has active discussions about policy patterns and self-hosted deployments
For context on why agent security matters for local setups, read How to Secure a Self-Hosted AI Server and Team Collaboration with Local LLMs: Multi-User Workflows for Private AI.
FAQ
Do I need a Microsoft account to use this?
No. The toolkit is fully open-source (MIT) and runs entirely on your infrastructure. No telemetry, no cloud dependency.
Does it work with local-only models like Ollama?
Yes. The governance layer is framework-agnostic. It intercepts tool calls, not model inference. You can pair it with any local model runtime.
How much overhead does the policy engine add?
Microsoft benchmarks show sub-10ms policy evaluation for typical Rego rule sets. The sandbox layer adds startup overhead (hundreds of ms to set up namespaces), but per-call overhead is negligible.
Can I use it with existing OPA deployments?
Yes. The toolkit's policy engine uses standard OPA (Open Policy Agent) under the hood. You can author policies in any Rego-compatible editor and reuse existing OPA rule sets.
Source
**GitHub:** https://github.com/microsoft/agent-governance-toolkit
**Documentation:** https://microsoft.github.io/agent-governance-toolkit
