How to Run AI Agents Safely: Permissions, Approval Gates, and Governance

An AI agent deleted a company’s entire database last week.

I have 9 agents running real operations at my business. Nothing like that has happened.

Not because my setup is perfect. Because it has rules.

The actual risk

When people worry about AI agents, they usually imagine a rogue AI. An agent that decides to do something it was never asked to do.

That’s not the real risk.

The real risk is simpler: an agent that does exactly what it was asked, in a context where doing exactly that causes real damage.

“Clean up the old customer records” + broad database access + no approval gate = deleted production data.

The model didn’t malfunction. The agent followed instructions. The problem was that no one had defined what “clean up” meant, what access was appropriate, and what required a human sign-off before running.

The three-layer governance system

I run my agent team with a three-layer system. It took me a few painful sessions to figure out, where I gave an agent too much latitude and had to undo work. Nothing catastrophic, but enough to make me build proper guardrails.

Layer 1: The approval matrix

Every agent in my system has a written approval matrix. Two categories:

Autonomous: things the agent can do without asking:

Draft content, research, write files, prepare reports
Update internal documents
Comment on tasks, log work, create drafts

Needs human: things that require my explicit approval:

Send anything to real people (newsletters, emails, messages)
Post publicly on any platform
Delete or archive anything
Spend money or trigger billing
Make changes that are hard to reverse

This is the single most important piece. Before you give an agent any tool, decide which category each action falls into.

The agent’s job is to do the autonomous work and surface the needs-human work for you to approve. Not to make that judgment call itself.

Layer 2: Scoped access

Agents only get access to the tools they actually need.

My content agent has no database access. No billing access. No customer data access. It has file access to the content directories it works in, API access to the content tools it uses, and nothing else.

The failure mode, agent deletes production data, requires that the agent have production database access. If the content agent doesn’t have database access, it cannot delete your database no matter what instructions it follows.

Tool assignment should be conservative and specific. If you’re not sure an agent needs a tool, it doesn’t get that tool.

Layer 3: Written rules per agent

Each agent in my system has a SOUL.md, a file that defines who the agent is, what it does, and critically: what it does not do.

The SOUL.md includes an explicit “hard rules” section. Things like:

Never post to external channels without approval
Never send messages to real people without human sign-off
Never delete files without explicit confirmation
All actions that affect live systems go through the approval queue first

This isn’t just about safety. It also means the agent operates consistently. You’re not relying on context or instructions to remind it of the rules. The rules are baked into the identity it reads at the start of every session.

What changed when I added this system

Before: I was giving agents broad tasks and trusting them to stay in bounds. They mostly did. But “mostly” is not a governance architecture.

After: I know exactly what each agent can and can’t do. When something touches real systems, it goes through an approval queue. The agent does the work; I approve the action before it goes live.

The practical effect: I can give agents access to real business infrastructure, content systems, ops tooling, scheduling, without worrying that one bad instruction causes irreversible damage.

The mental model I use: treat your AI agents like new employees who are enthusiastic and capable but who haven’t yet developed the judgment to know which actions need escalation. You give them real work. You review the high-stakes decisions. Over time you build confidence in specific tasks, and the approval gate for those tasks becomes lighter.

The hardest part

Most people don’t set up governance because they’re doing small tasks in chat mode. Governance feels like overhead for asking “write me an email.”

It becomes relevant the moment you give agents persistent access to real systems.

Email accounts. Databases. Publishing pipelines. Payment systems. Customer-facing channels.

The moment you do that, and most people who are serious about agents get there quickly, the approval matrix and scoped access aren’t optional. They’re the thing standing between “9 agents running for 6 months without a data loss incident” and “Claude deleted our database.”

What to set up first

If you’re running agents with access to real systems, start here:

Write an approval matrix: two columns, autonomous and needs-human. Keep it short. Fill in the obvious categories. You can add to it later.
Audit tool access: for each agent, list every tool and access it has. Remove anything it doesn’t need for its specific job.
Add a hard rules section to each agent’s config: explicit prohibitions, not suggestions. “This agent never deletes files” should be written down, not assumed.
Set up a review step for irreversible actions: anything that can’t be easily undone goes through human review before running. This is the approval gate.

None of this requires sophisticated tooling. A markdown file with a two-column table and a few hard rules is more protection than 90% of agent setups have.

More on running agents:

OpenClaw Ops: how to keep your AI agents alive — the operational layer that keeps an agent team running
Claude Code Tutorial: the complete guide for founders — hooks, commands, skills, and the config that shapes agent behavior
AI agent use cases for founders: 12 real examples — where agents earn their keep once they’re set up safely
Juno Home Chief of Staff — a household manager kit built around explicit approval rules and privacy boundaries