Don’t Give AI the Keys: Information Security 101 for your AI agents

We're giving AI agents the privileges of critical infrastructure without the guardrails that govern it.

Mar 26, 2026

Imagine walking up to an Inbox-Zero dream. Only to find out that a byproduct of this is a notification on your phone from a payment you didn’t approve and you didn’t want to do. You didn’t click on a suspicious link. You didn’t change your password. And the latter is not 12345. You told an agent —over a messaging app— to “clean up my messages,” and the same agent, with broad access, followed a chain of permissions, acted on a forwarded instruction, and executed a transfer.

After the viral launch of OpenClaw, similar things happened1. OpenClaw’s innovation is to fold agentic AI into the same messaging apps you use with friends and family —WhatsApp, Signal, Messages— so that a natural‑language instruction can now travel from chat to calendar, email, banking APIs, and all the apps the user provides access to. The result is dazzling convenience: tell an agent to “handle it,” and it handles it. The result is also terrifyingly fragile: we are handing over access to sensitive systems in a way that doesn’t honour that these are critical infrastructure.

My thesis is that OpenClaw is not necessarily unsafe. But due to ignorance, negligence or naiveté we are ignoring the principles, frameworks and rules that Information Security, a discipline that is four decades old, offers and that, if followed, would make the deployment of these agents safer. We are deploying agentic flows with the privileges and reach that zero-tolerance systems (banks, hospitals, air‑traffic control) enjoy, while leaving the guardrails and frameworks that govern the latter aside.

In this article, I explore what it would look like to apply the same frameworks that govern zero-tolerance architectures2 to this example. I also would like to encourage you to see these principles as something that has the potential of making these systems more resilient, and not necessarily something that slows down or halts innovation.

What Information Security as a discipline has given us

Information security developed over 40+ years ago to answer one question: how do we give systems the ability to act, without giving them the ability to cause harm?

We rarely think of it explicitly, but we humans run systems on zero-tolerance architecture for some time now: banks, hospitals, traffic systems (think of airport or train controllers). They are the critical infrastructure that keep our modern world spinning. Similarly to good health, we only realise how they silently and invisibly power our modern world, when they fail.

It collectively took us decades and costly and tragic mistakes3 to come up with an answer to the above question. But the answer is nothing less than a set of principles that govern how critical infrastructure in our world operates. The technical term for this type of critical infrastructure is zero-tolerance, to describe the fact that banks, hospitals, governments, public transport are all fields where we collectively won’t accept any mistake4.

The 5 key principles of information security

1. Principle of Least Privilege

Principle: Give any user or system only the minimum access it needs to do its specific job. Nothing less, but more importantly nothing more.

For example, a hospital billing system should be able to read patient invoices. It should not be able to modify medical records.

Why it matters for AI agents: an AI agent that can send emails should not also have access to your file system, calendar, and contacts unless it genuinely needs all of them.

Even though deploying agents with broad, permissive access it’s easier and faster to set up, the Principle of Least Privilege says: scope access to the task. If it only needs to read, don’t give it write. If it only needs this folder, don’t give it the whole drive.

2. Role-Based Access Control

Principle: Permissions are assigned to roles, not to individuals or systems. A “viewer” role can read but not write. An “admin” role can configure but not delete. Nobody sits down and configures each individual’s permissions from scratch. They get a role, and the role carries the permissions.

For example: In a hospital, a nurse has a “nurse” role which has the rights to read patient vitals, and update care notes. A billing clerk has a “billing” role which allows to read invoices but no access to medical records. A surgeon has a “surgeon” role with access to the full surgical history and right to prescribe medicine.

Why it matters for AI agents: An agent acting as a “scheduler” should have a scheduler’s permissions only. The same agent shouldn’t be able to both book a meeting and wire money, even if it theoretically could.

Roles are defined once and assigned. This matters at scale: when a new user or system joins, you assign a role rather than manually configuring access from scratch. It also simplifies auditing because you review a handful of roles, not thousands of individual permission configurations. If something goes wrong, you ask: “which role had that access?” not “which of the 10,000 users had that setting enabled?”

3. Separation of Duties

Principle: No single person or system should have end-to-end control over a sensitive process.

For example: In banking, the person who approves a transaction is never the same person who executes it.

Why it matters for AI agents: When an agent can both decide and execute there is no checkpoint. Separation of duties would require a human approval step between decision and action, especially for irreversible ones.

This is sometimes called the “four-eyes principle.” The logic is that corruption or error requires collusion. One compromised actor (human or AI) can’t unilaterally execute a harmful action, because a second checkpoint must be passed. The aim is to have systems where no single actor needs to be trusted completely.

4. Audit Trails and Logging

Principle: Every action taken by every user or system is recorded: who did it, what they did, when, and from where. This record cannot be altered after the fact.

For example: Your bank statement is a form of audit trail. Every transaction is logged including details like when, how much, to whom and you can’t delete last Tuesday’s transfer to make it disappear.

Why it matters for AI agents: Unless you make agents to keep track, the doings of agents is opaque: they act, and unless something breaks visibly, no one knows exactly what they did. An audit trail changes that because when something goes wrong, you can trace it. They also deter bad behavior because actors (humans and AI) know they are being watched.

5. Zero Trust Architecture

Plain language: “Never trust, always verify.” Older security models assumed that anything inside the network perimeter was safe. Zero Trust assumes nothing is safe by default. Every request for access must be authenticated, regardless of where it comes from.

For example: Think of a government building where employees with valid ID badges still have to re-scan at every internal door, not just the entrance. Being “inside” the building doesn’t mean you can go anywhere. Every door is its own checkpoint. The badge that opens the cafeteria is not the badge that opens the server room.

Why it matters for AI agents: An agent that has been granted access once should not have that access assumed forever. Zero Trust would require ongoing verification: is this agent still acting within its defined scope? Is this action consistent with its role?5

What applying info sec principles would look like

Let’s go back to the unauthorized transfer as collateral action in the Inbox-Zero example. Applying the above principles, would return:

Least Privilege: the agent handling your inbox has one job and the access that goes with it: read, archive, and delete emails. You could maybe include write drafts. But it has no connection to your bank simply because it is not necessary to the goal of cleaning up the inbox.
Separation of Duties: even if the agent somehow reached a financial action, it couldn’t execute it alone. A payment should definitely require a second step: a human confirmation. The agent proposes, a person approves. The decision and the execution are separated. The transfer doesn’t happen until you say so.
Zero Trust: the forwarded payment instruction (the one that the agent interpreted as needed in order to clear the inbox) doesn’t get a free pass. Every request to act gets verified: is this consistent with this agent’s role? Is this the kind of action it’s authorized to take? The answer, for a wire transfer initiated by an inbox manager, is no.

What do you notice?

None of this is a new application, or a new device. It is the result of slowing down and taking the collaboration with agents equally seriously and deliberately as we would do with a colleague.

A useful mental model: treat agents the way you’d treat a new employee. If you have a healthy sense of mistrust toward a new hire — taking it step by step, scoping what they can take over before expanding their access — apply the same logic to agentic systems. The same way we expect new people to prove themselves, agents also need to prove themselves. We seem to be skipping this step with our agents. We are not requiring them to prove themselves, we are lowering our guards precisely when we should be raising them.

We are ignoring these principles. Why?

I see three possible candidates to answer this question:

Cultural lens: Our culture values and praises the build fast and break things mentality. We value doing far more than we value observing, waiting or pausing. The latter are associated with laziness, risk-aversion or lack of skills. None of those are winning traits. Pushed a bit farther, and with the seductive novelty of agentic possibilities, we end up suspending the judgement we would never suspend elsewhere.
Structural lens: You might be familiar with the Maslow pyramid of human needs. The most important insight of this concept is that there is a ranking in terms of needs. I will likely only bother about my fashion style if my basic needs are already met. Applied to organizations and the deployment of agents with broad access, one could argue that an organization is only realistically in the position of applying InfoSec principles once certain “basics” are sorted: identity management, audit infrastructure and definition of roles. In the absence of this, deploying agents does not add capabilities but amplifies vulnerabilities. The foundation has to come first.
Literacy lens: Most people don’t have a solid understanding of what they’re handing over when they connect an agent to their systems. They think about it like installing an app. But an app has a defined, static set of permissions. An agent can make decisions. It interprets instructions, follows chains of logic, and acts. Often in ways no one anticipated. If we don’t understand the basics, we don’t comprehend fully what we are doing.

Slow down to go further

Even though we might be blinded by the possibilities and promises of these agentic systems, the collaboration between humans and machines is not new. Our modern life relies on it.

What is new is the speed at which we are handing over access to systems we haven’t properly understood or scoped, in domains we haven’t properly thought through.

As AI systems grow more capable, lets grow alongside them. That starts with holding agents to the standards and these could be the ones I have outlined above.

This is an essential human task. The perception that information security is too technical to engage with is precisely the kind of thinking that keeps these frameworks on the shelf, hands over the keys and hopes for the best.

The good news is that we don’t need to invent anything new. The frameworks exist. We just need to take this technology as seriously as we have learned to take every other system that acts in the world on our behalf — with boundaries proportional to its capability.

As Elmira pointed out, following these principles make agents safer and organizations more resilient. A system that is properly scoped, audited, and governed is a system you can trust, debug, and improve over time. Security and capability grow together, and should not grow apart.

Invitation to an in person event in Berlin

We’re doing something on April 23rd in Berlin that I am truly excited about:

Three people I deeply admire are joining me to think through AI, Memory and Migration:

→ Roshan Melwani (Oxford Institute for Technology and Justice)

→ Manuela Verduci (Kiron Digital Learning Solutions)

→ Mekonnen Mesghena (Heinrich Böll Foundation)

A human rights lawyer, a social entrepreneur, and a policy thinker, each looking at the same question from a different angle.

The evening explores how two of our oldest human instincts — the need to move and to remember — intersect with our newest technology. And it asks what happens when we let algorithms touch the stories that make us who we are.

We’ve also prepared something so that attendees don’t just listen. Our goal is for you to feel what’s at stake — and then sit with that feeling in a room full of others who felt it too.

Come join the conversation.

→ Register here

References

Ken Institute. (2024). “The Worst Engineering Disasters Due to Mechanical Errors.” Ken Institute Blog. Available at: https://keninstitute.com/the-worst-engineering-disasters-due-to-mechanical-errors/

Roelen, A., Kinnersly, S., and Drogoul, F. (2004). Review of Root Causes of Accidents Due to Design (EEC Note No. 14/04). EUROCONTROL Experimental Centre, Brétigny-sur-Orge, France. Available at: https://www.eurocontrol.int/sites/default/files/library/027_Root_Causes_of_Accidents_Due_to_Design.pdf

Wikipedia contributors. (n.d.). "Zero trust architecture." Wikipedia, The Free Encyclopedia. Retrieved March 25, 2026, from https://en.wikipedia.org/wiki/Zero_trust_architecture

Elmira Gazizova, Personal conversation with the author. Berlin, March 19, 2026.

For example, X user @summeryue0 (do note that she works as a Safety Researcher at Meta, so this might happen to the best of us) ran to her Mac mini to turn it off like a bomb. Or, project Agents of Chaos, where twenty researchers interacted with six agents powered by frontier models (Claude Opus 4.6, GPT 5.4, etc) were deployed in a live, multi-party lab environment from January 28 to February 17, 2026. The result: a sobering catalog of failures in security, privacy, trust models, and governance.

Summer Yue@summeryue0

Nothing humbles you like telling your OpenClaw “confirm before acting” and watching it speedrun deleting your inbox. I couldn’t stop it from my phone. I had to RUN to my Mac mini like I was defusing a bomb.

3:25 AM · Feb 23, 2026 · 10M Views

2.35K Replies · 1.69K Reposts · 17.5K Likes

More on zero tolerance architecture here

Some of example of tragedies that led to changes in system architecture: 1981-Hyatt Regency walkway collapse. A design change to suspended walkways in Kansas City contributed to 114 deaths and over 200 injuries, showing how a small structural revision can create catastrophic load-path failure., 1984 — Bhopal gas disaster. A toxic release at a pesticide plant killed thousands and exposed how weak process safety, maintenance failures, and poor containment can turn a plant into a mass-casualty system., 1986 — Chernobyl. A safety test, design flaws, and disabled protections led to reactor destruction and a massive radioactive release. It strongly reinforced the need for defense-in-depth, fail-safe defaults, and independent barriers., 2003 — Columbia Plane disaster. Damage to the shuttle’s thermal protection system during launch became fatal on reentry, showing that “minor” damage in one phase can destroy the whole mission later.

More technically, it is an applied posture for systems where any unauthorized action or irreversible error is unacceptable. It requires preventative controls, enforced segregation of duties, immutable audit trails, and human checkpoints for non‑reversible operations so that a single failure cannot produce catastrophic outcomes. The result: these systems don’t collapse and run reliably and smoothly most of the time.

This whole section would not have been possible without the kind and insightful conversation I had with Elmira Gazizova, AI Adoption Lead at keyIT SA on 19th March. Thank you, Elmira.

Phi / AI

Discussion about this post

Ready for more?