Giving an AI Agent Its Own Microsoft 365 Account

This is the second post in our series about building an AI Executive Assistant with OpenClaw. If you haven’t read the overview yet, start there for the big picture. Here, we’re going deep on one of the most critical building blocks: the Microsoft 365 integration.

Every Executive Assistant needs email and calendar access. That’s table stakes. But giving an AI agent access to Microsoft 365 isn’t as simple as handing over login credentials and hoping for the best. We needed something much more deliberate.

Why We Built a Custom M365 Plugin

The first question we faced was straightforward: how do you give an AI agent access to email, calendar, tasks, and contacts in a way that’s both useful and safe?

We didn’t want blanketed access. We didn’t want the agent poking around in places it shouldn’t be, or lacking context about why it has access to a particular mailbox. So instead of taking shortcuts, we built a proper Microsoft 365 plugin from the ground up.

The plugin gives OpenClaw its own dedicated email address, its own calendar, its own task lists, and its own contacts. This is important. The agent isn’t borrowing someone else’s identity. It has its own workspace within the M365 ecosystem, which makes everything cleaner and more auditable.

Shared Access with Clear Boundaries

Beyond its own accounts, the EA also needs access to shared resources. It monitors my inbox and calendar, as well as several global group mailboxes: the info@ addresses, support queues, and similar shared accounts that every company has.

But here’s the key design decision: every shared account comes with explicit context. The plugin is configured so the agent has a clear understanding of what each resource is intended for, what’s shared versus what’s its own, and what purpose each shared account serves.

For example, when the agent processes an email from the info@ mailbox, it knows this is a public-facing address that receives inquiries from potential clients, partners, and vendors. It handles those messages differently than it would handle a message in my personal inbox, which might be about a board meeting or a private conversation. Without this contextual awareness, an AI agent with access to multiple mailboxes is just a disaster waiting to happen.

The Permission Layer

Microsoft 365 has a robust permissions system built in, and we use it. The agent’s access to shared resources follows the same permission levels that M365 provides. If a shared calendar is read-only for the EA, it stays read-only. If a mailbox grants send-on-behalf permissions, those are respected and nothing more.

This matters because it means the security model isn’t something we invented from scratch. It builds on the enterprise-grade permissions infrastructure that organizations already trust and maintain. Our plugin layer sits on top of this, adding the contextual intelligence that M365 alone can’t provide.

Communication Etiquette

Access to email is one thing. Knowing how to use it is another.

One of the biggest investments we made was in communication intelligence. We built and integrated a comprehensive communication etiquette system that teaches the agent how to adjust its tone, formality, and approach based on the situation.

The agent knows the difference between emailing a coworker about a project update, replying to the CEO’s family member who’s trying to coordinate a dinner, and participating in a group thread with external partners. It reads the room, or rather, it reads the recipient list and the conversation history, and adapts.

This extends to group conversations as well. If the agent is added to a thread, it assesses who’s participating, what the established tone is, and adjusts accordingly. If a new participant joins and the dynamics shift, the agent shifts too. If the conversation starts formal and drifts casual, it follows along naturally.

This isn’t hardcoded behavior with rigid rules for each scenario. We built a flexible etiquette framework with principles, examples, and guidelines that the agent applies contextually. Think of it less like a decision tree and more like training a new hire on your company culture. You give them the principles and trust them to apply good judgment.

Importantly, email composition isn’t handled by the primary agent at all. We have a dedicated sub-agent whose entire job is writing emails. This sub-agent is an expert on our communication etiquette, the different tones and styles for different contexts, how emails should be formatted, and all the nuance that good email communication requires. For example, when the EA contacts someone on my behalf for the first time, the sub-agent knows it needs to introduce itself and make it clear that it’s an AI agent representing me. That’s the kind of detail that gets lost when a general-purpose agent is trying to juggle everything at once, but a specialist handles it naturally every time.

Here’s a real example of how all of this comes together. I’m driving to a meeting, but the previous meeting ran over and I’m going to be late. I can’t pull over to look up who I’m meeting, find their contact information, and write an appropriate email while I’m behind the wheel. So I send a voice message to the EA: “Let the people I’m meeting know I’m running late.” That’s it. The agent checks the calendar, identifies the meeting, finds the attendees, looks up their contact details, determines the appropriate tone based on who these people are, and sends an email on my behalf apologizing and letting them know I’ll be there soon. The whole thing takes less than a minute, and I never had to take my hands off the wheel. This is the kind of moment where the M365 integration, communication etiquette, and calendar awareness all come together and the EA genuinely earns its keep.

That said, let’s be realistic. The etiquette system works well most of the time, but “most of the time” is the key qualifier. The agent can nail the tone in a client email on Monday and then handle a nearly identical situation with a slightly different tone on Wednesday. There’s an inherent inconsistency in how LLMs apply guidelines, and it’s something we’re still working to tighten up. The plugin and framework give it strong guardrails, and within those guardrails the behavior is solid. But it’s not yet at the level where you can set it and completely forget about it.

Mailbox Organization as a Safety Net

Current AI agents have a weakness: context loss. If the agent’s memory or context gets disrupted (which can happen), it needs to recover quickly without dropping balls.

Our solution was to build strict organizational processes around how the agent manages its mailboxes and calendar, backed by a dedicated sub-agent for inbox processing. Whenever a new message comes in, whether it’s an email, a calendar notification, or something else, it gets handed to this specialist first. The inbox sub-agent evaluates the message against its built-in rules and guidelines, decides what to do with it, and either handles it directly or escalates it to the primary agent. This means the primary agent isn’t spending its context and attention on every incoming notification. The specialist triages everything.

Every item is categorized by status: already processed, currently in process, or needs to be processed. Folders, labels, and flags all follow a consistent system.

The result is that even if the agent loses context entirely, it can open its mailboxes and immediately understand the state of things. What’s been handled, what’s pending, what needs attention right now. It’s essentially an external memory system that doesn’t depend on the AI’s internal context staying intact.

This turned out to be one of the most important design decisions we made. In practice, context issues happen more often than you’d like, and having this organizational backbone means they’re inconveniences rather than catastrophes.

Promise Detection

One of the more clever capabilities that emerged from the M365 integration is promise detection. The EA continuously monitors my communications (emails, calendar notes, meeting summaries) and watches for commitments.

If I write to someone saying “let’s meet next Thursday at 2pm” but don’t create a calendar event, the EA catches it. It extracts the agreement, creates the calendar entry, and makes sure nothing falls through the cracks.

This alone has saved us from missed meetings more times than I’d like to admit. It’s the kind of task that a human EA excels at, and now the AI does too, except it never gets tired or distracted.

Or at least that’s the idea. In reality, promise detection doesn’t catch everything. Sometimes the agent notices a commitment and acts on it perfectly. Other times, it reads the exact same kind of message and does nothing. There’s no clear reason why it triggers on one and misses the other. And when it misses, it fails silently. Nobody gets notified that something was skipped. If you also forget about it, the commitment just falls through the cracks, which is exactly what the system was designed to prevent.

This is the core design challenge of building on top of LLMs: they’re non-deterministic systems, and any architecture that assumes they’ll behave the same way every time is going to break. The system has to be designed from the ground up to account for the fact that the agent might handle the same input differently each time. That means building verification layers, fallback mechanisms, and monitoring that don’t rely on the agent itself being reliable. We haven’t fully solved this for promise detection yet, but the path forward is clear: the architecture around the LLM needs to be robust enough that it doesn’t matter if the agent occasionally misses something, because the system catches it anyway.

What We Learned

Building this integration taught us several things. First, giving an AI agent access to enterprise tools requires the same rigor you’d apply to onboarding a human employee. Maybe more. Clear permissions, clear context, clear boundaries.

Second, communication intelligence isn’t optional. An agent that can access email but doesn’t know how to write appropriately is worse than no agent at all.

Third, building resilience against context loss from day one saved us enormous headaches later. If you’re building anything similar, don’t treat this as an afterthought.

And fourth, be prepared for the agent to forget things you’ve agreed on. We’ve had situations where we explicitly told the agent to change how it handles a certain type of email. It confirms, does it correctly once, and then reverts to the old behavior the next time. It’s genuinely surprised when you point out the change was agreed upon. This is a core limitation of current LLMs: agreements made in conversation don’t reliably persist into future behavior unless they’re encoded into the actual framework or plugin configuration.

In the next post, we’ll cover Micro Apps, the system we built to give our EA the ability to create, manage, and work with custom applications on the fly, without waiting on development cycles.

Share this article

Klemens Arro

Klemens Arro

Author

Leading the AI Lab as CEO, Klemens writes to demystify what happens behind the code. He connects high-level strategy with the curiosity that drives the industry forward, all while keeping the robots in check.