product

Who starts the work?

this+that team

Two paradigms for AI assistants

Most AI assistants today are built for the same moment. You have a goal in your head, you ask the agent to execute it. Write the quarterly review. Draft the reply. Schedule the meeting. Summarize the thread. The user provides intent, the agent provides execution. This is the paradigm Claude, ChatGPT, Microsoft Copilot, and most enterprise assistants are optimized for.

The other half of the workday looks different. A colleague emails about a project. A customer asks a question. A calendar invite arrives. A Slack mention pings. You did not form intent yet, the work showed up. Most knowledge workers spend more time in this mode than the goal-driven one. The agent’s job here is different too. Monitor what is arriving, judge what matters, surface what needs you, and act on the rest with guardrails.

We call these the user-initiated and externally-initiated paradigms. They are not the same product.

Why externally-initiated is harder

Goal-driven agents are bounded. You tell them what to do, they do it, they stop. Externally-initiated agents have to do three things that goal-driven agents do not. Monitor continuously across channels. Judge importance without an explicit goal. Act safely when no one asked them to.

Each of those is a real engineering and trust problem. Continuous monitoring means real-time ingest from Gmail, Outlook, Slack, Teams, and whatever else. Judgment about importance means a model trained on a lot of signal about what gets attention versus what gets ignored. Safe action without explicit prompts means trust gates and human-in-the-loop on anything sensitive. Sends, schedules, modifications, anything that crosses a domain boundary.

Most assistants are built for goals first because goals are easier.

The trust scaffolding that makes it work

If an agent is going to act on signals you did not trigger, the trust architecture has to be visible and tight. Ours has four layers, all shipped today.

  • Identity and domain gates. Rules that scope which senders count as clients, which domains can trigger which actions, which contacts get auto-replied to.
  • Reviewable plans. Before any non-trivial workflow runs, the agent shows the plan in plain language. You approve or revise before generation.
  • Drafts, not sends. On anything outbound, email reply, Slack message, calendar invite, the agent prepares and waits. You send.
  • Provenance everywhere. Every task in the DoBox, every claim in the assistant, links back to the source message. You can verify without re-asking.

This is what makes externally-initiated agents trustworthy enough to use. Without it the failure modes are worse than the productivity gains.

Lobster, Cowork, and where we differ

Microsoft’s Project Lobster and Copilot Cowork are excellent products for the user-initiated paradigm. Lobster has been articulated by its lead, Omar Shahine, as agents-as-new-hires. Trust earned and scoped. Graduated capabilities. That framing is right and we share it.

The difference is who starts the work. Lobster and Cowork excel when you bring the intent. We focus on the moments when intent comes from somewhere else. A message, a meeting, a system event. Both paradigms are valid. Most knowledge workers need both. The products that serve the two halves do not have to be the same product.

Where the paradigms converge

Two surfaces matter for both paradigms. The inbox and the desktop. The inbox is where the externally-initiated work arrives and where most of the response happens. The desktop is where ambient monitoring and OS-level handoffs become possible.

We ship our AI-native inbox this month with the assistant page. Chat, email cards, Slack threads, calendar invites, draft compose, all in one product. The desktop is in flight, on the same surface bet Microsoft has made with Lobster and Cowork.

If you think of AI assistants as a two-paradigm category instead of a single product type, the building blocks of the future stack become clearer. A goal-driven agent for when you have intent. An externally-initiated agent for when work finds you. Shared surfaces, inbox and desktop, where they meet. We are building the second half of that picture.