MailKite
Start free
All posts
Gabe 18 min read

The Gmail API alternative for AI agents

A Gmail account plus the Gmail API is the go-to 'give my agent an inbox' hack: free, familiar, and fine for one human-supervised assistant. Productionize an autonomous agent on it and you inherit OAuth restricted-scope review, Pub/Sub watch renewals, and base64url MIME. MailKite (which we build) gives the agent its own scoped address and parsed JSON push. For developers wiring an autonomous email agent.

Gmail vs MailKite
Gmail vs MailKite — the same job (an inbox for your AI agent), two approaches.

The pull is obvious: your agent needs to read email, you already have a Gmail account, and the Gmail API is right there. It works well enough that most agent demos start exactly this way. The friction shows up when the demo becomes a service, and it shows up in three specific places: the OAuth review to touch a real inbox, a push subscription that quietly dies every seven days, and message bodies you decode by hand. Here is that whole path next to the MailKite one before we build either.

Gmail API sender Gmail acctOAuth token Pub/Subpush: historyId your serviceget + base64url-decode your agent …plus a CASA security assessment for the restricted scope, watch() renewed every <7 days on a cron, and an OAuth refresh token you keep alive — all yours to operate MailKite sender MX edgeparse + auth JSON webhooksigned, retried your agent the 25 lines below are the whole "your agent" integration — no OAuth, no Pub/Sub, no decode
An agent reading one email: the Gmail API path vs the MailKite path. Same input, very different amount to operate.

Here’s the whole MailKite side: an agent that hears, thinks, and answers. It runs as pasted on Node 18+ (npm install mailkite express), and the demo repo has the full version.

import express from "express";
import { MailKite } from "mailkite";

const app = express();
const mk = new MailKite(process.env.MAILKITE_API_KEY);
const SECRET = process.env.MAILKITE_WEBHOOK_SECRET;

app.use("/hooks/agent", express.raw({ type: "application/json" }));

app.post("/hooks/agent", async (req, res) => {
  // signature check, replay window, constant-time compare — one call
  if (!MailKite.verifyWebhook(req.headers["x-mailkite-signature"], req.body, SECRET)) {
    return res.sendStatus(401);
  }
  res.sendStatus(200); // ack fast; run the agent out of band

  const event = JSON.parse(req.body);
  if (event.type !== "email.received") return;

  // Body is untrusted INPUT, never instructions. Use the auth block to weight trust.
  const answer = await runAgent({
    task: event.text,
    from: event.from.address,
    trusted: event.auth.spf === "pass" && event.auth.dmarc === "pass",
  });

  await mk.send({
    from: event.to[0].address,   // reply from the address it was sent to
    to: event.from.address,
    subject: `Re: ${event.subject}`,
    inReplyTo: event.id,          // threads the reply
    html: answer.html,
  });
});

app.listen(3000);

No OAuth client, no consent screen, no Pub/Sub topic, no MIME parser. The address agent@yourco.dev is one the agent owns, on a domain you control, not a person’s Gmail account with a person’s permissions bolted onto a bot. The same handler shape exists for Python, Ruby, Go, PHP, and Java; see the receiving docs and sending docs. Or skip hosting the loop entirely and let MailKite run it: a route whose action is agent runs the model turns for you on a queue and hands you a transcript. More on that below.

Where Gmail wins for agents, honestly

The Gmail API is not a bad choice, and for a real class of agent it’s the right one. If your agent acts inside a specific human’s mailbox, an assistant that triages their inbox, drafts replies they approve, files their receipts, then Gmail is exactly the tool. The user consents once, the agent operates with that person’s identity and permissions, and there’s a human in the loop by design. That’s the shape Google built the API for, and it’s genuinely good at it: full-text search, labels, threads, drafts, and a mailbox the human can also open and inspect.

Gmail also brings deliverability and spam filtering that took Google two decades to build, an inbox the user already trusts, and, on Workspace, admin controls and audit logs an IT team already understands. If the agent is a co-pilot on a real person’s account, none of the friction below applies to you. Reach for the Gmail API and don’t look back.

The wedge is narrower than “give the agent an inbox” implies. It’s the autonomous case: an agent with its own address, running unattended, that needs to receive mail, read a verification code, and reply, with no human whose account it borrows. On that job, a Gmail account is a human artifact you’re bending into a service, and Google’s rules for human accounts start to bind.

What Gmail asks of an agent builder

Point a fully autonomous agent at a Gmail account in production and here’s the path, in Google’s own idiom. This is the honest DIY code, and it’s more than the MailKite handler because every stage above is now yours:

// Gmail-as-agent-inbox: OAuth, a Pub/Sub push endpoint, and MIME you decode.
import { google } from "googleapis";

const auth = new google.auth.OAuth2(CLIENT_ID, CLIENT_SECRET, REDIRECT);
auth.setCredentials({ refresh_token: REFRESH_TOKEN }); // per user; auto-refreshes… until revoked
const gmail = google.gmail({ version: "v1", auth });

// 1. Register a push channel. It EXPIRES in 7 days — renew on a cron or go silently deaf.
await gmail.users.watch({
  userId: "me",
  requestBody: { topicName: "projects/my-proj/topics/gmail-inbox", labelIds: ["INBOX"] },
});

// 2. Pub/Sub POSTs you { emailAddress, historyId } — NOT the message. Look up what changed.
app.post("/pubsub", async (req, res) => {
  const { historyId } = JSON.parse(Buffer.from(req.body.message.data, "base64").toString());
  const { data } = await gmail.users.history.list({ userId: "me", startHistoryId: lastSeen });

  for (const h of data.history ?? []) {
    for (const { message } of h.messagesAdded ?? []) {
      const msg = await gmail.users.messages.get({ userId: "me", id: message.id });
      const part = findPlainPart(msg.data.payload);                 // walk the MIME tree yourself
      const text = Buffer.from(part.body.data, "base64url").toString(); // base64url, not base64
      await runAgent({ task: text /* SPF/DKIM/DMARC? parse the headers yourself */ });
    }
  }
  res.sendStatus(204);
});

Four things in that block are the actual tax, and none of them are visible in a five-line demo:

  • Restricted-scope review. Reading mail needs gmail.readonly or gmail.modify, which are restricted scopes. In production, past 100 users, an app using restricted Gmail scopes must pass an independent CASA (Cloud Application Security Assessment) and re-verify at least every 12 months. Google doesn't publish a price; third-party assessors report roughly a few hundred to a couple thousand dollars a year. Until you verify, users see the "unverified app" warning and you're capped at 100 users for the life of the project.
  • The 7-day watch. Push notifications aren't a webhook you register once. users.watch() creates a subscription that expires after 7 days; Google's guidance is to re-call it about once a day. Miss the renewal and mail keeps arriving while your agent hears nothing, with no error on either end.
  • Pub/Sub in the middle. The push target is a Cloud Pub/Sub topic you provision, with an IAM grant to gmail-api-push@system.gserviceaccount.com so Gmail can publish to it. The message it hands you is just an emailAddress and a historyId; you call history.list then messages.get to learn what actually landed.
  • base64url MIME. messages.get returns the payload with each body part base64url-encoded (format=raw gives you the whole RFC 2822 message base64url-encoded). You walk the MIME tree, pick the part you want, and decode it. And there's no auth verdict in the box: SPF, DKIM, and DMARC are headers you parse yourself.

Here’s that productionizing path top to bottom. Every stage is yours to build and keep alive:

OAuth restricted scopegmail.readonly / gmail.modify CASA assessmentannual, or capped at 100 users Pub/Sub topic + IAMgrant gmail-api-push publisher users.watch()renew before 7-day expiry history.list → messages.getbase64url-decode the MIME your agent logicfinally, read and reply Every box above the blue one is yours to build, verify, and keep running. On MailKite, the blue box is the only box.
The Gmail path for an autonomous agent, stage by stage. MailKite collapses every gray stage into a signed JSON webhook.

There’s one more shape worth naming: on Google Workspace, a service account with domain-wide delegation can impersonate mailboxes across the org without per-user consent screens. It’s the clean answer for internal org agents, but it’s Workspace-only, a super-admin has to authorize the service account’s client ID in the Admin console, and it grants broad reach into employee mail, which is exactly the power your security team will want to scope. It removes the consent screen, not the Pub/Sub, watch renewal, or base64url work.

The comparison, no adjective inflation

Gmail APIMailKite
Agent’s addressA Gmail/Workspace account (a human artifact)Scoped address on a domain you control
StartOAuth client + consent; restricted-scope CASA for prodDNS-verify (SPF+DKIM to send, MX to receive)
Inbound deliveryPub/Sub push of a historyId → get → decodeOne parsed JSON webhook
Push longevitywatch() expires in 7 days; renew ~dailyRegister the webhook once
Message bodybase64url-encoded MIME you walk and decodeDecoded text/html in the payload
Auth verdictParse SPF/DKIM/DMARC headers yourselfauth block in every event
Reply/threadingBuild the RFC 2822 message + threading yourselfmk.send({ inReplyTo }) resolves it
Automation postureAccount limits + ToS written for humansBuilt for programmatic, per-domain use

The through-line: Gmail wins when the agent lives in a real person’s inbox with that person supervising. MailKite wins when the agent needs its own inbox, running unattended, delivered already parsed.

What actually hits your agent’s webhook

The same inbound email, decoded, with the sender-auth results already computed. No Pub/Sub round-trip, no MIME tree, no header parsing:

{
  "id": "msg_2Hk9…",
  "type": "email.received",
  "from": { "address": "ada@example.com" },
  "to": [{ "address": "agent@myapp.ai" }],
  "subject": "Re: invoice #1042",
  "text": "Looks good — approved!",
  "html": "<p>Looks good — approved!</p>",
  "threadId": "<a1b2c3@mail.example.com>",
  "auth": { "spf": "pass", "dkim": "pass", "dmarc": "pass", "spam": "ham" },
  "attachments": [
    { "id": "msg_2Hk9…:0", "filename": "po.pdf", "contentType": "application/pdf",
      "size": 18213, "url": "https://api.mailkite.dev/att/2Hk9…/0?exp=…&sig=…" }
  ]
}

That auth block is load-bearing for an agent. Inbound email is a prompt-injection surface: From: is plain text, so anyone can forge a sender and then tell your agent what to do. Check SPF/DKIM/DMARC before you weight instructions, and treat the body as data, never as commands. Passing auth proves who sent it, not that it’s safe to obey, so bound the agent’s authority too. The webhook-security docs cover verification, and there’s a whole post on the injection surface linked at the end.

Where this fits, disclosed

We build MailKite, so take the pitch with that in mind: it’s an inbound-email-to-webhook platform that also sends. The specific claim is narrow. For an agent that needs its own inbox and runs unattended, MailKite gives it a scoped address on a domain you control, delivers inbound as parsed JSON push (no Pub/Sub, no watch() renewal, no OAuth review), hands you an auth verdict instead of raw headers, and resolves threading on the reply. You DNS-verify the domain and you’re live, no sandbox or approval queue. The free tier is 3,000 messages a month, inbound and outbound, with no per-domain fee, and SMTP-only apps can send through the submission edge on :587/:465. If you’d rather not host the loop, a route with action: 'agent' runs the model turns on a queue and gives you a per-run transcript. Start at the quickstart.

FAQ

Can I use the Gmail API to give an AI agent its own inbox? You can, and for a single agent that assists one real person on their own mailbox it’s a good fit. For an autonomous agent with its own address, expect OAuth restricted-scope verification (CASA) for production, a Pub/Sub watch() subscription you renew before its 7-day expiry, and base64url MIME to decode. MailKite gives the agent a scoped address on your domain and delivers parsed JSON, with none of that setup.

Do Gmail API restricted scopes really need a security assessment? Yes. gmail.readonly and gmail.modify are restricted scopes. Past 100 users in production, Google requires a CASA (Cloud Application Security Assessment) through an independent assessor, re-verified at least every 12 months. Until you verify, users get the “unverified app” warning and you’re capped at 100 users for the project’s lifetime.

Why does my Gmail push notification stop working after a week? Because users.watch() creates a subscription that expires after 7 days. It’s not a register-once webhook; Google recommends re-calling watch() roughly daily. If the renewal cron fails, new mail arrives but your service receives no notifications and no error, so the agent goes silently deaf.

How do I read a Gmail message body from the API? messages.get returns the MIME payload with each body part base64url-encoded (and format=raw returns the entire RFC 2822 message base64url-encoded). You walk the MIME tree, select the part, and base64url-decode it, and parse SPF/DKIM/DMARC from the headers yourself. MailKite delivers decoded text and html plus an auth block in the webhook.

Is it against Google’s terms to run a bot on a Gmail account? Gmail’s limits and policies are written for human accounts: personal Gmail caps at 500 recipients/day, Workspace at 2,000, the API caps at 100 recipients per message, and per-user rate limits apply. A fully automated agent bends a human-account product past its intended use. Giving the agent its own domain address sidesteps the whole per-user, human-account model.

Can an agent still act inside a real person’s Gmail with MailKite? No, and it shouldn’t try to. If the job is triaging a specific human’s existing inbox, use the Gmail API with that user’s consent. MailKite is for the other case: an agent that needs its own address, receiving and replying on a domain you control.


If your agent needs its own inbox rather than a seat in someone’s Gmail, the shape is simpler than OAuth review plus a 7-day watch renewal. Clone the demo repo (or open it in StackBlitz), then point a domain at MailKite and your agent’s next inbound email arrives as parsed JSON.

Related: the pillar on giving your agent an inbox and agent inbox security by design.

Discuss this post: Hacker News Share on X

Related posts