Receiving email is the part nobody warns you about
Sending email is a solved problem. Receiving it — turning real-world MIME into something your app can use — is where everyone quietly loses a week. Here's why inbound is the hard direction, how each option punts, and what one clean webhook looks like instead.
Sending email is a solved problem. Pick a provider, call one endpoint, a message goes out, you feel productive. There’s a whole generation of lovely send APIs now, and they earned the praise.
Receiving email is where the ground gives way.
The moment a feature needs the other direction — a reply that reopens a support ticket, an inbound attachment to process, an “email us your receipt” address, an agent that reads its own mail — you leave the paved road. You’re suddenly responsible for parsing arbitrary MIME produced by thirty years of mail clients that agree on nothing. And most providers hand you that swamp as an afterthought, bolted onto their send product.
This is the post I wish I’d read before I started.
Why inbound is the hard direction
When you send, you control the message. You build clean data, hand it to a library, and it produces well-formed MIME. When you receive, you inherit whatever twenty years of email clients, marketing tools, and someone’s Exchange server from 2009 decided to emit. You don’t get to reject it for being ugly. You have to parse it.
And real-world MIME is genuinely nasty:
- It’s a tree, not a string. A single email is
multipart/alternative(text + HTML) wrapped inmultipart/mixed(body + attachments), sometimes wrapped again for signatures. You walk the tree to find “the body.” - Encodings everywhere.
quoted-printable,base64, and charset labels that lie. The classic symptom: a£or an emoji turns into£because something decoded latin-1 bytes as UTF-8. - Attachments are inline base64 stuffed into the same payload — so a 10 MB PDF becomes a ~13 MB string you now have to pull apart, decode, and store.
- Headers are trivially forged.
From:is just text. Anyone can put anything there. Trust it and you’ve built a spoofing hole. You need SPF/DKIM/DMARC results to know if the sender is real — and computing those is its own project.
None of this is your app’s problem. It’s plumbing. But it sits directly between you and the one thing you wanted: the contents of the email.
This isn’t a niche gripe, either. The founder of Mailgun once described building the product specifically to eliminate MIME for developers — because of how much malformed, broken MIME arrives over SMTP. When the people who ran a mail company for a living call the format the enemy, believe them.
The incumbents each punt in a different way
Every existing option leaves you holding some version of the bag:
- SendGrid Inbound Parse posts you
multipart/form-data— a form upload, not JSON. You run a form parser, pull headers back out of string fields, and handle attachments as file parts. Developers hit the same walls repeatedly: undocumented payload shapes you reverse-engineer with test emails, attachments arriving corrupt or mislabeled, and the£-becomes-garbage encoding bug as a rite of passage. Worst of all, if your endpoint returns a 4xx or has a DNS hiccup, it drops the email immediately, with no retry. - Mailgun Routes exists because MIME is awful — but now you maintain a rule engine:
match_recipient("support@myapp\.ai"), expression syntax you keep in sync with your app. Inbound also moved behind a paywall, and its convenience fields (like stripped body text) have a habit of quietly eating real message content. - Cloudflare Email Routing is a great edge and free — but there’s no native parse-to-webhook. You write an Email Worker and parse the raw MIME yourself, inside a CPU budget that a big base64 attachment will blow through, against a 25 MB limit. It’s a fine trigger/router; it is not a “parsed message as JSON” API. And it was built to forward, not to let your app reply.
- Resend and other send-first tools added inbound, but the webhook is typically metadata-only — you get an event, then make a second API call to fetch the body and attachments. Two round-trips to read one email.
- Self-hosting Postfix or Haraka lets you parse everything exactly how you want. You’ll also inherit deliverability as a full-time job. Fresh VPS IPs start life on blocklists, and even flawless SPF/DKIM/DMARC lands you in spam often enough that the consensus from people who’ve done it is simply: don’t.
Notice the shape. Every option makes you own the MIME parsing, own the retry/reliability story, make a second call, or own IP reputation — usually more than one. The “just receive an email” task quietly becomes: pick a lesser-evil vendor, learn its wrapper, re-implement decoding for the parts it botched, compute your own auth results, and figure out attachment storage. For a checkbox on a feature list.
What it should look like instead: one webhook, the whole message
We built MailKite’s inbound as a first-class product, not a bolt-on. When mail arrives at any address on your domain, we parse the entire MIME tree at the edge and POST you one webhook with the whole message already extracted as JSON:
{
"id": "msg_2Hk9…",
"type": "email.received",
"from": { "address": "ada@example.com" },
"to": [{ "address": "support@myapp.ai" }],
"subject": "Re: invoice #1042",
"text": "Looks good — approved!",
"html": "<p>Looks good — approved!</p>",
"threadId": "<a1b2c3@mail.example.com>",
"auth": { "spf": "pass", "dkim": "pass", "dmarc": "pass", "spam": "ham" },
"attachments": [
{
"filename": "po.pdf",
"contentType": "application/pdf",
"size": 18213,
"url": "https://api.mailkite.dev/att/2Hk9…?exp=…&sig=…"
}
]
}
Notice what’s not there: no MIME tree, no inline base64 blob, no charset guessing. text and html are already decoded — the £ is a £. Attachments are pulled out and handed to you as a short-lived signed url you fetch on demand, so a 13 MB PDF never rides along in your webhook body. Threading is resolved. And auth tells you up front whether SPF, DKIM, and DMARC passed.
That last field matters more than it looks. The moment your app does something with an inbound email — files a ticket, sends a reply, hands it to an agent — a forged From: becomes an authorization decision. Having auth in the payload means you decide how much to trust the sender without computing it yourself or trusting blindly.
Receiving your first email
Point a domain at MailKite (add the MX record, verify), set a webhook URL, and inbound mail to any address on that domain gets parsed and POSTed to you. The whole handler is: verify the signature, then read the fields.
// Express
import express from "express";
import { MailKite } from "mailkite";
const SECRET = process.env.MAILKITE_WEBHOOK_SECRET;
// Capture the RAW body — verify the exact bytes, not a re-serialized object.
app.use("/hooks/mailkite", express.raw({ type: "application/json" }));
app.post("/hooks/mailkite", (req, res) => {
// Recomputes the HMAC, compares in constant time, and rejects anything
// outside the ±5-minute replay window.
const sig = req.headers["x-mailkite-signature"];
if (!MailKite.verifyWebhook(sig, req.body, SECRET)) {
return res.sendStatus(401);
}
const event = JSON.parse(req.body);
if (event.type === "email.received") {
console.log("from", event.from.address, "·", event.subject);
// event.text / event.html are already decoded.
// …create a ticket, reply, store it, hand it to an agent.
}
res.sendStatus(200); // ack fast; do the heavy work out of band
});
app.listen(3000);
Two rules that save real debugging later: verify against the raw bytes (JSON round-tripping changes the bytes and breaks the HMAC), and ack fast — return 200, then do slow work in a queue, because senders retry and you don’t want a duplicate because your handler took nine seconds. The same handler exists for Python, Ruby, Go, PHP, and Java — see the receiving docs and webhook security.
You get all of this across unlimited domains on the free tier — 3,000 messages a month, inbound and outbound sharing one quota, no daily cap, with automatic retries, HMAC-signed payloads, and graceful metered overage instead of a hard cutoff. Spin up an inbox for every side project; pay only when one takes off.
FAQ
What’s the hardest part of receiving email programmatically?
Parsing real-world MIME: emails are nested multipart trees with mixed encodings (quoted-printable, base64) and inline attachments, plus forgeable headers. Decoding all of that correctly — and computing SPF/DKIM/DMARC to know if the sender is genuine — is the work that surprises people. A good inbound API does it for you and hands you decoded text/html and an auth result.
Is SendGrid Inbound Parse the same as this? It’s the closest incumbent, but it POSTs multipart form data, is known to mangle certain encodings and attachments, and drops the message with no retry if your endpoint returns an error. Here the message arrives already fully parsed as JSON, and failed deliveries retry.
How do I stop someone from spoofing an inbound email?
Don’t trust the From: header — it’s plain text. Use the SPF/DKIM/DMARC results in the auth object to decide how much to trust the sender, and always verify the webhook signature so you know the request genuinely came from your provider and not an attacker POSTing to your endpoint.
Do attachments come inline in the webhook?
Not by default — a large base64 attachment inline bloats every webhook and can blow request-size limits. Each attachment arrives as a short-lived signed url you fetch on demand. (Zero-retention and encrypted domains are the exception: those receive attachment content inline as base64, since there’s nothing stored to link to.)
Can Cloudflare Email Routing do this? It gives you the raw message and leaves parsing to you inside a CPU-limited Worker — exactly where a big attachment will exhaust your budget. It’s a fine trigger/router; it is not a “parsed message as JSON” API, and it can’t reply from your domain.
Sending was never the hard part. Receiving is — and it’s the direction that unlocks support inboxes, reply-by-email, and agents that read their own mail. Point a domain at MailKite and receive your first parsed email in a few minutes.
Related: You can’t prompt your way out of prompt injection — how we designed an agent inbox that’s ACL-gated by design, so a fooled agent still can’t do damage.