WhatsApp Cloud API vs On-Premise API: Which Should You Use in 2026?
May 21, 2026
·
12 min read
In this article: The 60-second answer · Two architectures · Full comparison table · The deprecation timeline · Throughput & rate limits · Real cost breakdown · The inbound webhook nobody talks about · What changes when you migrate · FAQ
The 60-second answer
In 2026, this is not a choice. The WhatsApp On-Premise API hit official end-of-life on October 23, 2025. Meta stopped shipping security patches, stopped accepting new number registrations on January 15, 2024, and has flagged the On-Premise stack as unsupported software. If you are still running a self-hosted Docker container, you are on dead infrastructure.
The WhatsApp Cloud API — where Meta hosts the server, handles scaling, ships patches, and pushes inbound messages to your webhook URL — has been the default for every new WhatsApp Business number since mid-2022. The Cloud API is cheaper, faster to provision, higher throughput, and gets new features first. There is no scenario in 2026 where On-Premise is the correct answer for a new integration.
On-Premise EOL: Oct 23, 2025No new patches — everCloud API: default since 2022Cloud API infra cost: $0Cloud throughput: 1,000 msg/sOn-Premise peak: ~30–50 msg/s
Two architectures, one platform name
Both the Cloud API and On-Premise API are technically part of the "WhatsApp Business Platform." The branding is identical. The architecture is completely different — and understanding the architecture is the only way to understand why the Cloud API won.
WhatsApp Cloud API (Meta-hosted)
Your application sends an HTTP POST to https://graph.facebook.com/v21.0/{phone_number_id}/messages. Meta receives it, routes the message through its WhatsApp backend, delivers it to the customer, and pushes delivery status and inbound replies back to your registered webhook URL. You never touch a server. Authentication is a permanent access token plus HMAC-SHA256 payload signatures. Media is hosted on Meta's CDN with 90-day retention.
WhatsApp On-Premise API (self-hosted, deprecated)
You — or your BSP — run two Docker images published by Meta (whatsapp/coreapp and whatsapp/web) on your own infrastructure. The coreapp connects outbound to Meta's relay, caches messages in a MySQL database, and exposes a REST API on port 443. Your team manages certificate rotation, MySQL replication, media storage, backups, and every security patch Meta releases. The minimum hardware spec was 4 vCPU and 8 GB RAM per phone number.
If you're still on On-Premise: The last security patch shipped on October 23, 2025. No future patches are planned. You are running software with known, unpatched vulnerabilities, zero feature updates, and no Meta support. Migration is not optional — it is overdue.
Full comparison table
Dimension
Cloud API
On-Premise API
Status in 2026
Active — default
EOL Oct 23, 2025
Who hosts the server
Meta
You or your BSP
Hardware footprint
None
4 vCPU + 8 GB RAM minimum per number
Setup time (via BSP)
Under 10 minutes
2–6 weeks (container deploy + cert install)
Authentication
Access token + HMAC-SHA256
Client certificate + admin bearer token
Throughput (default)
80 msg/s per number
30–50 msg/s (hardware-limited)
Throughput (max)
1,000 msg/s on request
80 msg/s with multi-node + load balancer
Inbound webhooks
HTTP POST to your URL — Graph API format
HTTP POST to your URL — On-Premise REST format
Security patches
Pushed by Meta automatically
Manual pull-and-deploy — none since Oct 2025
Media hosting
Meta CDN, 90-day retention
Your storage, your retention policy
New features (Flows, etc.)
Day one
Backported late — or never
Infrastructure cost/month
$0
$400–$1,200 per phone number
Per-conversation cost
Identical Meta pricing grid
Identical Meta pricing grid
Graph API unification
Yes — shared surface with FB + IG
No — separate REST surface
Data residency
Meta data centers (US/EU)
Your choice
New number registration
Open
Closed since Jan 15, 2024
The deprecation timeline
Most comparison articles gloss over the dates. These dates are the whole story. If you're still on On-Premise, knowing exactly when each door closed tells you how urgent your situation is.
August 1, 2018
On-Premise API launched
WhatsApp Business API ships with On-Premise only. Self-hosted Docker containers required. The enterprise WhatsApp story begins.
May 19, 2022
Cloud API reaches general availability
Announced at F8 2022. Cloud API becomes available to all businesses. Meta immediately recommends Cloud as the preferred integration path for all new numbers.
October 27, 2023
On-Premise API officially deprecated Deprecated
Meta publishes the deprecation notice. End-of-life date set for October 23, 2025. The clock starts. Every On-Premise operator is put on a mandatory migration timeline.
January 15, 2024
New number registrations on On-Premise closed
Meta stops accepting new WhatsApp Business numbers on the On-Premise stack. All new numbers must use Cloud API. Existing On-Premise numbers can still operate — temporarily.
October 23, 2025
Official end-of-life — last security patch End of Life
The final security patch shipped. On-Premise containers are now unsupported software. No future patches, no bug fixes, no feature backports. Any On-Premise operator still running in 2026 is running on a frozen, unpatched codebase.
2026 and beyond
Cloud API — the only supported path Active
Cloud API is the sole supported WhatsApp Business API integration. WhatsApp Flows, Calling API (launched July 2025), new interactive message types — all land on Cloud first, and only on Cloud going forward.
Throughput and rate limits: the numbers that actually matter
Daily messaging tiers are identical between Cloud and On-Premise — they are set by Meta's phone number quality scoring system, not by which API you use. Both APIs share the same four tiers:
Tier 1: 1,000 unique business-initiated customers per 24 hours (default for all new numbers)
Tier 2: 10,000 unique customers per 24 hours
Tier 3: 100,000 unique customers per 24 hours
Tier 4: Unlimited per 24 hours
Session messages (replies within a customer-opened 24-hour window) have no daily cap at any tier. Where Cloud API dramatically outperforms On-Premise is per-second throughput:
Cloud API
1,000 msg/s
per phone number (on request)
80 msg/s out of the box. Scalable to 1,000 msg/s on request for high-volume senders. Meta absorbs the load — no infra changes on your side required.
On-Premise API
30–50 msg/s
practical limit (single node)
80 msg/s cap applies in theory, but single-node Docker on a 4 vCPU VM typically saturates at 30–50 msg/s. Reaching 80 msg/s requires a MySQL replica, multi-node setup, and a load balancer.
For a Black Friday campaign sending 100,000 messages in a one-hour window, Cloud API absorbs the load without configuration. On-Premise requires engineering work upfront — multi-node deployment, MySQL replication, load balancer in front of the coreapp — and then still hits a hard ceiling that Cloud API removes entirely.
What you actually pay
Meta's per-conversation pricing is identical for both APIs. The difference is entirely in fixed infrastructure costs. Most comparison articles skip this part because it's unflattering for BSPs still running On-Premise stacks.
Cloud API — monthly cost per number
Server hosting$0
Database (MySQL)$0
Certificate management$0
Security patching / DevOps$0
Media storage$0 (Meta CDN)
Meta conversation feesPer volume
Fixed infra overhead$0 / month
On-Premise API — monthly cost per number
Server hosting (4 vCPU / 8GB)$150–$400
MySQL (replicated)$80–$200
TLS cert management$20–$60
Engineering on-call$150–$400
Media storage / CDN$30–$140
Meta conversation feesPer volume
Fixed infra overhead$430–$1,200 / month
These infrastructure costs apply per phone number. An agency managing 10 client WhatsApp numbers on On-Premise was burning $4,300–$12,000/month before paying Meta a single conversation fee. Cloud API eliminates that entire line item.
The piece every comparison article misses: inbound webhooks on the Cloud API
Every article about Cloud API vs On-Premise focuses on outbound — sending campaigns, templates, broadcasts. Almost none of them explain what happens to inbound messages on the Cloud API, and this is where most teams hit problems after migrating.
When a customer sends a message to your WhatsApp Business number connected via the Cloud API, Meta's infrastructure fires an HTTP POST request to your registered webhook URL. This happens within milliseconds. The payload is a structured JSON object in Meta's Graph API format. Your server — or a webhook platform — must be:
Publicly reachable over HTTPS at all times
Capable of responding with HTTP 200 within 20 seconds (otherwise Meta retries)
Verifying the HMAC-SHA256 signature in the X-Hub-Signature-256 header before processing
Parsing the Graph API payload format — which is different from the On-Premise REST format
Here is what a Cloud API inbound message payload looks like when normalized through SocialHook's WhatsApp webhook integration — delivered to your endpoint in a consistent format regardless of message type:
cloud-api-inbound-payload.json — delivered <50ms via SocialHook
{ "platform":"whatsapp", "event":"message.received", "timestamp":1747123456, "from":"+44 7700 900 123", "conversation_id":"conv_4m9x...", "message": { "type":"text", "body":"I just moved to the Cloud API. When does onboarding start?", "id":"wamid.HBgL..." }, "delivery": { "status":"delivered", "latency_ms":38 } }
SocialHook's role in a Cloud API stack: Meta's Cloud API delivers inbound events to a single endpoint. SocialHook sits between Meta and your server — handling webhook verification (HMAC-SHA256), normalizing the Graph API payload format, adding automatic 3x retry with exponential backoff if your server returns a non-200, and logging every delivery with full status history. Your server receives a clean, consistent JSON event. It does not matter whether the message came from WhatsApp, Facebook Messenger, or Instagram DMs — the payload format is identical across all three. One endpoint. Three channels. Flat $50/month.
What actually changes when you migrate from On-Premise to Cloud API
Migrating your phone number from On-Premise to Cloud API is supported and preserves your number — customers notice no change. But your server-side code needs updates in three specific places that most migration guides bury in footnotes.
1
Authentication method Breaking change
On-Premise used client TLS certificates and an admin bearer token. Cloud API uses a permanent access token for outbound and HMAC-SHA256 for inbound payload verification. You need to update your auth headers and implement the signature verification check on your webhook handler before going live.
2
Inbound payload structure Breaking change
The On-Premise REST API and the Cloud API Graph API use different JSON structures for inbound webhook events. Field names, nesting, and metadata differ. Your payload parser will need a full rewrite — or you use a normalization layer like SocialHook that abstracts this away from your application code entirely.
3
Media download logic
On-Premise stored incoming media on your server. Cloud API stores media on Meta's CDN with a 90-day retention policy. When a customer sends an image or document, your webhook payload contains a media ID. You must call the Cloud API to retrieve the download URL, then fetch the file. Existing code that reads from a local path will break.
4
Outbound API endpoint
On-Premise exposed a REST API on port 443 of your container. Cloud API uses the Meta Graph API endpoint at graph.facebook.com. The request format, authentication header, and URL structure all differ. Update your HTTP client configuration and request bodies accordingly.
5
Shut down infrastructure
After migration and validation, decommission your Docker containers, MySQL instance, and associated monitoring. This is where you reclaim the $400–$1,200/month. Verify no application code still points to the old local endpoint before decommissioning.
Is On-Premise ever the right answer in 2026?
We read every argument for keeping On-Premise in 2026. Here is the honest analysis:
Data residency requirements
The strongest historical argument for On-Premise was data sovereignty — keeping message metadata within your own VPC, in a specific jurisdiction. Cloud API sends data through Meta's infrastructure (US and EU data centers). For companies under certain government or financial regulations, this was a genuine constraint. In 2026, most of these scenarios are resolved through Meta's GDPR-compliant EU data residency settings, or through Meta's Enterprise API tiers with explicit Data Processing Agreements. The data residency argument rarely survives legal review anymore — and it doesn't survive the security argument of running unpatched software.
Legacy BSP stacks
Some BSPs have not finished migrating their entire On-Premise fleet to Cloud API. If your WhatsApp stack is managed by a BSP and they haven't moved you yet, push them. Hard. They are operating unsupported software on your behalf. Any security incident on their On-Premise stack post-October 2025 is their liability — and potentially yours.
The verdict
There is no new-build scenario in 2026 where On-Premise is the correct choice. For existing On-Premise deployments with data residency constraints — consult your legal team about Meta's DPA terms and EU data center options, then migrate as quickly as that review permits. Running unpatched software is not a valid compliance posture for any regulated industry.
If you're building a new WhatsApp integration today: Start with the Cloud API. Use a BSP or webhook platform for setup. Get your phone number registered, your webhook endpoint configured, and your first inbound message flowing to your server. The entire process — including connecting to SocialHook so your webhook receives clean JSON — takes under 15 minutes. See the quickstart guide.
FAQ
Common questions
Is the WhatsApp On-Premise API still supported in 2026?
No. Meta officially deprecated it on October 27, 2023, and the final end-of-life date was October 23, 2025. No security patches have shipped since that date, no new phone numbers can be registered on On-Premise (that closed January 15, 2024), and no new features will ever be backported. Anyone still running a self-hosted container in 2026 is on unsupported, unpatched software.
What is the WhatsApp Cloud API and how does it differ from On-Premise?
The WhatsApp Cloud API is Meta's hosted version of the WhatsApp Business Platform. You call graph.facebook.com to send messages — Meta runs all the infrastructure. On-Premise required you to run Docker containers on your own servers, manage MySQL, rotate TLS certificates, and ship security patches yourself. Cloud API eliminates all of that operational overhead.
What are the throughput limits on the WhatsApp Cloud API?
Cloud API supports 80 messages per second per phone number by default, scalable to 1,000 messages per second on request for high-volume senders. Daily messaging tiers (1K / 10K / 100K / unlimited unique customers per 24h) are identical between Cloud and On-Premise — they're controlled by Meta's quality scoring system, not the API version.
Do I need my own server to receive inbound WhatsApp messages on the Cloud API?
Yes — or you need a webhook platform. When a customer messages your number, Cloud API fires an HTTP POST to your registered webhook URL. You need a publicly reachable server (or a service like SocialHook) to receive, verify the HMAC-SHA256 signature, and process the payload. Without a working webhook endpoint, inbound messages never reach you programmatically.
How does Cloud API pricing compare to On-Premise pricing?
Meta's per-conversation pricing is identical for both. The real difference is fixed infrastructure cost. On-Premise required $400–$1,200/month per phone number in hosting, MySQL, monitoring, and engineering. Cloud API has $0 infrastructure cost — Meta absorbs it entirely. Total cost of ownership on Cloud API is typically 30–60% lower than On-Premise at the same message volume.
What changes on my server when I migrate from On-Premise to Cloud API?
Three breaking changes: (1) Authentication — On-Premise used client certificates; Cloud API uses an access token and HMAC-SHA256 signature verification. (2) Payload structure — Cloud API uses the Graph API JSON format, which differs from On-Premise's REST structure. (3) Media handling — On-Premise stored media on your server; Cloud API stores it on Meta's CDN, requiring an API call to retrieve download URLs. Your webhook handler, payload parser, and media logic all need updating.
Connect your WhatsApp Cloud API number to SocialHook and get every inbound message delivered to your server as clean JSON — verified, normalized, retried on failure — in under 50ms. No infra. No Docker. No on-call rotation.