The anatomy of a 3-second order-to-ship pipeline

A customer hits Pay now on a Shopify store running AutoCom. Four seconds later, the merchant’s warehouse printer is already spitting out a shipping label and the customer’s phone buzzes with a WhatsApp: “Your order #4052 is confirmed — out for dispatch tomorrow morning.”

Four seconds sounds fast, but it’s actually a lot of budget when you break it down. Most of it is spent waiting on other people’s servers. Here’s exactly where the time goes, and how we got it down from the 14 seconds it took in our first prototype.

The budget

Here’s the median path, measured over the last 30 days across roughly 120,000 orders:

Step	Median	p95
Shopify webhook → our edge	180ms	420ms
Signature verify + dedupe	3ms	8ms
Persist order event	12ms	28ms
Queue dispatch (Redis)	2ms	5ms
Carrier rate-shop (parallel)	680ms	1,400ms
Carrier label request	950ms	2,100ms
WhatsApp template send	420ms	980ms
Webhook ACK	—	—
Total (observed)	2.2s	4.9s

Two observations.

First: we spend 90% of our time waiting on other people’s APIs. Our own code accounts for about 40ms in the median path. Everything else is network round-trips to Shopify, Delhivery, and Meta. This matters enormously for how you optimise — trying to shave CPU cycles off our handler is pointless. Making better use of the blocked time is everything.

Second: p95 is 2.2× the median. The tail is almost entirely carrier APIs having a bad day. We can’t make Delhivery faster, but we can stop letting their slow days block the customer path — which is where most of the engineering work lives.

Three things that bought us most of the time

1. Rate-shop in parallel, then commit to a winner

Our first version picked a carrier synchronously. Ask Delhivery for a rate, then Shiprocket, then BlueDart, then whichever came back cheapest got the label. That’s a serial 2–3 second wall.

Now: we hit all three carriers concurrently with a 900ms timeout. The first response back that meets the merchant’s policy (usually cheapest or fastest) wins. The late responses are discarded. The merchant’s config decides what “wins” means; we just run the race.

This single change cut median latency by about 1.4 seconds. The code is shockingly simple — it’s an asyncio.gather with a wait_for on each branch. The hard part was convincing ourselves we were OK discarding a response that might have been better. In practice, the difference between “best rate” and “first acceptable rate” is tiny, and the latency win is huge.

2. Fire-and-forget the WhatsApp

The customer doesn’t need to wait for Meta to accept the template before we tell Shopify the order is confirmed. We now:

Persist the order
Queue the WhatsApp as a background job
ACK the webhook
Send the WhatsApp from a worker with its own retry policy

This moved the WhatsApp call out of the critical path entirely. From Shopify’s point of view, we ACK in ~1.3 seconds. The customer gets the message somewhere in the next 400–900ms, usually before they’ve finished reading the Shopify “thank you” page.

3. Warm the carrier auth tokens

Carrier APIs require OAuth tokens that expire on various schedules (Delhivery: 24h, Shiprocket: 10 days, BlueDart: 1h). Our first version fetched them lazily — the first request after expiry would block waiting on a token refresh.

Now: a background cron refreshes every token at 60% of its TTL and writes it to Redis with a longer TTL than the carrier’s own. The label request handler never waits on auth. This doesn’t show up in median latency but it kills a nasty class of latency spikes that used to happen randomly every few hours.

The things we still can’t fix

Some latency just isn’t ours to optimise:

Carrier API cold-start quirks. BlueDart’s rate endpoint regularly takes 3+ seconds on the first call after a quiet period. We hedge against this by calling all three carriers in parallel, so BlueDart’s slow days don’t block us.
Shopify webhook latency from the US. Webhooks originating from us-east-1 reach us in Mumbai in 180–220ms. That’s already below the intercontinental floor. The only way to get faster is to move our ingest to Shopify’s region, which we’re not currently willing to do for two reasons — data residency and operational complexity.
WhatsApp’s internal routing. Meta’s own infra occasionally takes 2–3 seconds to fan out a template send. When that happens, the customer’s message is late by a couple of seconds. We can’t do anything about it.

The number that matters

The customer only feels one number: how long between hitting “Pay” and getting confirmation. That’s the Shopify ACK, currently median 1.3s. The rest — label generation, printer queueing, carrier coordination — happens in the background while the customer is still reading the confirmation page.

Optimising for the number users actually feel is almost always a better use of engineering time than optimising for the number your dashboards show.

Where to
next?

The anatomy of a 3-second order-to-ship pipeline

The budget

Three things that bought us most of the time

1. Rate-shop in parallel, then commit to a winner

2. Fire-and-forget the WhatsApp

3. Warm the carrier auth tokens

The things we still can’t fix

The number that matters

Keep
reading

Shipping AutoCom's order pipeline in six weeks

On-prem AI for Indian D2C: why sovereignty is a feature, not a tax

Debugging a ghost: the three-week Postgres mystery

The anatomy of a 3-second order-to-ship pipeline

The budget

Three things that bought us most of the time

1. Rate-shop in parallel, then commit to a winner

2. Fire-and-forget the WhatsApp

3. Warm the carrier auth tokens

The things we still can’t fix

The number that matters

Keepreading

Shipping AutoCom's order pipeline in six weeks

On-prem AI for Indian D2C: why sovereignty is a feature, not a tax

Debugging a ghost: the three-week Postgres mystery

Keep
reading