Bulk-provisioning 1,000 eSIMs without hitting rate limits
Exponential backoff, idempotency keys, and how to design a queue that survives partial failures.
Bulk-provisioning 1,000 eSIMs without hitting rate limits
A corporate travel manager messages you on a Friday afternoon. Their team of 350 sales reps is flying to a conference in Singapore on Monday. Each needs a SIM. Can you provision them by end of weekend?
Sure. You're a reseller. You have an API integration with an eSIM provider. You write a loop that hits the provisioning endpoint 350 times.
Two minutes in, the API starts returning 429 Too Many Requests. Twenty minutes in, you've successfully provisioned 89 SIMs and you have no idea which ones. You're going to have a bad weekend.
This post is how to design bulk provisioning correctly.
The problem
Most eSIM APIs (Firsty included) rate-limit at a window-based threshold. Firsty's general endpoints allow 1000 requests per 15 minutes. Bursting past that returns 429 errors with a
Retry-AfterA naive bulk script:
javascriptfor (const employee of employees) { await provisionEsim(employee); }
This is slow (350 seconds at 1 second per call) and starts failing the moment you exceed the rate window.
A naive parallel script is worse:
javascriptawait Promise.all(employees.map(e => provisionEsim(e)));
This hits the API as fast as Node can spawn requests. The first batch might succeed. The rest fail with 429. You retry the failures. Some succeed, some fail again. You have no idea which employees got SIMs.
What you actually need
Try it yourself
Free sandbox. Real Tier-1 carriers. 60 seconds from signup to credentials.
Get started →A queue that:
- Respects the rate limit (stays under 1000 per 15 minutes)
- Uses idempotency to prevent duplicate provisioning on retry
- Recovers gracefully from failures
- Reports progress so you can tell the customer how many are done
- Is resumable: if your script crashes at SIM 217, you can restart and pick up where you left off
This is rate-limited job processing. The pattern works for any bulk API operation.
Idempotency keys first
Before we touch rate limits, fix the duplicate provisioning problem. Every provisioning request should include a unique idempotency key in the
X-Idempotency-Keyjavascriptconst idempotencyKey = `${customerId}-${employeeId}`; await axios.post('/api/v3/esims', {}, { headers: { Authorization: `Bearer ${token}`, 'X-Idempotency-Key': idempotencyKey, } });
If you retry the same request with the same idempotency key within 48 hours, Firsty returns the original response instead of creating a duplicate. The key should be deterministic: same logical operation, same key.
If a request with the same key is still being processed, you get
409 Conflict422 Unprocessable EntityCommon mistake: using a UUID generated at request time as the idempotency key. This doesn't help because retries generate a new UUID. The key must be derived from the logical operation, not the request.
Window-based rate limiting
Firsty's rate limit is 1000 requests per 15 minutes. That's roughly 67 requests per minute on average, but the window is what matters: you could burst 1000 requests in 30 seconds and then be locked out for 14.5 minutes.
The cleanest pattern is to pace yourself to use the window evenly. A token bucket works well:
javascriptclass TokenBucket { constructor(capacity, refillPerSecond) { this.capacity = capacity; this.tokens = capacity; this.refillPerSecond = refillPerSecond; this.lastRefill = Date.now(); } async acquire() { while (true) { this.refill(); if (this.tokens >= 1) { this.tokens -= 1; return; } const waitMs = (1 / this.refillPerSecond) * 1000; await new Promise(r => setTimeout(r, waitMs)); } } refill() { const now = Date.now(); const elapsedSec = (now - this.lastRefill) / 1000; this.tokens = Math.min(this.capacity, this.tokens + elapsedSec * this.refillPerSecond); this.lastRefill = now; } } const bucket = new TokenBucket(1000, 1000 / 900);
Before each API call:
javascriptawait bucket.acquire(); await provisionEsim(employee);
This automatically paces your requests. You never exceed the rate limit because acquire() blocks when the bucket is empty.
Exponential backoff on 429
Even with rate limiting, you'll occasionally get 429s. Network jitter, clock skew between you and the server, other clients sharing your IP. Handle them with the
Retry-Afterjavascriptasync function withRetry(fn, maxRetries = 5) { let lastErr; for (let attempt = 0; attempt < maxRetries; attempt++) { try { return await fn(); } catch (err) { if (err.response?.status !== 429) throw err; lastErr = err; const retryAfter = parseInt(err.response.headers['retry-after']) || (2 ** attempt); const jitter = Math.random() * 0.3; const waitMs = (retryAfter + jitter) * 1000; console.log(`Rate limited, waiting ${waitMs}ms before retry ${attempt + 1}`); await new Promise(r => setTimeout(r, waitMs)); } } throw lastErr; }
Three details that matter:
- Honor the header if present. The server is telling you exactly when to retry.
Retry-After - Add jitter (random delay) to prevent thundering herd if multiple workers are retrying simultaneously.
- Cap the retries. A permanent 429 (account suspension, for example) shouldn't retry forever.
Persistent job queue
In-memory rate limiting works for a single script. For real bulk operations, persist your work to a database:
sqlCREATE TABLE provisioning_jobs ( id UUID PRIMARY KEY DEFAULT gen_random_uuid(), batch_id UUID NOT NULL, employee_id TEXT NOT NULL, plan_reference TEXT NOT NULL, status TEXT DEFAULT 'pending' CHECK (status IN ('pending', 'in_progress', 'completed', 'failed')), attempts INT DEFAULT 0, iccid TEXT, activation_code TEXT, error TEXT, created_at TIMESTAMPTZ DEFAULT NOW(), completed_at TIMESTAMPTZ, UNIQUE (batch_id, employee_id) ); CREATE INDEX provisioning_jobs_status_idx ON provisioning_jobs(status);
Your worker pulls one job at a time:
sqlUPDATE provisioning_jobs SET status = 'in_progress', attempts = attempts + 1 WHERE id = ( SELECT id FROM provisioning_jobs WHERE status = 'pending' ORDER BY created_at FOR UPDATE SKIP LOCKED LIMIT 1 ) RETURNING *;
The
FOR UPDATE SKIP LOCKEDAfter each provisioning:
sqlUPDATE provisioning_jobs SET status = 'completed', iccid = $1, activation_code = $2, completed_at = NOW() WHERE id = $3;
If your script crashes, you restart and unprocessed jobs are still
pendingReporting progress
For a 350-SIM batch, you want to tell the customer "247 of 350 done." A simple status query:
sqlSELECT COUNT(*) FILTER (WHERE status = 'completed') AS completed, COUNT(*) FILTER (WHERE status = 'failed') AS failed, COUNT(*) FILTER (WHERE status IN ('pending', 'in_progress')) AS remaining FROM provisioning_jobs WHERE batch_id = $1;
Poll this every few seconds in the UI. Show a progress bar.
Putting it all together
javascriptconst WORKERS = 5; const bucket = new TokenBucket(1000, 1000 / 900); async function worker() { while (true) { const job = await claimNextJob(); if (!job) return; try { await bucket.acquire(); const esim = await withRetry(() => provisionEsim({ idempotencyKey: `${job.batch_id}-${job.employee_id}-esim`, }) ); await bucket.acquire(); await withRetry(() => orderPackage(esim.profileReference, esim.esimReference, job.plan_reference, { idempotencyKey: `${job.batch_id}-${job.employee_id}-pkg`, }) ); await markJobCompleted(job.id, esim); } catch (err) { await markJobFailed(job.id, err.message); } } } await Promise.all(Array.from({ length: WORKERS }, worker));
Note that each SIM requires 2 API calls (create eSIM + order package), so 350 SIMs is 700 API calls total. At 1000 per 15 minutes, that's about 11 minutes of API work, plus carrier-side latency.
With proper progress reporting, the customer can watch it complete. If something fails, you have a database record of exactly which SIM failed and why.
What this costs
The naive approach: 1 hour of code, 0 working SIMs, angry customer.
The correct approach: 1 day of code, working bulk provisioning forever, repeat business.
If you're a reseller doing volume, this is week-one infrastructure, not a nice-to-have.
Related guides
Buying eSIMs in bulk: an honest checklist for evaluating wholesale providers
What to actually evaluate when buying eSIMs in bulk. The questions wholesale providers do not want you to ask.
How to provision your first eSIM via API in 30 minutes
From OAuth token to first eSIM activated, with QR code generation server-side. Real code, real credentials, real eSIM, in about 30 minutes.
OAuth2 client credentials in production: what most tutorials get wrong
Token caching, refresh strategy, and the security mistakes we see in production integrations every week.