When to serve a model with an API vs a batch job
Most model work lands in one of two shapes: something answers when a request arrives (an API), or something runs on a schedule or after a file lands (a batch job). Both are valid. The mistake is picking the wrong one because “real-time” sounds more serious.
What an API is good for
An API path fits when a human or another system is waiting. Someone loads a screen, a service needs a score before the next step, or latency in the order of milliseconds to a few seconds is part of the product. You care about consistent preprocessing on that request, timeouts, and what happens when the model or dependencies fail mid-call.
What batch scoring is good for
Batch fits when data shows up in chunks: nightly exports, hourly syncs, or “run this file when it is ready.” You write one row of outputs per input row (or per entity), stamp model version and run time, and keep the run repeatable. Nobody is staring at the browser; they need a ledger they can audit tomorrow.
Latency and when data arrives
If the business decision only happens after a file is complete, batch is usually simpler than forcing a long-running API. If the decision is per click or per session, an API (or a queue worker that feels like an API to the caller) is the natural shape.
Audit trail and repeatability
Batch runs often make audit easier: one job ID, one input snapshot, one output table. APIs can log too, but you have to design it. If regulators or internal finance ask “what did the model say on Tuesday?”, batch artefacts answer that question bluntly.
Operational ownership
APIs need uptime, scaling, and safe rollbacks. Batch needs scheduling, retries, and clear failure alerts. Smaller teams sometimes ship batch first because the blast radius of a bad deploy is easier to contain—you fix the job and re-run.
Why not everything should be real-time
Real-time adds moving parts. If no one needs an answer in seconds, you pay for complexity without buying a better decision. Many use cases—risk flags for the next morning’s review, lead scoring for a weekly campaign, inventory hints for a daily planning sheet—are perfectly honest batch problems.
A simple checklist
- Does a person or system block on the score in seconds? If yes, lean API (or async job with a poll). If no, batch is on the table.
- Does the input arrive as a stable file or batch export? Strong signal for batch.
- Do you need a row-level audit trail for a past run? Batch tables and run logs are a good fit.
- Can you tolerate a short delay between new data and a new score? If yes, batch is often enough.
- Who owns on-call for this path? If that team is thin, prefer the simpler shape.
I keep runnable examples of both patterns on the Portfolio page—one churn-style serving surface and one batch scoring pipeline—so you can compare how each is scoped, not just how slides describe them. For fixed-scope work that picks one path and ships it, Services spells out how I structure those engagements.
Both API and batch can be the right answer. The goal is to match the business rhythm, not to impress with real-time by default.