Blog

The Media Pipeline: What Happens After You Upload

Updated June 12, 2026

The Media Pipeline: What Happens After You Upload

Product media placeholder

Replace this area with a screenshot or short walkthrough video during the media sweep.

Drag a photo into the media library and it's "done" in a heartbeat. That heartbeat is a small lie of presentation: the real work — resizing, reformatting, poster-framing, the dance that makes one file servable on every page and screen — happens after you've walked away. This post traces what actually happens between "upload complete" and "ready everywhere," and the two or three design decisions in that gap we'd defend anywhere.

💡

TL;DR: Uploads are accepted fast and processed later: the original is stored, a record is created in processing state, and background processors generate the variants — resized and reformatted images, a poster frame and web-friendly rendition for video — then report back via webhook. Image and video pipelines deliberately share one payload and webhook contract. The webhook can arrive before the record is fully visible, so the receiver retries patiently instead of dropping results. Pages reference assets from the library — never copies — so one upload serves every placement, and its metadata travels with it.

Accept fast, process later

The first decision shapes everything: the upload path does the minimum. Receive the file, store the original, create the media record with status: processing, return. Everything expensive — decoding, resizing, transcoding — is deferred to background processors, because the human at the keyboard is the scarcest resource in the pipeline and nothing they're waiting on should ever include "transcode a video."

The status field is the contract with the UI: processing means "yours, safe, being prepared," and the library can show the asset immediately while its variants materialize. The original is sacred and untouched — every derived rendition is exactly that, derived, which means reprocessing is always possible and a bad transcode never costs the source.

Variants: one original, many faces

Why variants at all? Because the file a camera produces and the file a webpage should serve are different artifacts with different jobs. A modern phone photo is several thousand pixels wide and multiple megabytes; the card it appears in on someone's phone screen is 400 pixels wide on a coffee-shop connection. Serving the original everywhere is how sites get slow, and slow is its own accessibility failure.

MediaWhat gets generatedWhy
ImagesA ladder of resized, reformatted renditionsThe right pixels for each placement, in modern web formats that compress dramatically better than the camera's
VideoA poster frame plus a web-friendly 720p renditionSomething to show before play is pressed, and something streamable that doesn't assume fiber

And when a placement needs something the standard ladder doesn't cover, custom variants are generated on demand: an exact width, a format from a short menu (webp, jpg, png), resampled with a quality filter rather than a fast one. The menu is the point — the same sizes-from-a-menu discipline as type scales, applied to pixels: a constrained set of good options beats infinite mediocre ones, for humans and for the AI placing images into pages.

One contract, two pipelines

Image processing and video processing are different beasts — different libraries, different costs, different failure modes — and the obvious architecture gives each its own integration. We deliberately didn't: both pipelines share the exact same payload shape and webhook contract. One invoker dispatches to either processor; one receiver handles either's results. The processor's name is the only thing that changes.

The reasoning is anti-drift: two integrations that start identical never stay identical — each accumulates its own retries, its own error shapes, its own folklore, until "fix the webhook handling" is two tickets. Sharing the contract means improvements land on both pipelines by construction, and adding a third media type someday is a new processor, not a new integration. It's the helpers argument again: centralize the part you'll need to fix.

The race we had to respect

Here's the bug-shaped corner, and regular readers will recognize the family. The processor finishes and posts its results — variants, metadata, the new status — to a completion webhook. But the media record it's patching was created by an upload whose own write may still be propagating between instances. Occasionally the webhook arrives before the record is visible. The naive receiver looks up the record, finds nothing, drops the payload — and the asset is stuck in processing forever, because the processor reports exactly once and never re-posts.

The fix is the patient receiver: a result that finds no record isn't an error, it's an early arrival — so the patch is queued and re-applied with backoff until the record appears or attempts genuinely run out. The general lesson sits right beside our distributed-writes postmortem: when a callback fires at most once, the receiving side owns the persistence. You can't ask the one-shot caller to be patient, so the receiver must be — and "not found yet" must be a different code path than "not found."

Placement: reference, never copy

The last leg is where media meets pages, and the rule is one we've preached from the product side: pages and posts reference library assets, they don't embed copies. One upload, many placements — and the consequences compound quietly:

  • Metadata travels. The description written once in the library — the alt text, the tags — arrives at every placement, instead of being retyped or skipped per page.
  • Replacement is an edit, not an audit. The updated logo, the reshot product photo: change the asset, every placement follows. The five-times-uploaded image is the document fork wearing a JPEG extension, and references are how it never forks.
  • The right variant per placement, automatically. Because placements reference the asset rather than a file, the render path can choose the appropriate rendition for the context — the hero gets the big one, the card gets the small one, nobody chooses by hand.

What we'd carry to any media pipeline

  • Never make a human wait on a transcode. Accept, record, defer — the status field is the UI contract.
  • Keep originals sacred. Everything derived must be re-derivable; a processing bug should cost compute, never content.
  • Share contracts across media types. The payload and webhook shape is the part that drifts — make it singular.
  • Receivers own persistence for one-shot callbacks. "Not found yet" is a retry, not a drop.
  • Constrain the variant menu. A short list of good formats and quality resampling beats infinite knobs.
  • Reference, never copy. Identity for assets, like identity for customers — one record, many appearances.

Key takeaways

  • Upload is acceptance, not processing: the original lands, the record exists in processing state, and the expensive work happens where no human is waiting.
  • Variants are the serving strategy: a ladder of resized modern-format renditions for images, a poster and streamable rendition for video — the camera's file and the page's file are different artifacts.
  • One contract, two pipelines: image and video processors share payload and webhook shapes by design — anti-drift, and a third media type becomes a processor, not an integration.
  • Respect the early-arrival race: completion webhooks can beat record visibility — patient receivers re-apply with backoff, because one-shot callers never re-post.
  • Custom variants from a menu: exact width, three good formats, quality resampling — constrained options that humans and AI both place correctly.
  • Reference, never copy: alt text travels, replacement propagates, and each placement gets the right rendition — asset identity is what prevents the JPEG fork.

Frequently asked questions

What happens if processing fails — does the asset disappear?

No — failure is a state, not a deletion. The original is already stored and the record already exists, so a failed transcode leaves an asset that's present but unprocessed, with retry as the recovery path. This is the original-is-sacred dividend: the worst processing outcome is "try again," never "re-upload." The status the user sees stays honest about which state they're in, which beats the alternative of variants that silently never arrived.

Why a 720p rendition for video rather than full adaptive streaming?

Fit for purpose: the videos small businesses embed — a walkthrough, a testimonial, a hero loop — are watched in a page context where one good web-friendly rendition plus a poster frame covers the real need at a fraction of the complexity. Full adaptive ladders earn their cost at long-form, high-volume scale. The architecture keeps the door open — renditions are just variants, and the shared contract means a richer ladder is a processor upgrade, not a redesign.

Why does the on-demand variant API resize with a quality filter instead of the fastest one?

Because custom variants are made once and served many times — the classic write-once-read-many trade. A slower, high-quality resample costs milliseconds at creation and pays at every view; a fast-but-soft resize saves milliseconds once and serves slightly worse pixels forever. For pre-generated ladders the same logic applies even more strongly. The fast path is for previews; the persistent path gets the good filter.

How does this interact with AI building pages?

Cleanly, and on purpose: the AI places references from the library, chooses from the constrained variant menu, and inherits the asset's metadata — alt text included. It never uploads duplicates or invents sizes, because the system doesn't offer those moves. The same constraint-makes-AI-safe pattern as templates and design tokens: shrink the action space to good actions, and generated pages get media handling right by construction.

The media pipeline ships inside Faster — upload once, place everywhere, with the right pixels chosen for you. More engineering notes: the engineering blog.

Was this guide helpful?

Sunny Arora

Written by

Sunny Arora

Get technical deep dives delivered to your inbox

Join creators and developers who get exclusive insights, tutorials, and behind-the-scenes content every week.

No spam. Unsubscribe anytime.