Architecture
| Piece | Role |
|---|---|
| Step Functions | State machines that orchestrate each pipeline, with waitForTaskToken pauses for external verification. |
| Lambda | The individual steps (mediainfo, ingestPrepare, gfxEncode, createFile, …). Mostly Node.js, some Python. |
| MediaConvert | AWS Elemental MediaConvert does the transcoding (GfxEncode, the encode workflow). |
| AWS Batch | Containerized subtitle/graphics rendering (GfxSubtitles, a batch:submitJob.sync task). |
| S3 buckets | IngestOutputBucket, EncodeOutputBucket, GfxEncodeOutputBucket — outputs land here; ObjectCreated events trigger the completion Lambdas. |
| EventBridge | Rules that fire completion handlers (e.g. EventGfxEncodeAudioComplete). |
slurpee3/template.yaml, deployed via deploy.sh) alongside a CDK project (slurpee3/cdk/, including the autosub stack). Source lives under slurpee3/workflows/{ingest,encodes,localization}.
The pipelines
Ingest
Runs when a source file is uploaded (see Ingests for the product view):mediainfo— read media metadata (always).ingestScratchPreview— a quick scratch preview (video only).ingestPrepare—ffprobefor video; updates the ingest’s status,mediainfo,selected, andffprobedata.- Wait for verification — a
waitForTaskTokenpause while the detected metadata is confirmed. ingestProcess— generate the preview and thumbnail (video only).copyFile/createFile— copy the download file when a download is requested.ingestVerify— finalize, updating the asset and ingest records.
ingestError propagates failures to the ingest status.
Encodes
Produces deliverable encodes through MediaConvert:createFile sets up the output, encode submits the MediaConvert job, and encodeComplete (triggered by an S3 ObjectCreated event on EncodeOutputBucket) registers the finished file back on the platform.
Localization (gfx)
The state machine instatemachine/gfx.asl.json — “Generates gfx render for a submitted order master file” — is what powers Automated Localization (subtitles and graphics):
GfxPrepare— set up the render from the order’s master and translation.GfxEncode— MediaConvert transcode of the base video.GfxEncodeAudio— audio handling (waitForTaskToken).GfxSubtitles— AWS Batch container that renders/burns subtitles and graphics.GfxEncodeOffline— produce the reviewable offline.CopyFiles— a map state fanning outCopyFileto place outputs.GfxCreateOffline— register the offline back on the work request.
GfxError handles failures, and S3-triggered completion Lambdas (gfxEncodeComplete, gfxEncodeOfflineComplete, gfxEncodeAudioComplete) advance the flow as each render lands.
How it connects to platform2
The API starts the work — an ingest or a work request kicks off the relevant state machine. Slurpee3 processes the media and calls back to the API to update ingest, asset, and order status as steps complete (thewaitForTaskToken pattern lets the API or a verification step resume a paused execution). Finished files land in S3 and are registered as platform files.
Supporting pieces
mediainfo/— a Lambda wrapping the MediaInfo CLI (bundleslibcurl) for metadata extraction.copyFile/— a Python Lambda for S3-to-S3 copies.collect-metrics/— gathers pipeline metrics.cdk/autosub/— the CDK stack for the autosub side of localization.
Deploying
- SAM:
cd slurpee3 && ./deploy.sh(usestemplate.yaml/packaged.yml). - CDK: from
slurpee3/cdk/, the standardcdk deployflow.
StepFunctions bills per transition, so the workflows keep state machines lean — choices are pre-computed into simple string/boolean hinges, and a step is only crossed when it genuinely needs orchestration (e.g. spanning Lambda → Batch → MediaConvert).