phrases collection (Phrases model). Each phrase records a single source line (ov) and its translation (text), scoped by project and language.
Data Model
Phrases (api/models/Phrases.php)
| Field | Type | Notes |
|---|---|---|
_id | id | |
project | id | The project the phrase belongs to. Match scope. |
asset | id | The asset it was indexed from. Recorded but not used when fetching. |
translation | id | The translation document it was indexed from. Used to wipe a doc’s phrases on re-index. |
language | string | Target language code. Match scope. |
ov | string | The tag/newline-stripped source (OV) line. Match key. |
text | string | The translated line returned to the editor. |
print | boolean | true if generated from a print asset’s printLines. Written but never read. |
project and asset must exist; language, ov, and text must be non-empty. A blank cleaned-OV or blank translation silently fails to save.
Indexing (writing phrases)
Trigger
Indexing runs through theGeneratePhrases save filter, wired into the Translations model save chain (api/models/Translations.php:41, implemented in api/models/translations/save/GeneratePhrases.php). It fires on every translation save, but indexes only if all of the following hold:
status === 'submitted'— afor-reviewsave does not index. The later approval save (which flips status tosubmitted) is what indexes a reviewed translation.multiTranslationis empty — dual-language / multiTranslation submissions never index.translator !== "Pixwel"— transcriptions (OV authoring) are excluded.- The asset has an OV transcription of the matching type. The type is resolved from the work request:
- graphics translation →
autogfx(ifauto) orgraphics - dialogue translation →
dialogue(graphics asset) or_id - otherwise →
_id
- graphics translation →
indexByOV and indexByPrintOV.
Per-line rules (indexByOV / indexByPrintOV)
api/models/Translations.php:394 and :456.
- Bail entirely if
count(translation lines) !== count(OV lines), or the translation doesn’t exist. Matching is strictly by line position (k), not by content. - Wipe all existing phrases for this translation first (
Phrases::remove(['translation' => _id]), plusprint: truefor the print variant). - Per line, skip if
textis empty. - Per line, skip if
custom === false && machine === false. Only lines that were user-edited (custom) or machine-translated (machine) are indexed. Lines taken verbatim from a TM match (auto) or left as untranslated OV are skipped. (See Line source flags.) - Compute the match key:
cleanLine(OV_line[k].text)— strips HTML tags,\r, and\n. - Remove any prior phrase with the same
{ov, project, language}(+print: truefor print) — so a{project, language, ov}triple holds at most one phrase. - Create the new
Phrasesdocument.
Line source flags
The subtitler maps the editor’stranslationFrom value onto the stored flags when saving (ui/3x/modules/services/subtitle-service.js:307-309):
translationFrom | auto | custom | machine | Indexed? |
|---|---|---|---|---|
'custom' (user-edited) | false | true | false | ✅ |
'mt' (machine translation) | false | false | true | ✅ |
'tm' (filled from a TM match) | true | false | false | ❌ skipped |
'ov' (untranslated / fell back to original) | false | false | false | ❌ skipped |
Bulk reindex
Assets::indexTranslations() (api/models/Assets.php:580) wipes all phrases for an asset and re-runs indexByOV for every submitted, non-OV translation on it. Only invoked by the RegeneratePhrases migration (api/migrations/RegeneratePhrases.php) — never from the UI.
Fetching (reading phrases)
Endpoint
GET /translations/translate?id=…&language=… (api/controllers/Translations.php:18).
Adding machine=true routes to AWS Translate instead of TM. Resolves to Translations::translate() → getMemoryTranslation() (api/models/Translations.php:239).
Frontend input
getTranslationId(workRequest) (ui/3x/utils/orders.js:175) supplies the id. It is the target translation’s _id (dialogue, graphics, or both, depending on order mode):
| Order mode | id returned |
|---|---|
print, script | workRequest.translation._id |
gfx, autogfx | workRequest.graphicsTranslation._id |
script+gfx | [translation._id, graphicsTranslation._id] |
GER-PFR), the frontend splits the language, makes one call per language, and merges results positionally with a <span></span> separator to match the dual-language editor rendering (ui/3x/modules/services/translation-service.js getTranslationMemories).
Server lookup rules (getMemoryTranslation)
- Load the document by the passed
_id. UseprintLinesfordocument/imagemedia types, otherwiselines. - For each line, look up a
Phrasesmatch on exactly{ project, ov: cleanLine(line.text), language }. Matching is by exact tag/newline-stripped source-text equality, scoped to project + language — not asset. This is what enables reuse of a translation across different assets in the same project. order: ['_id' => 'desc']— on multiple matches, the most recently created phrase wins.- Return one entry per line: the phrase’s
text, ornullwhen there is no match. - The lookup does not filter on
print.
Subtitler Editor — auto-translation & provenance
This section covers how the subtitler applies TM and MT to the editor and how each line’s source is shown. (Distinct from indexing/fetching above, which is the API side.)Toggles and layering
The subtitler has two independent toggles — Translation Memories and Machine Translations — that can both be on at once.TranslationService.autoTranslate() resolves each line to the highest-priority source that has data:
Precedence: Custom > TM > MT > OV
- Custom — a manual edit. Always wins and is never overwritten by a toggle;
autoTranslatereturns custom lines untouched. - TM — a translation-memory match (when the TM toggle is on).
- MT — a machine translation (when the MT toggle is on). MT is applied first, then TM overwrites per line where a match exists, so with both toggles on TM takes precedence and MT backfills the rest.
- OV — the fallback when no higher source applies (and the line isn’t a custom edit).
History
Layering and custom-edit preservation are the original behavior. PR #3187 (6ad8027b6, “allow user to enable only one of the two toggles…”, PLATFORM-3916, May 2026) made the toggles mutually exclusive and changed autoTranslate to overwrite custom edits (tracking a customText field to restore them on toggle-off). That PR has been reverted — the toggles are independent again and custom edits are preserved by precedence, so the customText machinery is gone. Every reverted behavior (mutual exclusion, custom overwrite, customText, the related tests) traces solely to #3187.
Provenance colors
Each line’s source is shown by a colored left accent bar on the translation field, using fixed semantic colors:| Source | Color | Hex |
|---|---|---|
| Custom | green | #2f9e44 |
| TM | violet | #9d4edd |
| MT | orange | #f08c00 |
ui/3x/constants/provenance.js as PROVENANCE_COLORS and are intentionally not part of the themeable palette (~/theme) — provenance is a semantic status signal that must stay stable across themes/white-labeling.
The same colors are reused for:
- A color key (
ProvenanceKey,data-testid="sub-provenance-key") in the actions bar — Custom / Memories / Machine swatches + labels, the always-visible legend for the accent-bar colors. - The TM / MT toggles — each toggle’s checked state takes its source color (
accentprop onTranslationToggle), so a toggle visually matches the lines it produces.
:focus / .is-editing rules so the blue active-cell border still wins while editing.
Notes:
- There is no per-source icon. An earlier version colored a per-line icon (
circleCheck/translationMemories/machineTranslations); it became redundant once the accent bar + color key carried the signal, and was removed. Split/merge icons remain (yellow). - Non-color fallback: the field carries a
titletooltip with the source label (Custom/Memories/Machine/OV), and the edit menu shows the same label as text for the active row. IconresolvesTheme[color] || color, so it accepts both theme keys and raw hex (kept for the split/merge icons and future use).
Key Rules
- Match scope is
{project, language, ov-text}— never asset. TM is reused project-wide. - Match key is the cleaned source line — HTML tags and line breaks are stripped on both write and read, so matching is exact on the visible source text only.
- Only user-edited (
custom) or machine (machine) lines are indexed. Untouched OV lines and lines accepted verbatim from a TM suggestion are not written back. submittedstatus indexes;for-reviewdoes not. Reviewed translations index when they are later approved tosubmitted.- multiTranslations never index — but the fetch path fully supports reading dual-language TM.
- Most recent phrase wins on duplicate matches (
_id desc). - Image assets index both subtitle and print phrases.
- Line count must match the OV or the whole translation is skipped during indexing.
Known Asymmetries
These are mismatches between the write and read paths, relevant to ongoing TM work:printis written but never read.indexByPrintOVtags phrasesprint: true, butgetMemoryTranslationnever filters on it. A print fetch can therefore return a non-print phrase (and vice versa) — whichever is newest.- multiTranslations are read but never written. Dual-language submissions contribute nothing to the memory, even though the fetch path does elaborate per-language merging to read them.
- Index records
asset; fetch ignores it. Reuse is project-wide by design — confirm that is the intended boundary for any given workflow. - Index keys on the OV transcription’s line text; fetch keys on the passed translation document’s line text. They align only because matching is positional on index and source-text-equality on read.
Code References
GeneratePhrases::filter()—api/models/translations/save/GeneratePhrases.php— indexing trigger and gate conditionsTranslations::indexByOV()/indexByPrintOV()—api/models/Translations.php:394/:456— per-line indexingTranslations::translate()/getMemoryTranslation()—api/models/Translations.php:217/:239— fetch logicTranslations::cleanLine()—api/models/Translations.php:294— match-key normalizationAssets::indexTranslations()—api/models/Assets.php:580— bulk reindex (migration only)Phrases—api/models/Phrases.php— phrase schema and validationtranslateroute —api/controllers/Translations.php:18— endpoint bindingTranslationService.getTranslationMemories()—ui/3x/modules/services/translation-service.js— frontend fetch + dual-language mergegetTranslationId()—ui/3x/utils/orders.js:175— resolves which translation id to fetchfetchTranslationMemories()—ui/3x/modules/hooks/use-subtitler-queries.js:789— assembles TM into the editorSubtitleService.to2xTranslation()—ui/3x/modules/services/subtitle-service.js:298— maps editor source ontocustom/machine/autoflagsTranslationService.autoTranslate()—ui/3x/modules/services/translation-service.js— applies TM/MT to the editor (Custom > TM > MT > OV)PROVENANCE_COLORS—ui/3x/constants/provenance.js— fixed semantic source colors (not themed)- Provenance rendering —
ui/3x/modules/components/subtitler/segment/index.js(source-*classes +titleon the field) andsegment.css.js(left accent bar) - Toggles & color key —
ui/3x/pages/subtitler/index.jsandsubtitler.css.js(TranslationToggleaccentprop,ProvenanceKey)