We align the model across three critical layers, each addressing a distinct
failure mode:
Level 1: Anti-Hallucination (The Safety Layer)
Objective: Elimination of "Critical Failures" — instances
where the model invents concepts not present in the source text.
The Challenge: NMT models generate translations token-by-token
based on probability distributions. When faced with ambiguous or rare terms,
they "hallucinate" by selecting high-frequency alternatives that fit
grammatically but distort meaning.
Common Hallucination Patterns in Patent Translation
Pattern 1: Polysemy Misresolution
Source: "The chip performs a U-turn during signal routing..."
❌ Generic NMT:
"La puce effectue un demi-tour pendant le routage du signal..."
(Treats "U-turn" as literal vehicular maneuver)
✓ Domain-Aligned Model:
"La puce effectue un retour en U pendant le routage du signal..."
(Recognizes "U-turn" as chip routing terminology)
Pattern 2: Context-Free Substitution
Source: "...the package comprises a leadframe..."
❌ Generic NMT:
"...l'emballage comprend un cadre de plomb..."
("leadframe" → "cadre de plomb" literally "lead frame",
suggesting material composition rather than semiconductor component)
✓ Domain-Aligned Model:
"...le boîtier comprend un cadre de connexion..."
(Recognizes "leadframe" as standard semiconductor packaging term)
Why It Matters: Hallucinations can
broaden or narrow the scope of protection in ways that
contradict the inventor's intent. A "lead frame" made of lead is not the
same invention as a "leadframe" (connection structure).
→ View Level 1 Case Catalog
Level 2: Terminological Precision (The Accuracy Layer)
Objective: Override generic synonyms with specific,
client-approved nomenclature.
The Challenge: NMT models see "plastic," "resin," "polymer,"
and "thermoplastic" as interchangeable because they share semantic vector
space. But in patent prosecution,
these terms are legally distinct.
Source: "The housing is formed from a thermoplastic resin..."
❌ Generic NMT (Random Synonym Selection):
Translation 1: "Le boîtier est formé d'une résine thermoplastique..."
Translation 2: "Le boîtier est formé d'un plastique thermoplastique..."
Translation 3: "Le boîtier est formé d'un polymère thermoplastique..."
Problem: All three are linguistically correct, but only ONE matches
the client's approved terminology and prior art landscape.
Legal Impact: If a competitor's patent uses "polymer" and
your client's patent uses "thermoplastic resin," the terminological
distinction may be the basis for establishing product differentiation.
→ View Level 2 Case Catalog
Level 3: In-Context Consistency (The Coherence Layer)
Objective: Ensure that terminology remains stable across
the entire document.
The Challenge: Generic NMT models have
no long-term memory. Each sentence is translated
semi-independently, leading to catastrophic inconsistency in multi-claim
patent documents.
The Consistency Failure Pattern
Claim 1: "A device comprising a guide member..."
→ "Dispositif comprenant un élément de guidage..." ✓
Claim 5: "The device of claim 1, wherein the guide member..."
→ ❌ "Le dispositif selon la revendication 1, le guide..."
(Switches from "élément de guidage" → "guide")
Claim 9: "The device of claim 1, wherein the guide member..."
→ ❌ "Le dispositif selon la revendication 1, le membre directeur..."
(Switches to entirely different term "membre directeur")
Why It Matters: Patent examiners and courts interpret
term variation as intentional claim differentiation. If
"guide member" becomes three different French terms, the claims may be
rejected for indefiniteness.
→ View Level 3 Case Catalog