Why Semantic Attributes?
Many languages have grammatical gender agreement. Without knowing the gender of a person, translation quality suffers:Enable Semantic Enrichment
Semantic Attributes
Person Gender
| Attribute Value | Meaning | Example Names |
|---|---|---|
gender="male" | Masculine name | John, Michael, Hans |
gender="female" | Feminine name | Maria, Sarah, Anna |
gender="neutral" | Ambiguous/unknown | Alex, Jordan, Sam |
gender="unknown" | Not found in database | — |
Location Scope
| Attribute Value | Meaning | Examples |
|---|---|---|
scope="city" | City/town | Berlin, Paris, Tokyo |
scope="country" | Country | Germany, France, Japan |
scope="region" | Region/state | Bavaria, California, Hokkaido |
scope="unknown" | Not found in database | — |
Semantic Data
Semantic enrichment uses lookup databases (~4 MB total):- Name database: 40K+ first names with gender associations (from gender-guesser)
- City database: 25K+ cities from GeoNames (population > 15K)
- Country database: Country names and ISO codes
- Region database: First-level administrative divisions
First-Use Download
Manual Data Management
Title Extraction
When semantic enrichment is enabled, honorific titles are extracted and kept visible:- Academic: Dr., Prof., PhD
- Honorific: Mr., Mrs., Ms., Miss
- Professional: Rev., Hon.
- German: Herr, Frau, Dr.
- French: M., Mme., Mlle.
- And many more…
Excluding Location Scopes
When semantic enrichment is enabled, you can selectively skip anonymization of locations by their scope. This is useful when you want to keep country and region names visible while still anonymizing cities and addresses.| Scope | What it covers | Examples |
|---|---|---|
city | Cities and towns | Berlin, Paris, Tokyo |
country | Countries | Germany, France, Japan |
region | States, provinces, regions | Bavaria, California |
unknown | Locations not found in the database | — |
This option requires
semantic: { enabled: true } because scope classification depends on the GeoNames lookup database. Without semantic enrichment, excludeLocationScopes is silently ignored.Locale Hints
Improve detection accuracy with locale hints:- Name gender inference (culture-specific names)
- Title recognition (Mr. vs Herr vs M.)
Configuration Options
Cache Locations
Semantic data is cached locally:Node.js
| Platform | Location |
|---|---|
| macOS | ~/Library/Caches/rehydra/semantic-data/ |
| Linux | ~/.cache/rehydra/semantic-data/ |
| Windows | %LOCALAPPDATA%/rehydra/semantic-data/ |
Browser
Uses IndexedDB for cross-session persistence.Use Cases
Machine Translation
Machine Translation
German, French, Spanish, and many other languages have grammatical gender. Semantic attributes help MT systems:
Location Prepositions
Location Prepositions
Different location types use different prepositions:The
scope attribute helps translation systems choose correctly.Contextual Processing
Contextual Processing
Beyond translation, semantic attributes enable:
- Gender-aware text generation
- Location-based content filtering
- Name normalization
Next Steps
Browser Usage
Using semantic enrichment in browsers
Sessions & Storage
Persist enriched PII maps