Skip to main content

Installation

npm install rehydra

Basic Usage

Regex-Only Mode (No Downloads Required)

For structured PII like emails, phones, IBANs, and credit cards:
import { anonymizeRegexOnly } from 'rehydra';

const result = await anonymizeRegexOnly(
  'Contact [email protected] or call +49 30 123456. IBAN: DE89370400440532013000'
);

console.log(result.anonymizedText);
// "Contact <PII type="EMAIL" id="1"/> or call <PII type="PHONE" id="2"/>. IBAN: <PII type="IBAN" id="3"/>"

Full Mode with NER

For detecting names, organizations, and locations, enable the NER model:
import { createAnonymizer } from 'rehydra';

const anonymizer = createAnonymizer({
  ner: { 
    mode: 'quantized',  // ~280 MB, auto-downloaded on first use
    onStatus: (status) => console.log(status),
  }
});

await anonymizer.initialize();

const result = await anonymizer.anonymize(
  'Hello John Smith from Acme Corp in Berlin!'
);

console.log(result.anonymizedText);
// "Hello <PII type="PERSON" id="1"/> from <PII type="ORG" id="2"/> in <PII type="LOCATION" id="3"/>!"

// Clean up when done
await anonymizer.dispose();

Complete Translation Workflow

Here’s the full workflow for privacy-preserving translation:
import { 
  createAnonymizer, 
  decryptPIIMap, 
  rehydrate,
  InMemoryKeyProvider 
} from 'rehydra';

// 1. Create a key provider (required to decrypt later)
const keyProvider = new InMemoryKeyProvider();

// 2. Create anonymizer with key provider
const anonymizer = createAnonymizer({
  ner: { mode: 'quantized' },
  keyProvider
});

await anonymizer.initialize();

// 3. Anonymize before translation
const original = 'Hello John Smith from Acme Corp in Berlin!';
const result = await anonymizer.anonymize(original);

console.log(result.anonymizedText);
// "Hello <PII type="PERSON" id="1"/> from <PII type="ORG" id="2"/> in <PII type="LOCATION" id="3"/>!"

// 4. Translate (placeholders are preserved by most translation APIs)
const translated = await yourTranslateAPI(result.anonymizedText);
// "Hallo <PII type="PERSON" id="1"/> von <PII type="ORG" id="2"/> in <PII type="LOCATION" id="3"/>!"

// 5. Decrypt the PII map using the same key
const encryptionKey = await keyProvider.getKey();
const piiMap = await decryptPIIMap(result.piiMap, encryptionKey);

// 6. Rehydrate - replace placeholders with original values
const rehydrated = rehydrate(translated, piiMap);

console.log(rehydrated);
// "Hallo John Smith von Acme Corp in Berlin!"

// 7. Clean up
await anonymizer.dispose();
Key Points:
  • Save the encryption key — You need the same key to decrypt the PII map
  • Placeholders are XML-like — Most translation services preserve them automatically
  • PII stays local — Original values never leave your system during translation

Semantic Enrichment

Add gender and location attributes for better machine translation:
const anonymizer = createAnonymizer({
  ner: { mode: 'quantized' },
  semantic: { 
    enabled: true,  // Downloads ~12 MB of semantic data on first use
  }
});

await anonymizer.initialize();

const result = await anonymizer.anonymize('Hello Maria Schmidt from Berlin!');

console.log(result.anonymizedText);
// "Hello <PII type="PERSON" gender="female" id="1"/> from <PII type="LOCATION" scope="city" id="2"/>!"

Understanding the Result

Every anonymization returns a structured result:
const result = await anonymizer.anonymize('Contact [email protected]');

// The anonymized text with placeholder tags
result.anonymizedText;  // "Contact <PII type="EMAIL" id="1"/>"

// Detected entities (without original values for safety)
result.entities;        // [{ type: 'EMAIL', id: 1, start: 8, end: 24, ... }]

// Encrypted PII map (for later rehydration)
result.piiMap;          // { ciphertext: '...', iv: '...', authTag: '...' }

// Processing statistics
result.stats;           // { totalEntities: 1, processingTimeMs: 5, ... }

Next Steps