Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.rehydra.ai/llms.txt

Use this file to discover all available pages before exploring further.

Rehydra configuration controls what gets detected, how text is protected, and whether the workflow is reversible.

Main Configuration Areas

Mode

Choose whether the workflow is reversible:
  • pseudonymize for encrypted mappings and later rehydration
  • anonymize for irreversible protection

Detection

Control which detectors are active:
  • Built-in regex recognizers
  • Optional NER model
  • Custom recognizers and custom ID patterns

Policy

Policies shape detection behavior for each call or as a default:
  • Enabled regex and NER types
  • Confidence thresholds
  • Type priority for overlaps
  • Allowlists and denylists
  • ID reuse behavior
  • Leak scanning and semantic masking
  • Location scope exclusions (skip countries/regions)

Tag Format

By default, PII placeholders use XML-style tags: <PII type="EMAIL" id="1"/>. You can customize the delimiters and keyword:
  • tagFormat.open / tagFormat.close — delimiter characters (e.g., [[ / ]])
  • tagFormat.keyword — the keyword inside tags (default: PII)
This is useful when XML-style tags conflict with your processing pipeline (e.g., HTML sanitizers, XML parsers) or when you need a format that’s less likely to be mangled by translation services.

Runtime Integrations

Configuration also covers runtime behavior such as:
  • Key providers for reversible workflows
  • Storage providers for sessions
  • NER model backend and download settings
  • Semantic enrichment options

Example

import { createAnonymizer, PIIType } from 'rehydra';

const anonymizer = createAnonymizer({
  mode: 'pseudonymize',
  ner: { mode: 'quantized' },
  semantic: { enabled: false },
  tagFormat: { open: '[[', close: ']]' }, // optional: bracket-style placeholders
  defaultPolicy: {
    regexEnabledTypes: new Set([PIIType.EMAIL, PIIType.PHONE]),
    nerEnabledTypes: new Set([PIIType.PERSON]),
    enableLeakScan: true,
  }
});

Configuration Levels

Rehydra settings typically live at two levels:
  • createAnonymizer() configuration for reusable defaults
  • Per-call policy overrides when a specific operation needs different rules
This lets you keep a stable baseline while still adapting behavior for individual workflows.

Keep Concepts Separate

It helps to think about configuration in layers:
  • Mode decides reversibility
  • Recognizers decide what can be found
  • Policy decides what should be found right now
  • Tag format decides how placeholders look in output
  • Key and storage settings decide how reversible data is managed

Next Steps

createAnonymizer

Review the full SDK configuration surface.

Recognizers

Understand the detection layer you are configuring.