Redaction Reference

Overview

All five strategies at a glance

Each strategy transforms a detected value into a safe representation. The right choice depends on who sees the output and whether the original ever needs to be recovered.

Strategy	Output example	Reversible	Use when
mask	`*--****`	No	Displaying to UI, logs
format-preserve	`7891 2345 6789 0123`	Yes (vault)	Format-sensitive systems, test data
category	`[SSN]`	No	LLM prompts, training data
partial	`*--1120`	No	Support agents need last-4
tokenize	`[SSN_a3f2]`	Yes (vault)	Reversible workflows, detokenization

Strategies

Each strategy in detail

Every strategy is applied after detection. slim.io supports mixed-strategy policies: mask credit cards, tokenize SSNs, and label entity types for LLMs within the same pipeline.

Mask mask

Replaces the detected value with a format-aware pattern of asterisks. Credit cards preserve their last four digits; SSNs, emails, and phone numbers each follow their own masking shape.

Input078-05-1120

↓

Output***-**-****

Input4111 1111 1111 1234

↓

Output****-****-****-1234

Inputjane@acme.co

↓

Outputj***@acme.co

Use when Displaying data in end-user interfaces or writing to application logs where the raw value is never needed downstream.

Entity types All supported entities

Format-preserve format-preserve

Replaces the detected value with a different value in the exact same format. A 16-digit credit card becomes a different valid-looking 16-digit number. The mapping is stored in slim.io's vault and is recoverable by authorized services. Uses FF3-1 format-preserving encryption.

Input4532 1488 0343 6467

↓

Output7891 2345 6789 0123

Use when Downstream systems validate or display data in a specific format (payment processors, test environments, format-sensitive pipelines) and cannot accept masked or tokenized values.

Entity types Credit cards, SSNs, IBANs, phone numbers. Any entity where the format must be preserved after redaction.

Category category

Substitutes the detected value with its entity type label in bracket notation. The result is human-readable and safe for language models. The type is preserved; the value is gone.

Input078-05-1120

↓

Output[SSN]

[SSN] [EMAIL] [CREDIT_CARD] [MRN] [ICD10] [PERSON_NAME] [API_KEY] [PHONE]

Use when Sending data to LLMs, building fine-tuning datasets, or injecting context into RAG pipelines. Models receive a signal about what type of data was present without exposure.

Entity types All supported entities. The label matches the entity type identifier exactly.

Partial partial

Masks most of the value while preserving a recognizable trailing segment. For most numeric identifiers this is the last four characters; for email addresses, the domain is retained.

Input078-05-1120

↓

Output***-**-1120

Inputsupport@acme.co

↓

Output***@acme.co

Use when Support agents or frontline staff need just enough context to identify a record (a last-four match against a customer's on-file value) without ever seeing the full sensitive string.

Entity types SSN, SIN, credit card (last 4 digits), phone (last 4 digits), email (domain only), MRN.

Tokenize tokenize

Generates a short, opaque token that maps back to the original value in slim.io's encrypted vault. Authorized services can recover the original through the detokenization API. Tokens are scoped by connector, resource, or field.

Input078-05-1120

↓

Token[SSN_a3f2c8]

↓

Detoken078-05-1120 (authorized only)

Use when Downstream systems must work with pseudonymous data, but authorized workflows (billing, clinical, identity resolution) need to recover the original value on demand.

Entity types SSN, SIN, MRN, credit card, email, phone. Token scope is configurable per connector or field.

Recovery Via the slim.io detokenization API. Requires an authorized service credential scoped to the token's namespace.

Decision guide

Choosing the right strategy

Work through these questions in order. The first "yes" match is your recommended strategy.

Does a downstream AI model see this data?

Yes →

Use category

Does a human UI or dashboard display this data?

Yes →

Use mask or partial

Do you need to track duplicates or correlate across records?

Yes →

Use tokenize (deterministic mode)

Does an authorized downstream service need to recover the original?

Yes →

Use tokenize

Is this data going only to audit logs?

Yes →

Use mask or category

Per-entity defaults

Recommended defaults by entity type

slim.io ships with opinionated defaults tuned for common compliance and operational requirements. All defaults can be overridden per connector or per field in your policy configuration.

Entity	Default strategy	Rationale
SSN / SIN	tokenize	Reversible for billing and identity verification workflows
Credit Card	mask (last 4)	PCI-DSS requires last-four preservation for user-facing display
Email	category LLM · partial UI	Models need the label; support agents need the domain. Use category for LLM prompts, partial for display.
Phone	mask	Low need for recovery; masking satisfies most display and logging requirements
MRN	tokenize	Clinical workflows (scheduling, billing, EHR) require detokenization
ICD-10	category	Diagnosis codes are safe for LLMs as labels; low need for reversibility
Person Name	category	NER output; language models do not need real names in prompt context
API Key	format-preserve	Format-sensitive audit systems; the token looks like a real key but maps to the original in vault
IP Address	mask	Rarely needs recovery; masking satisfies most logging and GDPR requirements

Five ways to handlesensitive data

All five strategies at a glance

Each strategy in detail

Choosing the right strategy

Recommended defaults by entity type

Five ways to handle
sensitive data