Back to slim.io

Detection · Redaction strategies

Five ways to handle
sensitive data

slim.io gives you precise control over what happens to each entity after detection. Choose the right strategy per entity type, per policy.

Overview

All five strategies at a glance

Each strategy transforms a detected value into a safe representation. The right choice depends on who sees the output and whether the original ever needs to be recovered.

Strategy Output example Reversible Use when
mask ***-**-**** No Displaying to UI, logs
format-preserve 7891 2345 6789 0123 Yes (vault) Format-sensitive systems, test data
category [SSN] No LLM prompts, training data
partial ***-**-1120 No Support agents need last-4
tokenize [SSN_a3f2] Yes (vault) Reversible workflows, detokenization

Strategies

Each strategy in detail

Every strategy is applied after detection. slim.io supports mixed-strategy policies: mask credit cards, tokenize SSNs, and label entity types for LLMs within the same pipeline.

Mask mask

Replaces the detected value with a format-aware pattern of asterisks. Credit cards preserve their last four digits; SSNs, emails, and phone numbers each follow their own masking shape.

Input078-05-1120
Output***-**-****
Input4111 1111 1111 1234
Output****-****-****-1234
Inputjane@acme.co
Outputj***@acme.co
Use when Displaying data in end-user interfaces or writing to application logs where the raw value is never needed downstream.
Entity types All supported entities
Format-preserve format-preserve

Replaces the detected value with a different value in the exact same format. A 16-digit credit card becomes a different valid-looking 16-digit number. The mapping is stored in slim.io's vault and is recoverable by authorized services. Uses FF3-1 format-preserving encryption.

Input4532 1488 0343 6467
Output7891 2345 6789 0123
Use when Downstream systems validate or display data in a specific format (payment processors, test environments, format-sensitive pipelines) and cannot accept masked or tokenized values.
Entity types Credit cards, SSNs, IBANs, phone numbers. Any entity where the format must be preserved after redaction.
Category category

Substitutes the detected value with its entity type label in bracket notation. The result is human-readable and safe for language models. The type is preserved; the value is gone.

Input078-05-1120
Output[SSN]
[SSN] [EMAIL] [CREDIT_CARD] [MRN] [ICD10] [PERSON_NAME] [API_KEY] [PHONE]
Use when Sending data to LLMs, building fine-tuning datasets, or injecting context into RAG pipelines. Models receive a signal about what type of data was present without exposure.
Entity types All supported entities. The label matches the entity type identifier exactly.
Partial partial

Masks most of the value while preserving a recognizable trailing segment. For most numeric identifiers this is the last four characters; for email addresses, the domain is retained.

Input078-05-1120
Output***-**-1120
Inputsupport@acme.co
Output***@acme.co
Use when Support agents or frontline staff need just enough context to identify a record (a last-four match against a customer's on-file value) without ever seeing the full sensitive string.
Entity types SSN, SIN, credit card (last 4 digits), phone (last 4 digits), email (domain only), MRN.
Tokenize tokenize

Generates a short, opaque token that maps back to the original value in slim.io's encrypted vault. Authorized services can recover the original through the detokenization API. Tokens are scoped by connector, resource, or field.

Input078-05-1120
Token[SSN_a3f2c8]
Detoken078-05-1120 (authorized only)
Use when Downstream systems must work with pseudonymous data, but authorized workflows (billing, clinical, identity resolution) need to recover the original value on demand.
Entity types SSN, SIN, MRN, credit card, email, phone. Token scope is configurable per connector or field.
Recovery Via the slim.io detokenization API. Requires an authorized service credential scoped to the token's namespace.

Decision guide

Choosing the right strategy

Work through these questions in order. The first "yes" match is your recommended strategy.

Does a downstream AI model see this data?
Yes →
Use category
Does a human UI or dashboard display this data?
Yes →
Use mask or partial
Do you need to track duplicates or correlate across records?
Yes →
Use tokenize (deterministic mode)
Does an authorized downstream service need to recover the original?
Yes →
Use tokenize
Is this data going only to audit logs?
Yes →
Use mask or category

Per-entity defaults

Recommended defaults by entity type

slim.io ships with opinionated defaults tuned for common compliance and operational requirements. All defaults can be overridden per connector or per field in your policy configuration.

Entity Default strategy Rationale
SSN / SIN tokenize Reversible for billing and identity verification workflows
Credit Card mask (last 4) PCI-DSS requires last-four preservation for user-facing display
Email category LLM · partial UI Models need the label; support agents need the domain. Use category for LLM prompts, partial for display.
Phone mask Low need for recovery; masking satisfies most display and logging requirements
MRN tokenize Clinical workflows (scheduling, billing, EHR) require detokenization
ICD-10 category Diagnosis codes are safe for LLMs as labels; low need for reversibility
Person Name category NER output; language models do not need real names in prompt context
API Key format-preserve Format-sensitive audit systems; the token looks like a real key but maps to the original in vault
IP Address mask Rarely needs recovery; masking satisfies most logging and GDPR requirements