Guide

How SafeDoc anonymizes documents (visually).

SafeDoc secures external AI usage with a simple process: anonymize, analyze, de‑anonymize.
A controllable flow: import → detect → replace → review → export (with optional restore).

Back to home See the flow

Overview (1 minute)

Input

Single document, pasted text, or multi-document processing with Data Room.

Protection

Choose replacement type and protection level N1 → N2 (N3 on the roadmap Q3 2026) based on risk.

Outputs

Copy-ready text + mapping (if reversible) + residual risk indicators.

1) Import: File, Paste, or Data Room

Import

Upload a file, paste text, or use Data Room to process multiple documents.

Restore

Re-inject values locally using mapping_xxx.json and a tokenized AI answer.

Data Room

Multi-document analysis with consistent pseudonyms across the whole batch.

Import options

Upload a file: drag & drop or select a document (PDF/DOCX/TXT).
Paste text: great for an email, an excerpt, a note.
Data Room (multi-docs): import multiple documents at once for multi-document analysis.

File Paste Data Room

Why Data Room helps

With pseudonymization, the same entity remains the same pseudonym across documents.

Doc 1

Doc 2

Carrefour

[ORG_2]

Great for diligence, contracts, multi-piece case files.

2) Choose a mode: Anonymize, Pseudonymize, or Fake data

Anonymized

Generic tokens: [PERSON], [LOCATION]…

When you don’t need to keep links between occurrences.

Pseudonymized

Numbered, consistent tokens: [PERSON_1], [ORG_2].

Best to preserve narrative consistency across a document set.

Fake data

Readable replacements (invented names/addresses) for a “natural” text.

Useful for review and presentation while masking real values.

3) Increase protection: N1, N2, N3 (roadmap)

Principle

Higher levels reduce re-identification through context (dates, amounts, locations, writing style).

N1 Standard

PII tokenization / basic cleanup.

N2 Advanced

Stronger generalization and contextual reduction.

N3 High security

Additional protections. Roadmap Q3 2026.

Visual example

“—17.3M – Feb 12, 2026 – Rouen”

↓

N2 (N3 roadmap): contextual reduction

↓

“mid-teen millions – Q1 2026 – North France”

Goal: keep useful meaning while reducing identifiability.

4) Review: detected entities and residual scan

Detected entities

Review, filter and adjust what must be masked.

SafeDoc screenshot: anonymized output and copy/mapping options

Copy-ready

Copy the anonymized result and export mapping/report if needed.

What you control

Detected entities: people, orgs, locations, emails, phones, IBAN, amounts…
User choice: uncheck what you don’t want to mask.
Comparisons: views by source (e.g., model vs regex) depending on UI.

Why human review matters

Automated detection can produce false positives and false negatives. Indicators (residual scan / leakage score) help assess risk, but don’t replace a final review.

5) Export + Restore (optional)

Primary outputs

Anonymized text: ready to paste into an AI tool.
JSON mapping: original → token mapping (reversible mode).
Report: audit-friendly indicators (depending on level).

Restore an AI answer

Paste the AI answer containing tokens, then import your mapping to re-inject values locally.

Ready to try it on a real document?

Anonymize in seconds.

Open SafeDoc Read the FAQ