Help center
Frequently asked questions
Detailed answers about Safe-Doc: shadow AI, pseudonymization, zero-storage architecture, GDPR compliance and limits. Updated with the product.
Because real usage already exists. In most organizations, employees use generative AI to move faster - often without a document security layer. That's Shadow AI: productive, but hard to audit.
Bans without alternatives create workarounds, lost productivity, and higher risk (people still send raw documents).
Safe-Doc doesn't replace internal policies. It enables controlled usage: pseudonymize before AI, stay in control, de-anonymize on the way back when needed.
Shadow AI is the use of external AI tools outside an official framework: no dedicated DPIA, no logging, no document protection before sending.
It can be small (pasting a paragraph) or massive (contracts, audits, HR files). Safe-Doc targets the latter - where document exposure is structural.
No. Safe-Doc is not an analysis model. It's a document security layer before and after the AI you already use.
You keep your usual tools on a pseudonymized or anonymized version - Safe-Doc doesn't lock you into a proprietary chat or a workspace that stores your case files.
A summary tool processes content. Safe-Doc processes exposure risk: what goes to AI, what stays with you, and what can be re-identified later.
"Workspace + built-in chat" platforms often centralize files, history and mapping tables at the vendor. Safe-Doc inverts that: ephemeral processing, not a cloud archive of sensitive case files.
Safe-Doc follows four steps:
- Import a document (file, pasted text, or Data Room batch)
- Pseudonymize or anonymize depending on mode and level N1–N2 (N3 on roadmap)
- Review detected entities (uncheck, adjust)
- Export AI-ready text + JSON mapping (if reversible) + report
On return: paste the tokenized AI answer + import mapping to de-anonymize locally.
See the visual guide →
Single document - one contract, report or memo: fast, ideal for daily use.
Data Room - a homogeneous batch (contracts + annexes + exhibits): consistent pseudonymization - the same entity gets the same token across the batch. Essential for multi-document AI analysis.
Safe-Doc Clean
- Everyday team AI usage
- Pseudonymization or anonymization + metadata
- Standard context reduction
- Data Room included
Safe-Doc Advanced
- High-sensitivity case files
- Stronger multi-doc pseudonymization
- Context + style reduction
- Roadmap Q2 2026 - details →
- N1–N2 : direct identifiers (names, emails, phones, IBAN…) and risky context (exact dates, amounts, locations, internal references)
- N3 : stylistic fingerprint and weak signals (roadmap Q3 2026)
Higher levels make the document more generic - with less contextual precision to validate with the business.
- Anonymized - masking oriented toward irreversibility
- Pseudonymized - consistent tokens + exportable mapping
- Fake data - plausible substitute values (depending on level)
No. Safe-Doc is a browser-accessible SaaS. No workstation installation, no IT integration required. An account is enough to get started in minutes.
For team deployments (Cabinet or Cabinet+ plan), users receive an email invitation. No IT department involvement required for standard plans.
- Personal data: names, emails, phones, addresses, national IDs…
- Confidential data: company names, internal references, contracts, amounts, IBAN…
- Identifying context: exact dates, fine locations, rare combinations (by level)
- Metadata: author, revision history, comments (DOCX), PDF layers
Exact scope depends on mode and level.
Pseudonymize if you need to reuse AI analysis with real values (mapping + de-anonymization).
Anonymize if you don't need re-identification and want fewer traces (no correspondence table to secure).
Safe-Doc offers both - not a single marketing choice.
Yes in pseudonymized mode: import mapping_xxx.json, paste the AI answer with tokens, Safe-Doc re-injects values locally.
You remain responsible for protecting the mapping - don't share it on unsecured channels.
Yes by design - that enables "analyze then re-identify". Reversibility relies on the mapping table; if it leaks, re-identification risk rises sharply.
Irreversible anonymization aims not to leave an exploitable link - but strict GDPR "anonymity" remains hard in practice on rich documents.
Pseudonymization (GDPR Art. 4(5)) replaces identifying data with tokens while keeping a mapping that allows re-identification. The data remains personal data under GDPR : processing stays subject to its obligations.
Anonymization aims for irreversibility: if no cross-matching can re-identify, the data falls outside GDPR scope. In practice on rich documents, that threshold is hard to reach.
Safe-Doc uses the right term : not approximate "anonymization" when re-injection is possible.
With Safe-Doc, the process takes 4 steps:
- Import your PDF into Safe-Doc (drag-and-drop or upload)
- Choose the protection level N1–N2
- Review detected entities and export the pseudonymized text
- Paste the pseudonymized text into ChatGPT : no sensitive data goes to OpenAI
On return, import the JSON mapping to re-inject the real values into ChatGPT's response.
An NDA typically contains party names, amounts, validity dates, confidentiality clauses and internal references : all elements that must not transit in plain text to a third-party AI model.
Safe-Doc automatically pseudonymizes all these entities before sending. You get a "neutralized" contract you can analyze freely with ChatGPT, Claude or Gemini : then re-inject the real values into the result.
Safe-Doc is built on no-retention:
- No document content kept in the database or application logs
- temporary technical processing (upload, pipeline), then purge
- processed file and mapping (if exported) are downloaded to you
Job metadata (status, counters, technical keys) may be kept for tracking - never the full document text.
Many LegalTech solutions offer a persistent project space: file versions, chat history, hosted mapping tables, long retention to re-identify later.
Convenient - but each upload increases risk surface: data lifetime at the vendor, support/backup access, DPA and sub-processor complexity.
Safe-Doc doesn't turn client files into a cloud vault. It's a technical pass-through: in → process → out → purge.
No. Safe-Doc doesn't replace ChatGPT: it doesn't store your prompts or third-party model answers. You keep exchanges in the AI tool you already use - Safe-Doc provides the upstream/downstream document layer.
You are. Mapping is generated for local export. Its protection (encryption, retention, internal sharing) is your organization's duty. Safe-Doc doesn't keep it as a central re-identification archive.
Safe-Doc is designed privacy-by-design: minimization, pseudonymization (GDPR Art. 4), operation traceability, EU hosting (Hetzner, Germany), 100% self-hosted AI, DPA available.
It's a compliance aid - not a substitute for the controller's obligations (legal basis, retention, transparency, etc.).
The European AI Act imposes transparency, documentation and risk-management obligations for AI system use : with a compliance deadline in August 2026 for high-risk systems.
Safe-Doc helps AI Act compliance by:
- Documenting each pseudonymization operation (audit report)
- Reducing personal data exposure to third-party AI models
- Enabling workflow traceability for AI on sensitive documents
Safe-Doc is not a standalone AI Act compliance tool : consult your DPO for a full analysis.
Safe-Doc targets sensitive documents and controlled AI usage - including law firms. Pseudonymization before third-party AI reduces exposure, aligned with sector recommendations.
Lawyers and DPOs remain responsible for data choices, AI vendor selection and final review. Safe-Doc does not provide legal advice.
Data processing infrastructure: European Union (Hetzner, Germany). Scaleway is used for transactional email only — no document hosting. No third-party AI API receives your content; you then send the pseudonymized version to the AI tool of your choice from your workstation.
No. Safe-Doc doesn't use your files to train models. For external AI you use afterward, check your account terms (professional tiers with training opt-out).
Technical indicators: job id, duration, counters (pages, PII replaced, warnings) - not document bodies, no plaintext PII in application logs.
PDF and DOCX fully supported; text pasted in the UI. Scanned PDFs: OCR extraction depending on scan quality.
No. Any automatic engine has false positives (mask too much) and false negatives (miss an entity). Hence human review and residual risk indicators in the report.
Clean mode: common metadata removal (author, tool dates, Word comments, etc.). Stylistic fingerprint is Advanced territory (roadmap). No guarantee of removing all forensic traces.
Only if your organization has appropriate legal basis and contractual framework. Safe-Doc is not HDS-certified or cleared for national defense secrets. When in doubt, consult your DPO or compliance officer.
Yes. Safe-Doc is AI-tool agnostic : it works with any model or interface: ChatGPT, Claude, Gemini, Microsoft Copilot, Perplexity, Mistral and any other LLM accessible via web interface or API.
The principle is the same: pseudonymize with Safe-Doc, then copy-paste, call via API or connect through MCP to Copilot, retrieve the response and de-anonymize locally.
All AI tools accessible via web interface or API:
- OpenAI: ChatGPT (all plans), GPT-4 API
- Anthropic: Claude (claude.ai, API)
- Google: Gemini, NotebookLM
- Microsoft: Copilot, Azure OpenAI
- Mistral, Perplexity, LLaMA via third-party interfaces
Safe-Doc does not integrate into these tools as a plugin : it operates upstream (pseudonymization) and downstream (de-anonymization). Three access modes: web UI, REST API or MCP server. No extension required.
Both tools pseudonymize documents before AI. The main differences:
- Market: Anonym-IA mainly targets lawyers; Safe-Doc targets all professionals handling sensitive documents (M&A, consulting, HR, finance, audit)
- Architecture: Safe-Doc does not store your data or see your AI queries : unlike solutions with built-in AI chat where every query transits their servers
- PII coverage: 55+ entity types detected across 6 countries
- Formats: PDF, DOCX, OCR : multi-document Data Room
Solo (€35/month · 1 seat · 1,200 pages), Cabinet (€149/month · 3 seats · 6,400 pages), Cabinet+ (€399/month · 10 seats · 20,000 pages). Enterprise on request. Add-ons: extra pages +€35/1,500 pages · extra seat +€35 (Cabinet and Cabinet+ only).
Safe-Doc offers 4 plans:
- Solo : €35/month · 1 seat · 1,200 pages/month
- Cabinet : €149/month · 3 seats · 6,400 pages/month (~€50/user)
- Cabinet+ : €399/month · 10 seats · 20,000 pages/month (~€40/user)
- Enterprise : on request · unlimited seats and pages · SSO · dedicated DPA
Add-ons: +€35 / 1,500 extra pages (all plans) · +€35 / extra seat (Cabinet and Cabinet+ only).
- Legal, compliance, audit teams
- Law firms and regulated professions
- Finance, HR, M&A, operations
- SMEs adopting AI without a dedicated Shadow AI function
No. Many customers start with a pilot team (legal, compliance, innovation) then extend usage rules: pseudonymize before any external AI on sensitive documents.
No. Safe-Doc significantly reduces exposure and re-identification risk - it doesn't remove all cross-matching possibilities (rare amounts, unique context, style).
The service is provided on a best-efforts basis. Human review and settings adapted to the case remain essential.
The user: level choice, entity validation, decision to send to third-party AI. Safe-Doc provides indicators and an action report - not legal validation of the final document.
Technically yes - not recommended. Any automated output should be reviewed before external transmission, especially on high-stakes matters (litigation, M&A, disciplinary).
A question not covered?
Contact us or try Safe-Doc directly on one of your documents.