Tokumeika is an AI-powered document anonymization service. Upload your documents — PDF, DOCX, PPTX, XLSX, HTML, CSV, JSON, XML, EPUB, or TXT — and the AI will detect and redact personal information. Refine the results through a chat interface, download the clean file, and maintain a tamper-proof audit trail for compliance.
Click "Upload" in the navigation bar and select your document.
| Format | Extension | Notes |
|---|---|---|
.pdf |
Text-based PDFs. Scanned-image-only PDFs are not supported. | |
| Word | .docx |
Microsoft Word format. Legacy .doc files are not supported. |
| PowerPoint | .pptx |
Microsoft PowerPoint format. Legacy .ppt files are not supported. |
| Excel | .xlsx |
Microsoft Excel format. Legacy .xls files are not supported. |
| HTML | .html |
HTML files. |
| CSV | .csv |
Comma-separated values. |
| JSON | .json |
JSON data files. |
| XML | .xml |
XML data files. |
| EPUB | .epub |
E-book format. Anonymized output is PDF. |
| Text | .txt |
Plain text files (UTF-8 recommended). |
File size and character limits depend on your plan. Text that exceeds the per-file character limit is truncated (not left unredacted).
Uploaded files are stored in AWS S3 with KMS encryption and automatically scanned for viruses by ClamAV.
After uploading, the AI automatically scans your document and detects the following types of personal information:
Detected entities are replaced with placeholders like [EMAIL_REDACTED] and [PHONE_REDACTED]. Review the results and use the chat interface to request additional redactions or exceptions.
If the metadata-stripping module is enabled for your workflow, the pipeline can also remove author, revision history, and similar document metadata.
The chat interface lets you refine anonymization results using natural language. For example:
The AI understands context and updates results based on your instructions. Iterate as many times as needed until you are satisfied.
Chat interactions consume credits from your balance. Credit usage is calculated based on the AI model's token consumption.
The Audit page provides a complete record of all actions performed on your documents:
Pipeline processing audit records are secured with a SHA-256 hash chain, making tampering detectable. Use this for compliance audits and internal controls.
Tokumeika uses a two-part pricing model: a monthly plan (which sets your file and size limits) and AI credits (which pay for chat usage).
Determines how many files you can process per month, the maximum file size, and the character limit per file. See the Pricing page for details.
Consumed when you use the AI chat to refine results. 1 credit = $0.01 USD. Usage is calculated from the AI model's token consumption. Purchase credits via credit packs or redeem a voucher code.
| Basic | Advanced | Max | |
|---|---|---|---|
| Price | $10/mo | $50/mo | $200/mo |
| Files / month | 20 | 200 | 2,000 |
| Processing runs / month | 50 | 500 | 5,000 |
| Characters / file | 50,000 | 100,000 | 500,000 |
| Max file size | 2.0 MB | 5.0 MB | 10.0 MB |
| PDF page limit | 100 | 250 | 500 |
| Image size limit | 25 MP | 100 MP | 100 MP |
| Monthly credits | 500 | 2,000 | 10,000 |
New accounts start on the Free plan with 3 files per month, 2 MB max file size, and 50,000 characters per file. Upgrade to a paid plan for higher limits and monthly credits.
The Max plan includes compliance-readiness features designed to support HIPAA, SOC 2, and GDPR workflows.