Audit-grade. Offline by design.
The Matrix files your documents on the machine they were scanned on. Your scan contents never traverse the internet, never reach an LLM, and never appear on someone else's server.
A$29/mo · 14-day money-back guarantee · Windows · macOS · Linux
Four claims. Each backed by a line of code.
If you can read Python, you can audit our claims. Source paths point to public files in the same repo we ship from.
Offline by design at runtime
The runtime read pipeline reads and writes files on the user's machine only. No runtime upload step exists. The only outbound HTTP call the desktop app makes is a licence validation POST that contains the licence key, machine ID, and app version. Nothing else.
apps/tray/app.py. Validate_license() services/license/main.py. /license/validateDeterministic recognizers, not probabilistic
No LLM in the runtime path. No OCR-as-classifier. The recognizer set in your install kit was hand-built from your real samples; each entry has a tight match condition. Pages that don't match any recognizer park in UNKNOWN/. Never silently misfiled. A $10M contract cannot be misclassified as an invoice because we don't classify by content.
Bounded setup ingest
At setup, you knowingly email as many sample documents as you have (the more the better for accuracy) to setup@chunkland.com. Those samples are stored for up to 30 days then auto-deleted. The runtime app never sends a scan back to ChunkLand. The samples are a one-time configuration input, not a recurring data path.
Nothing we can see at runtime
ChunkLand operates no runtime document ingest service. The licence server stores licence keys, plans, machine IDs, and seat counts. That's it. After your 30-day setup samples are deleted, we literally cannot answer "how many documents does customer X have" because we don't have the data.
services/license/db.py. Licenses, machines tables onlyWhat The Matrix does not do.
Security reviews live or die on the negatives. Here is everything that is absent. By design.
- No runtime cloud document ingest. Your day-to-day scans never leave your disk.
- No OCR-as-classifier. OCR is used locally to read text into the recognizer engine; it is never sent off the machine.
- No LLM at runtime. We don't summarise, categorise, or interpret content via a model on your scans.
- No training on your documents. The 30-day setup samples are configuration input, never training input.
- No third-party analytics inside the desktop app. No telemetry on file activity.
- No data-broker sharing. Ever.
- No ad networks, no tracking pixels, no marketing SDKs in the app.
- No password reset flow on the licence server. Licence keys are the only credential.
Recognizers in. Filing decisions out. No model in your data path.
Each install kit ships with a provision.json hand-built from the samples you emailed at setup. At runtime, the local recognizer engine reads the file with OCR, applies your recognizer set, and routes the page. Same input → same output. Anything that doesn't match parks in UNKNOWN/ for human review.
Where the input comes from
The Matrix watches one of two inputs, polled every 5 minutes:
- An Outlook inbox subfolder (typical office workflow). The app reads via the user's locally-installed Outlook client. Same security boundary as Outlook itself. Processed emails move to a paired "Complete" subfolder inside the same mailbox. ChunkLand does not connect to Exchange, Microsoft 365, or your mail server. Nothing about the messages leaves your computer.
- A watched folder on disk (direct-scanner workflow). The app watches the path you chose at setup, processes new PDFs as they appear, and writes the renamed output to your chosen destination.
Either input is read by the same local engine. No cloud queue, no upload step, no third-party intermediary.
The runtime decision path
For every page in a watched-folder or inbox scan:
- OCR locally. The page text is extracted on-machine; the OCR output never leaves the device.
- Recognizer match. Each recognizer in your
provision.jsonis tried in priority order. Match conditions are tight regex / barcode capture-group / layout anchor. All built off your real samples. - Route or quarantine. First successful match decides the destination folder. No match across the whole set = page parks in
UNKNOWN/with a one-line note. - Document boundaries. The recognizer set also encodes "this is the start of a new document" cues, so a 200-page mixed PDF is split correctly.
No probability score on the page. No "87% confident invoice." A recognizer either matches or it doesn't, and the conditions are auditable in your kit's provision.json.
If a new supplier or document type starts landing in UNKNOWN/, you reply to your setup email with a sample. We update the kit by hand and email back a new provision.json. The recognizer set grows in a way you can read.
Above is illustrative output. See src/runtime/recognizers.py for the engine.
Every outbound packet, accounted for.
If you put The Matrix behind a firewall and watched the traffic, this is everything you would see. Nothing else.
| Destination | When | Body sent | Body NOT sent |
|---|---|---|---|
license.chunkland.com /license/validate |
App start, then on-demand | license_key, machine_id, machine_label (optional), app_version |
No scan contents. No file paths. No file names. No user names. No document IDs. No recognizer outputs. |
hello@chunkland.com (mailto:) |
Manual. You click "Resend my license" or "Email support" | Whatever you type in your email client | App never auto-sends mail. App never reads your inbox. |
That's the entire egress surface. The license server never receives document data because the desktop app never sends document data. The validate endpoint accepts the four fields above and rejects anything else (verified in services/license/main.py. ValidateRequest).
How The Matrix maps to your framework.
We sit on the lawful side of these regimes by handling no personal information ourselves. You stay the data controller. Below: what each framework requires, how The Matrix meets it, and what stays your job.
| Framework | What it requires | How The Matrix meets it | What you still need to do |
|---|---|---|---|
| Privacy Act 1988 (AU) & the APPs | Lawful collection, use, disclosure, security, and storage of personal information; cross-border disclosure controls (APP 8). | The Matrix never collects, uses, or discloses your clients' personal information. Scans stay on your device. No cross-border transfer occurs because no transfer occurs. | Maintain your own APP 1 privacy policy. Secure the workstation. Manage retention/destruction of the local files. |
| GDPR (EU) | Lawful basis, data minimisation, purpose limitation, security of processing, data-subject rights, restricted international transfers. | ChunkLand processes no personal data inside your scans (Art. 4). No transfer to the EU/from the EU is performed by the app. The license server holds your own contact email + license key only. | You remain the controller. Document your lawful basis, run your DPIA, and handle subject access for the documents you hold. |
| HIPAA (US) | Covered entities and business associates must safeguard PHI; vendors who touch PHI sign a BAA. | We do not claim a BAA. And we don't need one, because The Matrix does not transmit, store, or process PHI on our infrastructure. You process PHI locally; we never see it. | You apply HIPAA Security Rule controls to the workstation, the disk, and your backup chain. |
| Attorney–client privilege | Privileged communications must not be disclosed to third parties; cloud transmission can risk waiver. | Privileged matter never leaves your control. There is no third party in the document path. Sorting and filing happen entirely on your workstation, and ChunkLand has no document ingest endpoint to subpoena. | Apply your firm's standard physical and digital safeguards to the workstation. |
Not legal advice. ChunkLand is a sole trader (ABN 53 628 676 390). You are responsible for your compliance posture. We're happy to provide written attestation (below) for your auditor.
Local + deterministic vs. Cloud + probabilistic.
The category that uploads your scans to a server, runs an OCR + LLM pass, and sorts on the vendor's side. That's a different threat model from ours.
| The Matrix (local + deterministic) | Cloud AI doc sorters | |
|---|---|---|
| Where your data lives | Your disk. Optional external drive. That's it. | Vendor cloud + sub-processors (LLM provider, storage, CDN). |
| Who can subpoena it | Only you. We hold no copy. | Vendor + every sub-processor in their chain. |
| Filing decision auditability | Recognizer rule per page. Readable in your provision.json. |
"The model decided." No human-readable rule. |
| Offline operation | Yes. 7-day grace mode for air-gapped runs. | No. Loss of internet = no sorting. |
| Cost scaling | Flat fee per seat. Unmetered pages. | Per-page or per-API-call. Costs grow with volume. |
| Misfile risk | Page either matches a recognizer in your kit or is flagged UNKNOWN. No silent misclassification. | Probabilistic. Model confidence varies, errors are silent. |
Signed attestation letter.
Customers can request a signed PDF attestation suitable for sharing with internal audit or an external assessor.
What the letter says
It's a one-page PDF on ChunkLand letterhead, signed and dated, stating:
ChunkLand attests that The Matrix does not transmit scanned document contents over any network. The desktop application's only outbound communication is a license-validation request to license.chunkland.com, the contents of which are limited to license key, machine ID, machine label, and app version. Valid from {issue_date} for 12 months.
Signed, ChunkLand sole trader
ABN 53 628 676 390
If your audit or review process needs it earlier, send us your customer name and we'll issue it within two business days.
Request attestation letterQuestions we get from compliance teams.
Does The Matrix send my documents to OpenAI, Google, or any LLM?
No. There is no LLM call, no AI vendor, and no inference service in the document path. See "Network-egress audit" above. The only outbound traffic is a license-validation POST containing the license key, machine ID, machine label, and app version. No scan contents, file names, or document IDs ever leave your machine.
Where is my data stored?
On your machine's disk, in folders you choose during setup (default ~/Matrix/inbox, outbox, scans, filed). You can point those at an external drive, an encrypted volume, or a network share that you control. ChunkLand has zero copies. We have no document storage of any kind.
Can I run this air-gapped?
Yes. The desktop app validates your licence online at first launch, then enters a 7-day offline grace period. While in grace mode, sorting and filing both work without a network connection. Re-validate online once a week to keep grace rolling. Implementation: apps/tray/app.py. Validate_license(), grace_days = 7.
What happens to my data if ChunkLand shuts down?
Nothing happens to your data. Your sorted PDFs and your already-filed folders sit on your disk forever. They're regular files we never touched. Your licence key sits in a local config file and your ~/Matrix folders keep working. Worst case, the desktop app stops being able to call the validation server; a future release would unlock that with an offline-permanent licence. There is no vendor lock-in because there is no vendor data path to lock you into.
Do you have SOC 2 or ISO 27001?
Not yet. We're a sole trader; large-scale third-party audits cost $50k+ and we'd rather pass that cost on transparently than bake it into prices for customers who don't need it. The Matrix is a Personal-tier product. If you need a SOC 2 / ISO 27001 audit before you can install personal software at work, your IT policy is the more relevant gate, not ours.
What's your data-breach notification policy?
We have no customer document data to breach. The only customer data we hold is on the license server: license keys, contact email addresses, plan, seat counts, and machine IDs/labels (the threat model and rotation procedures for these are documented in services/license/SECURITY.md). If that database is ever compromised, we will notify affected customers by email within 72 hours and follow the rotation playbook. There is no document-content breach path because there is no document-content storage.
Want to verify any of this?
Happy to walk through any of the following. No cost, no obligation.
What we can run with you
- Source-code review. Remote, ~30 minutes. We share screen, walk you through codec.py, reader.py, and the license-validate endpoint, and you ask whatever you want.
- Egress firewall test. Run The Matrix behind a restrictive firewall on your side. We'll specify the only outbound destination it needs (license.chunkland.com), and you can confirm with your own packet capture that nothing else moves.
- Custom attestation language. If you need specific wording for your records, send it over. We'll mark up what we can sign as-is, redline what needs adjusting, and explain why.
Email hello@chunkland.com with the subject line "Security review". A human responds within one business day.
Email. Subject "Security review"Run it on a folder. See for yourself.
A$29/month, 14-day money-back guarantee. Email me as many sample documents as you have (the more the better for accuracy) and I'll configure the install around them; scan a real day's worth of work into the watched folder and watch every page land where it belongs. Cancel any month.