For IT Reviewers
The PHI firewall your CISO has been asking for.
Project Nightingale ships with the Twin Service — a pseudonymization gateway where your hospital's key never leaves your network. Our servers receive a surrogate FHIR bundle and a re-identification risk score. They never see the patient's name, MRN, address, or birthdate. The nurse charting assistant is the first application of it.
For each real identifier — name, MRN, DOB, dates, address, phone — the Twin Service derives a deterministic surrogate inside your network. The same real patient produces the same surrogate every time, so trends thread shift to shift, but the link back to the human stays on-prem. Clinical metrics (vitals, labs, dose spacing, scored scales) pass through untouched — they're the signal the AI needs; they're not what identifies a patient.
Legal framing. This is GDPR Art. 4(5) pseudonymization, not HIPAA Safe Harbor de-identification. Keyed surrogates are derived from the source value, so the output remains PHI under HIPAA until the BAA stack is live. We cite Safe Harbor only for the ZIP-truncation rule (§164.514(b)(2)(i)(B); 3-digit ZIPs with population ≤ 20,000 are mapped to 000**). We will not market "de-identified" before the audit closes.
What we receive vs. what stays on-prem
Stays on-prem (never sent)
- Patient names, MRNs, account numbers
- Dates of birth and real calendar dates
- Street addresses, phone numbers, emails
- The hospital_secret and key_id for surrogate derivation
- The hashed-name local map (browser IndexedDB in the prototype; on-prem container in production)
Sent to Nightingale (surrogate only)
- Surrogate names from a deterministic pool
- Format-preserved encrypted MRNs / account numbers
- Dates shifted by one per-patient offset (intervals preserved)
- 3-digit ZIP truncation (Safe Harbor §164.514(b) rule)
- Clinical signal: vitals, labs, meds, doses, scores — untouched
- Re-identification risk score and reasons
Key rotation without continuity loss
Every surrogate carries a key_id. Rotating the hospital_secret issues a new key_id; existing surrogates remain reversible by the prior key until you retire it. Continuity across shifts survives rotation — your charts don't shatter because security policy required a quarterly key roll.
Twin Service roadmap
- Today (prototype). Twin Service runs as a Web Worker inside the browser tab. The hashed-name map sits in IndexedDB, session-scoped, wiped on close. Banner stays up — synthetic data only.
- Post-BAA (described, not built). Same engine packaged as a small container that runs inside the hospital VPC. Lovable Cloud receives only the surrogate bundle. No code change in the app — different transport.
- Hardening. Replace the demo-grade format-preserving substitution with a vetted FF3-1 library, expand quasi-identifier heuristics (occupation, public-figure markers), and structurally walk FHIR resources instead of string-serializing.
Nurses paste synthetic context. PHI is redacted client-side before any network call. A BAA-track LLM drafts. The nurse verifies and signs every output. We store the signed hash, not the chart text. The result is an audit trail your IT department can actually approve — and a clean liability boundary that keeps the nurse, not the model, on the byline.
Architecture
Single flow, no hidden services. Browser-side scrubbing precedes every server call.
What we store vs. what we don't
Stored
Signed chart hash (SHA-256)
32-byte digest binding final text + license + timestamp.
Nursys license verification result
State, license #, status (active/encumbered), checked-at.
Audit log row
Append-only: user id, action, hash, IP, user-agent, server ts.
Account email + role
For sign-in and admin gating only.
Never stored
Raw clinical input text
Discarded server-side after the AI call returns.
AI draft text
Held in browser memory; persisted only as a hash on sign.
Clipboard / dictation buffers
Never transmitted; redaction runs before fetch.
Patient identifiers
PHI patterns are scrubbed client-side before egress.
Compliance roadmap (honest labels)
Status pills are deliberate. We will not market "SOC 2 compliant" before the audit closes.
| Control | Status | Notes |
|---|---|---|
| HIPAA technical safeguards (encryption, access control, audit) | Shipped | TLS 1.2+, AES-256 at rest, RLS, append-only audit. |
| MFA on all accounts | Shipped | TOTP enrollment available; required for admin role. |
| BAA with LLM provider | In progress | Provider interface abstracted; Azure OpenAI / Bedrock swap-in ready. |
| Real Nursys API license verification | In progress | Adapter shipped behind manual review queue until contract closes. |
| Clinical-grade PHI de-identification (AWS Comprehend Medical) | In progress | Regex redactor today; Comprehend Medical for production. |
| SOC 2 Type I | Not yet · target Q4 2026 | Vanta/Drata controls in design. |
| SOC 2 Type II | Not yet · pending pilot revenue | Requires 6+ months of evidence post Type I. |
| Third-party penetration test | Not yet · pending pilot revenue | Scheduled to coincide with first paid pilot. |
| SSO / SAML | Not yet · pending pilot revenue | Available on enterprise pilot tier. |
| HITRUST CSF | Not yet · pending pilot revenue | Post-SOC 2 Type II only — over-investing earlier is theater. |
What we ask of your team for a pilot
- Synthetic data only until the BAA-track LLM is in production.
- Allowlist
*.lovable.appand the configured LLM gateway domain. - Optional SSO via SAML on the enterprise tier.
- A named security contact on our side for incident response (assigned at pilot kickoff).
- 30-minute architecture call with your security team before go-live.
Last updated: June 2026 · Questions? security@projectnightingale.dev