Safe Content Monitoring - Phosra Developer Docs

The model in one line: your app classifies a child’s content on the device, deletes everything benign, and hands Phosra only a content-free signal or a sealed harmful excerpt — so the parent gets actionable safety, the child’s ordinary messages are never read by any adult or server, and Phosra itself never sees content at all. It is provably more private than screen-scraping, keylogging, or notification capture — and it’s built for Sammy’s-Law / COPPA parental-control compliance.

Phosra is not the classifier and not a content pipe. Getting the messages (ingestion) is your job — via the platform’s Sammy’s-Law API or your device agent — not through Phosra. Phosra carries the result: a classified signal, a sealed excerpt, and a minimization proof.

Three roles — and the lines between them

Role	Does	Sees plaintext?
Your app	Ingests the child’s messages, parent UX, sends signals/excerpts	Only transiently, on-device
Your classifier	Decides harmful-vs-benign on the plaintext	Yes — on-device only, then zeroizes
Phosra (OCSS)	Router-blind signal + sealed delivery + the minimization proof	Never. Forwards ciphertext verbatim

How to architect your classifier

Run it where the plaintext already is — on the device. Never POST raw content to a classification server (yours or anyone’s). Content must not egress to be classified.
Classify into the closed harm taxonomy + a severity level. The taxonomy is a fixed, governed enum (grooming, sexual_exploitation, self_harm, suicidal_ideation, bullying, companion_dependency, ai_mediated_grooming, and the rest of the §4.4 set); severity runs informational → imminent. Fail closed (unknown/unsure ⇒ do not silently pass). Phosra rejects any signal carrying a class outside the enum — the structural guarantee against scope-creep from “safety” into general surveillance, so don’t invent custom classes.
Benign content (the overwhelming majority) → zeroize immediately. Never transmit it, never persist it, never show it to an adult. An ordinary message is gone the instant it’s judged benign.
Harmful content → a minimal, length-bounded excerpt. Seal that excerpt to the recipient’s public key and send it through Phosra. Zeroize everything else.
Phosra never reads it. You seal() with the recipient’s key (the consenting parent — or, for self-harm / abuse-disclosed-at-home, an independent advocate); Phosra forwards the sealed bytes verbatim and signs only the ciphertext digest. Only the recipient’s private key decrypts.
Peers’ inbound messages (the child’s friends, who didn’t consent) are classified on-device and never transmitted — only monitored-child-authored content may be sealed. (Two-party-consent / wiretap hygiene.)

The data lifecycle

   child's messages (on device)
        │
        ▼
  YOUR CLASSIFIER (on-device, accredited build)
        ├── benign  ─────────────► zeroized instantly. never leaves. never seen.
        └── harmful ─► minimal excerpt ─► seal(excerpt, recipientPubKey) ─► "JWE:…"
                                                   │
                                                   ▼
                              PHOSRA  (router-blind: forwards JWE verbatim,
                                       signs SHA-256(ciphertext), never decrypts)
                                                   │
                                                   ▼
                       PARENT (or advocate) pulls + open(JWE, theirPrivKey) ─► judges
        │
        └── every epoch: minimization_attestation {N_classified, N_flagged, N_delivered, commitment}
                          ─► Phosra countersigns ─► independent transparency log

Parental control & transparency

Control is the parent’s; secrecy is not allowed. The parent is the sole controller — consent, revocation, opt-out, and key authority all sit with the parent, and the child has no opt-out and no revocation key (matching Sammy’s Law and COPPA). But the child’s device must show a non-suppressable “monitoring is active” indicator the child can see but cannot turn off. A covert, child-undetectable monitor is stalkerware — banned by the Apple App Store and Google Play and flagged by the Coalition Against Stalkerware. A hidden design gets your app pulled.

(Minor-assent layers apply only in jurisdictions that require them — EU GDPR Art. 8, UK ICO Children’s Code — as a config, off by default in the US.)

How Phosra carries the result

Lane	Use it for	Carries content?
`abuse_signal`	”Harm detected, class X, severity Y, ref Z”	No — classification only
`harm_context`	The sealed harmful excerpt to the parent/advocate	Yes — sealed; Phosra is blind
`minimization_attestation`	The per-epoch proof you minimized benign content	No — counts + a commitment

Build against the published SDKs and the sandbox:

npm install @openchildsafety/ocss   # seal / open (E2E), verify-to-root
npm install @phosra/link            # write signed rules / signals
npm install @phosra/gatekeeper      # platform-side verification
# sandbox census: https://sandbox.phosra.com/api/v1   (test keys, seeded family)

The harm_context round-trip (seal → router-blind forward → recipient decrypts) is live and verifiable on the sandbox today.

The minimization proof — what you commit to (and the honest limits)

Each epoch, your classifier signs { classifier_build_hash, N_classified, N_flagged, N_delivered, salted_merkle_root }; Phosra countersigns it into an append-only log anchored to an independent transparency authority (not Phosra itself). Two-attestor reconciliation (your count vs. the platform’s ingestion count or your enclave’s attestation) makes silent over-retention detectable.

What this does and doesn’t prove — stated plainly:

✅ You didn’t quietly keep benign content off-ledger.
✅ Only sealed, consent-bound harmful excerpts ever egressed.
❌ It does not prove a specific benign message was bit-wiped (you can’t cryptographically prove a negative — so we never call it “verifiable deletion”).
❌ It does not prove your classifier caught all harm — that’s what accreditation is for.

That honesty is the point: it’s a real, checkable guarantee, which is more than any screen-scraping monitor offers today.

Accreditation criteria — what we check to approve you

To be Trust-Listed as an accredited safety classifier, your integration must demonstrate:

On-device classification — no raw content egresses to any classification service; ingestion_method is a declared, auditable field; screen-scrape / keylog / Accessibility-capture are not used.
Closed harm enum, fail-closed — you classify only into the governed taxonomy; out-of-enum signals are rejected by the census.
Benign zeroized — benign content is provably not retained or transmitted.
Sealed-to-consent-recipient only — harmful excerpts seal only to the consent-authorized parent/advocate key (no per-message key diversion).
Parent sole controller + visible indicator — parental consent/revocation; child cannot opt out; a non-suppressable monitoring indicator is present.
Minimization-attestation lane wired — your per-epoch attestation reconciles against an independent count and anchors to the transparency log.
Self-harm / abuse-at-home routing — those signals route to an independent advocate, not blindly to the parent.

We run a conformance suite against these (out-of-enum rejection, post-revocation no-event, HMAC-salted leaves, attestation-fail ⇒ lane suspended, advocate/parent delivery indistinguishability). Pass ⇒ accredited and on the production Trust List.

Phased path — you don’t need it all on day one

Phase 1 — ship now

Your classifier runs on-device inside the app and self-reports its minimization. We accredit this as “attested self-report” — still router-blind, still minimization-accounted; trust rests on your open, inspectable build.

Phase 3 — the strong tier

The classifier runs in a remote-attested enclave (App Attest / Play Integrity / TEE) on an accredited build — now “we only retained harmful content” is provable to a regulator, not asserted.

Why this beats what’s out there

Today’s monitors screen-scrape, keylog, capture notifications, retain everything server-side, give the child no rights, prove nothing, and leak. This architecture: benign-never-read is enforced, not promised; the parent controls everything but the child isn’t surveilled covertly; harmful-only egress is sealed and Phosra-blind; and the minimization is independently checkable. It’s the safe, compliant, auditable way to do what the law is about to require anyway.

Getting started

Pull the SDKs + read the protocol docs across this site.
Build the seal → harm_context round-trip against the sandbox (sandbox.phosra.com/api/v1).
Wire the minimization_attestation epoch lane.
Apply for accreditation — see Production Accreditation (or email developers@phosra.com) with your classifier build details. We review against the criteria above and Trust-List you for production.

The published SDKs + the harm_context round-trip are live today. The minimization_attestation lane and the classifier-accreditation Trust-List role are specified and on the roadmap — contact developers@phosra.com to be an early accredited classifier partner.

​Three roles — and the lines between them

​How to architect your classifier

​The data lifecycle

​Parental control & transparency

​How Phosra carries the result

​The minimization proof — what you commit to (and the honest limits)

​Accreditation criteria — what we check to approve you

​Phased path — you don’t need it all on day one

Phase 1 — ship now

Phase 3 — the strong tier

​Why this beats what’s out there

​Getting started

Three roles — and the lines between them

How to architect your classifier

The data lifecycle

Parental control & transparency

How Phosra carries the result

The minimization proof — what you commit to (and the honest limits)

Accreditation criteria — what we check to approve you

Phased path — you don’t need it all on day one

Why this beats what’s out there

Getting started