← Essays

On Family 1 and the 2026 privacy stack

Identity without disclosure.

Family 1 was the anti-surveillance patent pretending to be an ad-tech patent. In 2026 the cryptographic math finally matches the spirit.

9 min read

The 2014 filing for Family 1 — Users on a Network, WO/2015/143407 — describes something odd. It solves the problem of relating a physical address to an IP address without ever observing them together. The mechanism is graph-chained intermediate identifiers: a cookie seen in one exchange, a device ID seen in another, an email hash in a third, each co-occurring with part of what you want to resolve but never the whole. You chain the edges, score the confidence, and the join closes.

The industry framing was ad-tech — of course it was, that was where the budget lived — and so the patent is written in ad-tech vocabulary. Exchange, intermediate identifier, co-occurrence, confidence score. Read it without that framing and it becomes something else: a design document for identity resolution that refuses to concentrate anyone's data in any single place. The exchange sees an identifier and an IP. A different exchange sees the same identifier and an address. Neither has the whole picture. The resolver that chains them doesn't need raw personal data either — it needs edge weights. The design choice is the anti-surveillance choice.

At the time, the choice didn't matter much. Cookies were cleartext, device IDs were permissive, and the cryptographic primitives that would eventually make the patent's design discipline structurally enforceable instead of merely elegant didn't exist in production. In 2026 they do. This essay is about what the 2014 mechanism becomes when the math catches up.

The cryptographic primitives that arrived

Four things moved between 2018 and 2025 that make the Family 1 design enforceable instead of polite.

Trusted execution environments.AWS Nitro Enclaves, Azure Confidential Computing, Google's confidential VMs and Apple's Private Cloud Compute all let you run code on data in a way that the cloud operator — including the owner of the infrastructure — cannot observe. Attestation proves to the other party that your code is what you say it is. The edge-resolution work the patent describes can now run inside a TEE, and the two parties participating in the join never see each other's raw identifiers.

Zero-knowledge proofs.ZK-SNARKs and ZK-STARKs let you prove that a computation was done correctly, or that a record satisfies a predicate, without revealing the record itself. In the patent's vocabulary: you can prove that a correlation exists between two identifiers without revealing which intermediate identifier carried the edge. This is the difference between "we know you and this other party share a user" and "we know you share a user and we can prove it to a regulator, with no knowledge of who."

Private set intersection.PSI protocols let two parties discover the intersection of their sets — say, their customer lists — without either party learning anything else about the other set. In the ad-tech era, "clean rooms" attempted this with legal agreements and data-use policies. PSI does it with mathematics. The join closes; nothing about non-overlapping members leaks.

Federated learning.The edge-weight calculation itself can run on the edge devices, not on a central server. A model updates locally, sends a differentially-private gradient to the central aggregator, and the global graph improves without the local observations ever leaving the device. The patent's "transitive resolver" becomes a federated process, and the central graph never holds the raw data at all.

What the 2014 architecture becomes in 2026

Restate the mechanism with the new primitives. Two parties each hold intermediate identifiers observed in their respective contexts. Each wants to know if the identifiers they hold connect to identifiers the other holds, without either side seeing the other's raw data.

They run a PSI protocol on their identifier sets. They discover the shared intermediates. For each shared intermediate, the edge-resolution computation runs inside a TEE owned by neither party, attested and verifiable. The TEE emits a ZK proof of the correlation and a differentially-private confidence score. Downstream parties — a regulator, a fraud detector, a permissioned query API — can verify that the correlation exists without gaining access to the underlying identifiers.

That is the 2014 patent, unchanged in structure. What has changed is that every node in the pipeline that used to require trust now has a cryptographic replacement. The exchange doesn't need to trust the resolver with its data — the TEE attests. The querier doesn't need to trust the graph to be honest about confidence — the ZK proof carries the claim. The aggregator doesn't need to trust the contributors to not leak — the DP budget bounds the leakage.

Three marketplace applications

Healthcare record reconciliation.The same patient shows up across hospital systems, pharmacies, insurers, and wearables under different identifiers. HIPAA forbids bulk PHI sharing. PSI + TEE-based reconciliation lets the composite patient graph exist without any party ever seeing the other's raw identifiers, and lets a downstream clinician or payer query it with proofs rather than copies. This is the anti-surveillance choice applied to an adjacent domain where the current state of the art — shared EHR fields with broad access — is actively harmful.

Supply-chain counter-fraud.A container fraudulently "splits" — the same serialized pallet appears at two warehouses on the same day. Treat shipper manifest, RFID scan, customs filing, and carrier telemetry as intermediates. Detect the contradiction by PSI between logistics parties who will never share their raw operational data. The fraud is visible because the intermediates are shared; the legitimate operations stay private because everything else doesn't intersect.

Sybil defense. Voting integrity, airdrop allocation, and benefits-program de-duplication all face the same attack: one person running fifty wallets, fifty ballots, fifty accounts. Residential address, mobile carrier ID, device attestation, and geo-ping are intermediates. Transitive co-occurrence reveals the fifty-wallet puppeteer. With ZK-proof aggregation, the defense reveals that duplicates exist without unmasking legitimate participants. The attacker loses; the private citizen stays private.

What actually changed

The 2014 patent was the anti-surveillance patent pretending to be an ad-tech patent. For a decade the mechanism existed but the enforcement didn't — anyone who wanted to run the design honestly had to rely on operational discipline and legal contracts, not architecture. In 2026 the architecture is available. The mechanism that resolves identity through transitive co-occurrence of intermediates can now run on a substrate where no single party sees enough to misbehave.

This is a good outcome. It's also a useful way to read the rest of the portfolio. Every family was designed against the grain of what commercial pressures of the day were pushing toward. Family 2 pushed for per-user scoring that didn't rely on centralized behavior stores. Family 5 pushed for rule sets that didn't require a unified ledger. The framing at the time was ad-tech. The design ethic, read twelve years later, was anti-surveillance, anti-centralization, and anti-disclosure. The cryptography caught up. That's what changed.

Further reading on this site