How it works — local-first, private, resilient
This is a technical companion to "Meet Vita". The product post talks about what the app does and why. This one talks about how it's built, and what that means for your data.
The short version: Vita is a habit tracker where your data lives on your device, every byte that touches the network is end-to-end encrypted under a key only your devices share, and the app keeps working when the network doesn't. The rest of this post explains exactly what that means and what it doesn't.
What "local-first" actually means
The phrase comes from Ink & Switch's Local-first software paper. It boils down to seven properties: no spinners on user input, work works offline, multi-device sync is possible, collaboration with others is possible, data lasts forever, security and privacy by default, and you retain ultimate ownership and control.
For Vita that translates to a hard architectural rule: every user action must complete without contacting a server. No login, no "creating account", no API call to read your own categories. The first mutation you make in Vita — typing your first name during onboarding — goes through the same path as the thousandth one: client-side WASM mutates an in-memory data structure, the change is sealed, and the sealed bytes are appended to a file on your device. That's the whole write path.
The architecture has three layers:
- The core (
vita-core). A Rust crate compiled to WebAssembly. It owns five Automerge documents —profile,categories,templates,observations,oneOffs— plus the scheduler, the per-template cron evaluator, and the summary aggregations. The core is the only place that knows the domain. The frontend is a thin shell. - The browser (Next.js + React). Hosts the UI, the WASM module, and two browser primitives: OPFS (Origin Private File System) for the per-document append-only log, and IndexedDB for the wrapping key. No business logic.
- The relay (
vita-relay). An axum HTTP+WebSocket server backed by Postgres. The relay's only job is to store opaque sealed events and fan them out to a user's other devices. It has no read access to anything inside those events.
Cryptography, in plain language
The data path looks like this:
your action
↓
mutate Automerge doc in WASM
↓
serialize the change → seal under K_master with XChaCha20-Poly1305
↓
append [u32 len ‖ 24-byte nonce ‖ ciphertext+tag] to OPFS log
↓ ↘
(stays here always) (optionally) push to relay over WS
Every record on disk and every byte on the wire is sealed under
K_master, a 256-bit symmetric key. K_master is generated on first
launch in your browser using crypto.subtle.generateKey. It is wrapped
under a non-extractable AES-GCM key that lives in IndexedDB. Both
the wrapping key and the wrapped K_master are stored in IndexedDB,
which the browser scopes per-origin and never exposes to JavaScript code
on other sites.
The key derivation looks like this:
K_master (256-bit, generated once, never leaves your devices)
│
├── HKDF-SHA256(info="stream-v1") ──► K_stream
│ │
│ └── HMAC-SHA256(K_stream, doc_name)
│ ──► stream_id (32 bytes)
│
└── used directly for XChaCha20-Poly1305 sealing of every record
stream_id is the only address the relay sees. Two devices that share
K_master compute the same stream_id for "observations"; everyone
else sees a uniform random 32-byte value. The relay groups events by
(account_id, stream_id) without knowing which document is which.
What the relay sees vs. what it doesn't
| Field on the wire | Cleartext to relay? | Why |
|---|---|---|
account_id (UUID) |
yes | needed for routing |
device_id (UUID) |
yes | needed for auth + echo suppression |
stream_id (32 bytes) |
yes (uniform random) | needed for fan-out grouping |
ciphertext + 24-byte nonce |
yes (opaque) | the actual change |
device_label (e.g. "Pixel 8") |
yes | shown in your own Devices list |
| habit names, categories, observations | no | sealed in the ciphertext |
| Automerge change structure | no | sealed in the ciphertext |
| your name, timezone, locale | no | sealed in the ciphertext |
If someone compromises the relay, what they get is your account ID, your device IDs, your device labels, the timing and length of your sealed events, and a stream of opaque ciphertext. They cannot read habits, categories, observations, or anything else inside Vita.
How a second device joins without leaking the master key
A new device can't be useful without K_master. But K_master must never travel through the relay in the clear. Here's how Vita threads that needle:
- Initiator (device A) generates a one-shot ephemeral X25519
keypair, posts the public half plus a 12-character base32 code to
the relay (the relay stores
SHA-256(code)so the cleartext code never hits disk), and renders a QR containing(relay_url, account_id, code, eph_pub_a). - Joiner (device B) scans the QR (or types the code; in the typed
case, B fetches
eph_pub_aandaccount_idfrom the relay using a public lookup-by-code endpoint), generates its own ephemeral X25519 keypair, and runs ECDH againsteph_pub_a. - The 32-byte ECDH shared secret is fed through HKDF-SHA256 with info
"pair-transport-v1"to produce a per-pairing transport key. - B posts its identity (
eph_pub_b,device_pubkey_ed25519,device_pubkey_x25519, label) to the relay. The relay forwards it to A's open WebSocket as apairing_request. - SAS verification. Both devices independently derive a 6-digit
short authentication string (SAS) from the DH shared secret — a
separate HKDF expansion with
info="pair-sas-v1"so the SAS is independent of the transport key. Each device displays its own SAS. The user reads the digits on both screens; A only releases K_master after confirming on its own device that the codes match. A relay-side MITM produces a different shared secret on each side, so the SAS values diverge visibly and the user catches it. - A computes the same shared secret, derives the same transport key, seals K_master under XChaCha20-Poly1305 with that key, and sends the sealed payload through the relay back to B.
- B unseals, persists K_master locally (wrapped under a fresh AES-GCM wrapping key, just like A did on first launch), and registers its device pubkey with the relay.
The relay is in the middle of every step but never sees K_master in the clear, because the transport key it's sealed under is derived from two ephemeral X25519 private keys that never leave their respective devices. This is the same pattern as Signal's X3DH, with an added SAS step that hardens the typed-code path against a hostile relay.
Recovery phrase: the only way back when every device is gone
Pairing solves "I have one device and I want a second one." It does not solve "I have zero devices left." For that, Vita has an opt-in recovery-phrase escrow.
When the user enables recovery from Settings → Recovery, the device
generates a 12-word BIP-39 mnemonic. From the phrase the device
derives two values via HKDF-SHA256 (different info labels, so the
two are mathematically independent):
- a 32-byte
recovery_id, used as a public lookup key on the relay. - a 32-byte
K_wrap, used to seal a small payload.
The payload is K_master ‖ account_id — 48 bytes — sealed under
K_wrap with XChaCha20-Poly1305. The result, plus the random nonce, is
an 88-byte blob the device PUTs to the relay under its recovery_id.
To restore, the user types the 12 words on a fresh install. The new
device re-derives recovery_id and K_wrap from the phrase, fetches
the blob from the relay, decrypts to recover K_master and
account_id, and registers itself as a new device on the account in
the same call. From there it bulk-pulls the encrypted event log and
rebuilds local state.
The relay only ever sees opaque blobs keyed by 32-byte
random-looking ids. With 128 bits of phrase entropy stretched through
BIP-39's PBKDF2-HMAC-SHA512 (2048 rounds) into a 64-byte seed before
HKDF, the keyspace is too large to brute-force a recovery_id and a
phrase typo fails on either the BIP-39 checksum or the AEAD tag — both
errors are loud, neither leaks information.
The trade-off: the phrase is the only recovery vector. Lose every paired device and the phrase, and the relay's events stay permanent ciphertext.
Resilience: making "your network is sad" a non-event
Local-first only matters if it's actually offline-capable. Vita has four resilience mechanisms layered on top of each other:
1. Service worker app shell. When you load Vita on a network
connection, a service worker (built with Serwist) precaches
the JavaScript chunks, the WASM binary, the icons, and the static
assets. A small /offline fallback page is also precached. After the
first load, the app starts even when you have no network — the SW
serves the shell from cache, and pages you've already visited
network-first stay available too.
2. OPFS as the source of truth. Every Vita document is an
append-only encrypted log on disk. The app reads from OPFS first, never
from the relay. A reload, a closed tab, a power outage: as long as the
last appendDoc call returned, that data is yours. The relay is for
backup and multi-device sync, not for serving the active app.
3. Persistent push-replay queue. If you make a change while the
relay is unreachable — bad wifi, server restart, plane mode — the
sealed record gets appended to a local IndexedDB-backed queue. The
queue survives reloads and crashes. The app shows a small
"Offline · N queued" indicator in the header so you know what's
pending.
4. Reconnect with exponential backoff. When the WebSocket to the
relay closes unexpectedly, Vita schedules a reconnect at 1s, 2s, 4s, 8s,
16s, capped at 30s. The browser's online event also kicks a
reconnect. On a successful connect, the queue drains in order. The
backoff resets to zero, so the next outage starts fresh instead of
inheriting a long delay.
What about concurrent edits across devices? That's where Automerge earns its keep. Each Vita document is an Automerge CRDT. If device A adds a category while device B is offline editing the same document, the changes commute when B comes back online: both devices end up at the union, neither one overwrites the other, and the merge is deterministic. There's no manual conflict resolution because there are no conflicts to resolve in the OT/CRDT sense — the data structure was designed for this.
There's a subtle invariant here too: scheduled-observation IDs are
deterministic. The scheduler computes
observation_id = derive(template_id ‖ scheduled_date). If two paired
devices both run their lazy scheduler past midnight, they produce
identical Automerge writes for the new observations, so the merge
yields one row, not two duplicates. The materializer is idempotent
under concurrent execution by design.
What we don't protect against
Honesty is part of the threat model. Vita does not protect you from:
- A malicious browser or OS. If your device is compromised at the OS level — keylogger, hostile extension with broad access — there is no client-side cryptography that helps. We do use non-extractable WebCrypto keys to make exfiltration harder, but a determined attacker on your machine wins.
- Losing every device and your recovery phrase. Vita supports recovery-phrase escrow (see the section above), but it's opt-in. If you skipped setting one up, lost every paired device, and have no other paired device to seed a fresh one from, the relay's events are permanent ciphertext. The recovery-phrase nag in the app shell exists exactly to keep this from being the silent default.
- Traffic analysis. The relay sees event timing and sizes. A patient observer can probably tell when you opened the app and how much you typed, even though they can't read what.
- The relay operator. If you don't run the relay yourself, you're trusting whoever does to keep the metadata they can see. The contents are still ciphertext to them, but device labels and account IDs are not.
- A future bug in our code. We try; we test. Local-first systems
fail in different ways than server-first ones do. If we ship a bug
that corrupts an OPFS log, your data is at risk in a way the cloud
model doesn't have. The flip side is that you can also
git diffthe entire system; nothing is happening on a server you can't see. - A user who blindly approves an SAS mismatch. Pairing now goes
through the SAS verification step described above: both devices
derive a 6-digit code from their DH shared secret, the user reads
the codes on both screens, and the initiator only releases
K_masterif the user confirms they match. A relay-side MITM produces different codes on the two sides, so the human is the detector. If the user clicks "match" without actually looking, a hostile relay can still get in. We can't take that last hop for the user.
Why this shape
A normal SaaS habit tracker has a much shorter spec: form posts go to an API, the API writes to a database, the UI renders what it reads back. The trade-off Vita makes is more code on your device for less trust in any company.
If the relay disappears tomorrow, your data is intact, on your device, in an encrypted log you can decrypt with the keys in your IndexedDB. That property — "the company can vanish and you still have your habits" — is the actual point. Everything else is implementation detail.