Building a Quantum-Safe Encryption Protocol
I had reached the stage of the product roadmap I wasn't looking forward to implementing—end-to-end encryption.
At the time of writing, I am the sole developer at Monomize. While I certainly don't consider myself a cryptographer, I am confident that by mastering basic cryptographic principles, I can leverage existing algorithmic libraries to build a quantum-safe E2E encryption protocol.
Public key cryptography principles—symmetric and asymmetric encryption, Diffie-Hellman key exchange, and digital signatures—alongside critical security properties like Perfect Forward Secrecy (PFS) and Post-Compromise Security (PCS), are all things I had to wrap my head around before I had the cojones to actually build a working prototype.
Building a custom end-to-end encryption (E2EE) protocol is no small feat. Even as a seasoned developer supported by SOTA AI models, the "gotchas" are everywhere. You aren't just writing code; you're making high-stakes product design choices that dictate the balance between ironclad security and a usable interface.
It would be fair to ask why I didn't just import something like Signal and move on. That would have been the simpler path. But simplicity isn't always the right trade-off. Signal is built around full deniability, which makes sense for deeply private, personal messaging—but in a business context, non-repudiation matters. That trade-off alone ruled it out.
From the very beginning of Monomize, I’ve deliberately built critical infrastructure in-house where correctness, control, and long-term ownership matter. For example, instead of relying on off-the-shelf recurrence and precision arithmetic libraries, we maintain these foundations ourselves. This gives us full flexibility over core systems as the product evolves. Encryption naturally falls into the same category.
This wasn’t about avoiding existing solutions because I’m a purist—it was about owning the parts of the system that define Monomize. Building it myself wasn’t easier, but it was the right choice.
Quantum Readiness
Most of today's internet still runs on RSA or elliptic curve cryptography—but the math propping it up has an expiration date. Quantum algorithms like Shor’s and Grover’s have very real implications for cryptography: Shor’s undermines RSA and ECC, and Grover’s means symmetric keys need to be twice as large to offer the same security. In other words, the "standard" cryptographic stack is a sitting duck.
We're building for a post-quantum reality now because "harvest now, decrypt later" isn't a hypothetical threat—it's already happening. Moving to lattice-based algorithms isn't just future-proofing; it's the baseline for any platform that claims to protect a business's most sensitive data.
That's why the protocol uses ML-KEM-768 (the NIST-standardized quantum-resistant key encapsulation mechanism) for initial handshakes and periodic re-keying.
ML-KEM-768 specifically was chosen over the larger 1024-bit variant because it hits the sweet spot: strong enough to resist quantum attacks (roughly equivalent to AES-192 classical security) while keeping key sizes and computation times reasonable for a web application. ML-KEM-1024 would be overkill for most threat models and noticeably slower on mobile devices.
The big headache with quantum-safe primitives? Those massive key sizes and longer computation times for keying operations.
For one-on-one chats, it's not a huge deal. You're usually just juggling a few devices per user, so stuff like key authentication, ratcheting, and rekeying happens pretty quick on today's hardware. But throw in group chats, and the extra crunch from these quantum algorithms starts stacking up fast, especially with frequent member changes or rekeys, resulting in higher battery drain and perceptible latency.
The Specification
Just like with any software build, the goal is always to hit the highest security standards while keeping performance snappy—nothing changes there.
After studying Apple's PQ3 protocol and how Signal rolled out their post-quantum ratchets, I landed on a spec that fits Monomize like a glove—ultra-secure, but still blazing fast where it counts.
The Initial Handshake
Before diving into the ongoing ratchet mechanics, it's worth explaining how two devices establish their first shared secret. This is where the protocol diverges most from Signal's approach.
Signal's PQXDH (Post-Quantum Extended Diffie-Hellman) preserves the full X3DH structure: three to four classical Diffie-Hellman operations plus one quantum KEM. The multiple DHs serve specific purposes—mutual authentication through identity keys and deniability through ephemeral exchanges. It's elegant, but heavy.
For Monomize, I took a page from Apple's PQ3 playbook and simplified: one X25519 exchange plus one ML-KEM-768 encapsulation. That's it.
Here's why this works:
The sender generates an ephemeral X25519 key pair and performs a single DH operation with the recipient's pre-key (either a one-time key or signed pre-key). Simultaneously, they encapsulate a quantum secret using ML-KEM against the recipient's quantum public key. The two secrets are combined via HKDF to derive the initial root key.
Authentication doesn't come from multiple DH operations—it comes from Ed25519 signatures. Every message is signed, so there's no need to bake identity keys into the DH exchanges for mutual authentication. This keeps the handshake lean while still achieving the same security goals.
The trade-off? No deniability. In Signal's model, you can plausibly deny sending a message because there's no cryptographic signature proving authorship. For Monomize, that's not a bug—it's exactly what the business context demands.
Structurally, this puts the protocol closer to PQ3 than PQXDH. Both favor simplicity and non-repudiation over the complexity of triple-DH handshakes. The quantum resistance comes from ML-KEM, the forward secrecy comes from ephemeral keys, and the authentication comes from signatures.
Different Tools for Different Conversations
You're not just writing code here. Every decision affects how secure the protocol is and how usable the product feels. One-to-one chats are relatively straightforward, but group chats add real complexity around key sharing and membership changes.
That complexity is why I ended up with two different approaches: one for direct messages, one for groups.
Double Ratchet (Direct Messages)
For direct messages, the double ratchet gives us forward secrecy and post-compromise security by evolving keys with every message. Even if a key is compromised, both past and future messages remain protected.
I chose X25519 for the classical Diffie-Hellman ratchet on every message exchange—fast, battle-tested, and gives us instant healing from key compromise. Why X25519 over other curves? It's the de facto standard for modern cryptography: constant-time implementation (resistant to timing attacks), 128-bit security level, and native support in virtually every crypto library. NIST’s P-256 would also work, but X25519 is generally faster in practice and comes with a simpler, less controversial security story.
But here's the thing: X25519 won't hold up against a quantum computer. That's where the ML-KEM-768 rotation comes in. Every 50 messages (or 24 hours), the protocol triggers a fresh quantum-resistant handshake, rotating the root key. Classical ratchet for speed, quantum ratchet for long-term security.
I use a fairly typical server-side fan-out approach, where each message is encrypted separately for every participant in the conversation. As you'll see in the multi-device section, each device is treated as its own recipient, while still being logically tied back to the owning user.
For the actual encryption, the protocol uses XChaCha20-Poly1305—an authenticated encryption scheme with a 192-bit nonce that's basically impossible to accidentally reuse. Fast, secure, and doesn't choke on large messages. The extended nonce size is critical here: with standard ChaCha20's 96-bit nonce, you'd risk collisions after billions of messages. XChaCha20's 192-bit nonce makes that practically impossible, even with random generation.
Sender Keys (Group Messages)
Group chats don't scale if you treat them like a collection of one-to-one conversations. Encrypting every message separately for every participant—and every device—quickly becomes expensive, both computationally and operationally.
Sender keys solve this by flipping the model. Instead of re-encrypting a message for each recipient, each participant maintains a shared sender key for the group. Messages are encrypted once with that key and can be decrypted by all current members.
The magic here is that the protocol uses a simple HMAC-SHA256 hash ratchet instead of expensive asymmetric operations. Each message derives a new key from the previous one—irreversible, efficient, and forward-secret. XChaCha20-Poly1305 is still used for the actual encryption, keeping it consistent with direct messages.
The tricky part is managing change. When someone joins or leaves a group, sender keys must be rotated to ensure new members can't read old messages and removed members can't read new ones. Done correctly, this keeps the group secure without sacrificing performance or battery life.
This approach becomes especially important once post-quantum cryptography enters the picture. Quantum-safe key exchanges are heavier, and sender keys let us limit how often those expensive ML-KEM operations need to happen—without weakening the security model. The protocol rotates every 50 messages or 24 hours, whichever comes first.
Cryptographic Signatures
Here's where the protocol diverges most sharply from Signal: every message is cryptographically signed with Ed25519—a fast, deterministic signature algorithm that proves authorship.
Signal deliberately avoids signatures to maintain deniability (the ability to say "I didn't send that" even if you did). For personal messaging, that makes sense. For business communication? That's a bug, not a feature.
Ed25519 specifically was chosen over alternatives like ECDSA for a few reasons: it's deterministic (no random number generation that could fail and leak keys), it's fast (faster than RSA and ECDSA for both signing and verification), and it's compact (64-byte signatures). The security level (128-bit) aligns perfectly with X25519, keeping the protocol consistent.
The signature verification happens before decryption. If the signature is invalid, the message is rejected outright—no oracle attacks, no funny business.
Non-repudiation
In a business, messages aren't just conversations—they're decisions. Someone approves a change, gives an instruction, or signs off on work, and that message carries real weight.
Without non-repudiation, things get messy fast:
- "I didn't send that."
- "That wasn't my approval."
- "The message must've been changed."
Non-repudiation gives the business certainty. It means you can prove who sent a message and trust that it hasn't been tampered with, without giving up end-to-end encryption or user privacy. That's why non-repudiation wasn't optional—it was a requirement.
Every Device Gets Its Own Identity
Here's something that might surprise you: your laptop and your phone don't share cryptographic keys in Monomize. Each device generates its own complete identity—its own Ed25519 signing key, its own X25519 agreement key, and its own ML-KEM-768 quantum key.
When someone sends you a message, they're actually encrypting it separately for each of your devices. A message to you might really be two messages: one your laptop can decrypt, one your phone can decrypt. The protocol uses a composite addressing scheme internally (userId:deviceId) to route the right ciphertext to the right device.
This sounds wasteful, but it's a deliberate trade-off. The alternative—syncing private keys between devices—opens up attack vectors I'm not willing to accept. If your laptop gets compromised, it shouldn't automatically compromise your phone too.
The protocol enforces a hard limit of 5 devices per user. More than that and the fan-out gets expensive, but more importantly, each device is an additional attack surface. Five is enough for most people (laptop, phone, tablet, plus two spares) without opening the floodgates.
Each device gets a UUIDv7 identifier—time-sortable, collision-proof, stored in IndexedDB so it survives cookie clearing. When a device registers, it generates 50 one-time keys (OTKs) that get consumed for forward secrecy, plus a signed pre-key (SPK) as a fallback when OTKs run out. The server atomically deletes OTKs as they're used, and the client auto-replenishes when it drops below 10.
The Tricky Parts
The spec looks clean on paper. But actually implementing this stuff? That's where you discover all the edge cases that make encryption protocols so unforgiving. Here are the things that bit me.
Race Conditions and Mutex Locks
The double ratchet is a state machine. It has a counter that increments with every message. That counter is absolutely critical—if it gets out of sync, messages become undecryptable.
Here's the problem: what happens when a user clicks "Send" twice in rapid succession?
Without protection, both clicks read the same counter value from IndexedDB (say, N=5). Both encrypt with counter N=6. Both save N=7 back to the database. The second message gets sent with the wrong counter, and when the recipient tries to decrypt it, the ratchet state doesn't match. Message lost.
This is a classic race condition, and in a cryptographic context, it's catastrophic.
The fix is a mutex lock (mutual exclusion). Every conversation gets its own lock. When you send a message, you acquire the lock first:
const mutex = getLock(conversationId)
await mutex.runExclusive(async () => {
// Read state from IndexedDB
// Encrypt message
// Update state
// Save to IndexedDB
})The second click waits in a queue until the first operation completes. No race condition, no lost messages, no edge cases where encryption silently breaks.
The same pattern is used for group chat sender keys and for decryption. The ratchet state machine must be serialized—there's no way around it.
Out-of-Order Messages
Messages don't always arrive in the order they're sent. Network conditions, server routing, WebSocket reconnections—all of it can shuffle messages around.
The ratchet expects message N, but message N+5 shows up instead. What do you do?
If you just skip ahead, you've permanently lost the ability to decrypt messages N through N+4 if they eventually arrive. That's unacceptable.
The solution is skipped key storage. When we receive message N+5, the protocol ratchets the chain key forward step-by-step, deriving and storing the keys for messages N, N+1, N+2, N+3, and N+4 in IndexedDB. Then it decrypts N+5 with the freshly derived key.
If message N+2 shows up later, we check IndexedDB first, find the stored key, decrypt the message, and delete the key. Forward secrecy is preserved—used keys are immediately wiped.
Message Key Caching
Here's a performance gotcha: decrypting a message requires deriving keys through the ratchet. If you refresh the page and need to re-render 50 messages in a conversation, that's 50 ratchet operations, 50 HMAC derivations, 50 decryption operations.
On desktop, it's fast. On a low-end mobile device, it's noticeable lag.
The optimization is message key caching. After we decrypt a message for the first time, we store the message key in IndexedDB, keyed by the message ID. On page reload, we check the cache first. If the key exists, we skip the entire ratchet derivation and jump straight to decryption.
This is safe because:
- The message key is unique per message (derived from the chain)
- We still verify the signature before decrypting (prevents tampering)
- The cache is local to the device (never sent to the server)
But caches can't grow forever. The implementation enforces a 7-day TTL on cached keys and an LRU eviction policy capped at 1,000 entries. After 7 days, if you revisit an old conversation, the first load might be slower—but that's a reasonable trade-off for keeping storage under control.
The result? Conversations load instantly, even with hundreds of messages, without sacrificing security or exploding the database.