Jan 30, 02:08 PM
I have been blogging about solving the Protohackers problems, and it’s been working out pretty well for me. Forcing myself to explain what I’m doing has generally made me arrive at better solutions than I arrived at when I just dove in and tried a few of them. My next “real” project is going to be a messaging service featuring end-to-end encryption, and I figured that since blogging is working out well for me, I’d try blogging my way through this, too. I am hopeful that it’ll lead to a better solution.
First Thoughts
I’m aiming for something MVPish that can be expanded later if I end up liking it. Let’s get some features I’ve been kicking around in my head down on…metaphorical paper:
- End-to-end encryption. The world needs more of this; let’s practice getting it to them.
- No “registration” required. Users shouldn’t be associated with any identifying information other than their public keys (and their current connection, if they’re currently connected).
- Private keys will remain entirely within the possession of the user; they will not be stored, even behind a password, on the server. If a user wants to use the same identity on multiple devices, they will have to make provision for copying/transferring/re-entering their private key on each additional device. This may be inconvenient for the user, but it’s also secure. We’ll try to make it easy for them.
- Any “address book” or “contact management” will be handled entirely by the user’s frontend; initially there will probably be a provision to save some public keys in
localStorage
and associate names with them as a sort of “contact list”. We might also offer a provision for saving and loading address book data from a file.
Architecture
- The server will be written in Rust, leaning heavily on
tokio_tungstenite
. - The backing store will be a PostgreSQL instance. (I will probably use
tokio_postgres
anddeadpool_postgres
.) - Clients will communicate with the server over TLS-encrypted (rustls) websockets (hence
tungstenite
). - There will initially only be a web client; the frontend protocol should be easy enough to write additional clients for.
- All messages will be encrypted/decrypted client-side. The server will only ever see the following:
- global message ID#
- sender key
- recipient key
- timestamp
- nonce
- whether it’s been seen
- blob of encrypted data
- I’m toying with the idea of internally indexing public keys with integers, so each stored message will have the columns
(sender_id, recipient_id, timestamp, nonce, blob)
. - Blobs will be encrypted/decrypted using
crypto_box::ChaChaBox
in a wasm module. - Each message sent from client to server or vice versa will be the same size. This means short plaintexts will be padded, and long plaintexts will be broken up into several messages (with the last one probably padded). So then each blob will need the following fields:
- plaintext sequence number (this is to group broken-up messages)
- number of messages this plaintext spans
- index of this message in the current subsequence of messages for this particular plaintext
- length of this particular chunk of plaintext
- message text (possibly padded)
Operation/Behavior
- If the user agent in question has no stored private key, the user will be prompted to either generate or import one.
- Upon connection:
- client identifies self with public key
- server sends challenge phrase encrypted with public key
- client responds with decrypted challenge phrase to verify identity
- server will send list of public keys that have unseen messages waiting for the client
- When a user selects a contact in their contact list for the first time during a session:
- frontend will query the server for all unseen messages from that contact
- server will also include some extra history (already-seen messages, both to and from the given contact, to give newer ones context for the user) (The user might be able to specifically request more history.)
- All those messages will be marked as seen. (That is, the client will send a message to the server to update those database records as “seen”.)
- When a user selects a contact in their contact list subsequently in a session:
- Any delivered but unseen messages will be marked as seen.
- When a user sends a message:
- The message will be stored in the database, regardless of whether it’s delivered immediately.
- If the recipient is currently connected, the message will be delivered immediately.
- When a message is delivered to a client:
- If that sender is not in the user’s contact list, it will be added, identified only by public key. The user may then store locally an associated name (or arbitrary text or whatever).
- If the sender’s contact is selected, it will be displayed and marked as seen. Otherwise something will notify the user that they have unseen messages from the given sender.
- All messages will be stored in the database for some arbitrary length of time (like six weeks) and then discarded forever.
- The server will maintain a block list. Messages from blockee to blocker will be stored in the database, but never delivered. (The only reason they’re saved is so the blockee will not know they have been blocked.) Messages from blocker to blockee will be dropped.
This seems like enough of a musculoskeletal plan to at least get going. There are certainly some important decisions to be made, chief of which I think is
- What format/encoding will messages between the client and server take? Considering that the blobs of encoded text, JSON seems inappropriate. Given that each type of message will probably have a fixed size, it may be okay to just use a raw binary format.
- How much work will the wasm module be doing? Will it just be doing the crypto? Or will it process all the incoming (and prepare all the outgoing) data, delegating to Javascript only for updating the display and pushing bytes down the socket.
This document may evolve before more progress is made.