A Minimal Hash-Based Symmetric Encryption Protocol
I have long been drawn to cryptographic primitives for their elegance—how a handful of mathematical properties can guarantee something as consequential as secrecy. When exploring symmetric encryption, I found myself wondering: what is the simplest possible protocol that still provides meaningful confidentiality?
This article describes a minimalist approach I designed, relying solely on a cryptographic hash function. It focuses exclusively on message confidentiality, deliberately setting aside integrity and authentication. This isn't because those properties don't matter—they very much do—but because I wanted to isolate and understand confidentiality on its own terms.
The security of this protocol rests on three pillars: the strength of the chosen hash function, the entropy of the shared secret, and the crucial requirement that each secret be unique per message.
The basic idea
The protocol requires only two things: a cryptographic hash function H, and a shared secret S known to both sender and receiver. The key constraint is that S must be unique for each message—reusing a secret across messages would compromise the scheme entirely.
The approach is straightforward. I divide the message into blocks matching the hash output size, then XOR each block with a value derived from both the secret and the block's position. This creates a simple stream cipher where the hash function serves as a pseudorandom generator.
How encryption works
Let M be the message to encrypt, and n the output size of the hash function in bits.
First, I split M into blocks M₁, M₂, ..., Mₖ. Each block has n bits, except possibly the last one, which may be shorter. I deliberately avoid padding—if the final block has m bits where m < n, I simply use only the first m bits of the corresponding hash output.
For each block i, I compute:
Cᵢ = Mᵢ ⊕ H(S || i)
Here, || denotes concatenation and ⊕ is the bitwise XOR operation. The final ciphertext C is simply the concatenation of all encrypted blocks.
Decryption is symmetric: the receiver, knowing S, divides the ciphertext into blocks and applies the same XOR operation with the same derived values to recover the original message.
Why this works
The protocol's confidentiality depends on several properties working together.
The hash function must be secure against preimage attacks—given H(x), it should be computationally infeasible to recover x. More importantly for our purposes, the output of H should be indistinguishable from random data. If an attacker cannot predict H(S || i), they cannot recover the plaintext from the ciphertext.
The shared secret needs sufficient entropy to resist brute-force attacks. And critically, using a fresh secret for each message prevents an attacker from correlating patterns across multiple ciphertexts. If the same secret were reused, XORing two ciphertexts would yield the XOR of the two plaintexts—a serious leak.
Each block uses a unique derived key because I include the block index in the hash input. This prevents patterns from emerging when identical plaintext blocks appear at different positions in the message.
What this protocol does not do
I want to be clear about the boundaries of this design. The protocol provides confidentiality only. It does not verify message integrity—an attacker could flip bits in the ciphertext, and the receiver would decrypt garbage without knowing something went wrong. It includes no authentication mechanism—the receiver cannot confirm the message actually came from the expected sender. And it offers no protection against replay attacks or message reordering.
These omissions are intentional. I wanted to understand what confidentiality alone looks like, stripped of the mechanisms that typically accompany it in production systems.
Practical considerations
A few properties make this protocol pleasant to implement. The ciphertext length exactly matches the plaintext length—no padding overhead, no length leakage beyond what the original message already reveals. Each block can be encrypted or decrypted independently, enabling parallelization. And memory requirements are minimal: I can process a stream without buffering the entire message.
The computational cost is one hash operation per block. For a 256-bit hash function, that means one hash call per 32 bytes of message data—efficient enough for most purposes, though not competitive with dedicated stream ciphers for high-throughput applications.
Conclusion
This protocol is an exercise in minimalism rather than a recommendation for production use. Real-world applications need integrity checking, authentication, and often much more. But there is value in understanding the simplest form of a cryptographic property—it clarifies what each layer of a more complete system actually contributes.
For anyone building on this foundation, the natural next step would be to add a MAC (Message Authentication Code) for integrity, and to design a key exchange mechanism so that the unique-secret-per-message requirement becomes practical to satisfy.