diff --git a/ChaCha20_Poly1305_64/doc/notes.md b/ChaCha20_Poly1305_64/doc/notes.md new file mode 100644 index 0000000..d4395de --- /dev/null +++ b/ChaCha20_Poly1305_64/doc/notes.md @@ -0,0 +1,33 @@ +# Notes + +Since we are designing this for a 64 bit datapath, we need to be able to +compute 64 bits every cycle. The ChaCha20 hash works on groups of 16x32, or +512-bit blocks at a time. Logically it might make more sense to have a datapath +of 128 bits. + +On the other hand, each operation is a 32 bit operation. It might make more +sense for timing reasons then to have each operation registered. But will this +be able to match the throughput that we need? + +Each quarter round generates 4 words. Each cycle updates all 128 bits at once. +We can do 4 of the quarter rounds at once, so at the end of each cycle we will +generate 512 bits. + +At full speed then, the core would generate 512 bits per cycle. but we would +only need to generate 64 bits per cycle. We could only do 1 quarter cycle at +once, which would only generate 128 bits per cycle, but we would need some sort +of structure to reorder the state such that it is ready to xor with the +incoming data. We could even make this parameterizable, but that would be the +next step if we actually need to support 100Gbps encryption. + +So in summary, we will have a single QuarterRound module which generates 128 +bits of output. We will have a scheduling block which schedules which 4 words +of state go into the quarter round module, and a de-interleaver which takes the +output from the quarter round module and re-orders it to be in the correct +order to combine with the incoming data. there is also an addition in there +somewhere. + +To support AEAD, The first round becomes the key for the Poly1305 block. This +can be done in parallel with the second round, which becomes the cipher, at the +expense of double the gates. Otherwise, there would be a delay in between +packets as this is generated.