ramblings
This commit is contained in:
@@ -105,4 +105,49 @@ quarter round module can have 2 different blocks going through it at once.
|
||||
|
||||
The new one multiplexes 4 quarter rounds between 1 QR module which reduces the
|
||||
logic usage down to only 46k le, of which the vast majority is flops (2k ff per round,
|
||||
0.5k lut)
|
||||
0.5k lut)
|
||||
|
||||
|
||||
# Modulo 2^130-5
|
||||
|
||||
We can use the trick here to do modulo reduction much faster.
|
||||
|
||||
If we split the bits at 2^130, leaving 129 high bits and 130 low bits, we now
|
||||
have a 129 bit value multiplied by 2^130, plus the 130 bit value. We know that
|
||||
2^130 mod 2^130-5 is 5, so we can replace that 2^130 with 5 and add, then
|
||||
repeat that step again.
|
||||
|
||||
Ex.
|
||||
|
||||
x = x1*2^130 + x2
|
||||
x mod 2^130-5 = x1*5 + x2 -> x1*5+x2 = x3
|
||||
x mod 2^130-5 = x3*2^130 + x4
|
||||
x mod 2^130-5 = x3*5+x4
|
||||
|
||||
|
||||
and lets do the math to verify that we only need two rounds. The maximum value
|
||||
that we could possible get is 2^131-1 and the maxmimum value for R is
|
||||
0x0ffffffc0ffffffc0ffffffc0fffffff. Multiplying these together gives us
|
||||
0x7fffffe07fffffe07fffffe07ffffff7f0000003f0000003f0000003f0000001.
|
||||
|
||||
Applying the first round to this we get
|
||||
|
||||
0x1ffffff81ffffff81ffffff81ffffffd * 5 + 0x3f0000003f0000003f0000003f0000001
|
||||
= 0x48fffffdc8fffffdc8fffffdc8ffffff2
|
||||
|
||||
applying the second round to this we get
|
||||
|
||||
1 * 5 + 0x8fffffdc8fffffdc8fffffdc8ffffff2 = 0x8fffffdc8fffffdc8fffffdc8ffffff7
|
||||
|
||||
and this is indeed the correct answer. The bottom part is 130 bits but since we
|
||||
put in the max values and it didn't overflow, I don't think it will overflow here.
|
||||
|
||||
131+128 = 259 bits, only have to do this once
|
||||
|
||||
0xb83fe991ca75d7ef2ab5cba9cccdfd938b73fff384ac90ed284034da565ecf
|
||||
0x19471c3e3e9c1bfded81da3736e96604a
|
||||
|
||||
|
||||
Kind of curious now, at what point does a ripple carry adder using dedicated
|
||||
CI/CO ports become slower then a more complex adder like carry lookahead or
|
||||
carry save (wallace tree)
|
||||
|
||||
Reference in New Issue
Block a user