Refactor readback mux implementation. Improves performance (#155) and eliminates illegal streaming operator usage (#165)
This commit is contained in:
@@ -38,18 +38,15 @@ This section also assigns any hardware interface outputs.
|
|||||||
|
|
||||||
Readback
|
Readback
|
||||||
--------
|
--------
|
||||||
The readback layer aggregates and reduces all readable registers into a single
|
The readback layer aggregates and MUXes all readable registers into a single
|
||||||
read response. During a read operation, the same address decode strobes are used
|
read response.
|
||||||
to select the active register that is being accessed.
|
|
||||||
This allows for a simple OR-reduction operation to be used to compute the read
|
|
||||||
data response.
|
|
||||||
|
|
||||||
For designs with a large number of software-readable registers, an optional
|
For designs with a large number of software-readable registers, an optional
|
||||||
fanin re-timing stage can be enabled. This stage is automatically inserted at a
|
fanin re-timing stage can be enabled. This stage is automatically inserted at a
|
||||||
balanced point in the read-data reduction so that fanin and logic-levels are
|
balanced point in the read-data reduction so that fanin and logic-levels are
|
||||||
optimally reduced.
|
optimally reduced.
|
||||||
|
|
||||||
.. figure:: diagrams/readback.png
|
.. figure:: diagrams/rt-readback-fanin.png
|
||||||
:width: 65%
|
:width: 65%
|
||||||
:align: center
|
:align: center
|
||||||
|
|
||||||
|
|||||||
@@ -1,10 +0,0 @@
|
|||||||
Holy smokes this is complicated
|
|
||||||
|
|
||||||
Keep this exporter in Alpha/Beta for a while
|
|
||||||
Add some text in the readme or somewhere:
|
|
||||||
- No guarantees of correctness! This is always true with open source software,
|
|
||||||
but even more here!
|
|
||||||
Be sure to do your own validation before using this in production.
|
|
||||||
- Alpha means the implementation may change drastically!
|
|
||||||
Unlike official sem-ver, I am not making any guarantees on compatibility
|
|
||||||
- I need your help! Validating, finding edge cases, etc...
|
|
||||||
@@ -1,35 +1,84 @@
|
|||||||
--------------------------------------------------------------------------------
|
--------------------------------------------------------------------------------
|
||||||
Readback mux layer
|
Readback mux layer
|
||||||
--------------------------------------------------------------------------------
|
--------------------------------------------------------------------------------
|
||||||
|
Use a large always_comb block + many if statements that select the read data
|
||||||
|
based on the cpuif address.
|
||||||
|
Loops are handled the same way as address decode.
|
||||||
|
|
||||||
Implementation:
|
Other options that were considered:
|
||||||
- Big always_comb block
|
- Flat case statement
|
||||||
- Initialize default rd_data value
|
con: Difficult to represent arrays. Essentially requires unrolling
|
||||||
- Lotsa if statements that operate on reg strb to assign rd_data
|
con: complicates retiming strategies
|
||||||
- Merges all fields together into reg
|
con: Representing a range (required for externals) is cumbersome. Possible with stacked casez wildcards.
|
||||||
- pulls value from storage element struct, or input struct
|
- AND field data with strobe, then massive OR reduce
|
||||||
- Provision for optional flop stage?
|
This was the strategy prior to v1.3, but turned out to infer more overhead
|
||||||
|
than originally anticipated
|
||||||
|
- Assigning data to a flat register array, then directly indexing via address
|
||||||
|
con: Would work fine, but scales poorly for sparse regblocks.
|
||||||
|
Namely, simulators would likely allocate memory for the entire array
|
||||||
|
- Assign to a flat array that is packed sequentially, then directly indexing using a derived packed index
|
||||||
|
Concern that for sparse regfiles, the translation of addr --> packed index
|
||||||
|
becomes a nontrivial logic function
|
||||||
|
|
||||||
Mux Strategy:
|
Pros:
|
||||||
Flat case statement:
|
- Scales well for arrays since loops can be used
|
||||||
-- Cant parameterize
|
- Externals work well, as address ranges can be compared
|
||||||
+ better performance?
|
- Synthesis results show more efficient logic inference
|
||||||
|
|
||||||
Flat 1-hot array then OR reduce:
|
Example:
|
||||||
- Create a bus-wide flat array
|
logic [7:0] out;
|
||||||
eg: 32-bits x N readable registers
|
always_comb begin
|
||||||
- Assign each element:
|
out = '0;
|
||||||
the readback value of each register
|
for(int i=0; i<64; i++) begin
|
||||||
... masked by the register's access strobe
|
if(i == addr) out = data[i];
|
||||||
- I could also stuff an extra bit into the array that denotes the read is valid
|
end
|
||||||
A missed read will OR reduce down to a 0
|
end
|
||||||
- Finally, OR reduce all the elements in the array down to a flat 32-bit bus
|
|
||||||
- Retiming the large OR fanin can be done by chopping up the array into stages
|
|
||||||
for 2 stages, sqrt(N) gives each stage's fanin size. Round to favor
|
How to implement retiming:
|
||||||
more fanin on 2nd stage
|
Ideally this would partition the design into several equal sub-regions, but
|
||||||
3 stages uses cube-root. etc...
|
with loop structures, this is pretty difficult..
|
||||||
- This has the benefit of re-using the address decode logic.
|
What if instead, it is partitioned into equal address ranges?
|
||||||
synth can choose to replicate logic if fanout is bad
|
|
||||||
|
First stage compares the lower-half of the address bits.
|
||||||
|
Values are assigned to the appropriate output "bin"
|
||||||
|
|
||||||
|
logic [7:0] out[8];
|
||||||
|
always_comb begin
|
||||||
|
for(int i=0; i<8; i++) out[i] = '0;
|
||||||
|
|
||||||
|
for(int i=0; i<64; i++) begin
|
||||||
|
automatic bit [5:0] this_addr = i;
|
||||||
|
|
||||||
|
if(this_addr[2:0] == addr[2:0]) out[this_addr[5:3]] = data[i];
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
|
(not showing retiming ff for `out` and `addr`)
|
||||||
|
The second stage muxes down the resulting bins using the high address bits.
|
||||||
|
If the user up-sizes the address bits, need to check the upper bits to prevent aliasing
|
||||||
|
Assuming min address bit range is [5:0], but it was padded up to [8:0], do the following:
|
||||||
|
|
||||||
|
logic [7:0] rd_data;
|
||||||
|
always_comb begin
|
||||||
|
if(addr[8:6] != '0) begin
|
||||||
|
// Invalid read range
|
||||||
|
rd_data = '0;
|
||||||
|
end else begin
|
||||||
|
rd_data = out[addr[5:3]];
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
|
Retiming with external blocks
|
||||||
|
One minor downside is the above scheme does not work well for external blocks
|
||||||
|
that span a range of addresses. Depending on the range, it may span multiple
|
||||||
|
retiming bins which complicates how this would be assigned cleanly.
|
||||||
|
This would be complicated even further with arrays of externals since the
|
||||||
|
span of bins could change depending on the iteration.
|
||||||
|
|
||||||
|
Since externals can already be retimed, and large fanin of external blocks
|
||||||
|
is likely less of a concern, implement these as a separate readback mux on
|
||||||
|
the side that does not get retimed at all.
|
||||||
|
|
||||||
|
|
||||||
WARNING:
|
WARNING:
|
||||||
@@ -42,8 +91,14 @@ WARNING:
|
|||||||
|
|
||||||
Forwards response strobe back up to cpu interface layer
|
Forwards response strobe back up to cpu interface layer
|
||||||
|
|
||||||
TODO:
|
|
||||||
Dont forget about alias registers here
|
|
||||||
|
|
||||||
TODO:
|
Variables:
|
||||||
Does the endinness the user sets matter anywhere?
|
From decode:
|
||||||
|
decoded_addr
|
||||||
|
decoded_req
|
||||||
|
decoded_req_is_wr
|
||||||
|
|
||||||
|
Response:
|
||||||
|
readback_done
|
||||||
|
readback_err
|
||||||
|
readback_data
|
||||||
|
|||||||
Binary file not shown.
Binary file not shown.
|
Before Width: | Height: | Size: 89 KiB |
242
docs/diagrams/rt-readback-fanin.drawio
Normal file
242
docs/diagrams/rt-readback-fanin.drawio
Normal file
@@ -0,0 +1,242 @@
|
|||||||
|
<mxfile host="Electron" agent="Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) draw.io/29.0.3 Chrome/140.0.7339.249 Electron/38.7.0 Safari/537.36" version="29.0.3">
|
||||||
|
<diagram name="Page-1" id="2eHshj4V_V3cOmKikpwH">
|
||||||
|
<mxGraphModel dx="542" dy="940" grid="1" gridSize="10" guides="1" tooltips="1" connect="1" arrows="1" fold="1" page="1" pageScale="1" pageWidth="850" pageHeight="1100" math="0" shadow="0">
|
||||||
|
<root>
|
||||||
|
<mxCell id="0" />
|
||||||
|
<mxCell id="1" parent="0" />
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-1" value="" style="endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="340" y="570" as="sourcePoint" />
|
||||||
|
<mxPoint x="400" y="570" as="targetPoint" />
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-2" value="" style="shape=trapezoid;perimeter=trapezoidPerimeter;whiteSpace=wrap;html=1;fixedSize=1;rotation=90;size=10.009999999999991;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="369.98" y="590.03" width="80.01" height="19.98" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-3" value="" style="endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="340" y="590" as="sourcePoint" />
|
||||||
|
<mxPoint x="400" y="590" as="targetPoint" />
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-4" value="" style="endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="340" y="610.02" as="sourcePoint" />
|
||||||
|
<mxPoint x="400" y="610.02" as="targetPoint" />
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-5" value="" style="endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="340" y="630" as="sourcePoint" />
|
||||||
|
<mxPoint x="400" y="630" as="targetPoint" />
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-8" value="&lt;reg1&gt;" style="text;html=1;whiteSpace=wrap;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;rounded=0;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="290" y="560" width="50" height="20" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-9" value="&lt;reg2&gt;" style="text;html=1;whiteSpace=wrap;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;rounded=0;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="290" y="580" width="50" height="20" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-10" value="&lt;reg3&gt;" style="text;html=1;whiteSpace=wrap;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;rounded=0;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="290" y="600" width="50" height="20" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-11" value="&lt;reg4&gt;" style="text;html=1;whiteSpace=wrap;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;rounded=0;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="290" y="620" width="50" height="20" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-12" value="" style="endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="340" y="660" as="sourcePoint" />
|
||||||
|
<mxPoint x="400" y="660" as="targetPoint" />
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-13" value="" style="shape=trapezoid;perimeter=trapezoidPerimeter;whiteSpace=wrap;html=1;fixedSize=1;rotation=90;size=10.009999999999991;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="369.98" y="680.03" width="80.01" height="19.98" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-14" value="" style="endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="340" y="680" as="sourcePoint" />
|
||||||
|
<mxPoint x="400" y="680" as="targetPoint" />
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-15" value="" style="endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="340" y="700.02" as="sourcePoint" />
|
||||||
|
<mxPoint x="400" y="700.02" as="targetPoint" />
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-16" value="" style="endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="340" y="720" as="sourcePoint" />
|
||||||
|
<mxPoint x="400" y="720" as="targetPoint" />
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-17" value="&lt;reg5&gt;" style="text;html=1;whiteSpace=wrap;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;rounded=0;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="290" y="650" width="50" height="20" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-18" value="&lt;reg6&gt;" style="text;html=1;whiteSpace=wrap;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;rounded=0;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="290" y="670" width="50" height="20" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-19" value="&lt;reg7&gt;" style="text;html=1;whiteSpace=wrap;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;rounded=0;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="290" y="690" width="50" height="20" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-20" value="&lt;reg8&gt;" style="text;html=1;whiteSpace=wrap;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;rounded=0;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="290" y="710" width="50" height="20" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-21" value="" style="endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="340" y="750" as="sourcePoint" />
|
||||||
|
<mxPoint x="400" y="750" as="targetPoint" />
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-22" value="" style="shape=trapezoid;perimeter=trapezoidPerimeter;whiteSpace=wrap;html=1;fixedSize=1;rotation=90;size=10.009999999999991;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="369.98" y="770.03" width="80.01" height="19.98" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-23" value="" style="endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="340" y="770" as="sourcePoint" />
|
||||||
|
<mxPoint x="400" y="770" as="targetPoint" />
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-24" value="" style="endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="340" y="790.02" as="sourcePoint" />
|
||||||
|
<mxPoint x="400" y="790.02" as="targetPoint" />
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-25" value="" style="endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="340" y="810" as="sourcePoint" />
|
||||||
|
<mxPoint x="400" y="810" as="targetPoint" />
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-26" value="&lt;reg9&gt;" style="text;html=1;whiteSpace=wrap;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;rounded=0;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="290" y="740" width="50" height="20" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-27" value="&lt;reg10&gt;" style="text;html=1;whiteSpace=wrap;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;rounded=0;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="290" y="760" width="50" height="20" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-28" value="&lt;reg11&gt;" style="text;html=1;whiteSpace=wrap;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;rounded=0;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="290" y="780" width="50" height="20" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-29" value="&lt;reg12&gt;" style="text;html=1;whiteSpace=wrap;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;rounded=0;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="290" y="800" width="50" height="20" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-30" value="" style="endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="340" y="840" as="sourcePoint" />
|
||||||
|
<mxPoint x="400" y="840" as="targetPoint" />
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-31" value="" style="shape=trapezoid;perimeter=trapezoidPerimeter;whiteSpace=wrap;html=1;fixedSize=1;rotation=90;size=10.009999999999991;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="369.98" y="860.03" width="80.01" height="19.98" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-32" value="" style="endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="340" y="860" as="sourcePoint" />
|
||||||
|
<mxPoint x="400" y="860" as="targetPoint" />
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-33" value="" style="endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="340" y="880.02" as="sourcePoint" />
|
||||||
|
<mxPoint x="400" y="880.02" as="targetPoint" />
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-34" value="" style="endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="340" y="900" as="sourcePoint" />
|
||||||
|
<mxPoint x="400" y="900" as="targetPoint" />
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-35" value="&lt;reg13&gt;" style="text;html=1;whiteSpace=wrap;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;rounded=0;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="290" y="830" width="50" height="20" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-36" value="&lt;reg14&gt;" style="text;html=1;whiteSpace=wrap;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;rounded=0;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="290" y="850" width="50" height="20" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-37" value="&lt;reg15&gt;" style="text;html=1;whiteSpace=wrap;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;rounded=0;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="290" y="870" width="50" height="20" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-38" value="&lt;reg16&gt;" style="text;html=1;whiteSpace=wrap;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;rounded=0;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="290" y="890" width="50" height="20" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-42" value="" style="endArrow=classic;html=1;rounded=0;exitX=0.5;exitY=0;exitDx=0;exitDy=0;" edge="1" parent="1" source="eVcFmWNxBADlTckHnpvW-2">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="490" y="700" as="sourcePoint" />
|
||||||
|
<mxPoint x="520" y="700" as="targetPoint" />
|
||||||
|
<Array as="points">
|
||||||
|
<mxPoint x="490" y="600" />
|
||||||
|
<mxPoint x="490" y="700" />
|
||||||
|
</Array>
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-43" value="" style="shape=trapezoid;perimeter=trapezoidPerimeter;whiteSpace=wrap;html=1;fixedSize=1;rotation=90;size=10.009999999999991;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="490" y="720.02" width="80.01" height="19.98" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-44" value="" style="endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="420" y="690" as="sourcePoint" />
|
||||||
|
<mxPoint x="520" y="720" as="targetPoint" />
|
||||||
|
<Array as="points">
|
||||||
|
<mxPoint x="480" y="690" />
|
||||||
|
<mxPoint x="480" y="720" />
|
||||||
|
</Array>
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-45" value="" style="endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="420" y="780" as="sourcePoint" />
|
||||||
|
<mxPoint x="520" y="740" as="targetPoint" />
|
||||||
|
<Array as="points">
|
||||||
|
<mxPoint x="480" y="780" />
|
||||||
|
<mxPoint x="480" y="740" />
|
||||||
|
</Array>
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-46" value="" style="endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="420" y="870" as="sourcePoint" />
|
||||||
|
<mxPoint x="520" y="760" as="targetPoint" />
|
||||||
|
<Array as="points">
|
||||||
|
<mxPoint x="490" y="870" />
|
||||||
|
<mxPoint x="490" y="760" />
|
||||||
|
</Array>
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-47" value="" style="endArrow=classic;html=1;rounded=0;" edge="1" parent="1">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="540" y="729.84" as="sourcePoint" />
|
||||||
|
<mxPoint x="560" y="730" as="targetPoint" />
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-48" value="if retime_read_fanin = True" style="text;html=1;whiteSpace=wrap;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;rounded=0;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="369.98" y="920" width="160" height="20" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-49" value="addr" style="text;html=1;whiteSpace=wrap;strokeColor=none;fillColor=none;align=center;verticalAlign=middle;rounded=0;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="290" y="510" width="50" height="20" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-50" value="" style="endArrow=classic;html=1;rounded=0;entryX=0;entryY=0.5;entryDx=0;entryDy=0;" edge="1" parent="1" target="eVcFmWNxBADlTckHnpvW-2">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="340" y="519.66" as="sourcePoint" />
|
||||||
|
<mxPoint x="400" y="519.66" as="targetPoint" />
|
||||||
|
<Array as="points">
|
||||||
|
<mxPoint x="410" y="520" />
|
||||||
|
</Array>
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-51" value="" style="endArrow=classic;html=1;rounded=0;entryX=0;entryY=0.5;entryDx=0;entryDy=0;" edge="1" parent="1" target="eVcFmWNxBADlTckHnpvW-43">
|
||||||
|
<mxGeometry width="50" height="50" relative="1" as="geometry">
|
||||||
|
<mxPoint x="410" y="520" as="sourcePoint" />
|
||||||
|
<mxPoint x="530" y="565" as="targetPoint" />
|
||||||
|
<Array as="points">
|
||||||
|
<mxPoint x="530" y="520" />
|
||||||
|
</Array>
|
||||||
|
</mxGeometry>
|
||||||
|
</mxCell>
|
||||||
|
<mxCell id="eVcFmWNxBADlTckHnpvW-40" value="" style="rounded=0;whiteSpace=wrap;html=1;fillColor=#f5f5f5;fontColor=#333333;strokeColor=#666666;" vertex="1" parent="1">
|
||||||
|
<mxGeometry x="440" y="500" width="20" height="420" as="geometry" />
|
||||||
|
</mxCell>
|
||||||
|
</root>
|
||||||
|
</mxGraphModel>
|
||||||
|
</diagram>
|
||||||
|
</mxfile>
|
||||||
BIN
docs/diagrams/rt-readback-fanin.png
Normal file
BIN
docs/diagrams/rt-readback-fanin.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 61 KiB |
@@ -7,7 +7,7 @@ name = "peakrdl-regblock"
|
|||||||
dynamic = ["version"]
|
dynamic = ["version"]
|
||||||
requires-python = ">=3.7"
|
requires-python = ">=3.7"
|
||||||
dependencies = [
|
dependencies = [
|
||||||
"systemrdl-compiler ~= 1.31",
|
"systemrdl-compiler ~= 1.32",
|
||||||
"Jinja2 >= 2.11",
|
"Jinja2 >= 2.11",
|
||||||
]
|
]
|
||||||
|
|
||||||
|
|||||||
@@ -133,8 +133,9 @@ class DecodeLogicGenerator(RDLForLoopGenerator):
|
|||||||
self._array_stride_stack = [] # type: List[int]
|
self._array_stride_stack = [] # type: List[int]
|
||||||
|
|
||||||
def _add_addressablenode_decoding_flags(self, node: 'AddressableNode') -> None:
|
def _add_addressablenode_decoding_flags(self, node: 'AddressableNode') -> None:
|
||||||
addr_str = self._get_address_str(node)
|
addr_lo = self._get_address_str(node)
|
||||||
addr_decoding_str = f"cpuif_req_masked & (cpuif_addr >= {addr_str}) & (cpuif_addr <= {addr_str} + {SVInt(node.size - 1, self.addr_decode.exp.ds.addr_width)})"
|
addr_hi = f"{addr_lo} + {SVInt(node.size - 1, self.addr_decode.exp.ds.addr_width)}"
|
||||||
|
addr_decoding_str = f"cpuif_req_masked & (cpuif_addr >= {addr_lo}) & (cpuif_addr <= {addr_hi})"
|
||||||
rhs = addr_decoding_str
|
rhs = addr_decoding_str
|
||||||
rhs_valid_addr = addr_decoding_str
|
rhs_valid_addr = addr_decoding_str
|
||||||
if isinstance(node, MemNode):
|
if isinstance(node, MemNode):
|
||||||
|
|||||||
@@ -165,12 +165,6 @@ class RegblockExporter:
|
|||||||
# Validate that there are no unsupported constructs
|
# Validate that there are no unsupported constructs
|
||||||
DesignValidator(self).do_validate()
|
DesignValidator(self).do_validate()
|
||||||
|
|
||||||
# Compute readback implementation early.
|
|
||||||
# Readback has the capability to disable retiming if the fanin is tiny.
|
|
||||||
# This affects the rest of the design's implementation, and must be known
|
|
||||||
# before any other templates are rendered
|
|
||||||
readback_implementation = self.readback.get_implementation()
|
|
||||||
|
|
||||||
# Build Jinja template context
|
# Build Jinja template context
|
||||||
context = {
|
context = {
|
||||||
"cpuif": self.cpuif,
|
"cpuif": self.cpuif,
|
||||||
@@ -184,7 +178,7 @@ class RegblockExporter:
|
|||||||
"default_resetsignal_name": self.dereferencer.default_resetsignal_name,
|
"default_resetsignal_name": self.dereferencer.default_resetsignal_name,
|
||||||
"address_decode": self.address_decode,
|
"address_decode": self.address_decode,
|
||||||
"field_logic": self.field_logic,
|
"field_logic": self.field_logic,
|
||||||
"readback_implementation": readback_implementation,
|
"readback_implementation": self.readback.get_implementation(),
|
||||||
"ext_write_acks": ext_write_acks,
|
"ext_write_acks": ext_write_acks,
|
||||||
"ext_read_acks": ext_read_acks,
|
"ext_read_acks": ext_read_acks,
|
||||||
"parity": parity,
|
"parity": parity,
|
||||||
@@ -319,6 +313,10 @@ class DesignState:
|
|||||||
)
|
)
|
||||||
self.cpuif_data_width = 32
|
self.cpuif_data_width = 32
|
||||||
|
|
||||||
|
# Also, to avoid silly edge cases, disable read fanin retiming since
|
||||||
|
# it has little benefit anyways
|
||||||
|
self.retime_read_fanin = False
|
||||||
|
|
||||||
#------------------------
|
#------------------------
|
||||||
# Min address width encloses the total size AND at least 1 useful address bit
|
# Min address width encloses the total size AND at least 1 useful address bit
|
||||||
self.addr_width = max(clog2(self.top_node.size), clog2(self.cpuif_data_width//8) + 1)
|
self.addr_width = max(clog2(self.top_node.size), clog2(self.cpuif_data_width//8) + 1)
|
||||||
@@ -328,6 +326,15 @@ class DesignState:
|
|||||||
msg.fatal(f"User-specified address width shall be greater than or equal to {self.addr_width}.")
|
msg.fatal(f"User-specified address width shall be greater than or equal to {self.addr_width}.")
|
||||||
self.addr_width = user_addr_width
|
self.addr_width = user_addr_width
|
||||||
|
|
||||||
|
if self.retime_read_fanin:
|
||||||
|
# Check if address width is sufficient to even bother with read fanin retiming
|
||||||
|
data_width_bytes = self.cpuif_data_width // 8
|
||||||
|
unused_low_addr_bits = clog2(data_width_bytes)
|
||||||
|
relevant_addr_width = self.addr_width - unused_low_addr_bits
|
||||||
|
if relevant_addr_width < 2:
|
||||||
|
# Unable to partition the address space. Disable retiming
|
||||||
|
self.retime_read_fanin = False
|
||||||
|
|
||||||
@property
|
@property
|
||||||
def min_read_latency(self) -> int:
|
def min_read_latency(self) -> int:
|
||||||
n = 0
|
n = 0
|
||||||
|
|||||||
@@ -30,24 +30,7 @@ module {{ds.module_name}}
|
|||||||
|
|
||||||
logic cpuif_req_masked;
|
logic cpuif_req_masked;
|
||||||
{%- if ds.has_external_addressable %}
|
{%- if ds.has_external_addressable %}
|
||||||
logic external_req;
|
|
||||||
logic external_pending;
|
logic external_pending;
|
||||||
logic external_wr_ack;
|
|
||||||
logic external_rd_ack;
|
|
||||||
always_ff {{get_always_ff_event(cpuif.reset)}} begin
|
|
||||||
if({{get_resetsignal(cpuif.reset)}}) begin
|
|
||||||
external_pending <= '0;
|
|
||||||
end else begin
|
|
||||||
if(external_req & ~external_wr_ack & ~external_rd_ack) external_pending <= '1;
|
|
||||||
else if(external_wr_ack | external_rd_ack) external_pending <= '0;
|
|
||||||
`ifndef SYNTHESIS
|
|
||||||
assert_bad_ext_wr_ack: assert(!external_wr_ack || (external_pending | external_req))
|
|
||||||
else $error("An external wr_ack strobe was asserted when no external request was active");
|
|
||||||
assert_bad_ext_rd_ack: assert(!external_rd_ack || (external_pending | external_req))
|
|
||||||
else $error("An external rd_ack strobe was asserted when no external request was active");
|
|
||||||
`endif
|
|
||||||
end
|
|
||||||
end
|
|
||||||
{%- endif %}
|
{%- endif %}
|
||||||
{% if ds.min_read_latency == ds.min_write_latency %}
|
{% if ds.min_read_latency == ds.min_write_latency %}
|
||||||
// Read & write latencies are balanced. Stalls not required
|
// Read & write latencies are balanced. Stalls not required
|
||||||
@@ -109,11 +92,9 @@ module {{ds.module_name}}
|
|||||||
decoded_reg_strb_t decoded_reg_strb;
|
decoded_reg_strb_t decoded_reg_strb;
|
||||||
logic decoded_err;
|
logic decoded_err;
|
||||||
{%- if ds.has_external_addressable %}
|
{%- if ds.has_external_addressable %}
|
||||||
logic decoded_strb_is_external;
|
logic decoded_req_is_external;
|
||||||
{% endif %}
|
{% endif %}
|
||||||
{%- if ds.has_external_block %}
|
|
||||||
logic [{{cpuif.addr_width-1}}:0] decoded_addr;
|
logic [{{cpuif.addr_width-1}}:0] decoded_addr;
|
||||||
{% endif %}
|
|
||||||
logic decoded_req;
|
logic decoded_req;
|
||||||
logic decoded_req_is_wr;
|
logic decoded_req_is_wr;
|
||||||
logic [{{cpuif.data_width-1}}:0] decoded_wr_data;
|
logic [{{cpuif.data_width-1}}:0] decoded_wr_data;
|
||||||
@@ -147,15 +128,31 @@ module {{ds.module_name}}
|
|||||||
decoded_err = '0;
|
decoded_err = '0;
|
||||||
{%- endif %}
|
{%- endif %}
|
||||||
{%- if ds.has_external_addressable %}
|
{%- if ds.has_external_addressable %}
|
||||||
decoded_strb_is_external = is_external;
|
decoded_req_is_external = is_external;
|
||||||
external_req = is_external;
|
|
||||||
{%- endif %}
|
{%- endif %}
|
||||||
end
|
end
|
||||||
|
|
||||||
|
{%- if ds.has_external_addressable %}
|
||||||
|
logic external_wr_ack;
|
||||||
|
logic external_rd_ack;
|
||||||
|
always_ff {{get_always_ff_event(cpuif.reset)}} begin
|
||||||
|
if({{get_resetsignal(cpuif.reset)}}) begin
|
||||||
|
external_pending <= '0;
|
||||||
|
end else begin
|
||||||
|
if(decoded_req_is_external & ~external_wr_ack & ~external_rd_ack) external_pending <= '1;
|
||||||
|
else if(external_wr_ack | external_rd_ack) external_pending <= '0;
|
||||||
|
`ifndef SYNTHESIS
|
||||||
|
assert_bad_ext_wr_ack: assert(!external_wr_ack || (external_pending | decoded_req_is_external))
|
||||||
|
else $error("An external wr_ack strobe was asserted when no external request was active");
|
||||||
|
assert_bad_ext_rd_ack: assert(!external_rd_ack || (external_pending | decoded_req_is_external))
|
||||||
|
else $error("An external rd_ack strobe was asserted when no external request was active");
|
||||||
|
`endif
|
||||||
|
end
|
||||||
|
end
|
||||||
|
{%- endif %}
|
||||||
|
|
||||||
// Pass down signals to next stage
|
// Pass down signals to next stage
|
||||||
{%- if ds.has_external_block %}
|
|
||||||
assign decoded_addr = cpuif_addr;
|
assign decoded_addr = cpuif_addr;
|
||||||
{% endif %}
|
|
||||||
assign decoded_req = cpuif_req_masked;
|
assign decoded_req = cpuif_req_masked;
|
||||||
assign decoded_req_is_wr = cpuif_req_is_wr;
|
assign decoded_req_is_wr = cpuif_req_is_wr;
|
||||||
assign decoded_wr_data = cpuif_wr_data;
|
assign decoded_wr_data = cpuif_wr_data;
|
||||||
@@ -223,7 +220,7 @@ module {{ds.module_name}}
|
|||||||
{{ext_write_acks.get_implementation()|indent(8)}}
|
{{ext_write_acks.get_implementation()|indent(8)}}
|
||||||
external_wr_ack = wr_ack;
|
external_wr_ack = wr_ack;
|
||||||
end
|
end
|
||||||
assign cpuif_wr_ack = external_wr_ack | (decoded_req & decoded_req_is_wr & ~decoded_strb_is_external);
|
assign cpuif_wr_ack = external_wr_ack | (decoded_req & decoded_req_is_wr & ~decoded_req_is_external);
|
||||||
{%- else %}
|
{%- else %}
|
||||||
assign cpuif_wr_ack = decoded_req & decoded_req_is_wr;
|
assign cpuif_wr_ack = decoded_req & decoded_req_is_wr;
|
||||||
{%- endif %}
|
{%- endif %}
|
||||||
@@ -262,6 +259,22 @@ module {{ds.module_name}}
|
|||||||
{%- endif %}
|
{%- endif %}
|
||||||
{%- endif %}
|
{%- endif %}
|
||||||
|
|
||||||
|
logic [{{cpuif.addr_width-1}}:0] rd_mux_addr;
|
||||||
|
{%- if ds.has_external_addressable %}
|
||||||
|
logic [{{cpuif.addr_width-1}}:0] pending_rd_addr;
|
||||||
|
// Hold read mux address to guarantee it is stable throughout any external accesses
|
||||||
|
always_ff {{get_always_ff_event(cpuif.reset)}} begin
|
||||||
|
if({{get_resetsignal(cpuif.reset)}}) begin
|
||||||
|
pending_rd_addr <= '0;
|
||||||
|
end else begin
|
||||||
|
if(decoded_req) pending_rd_addr <= decoded_addr;
|
||||||
|
end
|
||||||
|
end
|
||||||
|
assign rd_mux_addr = decoded_req ? decoded_addr : pending_rd_addr;
|
||||||
|
{%- else %}
|
||||||
|
assign rd_mux_addr = decoded_addr;
|
||||||
|
{%- endif %}
|
||||||
|
|
||||||
logic readback_err;
|
logic readback_err;
|
||||||
logic readback_done;
|
logic readback_done;
|
||||||
logic [{{cpuif.data_width-1}}:0] readback_data;
|
logic [{{cpuif.data_width-1}}:0] readback_data;
|
||||||
|
|||||||
@@ -1,72 +1 @@
|
|||||||
from typing import TYPE_CHECKING
|
from .readback import Readback
|
||||||
import math
|
|
||||||
|
|
||||||
from .generators import ReadbackAssignmentGenerator
|
|
||||||
|
|
||||||
if TYPE_CHECKING:
|
|
||||||
from ..exporter import RegblockExporter, DesignState
|
|
||||||
from systemrdl.node import AddrmapNode
|
|
||||||
|
|
||||||
class Readback:
|
|
||||||
def __init__(self, exp:'RegblockExporter'):
|
|
||||||
self.exp = exp
|
|
||||||
|
|
||||||
@property
|
|
||||||
def ds(self) -> 'DesignState':
|
|
||||||
return self.exp.ds
|
|
||||||
|
|
||||||
@property
|
|
||||||
def top_node(self) -> 'AddrmapNode':
|
|
||||||
return self.exp.ds.top_node
|
|
||||||
|
|
||||||
def get_implementation(self) -> str:
|
|
||||||
gen = ReadbackAssignmentGenerator(self.exp)
|
|
||||||
array_assignments = gen.get_content(self.top_node)
|
|
||||||
array_size = gen.current_offset
|
|
||||||
|
|
||||||
# Enabling the fanin stage doesnt make sense if readback fanin is
|
|
||||||
# small. This also avoids pesky corner cases
|
|
||||||
if array_size < 4:
|
|
||||||
self.ds.retime_read_fanin = False
|
|
||||||
|
|
||||||
context = {
|
|
||||||
"array_assignments" : array_assignments,
|
|
||||||
"array_size" : array_size,
|
|
||||||
'get_always_ff_event': self.exp.dereferencer.get_always_ff_event,
|
|
||||||
'get_resetsignal': self.exp.dereferencer.get_resetsignal,
|
|
||||||
"cpuif": self.exp.cpuif,
|
|
||||||
"ds": self.ds,
|
|
||||||
}
|
|
||||||
|
|
||||||
if self.ds.retime_read_fanin:
|
|
||||||
# If adding a fanin pipeline stage, goal is to try to
|
|
||||||
# split the fanin path in the middle so that fanin into the stage
|
|
||||||
# and the following are roughly balanced.
|
|
||||||
fanin_target = math.sqrt(array_size)
|
|
||||||
|
|
||||||
# Size of fanin group to consume per fanin element
|
|
||||||
fanin_stride = math.floor(fanin_target)
|
|
||||||
|
|
||||||
# Number of array elements to reduce to.
|
|
||||||
# Round up to an extra element in case there is some residual
|
|
||||||
fanin_array_size = math.ceil(array_size / fanin_stride)
|
|
||||||
|
|
||||||
# leftovers are handled in an extra array element
|
|
||||||
fanin_residual_stride = array_size % fanin_stride
|
|
||||||
|
|
||||||
if fanin_residual_stride != 0:
|
|
||||||
# If there is a partial fanin element, reduce the number of
|
|
||||||
# loops performed in the bulk fanin stage
|
|
||||||
fanin_loop_iter = fanin_array_size - 1
|
|
||||||
else:
|
|
||||||
fanin_loop_iter = fanin_array_size
|
|
||||||
|
|
||||||
context['fanin_stride'] = fanin_stride
|
|
||||||
context['fanin_array_size'] = fanin_array_size
|
|
||||||
context['fanin_residual_stride'] = fanin_residual_stride
|
|
||||||
context['fanin_loop_iter'] = fanin_loop_iter
|
|
||||||
|
|
||||||
template = self.exp.jj_env.get_template(
|
|
||||||
"readback/templates/readback.sv"
|
|
||||||
)
|
|
||||||
return template.render(context)
|
|
||||||
|
|||||||
@@ -1,381 +0,0 @@
|
|||||||
from typing import TYPE_CHECKING, List
|
|
||||||
|
|
||||||
from systemrdl.node import RegNode, AddressableNode
|
|
||||||
from systemrdl.walker import WalkerAction
|
|
||||||
|
|
||||||
from ..forloop_generator import RDLForLoopGenerator, LoopBody
|
|
||||||
|
|
||||||
from ..utils import do_bitswap, do_slice
|
|
||||||
|
|
||||||
if TYPE_CHECKING:
|
|
||||||
from ..exporter import RegblockExporter
|
|
||||||
|
|
||||||
class ReadbackLoopBody(LoopBody):
|
|
||||||
def __init__(self, dim: int, iterator: str, i_type: str) -> None:
|
|
||||||
super().__init__(dim, iterator, i_type)
|
|
||||||
self.n_regs = 0
|
|
||||||
|
|
||||||
def __str__(self) -> str:
|
|
||||||
# replace $i#sz token when stringifying
|
|
||||||
s = super().__str__()
|
|
||||||
token = f"${self.iterator}sz"
|
|
||||||
s = s.replace(token, str(self.n_regs))
|
|
||||||
return s
|
|
||||||
|
|
||||||
class ReadbackAssignmentGenerator(RDLForLoopGenerator):
|
|
||||||
i_type = "genvar"
|
|
||||||
loop_body_cls = ReadbackLoopBody
|
|
||||||
|
|
||||||
def __init__(self, exp:'RegblockExporter') -> None:
|
|
||||||
super().__init__()
|
|
||||||
self.exp = exp
|
|
||||||
|
|
||||||
# The readback array collects all possible readback values into a flat
|
|
||||||
# array. The array width is equal to the CPUIF bus width. Each entry in
|
|
||||||
# the array represents an aligned read access.
|
|
||||||
self.current_offset = 0
|
|
||||||
self.start_offset_stack = [] # type: List[int]
|
|
||||||
self.dim_stack = [] # type: List[int]
|
|
||||||
|
|
||||||
@property
|
|
||||||
def current_offset_str(self) -> str:
|
|
||||||
"""
|
|
||||||
Derive a string that represents the current offset being assigned.
|
|
||||||
This consists of:
|
|
||||||
- The current integer offset
|
|
||||||
- multiplied index of any enclosing loop
|
|
||||||
|
|
||||||
The integer offset from "current_offset" is static and is monotonically
|
|
||||||
incremented as more register assignments are processed.
|
|
||||||
|
|
||||||
The component of the offset from loops is added by multiplying the current
|
|
||||||
loop index by the loop size.
|
|
||||||
Since the loop's size is not known at this time, it is emitted as a
|
|
||||||
placeholder token like: $i0sz, $i1sz, $i2sz, etc
|
|
||||||
These tokens can be replaced once the loop body has been completed and the
|
|
||||||
size of its contents is known.
|
|
||||||
"""
|
|
||||||
offset_parts = []
|
|
||||||
for i in range(self._loop_level):
|
|
||||||
offset_parts.append(f"i{i} * $i{i}sz")
|
|
||||||
offset_parts.append(str(self.current_offset))
|
|
||||||
return " + ".join(offset_parts)
|
|
||||||
|
|
||||||
def push_loop(self, dim: int) -> None:
|
|
||||||
super().push_loop(dim)
|
|
||||||
self.start_offset_stack.append(self.current_offset)
|
|
||||||
self.dim_stack.append(dim)
|
|
||||||
|
|
||||||
def pop_loop(self) -> None:
|
|
||||||
start_offset = self.start_offset_stack.pop()
|
|
||||||
dim = self.dim_stack.pop()
|
|
||||||
|
|
||||||
# Number of registers enclosed in this loop
|
|
||||||
n_regs = self.current_offset - start_offset
|
|
||||||
self.current_loop.n_regs = n_regs # type: ignore
|
|
||||||
|
|
||||||
super().pop_loop()
|
|
||||||
|
|
||||||
# Advance current scope's offset to account for loop's contents
|
|
||||||
self.current_offset = start_offset + n_regs * dim
|
|
||||||
|
|
||||||
|
|
||||||
def enter_AddressableComponent(self, node: 'AddressableNode') -> WalkerAction:
|
|
||||||
super().enter_AddressableComponent(node)
|
|
||||||
|
|
||||||
if node.external and not isinstance(node, RegNode):
|
|
||||||
# External block
|
|
||||||
strb = self.exp.hwif.get_external_rd_ack(node)
|
|
||||||
data = self.exp.hwif.get_external_rd_data(node)
|
|
||||||
self.add_content(f"assign readback_array[{self.current_offset_str}] = {strb} ? {data} : '0;")
|
|
||||||
self.current_offset += 1
|
|
||||||
return WalkerAction.SkipDescendants
|
|
||||||
|
|
||||||
return WalkerAction.Continue
|
|
||||||
|
|
||||||
def enter_Reg(self, node: RegNode) -> WalkerAction:
|
|
||||||
if not node.has_sw_readable:
|
|
||||||
return WalkerAction.SkipDescendants
|
|
||||||
|
|
||||||
if node.external:
|
|
||||||
self.process_external_reg(node)
|
|
||||||
return WalkerAction.SkipDescendants
|
|
||||||
|
|
||||||
accesswidth = node.get_property('accesswidth')
|
|
||||||
regwidth = node.get_property('regwidth')
|
|
||||||
rbuf = node.get_property('buffer_reads')
|
|
||||||
if rbuf:
|
|
||||||
trigger = node.get_property('rbuffer_trigger')
|
|
||||||
is_own_trigger = (isinstance(trigger, RegNode) and trigger == node)
|
|
||||||
if is_own_trigger:
|
|
||||||
if accesswidth < regwidth:
|
|
||||||
self.process_buffered_reg_with_bypass(node, regwidth, accesswidth)
|
|
||||||
else:
|
|
||||||
# bypass cancels out. Behaves like a normal reg
|
|
||||||
self.process_reg(node)
|
|
||||||
else:
|
|
||||||
self.process_buffered_reg(node, regwidth, accesswidth)
|
|
||||||
elif accesswidth < regwidth:
|
|
||||||
self.process_wide_reg(node, accesswidth)
|
|
||||||
else:
|
|
||||||
self.process_reg(node)
|
|
||||||
|
|
||||||
return WalkerAction.SkipDescendants
|
|
||||||
|
|
||||||
def process_external_reg(self, node: RegNode) -> None:
|
|
||||||
strb = self.exp.hwif.get_external_rd_ack(node)
|
|
||||||
data = self.exp.hwif.get_external_rd_data(node)
|
|
||||||
regwidth = node.get_property('regwidth')
|
|
||||||
if regwidth < self.exp.cpuif.data_width:
|
|
||||||
self.add_content(f"assign readback_array[{self.current_offset_str}][{self.exp.cpuif.data_width-1}:{regwidth}] = '0;")
|
|
||||||
self.add_content(f"assign readback_array[{self.current_offset_str}][{regwidth-1}:0] = {strb} ? {data} : '0;")
|
|
||||||
else:
|
|
||||||
self.add_content(f"assign readback_array[{self.current_offset_str}] = {strb} ? {data} : '0;")
|
|
||||||
|
|
||||||
self.current_offset += 1
|
|
||||||
|
|
||||||
def process_reg(self, node: RegNode) -> None:
|
|
||||||
current_bit = 0
|
|
||||||
rd_strb = f"({self.exp.dereferencer.get_access_strobe(node)} && !decoded_req_is_wr)"
|
|
||||||
# Fields are sorted by ascending low bit
|
|
||||||
for field in node.fields():
|
|
||||||
if not field.is_sw_readable:
|
|
||||||
continue
|
|
||||||
|
|
||||||
# insert reserved assignment before this field if needed
|
|
||||||
if field.low != current_bit:
|
|
||||||
self.add_content(f"assign readback_array[{self.current_offset_str}][{field.low-1}:{current_bit}] = '0;")
|
|
||||||
|
|
||||||
value = self.exp.dereferencer.get_value(field)
|
|
||||||
if field.msb < field.lsb:
|
|
||||||
# Field gets bitswapped since it is in [low:high] orientation
|
|
||||||
value = do_bitswap(value)
|
|
||||||
|
|
||||||
self.add_content(f"assign readback_array[{self.current_offset_str}][{field.high}:{field.low}] = {rd_strb} ? {value} : '0;")
|
|
||||||
|
|
||||||
current_bit = field.high + 1
|
|
||||||
|
|
||||||
# Insert final reserved assignment if needed
|
|
||||||
bus_width = self.exp.cpuif.data_width
|
|
||||||
if current_bit < bus_width:
|
|
||||||
self.add_content(f"assign readback_array[{self.current_offset_str}][{bus_width-1}:{current_bit}] = '0;")
|
|
||||||
|
|
||||||
self.current_offset += 1
|
|
||||||
|
|
||||||
|
|
||||||
def process_buffered_reg(self, node: RegNode, regwidth: int, accesswidth: int) -> None:
|
|
||||||
rbuf = self.exp.read_buffering.get_rbuf_data(node)
|
|
||||||
|
|
||||||
if accesswidth < regwidth:
|
|
||||||
# Is wide reg
|
|
||||||
n_subwords = regwidth // accesswidth
|
|
||||||
astrb = self.exp.dereferencer.get_access_strobe(node, reduce_substrobes=False)
|
|
||||||
for i in range(n_subwords):
|
|
||||||
rd_strb = f"({astrb}[{i}] && !decoded_req_is_wr)"
|
|
||||||
bslice = f"[{(i + 1) * accesswidth - 1}:{i*accesswidth}]"
|
|
||||||
self.add_content(f"assign readback_array[{self.current_offset_str}] = {rd_strb} ? {rbuf}{bslice} : '0;")
|
|
||||||
self.current_offset += 1
|
|
||||||
|
|
||||||
else:
|
|
||||||
# Is regular reg
|
|
||||||
rd_strb = f"({self.exp.dereferencer.get_access_strobe(node)} && !decoded_req_is_wr)"
|
|
||||||
self.add_content(f"assign readback_array[{self.current_offset_str}][{regwidth-1}:0] = {rd_strb} ? {rbuf} : '0;")
|
|
||||||
|
|
||||||
bus_width = self.exp.cpuif.data_width
|
|
||||||
if regwidth < bus_width:
|
|
||||||
self.add_content(f"assign readback_array[{self.current_offset_str}][{bus_width-1}:{regwidth}] = '0;")
|
|
||||||
|
|
||||||
self.current_offset += 1
|
|
||||||
|
|
||||||
|
|
||||||
def process_buffered_reg_with_bypass(self, node: RegNode, regwidth: int, accesswidth: int) -> None:
|
|
||||||
"""
|
|
||||||
Special case for a buffered register when the register is its own trigger.
|
|
||||||
First sub-word shall bypass the read buffer and assign directly.
|
|
||||||
Subsequent subwords assign from the buffer.
|
|
||||||
Caller guarantees this is a wide reg
|
|
||||||
"""
|
|
||||||
astrb = self.exp.dereferencer.get_access_strobe(node, reduce_substrobes=False)
|
|
||||||
|
|
||||||
# Generate assignments for first sub-word
|
|
||||||
bidx = 0
|
|
||||||
rd_strb = f"({astrb}[0] && !decoded_req_is_wr)"
|
|
||||||
for field in node.fields():
|
|
||||||
if not field.is_sw_readable:
|
|
||||||
continue
|
|
||||||
|
|
||||||
if field.low >= accesswidth:
|
|
||||||
# field is not in this subword.
|
|
||||||
break
|
|
||||||
|
|
||||||
if bidx < field.low:
|
|
||||||
# insert padding before
|
|
||||||
self.add_content(f"assign readback_array[{self.current_offset_str}][{field.low - 1}:{bidx}] = '0;")
|
|
||||||
|
|
||||||
if field.high >= accesswidth:
|
|
||||||
# field gets truncated
|
|
||||||
r_low = field.low
|
|
||||||
r_high = accesswidth - 1
|
|
||||||
f_low = 0
|
|
||||||
f_high = accesswidth - 1 - field.low
|
|
||||||
|
|
||||||
if field.msb < field.lsb:
|
|
||||||
# Field gets bitswapped since it is in [low:high] orientation
|
|
||||||
# Mirror the low/high indexes
|
|
||||||
f_low = field.width - 1 - f_low
|
|
||||||
f_high = field.width - 1 - f_high
|
|
||||||
f_low, f_high = f_high, f_low
|
|
||||||
value = do_bitswap(do_slice(self.exp.dereferencer.get_value(field), f_high, f_low))
|
|
||||||
else:
|
|
||||||
value = do_slice(self.exp.dereferencer.get_value(field), f_high, f_low)
|
|
||||||
|
|
||||||
self.add_content(f"assign readback_array[{self.current_offset_str}][{r_high}:{r_low}] = {rd_strb} ? {value} : '0;")
|
|
||||||
bidx = accesswidth
|
|
||||||
else:
|
|
||||||
# field fits in subword
|
|
||||||
value = self.exp.dereferencer.get_value(field)
|
|
||||||
if field.msb < field.lsb:
|
|
||||||
# Field gets bitswapped since it is in [low:high] orientation
|
|
||||||
value = do_bitswap(value)
|
|
||||||
self.add_content(f"assign readback_array[{self.current_offset_str}][{field.high}:{field.low}] = {rd_strb} ? {value} : '0;")
|
|
||||||
bidx = field.high + 1
|
|
||||||
|
|
||||||
# pad up remainder of subword
|
|
||||||
if bidx < accesswidth:
|
|
||||||
self.add_content(f"assign readback_array[{self.current_offset_str}][{accesswidth-1}:{bidx}] = '0;")
|
|
||||||
self.current_offset += 1
|
|
||||||
|
|
||||||
# Assign remainder of subwords from read buffer
|
|
||||||
n_subwords = regwidth // accesswidth
|
|
||||||
rbuf = self.exp.read_buffering.get_rbuf_data(node)
|
|
||||||
for i in range(1, n_subwords):
|
|
||||||
rd_strb = f"({astrb}[{i}] && !decoded_req_is_wr)"
|
|
||||||
bslice = f"[{(i + 1) * accesswidth - 1}:{i*accesswidth}]"
|
|
||||||
self.add_content(f"assign readback_array[{self.current_offset_str}] = {rd_strb} ? {rbuf}{bslice} : '0;")
|
|
||||||
self.current_offset += 1
|
|
||||||
|
|
||||||
def process_wide_reg(self, node: RegNode, accesswidth: int) -> None:
|
|
||||||
bus_width = self.exp.cpuif.data_width
|
|
||||||
|
|
||||||
subword_idx = 0
|
|
||||||
current_bit = 0 # Bit-offset within the wide register
|
|
||||||
access_strb = self.exp.dereferencer.get_access_strobe(node, reduce_substrobes=False)
|
|
||||||
# Fields are sorted by ascending low bit
|
|
||||||
for field in node.fields():
|
|
||||||
if not field.is_sw_readable:
|
|
||||||
continue
|
|
||||||
|
|
||||||
# insert zero assignment before this field if needed
|
|
||||||
if field.low >= accesswidth*(subword_idx+1):
|
|
||||||
# field does not start in this subword
|
|
||||||
if current_bit > accesswidth * subword_idx:
|
|
||||||
# current subword had content. Assign remainder
|
|
||||||
low = current_bit % accesswidth
|
|
||||||
high = bus_width - 1
|
|
||||||
self.add_content(f"assign readback_array[{self.current_offset_str}][{high}:{low}] = '0;")
|
|
||||||
self.current_offset += 1
|
|
||||||
|
|
||||||
# Advance to subword that contains the start of the field
|
|
||||||
subword_idx = field.low // accesswidth
|
|
||||||
current_bit = accesswidth * subword_idx
|
|
||||||
|
|
||||||
if current_bit != field.low:
|
|
||||||
# assign zero up to start of this field
|
|
||||||
low = current_bit % accesswidth
|
|
||||||
high = (field.low % accesswidth) - 1
|
|
||||||
self.add_content(f"assign readback_array[{self.current_offset_str}][{high}:{low}] = '0;")
|
|
||||||
current_bit = field.low
|
|
||||||
|
|
||||||
|
|
||||||
# Assign field
|
|
||||||
# loop until the entire field's assignments have been generated
|
|
||||||
field_pos = field.low
|
|
||||||
while current_bit <= field.high:
|
|
||||||
# Assign the field
|
|
||||||
rd_strb = f"({access_strb}[{subword_idx}] && !decoded_req_is_wr)"
|
|
||||||
if (field_pos == field.low) and (field.high < accesswidth*(subword_idx+1)):
|
|
||||||
# entire field fits into this subword
|
|
||||||
low = field.low - accesswidth * subword_idx
|
|
||||||
high = field.high - accesswidth * subword_idx
|
|
||||||
|
|
||||||
value = self.exp.dereferencer.get_value(field)
|
|
||||||
if field.msb < field.lsb:
|
|
||||||
# Field gets bitswapped since it is in [low:high] orientation
|
|
||||||
value = do_bitswap(value)
|
|
||||||
|
|
||||||
self.add_content(f"assign readback_array[{self.current_offset_str}][{high}:{low}] = {rd_strb} ? {value} : '0;")
|
|
||||||
|
|
||||||
current_bit = field.high + 1
|
|
||||||
|
|
||||||
if current_bit == accesswidth*(subword_idx+1):
|
|
||||||
# Field ends at the subword boundary
|
|
||||||
subword_idx += 1
|
|
||||||
self.current_offset += 1
|
|
||||||
elif field.high >= accesswidth*(subword_idx+1):
|
|
||||||
# only a subset of the field can fit into this subword
|
|
||||||
# high end gets truncated
|
|
||||||
|
|
||||||
# assignment slice
|
|
||||||
r_low = field_pos - accesswidth * subword_idx
|
|
||||||
r_high = accesswidth - 1
|
|
||||||
|
|
||||||
# field slice
|
|
||||||
f_low = field_pos - field.low
|
|
||||||
f_high = accesswidth * (subword_idx + 1) - 1 - field.low
|
|
||||||
|
|
||||||
if field.msb < field.lsb:
|
|
||||||
# Field gets bitswapped since it is in [low:high] orientation
|
|
||||||
# Mirror the low/high indexes
|
|
||||||
f_low = field.width - 1 - f_low
|
|
||||||
f_high = field.width - 1 - f_high
|
|
||||||
f_low, f_high = f_high, f_low
|
|
||||||
|
|
||||||
value = do_bitswap(do_slice(self.exp.dereferencer.get_value(field), f_high, f_low))
|
|
||||||
else:
|
|
||||||
value = do_slice(self.exp.dereferencer.get_value(field), f_high, f_low)
|
|
||||||
|
|
||||||
self.add_content(f"assign readback_array[{self.current_offset_str}][{r_high}:{r_low}] = {rd_strb} ? {value} : '0;")
|
|
||||||
|
|
||||||
# advance to the next subword
|
|
||||||
subword_idx += 1
|
|
||||||
current_bit = accesswidth * subword_idx
|
|
||||||
field_pos = current_bit
|
|
||||||
self.current_offset += 1
|
|
||||||
else:
|
|
||||||
# only a subset of the field can fit into this subword
|
|
||||||
# finish field
|
|
||||||
|
|
||||||
# assignment slice
|
|
||||||
r_low = field_pos - accesswidth * subword_idx
|
|
||||||
r_high = field.high - accesswidth * subword_idx
|
|
||||||
|
|
||||||
# field slice
|
|
||||||
f_low = field_pos - field.low
|
|
||||||
f_high = field.high - field.low
|
|
||||||
|
|
||||||
if field.msb < field.lsb:
|
|
||||||
# Field gets bitswapped since it is in [low:high] orientation
|
|
||||||
# Mirror the low/high indexes
|
|
||||||
f_low = field.width - 1 - f_low
|
|
||||||
f_high = field.width - 1 - f_high
|
|
||||||
f_low, f_high = f_high, f_low
|
|
||||||
|
|
||||||
value = do_bitswap(do_slice(self.exp.dereferencer.get_value(field), f_high, f_low))
|
|
||||||
else:
|
|
||||||
value = do_slice(self.exp.dereferencer.get_value(field), f_high, f_low)
|
|
||||||
|
|
||||||
self.add_content(f"assign readback_array[{self.current_offset_str}][{r_high}:{r_low}] = {rd_strb} ? {value} : '0;")
|
|
||||||
|
|
||||||
current_bit = field.high + 1
|
|
||||||
if current_bit == accesswidth*(subword_idx+1):
|
|
||||||
# Field ends at the subword boundary
|
|
||||||
subword_idx += 1
|
|
||||||
self.current_offset += 1
|
|
||||||
|
|
||||||
# insert zero assignment after the last field if needed
|
|
||||||
if current_bit > accesswidth * subword_idx:
|
|
||||||
# current subword had content. Assign remainder
|
|
||||||
low = current_bit % accesswidth
|
|
||||||
high = bus_width - 1
|
|
||||||
self.add_content(f"assign readback_array[{self.current_offset_str}][{high}:{low}] = '0;")
|
|
||||||
self.current_offset += 1
|
|
||||||
101
src/peakrdl_regblock/readback/readback.py
Normal file
101
src/peakrdl_regblock/readback/readback.py
Normal file
@@ -0,0 +1,101 @@
|
|||||||
|
from typing import TYPE_CHECKING
|
||||||
|
|
||||||
|
from .readback_mux_generator import ReadbackMuxGenerator, RetimedReadbackMuxGenerator, RetimedExtBlockReadbackMuxGenerator
|
||||||
|
from ..utils import clog2
|
||||||
|
|
||||||
|
if TYPE_CHECKING:
|
||||||
|
from ..exporter import RegblockExporter, DesignState
|
||||||
|
|
||||||
|
class Readback:
|
||||||
|
def __init__(self, exp:'RegblockExporter'):
|
||||||
|
self.exp = exp
|
||||||
|
|
||||||
|
@property
|
||||||
|
def ds(self) -> 'DesignState':
|
||||||
|
return self.exp.ds
|
||||||
|
|
||||||
|
def get_implementation(self) -> str:
|
||||||
|
if self.ds.retime_read_fanin:
|
||||||
|
return self.get_2stage_implementation()
|
||||||
|
else:
|
||||||
|
# No retiming
|
||||||
|
return self.get_1stage_implementation()
|
||||||
|
|
||||||
|
|
||||||
|
def get_empty_implementation(self) -> str:
|
||||||
|
"""
|
||||||
|
Readback implementation when there are no readable registers
|
||||||
|
"""
|
||||||
|
context = {
|
||||||
|
"ds": self.ds,
|
||||||
|
}
|
||||||
|
template = self.exp.jj_env.get_template(
|
||||||
|
"readback/templates/empty_readback.sv"
|
||||||
|
)
|
||||||
|
return template.render(context)
|
||||||
|
|
||||||
|
|
||||||
|
def get_1stage_implementation(self) -> str:
|
||||||
|
"""
|
||||||
|
Implements readback without any retiming
|
||||||
|
"""
|
||||||
|
gen = ReadbackMuxGenerator(self.exp)
|
||||||
|
mux_impl = gen.get_content(self.ds.top_node)
|
||||||
|
|
||||||
|
if not mux_impl:
|
||||||
|
# Design has no readable registers.
|
||||||
|
return self.get_empty_implementation()
|
||||||
|
|
||||||
|
context = {
|
||||||
|
"readback_mux": mux_impl,
|
||||||
|
"cpuif": self.exp.cpuif,
|
||||||
|
"ds": self.ds,
|
||||||
|
}
|
||||||
|
template = self.exp.jj_env.get_template(
|
||||||
|
"readback/templates/readback_no_rt.sv"
|
||||||
|
)
|
||||||
|
|
||||||
|
return template.render(context)
|
||||||
|
|
||||||
|
|
||||||
|
def get_2stage_implementation(self) -> str:
|
||||||
|
"""
|
||||||
|
Implements readback that is retimed to 2 stages
|
||||||
|
"""
|
||||||
|
# Split the decode to happen in two stages, using low address bits first
|
||||||
|
# then high address bits.
|
||||||
|
# Split in the middle of the "relevant" address bits - the ones that
|
||||||
|
# actually contribute to addressing in the regblock
|
||||||
|
unused_low_addr_bits = clog2(self.exp.cpuif.data_width_bytes)
|
||||||
|
relevant_addr_width = self.ds.addr_width - unused_low_addr_bits
|
||||||
|
low_addr_width = (relevant_addr_width // 2) + unused_low_addr_bits
|
||||||
|
high_addr_width = self.ds.addr_width - low_addr_width
|
||||||
|
|
||||||
|
mux_gen = RetimedReadbackMuxGenerator(self.exp)
|
||||||
|
mux_impl = mux_gen.get_content(self.ds.top_node)
|
||||||
|
|
||||||
|
if not mux_impl:
|
||||||
|
# Design has no readable addresses.
|
||||||
|
return self.get_empty_implementation()
|
||||||
|
|
||||||
|
if self.ds.has_external_block:
|
||||||
|
ext_mux_gen = RetimedExtBlockReadbackMuxGenerator(self.exp)
|
||||||
|
ext_mux_impl = ext_mux_gen.get_content(self.ds.top_node)
|
||||||
|
else:
|
||||||
|
ext_mux_impl = None
|
||||||
|
|
||||||
|
context = {
|
||||||
|
"readback_mux": mux_impl,
|
||||||
|
"ext_block_readback_mux": ext_mux_impl,
|
||||||
|
"cpuif": self.exp.cpuif,
|
||||||
|
"ds": self.ds,
|
||||||
|
"low_addr_width": low_addr_width,
|
||||||
|
"high_addr_width": high_addr_width,
|
||||||
|
'get_always_ff_event': self.exp.dereferencer.get_always_ff_event,
|
||||||
|
'get_resetsignal': self.exp.dereferencer.get_resetsignal,
|
||||||
|
}
|
||||||
|
template = self.exp.jj_env.get_template(
|
||||||
|
"readback/templates/readback_with_rt.sv"
|
||||||
|
)
|
||||||
|
|
||||||
|
return template.render(context)
|
||||||
361
src/peakrdl_regblock/readback/readback_mux_generator.py
Normal file
361
src/peakrdl_regblock/readback/readback_mux_generator.py
Normal file
@@ -0,0 +1,361 @@
|
|||||||
|
from typing import TYPE_CHECKING, List, Sequence, Optional
|
||||||
|
|
||||||
|
from systemrdl.node import RegNode, AddressableNode, FieldNode
|
||||||
|
from systemrdl.walker import WalkerAction
|
||||||
|
|
||||||
|
from ..forloop_generator import RDLForLoopGenerator
|
||||||
|
from ..utils import SVInt, do_bitswap, do_slice
|
||||||
|
|
||||||
|
if TYPE_CHECKING:
|
||||||
|
from ..exporter import DesignState, RegblockExporter
|
||||||
|
|
||||||
|
class ReadbackMuxGenerator(RDLForLoopGenerator):
|
||||||
|
def __init__(self, exp: 'RegblockExporter') -> None:
|
||||||
|
super().__init__()
|
||||||
|
|
||||||
|
self.exp = exp
|
||||||
|
|
||||||
|
# List of address strides for each dimension
|
||||||
|
self._array_stride_stack: List[int] = []
|
||||||
|
|
||||||
|
@property
|
||||||
|
def ds(self) -> 'DesignState':
|
||||||
|
return self.exp.ds
|
||||||
|
|
||||||
|
|
||||||
|
def enter_AddressableComponent(self, node: AddressableNode) -> Optional[WalkerAction]:
|
||||||
|
super().enter_AddressableComponent(node)
|
||||||
|
|
||||||
|
if node.array_dimensions:
|
||||||
|
assert node.array_stride is not None
|
||||||
|
# Collect strides for each array dimension
|
||||||
|
current_stride = node.array_stride
|
||||||
|
strides = []
|
||||||
|
for dim in reversed(node.array_dimensions):
|
||||||
|
strides.append(current_stride)
|
||||||
|
current_stride *= dim
|
||||||
|
strides.reverse()
|
||||||
|
self._array_stride_stack.extend(strides)
|
||||||
|
|
||||||
|
if node.external and not isinstance(node, RegNode):
|
||||||
|
# Is an external block
|
||||||
|
self.process_external_block(node)
|
||||||
|
return WalkerAction.SkipDescendants
|
||||||
|
|
||||||
|
return WalkerAction.Continue
|
||||||
|
|
||||||
|
|
||||||
|
def process_external_block(self, node: AddressableNode) -> None:
|
||||||
|
addr_lo = self._get_address_str(node)
|
||||||
|
addr_hi = f"{addr_lo} + {SVInt(node.size - 1, self.exp.ds.addr_width)}"
|
||||||
|
self.add_content(f"if((rd_mux_addr >= {addr_lo}) && (rd_mux_addr <= {addr_hi})) begin")
|
||||||
|
data = self.exp.hwif.get_external_rd_data(node)
|
||||||
|
self.add_content(f" readback_data_var = {data};")
|
||||||
|
self.add_content("end")
|
||||||
|
|
||||||
|
|
||||||
|
def enter_Reg(self, node: RegNode) -> WalkerAction:
|
||||||
|
fields = node.fields(sw_readable_only=True)
|
||||||
|
if not fields:
|
||||||
|
# Reg has no readable fields
|
||||||
|
return WalkerAction.SkipDescendants
|
||||||
|
|
||||||
|
if node.external:
|
||||||
|
self.process_external_reg(node)
|
||||||
|
return WalkerAction.SkipDescendants
|
||||||
|
|
||||||
|
accesswidth = node.get_property('accesswidth')
|
||||||
|
regwidth = node.get_property('regwidth')
|
||||||
|
rbuf = node.get_property('buffer_reads')
|
||||||
|
|
||||||
|
if rbuf:
|
||||||
|
trigger = node.get_property('rbuffer_trigger')
|
||||||
|
is_own_trigger = (isinstance(trigger, RegNode) and trigger == node)
|
||||||
|
if is_own_trigger:
|
||||||
|
if accesswidth < regwidth:
|
||||||
|
self.process_wide_buffered_reg_with_bypass(node, fields, regwidth, accesswidth)
|
||||||
|
else:
|
||||||
|
# bypass cancels out. Behaves like a normal reg
|
||||||
|
self.process_reg(node, fields)
|
||||||
|
else:
|
||||||
|
self.process_buffered_reg(node, regwidth, accesswidth)
|
||||||
|
elif accesswidth < regwidth:
|
||||||
|
self.process_wide_reg(node, fields, regwidth, accesswidth)
|
||||||
|
else:
|
||||||
|
self.process_reg(node, fields)
|
||||||
|
|
||||||
|
return WalkerAction.SkipDescendants
|
||||||
|
|
||||||
|
|
||||||
|
def _get_address_str(self, node: AddressableNode, subword_offset: int=0) -> str:
|
||||||
|
expr_width = self.ds.addr_width
|
||||||
|
a = str(SVInt(
|
||||||
|
node.raw_absolute_address - self.ds.top_node.raw_absolute_address + subword_offset,
|
||||||
|
expr_width
|
||||||
|
))
|
||||||
|
for i, stride in enumerate(self._array_stride_stack):
|
||||||
|
a += f" + ({expr_width})'(i{i}) * {SVInt(stride, expr_width)}"
|
||||||
|
return a
|
||||||
|
|
||||||
|
|
||||||
|
def get_addr_compare_conditional(self, addr: str) -> str:
|
||||||
|
return f"rd_mux_addr == {addr}"
|
||||||
|
|
||||||
|
def get_readback_data_var(self, addr: str) -> str:
|
||||||
|
return "readback_data_var"
|
||||||
|
|
||||||
|
def process_external_reg(self, node: RegNode) -> None:
|
||||||
|
accesswidth = node.get_property('accesswidth')
|
||||||
|
regwidth = node.get_property('regwidth')
|
||||||
|
data = self.exp.hwif.get_external_rd_data(node)
|
||||||
|
|
||||||
|
if regwidth > accesswidth:
|
||||||
|
# Is wide reg.
|
||||||
|
# The retiming scheme requires singular address comparisons rather than
|
||||||
|
# ranges. To support this, unroll the subwords
|
||||||
|
n_subwords = regwidth // accesswidth
|
||||||
|
subword_stride = accesswidth // 8
|
||||||
|
for subword_idx in range(n_subwords):
|
||||||
|
addr = self._get_address_str(node, subword_offset=subword_idx*subword_stride)
|
||||||
|
conditional = self.get_addr_compare_conditional(addr)
|
||||||
|
var = self.get_readback_data_var(addr)
|
||||||
|
self.add_content(f"if({conditional}) begin")
|
||||||
|
self.add_content(f" {var} = {data};")
|
||||||
|
self.add_content("end")
|
||||||
|
else:
|
||||||
|
addr = self._get_address_str(node)
|
||||||
|
conditional = self.get_addr_compare_conditional(addr)
|
||||||
|
var = self.get_readback_data_var(addr)
|
||||||
|
self.add_content(f"if({conditional}) begin")
|
||||||
|
if regwidth < self.exp.cpuif.data_width:
|
||||||
|
self.add_content(f" {var}[{regwidth-1}:0] = {data};")
|
||||||
|
else:
|
||||||
|
self.add_content(f" {var} = {data};")
|
||||||
|
self.add_content("end")
|
||||||
|
|
||||||
|
|
||||||
|
def process_reg(self, node: RegNode, fields: Sequence[FieldNode]) -> None:
|
||||||
|
"""
|
||||||
|
Process a regular register
|
||||||
|
"""
|
||||||
|
addr = self._get_address_str(node)
|
||||||
|
conditional = self.get_addr_compare_conditional(addr)
|
||||||
|
var = self.get_readback_data_var(addr)
|
||||||
|
self.add_content(f"if({conditional}) begin")
|
||||||
|
|
||||||
|
for field in fields:
|
||||||
|
value = self.exp.dereferencer.get_value(field)
|
||||||
|
if field.msb < field.lsb:
|
||||||
|
# Field gets bitswapped since it is in [low:high] orientation
|
||||||
|
value = do_bitswap(value)
|
||||||
|
|
||||||
|
if field.width == 1:
|
||||||
|
self.add_content(f" {var}[{field.low}] = {value};")
|
||||||
|
else:
|
||||||
|
self.add_content(f" {var}[{field.high}:{field.low}] = {value};")
|
||||||
|
|
||||||
|
self.add_content("end")
|
||||||
|
|
||||||
|
|
||||||
|
def process_buffered_reg(self, node: RegNode, regwidth: int, accesswidth: int) -> None:
|
||||||
|
"""
|
||||||
|
Process a register which is fully buffered
|
||||||
|
"""
|
||||||
|
rbuf = self.exp.read_buffering.get_rbuf_data(node)
|
||||||
|
|
||||||
|
if accesswidth < regwidth:
|
||||||
|
# Is wide reg
|
||||||
|
n_subwords = regwidth // accesswidth
|
||||||
|
subword_stride = accesswidth // 8
|
||||||
|
for subword_idx in range(n_subwords):
|
||||||
|
addr = self._get_address_str(node, subword_offset=subword_idx*subword_stride)
|
||||||
|
conditional = self.get_addr_compare_conditional(addr)
|
||||||
|
var = self.get_readback_data_var(addr)
|
||||||
|
bslice = f"[{(subword_idx + 1) * accesswidth - 1}:{subword_idx*accesswidth}]"
|
||||||
|
self.add_content(f"if({conditional}) begin")
|
||||||
|
self.add_content(f" {var} = {rbuf}{bslice};")
|
||||||
|
self.add_content("end")
|
||||||
|
else:
|
||||||
|
# Is regular reg
|
||||||
|
addr = self._get_address_str(node)
|
||||||
|
conditional = self.get_addr_compare_conditional(addr)
|
||||||
|
var = self.get_readback_data_var(addr)
|
||||||
|
self.add_content(f"if({conditional}) begin")
|
||||||
|
self.add_content(f" {var}[{regwidth-1}:0] = {rbuf};")
|
||||||
|
self.add_content("end")
|
||||||
|
|
||||||
|
|
||||||
|
def process_wide_buffered_reg_with_bypass(self, node: RegNode, fields: Sequence[FieldNode], regwidth: int, accesswidth: int) -> None:
|
||||||
|
"""
|
||||||
|
Special case for a wide buffered register where the register is its own
|
||||||
|
trigger.
|
||||||
|
|
||||||
|
First sub-word shall bypass the read buffer and assign directly.
|
||||||
|
Subsequent subwords assign from the buffer.
|
||||||
|
"""
|
||||||
|
|
||||||
|
# Generate assignments for first sub-word
|
||||||
|
subword_assignments = self.get_wide_reg_subword_assignments(node, fields, regwidth, accesswidth)
|
||||||
|
if subword_assignments[0]:
|
||||||
|
addr = self._get_address_str(node, subword_offset=0)
|
||||||
|
conditional = self.get_addr_compare_conditional(addr)
|
||||||
|
self.add_content(f"if({conditional}) begin")
|
||||||
|
for assignment in subword_assignments[0]:
|
||||||
|
self.add_content(" " + assignment)
|
||||||
|
self.add_content("end")
|
||||||
|
|
||||||
|
# Assign remainder of subwords from read buffer
|
||||||
|
n_subwords = regwidth // accesswidth
|
||||||
|
subword_stride = accesswidth // 8
|
||||||
|
rbuf = self.exp.read_buffering.get_rbuf_data(node)
|
||||||
|
for subword_idx in range(1, n_subwords):
|
||||||
|
addr = self._get_address_str(node, subword_offset=subword_idx*subword_stride)
|
||||||
|
bslice = f"[{(subword_idx + 1) * accesswidth - 1}:{subword_idx*accesswidth}]"
|
||||||
|
conditional = self.get_addr_compare_conditional(addr)
|
||||||
|
var = self.get_readback_data_var(addr)
|
||||||
|
self.add_content(f"if({conditional}) begin")
|
||||||
|
self.add_content(f" {var} = {rbuf}{bslice};")
|
||||||
|
self.add_content("end")
|
||||||
|
|
||||||
|
|
||||||
|
def get_wide_reg_subword_assignments(self, node: RegNode, fields: Sequence[FieldNode], regwidth: int, accesswidth: int) -> List[List[str]]:
|
||||||
|
"""
|
||||||
|
Get a list of assignments for each subword
|
||||||
|
|
||||||
|
Returns a 2d array where the first dimension indicates the subword index.
|
||||||
|
The next dimension is the list of assignments
|
||||||
|
"""
|
||||||
|
n_subwords = regwidth // accesswidth
|
||||||
|
subword_stride = accesswidth // 8
|
||||||
|
subword_assignments: List[List[str]] = [[] for _ in range(n_subwords)]
|
||||||
|
|
||||||
|
# Fields are sorted by ascending low bit
|
||||||
|
for field in fields:
|
||||||
|
subword_idx = field.low // accesswidth
|
||||||
|
|
||||||
|
if field.high < accesswidth * (subword_idx + 1):
|
||||||
|
# entire field fits into this subword
|
||||||
|
low = field.low - accesswidth * subword_idx
|
||||||
|
high = field.high - accesswidth * subword_idx
|
||||||
|
|
||||||
|
value = self.exp.dereferencer.get_value(field)
|
||||||
|
if field.msb < field.lsb:
|
||||||
|
# Field gets bitswapped since it is in [low:high] orientation
|
||||||
|
value = do_bitswap(value)
|
||||||
|
|
||||||
|
addr = self._get_address_str(node, subword_offset=subword_idx*subword_stride)
|
||||||
|
var = self.get_readback_data_var(addr)
|
||||||
|
subword_assignments[subword_idx].append(f"{var}[{high}:{low}] = {value};")
|
||||||
|
|
||||||
|
else:
|
||||||
|
# Field spans multiple sub-words
|
||||||
|
# loop through subword indexes until the entire field has been assigned
|
||||||
|
while field.high >= accesswidth * subword_idx:
|
||||||
|
# Allowable field window for this subword
|
||||||
|
subword_low = accesswidth * subword_idx
|
||||||
|
subword_high = subword_low + accesswidth - 1
|
||||||
|
|
||||||
|
# field slice (relative to reg)
|
||||||
|
f_low = max(subword_low, field.low)
|
||||||
|
f_high = min(subword_high, field.high)
|
||||||
|
|
||||||
|
# assignment slice
|
||||||
|
r_low = f_low - accesswidth * subword_idx
|
||||||
|
r_high = f_high - accesswidth * subword_idx
|
||||||
|
|
||||||
|
# Adjust to be relative to field
|
||||||
|
f_low -= field.low
|
||||||
|
f_high -= field.low
|
||||||
|
|
||||||
|
if field.msb < field.lsb:
|
||||||
|
# Field gets bitswapped since it is in [low:high] orientation
|
||||||
|
# Mirror the low/high indexes
|
||||||
|
f_low = field.width - 1 - f_low
|
||||||
|
f_high = field.width - 1 - f_high
|
||||||
|
f_low, f_high = f_high, f_low
|
||||||
|
|
||||||
|
value = do_bitswap(do_slice(self.exp.dereferencer.get_value(field), f_high, f_low))
|
||||||
|
else:
|
||||||
|
value = do_slice(self.exp.dereferencer.get_value(field), f_high, f_low)
|
||||||
|
|
||||||
|
addr = self._get_address_str(node, subword_offset=subword_idx*subword_stride)
|
||||||
|
var = self.get_readback_data_var(addr)
|
||||||
|
subword_assignments[subword_idx].append(f"{var}[{r_high}:{r_low}] = {value};")
|
||||||
|
|
||||||
|
# advance to the next subword
|
||||||
|
subword_idx += 1
|
||||||
|
|
||||||
|
return subword_assignments
|
||||||
|
|
||||||
|
|
||||||
|
def process_wide_reg(self, node: RegNode, fields: Sequence[FieldNode], regwidth: int, accesswidth: int) -> None:
|
||||||
|
"""
|
||||||
|
Process a register whose accesswidth < regwidth
|
||||||
|
"""
|
||||||
|
subword_assignments = self.get_wide_reg_subword_assignments(node, fields, regwidth, accesswidth)
|
||||||
|
|
||||||
|
# Add generated content, wrapped in the address conditional
|
||||||
|
subword_stride = accesswidth // 8
|
||||||
|
for subword_idx, assignments in enumerate(subword_assignments):
|
||||||
|
if not assignments:
|
||||||
|
continue
|
||||||
|
addr = self._get_address_str(node, subword_offset=subword_idx*subword_stride)
|
||||||
|
conditional = self.get_addr_compare_conditional(addr)
|
||||||
|
self.add_content(f"if({conditional}) begin")
|
||||||
|
for assignment in assignments:
|
||||||
|
self.add_content(" " + assignment)
|
||||||
|
self.add_content("end")
|
||||||
|
|
||||||
|
|
||||||
|
def exit_AddressableComponent(self, node: AddressableNode) -> None:
|
||||||
|
super().exit_AddressableComponent(node)
|
||||||
|
|
||||||
|
if not node.array_dimensions:
|
||||||
|
return
|
||||||
|
|
||||||
|
for _ in node.array_dimensions:
|
||||||
|
self._array_stride_stack.pop()
|
||||||
|
|
||||||
|
|
||||||
|
class RetimedReadbackMuxGenerator(ReadbackMuxGenerator):
|
||||||
|
"""
|
||||||
|
Alternate variant that is dedicated to building the 1st decode stage
|
||||||
|
"""
|
||||||
|
|
||||||
|
def process_external_block(self, node: AddressableNode) -> None:
|
||||||
|
# Do nothing. External blocks are handled in a completely separate readback mux
|
||||||
|
pass
|
||||||
|
|
||||||
|
def get_addr_compare_conditional(self, addr: str) -> str:
|
||||||
|
# In the pipelined variant, compare the low-bits of both sides
|
||||||
|
return f"ad_low(rd_mux_addr) == ad_low({addr})"
|
||||||
|
|
||||||
|
def get_readback_data_var(self, addr: str) -> str:
|
||||||
|
# In the pipelined variant, assign to the bin indexed by the high bits of addr
|
||||||
|
return f"readback_data_var[ad_hi({addr})]"
|
||||||
|
|
||||||
|
|
||||||
|
class RetimedExtBlockReadbackMuxGenerator(ReadbackMuxGenerator):
|
||||||
|
"""
|
||||||
|
When retiming is enabled, external blocks are implemented as a separate
|
||||||
|
reaback mux that is not retimed using a partitioned address.
|
||||||
|
|
||||||
|
This is because the address partitioning scheme used for individual register
|
||||||
|
addresses does not work cleanly for address ranges. (not possible to cleanly
|
||||||
|
map readback of a range to high-address data bins)
|
||||||
|
|
||||||
|
Since the non-retimed mux generator already implements external ranges,
|
||||||
|
re-use it and suppress generation of register logic.
|
||||||
|
"""
|
||||||
|
|
||||||
|
def enter_Reg(self, node: RegNode) -> WalkerAction:
|
||||||
|
return WalkerAction.SkipDescendants
|
||||||
|
|
||||||
|
def process_external_block(self, node: AddressableNode) -> None:
|
||||||
|
addr_lo = self._get_address_str(node)
|
||||||
|
addr_hi = f"{addr_lo} + {SVInt(node.size - 1, self.exp.ds.addr_width)}"
|
||||||
|
self.add_content(f"if((rd_mux_addr >= {addr_lo}) && (rd_mux_addr <= {addr_hi})) begin")
|
||||||
|
data = self.exp.hwif.get_external_rd_data(node)
|
||||||
|
self.add_content(f" readback_data_var = {data};")
|
||||||
|
self.add_content(" is_external_block_var = 1'b1;")
|
||||||
|
self.add_content("end")
|
||||||
@@ -0,0 +1,7 @@
|
|||||||
|
assign readback_done = decoded_req & ~decoded_req_is_wr;
|
||||||
|
assign readback_data = '0;
|
||||||
|
{%- if ds.err_if_bad_addr or ds.err_if_bad_rw %}
|
||||||
|
assign readback_err = decoded_err;
|
||||||
|
{%- else %}
|
||||||
|
assign readback_err = '0;
|
||||||
|
{%- endif %}
|
||||||
@@ -1,94 +0,0 @@
|
|||||||
{% if array_assignments is not none %}
|
|
||||||
// Assign readback values to a flattened array
|
|
||||||
logic [{{cpuif.data_width-1}}:0] readback_array[{{array_size}}];
|
|
||||||
{{array_assignments}}
|
|
||||||
|
|
||||||
|
|
||||||
{%- if ds.retime_read_fanin %}
|
|
||||||
|
|
||||||
// fanin stage
|
|
||||||
logic [{{cpuif.data_width-1}}:0] readback_array_c[{{fanin_array_size}}];
|
|
||||||
for(genvar g=0; g<{{fanin_loop_iter}}; g++) begin
|
|
||||||
always_comb begin
|
|
||||||
automatic logic [{{cpuif.data_width-1}}:0] readback_data_var;
|
|
||||||
readback_data_var = '0;
|
|
||||||
for(int i=g*{{fanin_stride}}; i<((g+1)*{{fanin_stride}}); i++) readback_data_var |= readback_array[i];
|
|
||||||
readback_array_c[g] = readback_data_var;
|
|
||||||
end
|
|
||||||
end
|
|
||||||
{%- if fanin_residual_stride == 1 %}
|
|
||||||
assign readback_array_c[{{fanin_array_size-1}}] = readback_array[{{array_size-1}}];
|
|
||||||
{%- elif fanin_residual_stride > 1 %}
|
|
||||||
always_comb begin
|
|
||||||
automatic logic [{{cpuif.data_width-1}}:0] readback_data_var;
|
|
||||||
readback_data_var = '0;
|
|
||||||
for(int i={{(fanin_array_size-1) * fanin_stride}}; i<{{array_size}}; i++) readback_data_var |= readback_array[i];
|
|
||||||
readback_array_c[{{fanin_array_size-1}}] = readback_data_var;
|
|
||||||
end
|
|
||||||
{%- endif %}
|
|
||||||
|
|
||||||
logic [{{cpuif.data_width-1}}:0] readback_array_r[{{fanin_array_size}}];
|
|
||||||
logic readback_done_r;
|
|
||||||
logic readback_err_r;
|
|
||||||
always_ff {{get_always_ff_event(cpuif.reset)}} begin
|
|
||||||
if({{get_resetsignal(cpuif.reset)}}) begin
|
|
||||||
for(int i=0; i<{{fanin_array_size}}; i++) readback_array_r[i] <= '0;
|
|
||||||
readback_done_r <= '0;
|
|
||||||
readback_err_r <= '0;
|
|
||||||
end else begin
|
|
||||||
readback_array_r <= readback_array_c;
|
|
||||||
readback_err_r <= decoded_err;
|
|
||||||
{%- if ds.has_external_addressable %}
|
|
||||||
readback_done_r <= decoded_req & ~decoded_req_is_wr & ~decoded_strb_is_external;
|
|
||||||
{%- else %}
|
|
||||||
readback_done_r <= decoded_req & ~decoded_req_is_wr;
|
|
||||||
{%- endif %}
|
|
||||||
end
|
|
||||||
end
|
|
||||||
|
|
||||||
// Reduce the array
|
|
||||||
always_comb begin
|
|
||||||
automatic logic [{{cpuif.data_width-1}}:0] readback_data_var;
|
|
||||||
readback_done = readback_done_r;
|
|
||||||
{%- if ds.err_if_bad_addr or ds.err_if_bad_rw %}
|
|
||||||
readback_err = readback_err_r;
|
|
||||||
{%- else %}
|
|
||||||
readback_err = '0;
|
|
||||||
{%- endif %}
|
|
||||||
readback_data_var = '0;
|
|
||||||
for(int i=0; i<{{fanin_array_size}}; i++) readback_data_var |= readback_array_r[i];
|
|
||||||
readback_data = readback_data_var;
|
|
||||||
end
|
|
||||||
|
|
||||||
{%- else %}
|
|
||||||
|
|
||||||
// Reduce the array
|
|
||||||
always_comb begin
|
|
||||||
automatic logic [{{cpuif.data_width-1}}:0] readback_data_var;
|
|
||||||
{%- if ds.has_external_addressable %}
|
|
||||||
readback_done = decoded_req & ~decoded_req_is_wr & ~decoded_strb_is_external;
|
|
||||||
{%- else %}
|
|
||||||
readback_done = decoded_req & ~decoded_req_is_wr;
|
|
||||||
{%- endif %}
|
|
||||||
{%- if ds.err_if_bad_addr or ds.err_if_bad_rw %}
|
|
||||||
readback_err = decoded_err;
|
|
||||||
{%- else %}
|
|
||||||
readback_err = '0;
|
|
||||||
{%- endif %}
|
|
||||||
readback_data_var = '0;
|
|
||||||
for(int i=0; i<{{array_size}}; i++) readback_data_var |= readback_array[i];
|
|
||||||
readback_data = readback_data_var;
|
|
||||||
end
|
|
||||||
{%- endif %}
|
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
{%- else %}
|
|
||||||
assign readback_done = decoded_req & ~decoded_req_is_wr;
|
|
||||||
assign readback_data = '0;
|
|
||||||
{%- if ds.err_if_bad_addr or ds.err_if_bad_rw %}
|
|
||||||
assign readback_err = decoded_err;
|
|
||||||
{%- else %}
|
|
||||||
assign readback_err = '0;
|
|
||||||
{%- endif %}
|
|
||||||
{% endif %}
|
|
||||||
17
src/peakrdl_regblock/readback/templates/readback_no_rt.sv
Normal file
17
src/peakrdl_regblock/readback/templates/readback_no_rt.sv
Normal file
@@ -0,0 +1,17 @@
|
|||||||
|
always_comb begin
|
||||||
|
automatic logic [{{cpuif.data_width-1}}:0] readback_data_var;
|
||||||
|
readback_data_var = '0;
|
||||||
|
{{readback_mux|indent}}
|
||||||
|
readback_data = readback_data_var;
|
||||||
|
|
||||||
|
{%- if ds.has_external_addressable %}
|
||||||
|
readback_done = decoded_req & ~decoded_req_is_wr & ~decoded_req_is_external;
|
||||||
|
{%- else %}
|
||||||
|
readback_done = decoded_req & ~decoded_req_is_wr;
|
||||||
|
{%- endif %}
|
||||||
|
{%- if ds.err_if_bad_addr or ds.err_if_bad_rw %}
|
||||||
|
readback_err = decoded_err;
|
||||||
|
{%- else %}
|
||||||
|
readback_err = '0;
|
||||||
|
{%- endif %}
|
||||||
|
end
|
||||||
82
src/peakrdl_regblock/readback/templates/readback_with_rt.sv
Normal file
82
src/peakrdl_regblock/readback/templates/readback_with_rt.sv
Normal file
@@ -0,0 +1,82 @@
|
|||||||
|
function automatic bit [{{low_addr_width-1}}:0] ad_low(bit [{{ds.addr_width-1}}:0] addr);
|
||||||
|
return addr[{{low_addr_width-1}}:0];
|
||||||
|
endfunction
|
||||||
|
function automatic bit [{{high_addr_width-1}}:0] ad_hi(bit [{{ds.addr_width-1}}:0] addr);
|
||||||
|
return addr[{{ds.addr_width-1}}:{{low_addr_width}}];
|
||||||
|
endfunction
|
||||||
|
|
||||||
|
// readback stage 1
|
||||||
|
logic [{{cpuif.data_width-1}}:0] readback_data_rt_c[{{2 ** high_addr_width}}];
|
||||||
|
always_comb begin
|
||||||
|
automatic logic [{{cpuif.data_width-1}}:0] readback_data_var[{{2 ** high_addr_width}}];
|
||||||
|
for(int i=0; i<{{2 ** high_addr_width}}; i++) readback_data_var[i] = '0;
|
||||||
|
{{readback_mux|indent}}
|
||||||
|
readback_data_rt_c = readback_data_var;
|
||||||
|
end
|
||||||
|
|
||||||
|
logic [{{cpuif.data_width-1}}:0] readback_data_rt[{{2 ** high_addr_width}}];
|
||||||
|
logic readback_done_rt;
|
||||||
|
logic readback_err_rt;
|
||||||
|
logic [{{ds.addr_width-1}}:0] readback_addr_rt;
|
||||||
|
always_ff {{get_always_ff_event(cpuif.reset)}} begin
|
||||||
|
if({{get_resetsignal(cpuif.reset)}}) begin
|
||||||
|
for(int i=0; i<{{2 ** high_addr_width}}; i++) readback_data_rt[i] <= '0;
|
||||||
|
readback_done_rt <= '0;
|
||||||
|
readback_err_rt <= '0;
|
||||||
|
readback_addr_rt <= '0;
|
||||||
|
end else begin
|
||||||
|
readback_data_rt <= readback_data_rt_c;
|
||||||
|
readback_err_rt <= decoded_err;
|
||||||
|
{%- if ds.has_external_addressable %}
|
||||||
|
readback_done_rt <= decoded_req & ~decoded_req_is_wr & ~decoded_req_is_external;
|
||||||
|
{%- else %}
|
||||||
|
readback_done_rt <= decoded_req & ~decoded_req_is_wr;
|
||||||
|
{%- endif %}
|
||||||
|
readback_addr_rt <= rd_mux_addr;
|
||||||
|
end
|
||||||
|
end
|
||||||
|
|
||||||
|
{% if ds.has_external_block %}
|
||||||
|
logic [{{cpuif.data_width-1}}:0] readback_ext_block_data_rt_c;
|
||||||
|
logic readback_is_ext_block_c;
|
||||||
|
always_comb begin
|
||||||
|
automatic logic [{{cpuif.data_width-1}}:0] readback_data_var;
|
||||||
|
automatic logic is_external_block_var;
|
||||||
|
readback_data_var = '0;
|
||||||
|
is_external_block_var = '0;
|
||||||
|
{{ext_block_readback_mux|indent}}
|
||||||
|
readback_ext_block_data_rt_c = readback_data_var;
|
||||||
|
readback_is_ext_block_c = is_external_block_var;
|
||||||
|
end
|
||||||
|
|
||||||
|
logic [{{cpuif.data_width-1}}:0] readback_ext_block_data_rt;
|
||||||
|
logic readback_is_ext_block;
|
||||||
|
always_ff {{get_always_ff_event(cpuif.reset)}} begin
|
||||||
|
if({{get_resetsignal(cpuif.reset)}}) begin
|
||||||
|
readback_ext_block_data_rt <= '0;
|
||||||
|
readback_is_ext_block <= '0;
|
||||||
|
end else begin
|
||||||
|
readback_ext_block_data_rt <= readback_ext_block_data_rt_c;
|
||||||
|
readback_is_ext_block <= readback_is_ext_block_c;
|
||||||
|
end
|
||||||
|
end
|
||||||
|
{% endif %}
|
||||||
|
|
||||||
|
// readback stage 2
|
||||||
|
always_comb begin
|
||||||
|
{%- if ds.has_external_block %}
|
||||||
|
if(readback_is_ext_block) begin
|
||||||
|
readback_data = readback_ext_block_data_rt;
|
||||||
|
end else begin
|
||||||
|
readback_data = readback_data_rt[readback_addr_rt[{{ds.addr_width-1}}:{{low_addr_width}}]];
|
||||||
|
end
|
||||||
|
{%- else %}
|
||||||
|
readback_data = readback_data_rt[readback_addr_rt[{{ds.addr_width-1}}:{{low_addr_width}}]];
|
||||||
|
{%- endif %}
|
||||||
|
readback_done = readback_done_rt;
|
||||||
|
{%- if ds.err_if_bad_addr or ds.err_if_bad_rw %}
|
||||||
|
readback_err = readback_err_rt;
|
||||||
|
{%- else %}
|
||||||
|
readback_err = '0;
|
||||||
|
{%- endif %}
|
||||||
|
end
|
||||||
@@ -6,10 +6,10 @@
|
|||||||
Testcases require an installation of the Questa simulator, and for `vlog` & `vsim`
|
Testcases require an installation of the Questa simulator, and for `vlog` & `vsim`
|
||||||
commands to be visible via the PATH environment variable.
|
commands to be visible via the PATH environment variable.
|
||||||
|
|
||||||
*Questa - Intel FPGA Starter Edition* can be downloaded for free from Intel:
|
*Questa-Altera FPGA and Starter Edition* can be downloaded for free from Altera:
|
||||||
* Go to https://www.intel.com/content/www/us/en/collections/products/fpga/software/downloads.html?edition=pro&q=questa&s=Relevancy
|
* Go to https://www.altera.com/downloads
|
||||||
* Select latest version of Questa
|
* Select "Simulation Tools"
|
||||||
* Download Questa files.
|
* Download Questa
|
||||||
* Install
|
* Install
|
||||||
* Be sure to choose "Starter Edition" for the free version.
|
* Be sure to choose "Starter Edition" for the free version.
|
||||||
* Create an account on https://licensing.intel.com
|
* Create an account on https://licensing.intel.com
|
||||||
@@ -18,7 +18,7 @@ commands to be visible via the PATH environment variable.
|
|||||||
* Go to https://licensing.intel.com/psg/s/sales-signup-evaluationlicenses
|
* Go to https://licensing.intel.com/psg/s/sales-signup-evaluationlicenses
|
||||||
* Generate a free *Starter Edition* license file for Questa
|
* Generate a free *Starter Edition* license file for Questa
|
||||||
* Easiest to use a *fixed* license using your NIC ID (MAC address of your network card via `ifconfig`)
|
* Easiest to use a *fixed* license using your NIC ID (MAC address of your network card via `ifconfig`)
|
||||||
* Download the license file and point the `LM_LICENSE_FILE` environment variable to the folder which contains it.
|
* Download the license file and point the `LM_LICENSE_FILE` environment variable to the folder which contains it. In newer versions of Questa, use the `SALT_LICENSE_SERVER` environment variable instead.
|
||||||
* (optional) Delete Intel libraries to save some disk space
|
* (optional) Delete Intel libraries to save some disk space
|
||||||
* Delete `<install_dir>/questa_fse/intel`
|
* Delete `<install_dir>/questa_fse/intel`
|
||||||
* Edit `<install_dir>/questa_fse/modelsim.ini` and remove lines that reference the `intel` libraries
|
* Edit `<install_dir>/questa_fse/modelsim.ini` and remove lines that reference the `intel` libraries
|
||||||
|
|||||||
0
tests/test_only_external_blocks/__init__.py
Normal file
0
tests/test_only_external_blocks/__init__.py
Normal file
11
tests/test_only_external_blocks/regblock.rdl
Normal file
11
tests/test_only_external_blocks/regblock.rdl
Normal file
@@ -0,0 +1,11 @@
|
|||||||
|
addrmap top {
|
||||||
|
mem ext_mem #(
|
||||||
|
longint SIZE = 0x100
|
||||||
|
) {
|
||||||
|
memwidth = 32;
|
||||||
|
mementries = SIZE / 4;
|
||||||
|
};
|
||||||
|
|
||||||
|
external ext_mem #(.SIZE(0x10)) mem1 @ 0x0000;
|
||||||
|
external ext_mem #(.SIZE(0x90)) mem2 @ 0x0200;
|
||||||
|
};
|
||||||
115
tests/test_only_external_blocks/tb_template.sv
Normal file
115
tests/test_only_external_blocks/tb_template.sv
Normal file
@@ -0,0 +1,115 @@
|
|||||||
|
{% extends "lib/tb_base.sv" %}
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
{%- block dut_support %}
|
||||||
|
{% sv_line_anchor %}
|
||||||
|
|
||||||
|
external_block #(
|
||||||
|
.ADDR_WIDTH($clog2('h10))
|
||||||
|
) mem1_inst (
|
||||||
|
.clk(clk),
|
||||||
|
.rst(rst),
|
||||||
|
|
||||||
|
.req(hwif_out.mem1.req),
|
||||||
|
.req_is_wr(hwif_out.mem1.req_is_wr),
|
||||||
|
.addr(hwif_out.mem1.addr),
|
||||||
|
.wr_data(hwif_out.mem1.wr_data),
|
||||||
|
.wr_biten(hwif_out.mem1.wr_biten),
|
||||||
|
.rd_ack(hwif_in.mem1.rd_ack),
|
||||||
|
.rd_data(hwif_in.mem1.rd_data),
|
||||||
|
.wr_ack(hwif_in.mem1.wr_ack)
|
||||||
|
);
|
||||||
|
|
||||||
|
external_block #(
|
||||||
|
.ADDR_WIDTH($clog2('h90))
|
||||||
|
) mem2_inst (
|
||||||
|
.clk(clk),
|
||||||
|
.rst(rst),
|
||||||
|
|
||||||
|
.req(hwif_out.mem2.req),
|
||||||
|
.req_is_wr(hwif_out.mem2.req_is_wr),
|
||||||
|
.addr(hwif_out.mem2.addr),
|
||||||
|
.wr_data(hwif_out.mem2.wr_data),
|
||||||
|
.wr_biten(hwif_out.mem2.wr_biten),
|
||||||
|
.rd_ack(hwif_in.mem2.rd_ack),
|
||||||
|
.rd_data(hwif_in.mem2.rd_data),
|
||||||
|
.wr_ack(hwif_in.mem2.wr_ack)
|
||||||
|
);
|
||||||
|
|
||||||
|
{%- endblock %}
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
{% block seq %}
|
||||||
|
{% sv_line_anchor %}
|
||||||
|
##1;
|
||||||
|
cb.rst <= '0;
|
||||||
|
##1;
|
||||||
|
|
||||||
|
//--------------------------------------------------------------------------
|
||||||
|
// Simple read/write tests
|
||||||
|
//--------------------------------------------------------------------------
|
||||||
|
// mem1
|
||||||
|
repeat(32) begin
|
||||||
|
logic [31:0] x;
|
||||||
|
int unsigned addr;
|
||||||
|
x = $urandom();
|
||||||
|
addr = 'h0;
|
||||||
|
addr += $urandom_range(('h10 / 4) - 1) * 4;
|
||||||
|
cpuif.write(addr, x);
|
||||||
|
cpuif.assert_read(addr, x);
|
||||||
|
end
|
||||||
|
|
||||||
|
// mem2
|
||||||
|
repeat(32) begin
|
||||||
|
logic [31:0] x;
|
||||||
|
int unsigned addr;
|
||||||
|
x = $urandom();
|
||||||
|
addr = 'h200;
|
||||||
|
addr += $urandom_range(('h90 / 4) - 1) * 4;
|
||||||
|
cpuif.write(addr, x);
|
||||||
|
cpuif.assert_read(addr, x);
|
||||||
|
end
|
||||||
|
|
||||||
|
//--------------------------------------------------------------------------
|
||||||
|
// Pipelined access
|
||||||
|
//--------------------------------------------------------------------------
|
||||||
|
// init array with unique known value
|
||||||
|
for(int i=0; i<('h10 / 4); i++) begin
|
||||||
|
cpuif.write('h0 + i*4, 'h1000 + i);
|
||||||
|
end
|
||||||
|
for(int i=0; i<('h90 / 4); i++) begin
|
||||||
|
cpuif.write('h200 + i*4, 'h3000 + i);
|
||||||
|
end
|
||||||
|
|
||||||
|
// random pipelined read/writes
|
||||||
|
repeat(256) begin
|
||||||
|
fork
|
||||||
|
begin
|
||||||
|
int i;
|
||||||
|
logic [31:0] x;
|
||||||
|
int unsigned addr;
|
||||||
|
case($urandom_range(1))
|
||||||
|
0: begin
|
||||||
|
i = $urandom_range(('h10 / 4) - 1);
|
||||||
|
x = 'h1000 + i;
|
||||||
|
addr = 'h0 + i*4;
|
||||||
|
end
|
||||||
|
1: begin
|
||||||
|
i = $urandom_range(('h90 / 4) - 1);
|
||||||
|
x = 'h3000 + i;
|
||||||
|
addr = 'h200 + i*4;
|
||||||
|
end
|
||||||
|
endcase
|
||||||
|
|
||||||
|
case($urandom_range(1))
|
||||||
|
0: cpuif.write(addr, x);
|
||||||
|
1: cpuif.assert_read(addr, x);
|
||||||
|
endcase
|
||||||
|
end
|
||||||
|
join_none
|
||||||
|
end
|
||||||
|
wait fork;
|
||||||
|
|
||||||
|
{% endblock %}
|
||||||
29
tests/test_only_external_blocks/testcase.py
Normal file
29
tests/test_only_external_blocks/testcase.py
Normal file
@@ -0,0 +1,29 @@
|
|||||||
|
from parameterized import parameterized_class
|
||||||
|
|
||||||
|
from ..lib.cpuifs.apb4 import APB4
|
||||||
|
from ..lib.cpuifs.axi4lite import AXI4Lite
|
||||||
|
from ..lib.cpuifs.passthrough import Passthrough
|
||||||
|
from ..lib.sim_testcase import SimTestCase
|
||||||
|
from ..lib.test_params import get_permutation_class_name, get_permutations
|
||||||
|
|
||||||
|
|
||||||
|
@parameterized_class(get_permutations({
|
||||||
|
"cpuif": [
|
||||||
|
APB4(),
|
||||||
|
Passthrough(),
|
||||||
|
],
|
||||||
|
"retime_read_fanin": [True, False],
|
||||||
|
"retime_read_response": [True, False],
|
||||||
|
"retime_external": [True, False],
|
||||||
|
}), class_name_func=get_permutation_class_name)
|
||||||
|
class Test(SimTestCase):
|
||||||
|
extra_tb_files = [
|
||||||
|
"../lib/external_reg.sv",
|
||||||
|
"../lib/external_block.sv",
|
||||||
|
]
|
||||||
|
init_hwif_in = False
|
||||||
|
clocking_hwif_in = False
|
||||||
|
timeout_clk_cycles = 30000
|
||||||
|
|
||||||
|
def test_dut(self):
|
||||||
|
self.run_test()
|
||||||
Reference in New Issue
Block a user