Initial Commit - Forked from PeakRDL-regblock @ a440cc19769069be831d267505da4f3789a26695
This commit is contained in:
51
docs/dev_notes/template-layers/1-port-declaration
Normal file
51
docs/dev_notes/template-layers/1-port-declaration
Normal file
@@ -0,0 +1,51 @@
|
||||
--------------------------------------------------------------------------------
|
||||
Port Declaration
|
||||
--------------------------------------------------------------------------------
|
||||
Generates the port declaration of the module:
|
||||
- Parameters
|
||||
- rd/wr error response/data behavior
|
||||
Do missed accesses cause a SLVERR?
|
||||
Do reads respond with a magic value?
|
||||
- Pipeline enables
|
||||
Enable reg stages in various places
|
||||
|
||||
- RDL-derived Parameters:
|
||||
Someday in the future if i ever get around to this: https://github.com/SystemRDL/systemrdl-compiler/issues/58
|
||||
|
||||
- Clock/Reset
|
||||
Single clk
|
||||
One or more resets
|
||||
|
||||
- CPU Bus Interface
|
||||
Given the bus interface object, emits the IO
|
||||
This can be flattened ports, or a SV Interface
|
||||
Regardless, it shall be malleable so that the user can use their favorite
|
||||
declaration style
|
||||
|
||||
- Hardware interface
|
||||
Two options:
|
||||
- 2-port struct interface
|
||||
Everything is rolled into two unpacked structs - inputs and outputs
|
||||
- Flattened --> NOT DOING
|
||||
Flatten/Unroll everything
|
||||
No. not doing. I hate this and dont want to waste time implementing this.
|
||||
This will NEVER be able to support parameterized regmaps, and just
|
||||
creates a ton of corner cases i dont care to deal with.
|
||||
|
||||
Other IO Signals I need to be aware of:
|
||||
any signals declared, and used in any references:
|
||||
field.resetsignal
|
||||
field.next
|
||||
... etc ...
|
||||
any signals declared and marked as cpuif_reset, or field_reset
|
||||
These override the default rst
|
||||
If both are defined, be sure to not emit the default
|
||||
Pretty straightforward (see 17.1)
|
||||
Also have some notes on this in my general Logbook
|
||||
Will have to make a call on how these propagate if multiple defined
|
||||
in different hierarchies
|
||||
interrupt/halt outputs
|
||||
See "Interrupts" logbook for explanation
|
||||
addrmap.errextbus, regfile.errextbus, reg.errextbus
|
||||
???
|
||||
Apparently these are inputs
|
||||
103
docs/dev_notes/template-layers/1.1.hardware-interface
Normal file
103
docs/dev_notes/template-layers/1.1.hardware-interface
Normal file
@@ -0,0 +1,103 @@
|
||||
================================================================================
|
||||
Summary
|
||||
================================================================================
|
||||
|
||||
RTL interface that provides access to per-field context signals
|
||||
|
||||
Regarding signals:
|
||||
RDL-declared signals are part of the hwif input structure.
|
||||
Only include them if they are referenced by the design (need to scan the
|
||||
full design anyways, so may as well filter out unreferenced ones)
|
||||
|
||||
It is possible to use signals declared in a parent scope.
|
||||
This means that not all signals will be discovered by a hierarchical listener alone
|
||||
Need to scan ALL assigned properties for signal references too.
|
||||
- get signal associated with top node's cpuif_reset helper property, if any
|
||||
- collect all field_resets
|
||||
X check all signal instances in the hier tree
|
||||
- search parents of top node for the first field_reset signal, if any
|
||||
This is WAY less expensive than querying EACH field's resetsignal property
|
||||
X Check all explicitly assigned properties
|
||||
only need to do this for fields
|
||||
Collect all of these into the following:
|
||||
- If inside the hier, add to a list of paths
|
||||
- if outside the hier, add to a dict of path:SignalNode
|
||||
These are all the signals in-use by the design
|
||||
|
||||
Pass list into the hwif generator
|
||||
If the hwif generator encounters a signal during traversal:
|
||||
check if it exists in the signal path list
|
||||
|
||||
out-of-hier signals are inserted outside of the hwif_in as standalone signals.
|
||||
For now, just use their plain inst names. If I need to uniquify them i can add that later.
|
||||
I should at least check against a list of known "dirty words". Seems very likely someone will choose
|
||||
a signal called "rst".
|
||||
Prefix with usersig_ if needed
|
||||
|
||||
|
||||
|
||||
|
||||
================================================================================
|
||||
Naming Scheme
|
||||
================================================================================
|
||||
|
||||
hwif_out
|
||||
.my_regblock
|
||||
.my_reg[X][Y]
|
||||
.my_field
|
||||
.value
|
||||
.anded
|
||||
|
||||
hwif_in
|
||||
.my_regblock
|
||||
.my_reg[X][Y]
|
||||
.my_field
|
||||
.value
|
||||
.we
|
||||
.my_signal
|
||||
.my_fieldreset_signal
|
||||
|
||||
================================================================================
|
||||
Flattened mode? --> NO
|
||||
================================================================================
|
||||
If user wants a flattened list of ports,
|
||||
still use the same hwif_in/out struct internally.
|
||||
Rather than declaring hwif_in and hwif_out in the port list, declare it internally
|
||||
|
||||
Add a mapping layer in the body of the module that performs a ton of assign statements
|
||||
to map flat signals <-> struct
|
||||
|
||||
Alternatively, don't do this at all.
|
||||
If I want to add a flattened mode, generate a wrapper module instead.
|
||||
|
||||
Marking this as YAGNI for now.
|
||||
|
||||
|
||||
================================================================================
|
||||
IO Signals
|
||||
================================================================================
|
||||
|
||||
Outputs:
|
||||
field value
|
||||
If hw readable
|
||||
bitwise reductions
|
||||
if anded, ored, xored == True, output a signal
|
||||
swmod/swacc
|
||||
event strobes
|
||||
|
||||
Inputs:
|
||||
field value
|
||||
If hw writable
|
||||
we/wel
|
||||
if either is boolean, and true
|
||||
not part of external hwif if reference
|
||||
mutually exclusive
|
||||
hwclr/hwset
|
||||
if either is boolean, and true
|
||||
not part of external hwif if reference
|
||||
incr/decr
|
||||
if counter=true, generate BOTH
|
||||
incrvalue/decrvalue
|
||||
if either incrwidth/decrwidth are set
|
||||
signals!
|
||||
any signal instances instantiated in the scope
|
||||
72
docs/dev_notes/template-layers/2-CPUIF
Normal file
72
docs/dev_notes/template-layers/2-CPUIF
Normal file
@@ -0,0 +1,72 @@
|
||||
--------------------------------------------------------------------------------
|
||||
CPU Bus interface layer
|
||||
--------------------------------------------------------------------------------
|
||||
Provides an abstraction layer between the outside SoC's bus interface, and the
|
||||
internal register block's implementation.
|
||||
Converts a user-selectable bus protocol to generic register file signals.
|
||||
|
||||
Upstream Signals:
|
||||
Signal names are defined in the bus interface class and shall be malleable
|
||||
to the user.
|
||||
User can choose a flat signal interface, or a SV interface.
|
||||
SV interface shall be easy to tweak since various orgs will use different
|
||||
naming conventions in their library of interface definitions
|
||||
|
||||
Downstream Signals:
|
||||
- cpuif_req
|
||||
- Single-cycle pulse
|
||||
- Qualifies the following child signals:
|
||||
- cpuif_req_is_wr
|
||||
1 denotes this is a write transfer
|
||||
- cpuif_addr
|
||||
Byte address
|
||||
- cpuif_wr_data
|
||||
- cpuif_wr_biten
|
||||
per-bit strobes
|
||||
some protocols may opt to tie this to all 1's
|
||||
- cpuif_rd_ack
|
||||
- Single-cycle pulse
|
||||
- Qualifies the following child signals:
|
||||
- cpuif_rd_data
|
||||
- cpuif_rd_err
|
||||
|
||||
- cpuif_wr_ack
|
||||
- Single-cycle pulse
|
||||
- Qualifies the following child signals:
|
||||
- cpuif_wr_err
|
||||
|
||||
|
||||
Misc thoughts
|
||||
- Internal cpuif_* signals use a strobe-based protocol:
|
||||
- Unknown, but fixed latency
|
||||
- Makes for easy pipelining if needed
|
||||
- Decided to keep cpuif_req signals common for read write:
|
||||
This will allow address decode logic to be shared for read/write
|
||||
Downside is split protocols like axi-lite can't have totally separate rd/wr
|
||||
access lanes, but who cares?
|
||||
- separate response strobes
|
||||
Not necessary to use, but this lets me independently pipeline read/write paths.
|
||||
read path will need more time if readback mux is large
|
||||
- On multiple outstanding transactions
|
||||
Currently, cpuif doesnt really support this. Goal was to make it easily pipelineable
|
||||
without having to backfeed stall logic.
|
||||
Could still be possible to do a "fly-by" pipeline with a more intelligent cpuif layer
|
||||
Not worrying about this now.
|
||||
|
||||
|
||||
Implementation:
|
||||
Implement this mainly as a Jinja template.
|
||||
Upstream bus intf signals are fetched via busif class properties. Ex:
|
||||
{{busif.signal('pready')}} <= '1;
|
||||
This allows the actual SV or flattened signal to be emitted
|
||||
|
||||
What protocols do I care about?
|
||||
- AXI4 Lite
|
||||
- Ignore AxPROT?
|
||||
- APB3
|
||||
- APB4
|
||||
- Ignore pprot?
|
||||
- AHB?
|
||||
- Wishbone
|
||||
- Generic
|
||||
breakout the above signals as-is (reassign with a prefix or something)
|
||||
51
docs/dev_notes/template-layers/3-address-decode
Normal file
51
docs/dev_notes/template-layers/3-address-decode
Normal file
@@ -0,0 +1,51 @@
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
Address Decode layer
|
||||
--------------------------------------------------------------------------------
|
||||
A bunch of combinational address decodes that generate individual register
|
||||
req strobes
|
||||
|
||||
Possible decode logic styles:
|
||||
- Big case statement
|
||||
+ Probably more sim-efficient
|
||||
- Hard to do loop parameterization
|
||||
- More annoying to do multiple regs per address
|
||||
- Big always_comb + One if/else chain
|
||||
+ Easy to nest loops & parameterize if needed
|
||||
- sim has a lot to evaluate each time
|
||||
- More annoying to do multiple regs per address
|
||||
- implies precedence? Synth tools should be smart enough?
|
||||
- Big always_comb + inline conditionals <---- DO THIS
|
||||
+ Easy to nest loops & parameterize if needed
|
||||
- sim has a lot to evaluate each time
|
||||
+ Multiple regs per address possible
|
||||
+ implies address decode parallelism.
|
||||
?? Should I try using generate loops + assigns?
|
||||
This would be more explicit parallelism, however some tools may
|
||||
get upset at multiple assignments to a common struct
|
||||
|
||||
Implementation:
|
||||
Jinja is inappropriate here
|
||||
Very logic-heavy. Jinja may end up being annoying
|
||||
Also, not much need for customization here
|
||||
This may even make sense as a visitor that dumps lines
|
||||
- visit each reg
|
||||
- upon entering an array, create for loops
|
||||
- upon exiting an array, emit 'end'
|
||||
Make the strobe struct declared locally
|
||||
No need for it to leave the block
|
||||
Error handling
|
||||
If no strobe generated, respond w error?
|
||||
This is actually pretty expensive to do for writes.
|
||||
Hold off on this for now.
|
||||
Reads get this effectively for free in the readback mux.
|
||||
Implement write response strobes back upstream to cpuif
|
||||
Eventually allow for optional register stage for strobe struct
|
||||
Will need to also pipeline the other cpuif signals
|
||||
ok to discard the cpuif_addr. no longer needed
|
||||
|
||||
|
||||
Downstream Signals:
|
||||
- access strobes
|
||||
Encase these into a struct datatype
|
||||
- is_write + wr_data/wr_bitstrobe
|
||||
163
docs/dev_notes/template-layers/4-fields
Normal file
163
docs/dev_notes/template-layers/4-fields
Normal file
@@ -0,0 +1,163 @@
|
||||
--------------------------------------------------------------------------------
|
||||
Field storage / next value layer
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
Where all the magic happens!!
|
||||
|
||||
Any field that implements storage is defined here.
|
||||
Bigass struct that only contains storage elements
|
||||
|
||||
Each field consists of:
|
||||
- Entries in the storage element struct
|
||||
- if implements storage - field value
|
||||
- user extensible values?
|
||||
- Entries in the combo struct
|
||||
- if implements storage:
|
||||
- Field's "next" value
|
||||
- load-enable strobe
|
||||
- If counter
|
||||
various event strobes (overflow/overflow).
|
||||
These are convenient to generate alongside the field next state logic
|
||||
- user extensible values?
|
||||
- an always_comb block:
|
||||
- generates the "next value" combinational signal
|
||||
- May generate other intermediate strobes?
|
||||
incr/decr?
|
||||
- series of if/else statements that assign the next value in the storage element
|
||||
Think of this as a flat list of "next state" conditons, ranked by their precedence as follows:
|
||||
- reset
|
||||
Actually, handle this in the always_ff
|
||||
- sw access (if sw precedence)
|
||||
- onread/onwrite
|
||||
- hw access
|
||||
- Counter
|
||||
beware of clear events and incr/decr events happening simultaneously
|
||||
- next
|
||||
- etc
|
||||
- sw access (if hw precedence)
|
||||
- onread/onwrite
|
||||
- always_comb block to also generate write-enable strobes for the actual
|
||||
storage element
|
||||
This is better for low-power design
|
||||
- an always_ff block
|
||||
Implements the actual storage element
|
||||
Also a tidy place to abstract the specifics of activehigh/activelow field reset
|
||||
selection.
|
||||
|
||||
TODO:
|
||||
Scour the RDL spec.
|
||||
Does this "next state" precedence model hold true in all situations?
|
||||
|
||||
TODO:
|
||||
Think about user-extensibility
|
||||
Provide a mechanism for users to extend/override field behavior
|
||||
|
||||
TODO:
|
||||
Does the endianness the user sets matter anywhere?
|
||||
|
||||
Implementation
|
||||
Makes sense to use a listener class
|
||||
|
||||
Be sure to skip alias registers
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
NextStateConditional Class
|
||||
Describes a single conditional action that determines the next state of a field
|
||||
Provides information to generate the following content:
|
||||
if(<conditional>) begin
|
||||
<assignments>
|
||||
end
|
||||
|
||||
- is_match(self, field: FieldNode) -> bool:
|
||||
Returns True if this conditional is relevant to the field. If so,
|
||||
it instructs the FieldBuider that code for this conditional shall be emitted
|
||||
TODO: better name than "is_match"? More like "is this relevant"
|
||||
|
||||
- get_predicate(self, field: FieldNode) -> str:
|
||||
Returns the rendered conditional text
|
||||
|
||||
- get_assignments(self, field: FieldNode) -> List[str]:
|
||||
Returns a list of rendered assignment strings
|
||||
This will basically always be two:
|
||||
<field>.next = <next value>
|
||||
<field>.load_next = '1;
|
||||
|
||||
- get_extra_combo_signals(self, field: FieldNode) -> List[TBD]:
|
||||
Some conditionals will need to set some extra signals (eg. counter underflow/overflow strobes)
|
||||
Compiler needs to know to:
|
||||
- declare these inthe combo struct
|
||||
- initialize them in the beginning of always_comb
|
||||
|
||||
Return something that denotes the following information: (namedtuple?)
|
||||
- signal name: str
|
||||
- width: int
|
||||
- default value assignment: str
|
||||
|
||||
Multiple NextStateConditional can declare the same extra combo signal
|
||||
as long as their definitions agree
|
||||
--> Assert this
|
||||
|
||||
|
||||
FieldBuilder Class
|
||||
Describes how to build fields
|
||||
|
||||
Contains NextStateConditional definitions
|
||||
Nested inside the class namespace, define all the NextStateConditional classes
|
||||
that apply
|
||||
User can override definitions or add own to extend behavior
|
||||
|
||||
NextStateConditional objects are stored in a dictionary as follows:
|
||||
_conditionals {
|
||||
assignment_precedence: [
|
||||
conditional_option_1,
|
||||
conditional_option_2,
|
||||
conditional_option_3,
|
||||
]
|
||||
}
|
||||
|
||||
add_conditional(self, conditional, assignment_precedence):
|
||||
Inserts the NextStateConditional into the given assignment precedence bin
|
||||
The first one added to a precedence bin is first in that bin's search order
|
||||
|
||||
init_conditionals(self) -> None:
|
||||
Called from __init__.
|
||||
loads all possible conditionals into self.conditionals list
|
||||
This function is to provide a hook for the user to add their own.
|
||||
|
||||
Do not do fancy class introspection. Load them explicitly by name like so:
|
||||
self.add_conditional(MyNextState(), AssignmentPrecedence.SW_ACCESS)
|
||||
|
||||
If user wants to extend this class, they can pile onto the bins of conditionals freely!
|
||||
|
||||
--------------------------------------------------------------------------------
|
||||
Misc
|
||||
--------------------------------------------------------------------------------
|
||||
What about complex behaviors like a read-clear counter?
|
||||
if({{software read}})
|
||||
next = 0
|
||||
elif({{increment}})
|
||||
next = prev + 1
|
||||
|
||||
--> Implement this by stacking multiple NextStateConditional in the same assignment precedence.
|
||||
In this case, there would be a special action on software read that would be specific to read-clear counters
|
||||
this would get inserted ahead of the search order.
|
||||
|
||||
|
||||
Precedence & Search order
|
||||
There are two layers of priority I need to keep track of:
|
||||
- Assignment Precedence
|
||||
RTL precedence of the assignment conditional
|
||||
- Search order (sp?)
|
||||
Within an assignment precedence, order in which the NextStateConditional classes are
|
||||
searched for a match
|
||||
|
||||
For assignment precedence, it makes sense to use an integer enumeration for this
|
||||
since there really aren't too many precedence levels that apply here.
|
||||
Space out the integer enumerations so that user can reliably insert their own actions, ie:
|
||||
my_precedence = AssignmentPrecedence.SW_ACCESS + 1
|
||||
|
||||
For search order, provide a user API to load a NextStateConditional into
|
||||
a precedence 'bin'. Pushing into a bin always inserts into the front of the search order
|
||||
This makes sense since user overrides will always want to be highest priority - and
|
||||
rule themselves out before falling back to builtin behavior
|
||||
49
docs/dev_notes/template-layers/5-readback-mux
Normal file
49
docs/dev_notes/template-layers/5-readback-mux
Normal file
@@ -0,0 +1,49 @@
|
||||
--------------------------------------------------------------------------------
|
||||
Readback mux layer
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
Implementation:
|
||||
- Big always_comb block
|
||||
- Initialize default rd_data value
|
||||
- Lotsa if statements that operate on reg strb to assign rd_data
|
||||
- Merges all fields together into reg
|
||||
- pulls value from storage element struct, or input struct
|
||||
- Provision for optional flop stage?
|
||||
|
||||
Mux Strategy:
|
||||
Flat case statement:
|
||||
-- Cant parameterize
|
||||
+ better performance?
|
||||
|
||||
Flat 1-hot array then OR reduce:
|
||||
- Create a bus-wide flat array
|
||||
eg: 32-bits x N readable registers
|
||||
- Assign each element:
|
||||
the readback value of each register
|
||||
... masked by the register's access strobe
|
||||
- I could also stuff an extra bit into the array that denotes the read is valid
|
||||
A missed read will OR reduce down to a 0
|
||||
- Finally, OR reduce all the elements in the array down to a flat 32-bit bus
|
||||
- Retiming the large OR fanin can be done by chopping up the array into stages
|
||||
for 2 stages, sqrt(N) gives each stage's fanin size. Round to favor
|
||||
more fanin on 2nd stage
|
||||
3 stages uses cube-root. etc...
|
||||
- This has the benefit of re-using the address decode logic.
|
||||
synth can choose to replicate logic if fanout is bad
|
||||
|
||||
|
||||
WARNING:
|
||||
Beware of read/write flop stage asymmetry & race conditions.
|
||||
Eg. If a field is rclr, dont want to sample it after it gets read:
|
||||
addr --> strb --> clear
|
||||
addr --> loooong...retime --> sample rd value
|
||||
Should guarantee that read-sampling happens at the same cycle as any read-modify
|
||||
|
||||
|
||||
Forwards response strobe back up to cpu interface layer
|
||||
|
||||
TODO:
|
||||
Dont forget about alias registers here
|
||||
|
||||
TODO:
|
||||
Does the endinness the user sets matter anywhere?
|
||||
9
docs/dev_notes/template-layers/6-output-port-mapping
Normal file
9
docs/dev_notes/template-layers/6-output-port-mapping
Normal file
@@ -0,0 +1,9 @@
|
||||
--------------------------------------------------------------------------------
|
||||
Output Port mapping layer
|
||||
--------------------------------------------------------------------------------
|
||||
|
||||
Assign to output struct port
|
||||
|
||||
Still TBD if this will actually be a distinct layer.
|
||||
Cosmetically, this might be nicer to interleave with the field section above
|
||||
Assign storage element & other derived values as requested by properties
|
||||
Reference in New Issue
Block a user