basic framework

This commit is contained in:
Alex Mykyta
2021-06-01 21:51:24 -07:00
parent 292aec1c6e
commit 0d5b663f98
40 changed files with 1920 additions and 0 deletions

View File

@@ -0,0 +1,29 @@
================================================================================
Overarching philosophy
================================================================================
Encourage users to be able to tweak the design
Templating:
Design templates to be small bite-size snippets, each with clear intent
Templates shall be extendable
Templates shall be easy/intuitive
Output:
Output shall be beautiful. Consistent and clean formatting/whitespace builds trust
Output has comments. Users will be looking at it.
================================================================================
On templating...
================================================================================
Everything should be written out to a single file
The only exception is if a struct port interface is used, then the appropriate
struct package is also written out (or lumped into the same file?)
Each layer should actually be its own separate template.
Use helper functions to abstract away details about identifier implementation.
Similarly, some fields will end up with additional port-level signals inferred.
For example, if the user set the field's "anded=true" property
the template would do something like:
{{output_signal(field, "anded")}} = &{{field_value(field)}};
Basically, i'd define a ton of helper functions that return the signal identifier.

View File

@@ -0,0 +1,10 @@
Holy smokes this is complicated
Keep this exporter in Alpha/Beta for a while
Add some text in the readme or somewhere:
- No guarantees of correctness! This is always true with open source software,
but even more here!
Be sure to do your own validation before using this in production.
- Alpha means the implementation may change drastically!
Unlike official sem-ver, I am not making any guarantees on compatibility
- I need your help! Validating, finding edge cases, etc...

18
doc/logbooks/Interrupts Normal file
View File

@@ -0,0 +1,18 @@
Interrupts seem to be pretty well-described.
Basically...
- If a register contains one or more fields that use the intr property,
then it is implied to be an interrupt register
--> Add RegNode.has_intr and RegNode.has_halt properties?
- This register implies that there is an output irq signal that is propagated to the top, and it is the OR of all interrupt field bits
- BUT in the multilevel interrupt example, perhaps this output gets suppressed?
Suppress the output signal if Reg->intr gets referenced, since this means
the user is doing a multi-level interrupt.
This means that the register's interrupt signal is "consumed" by a second-level interrupt register
- WTF about the "halt" concept?
I assume this does NOT auto-imply an output?
Mayby only imply a default halt output if:
- an interrupt register has fields that use haltenable/haltmask
- AND the interrupt register's reg->halt has not been referenced

23
doc/logbooks/Program Flow Normal file
View File

@@ -0,0 +1,23 @@
1. Scan design. Collect information
- Check for unsupported constructs. Throw errors as appropriate
- Uniform regwidth, accesswidth, etc.
- Collect reset signals
cpuif_reset, field_reset
explicitly assigned to field->resetsignal
- Collect any other misc user signals that are referenced in the design
- Top-level interrupts
Collect X & Y:
X = set of all registers that have an interrupt field
Y = set of all interrupt registers that are referenced by a field
Top level interrupt registers are the set in X, but not in Y
(and probably other caveats. See notes)
2. Create intermediate template objects
3. Render top-level IO struct package (if applicable)
4. Render top-level module template

11
doc/logbooks/Resets Normal file
View File

@@ -0,0 +1,11 @@
================================================================================
Resets
================================================================================
use whatever is defined in RDL based on cpuif_reset and field_reset signals
Otherwise, provide configuration that defines what the default is:
a single reset that is active high/low, or sync/async
If cpuif_reset is specified, what do fields use?
I assume they still use the default reset separately?
YES. Agnisys appears to be wrong.
cpuif_reset has no influence on the fields' reset according to the spec

View File

@@ -0,0 +1,19 @@
I need some sort of signal "dereferencer" that can be easily used to translate references
to stuff via a normalized interface.
For example, if RDL defines:
my_field->next = my_other_field
Then in Python (or a template) I could do:
x = my_field.get_property("next")
y = dereferencer.get(x)
and trust that I'll get a value/identifier/whatever that accurately represents
the value being referenced
Values:
If X is a field reference:
... that implements storage, return its DFF value reference
... no storage, but has a hw input, grab from the hwif input
... no storage, and no hw input, return its constant reset value?
If X is a property reference... do whats right...
my_field->anded === (&path.to.my_field)
if X is a static value, return the literal

45
doc/logbooks/Some Classes Normal file
View File

@@ -0,0 +1,45 @@
================================================================================
Signal wrapper classes
================================================================================
Define a signal wrapper class that is easier to use in templates.
Provides the following properties:
.is_async
.is_activehigh
.identifier
Returns the Verilog identifier string for this signal
.activehigh_identifier
Normalizes identifier to active-high logic
same as .identifier, but prepends '~' if is_activehigh = False
.width
Default reset class instance:
Extends the base class
Hardcodes as follows:
.is_async = True
.is_activehigh = True
.identifier = "rst"
.width = 1
Wrapper classes
Wrap around a systemrdl.SignalNode
================================================================================
CPU Interface Class
================================================================================
Entry point class for a given CPU interface type (APB, AXI, etc..)
Does the following:
- Provide linkage to the logic implementation Jinja template
- Interface signal identifier properties
Aliases for signal identifiers to allow flat or sv-interface style
eg:
self.psel --> "s_apb_psel" or "s_apb.psel"
if sv interface, use the interface name class prpoerty
- Port declaration text property
declare as sv interface, or flat port list
If flattened, should use signal identifier properties
If sv interface, I should breakout the interface & modport name as
class properties for easy user-override

View File

@@ -0,0 +1,60 @@
================================================================================
Things that need validation by the compiler
================================================================================
Many of these are probably already covered, but being paranoid.
Make a list of things as I think of them.
Mark these as follows:
X = Yes, confirmed that the compiler covers this
! = No! Confirmed that the compiler does not check this
(blank) = TBD
--------------------------------------------------------------------------------
X resetsignal width
reset signals shall have width of 1
X Field has no knowable value
- does not implement storage
- hw is not writable
- sw is readable
- No reset value specified
--> emit a warning?
! multiple field_reset in the same hierarchy
there can only be one signal declared with field_reset
in a given hierarchy
! multiple cpuif_reset in the same hierarchy
there can only be one signal declared with cpuif_reset
in a given hierarchy
! incrwidth/incrvalue & decrvalue/decrwidth
these pairs are mutually exclusive.
Make sure they are not both set after elaboration
Compiler checks for mutex within the same scope, but
i dont think I check for mutexes post-elaborate
... or, make these properties clear each-other on assignment
================================================================================
Things that need validation by this exporter
================================================================================
List of stuff in case I forget.
! = No! exporter does not enforce this yet
x = Yes! I already implemented this.
--------------------------------------------------------------------------------
! "bridge" addrmap not supported
export shall refuse to process an addrmap marked as a "bridge"
Only need to check top-level. Compiler will enforce that child nodes arent bridges
cpuif_resets
! Warn/error on any signal with cpuif_reset set, that is not in the top-level
addrmap. At the very least, warn that it will be ignored
! multiple cpuif_reset
there can be only one cpuif reset

View File

@@ -0,0 +1,51 @@
--------------------------------------------------------------------------------
Port Declaration
--------------------------------------------------------------------------------
Generates the port declaration of the module:
- Parameters
- rd/wr error response/data behavior
Do missed accesses cause a SLVERR?
Do reads respond with a magic value?
- Pipeline enables
Enable reg stages in various places
- RDL-derived Parameters:
Someday in the future if i ever get around to this: https://github.com/SystemRDL/systemrdl-compiler/issues/58
- Clock/Reset
Single clk
One or more resets
- CPU Bus Interface
Given the bus interface object, emits the IO
This can be flattened ports, or a SV Interface
Regardless, it shall be malleable so that the user can use their favorite
declaration style
- Hardware interface
Two options:
- 2-port struct interface
Everything is rolled into two unpacked structs - inputs and outputs
- Flattened --> NOT DOING
Flatten/Unroll everything
No. not doing. I hate this and dont want to waste time implementing this.
This will NEVER be able to support parameterized regmaps, and just
creates a ton of corner cases i dont care to deal with.
Other IO Signals I need to be aware of:
any signals declared, and used in any references:
field.resetsignal
field.next
... etc ...
any signals declared and marked as cpuif_reset, or field_reset
These override the default rst
If both are defined, be sure to not emit the default
Pretty straightforward (see 17.1)
Also have some notes on this in my general Logbook
Will have to make a call on how these propagate if multiple defined
in different hierarchies
interrupt/halt outputs
See "Interrupts" logbook for explanation
addrmap.errextbus, regfile.errextbus, reg.errextbus
???
Apparently these are inputs

View File

@@ -0,0 +1,77 @@
================================================================================
Summary
================================================================================
RTL interface that provides access to per-field context signals
Regarding signals:
I think RDL-declared signals should actually be part of the hwif input
structure.
Exceptions:
- if the signal instance is at the top-level, it will get promoted to the
top level port list for convenience, and therefore omitted from the struct
================================================================================
Naming Scheme
================================================================================
hwif_out
.my_regblock
.my_reg[X][Y]
.my_field
.value
.anded
hwif_in
.my_regblock
.my_reg[X][Y]
.my_field
.value
.we
.my_signal
.my_fieldreset_signal
================================================================================
Flattened mode? --> NO
================================================================================
If user wants a flattened list of ports,
still use the same hwif_in/out struct internally.
Rather than declaring hwif_in and hwif_out in the port list, declare it internally
Add a mapping layer in the body of the module that performs a ton of assign statements
to map flat signals <-> struct
Alternatively, don't do this at all.
If I want to add a flattened mode, generate a wrapper module instead.
Marking this as YAGNI for now.
================================================================================
IO Signals
================================================================================
Outputs:
field value
If hw readable
bitwise reductions
if anded, ored, xored == True, output a signal
swmod/swacc
event strobes
Inputs:
field value
If hw writable
we/wel
if either is boolean, and true
not part of external hwif if reference
mutually exclusive
hwclr/hwset
if either is boolean, and true
not part of external hwif if reference
incr/decr
if counter=true, generate BOTH
incrvalue/decrvalue
if either incrwidth/decrwidth are set
signals!
any signal instances instantiated in the scope

View File

@@ -0,0 +1,72 @@
--------------------------------------------------------------------------------
CPU Bus interface layer
--------------------------------------------------------------------------------
Provides an abstraction layer between the outside SoC's bus interface, and the
internal register block's implementation.
Converts a user-selectable bus protocol to generic register file signals.
Upstream Signals:
Signal names are defined in the bus interface class and shall be malleable
to the user.
User can choose a flat signal interface, or a SV interface.
SV interface shall be easy to tweak since various orgs will use different
naming conventions in their library of interface definitions
Downstream Signals:
- cpuif_req
- Single-cycle pulse
- Qualifies the following child signals:
- cpuif_req_is_wr
1 denotes this is a write transfer
- cpuif_addr
Byte address
- cpuif_wr_data
- cpuif_wr_bitstrb
per-bit strobes
some protocols may opt to tie this to all 1's
- cpuif_rd_ack
- Single-cycle pulse
- Qualifies the following child signals:
- cpuif_rd_data
- cpuif_rd_err
- cpuif_wr_ack
- Single-cycle pulse
- Qualifies the following child signals:
- cpuif_wr_err
Misc thoughts
- Internal cpuif_* signals use a strobe-based protocol:
- Unknown, but fixed latency
- Makes for easy pipelining if needed
- Decided to keep cpuif_req signals common for read write:
This will allow address decode logic to be shared for read/write
Downside is split protocols like axi-lite can't have totally separate rd/wr
access lanes, but who cares?
- separate response strobes
Not necessary to use, but this lets me independently pipeline read/write paths.
read path will need more time if readback mux is large
- On multiple outstanding transactions
Currently, cpuif doesnt really support this. Goal was to make it easily pipelineable
without having to backfeed stall logic.
Could still be possible to do a "fly-by" pipeline with a more intelligent cpuif layer
Not worrying about this now.
Implementation:
Implement this mainly as a Jinja template.
Upstream bus intf signals are fetched via busif class properties. Ex:
{{busif.signal('pready')}} <= '1;
This allows the actual SV or flattened signal to be emitted
What protocols do I care about?
- AXI4 Lite
- Ignore AxPROT?
- APB3
- APB4
- Ignore pprot?
- AHB?
- Wishbone
- Generic
breakout the above signals as-is (reassign with a prefix or something)

View File

@@ -0,0 +1,51 @@
--------------------------------------------------------------------------------
Address Decode layer
--------------------------------------------------------------------------------
A bunch of combinational address decodes that generate individual register
req strobes
Possible decode logic styles:
- Big case statement
+ Probably more sim-efficient
- Hard to do loop parameterization
- More annoying to do multiple regs per address
- Big always_comb + One if/else chain
+ Easy to nest loops & parameterize if needed
- sim has a lot to evaluate each time
- More annoying to do multiple regs per address
- implies precedence? Synth tools should be smart enough?
- Big always_comb + inline conditionals <---- DO THIS
+ Easy to nest loops & parameterize if needed
- sim has a lot to evaluate each time
+ Multiple regs per address possible
+ implies address decode parallelism.
?? Should I try using generate loops + assigns?
This would be more explicit parallelism, however some tools may
get upset at multiple assignments to a common struct
Implementation:
Jinja is inappropriate here
Very logic-heavy. Jinja may end up being annoying
Also, not much need for customization here
This may even make sense as a visitor that dumps lines
- visit each reg
- upon entering an array, create for loops
- upon exiting an array, emit 'end'
Make the strobe struct declared locally
No need for it to leave the block
Error handling
If no strobe generated, respond w error?
This is actually pretty expensive to do for writes.
Hold off on this for now.
Reads get this effectively for free in the readback mux.
Implement write response strobes back upstream to cpuif
Eventually allow for optional register stage for strobe struct
Will need to also pipeline the other cpuif signals
ok to discard the cpuif_addr. no longer needed
Downstream Signals:
- access strobes
Encase these into a struct datatype
- is_write + wr_data/wr_bitstrobe

View File

@@ -0,0 +1,35 @@
--------------------------------------------------------------------------------
Field storage / next value layer
--------------------------------------------------------------------------------
Where all the magic happens!!
Any field that implements storage is defined here.
Bigass struct that only contains storage elements
Each field consists of:
- an always_ff block
- series of if/else statements that assign the next value in the storage element
Think of this as a flat list of "next state" conditons, ranked by their precedence as follows:
- reset
- sw access (if sw precedence)
- onread/onwrite
- hw access
- Counter
- next
- etc
- sw access (if hw precedence)
- onread/onwrite
TODO:
What about stuff like read-clear counters that cant lose a count?
In a traditional if/else chain, i need to be aware of the fact that its a counter
when handling the swaccess case
Is it possible to code this in a way where I can isolate the need to know every nuanced case here?
this may actually only apply to counters...
This is trivial in a 2-process implementation, but i'd rather avoid the overheads
Implementation
Makes sense to use a listener class
Be sure to skip alias registers

View File

@@ -0,0 +1,65 @@
--------------------------------------------------------------------------------
Readback mux layer
--------------------------------------------------------------------------------
Implementation:
- Big always_comb block
- Initialize default rd_data value
- Lotsa if statements that operate on reg strb to assign rd_data
- Merges all fields together into reg
- pulls value from storage element struct, or input struct
- Provision for optional flop stage?
Mux Strategy:
Flat case statement:
-- Cant parameterize
+ better performance?
Flatten array then mux:
- First, flatten ALL readback values into an array
Round up the size of the array to next ^2
needs to be fully addressable anyways!
This can be in a combinational block
Initialize the array to the default readback value
then, assign all register values. Use loops where necessary.
Append an extra 'is-valid' bit if I need to slverr on bad reads
- Next, use the read address as an index into this array
- If needed, I can do a staged decode!
Compute the most balanced fanin staging in Python. eg:
64 regs --mux--> 8x8 --mux--> 1
128 regs --mux--> 8x16 --mux--> 1
Favor smaller fanin first. Latter stage should have more fanin since routing congestion will be easier
256 regs --mux--> 16x16 --mux--> 1
- Potential sparseness of this makes me uncomfortable,
but its synthesis SEEMS like it would be really efficient!
- TODO: Rethink this
I feel like people will complain about this
It will likely also be pretty sim-inefficient?
Flat 1-hot array then OR reduce: <-- DO THIS
- Create a bus-wide flat array
eg: 32-bits x N readable registers
- Assign each element:
the readback value of each register
... masked by the register's access strobe
- I could also stuff an extra bit into the array that denotes the read is valid
A missed read will OR reduce down to a 0
- Finally, OR reduce all the elements in the array down to a flat 32-bit bus
- Retiming the large OR fanin can be done by chopping up the array into stages
for 2 stages, sqrt(N) gives each stage's fanin size. Round to favor
more fanin on 2nd stage
3 stages uses cube-root. etc...
- This has the benefit of re-using the address decode logic.
synth can choose to replicate logic if fanout is bad
WARNING:
Beware of read/write flop stage asymmetry & race conditions.
Eg. If a field is rclr, dont want to sample it after it gets read:
addr --> strb --> clear
addr --> loooong...retime --> sample rd value
Should guarantee that read-sampling happens at the same cycle as any read-modify
Forwards response strobe back up to cpu interface layer
Dont forget about alias registers here

View File

@@ -0,0 +1,9 @@
--------------------------------------------------------------------------------
Output Port mapping layer
--------------------------------------------------------------------------------
Assign to output struct port
Still TBD if this will actually be a distinct layer.
Cosmetically, this might be nicer to interleave with the field section above
Assign storage element & other derived values as requested by properties