basic framework

2021-06-01 21:51:24 -07:00
parent 292aec1c6e
commit 0d5b663f98
40 changed files with 1920 additions and 0 deletions
--- a/doc/logbooks/000-Main-Logbook
+++ b/doc/logbooks/000-Main-Logbook
@@ -0,0 +1,29 @@
+
+================================================================================
+Overarching philosophy
+================================================================================
+Encourage users to be able to tweak the design
+Templating:
+    Design templates to be small bite-size snippets, each with clear intent
+    Templates shall be extendable
+    Templates shall be easy/intuitive
+Output:
+    Output shall be beautiful. Consistent and clean formatting/whitespace builds trust
+    Output has comments. Users will be looking at it.
+
+================================================================================
+On templating...
+================================================================================
+Everything should be written out to a single file
+The only exception is if a struct port interface is used, then the appropriate
+struct package is also written out (or lumped into the same file?)
+
+Each layer should actually be its own separate template.
+
+Use helper functions to abstract away details about identifier implementation.
+Similarly, some fields will end up with additional port-level signals inferred.
+For example, if the user set the field's "anded=true" property
+the template would do something like:
+    {{output_signal(field, "anded")}} = &{{field_value(field)}};
+
+Basically, i'd define a ton of helper functions that return the signal identifier.
--- a/doc/logbooks/Alpha-Beta
+++ b/doc/logbooks/Alpha-Beta
@@ -0,0 +1,10 @@
+Holy smokes this is complicated
+
+Keep this exporter in Alpha/Beta for a while
+Add some text in the readme or somewhere:
+    - No guarantees of correctness! This is always true with open source software,
+      but even more here!
+      Be sure to do your own validation before using this in production.
+    - Alpha means the implementation may change drastically!
+      Unlike official sem-ver, I am not making any guarantees on compatibility
+    - I need your help! Validating, finding edge cases, etc...
--- a/doc/logbooks/Interrupts
+++ b/doc/logbooks/Interrupts
@@ -0,0 +1,18 @@
+
+Interrupts seem to be pretty well-described.
+Basically...
+
+    - If a register contains one or more fields that use the intr property,
+      then it is implied to be an interrupt register
+        --> Add RegNode.has_intr and RegNode.has_halt properties?
+    - This register implies that there is an output irq signal that is propagated to the top, and it is the OR of all interrupt field bits
+    - BUT in the multilevel interrupt example, perhaps this output gets suppressed?
+        Suppress the output signal if Reg->intr gets referenced, since this means
+        the user is doing a multi-level interrupt.
+        This means that the register's interrupt signal is "consumed" by a second-level interrupt register
+
+    - WTF about the "halt" concept?
+      I assume this does NOT auto-imply an output?
+      Mayby only imply a default halt output if:
+        - an interrupt register has fields that use haltenable/haltmask
+        - AND the interrupt register's reg->halt has not been referenced
--- a/doc/logbooks/Program
+++ b/doc/logbooks/Program
@@ -0,0 +1,23 @@
+
+1. Scan design. Collect information
+    - Check for unsupported constructs. Throw errors as appropriate
+        - Uniform regwidth, accesswidth, etc.
+
+    - Collect reset signals
+        cpuif_reset, field_reset
+        explicitly assigned to field->resetsignal
+
+    - Collect any other misc user signals that are referenced in the design
+
+    - Top-level interrupts
+        Collect X & Y:
+            X = set of all registers that have an interrupt field
+            Y = set of all interrupt registers that are referenced by a field
+        Top level interrupt registers are the set in X, but not in Y
+        (and probably other caveats. See notes)
+
+2. Create intermediate template objects
+
+3. Render top-level IO struct package (if applicable)
+
+4. Render top-level module template
--- a/doc/logbooks/Resets
+++ b/doc/logbooks/Resets
@@ -0,0 +1,11 @@
+================================================================================
+Resets
+================================================================================
+use whatever is defined in RDL based on cpuif_reset and field_reset signals
+Otherwise, provide configuration that defines what the default is:
+    a single reset that is active high/low, or sync/async
+
+If cpuif_reset is specified, what do fields use?
+    I assume they still use the default reset separately?
+    YES. Agnisys appears to be wrong.
+    cpuif_reset has no influence on the fields' reset according to the spec
--- a/doc/logbooks/Signal
+++ b/doc/logbooks/Signal
@@ -0,0 +1,19 @@
+I need some sort of signal "dereferencer" that can be easily used to translate references
+to stuff via a normalized interface.
+
+For example, if RDL defines:
+    my_field->next = my_other_field
+Then in Python (or a template) I could do:
+    x = my_field.get_property("next")
+    y = dereferencer.get(x)
+and trust that I'll get a value/identifier/whatever that accurately represents
+the value being referenced
+
+Values:
+    If X is a field reference:
+        ... that implements storage, return its DFF value reference
+        ... no storage, but has a hw input, grab from the hwif input
+        ... no storage, and no hw input, return its constant reset value?
+    If X is a property reference... do whats right...
+        my_field->anded === (&path.to.my_field)
+    if X is a static value, return the literal
--- a/doc/logbooks/Some
+++ b/doc/logbooks/Some
@@ -0,0 +1,45 @@
+
+================================================================================
+Signal wrapper classes
+================================================================================
+Define a signal wrapper class that is easier to use in templates.
+
+Provides the following properties:
+    .is_async
+    .is_activehigh
+    .identifier
+        Returns the Verilog identifier string for this signal
+    .activehigh_identifier
+        Normalizes identifier to active-high logic
+        same as .identifier, but prepends '~' if is_activehigh = False
+    .width
+
+Default reset class instance:
+    Extends the base class
+    Hardcodes as follows:
+        .is_async = True
+        .is_activehigh = True
+        .identifier = "rst"
+        .width = 1
+
+Wrapper classes
+    Wrap around a systemrdl.SignalNode
+
+
+================================================================================
+CPU Interface Class
+================================================================================
+Entry point class for a given CPU interface type (APB, AXI, etc..)
+
+Does the following:
+    - Provide linkage to the logic implementation Jinja template
+    - Interface signal identifier properties
+        Aliases for signal identifiers to allow flat or sv-interface style
+        eg:
+            self.psel --> "s_apb_psel" or "s_apb.psel"
+        if sv interface, use the interface name class prpoerty
+    - Port declaration text property
+        declare as sv interface, or flat port list
+        If flattened, should use signal identifier properties
+        If sv interface, I should breakout the interface & modport name as
+        class properties for easy user-override
--- a/doc/logbooks/Validation
+++ b/doc/logbooks/Validation
@@ -0,0 +1,60 @@
+
+================================================================================
+Things that need validation by the compiler
+================================================================================
+Many of these are probably already covered, but being paranoid.
+Make a list of things as I think of them.
+
+Mark these as follows:
+    X = Yes, confirmed that the compiler covers this
+    ! = No! Confirmed that the compiler does not check this
+    (blank) = TBD
+
+--------------------------------------------------------------------------------
+
+X resetsignal width
+    reset signals shall have width of 1
+
+X Field has no knowable value
+    - does not implement storage
+    - hw is not writable
+    - sw is readable
+    - No reset value specified
+
+    --> emit a warning?
+
+! multiple field_reset in the same hierarchy
+    there can only be one signal declared with field_reset
+    in a given hierarchy
+
+! multiple cpuif_reset in the same hierarchy
+    there can only be one signal declared with cpuif_reset
+    in a given hierarchy
+
+! incrwidth/incrvalue & decrvalue/decrwidth
+    these pairs are mutually exclusive.
+    Make sure they are not both set after elaboration
+    Compiler checks for mutex within the same scope, but
+    i dont think I check for mutexes post-elaborate
+
+    ... or, make these properties clear each-other on assignment
+
+================================================================================
+Things that need validation by this exporter
+================================================================================
+List of stuff in case I forget.
+
+    ! = No! exporter does not enforce this yet
+    x = Yes! I already implemented this.
+
+--------------------------------------------------------------------------------
+! "bridge" addrmap not supported
+    export shall refuse to process an addrmap marked as a "bridge"
+    Only need to check top-level. Compiler will enforce that child nodes arent bridges
+
+cpuif_resets
+    ! Warn/error on any signal with cpuif_reset set, that is not in the top-level
+        addrmap. At the very least, warn that it will be ignored
+
+    ! multiple cpuif_reset
+        there can be only one cpuif reset
--- a/doc/logbooks/template-layers/1-port-declaration
+++ b/doc/logbooks/template-layers/1-port-declaration
@@ -0,0 +1,51 @@
+--------------------------------------------------------------------------------
+Port Declaration
+--------------------------------------------------------------------------------
+Generates the port declaration of the module:
+    - Parameters
+        - rd/wr error response/data behavior
+            Do missed accesses cause a SLVERR?
+            Do reads respond with a magic value?
+        - Pipeline enables
+            Enable reg stages in various places
+
+    - RDL-derived Parameters:
+        Someday in the future if i ever get around to this: https://github.com/SystemRDL/systemrdl-compiler/issues/58
+
+    - Clock/Reset
+        Single clk
+        One or more resets
+
+    - CPU Bus Interface
+        Given the bus interface object, emits the IO
+        This can be flattened ports, or a SV Interface
+        Regardless, it shall be malleable so that the user can use their favorite
+        declaration style
+
+    - Hardware interface
+        Two options:
+            - 2-port struct interface
+                Everything is rolled into two unpacked structs - inputs and outputs
+            - Flattened --> NOT DOING
+                Flatten/Unroll everything
+                No. not doing. I hate this and dont want to waste time implementing this.
+                This will NEVER be able to support parameterized regmaps, and just
+                creates a ton of corner cases i dont care to deal with.
+
+Other IO Signals I need to be aware of:
+    any signals declared, and used in any references:
+        field.resetsignal
+        field.next
+        ... etc ...
+    any signals declared and marked as cpuif_reset, or field_reset
+        These override the default rst
+        If both are defined, be sure to not emit the default
+        Pretty straightforward (see 17.1)
+        Also have some notes on this in my general Logbook
+            Will have to make a call on how these propagate if multiple defined
+            in different hierarchies
+    interrupt/halt outputs
+        See "Interrupts" logbook for explanation
+    addrmap.errextbus, regfile.errextbus, reg.errextbus
+        ???
+        Apparently these are inputs
--- a/doc/logbooks/template-layers/1.1.hardware-interface
+++ b/doc/logbooks/template-layers/1.1.hardware-interface
@@ -0,0 +1,77 @@
+================================================================================
+Summary
+================================================================================
+
+RTL interface that provides access to per-field context signals
+
+Regarding signals:
+    I think RDL-declared signals should actually be part of the hwif input
+    structure.
+    Exceptions:
+        - if the signal instance is at the top-level, it will get promoted to the
+          top level port list for convenience, and therefore omitted from the struct
+
+================================================================================
+Naming Scheme
+================================================================================
+
+hwif_out
+    .my_regblock
+        .my_reg[X][Y]
+            .my_field
+                .value
+                .anded
+
+hwif_in
+    .my_regblock
+        .my_reg[X][Y]
+            .my_field
+                .value
+                .we
+                .my_signal
+        .my_fieldreset_signal
+
+================================================================================
+Flattened mode? --> NO
+================================================================================
+If user wants a flattened list of ports,
+still use the same hwif_in/out struct internally.
+Rather than declaring hwif_in and hwif_out in the port list, declare it internally
+
+Add a mapping layer in the body of the module that performs a ton of assign statements
+to map flat signals <-> struct
+
+Alternatively, don't do this at all.
+If I want to add a flattened mode, generate a wrapper module instead.
+
+Marking this as YAGNI for now.
+
+
+================================================================================
+IO Signals
+================================================================================
+
+Outputs:
+    field value
+        If hw readable
+    bitwise reductions
+        if anded, ored, xored == True, output a signal
+    swmod/swacc
+        event strobes
+
+Inputs:
+    field value
+        If hw writable
+    we/wel
+        if either is boolean, and true
+        not part of external hwif if reference
+        mutually exclusive
+    hwclr/hwset
+        if either is boolean, and true
+        not part of external hwif if reference
+    incr/decr
+        if counter=true, generate BOTH
+    incrvalue/decrvalue
+        if either incrwidth/decrwidth are set
+    signals!
+        any signal instances instantiated in the scope
--- a/doc/logbooks/template-layers/2-CPUIF
+++ b/doc/logbooks/template-layers/2-CPUIF
@@ -0,0 +1,72 @@
+--------------------------------------------------------------------------------
+CPU Bus interface layer
+--------------------------------------------------------------------------------
+Provides an abstraction layer between the outside SoC's bus interface, and the
+internal register block's implementation.
+Converts a user-selectable bus protocol to generic register file signals.
+
+Upstream Signals:
+    Signal names are defined in the bus interface class and shall be malleable
+    to the user.
+    User can choose a flat signal interface, or a SV interface.
+    SV interface shall be easy to tweak since various orgs will use different
+    naming conventions in their library of interface definitions
+
+Downstream Signals:
+    - cpuif_req
+        - Single-cycle pulse
+        - Qualifies the following child signals:
+            - cpuif_req_is_wr
+                1 denotes this is a write transfer
+            - cpuif_addr
+                Byte address
+            - cpuif_wr_data
+            - cpuif_wr_bitstrb
+                per-bit strobes
+                some protocols may opt to tie this to all 1's
+    - cpuif_rd_ack
+        - Single-cycle pulse
+        - Qualifies the following child signals:
+            - cpuif_rd_data
+            - cpuif_rd_err
+
+    - cpuif_wr_ack
+        - Single-cycle pulse
+        - Qualifies the following child signals:
+            - cpuif_wr_err
+
+
+Misc thoughts
+- Internal cpuif_* signals use a strobe-based protocol:
+    - Unknown, but fixed latency
+    - Makes for easy pipelining if needed
+- Decided to keep cpuif_req signals common for read write:
+    This will allow address decode logic to be shared for read/write
+    Downside is split protocols like axi-lite can't have totally separate rd/wr
+    access lanes, but who cares?
+- separate response strobes
+    Not necessary to use, but this lets me independently pipeline read/write paths.
+    read path will need more time if readback mux is large
+- On multiple outstanding transactions
+    Currently, cpuif doesnt really support this. Goal was to make it easily pipelineable
+    without having to backfeed stall logic.
+    Could still be possible to do a "fly-by" pipeline with a more intelligent cpuif layer
+    Not worrying about this now.
+
+
+Implementation:
+    Implement this mainly as a Jinja template.
+    Upstream bus intf signals are fetched via busif class properties. Ex:
+        {{busif.signal('pready')}} <= '1;
+    This allows the actual SV or flattened signal to be emitted
+
+What protocols do I care about?
+    - AXI4 Lite
+        - Ignore AxPROT?
+    - APB3
+    - APB4
+        - Ignore pprot?
+    - AHB?
+    - Wishbone
+    - Generic
+        breakout the above signals as-is (reassign with a prefix or something)
--- a/doc/logbooks/template-layers/3-address-decode
+++ b/doc/logbooks/template-layers/3-address-decode
@@ -0,0 +1,51 @@
+
+--------------------------------------------------------------------------------
+Address Decode layer
+--------------------------------------------------------------------------------
+A bunch of combinational address decodes that generate individual register
+req strobes
+
+Possible decode logic styles:
+    - Big case statement
+        + Probably more sim-efficient
+        - Hard to do loop parameterization
+        - More annoying to do multiple regs per address
+    - Big always_comb + One if/else chain
+        + Easy to nest loops & parameterize if needed
+        - sim has a lot to evaluate each time
+        - More annoying to do multiple regs per address
+        - implies precedence? Synth tools should be smart enough?
+    - Big always_comb + inline conditionals <---- DO THIS
+        + Easy to nest loops & parameterize if needed
+        - sim has a lot to evaluate each time
+        + Multiple regs per address possible
+        + implies address decode parallelism.
+        ?? Should I try using generate loops + assigns?
+            This would be more explicit parallelism, however some tools may
+            get upset at multiple assignments to a common struct
+
+Implementation:
+    Jinja is inappropriate here
+        Very logic-heavy. Jinja may end up being annoying
+        Also, not much need for customization here
+        This may even make sense as a visitor that dumps lines
+            - visit each reg
+            - upon entering an array, create for loops
+            - upon exiting an array, emit 'end'
+    Make the strobe struct declared locally
+        No need for it to leave the block
+    Error handling
+        If no strobe generated, respond w error?
+        This is actually pretty expensive to do for writes.
+            Hold off on this for now.
+            Reads get this effectively for free in the readback mux.
+    Implement write response strobes back upstream to cpuif
+    Eventually allow for optional register stage for strobe struct
+        Will need to also pipeline the other cpuif signals
+        ok to discard the cpuif_addr. no longer needed
+
+
+Downstream Signals:
+    - access strobes
+        Encase these into a struct datatype
+    - is_write + wr_data/wr_bitstrobe
--- a/doc/logbooks/template-layers/4-fields
+++ b/doc/logbooks/template-layers/4-fields
@@ -0,0 +1,35 @@
+--------------------------------------------------------------------------------
+Field storage / next value layer
+--------------------------------------------------------------------------------
+Where all the magic happens!!
+
+Any field that implements storage is defined here.
+Bigass struct that only contains storage elements
+
+Each field consists of:
+    - an always_ff block
+    - series of if/else statements that assign the next value in the storage element
+        Think of this as a flat list of "next state" conditons, ranked by their precedence as follows:
+            - reset
+            - sw access (if sw precedence)
+                - onread/onwrite
+            - hw access
+                - Counter
+                - next
+                - etc
+            - sw access (if hw precedence)
+                - onread/onwrite
+
+TODO:
+    What about stuff like read-clear counters that cant lose a count?
+    In a traditional if/else chain, i need to be aware of the fact that its a counter
+    when handling the swaccess case
+    Is it possible to code this in a way where I can isolate the need to know every nuanced case here?
+        this may actually only apply to counters...
+    This is trivial in a 2-process implementation, but i'd rather avoid the overheads
+
+
+Implementation
+    Makes sense to use a listener class
+
+Be sure to skip alias registers
--- a/doc/logbooks/template-layers/5-readback-mux
+++ b/doc/logbooks/template-layers/5-readback-mux
@@ -0,0 +1,65 @@
+--------------------------------------------------------------------------------
+Readback mux layer
+--------------------------------------------------------------------------------
+
+Implementation:
+    - Big always_comb block
+    - Initialize default rd_data value
+    - Lotsa if statements that operate on reg strb to assign rd_data
+    - Merges all fields together into reg
+    - pulls value from storage element struct, or input struct
+    - Provision for optional flop stage?
+
+Mux Strategy:
+    Flat case statement:
+        -- Cant parameterize
+        + better performance?
+
+    Flatten array then mux:
+        - First, flatten ALL readback values into an array
+            Round up the size of the array to next ^2
+                needs to be fully addressable anyways!
+            This can be in a combinational block
+            Initialize the array to the default readback value
+            then, assign all register values. Use loops where necessary.
+            Append an extra 'is-valid' bit if I need to slverr on bad reads
+        - Next, use the read address as an index into this array
+        - If needed, I can do a staged decode!
+            Compute the most balanced fanin staging in Python. eg:
+                64 regs --mux--> 8x8 --mux--> 1
+                128 regs --mux--> 8x16 --mux--> 1
+                    Favor smaller fanin first. Latter stage should have more fanin since routing congestion will be easier
+                256 regs --mux--> 16x16 --mux--> 1
+        - Potential sparseness of this makes me uncomfortable,
+          but its synthesis SEEMS like it would be really efficient!
+        - TODO: Rethink this
+            I feel like people will complain about this
+            It will likely also be pretty sim-inefficient?
+    Flat 1-hot array then OR reduce: <-- DO THIS
+        - Create a bus-wide flat array
+            eg: 32-bits x N readable registers
+        - Assign each element:
+            the readback value of each register
+            ... masked by the register's access strobe
+        - I could also stuff an extra bit into the array that denotes the read is valid
+            A missed read will OR reduce down to a 0
+        - Finally, OR reduce all the elements in the array down to a flat 32-bit bus
+        - Retiming the large OR fanin can be done by chopping up the array into stages
+            for 2 stages, sqrt(N) gives each stage's fanin size. Round to favor
+            more fanin on 2nd stage
+            3 stages uses cube-root. etc...
+        - This has the benefit of re-using the address decode logic.
+          synth can choose to replicate logic if fanout is bad
+
+
+WARNING:
+    Beware of read/write flop stage asymmetry & race conditions.
+    Eg. If a field is rclr, dont want to sample it after it gets read:
+        addr --> strb --> clear
+        addr --> loooong...retime --> sample rd value
+    Should guarantee that read-sampling happens at the same cycle as any read-modify
+
+
+Forwards response strobe back up to cpu interface layer
+
+Dont forget about alias registers here
--- a/doc/logbooks/template-layers/6-output-port-mapping
+++ b/doc/logbooks/template-layers/6-output-port-mapping
@@ -0,0 +1,9 @@
+--------------------------------------------------------------------------------
+Output Port mapping layer
+--------------------------------------------------------------------------------
+
+Assign to output struct port
+
+Still TBD if this will actually be a distinct layer.
+Cosmetically, this might be nicer to interleave with the field section above
+Assign storage element & other derived values as requested by properties