documentation

This commit is contained in:
Alex Mykyta
2022-02-21 22:16:56 -08:00
parent 0fa26f2030
commit c3bfc2d416
14 changed files with 251 additions and 37 deletions

5
docs/api.rst Normal file
View File

@@ -0,0 +1,5 @@
Exporter API
============
.. autoclass:: peakrdl.regblock.RegblockExporter
:members:

View File

@@ -49,6 +49,10 @@ fanin re-timing stage can be enabled. This stage is automatically inserted at a
balanced point in the read-data reduction so that fanin and logic-levels are
optimally reduced.
.. figure:: diagrams/readback.png
:width: 65%
:align: center
A second optional read response retiming register can be enabled in-line with the
path back to the CPU interface layer. This can be useful if the CPU interface protocol
used has a fully combinational response path, and the design's complexity requires

View File

@@ -29,6 +29,8 @@ author = 'Alex Mykyta'
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
# ones.
extensions = [
'sphinx.ext.autodoc',
'sphinx.ext.napoleon',
"sphinxcontrib.wavedrom",
]
render_using_wavedrompy = True

View File

@@ -1,9 +0,0 @@
CPU Interface Addressing
========================
TODO: write about the following:
* cpuif addressing is always 0-based (aka relative to the block's root)
* It is up to the decoder to handle the offset
* Address bus width is pruned down
* recommend that the decoder/interconnect reserve a full ^2 block of addresses to simplify decoding

View File

@@ -1,11 +1,31 @@
AMBA APB3
=========
AMBA 3 APB
==========
TODO: Describe the following
Implements the register block using an
`AMBA 3 APB <https://developer.arm.com/documentation/ihi0024/b/Introduction/About-the-AMBA-3-APB>`_
CPU interface.
* List of interface signals
The APB3 CPU interface comes in two i/o port flavors:
* interface name & modports (link to advanced topics in case user wants to override)
* flattened equivalents
SystemVerilog Interface
Class: :class:`peakrdl.regblock.cpuif.apb3.APB3_Cpuif`
* Download link to SV interface definition
Interface Definition: :download:`apb3_intf.sv <../../test/lib/cpuifs/apb3/apb3_intf.sv>`
Flattened inputs/outputs
Flattens the interface into descrete input and output ports.
Class: :class:`peakrdl.regblock.cpuif.apb3.APB3_Cpuif_flattened`
.. warning::
Some IP vendors will incorrectly implement the address signalling
assuming word-addresses. (that each increment of ``PADDR`` is the next word)
For this exporter, values on the interface's ``PADDR`` input are interpreted
as byte-addresses. (a 32-bit APB bus increments ``PADDR`` in steps of 4)
Although APB protocol does not allow for unaligned transfers, this is in
accordance to the official AMBA bus specification.
Be sure to double-check the interpretation of your interconnect IP. A simple
bit-shift operation can be used to correct this if necessary.

View File

@@ -1,11 +1,29 @@
AMBA AXI4-Lite
==============
TODO: Describe the following
Implements the register block using an
`AMBA AXI4-Lite <https://developer.arm.com/documentation/ihi0022/e/AMBA-AXI4-Lite-Interface-Specification>`_
CPU interface.
* List of interface signals
The AXI4-Lite CPU interface comes in two i/o port flavors:
* interface name & modports (link to advanced topics in case user wants to override)
* flattened equivalents
SystemVerilog Interface
Class: :class:`peakrdl.regblock.cpuif.axi4lite.AXI4Lite_Cpuif`
* Download link to SV interface definition
Interface Definition: :download:`apb3_intf.sv <../../test/lib/cpuifs/axi4lite/axi4lite_intf.sv>`
Flattened inputs/outputs
Flattens the interface into descrete input and output ports.
Class: :class:`peakrdl.regblock.cpuif.axi4lite.AXI4Lite_Cpuif_flattened`
Pipelined Performance
---------------------
This implementation of the AXI4-Lite interface supports transaction pipelining
which can significantly improve performance of back-to-back transfers.
In order to support transaction pipelining, the CPU interface will accept multiple
concurrent transactions. The number of outstanding transactions allowed is automatically
determined based on the register file pipeline depth (affected by retiming options),
and influences the depth of the internal transaction response skid buffer.

View File

@@ -0,0 +1,26 @@
Introduction
============
The CPU interface logic layer provides an abstraction between the
application-specific bus protocol and the internal register file logic.
When exporting a design, you can select from a variety of popular CPU interface
protocols. These are described in more detail in the pages that follow.
Addressing
^^^^^^^^^^
The regblock exporter will always generate its address decoding logic using local
address offsets. The absolute address offset of your device shall be
handled by your system interconnect, and present addresses to the regblock that
only include the local offset.
For example, consider a fictional AXI4-Lite device that:
- Consumes 4 kB of address space (``0x000``-``0xFFF``).
- The device is instantiated in your system at global address ``0x80_0000``-``0x80_0FFF``.
- After decoding transactions destined to the device, the system interconnect shall
ensure that AxADDR values are presented to the device as relative addresses - within
the range of ``0x000``-``0xFFF``.
- If care is taken to align the global address offset to the size of the device,
creating a relative address is as simple as pruning down address bits.

View File

@@ -0,0 +1,9 @@
CPUIF Passthrough
=================
This CPUIF mode bypasses the protocol converter stage and directly exposes the
internal CPUIF handshake signals to the user.
Class: :class:`peakrdl.regblock.cpuif.passthrough.PassthroughCpuif`
For more details on the protocol itself, see: :ref:`cpuif_protocol`.

Binary file not shown.

BIN
docs/diagrams/readback.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 89 KiB

View File

@@ -1,8 +1,51 @@
Hardware Interface
------------------
TODO: Describe the following
The generated register block will present the entire hardware interface to the user
using two struct ports:
* hwif_in / hwif_out structs and their contents
* shorthand notation used in this reference: ``hwif_in..xyz``
* Example of how to peel back a sub-hierarchy struct
* ``hwif_in``
* ``hwif_out``
All field inputs and outputs as well as signals are consolidated into these
struct ports. The presence of each depends on the specific contents of the desgin
being exported.
Using structs for the hardware interface has the following benefits:
* Preserves register map component grouping, arrays, and hierarchy.
* Avoids naming collisions and cumbersome signal name flattening.
* Allows for more natural mapping and distribution of register block signals to a design's hardware components.
* Use of unpacked arrays/structs prevents common assignment mistakes as they are enforced by the compiler.
Structs are organized as follows: ``hwif_out.<heir_path>.<feature>``
For example, a simple design such as:
.. code-block:: systemrdl
addrmap my_design {
reg {
field {
sw = rw;
hw = rw;
we;
} my_field;
} my_reg[2];
};
... results in the following struct members:
.. code-block:: text
hwif_out.my_reg[0].my_field.value
hwif_in.my_reg[0].my_field.next
hwif_in.my_reg[0].my_field.we
hwif_out.my_reg[1].my_field.value
hwif_in.my_reg[1].my_field.next
hwif_in.my_reg[1].my_field.we
For brevity in this documentation, hwif features will be described using shorthand
notation that omits the hierarchcal path: ``hwif_out..<feature>``

View File

@@ -1,17 +1,25 @@
PeakRDL-regblock
================
Introduction
============
.. important::
PeakRDL-regblock is a free and open-source control & status register (CSR) compiler.
This code generator that will translate your SystemRDL register descripton into
a synthesizable SystemVerilog RTL module that can be easily instantiated into
your hardware design.
This project has no official releases yet and is still under active development!
* Generates fully synthesizable SystemVerilog RTL (IEEE 1800-2012)
* Options for many popular CPU interface protocols (AMBA APB, AXI4-Lite, and more)
* Configurable pipelining options for designs with fast clock rates.
* Broad support for SystemRDL 2.0 features
TODO: Intro text
Installing
----------
.. important::
This project has no official releases yet and is still under active development!
Install from `PyPi`_ using pip
.. code-block:: bash
@@ -22,6 +30,45 @@ Install from `PyPi`_ using pip
.. _PyPi: https://pypi.org/project/peakrdl-regblock
Quick Start
-----------
Below is a simple example that demonstrates how to generate a SystemVerilog
implementation from SystemRDL source.
.. code-block:: python
:emphasize-lines: 2-3, 23-27
from systemrdl import RDLCompiler, RDLCompileError
from peakrdl.regblock import RegblockExporter
from peakrdl.regblock.cpuif.apb3 import APB3_Cpuif
input_files = [
"PATH/TO/my_register_block.rdl"
]
# Create an instance of the compiler
rdlc = RDLCompiler()
try:
# Compile your RDL files
for input_file in input_files:
rdlc.compile_file(input_file)
# Elaborate the design
root = rdlc.elaborate()
except RDLCompileError:
# A compilation error occurred. Exit with error code
sys.exit(1)
# Export a SystemVerilog implementation
exporter = RegblockExporter()
exporter.export(
root, "path/to/output_dir",
cpuif_cls=APB3_Cpuif
)
Links
-----
@@ -39,17 +86,19 @@ Links
self
architecture
hwif
api
limitations
.. toctree::
:hidden:
:caption: CPU Interfaces
cpuif/addressing
cpuif/introduction
cpuif/apb3
cpuif/axi4lite
cpuif/advanced
cpuif/passthrough
cpuif/internal_protocol
cpuif/advanced
.. toctree::
:hidden:

View File

@@ -1,5 +1,4 @@
// LATENCY = {{cpuif.regblock_latency}}
// MAX OUTSTANDING = {{cpuif.max_outstanding}}
// Max Outstanding Transactions: {{cpuif.max_outstanding}}
logic [{{clog2(cpuif.max_outstanding+1)-1}}:0] axil_n_in_flight;
logic axil_prev_was_rd;
logic axil_arvalid;
@@ -11,6 +10,8 @@ logic axil_wvalid;
logic [{{cpuif.data_width-1}}:0] axil_wdata;
logic axil_aw_accept;
logic axil_resp_acked;
// Transaction request accpetance
always_ff {{get_always_ff_event(cpuif.reset)}} begin
if({{get_resetsignal(cpuif.reset)}}) begin
axil_prev_was_rd <= '0;

View File

@@ -16,7 +16,7 @@ from .utils import get_always_ff_event
from .scan_design import DesignScanner
class RegblockExporter:
def __init__(self, **kwargs):
def __init__(self, **kwargs) -> None:
user_template_dir = kwargs.pop("user_template_dir", None)
# Check for stray kwargs
@@ -57,7 +57,53 @@ class RegblockExporter:
)
def export(self, node: Union[RootNode, AddrmapNode], output_dir:str, **kwargs):
def export(self, node: Union[RootNode, AddrmapNode], output_dir:str, **kwargs) -> None:
"""
Parameters
----------
node: AddrmapNode
Top-level SystemRDL node to export.
output_dir: str
Path to the output directory where generated SystemVerilog will be written.
Output includes two files: a module definition and package definition.
cpuif_cls: :class:`peakrdl.regblock.cpuif.CpuifBase`
Specify the class type that implements the CPU interface of your choice.
Defaults to AMBA APB3.
module_name: str
Override the SystemVerilog module name. By default, the module name
is the top-level node's name.
package_name: str
Override the SystemVerilog package name. By default, the package name
is the top-level node's name with a "_pkg" suffix.
reuse_hwif_typedefs: bool
By default, the exporter will attempt to re-use hwif struct definitions for
nodes that are equivalent. This allows for better modularity and type reuse.
Struct type names are derived using the SystemRDL component's type
name and declared lexical scope path.
If this is not desireable, override this parameter to ``False`` and structs
will be generated more naively using their hierarchical paths.
retime_read_fanin: bool
Set this to ``True`` to enable additional read path retiming.
For large register blocks that operate at demanding clock rates, this
may be necessary in order to manage large readback fan-in.
The retiming flop stage is automatically placed in the most optimal point in the
readback path so that logic-levels and fanin are minimized.
Enabling this option will increase read transfer latency by 1 clock cycle.
retime_read_response: bool
Set this to ``True`` to enable an additional retiming flop stage between
the readback mux and the CPU interface response logic.
This option may be beneficial for some CPU interfaces that implement the
response logic fully combinationally. Enabling this stage can better
isolate timing paths in the register file from the rest of your system.
Enabling this when using CPU interfaces that already implement the
response path sequentially may not result in any meaningful timing improvement.
Enabling this option will increase read transfer latency by 1 clock cycle.
"""
# If it is the root node, skip to top addrmap
if isinstance(node, RootNode):
self.top_node = node.top