Initial Commit - Forked from PeakRDL-regblock @ a440cc19769069be831d267505da4f3789a26695

This commit is contained in:
Arnav Sacheti
2025-10-10 22:28:36 -07:00
commit 9bf5cd1e68
308 changed files with 19414 additions and 0 deletions

59
docs/cpuif/apb.rst Normal file
View File

@@ -0,0 +1,59 @@
AMBA APB
========
Both APB3 and APB4 standards are supported.
.. warning::
Some IP vendors will incorrectly implement the address signalling
assuming word-addresses. (that each increment of ``PADDR`` is the next word)
For this exporter, values on the interface's ``PADDR`` input are interpreted
as byte-addresses. (an APB interface with 32-bit wide data increments
``PADDR`` in steps of 4 for every word). Even though APB protocol does not
allow for unaligned transfers, this is in accordance to the official AMBA
specification.
Be sure to double-check the interpretation of your interconnect IP. A simple
bit-shift operation can be used to correct this if necessary.
APB3
----
Implements the register block using an
`AMBA 3 APB <https://developer.arm.com/documentation/ihi0024/b/Introduction/About-the-AMBA-3-APB>`_
CPU interface.
The APB3 CPU interface comes in two i/o port flavors:
SystemVerilog Interface
* Command line: ``--cpuif apb3``
* Interface Definition: :download:`apb3_intf.sv <../../hdl-src/apb3_intf.sv>`
* Class: :class:`peakrdl_regblock.cpuif.apb3.APB3_Cpuif`
Flattened inputs/outputs
Flattens the interface into discrete input and output ports.
* Command line: ``--cpuif apb3-flat``
* Class: :class:`peakrdl_regblock.cpuif.apb3.APB3_Cpuif_flattened`
APB4
----
Implements the register block using an
`AMBA 4 APB <https://developer.arm.com/documentation/ihi0024/d/?lang=en>`_
CPU interface.
The APB4 CPU interface comes in two i/o port flavors:
SystemVerilog Interface
* Command line: ``--cpuif apb4``
* Interface Definition: :download:`apb4_intf.sv <../../hdl-src/apb4_intf.sv>`
* Class: :class:`peakrdl_regblock.cpuif.apb4.APB4_Cpuif`
Flattened inputs/outputs
Flattens the interface into discrete input and output ports.
* Command line: ``--cpuif apb4-flat``
* Class: :class:`peakrdl_regblock.cpuif.apb4.APB4_Cpuif_flattened`

33
docs/cpuif/avalon.rst Normal file
View File

@@ -0,0 +1,33 @@
Intel Avalon
============
Implements the register block using an
`Intel Avalon MM <https://www.intel.com/content/www/us/en/docs/programmable/683091/22-3/memory-mapped-interfaces.html>`_
CPU interface.
The Avalon interface comes in two i/o port flavors:
SystemVerilog Interface
* Command line: ``--cpuif avalon-mm``
* Interface Definition: :download:`avalon_mm_intf.sv <../../hdl-src/avalon_mm_intf.sv>`
* Class: :class:`peakrdl_regblock.cpuif.avalon.Avalon_Cpuif`
Flattened inputs/outputs
Flattens the interface into discrete input and output ports.
* Command line: ``--cpuif avalon-mm-flat``
* Class: :class:`peakrdl_regblock.cpuif.avalon.Avalon_Cpuif_flattened`
Implementation Details
----------------------
This implementation of the Avalon protocol has the following features:
* Interface uses word addressing.
* Supports `pipelined transfers <https://www.intel.com/content/www/us/en/docs/programmable/683091/22-3/pipelined-transfers.html>`_
* Responses may have variable latency
In most cases, latency is fixed and is determined by how many retiming
stages are enabled in your design.
However if your design contains external components, access latency is
not guaranteed to be uniform.

32
docs/cpuif/axi4lite.rst Normal file
View File

@@ -0,0 +1,32 @@
.. _cpuif_axi4lite:
AMBA AXI4-Lite
==============
Implements the register block using an
`AMBA AXI4-Lite <https://developer.arm.com/documentation/ihi0022/e/AMBA-AXI4-Lite-Interface-Specification>`_
CPU interface.
The AXI4-Lite CPU interface comes in two i/o port flavors:
SystemVerilog Interface
* Command line: ``--cpuif axi4-lite``
* Interface Definition: :download:`axi4lite_intf.sv <../../hdl-src/axi4lite_intf.sv>`
* Class: :class:`peakrdl_regblock.cpuif.axi4lite.AXI4Lite_Cpuif`
Flattened inputs/outputs
Flattens the interface into discrete input and output ports.
* Command line: ``--cpuif axi4-lite-flat``
* Class: :class:`peakrdl_regblock.cpuif.axi4lite.AXI4Lite_Cpuif_flattened`
Pipelined Performance
---------------------
This implementation of the AXI4-Lite interface supports transaction pipelining
which can significantly improve performance of back-to-back transfers.
In order to support transaction pipelining, the CPU interface will accept multiple
concurrent transactions. The number of outstanding transactions allowed is automatically
determined based on the register file pipeline depth (affected by retiming options),
and influences the depth of the internal transaction response skid buffer.

114
docs/cpuif/customizing.rst Normal file
View File

@@ -0,0 +1,114 @@
Customizing the CPU interface
=============================
Use your own existing SystemVerilog interface definition
--------------------------------------------------------
This exporter comes pre-bundled with its own SystemVerilog interface declarations.
What if you already have your own SystemVerilog interface declaration that you prefer?
Not a problem! As long as your interface definition is similar enough, it is easy
to customize and existing CPUIF definition.
As an example, let's use the SystemVerilog interface definition for
:ref:`cpuif_axi4lite` that is bundled with this project. This interface uses
the following style and naming conventions:
* SystemVerilog interface type name is ``axi4lite_intf``
* Defines modports named ``master`` and ``slave``
* Interface signals are all upper-case: ``AWREADY``, ``AWVALID``, etc...
Lets assume your preferred SV interface definition uses a slightly different naming convention:
* SystemVerilog interface type name is ``axi4_lite_interface``
* Modports are capitalized and use suffixes ``Master_mp`` and ``Slave_mp``
* Interface signals are all lower-case: ``awready``, ``awvalid``, etc...
Rather than rewriting a new CPU interface definition, you can extend and adjust the existing one:
.. code-block:: python
from peakrdl_regblock.cpuif.axi4lite import AXI4Lite_Cpuif
class My_AXI4Lite(AXI4Lite_Cpuif):
@property
def port_declaration(self) -> str:
# Override the port declaration text to use the alternate interface name and modport style
return "axi4_lite_interface.Slave_mp s_axil"
def signal(self, name:str) -> str:
# Override the signal names to be lowercase instead
return "s_axil." + name.lower()
Then use your custom CPUIF during export:
.. code-block:: python
exporter = RegblockExporter()
exporter.export(
root, "path/to/output_dir",
cpuif_cls=My_AXI4Lite
)
Custom CPU Interface Protocol
-----------------------------
If you require a CPU interface protocol that is not included in this project,
you can define your own.
1. Create a SystemVerilog CPUIF implementation template file.
This contains the SystemVerilog implementation of the bus protocol. The logic
in this shall implement a translation between your custom protocol and the
:ref:`cpuif_protocol`.
Reminder that this template will be preprocessed using
`Jinja <https://jinja.palletsprojects.com>`_, so you can use
some templating tags to dynamically render content. See the implementations of
existing CPU interfaces as an example.
2. Create a Python class that defines your CPUIF
Extend your class from :class:`peakrdl_regblock.cpuif.CpuifBase`.
Define the port declaration string, and provide a reference to your template file.
3. Use your new CPUIF definition when exporting.
4. If you think the CPUIF protocol is something others might find useful, let me
know and I can add it to PeakRDL!
Loading into the PeakRDL command line tool
------------------------------------------
There are two ways to make your custom CPUIF class visible to the
`PeakRDL command-line tool <https://peakrdl.readthedocs.io>`_.
Via the PeakRDL TOML
^^^^^^^^^^^^^^^^^^^^
The easiest way to add your cpuif is via the TOML config file. See the
:ref:`peakrdl_cfg` section for more details.
Via a package's entry point definition
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If you are publishing a collection of PeakRDL plugins as an installable Python
package, you can advertise them to PeakRDL using an entry point.
This advertises your custom CPUIF class to the PeakRDL-regblock tool as a plugin
that should be loaded, and made available as a command-line option in PeakRDL.
.. code-block:: toml
[project.entry-points."peakrdl_regblock.cpuif"]
my-cpuif = "my_package.my_module:MyCPUIF"
* ``my_package``: The name of your installable Python module
* ``peakrdl-regblock.cpuif``: This is the namespace that PeakRDL-regblock will
search. Any cpuif plugins you create must be enclosed in this namespace in
order to be discovered.
* ``my_package.my_module:MyCPUIF``: This is the import path that
points to your CPUIF class definition.
* ``my-cpuif``: The lefthand side of the assignment is your cpuif's name. This
text is what the end-user uses in the command line interface to select your
CPUIF implementation.

View File

@@ -0,0 +1,232 @@
.. _cpuif_protocol:
Internal CPUIF Protocol
=======================
Internally, the regblock generator uses a common CPU interface handshake
protocol. This strobe-based protocol is designed to add minimal overhead to the
regblock implementation, while also being flexible enough to support advanced
features of a variety of bus interface standards.
Signal Descriptions
-------------------
Request
^^^^^^^
cpuif_req
When asserted, a read or write transfer will be initiated.
Denotes that the following signals are valid: ``cpuif_addr``,
``cpuif_req_is_wr``, and ``cpuif_wr_data``.
A transfer will only initiate if the relevant stall signal is not asserted.
If stalled, the request shall be held until accepted. A request's parameters
(type, address, etc) shall remain static throughout the stall.
cpuif_addr
Byte-address of the transfer.
cpuif_req_is_wr
If ``1``, denotes that the current transfer is a write. Otherwise transfer is
a read.
cpuif_wr_data
Data to be written for the write transfer. This signal is ignored for read
transfers.
cpuif_wr_biten
Active-high bit-level write-enable strobes.
Only asserted bit positions will change the register value during a write
transfer.
cpuif_req_stall_rd
If asserted, and the next pending request is a read operation, then the
transfer will not be accepted until this signal is deasserted.
cpuif_req_stall_wr
If asserted, and the next pending request is a write operation, then the
transfer will not be accepted until this signal is deasserted.
Read Response
^^^^^^^^^^^^^
cpuif_rd_ack
Single-cycle strobe indicating a read transfer has completed.
Qualifies that the following signals are valid: ``cpuif_rd_err`` and
``cpuif_rd_data``
cpuif_rd_err
If set, indicates that the read transaction failed and the CPUIF logic
should return an error response if possible.
cpuif_rd_data
Read data. Is sampled on the same cycle that ``cpuif_rd_ack`` is asserted.
Write Response
^^^^^^^^^^^^^^
cpuif_wr_ack
Single-cycle strobe indicating a write transfer has completed.
Qualifies that the ``cpuif_wr_err`` signal is valid.
cpuif_wr_err
If set, indicates that the write transaction failed and the CPUIF logic
should return an error response if possible.
Transfers
---------
Transfers have the following characteristics:
* Only one transfer can be initiated per clock-cycle. This is implicit as there
is only one set of request signals.
* The register block implementation shall guarantee that only one response can be
asserted in a given clock cycle. Only one ``cpuif_*_ack`` signal can be
asserted at a time.
* Responses shall arrive in the same order as their corresponding request was
dispatched.
Basic Transfer
^^^^^^^^^^^^^^
Depending on the configuration of the exported register block, transfers can be
fully combinational or they may require one or more clock cycles to complete.
Both are valid and CPU interface logic shall be designed to anticipate either.
.. wavedrom::
{
"signal": [
{"name": "clk", "wave": "p...."},
{"name": "cpuif_req", "wave": "010.."},
{"name": "cpuif_req_is_wr", "wave": "x2x.."},
{"name": "cpuif_addr", "wave": "x2x..", "data": ["A"]},
{},
{"name": "cpuif_*_ack", "wave": "010.."},
{"name": "cpuif_*_err", "wave": "x2x.."}
],
"foot": {
"text": "Zero-latency transfer"
}
}
.. wavedrom::
{
"signal": [
{"name": "clk", "wave": "p..|..."},
{"name": "cpuif_req", "wave": "010|..."},
{"name": "cpuif_req_is_wr", "wave": "x2x|..."},
{"name": "cpuif_addr", "wave": "x2x|...", "data": ["A"]},
{},
{"name": "cpuif_*_ack", "wave": "0..|10."},
{"name": "cpuif_*_err", "wave": "x..|2x."}
],
"foot": {
"text": "Transfer with non-zero latency"
}
}
Read & Write Transactions
-------------------------
Waveforms below show the timing relationship of simple read/write transactions.
For brevity, only showing non-zero latency transfers.
.. wavedrom::
{
"signal": [
{"name": "clk", "wave": "p..|..."},
{"name": "cpuif_req", "wave": "010|..."},
{"name": "cpuif_req_is_wr", "wave": "x0x|..."},
{"name": "cpuif_addr", "wave": "x3x|...", "data": ["A"]},
{},
{"name": "cpuif_rd_ack", "wave": "0..|10."},
{"name": "cpuif_rd_err", "wave": "x..|0x."},
{"name": "cpuif_rd_data", "wave": "x..|5x.", "data": ["D"]}
],
"foot": {
"text": "Read Transaction"
}
}
.. wavedrom::
{
"signal": [
{"name": "clk", "wave": "p..|..."},
{"name": "cpuif_req", "wave": "010|..."},
{"name": "cpuif_req_is_wr", "wave": "x1x|..."},
{"name": "cpuif_addr", "wave": "x3x|...", "data": ["A"]},
{"name": "cpuif_wr_data", "wave": "x5x|...", "data": ["D"]},
{},
{"name": "cpuif_wr_ack", "wave": "0..|10."},
{"name": "cpuif_wr_err", "wave": "x..|0x."}
],
"foot": {
"text": "Write Transaction"
}
}
Transaction Pipelining & Stalls
-------------------------------
If the CPU interface supports it, read and write operations can be pipelined.
.. wavedrom::
{
"signal": [
{"name": "clk", "wave": "p......"},
{"name": "cpuif_req", "wave": "01..0.."},
{"name": "cpuif_req_is_wr", "wave": "x0..x.."},
{"name": "cpuif_addr", "wave": "x333x..", "data": ["A1", "A2", "A3"]},
{},
{"name": "cpuif_rd_ack", "wave": "0.1..0."},
{"name": "cpuif_rd_err", "wave": "x.0..x."},
{"name": "cpuif_rd_data", "wave": "x.555x.", "data": ["D1", "D2", "D3"]}
]
}
It is very likely that the transfer latency of a read transaction will not
be the same as a write for a given register block configuration. Typically read
operations will be more deeply pipelined. This latency asymmetry would create a
hazard for response collisions.
In order to eliminate this hazard, additional stall signals (``cpuif_req_stall_rd``
and ``cpuif_req_stall_wr``) are provided to delay the next incoming transfer
request if necessary. When asserted, the CPU interface shall hold the next pending
request until the stall is cleared.
For non-pipelined CPU interfaces that only allow one outstanding transaction at a time,
these stall signals can be safely ignored.
In the following example, the regblock is configured such that:
* A read transaction takes 1 clock cycle to complete
* A write transaction takes 0 clock cycles to complete
.. wavedrom::
{
"signal": [
{"name": "clk", "wave": "p......."},
{"name": "cpuif_req", "wave": "01.....0"},
{"name": "cpuif_req_is_wr", "wave": "x1.0.1.x"},
{"name": "cpuif_addr", "wave": "x33443.x", "data": ["W1", "W2", "R1", "R2", "W3"]},
{"name": "cpuif_req_stall_wr", "wave": "0...1.0."},
{},
{"name": "cpuif_rd_ack", "wave": "0...220.", "data": ["R1", "R2"]},
{"name": "cpuif_wr_ack", "wave": "0220..20", "data": ["W1", "W2", "W3"]}
]
}
In the above waveform, observe that:
* The ``R2`` read request is not affected by the assertion of the write stall,
since the write stall only applies to write requests.
* The ``W3`` write request is stalled for one cycle, and is accepted once the stall is cleared.

View File

@@ -0,0 +1,36 @@
Introduction
============
The CPU interface logic layer provides an abstraction between the
application-specific bus protocol and the internal register file logic.
When exporting a design, you can select from a variety of popular CPU interface
protocols. These are described in more detail in the pages that follow.
Bus Width
^^^^^^^^^
The CPU interface bus width is automatically determined from the contents of the
design being exported. The bus width is equal to the widest ``accesswidth``
encountered in the design.
Addressing
^^^^^^^^^^
The regblock exporter will always generate its address decoding logic using local
address offsets. The absolute address offset of your device shall be
handled by your system interconnect, and present addresses to the regblock that
only include the local offset.
For example, consider a fictional AXI4-Lite device that:
- Consumes 4 kB of address space (``0x000``-``0xFFF``).
- The device is instantiated in your system at global address range ``0x30_0000 - 0x50_0FFF``.
- After decoding transactions destined to the device, the system interconnect shall
ensure that AxADDR values are presented to the device as relative addresses - within
the range of ``0x000``-``0xFFF``.
- If care is taken to align the global address offset to the size of the device,
creating a relative address is as simple as pruning down address bits.
By default, the bit-width of the address bus will be the minimum size to span the contents
of the register block. If needed, the address width can be overridden to a larger range.

View File

@@ -0,0 +1,10 @@
CPUIF Passthrough
=================
This CPUIF mode bypasses the protocol converter stage and directly exposes the
internal CPUIF handshake signals to the user.
* Command line: ``--cpuif passthrough``
* Class: :class:`peakrdl_regblock.cpuif.passthrough.PassthroughCpuif`
For more details on the protocol itself, see: :ref:`cpuif_protocol`.