Initial Commit - Forked from PeakRDL-regblock @ a440cc19769069be831d267505da4f3789a26695
This commit is contained in:
59
docs/cpuif/apb.rst
Normal file
59
docs/cpuif/apb.rst
Normal file
@@ -0,0 +1,59 @@
|
||||
AMBA APB
|
||||
========
|
||||
|
||||
Both APB3 and APB4 standards are supported.
|
||||
|
||||
.. warning::
|
||||
Some IP vendors will incorrectly implement the address signalling
|
||||
assuming word-addresses. (that each increment of ``PADDR`` is the next word)
|
||||
|
||||
For this exporter, values on the interface's ``PADDR`` input are interpreted
|
||||
as byte-addresses. (an APB interface with 32-bit wide data increments
|
||||
``PADDR`` in steps of 4 for every word). Even though APB protocol does not
|
||||
allow for unaligned transfers, this is in accordance to the official AMBA
|
||||
specification.
|
||||
|
||||
Be sure to double-check the interpretation of your interconnect IP. A simple
|
||||
bit-shift operation can be used to correct this if necessary.
|
||||
|
||||
|
||||
APB3
|
||||
----
|
||||
|
||||
Implements the register block using an
|
||||
`AMBA 3 APB <https://developer.arm.com/documentation/ihi0024/b/Introduction/About-the-AMBA-3-APB>`_
|
||||
CPU interface.
|
||||
|
||||
The APB3 CPU interface comes in two i/o port flavors:
|
||||
|
||||
SystemVerilog Interface
|
||||
* Command line: ``--cpuif apb3``
|
||||
* Interface Definition: :download:`apb3_intf.sv <../../hdl-src/apb3_intf.sv>`
|
||||
* Class: :class:`peakrdl_regblock.cpuif.apb3.APB3_Cpuif`
|
||||
|
||||
Flattened inputs/outputs
|
||||
Flattens the interface into discrete input and output ports.
|
||||
|
||||
* Command line: ``--cpuif apb3-flat``
|
||||
* Class: :class:`peakrdl_regblock.cpuif.apb3.APB3_Cpuif_flattened`
|
||||
|
||||
|
||||
APB4
|
||||
----
|
||||
|
||||
Implements the register block using an
|
||||
`AMBA 4 APB <https://developer.arm.com/documentation/ihi0024/d/?lang=en>`_
|
||||
CPU interface.
|
||||
|
||||
The APB4 CPU interface comes in two i/o port flavors:
|
||||
|
||||
SystemVerilog Interface
|
||||
* Command line: ``--cpuif apb4``
|
||||
* Interface Definition: :download:`apb4_intf.sv <../../hdl-src/apb4_intf.sv>`
|
||||
* Class: :class:`peakrdl_regblock.cpuif.apb4.APB4_Cpuif`
|
||||
|
||||
Flattened inputs/outputs
|
||||
Flattens the interface into discrete input and output ports.
|
||||
|
||||
* Command line: ``--cpuif apb4-flat``
|
||||
* Class: :class:`peakrdl_regblock.cpuif.apb4.APB4_Cpuif_flattened`
|
||||
33
docs/cpuif/avalon.rst
Normal file
33
docs/cpuif/avalon.rst
Normal file
@@ -0,0 +1,33 @@
|
||||
Intel Avalon
|
||||
============
|
||||
|
||||
Implements the register block using an
|
||||
`Intel Avalon MM <https://www.intel.com/content/www/us/en/docs/programmable/683091/22-3/memory-mapped-interfaces.html>`_
|
||||
CPU interface.
|
||||
|
||||
The Avalon interface comes in two i/o port flavors:
|
||||
|
||||
SystemVerilog Interface
|
||||
* Command line: ``--cpuif avalon-mm``
|
||||
* Interface Definition: :download:`avalon_mm_intf.sv <../../hdl-src/avalon_mm_intf.sv>`
|
||||
* Class: :class:`peakrdl_regblock.cpuif.avalon.Avalon_Cpuif`
|
||||
|
||||
Flattened inputs/outputs
|
||||
Flattens the interface into discrete input and output ports.
|
||||
|
||||
* Command line: ``--cpuif avalon-mm-flat``
|
||||
* Class: :class:`peakrdl_regblock.cpuif.avalon.Avalon_Cpuif_flattened`
|
||||
|
||||
|
||||
Implementation Details
|
||||
----------------------
|
||||
This implementation of the Avalon protocol has the following features:
|
||||
|
||||
* Interface uses word addressing.
|
||||
* Supports `pipelined transfers <https://www.intel.com/content/www/us/en/docs/programmable/683091/22-3/pipelined-transfers.html>`_
|
||||
* Responses may have variable latency
|
||||
|
||||
In most cases, latency is fixed and is determined by how many retiming
|
||||
stages are enabled in your design.
|
||||
However if your design contains external components, access latency is
|
||||
not guaranteed to be uniform.
|
||||
32
docs/cpuif/axi4lite.rst
Normal file
32
docs/cpuif/axi4lite.rst
Normal file
@@ -0,0 +1,32 @@
|
||||
.. _cpuif_axi4lite:
|
||||
|
||||
AMBA AXI4-Lite
|
||||
==============
|
||||
|
||||
Implements the register block using an
|
||||
`AMBA AXI4-Lite <https://developer.arm.com/documentation/ihi0022/e/AMBA-AXI4-Lite-Interface-Specification>`_
|
||||
CPU interface.
|
||||
|
||||
The AXI4-Lite CPU interface comes in two i/o port flavors:
|
||||
|
||||
SystemVerilog Interface
|
||||
* Command line: ``--cpuif axi4-lite``
|
||||
* Interface Definition: :download:`axi4lite_intf.sv <../../hdl-src/axi4lite_intf.sv>`
|
||||
* Class: :class:`peakrdl_regblock.cpuif.axi4lite.AXI4Lite_Cpuif`
|
||||
|
||||
Flattened inputs/outputs
|
||||
Flattens the interface into discrete input and output ports.
|
||||
|
||||
* Command line: ``--cpuif axi4-lite-flat``
|
||||
* Class: :class:`peakrdl_regblock.cpuif.axi4lite.AXI4Lite_Cpuif_flattened`
|
||||
|
||||
|
||||
Pipelined Performance
|
||||
---------------------
|
||||
This implementation of the AXI4-Lite interface supports transaction pipelining
|
||||
which can significantly improve performance of back-to-back transfers.
|
||||
|
||||
In order to support transaction pipelining, the CPU interface will accept multiple
|
||||
concurrent transactions. The number of outstanding transactions allowed is automatically
|
||||
determined based on the register file pipeline depth (affected by retiming options),
|
||||
and influences the depth of the internal transaction response skid buffer.
|
||||
114
docs/cpuif/customizing.rst
Normal file
114
docs/cpuif/customizing.rst
Normal file
@@ -0,0 +1,114 @@
|
||||
Customizing the CPU interface
|
||||
=============================
|
||||
|
||||
Use your own existing SystemVerilog interface definition
|
||||
--------------------------------------------------------
|
||||
|
||||
This exporter comes pre-bundled with its own SystemVerilog interface declarations.
|
||||
What if you already have your own SystemVerilog interface declaration that you prefer?
|
||||
|
||||
Not a problem! As long as your interface definition is similar enough, it is easy
|
||||
to customize and existing CPUIF definition.
|
||||
|
||||
|
||||
As an example, let's use the SystemVerilog interface definition for
|
||||
:ref:`cpuif_axi4lite` that is bundled with this project. This interface uses
|
||||
the following style and naming conventions:
|
||||
|
||||
* SystemVerilog interface type name is ``axi4lite_intf``
|
||||
* Defines modports named ``master`` and ``slave``
|
||||
* Interface signals are all upper-case: ``AWREADY``, ``AWVALID``, etc...
|
||||
|
||||
Lets assume your preferred SV interface definition uses a slightly different naming convention:
|
||||
|
||||
* SystemVerilog interface type name is ``axi4_lite_interface``
|
||||
* Modports are capitalized and use suffixes ``Master_mp`` and ``Slave_mp``
|
||||
* Interface signals are all lower-case: ``awready``, ``awvalid``, etc...
|
||||
|
||||
Rather than rewriting a new CPU interface definition, you can extend and adjust the existing one:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from peakrdl_regblock.cpuif.axi4lite import AXI4Lite_Cpuif
|
||||
|
||||
class My_AXI4Lite(AXI4Lite_Cpuif):
|
||||
@property
|
||||
def port_declaration(self) -> str:
|
||||
# Override the port declaration text to use the alternate interface name and modport style
|
||||
return "axi4_lite_interface.Slave_mp s_axil"
|
||||
|
||||
def signal(self, name:str) -> str:
|
||||
# Override the signal names to be lowercase instead
|
||||
return "s_axil." + name.lower()
|
||||
|
||||
Then use your custom CPUIF during export:
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
exporter = RegblockExporter()
|
||||
exporter.export(
|
||||
root, "path/to/output_dir",
|
||||
cpuif_cls=My_AXI4Lite
|
||||
)
|
||||
|
||||
|
||||
|
||||
Custom CPU Interface Protocol
|
||||
-----------------------------
|
||||
|
||||
If you require a CPU interface protocol that is not included in this project,
|
||||
you can define your own.
|
||||
|
||||
1. Create a SystemVerilog CPUIF implementation template file.
|
||||
|
||||
This contains the SystemVerilog implementation of the bus protocol. The logic
|
||||
in this shall implement a translation between your custom protocol and the
|
||||
:ref:`cpuif_protocol`.
|
||||
|
||||
Reminder that this template will be preprocessed using
|
||||
`Jinja <https://jinja.palletsprojects.com>`_, so you can use
|
||||
some templating tags to dynamically render content. See the implementations of
|
||||
existing CPU interfaces as an example.
|
||||
|
||||
2. Create a Python class that defines your CPUIF
|
||||
|
||||
Extend your class from :class:`peakrdl_regblock.cpuif.CpuifBase`.
|
||||
Define the port declaration string, and provide a reference to your template file.
|
||||
|
||||
3. Use your new CPUIF definition when exporting.
|
||||
4. If you think the CPUIF protocol is something others might find useful, let me
|
||||
know and I can add it to PeakRDL!
|
||||
|
||||
|
||||
Loading into the PeakRDL command line tool
|
||||
------------------------------------------
|
||||
There are two ways to make your custom CPUIF class visible to the
|
||||
`PeakRDL command-line tool <https://peakrdl.readthedocs.io>`_.
|
||||
|
||||
Via the PeakRDL TOML
|
||||
^^^^^^^^^^^^^^^^^^^^
|
||||
The easiest way to add your cpuif is via the TOML config file. See the
|
||||
:ref:`peakrdl_cfg` section for more details.
|
||||
|
||||
Via a package's entry point definition
|
||||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
|
||||
If you are publishing a collection of PeakRDL plugins as an installable Python
|
||||
package, you can advertise them to PeakRDL using an entry point.
|
||||
This advertises your custom CPUIF class to the PeakRDL-regblock tool as a plugin
|
||||
that should be loaded, and made available as a command-line option in PeakRDL.
|
||||
|
||||
.. code-block:: toml
|
||||
|
||||
[project.entry-points."peakrdl_regblock.cpuif"]
|
||||
my-cpuif = "my_package.my_module:MyCPUIF"
|
||||
|
||||
|
||||
* ``my_package``: The name of your installable Python module
|
||||
* ``peakrdl-regblock.cpuif``: This is the namespace that PeakRDL-regblock will
|
||||
search. Any cpuif plugins you create must be enclosed in this namespace in
|
||||
order to be discovered.
|
||||
* ``my_package.my_module:MyCPUIF``: This is the import path that
|
||||
points to your CPUIF class definition.
|
||||
* ``my-cpuif``: The lefthand side of the assignment is your cpuif's name. This
|
||||
text is what the end-user uses in the command line interface to select your
|
||||
CPUIF implementation.
|
||||
232
docs/cpuif/internal_protocol.rst
Normal file
232
docs/cpuif/internal_protocol.rst
Normal file
@@ -0,0 +1,232 @@
|
||||
.. _cpuif_protocol:
|
||||
|
||||
Internal CPUIF Protocol
|
||||
=======================
|
||||
|
||||
Internally, the regblock generator uses a common CPU interface handshake
|
||||
protocol. This strobe-based protocol is designed to add minimal overhead to the
|
||||
regblock implementation, while also being flexible enough to support advanced
|
||||
features of a variety of bus interface standards.
|
||||
|
||||
|
||||
Signal Descriptions
|
||||
-------------------
|
||||
|
||||
Request
|
||||
^^^^^^^
|
||||
cpuif_req
|
||||
When asserted, a read or write transfer will be initiated.
|
||||
Denotes that the following signals are valid: ``cpuif_addr``,
|
||||
``cpuif_req_is_wr``, and ``cpuif_wr_data``.
|
||||
|
||||
A transfer will only initiate if the relevant stall signal is not asserted.
|
||||
If stalled, the request shall be held until accepted. A request's parameters
|
||||
(type, address, etc) shall remain static throughout the stall.
|
||||
|
||||
cpuif_addr
|
||||
Byte-address of the transfer.
|
||||
|
||||
cpuif_req_is_wr
|
||||
If ``1``, denotes that the current transfer is a write. Otherwise transfer is
|
||||
a read.
|
||||
|
||||
cpuif_wr_data
|
||||
Data to be written for the write transfer. This signal is ignored for read
|
||||
transfers.
|
||||
|
||||
cpuif_wr_biten
|
||||
Active-high bit-level write-enable strobes.
|
||||
Only asserted bit positions will change the register value during a write
|
||||
transfer.
|
||||
|
||||
cpuif_req_stall_rd
|
||||
If asserted, and the next pending request is a read operation, then the
|
||||
transfer will not be accepted until this signal is deasserted.
|
||||
|
||||
cpuif_req_stall_wr
|
||||
If asserted, and the next pending request is a write operation, then the
|
||||
transfer will not be accepted until this signal is deasserted.
|
||||
|
||||
|
||||
Read Response
|
||||
^^^^^^^^^^^^^
|
||||
cpuif_rd_ack
|
||||
Single-cycle strobe indicating a read transfer has completed.
|
||||
Qualifies that the following signals are valid: ``cpuif_rd_err`` and
|
||||
``cpuif_rd_data``
|
||||
|
||||
cpuif_rd_err
|
||||
If set, indicates that the read transaction failed and the CPUIF logic
|
||||
should return an error response if possible.
|
||||
|
||||
cpuif_rd_data
|
||||
Read data. Is sampled on the same cycle that ``cpuif_rd_ack`` is asserted.
|
||||
|
||||
Write Response
|
||||
^^^^^^^^^^^^^^
|
||||
cpuif_wr_ack
|
||||
Single-cycle strobe indicating a write transfer has completed.
|
||||
Qualifies that the ``cpuif_wr_err`` signal is valid.
|
||||
|
||||
cpuif_wr_err
|
||||
If set, indicates that the write transaction failed and the CPUIF logic
|
||||
should return an error response if possible.
|
||||
|
||||
|
||||
Transfers
|
||||
---------
|
||||
|
||||
Transfers have the following characteristics:
|
||||
|
||||
* Only one transfer can be initiated per clock-cycle. This is implicit as there
|
||||
is only one set of request signals.
|
||||
* The register block implementation shall guarantee that only one response can be
|
||||
asserted in a given clock cycle. Only one ``cpuif_*_ack`` signal can be
|
||||
asserted at a time.
|
||||
* Responses shall arrive in the same order as their corresponding request was
|
||||
dispatched.
|
||||
|
||||
|
||||
Basic Transfer
|
||||
^^^^^^^^^^^^^^
|
||||
|
||||
Depending on the configuration of the exported register block, transfers can be
|
||||
fully combinational or they may require one or more clock cycles to complete.
|
||||
Both are valid and CPU interface logic shall be designed to anticipate either.
|
||||
|
||||
.. wavedrom::
|
||||
|
||||
{
|
||||
"signal": [
|
||||
{"name": "clk", "wave": "p...."},
|
||||
{"name": "cpuif_req", "wave": "010.."},
|
||||
{"name": "cpuif_req_is_wr", "wave": "x2x.."},
|
||||
{"name": "cpuif_addr", "wave": "x2x..", "data": ["A"]},
|
||||
{},
|
||||
{"name": "cpuif_*_ack", "wave": "010.."},
|
||||
{"name": "cpuif_*_err", "wave": "x2x.."}
|
||||
],
|
||||
"foot": {
|
||||
"text": "Zero-latency transfer"
|
||||
}
|
||||
}
|
||||
|
||||
.. wavedrom::
|
||||
|
||||
{
|
||||
"signal": [
|
||||
{"name": "clk", "wave": "p..|..."},
|
||||
{"name": "cpuif_req", "wave": "010|..."},
|
||||
{"name": "cpuif_req_is_wr", "wave": "x2x|..."},
|
||||
{"name": "cpuif_addr", "wave": "x2x|...", "data": ["A"]},
|
||||
{},
|
||||
{"name": "cpuif_*_ack", "wave": "0..|10."},
|
||||
{"name": "cpuif_*_err", "wave": "x..|2x."}
|
||||
],
|
||||
"foot": {
|
||||
"text": "Transfer with non-zero latency"
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
Read & Write Transactions
|
||||
-------------------------
|
||||
|
||||
Waveforms below show the timing relationship of simple read/write transactions.
|
||||
For brevity, only showing non-zero latency transfers.
|
||||
|
||||
.. wavedrom::
|
||||
|
||||
{
|
||||
"signal": [
|
||||
{"name": "clk", "wave": "p..|..."},
|
||||
{"name": "cpuif_req", "wave": "010|..."},
|
||||
{"name": "cpuif_req_is_wr", "wave": "x0x|..."},
|
||||
{"name": "cpuif_addr", "wave": "x3x|...", "data": ["A"]},
|
||||
{},
|
||||
{"name": "cpuif_rd_ack", "wave": "0..|10."},
|
||||
{"name": "cpuif_rd_err", "wave": "x..|0x."},
|
||||
{"name": "cpuif_rd_data", "wave": "x..|5x.", "data": ["D"]}
|
||||
],
|
||||
"foot": {
|
||||
"text": "Read Transaction"
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
.. wavedrom::
|
||||
|
||||
{
|
||||
"signal": [
|
||||
{"name": "clk", "wave": "p..|..."},
|
||||
{"name": "cpuif_req", "wave": "010|..."},
|
||||
{"name": "cpuif_req_is_wr", "wave": "x1x|..."},
|
||||
{"name": "cpuif_addr", "wave": "x3x|...", "data": ["A"]},
|
||||
{"name": "cpuif_wr_data", "wave": "x5x|...", "data": ["D"]},
|
||||
{},
|
||||
{"name": "cpuif_wr_ack", "wave": "0..|10."},
|
||||
{"name": "cpuif_wr_err", "wave": "x..|0x."}
|
||||
],
|
||||
"foot": {
|
||||
"text": "Write Transaction"
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
Transaction Pipelining & Stalls
|
||||
-------------------------------
|
||||
If the CPU interface supports it, read and write operations can be pipelined.
|
||||
|
||||
.. wavedrom::
|
||||
|
||||
{
|
||||
"signal": [
|
||||
{"name": "clk", "wave": "p......"},
|
||||
{"name": "cpuif_req", "wave": "01..0.."},
|
||||
{"name": "cpuif_req_is_wr", "wave": "x0..x.."},
|
||||
{"name": "cpuif_addr", "wave": "x333x..", "data": ["A1", "A2", "A3"]},
|
||||
{},
|
||||
{"name": "cpuif_rd_ack", "wave": "0.1..0."},
|
||||
{"name": "cpuif_rd_err", "wave": "x.0..x."},
|
||||
{"name": "cpuif_rd_data", "wave": "x.555x.", "data": ["D1", "D2", "D3"]}
|
||||
]
|
||||
}
|
||||
|
||||
It is very likely that the transfer latency of a read transaction will not
|
||||
be the same as a write for a given register block configuration. Typically read
|
||||
operations will be more deeply pipelined. This latency asymmetry would create a
|
||||
hazard for response collisions.
|
||||
|
||||
In order to eliminate this hazard, additional stall signals (``cpuif_req_stall_rd``
|
||||
and ``cpuif_req_stall_wr``) are provided to delay the next incoming transfer
|
||||
request if necessary. When asserted, the CPU interface shall hold the next pending
|
||||
request until the stall is cleared.
|
||||
|
||||
For non-pipelined CPU interfaces that only allow one outstanding transaction at a time,
|
||||
these stall signals can be safely ignored.
|
||||
|
||||
In the following example, the regblock is configured such that:
|
||||
|
||||
* A read transaction takes 1 clock cycle to complete
|
||||
* A write transaction takes 0 clock cycles to complete
|
||||
|
||||
.. wavedrom::
|
||||
|
||||
{
|
||||
"signal": [
|
||||
{"name": "clk", "wave": "p......."},
|
||||
{"name": "cpuif_req", "wave": "01.....0"},
|
||||
{"name": "cpuif_req_is_wr", "wave": "x1.0.1.x"},
|
||||
{"name": "cpuif_addr", "wave": "x33443.x", "data": ["W1", "W2", "R1", "R2", "W3"]},
|
||||
{"name": "cpuif_req_stall_wr", "wave": "0...1.0."},
|
||||
{},
|
||||
{"name": "cpuif_rd_ack", "wave": "0...220.", "data": ["R1", "R2"]},
|
||||
{"name": "cpuif_wr_ack", "wave": "0220..20", "data": ["W1", "W2", "W3"]}
|
||||
]
|
||||
}
|
||||
|
||||
In the above waveform, observe that:
|
||||
|
||||
* The ``R2`` read request is not affected by the assertion of the write stall,
|
||||
since the write stall only applies to write requests.
|
||||
* The ``W3`` write request is stalled for one cycle, and is accepted once the stall is cleared.
|
||||
36
docs/cpuif/introduction.rst
Normal file
36
docs/cpuif/introduction.rst
Normal file
@@ -0,0 +1,36 @@
|
||||
Introduction
|
||||
============
|
||||
|
||||
The CPU interface logic layer provides an abstraction between the
|
||||
application-specific bus protocol and the internal register file logic.
|
||||
When exporting a design, you can select from a variety of popular CPU interface
|
||||
protocols. These are described in more detail in the pages that follow.
|
||||
|
||||
|
||||
Bus Width
|
||||
^^^^^^^^^
|
||||
The CPU interface bus width is automatically determined from the contents of the
|
||||
design being exported. The bus width is equal to the widest ``accesswidth``
|
||||
encountered in the design.
|
||||
|
||||
|
||||
Addressing
|
||||
^^^^^^^^^^
|
||||
|
||||
The regblock exporter will always generate its address decoding logic using local
|
||||
address offsets. The absolute address offset of your device shall be
|
||||
handled by your system interconnect, and present addresses to the regblock that
|
||||
only include the local offset.
|
||||
|
||||
For example, consider a fictional AXI4-Lite device that:
|
||||
|
||||
- Consumes 4 kB of address space (``0x000``-``0xFFF``).
|
||||
- The device is instantiated in your system at global address range ``0x30_0000 - 0x50_0FFF``.
|
||||
- After decoding transactions destined to the device, the system interconnect shall
|
||||
ensure that AxADDR values are presented to the device as relative addresses - within
|
||||
the range of ``0x000``-``0xFFF``.
|
||||
- If care is taken to align the global address offset to the size of the device,
|
||||
creating a relative address is as simple as pruning down address bits.
|
||||
|
||||
By default, the bit-width of the address bus will be the minimum size to span the contents
|
||||
of the register block. If needed, the address width can be overridden to a larger range.
|
||||
10
docs/cpuif/passthrough.rst
Normal file
10
docs/cpuif/passthrough.rst
Normal file
@@ -0,0 +1,10 @@
|
||||
CPUIF Passthrough
|
||||
=================
|
||||
|
||||
This CPUIF mode bypasses the protocol converter stage and directly exposes the
|
||||
internal CPUIF handshake signals to the user.
|
||||
|
||||
* Command line: ``--cpuif passthrough``
|
||||
* Class: :class:`peakrdl_regblock.cpuif.passthrough.PassthroughCpuif`
|
||||
|
||||
For more details on the protocol itself, see: :ref:`cpuif_protocol`.
|
||||
Reference in New Issue
Block a user