revamp docs
This commit is contained in:
@@ -20,7 +20,7 @@ Both APB3 and APB4 standards are supported.
|
||||
APB3
|
||||
----
|
||||
|
||||
Implements the register block using an
|
||||
Implements the bus decoder using an
|
||||
`AMBA 3 APB <https://developer.arm.com/documentation/ihi0024/b/Introduction/About-the-AMBA-3-APB>`_
|
||||
CPU interface.
|
||||
|
||||
@@ -29,19 +29,19 @@ The APB3 CPU interface comes in two i/o port flavors:
|
||||
SystemVerilog Interface
|
||||
* Command line: ``--cpuif apb3``
|
||||
* Interface Definition: :download:`apb3_intf.sv <../../hdl-src/apb3_intf.sv>`
|
||||
* Class: :class:`peakrdl_busdecoder.cpuif.apb3.APB3_Cpuif`
|
||||
* Class: :class:`peakrdl_busdecoder.cpuif.apb3.APB3Cpuif`
|
||||
|
||||
Flattened inputs/outputs
|
||||
Flattens the interface into discrete input and output ports.
|
||||
|
||||
* Command line: ``--cpuif apb3-flat``
|
||||
* Class: :class:`peakrdl_busdecoder.cpuif.apb3.APB3_Cpuif_flattened`
|
||||
* Class: :class:`peakrdl_busdecoder.cpuif.apb3.APB3CpuifFlat`
|
||||
|
||||
|
||||
APB4
|
||||
----
|
||||
|
||||
Implements the register block using an
|
||||
Implements the bus decoder using an
|
||||
`AMBA 4 APB <https://developer.arm.com/documentation/ihi0024/d/?lang=en>`_
|
||||
CPU interface.
|
||||
|
||||
@@ -50,10 +50,10 @@ The APB4 CPU interface comes in two i/o port flavors:
|
||||
SystemVerilog Interface
|
||||
* Command line: ``--cpuif apb4``
|
||||
* Interface Definition: :download:`apb4_intf.sv <../../hdl-src/apb4_intf.sv>`
|
||||
* Class: :class:`peakrdl_busdecoder.cpuif.apb4.APB4_Cpuif`
|
||||
* Class: :class:`peakrdl_busdecoder.cpuif.apb4.APB4Cpuif`
|
||||
|
||||
Flattened inputs/outputs
|
||||
Flattens the interface into discrete input and output ports.
|
||||
|
||||
* Command line: ``--cpuif apb4-flat``
|
||||
* Class: :class:`peakrdl_busdecoder.cpuif.apb4.APB4_Cpuif_flattened`
|
||||
* Class: :class:`peakrdl_busdecoder.cpuif.apb4.APB4CpuifFlat`
|
||||
|
||||
@@ -1,33 +0,0 @@
|
||||
Intel Avalon
|
||||
============
|
||||
|
||||
Implements the register block using an
|
||||
`Intel Avalon MM <https://www.intel.com/content/www/us/en/docs/programmable/683091/22-3/memory-mapped-interfaces.html>`_
|
||||
CPU interface.
|
||||
|
||||
The Avalon interface comes in two i/o port flavors:
|
||||
|
||||
SystemVerilog Interface
|
||||
* Command line: ``--cpuif avalon-mm``
|
||||
* Interface Definition: :download:`avalon_mm_intf.sv <../../hdl-src/avalon_mm_intf.sv>`
|
||||
* Class: :class:`peakrdl_busdecoder.cpuif.avalon.Avalon_Cpuif`
|
||||
|
||||
Flattened inputs/outputs
|
||||
Flattens the interface into discrete input and output ports.
|
||||
|
||||
* Command line: ``--cpuif avalon-mm-flat``
|
||||
* Class: :class:`peakrdl_busdecoder.cpuif.avalon.Avalon_Cpuif_flattened`
|
||||
|
||||
|
||||
Implementation Details
|
||||
----------------------
|
||||
This implementation of the Avalon protocol has the following features:
|
||||
|
||||
* Interface uses word addressing.
|
||||
* Supports `pipelined transfers <https://www.intel.com/content/www/us/en/docs/programmable/683091/22-3/pipelined-transfers.html>`_
|
||||
* Responses may have variable latency
|
||||
|
||||
In most cases, latency is fixed and is determined by how many retiming
|
||||
stages are enabled in your design.
|
||||
However if your design contains external components, access latency is
|
||||
not guaranteed to be uniform.
|
||||
@@ -3,7 +3,7 @@
|
||||
AMBA AXI4-Lite
|
||||
==============
|
||||
|
||||
Implements the register block using an
|
||||
Implements the bus decoder using an
|
||||
`AMBA AXI4-Lite <https://developer.arm.com/documentation/ihi0022/e/AMBA-AXI4-Lite-Interface-Specification>`_
|
||||
CPU interface.
|
||||
|
||||
@@ -12,21 +12,22 @@ The AXI4-Lite CPU interface comes in two i/o port flavors:
|
||||
SystemVerilog Interface
|
||||
* Command line: ``--cpuif axi4-lite``
|
||||
* Interface Definition: :download:`axi4lite_intf.sv <../../hdl-src/axi4lite_intf.sv>`
|
||||
* Class: :class:`peakrdl_busdecoder.cpuif.axi4lite.AXI4Lite_Cpuif`
|
||||
* Class: :class:`peakrdl_busdecoder.cpuif.axi4lite.AXI4LiteCpuif`
|
||||
|
||||
Flattened inputs/outputs
|
||||
Flattens the interface into discrete input and output ports.
|
||||
|
||||
* Command line: ``--cpuif axi4-lite-flat``
|
||||
* Class: :class:`peakrdl_busdecoder.cpuif.axi4lite.AXI4Lite_Cpuif_flattened`
|
||||
* Class: :class:`peakrdl_busdecoder.cpuif.axi4lite.AXI4LiteCpuifFlat`
|
||||
|
||||
|
||||
Pipelined Performance
|
||||
---------------------
|
||||
This implementation of the AXI4-Lite interface supports transaction pipelining
|
||||
which can significantly improve performance of back-to-back transfers.
|
||||
Protocol Notes
|
||||
--------------
|
||||
The AXI4-Lite adapter is intentionally simplified:
|
||||
|
||||
In order to support transaction pipelining, the CPU interface will accept multiple
|
||||
concurrent transactions. The number of outstanding transactions allowed is automatically
|
||||
determined based on the register file pipeline depth (affected by retiming options),
|
||||
and influences the depth of the internal transaction response skid buffer.
|
||||
* AW and W channels must be asserted together for writes. The adapter does not
|
||||
support decoupled address/data for writes.
|
||||
* Only a single outstanding transaction is supported. Masters should wait for
|
||||
the corresponding response before issuing the next request.
|
||||
* Burst transfers are not supported (single-beat transfers only), consistent
|
||||
with AXI4-Lite.
|
||||
|
||||
@@ -29,15 +29,15 @@ Rather than rewriting a new CPU interface definition, you can extend and adjust
|
||||
|
||||
.. code-block:: python
|
||||
|
||||
from peakrdl_busdecoder.cpuif.axi4lite import AXI4Lite_Cpuif
|
||||
from peakrdl_busdecoder.cpuif.axi4lite import AXI4LiteCpuif
|
||||
|
||||
class My_AXI4Lite(AXI4Lite_Cpuif):
|
||||
class My_AXI4Lite(AXI4LiteCpuif):
|
||||
@property
|
||||
def port_declaration(self) -> str:
|
||||
# Override the port declaration text to use the alternate interface name and modport style
|
||||
return "axi4_lite_interface.Slave_mp s_axil"
|
||||
|
||||
def signal(self, name:str) -> str:
|
||||
def signal(self, name: str, node=None, indexer=None) -> str:
|
||||
# Override the signal names to be lowercase instead
|
||||
return "s_axil." + name.lower()
|
||||
|
||||
@@ -72,7 +72,7 @@ you can define your own.
|
||||
|
||||
2. Create a Python class that defines your CPUIF
|
||||
|
||||
Extend your class from :class:`peakrdl_busdecoder.cpuif.CpuifBase`.
|
||||
Extend your class from :class:`peakrdl_busdecoder.cpuif.BaseCpuif`.
|
||||
Define the port declaration string, and provide a reference to your template file.
|
||||
|
||||
3. Use your new CPUIF definition when exporting.
|
||||
|
||||
@@ -3,10 +3,10 @@
|
||||
Internal CPUIF Protocol
|
||||
=======================
|
||||
|
||||
Internally, the busdecoder generator uses a common CPU interface handshake
|
||||
protocol. This strobe-based protocol is designed to add minimal overhead to the
|
||||
busdecoder implementation, while also being flexible enough to support advanced
|
||||
features of a variety of bus interface standards.
|
||||
Internally, the bus decoder uses a small set of common request/response signals
|
||||
that each CPU interface adapter must drive. This protocol is intentionally simple
|
||||
and supports a single outstanding transaction at a time. The CPU interface logic
|
||||
is responsible for holding request signals stable until the transaction completes.
|
||||
|
||||
|
||||
Signal Descriptions
|
||||
@@ -15,62 +15,49 @@ Signal Descriptions
|
||||
Request
|
||||
^^^^^^^
|
||||
cpuif_req
|
||||
When asserted, a read or write transfer will be initiated.
|
||||
Denotes that the following signals are valid: ``cpuif_addr``,
|
||||
``cpuif_req_is_wr``, and ``cpuif_wr_data``.
|
||||
When asserted, a read or write transfer is in progress. Request signals must
|
||||
remain stable until the transfer completes.
|
||||
|
||||
A transfer will only initiate if the relevant stall signal is not asserted.
|
||||
If stalled, the request shall be held until accepted. A request's parameters
|
||||
(type, address, etc) shall remain static throughout the stall.
|
||||
cpuif_wr_en
|
||||
When asserted alongside ``cpuif_req``, denotes a write transfer.
|
||||
|
||||
cpuif_addr
|
||||
Byte-address of the transfer.
|
||||
cpuif_rd_en
|
||||
When asserted alongside ``cpuif_req``, denotes a read transfer.
|
||||
|
||||
cpuif_req_is_wr
|
||||
If ``1``, denotes that the current transfer is a write. Otherwise transfer is
|
||||
a read.
|
||||
cpuif_wr_addr / cpuif_rd_addr
|
||||
Byte address of the write or read transfer, respectively.
|
||||
|
||||
cpuif_wr_data
|
||||
Data to be written for the write transfer. This signal is ignored for read
|
||||
transfers.
|
||||
Data to be written for the write transfer.
|
||||
|
||||
cpuif_wr_biten
|
||||
Active-high bit-level write-enable strobes.
|
||||
Only asserted bit positions will change the register value during a write
|
||||
transfer.
|
||||
|
||||
cpuif_req_stall_rd
|
||||
If asserted, and the next pending request is a read operation, then the
|
||||
transfer will not be accepted until this signal is deasserted.
|
||||
|
||||
cpuif_req_stall_wr
|
||||
If asserted, and the next pending request is a write operation, then the
|
||||
transfer will not be accepted until this signal is deasserted.
|
||||
cpuif_wr_byte_en
|
||||
Active-high byte-enable strobes for writes. Some CPU interfaces do not
|
||||
provide byte enables and may drive this as all-ones.
|
||||
|
||||
|
||||
Read Response
|
||||
^^^^^^^^^^^^^
|
||||
cpuif_rd_ack
|
||||
Single-cycle strobe indicating a read transfer has completed.
|
||||
Qualifies that the following signals are valid: ``cpuif_rd_err`` and
|
||||
``cpuif_rd_data``
|
||||
Qualifies ``cpuif_rd_err`` and ``cpuif_rd_data``.
|
||||
|
||||
cpuif_rd_err
|
||||
If set, indicates that the read transaction failed and the CPUIF logic
|
||||
should return an error response if possible.
|
||||
Indicates that the read transaction failed. The CPU interface should return
|
||||
an error response if possible.
|
||||
|
||||
cpuif_rd_data
|
||||
Read data. Is sampled on the same cycle that ``cpuif_rd_ack`` is asserted.
|
||||
Read data. Sampled on the same cycle that ``cpuif_rd_ack`` is asserted.
|
||||
|
||||
|
||||
Write Response
|
||||
^^^^^^^^^^^^^^
|
||||
cpuif_wr_ack
|
||||
Single-cycle strobe indicating a write transfer has completed.
|
||||
Qualifies that the ``cpuif_wr_err`` signal is valid.
|
||||
Qualifies ``cpuif_wr_err``.
|
||||
|
||||
cpuif_wr_err
|
||||
If set, indicates that the write transaction failed and the CPUIF logic
|
||||
should return an error response if possible.
|
||||
Indicates that the write transaction failed. The CPU interface should return
|
||||
an error response if possible.
|
||||
|
||||
|
||||
Transfers
|
||||
@@ -78,155 +65,7 @@ Transfers
|
||||
|
||||
Transfers have the following characteristics:
|
||||
|
||||
* Only one transfer can be initiated per clock-cycle. This is implicit as there
|
||||
is only one set of request signals.
|
||||
* The register block implementation shall guarantee that only one response can be
|
||||
asserted in a given clock cycle. Only one ``cpuif_*_ack`` signal can be
|
||||
asserted at a time.
|
||||
* Responses shall arrive in the same order as their corresponding request was
|
||||
dispatched.
|
||||
|
||||
|
||||
Basic Transfer
|
||||
^^^^^^^^^^^^^^
|
||||
|
||||
Depending on the configuration of the exported register block, transfers can be
|
||||
fully combinational or they may require one or more clock cycles to complete.
|
||||
Both are valid and CPU interface logic shall be designed to anticipate either.
|
||||
|
||||
.. wavedrom::
|
||||
|
||||
{
|
||||
"signal": [
|
||||
{"name": "clk", "wave": "p...."},
|
||||
{"name": "cpuif_req", "wave": "010.."},
|
||||
{"name": "cpuif_req_is_wr", "wave": "x2x.."},
|
||||
{"name": "cpuif_addr", "wave": "x2x..", "data": ["A"]},
|
||||
{},
|
||||
{"name": "cpuif_*_ack", "wave": "010.."},
|
||||
{"name": "cpuif_*_err", "wave": "x2x.."}
|
||||
],
|
||||
"foot": {
|
||||
"text": "Zero-latency transfer"
|
||||
}
|
||||
}
|
||||
|
||||
.. wavedrom::
|
||||
|
||||
{
|
||||
"signal": [
|
||||
{"name": "clk", "wave": "p..|..."},
|
||||
{"name": "cpuif_req", "wave": "010|..."},
|
||||
{"name": "cpuif_req_is_wr", "wave": "x2x|..."},
|
||||
{"name": "cpuif_addr", "wave": "x2x|...", "data": ["A"]},
|
||||
{},
|
||||
{"name": "cpuif_*_ack", "wave": "0..|10."},
|
||||
{"name": "cpuif_*_err", "wave": "x..|2x."}
|
||||
],
|
||||
"foot": {
|
||||
"text": "Transfer with non-zero latency"
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
Read & Write Transactions
|
||||
-------------------------
|
||||
|
||||
Waveforms below show the timing relationship of simple read/write transactions.
|
||||
For brevity, only showing non-zero latency transfers.
|
||||
|
||||
.. wavedrom::
|
||||
|
||||
{
|
||||
"signal": [
|
||||
{"name": "clk", "wave": "p..|..."},
|
||||
{"name": "cpuif_req", "wave": "010|..."},
|
||||
{"name": "cpuif_req_is_wr", "wave": "x0x|..."},
|
||||
{"name": "cpuif_addr", "wave": "x3x|...", "data": ["A"]},
|
||||
{},
|
||||
{"name": "cpuif_rd_ack", "wave": "0..|10."},
|
||||
{"name": "cpuif_rd_err", "wave": "x..|0x."},
|
||||
{"name": "cpuif_rd_data", "wave": "x..|5x.", "data": ["D"]}
|
||||
],
|
||||
"foot": {
|
||||
"text": "Read Transaction"
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
.. wavedrom::
|
||||
|
||||
{
|
||||
"signal": [
|
||||
{"name": "clk", "wave": "p..|..."},
|
||||
{"name": "cpuif_req", "wave": "010|..."},
|
||||
{"name": "cpuif_req_is_wr", "wave": "x1x|..."},
|
||||
{"name": "cpuif_addr", "wave": "x3x|...", "data": ["A"]},
|
||||
{"name": "cpuif_wr_data", "wave": "x5x|...", "data": ["D"]},
|
||||
{},
|
||||
{"name": "cpuif_wr_ack", "wave": "0..|10."},
|
||||
{"name": "cpuif_wr_err", "wave": "x..|0x."}
|
||||
],
|
||||
"foot": {
|
||||
"text": "Write Transaction"
|
||||
}
|
||||
}
|
||||
|
||||
|
||||
Transaction Pipelining & Stalls
|
||||
-------------------------------
|
||||
If the CPU interface supports it, read and write operations can be pipelined.
|
||||
|
||||
.. wavedrom::
|
||||
|
||||
{
|
||||
"signal": [
|
||||
{"name": "clk", "wave": "p......"},
|
||||
{"name": "cpuif_req", "wave": "01..0.."},
|
||||
{"name": "cpuif_req_is_wr", "wave": "x0..x.."},
|
||||
{"name": "cpuif_addr", "wave": "x333x..", "data": ["A1", "A2", "A3"]},
|
||||
{},
|
||||
{"name": "cpuif_rd_ack", "wave": "0.1..0."},
|
||||
{"name": "cpuif_rd_err", "wave": "x.0..x."},
|
||||
{"name": "cpuif_rd_data", "wave": "x.555x.", "data": ["D1", "D2", "D3"]}
|
||||
]
|
||||
}
|
||||
|
||||
It is very likely that the transfer latency of a read transaction will not
|
||||
be the same as a write for a given register block configuration. Typically read
|
||||
operations will be more deeply pipelined. This latency asymmetry would create a
|
||||
hazard for response collisions.
|
||||
|
||||
In order to eliminate this hazard, additional stall signals (``cpuif_req_stall_rd``
|
||||
and ``cpuif_req_stall_wr``) are provided to delay the next incoming transfer
|
||||
request if necessary. When asserted, the CPU interface shall hold the next pending
|
||||
request until the stall is cleared.
|
||||
|
||||
For non-pipelined CPU interfaces that only allow one outstanding transaction at a time,
|
||||
these stall signals can be safely ignored.
|
||||
|
||||
In the following example, the busdecoder is configured such that:
|
||||
|
||||
* A read transaction takes 1 clock cycle to complete
|
||||
* A write transaction takes 0 clock cycles to complete
|
||||
|
||||
.. wavedrom::
|
||||
|
||||
{
|
||||
"signal": [
|
||||
{"name": "clk", "wave": "p......."},
|
||||
{"name": "cpuif_req", "wave": "01.....0"},
|
||||
{"name": "cpuif_req_is_wr", "wave": "x1.0.1.x"},
|
||||
{"name": "cpuif_addr", "wave": "x33443.x", "data": ["W1", "W2", "R1", "R2", "W3"]},
|
||||
{"name": "cpuif_req_stall_wr", "wave": "0...1.0."},
|
||||
{},
|
||||
{"name": "cpuif_rd_ack", "wave": "0...220.", "data": ["R1", "R2"]},
|
||||
{"name": "cpuif_wr_ack", "wave": "0220..20", "data": ["W1", "W2", "W3"]}
|
||||
]
|
||||
}
|
||||
|
||||
In the above waveform, observe that:
|
||||
|
||||
* The ``R2`` read request is not affected by the assertion of the write stall,
|
||||
since the write stall only applies to write requests.
|
||||
* The ``W3`` write request is stalled for one cycle, and is accepted once the stall is cleared.
|
||||
* Only one outstanding transaction is supported.
|
||||
* The CPU interface must hold ``cpuif_req`` and request parameters stable until
|
||||
the corresponding ``cpuif_*_ack`` is asserted.
|
||||
* Responses shall arrive in the same order as requests.
|
||||
|
||||
@@ -2,16 +2,17 @@ Introduction
|
||||
============
|
||||
|
||||
The CPU interface logic layer provides an abstraction between the
|
||||
application-specific bus protocol and the internal register file logic.
|
||||
When exporting a design, you can select from a variety of popular CPU interface
|
||||
protocols. These are described in more detail in the pages that follow.
|
||||
application-specific bus protocol and the internal bus decoder logic.
|
||||
When exporting a design, you can select from supported CPU interface protocols.
|
||||
These are described in more detail in the pages that follow.
|
||||
|
||||
|
||||
Bus Width
|
||||
^^^^^^^^^
|
||||
The CPU interface bus width is automatically determined from the contents of the
|
||||
design being exported. The bus width is equal to the widest ``accesswidth``
|
||||
encountered in the design.
|
||||
The CPU interface bus width is inferred from the contents of the design.
|
||||
It is intended to be equal to the widest ``accesswidth`` encountered in the
|
||||
design. If the exported addrmap contains only external components, the width
|
||||
cannot be inferred and will default to 32 bits.
|
||||
|
||||
|
||||
Addressing
|
||||
@@ -32,5 +33,6 @@ For example, consider a fictional AXI4-Lite device that:
|
||||
- If care is taken to align the global address offset to the size of the device,
|
||||
creating a relative address is as simple as pruning down address bits.
|
||||
|
||||
By default, the bit-width of the address bus will be the minimum size to span the contents
|
||||
of the register block. If needed, the address width can be overridden to a larger range.
|
||||
By default, the bit-width of the address bus will be the minimum size to span the
|
||||
contents of the decoded address space. If needed, the address width can be
|
||||
overridden to a larger range using ``--addr-width``.
|
||||
|
||||
@@ -1,10 +0,0 @@
|
||||
CPUIF Passthrough
|
||||
=================
|
||||
|
||||
This CPUIF mode bypasses the protocol converter stage and directly exposes the
|
||||
internal CPUIF handshake signals to the user.
|
||||
|
||||
* Command line: ``--cpuif passthrough``
|
||||
* Class: :class:`peakrdl_busdecoder.cpuif.passthrough.PassthroughCpuif`
|
||||
|
||||
For more details on the protocol itself, see: :ref:`cpuif_protocol`.
|
||||
Reference in New Issue
Block a user