Rework cpuif to support transaction pipelining. Add more docs. update simulator

This commit is contained in:
Alex Mykyta
2022-02-13 17:25:31 -08:00
parent de5eecf0e7
commit d0ba488904
21 changed files with 493 additions and 68 deletions

View File

@@ -1,26 +1,54 @@
Register Block Architecture
===========================
TODO: Add full block diagram
The generated register block RTL is organized into several sections.
Each section is automatically generated based on the source register model and
is rendered into the output register block SystermVerilog RTL.
.. figure:: diagrams/arch.png
Although it is not completely necessary to know the inner workings of the
generated RTL, it can be helpful to understand the implications of various
exporter configuration options.
CPU Interface
-------------
The CPU interface logic layer provides an abstraction between the
application-specific bus protocol and the internal register file logic.
This logic layer normalizes external CPU read & write transactions into a common
:ref:`cpuif_protocol` that is used to interact with the register file.
TODO: describe boundary signals. Timing diagrams
Address Decode
--------------
A common address decode operation is generated which computes individual access
strobes for each software-accessible register in the design.
This operation is performed completely combinationally.
TODO: describe boundary signals. Timing diagrams
Field Logic
-----------
This layer of the register block implements the storage elements and state-change
logic for every field in the design. Field state is updated based on address
decode strobes from software read/write actions, as well as events from the
hardware interface input struct.
This section also assigns any hardware interface outputs.
TODO: describe boundary signals. Timing diagrams
Readback
--------
The readback layer aggregates and reduces all readable registers into a single
read response. During a read operation, the same address decode strobes are used
to select the active register that is being accessed.
This allows for a simple OR-reduction operation to be used to compute the read
data response.
TODO: describe boundary signals. Timing diagrams
Retiming options
For designs with a large number of software-readable registers, an optional
fanin re-timing stage can be enabled. This stage is automatically inserted at a
balanced point in the read-data reduction so that fanin and logic-levels are
optimally reduced.
A second optional read response retiming register can be enabled in-line with the
path back to the CPU interface layer. This can be useful if the CPU interface protocol
used has a fully combinational response path, and needs to be retimed further.

View File

@@ -0,0 +1,228 @@
.. _cpuif_protocol:
Internal CPUIF Protocol
=======================
Internally, the regblock generator uses a common CPU interface handshake
protocol. This strobe-based protocol is designed to add minimal overhead to the
regblock implementation, while also being flexible enough to support advanced
features of a variety of bus interface standards.
Signal Descriptions
-------------------
Request
^^^^^^^
cpuif_req
When asserted, a read or write transfer will be initiated.
Denotes that the following signals are valid: ``cpuif_addr``,
``cpuif_req_is_wr``, and ``cpuif_wr_data``.
A transfer will only initiate if the relevant stall signal is not asserted.
If stalled, the request shall be held until accepted. A request's parameters
(type, address, etc) shall remain static throughout the stall.
cpuif_addr
Byte-address of the transfer.
cpuif_req_is_wr
If ``1``, denotes that the current transfer is a write. Otherwise transfer is
a read.
cpuif_wr_data
Data to be written for the write transfer. This signal is ignored for read
transfers.
cpuif_req_stall_rd
If asserted, and the next pending request is a read operation, then the
transfer will not be accepted until this signal is deasserted.
cpuif_req_stall_wr
If asserted, and the next pending request is a read operation, then the
transfer will not be accepted until this signal is deasserted.
Read Response
^^^^^^^^^^^^^
cpuif_rd_ack
Single-cycle strobe indicating a read transfer has completed.
Qualifies that the following signals are valid: ``cpuif_rd_err`` and
``cpuif_rd_data``
cpuif_rd_err
If set, indicates that the read transaction failed and the CPUIF logic
should return an error response if possible.
cpuif_rd_data
Read data. The width of this is bus is determined by the size of the largest
register in the design.
Write Response
^^^^^^^^^^^^^^
cpuif_wr_ack
Single-cycle strobe indicating a write transfer has completed.
Qualifies that the ``cpuif_wr_err`` signal is valid.
cpuif_wr_err
If set, indicates that the write transaction failed and the CPUIF logic
should return an error response if possible.
Transfers
---------
Transfers have the following characteristics:
* Only one transfer can be initiated per clock-cycle. This is implicit as there
is only one set of request signals.
* The register block implementation shall guarantee that only one response can be
asserted in a given clock cycle. Only one ``cpuif_*_ack`` signal can be
asserted at a time.
* Responses shall arrive in the same order as their corresponding request was
dispatched.
Basic Transfer
^^^^^^^^^^^^^^
Depending on the configuration of the exported register block, transfers can be
fully combinational or they may require one or more clock cycles to complete.
Both are valid and CPU interface logic shall be designed to anticipate either.
.. wavedrom::
{
signal: [
{name: 'clk', wave: 'p....'},
{name: 'cpuif_req', wave: '010..'},
{name: 'cpuif_req_is_wr', wave: 'x2x..'},
{name: 'cpuif_addr', wave: 'x2x..', data: ['A']},
{},
{name: 'cpuif_*_ack', wave: '010..'},
{name: 'cpuif_*_err', wave: 'x2x..'},
],
foot: {
text: "Zero-latency transfer"
}
}
.. wavedrom::
{
signal: [
{name: 'clk', wave: 'p..|...'},
{name: 'cpuif_req', wave: '010|...'},
{name: 'cpuif_req_is_wr', wave: 'x2x|...'},
{name: 'cpuif_addr', wave: 'x2x|...', data: ['A']},
{},
{name: 'cpuif_*_ack', wave: '0..|10.'},
{name: 'cpuif_*_err', wave: 'x..|2x.'},
],
foot: {
text: "Transfer with non-zero latency"
}
}
Read & Write Transactions
-------------------------
Waveforms below show the timing relationship of simple read/write transactions.
For brevity, only showing non-zero latency transfers.
.. wavedrom::
{
signal: [
{name: 'clk', wave: 'p..|...'},
{name: 'cpuif_req', wave: '010|...'},
{name: 'cpuif_req_is_wr', wave: 'x0x|...'},
{name: 'cpuif_addr', wave: 'x3x|...', data: ['A']},
{},
{name: 'cpuif_rd_ack', wave: '0..|10.'},
{name: 'cpuif_rd_err', wave: 'x..|0x.'},
{name: 'cpuif_rd_data', wave: 'x..|5x.', data: ['D']},
],
foot: {
text: "Read Transaction"
}
}
.. wavedrom::
{
signal: [
{name: 'clk', wave: 'p..|...'},
{name: 'cpuif_req', wave: '010|...'},
{name: 'cpuif_req_is_wr', wave: 'x1x|...'},
{name: 'cpuif_addr', wave: 'x3x|...', data: ['A']},
{name: 'cpuif_wr_data', wave: 'x5x|...', data: ['D']},
{},
{name: 'cpuif_wr_ack', wave: '0..|10.'},
{name: 'cpuif_wr_err', wave: 'x..|0x.'},
],
foot: {
text: "Write Transaction"
}
}
Transaction Pipelining & Stalls
-------------------------------
If the CPU interface supports it, read and write operations can be pipelined.
.. wavedrom::
{
signal: [
{name: 'clk', wave: 'p......'},
{name: 'cpuif_req', wave: '01..0..'},
{name: 'cpuif_req_is_wr', wave: 'x0..x..'},
{name: 'cpuif_addr', wave: 'x333x..', data: ['A1', 'A2', 'A3']},
{},
{name: 'cpuif_rd_ack', wave: '0.1..0.'},
{name: 'cpuif_rd_err', wave: 'x.0..x.'},
{name: 'cpuif_rd_data', wave: 'x.555x.', data: ['D1', 'D2', 'D3']},
]
}
It is very likely that the transfer latency of a read transaction will not
be the same as a write for a given register block configuration. Typically read
operations will be more deeply pipelined. This latency asymmetry would create a
hazard for response collisions.
In order to eliminate this hazard, additional stall signals are provided to delay
an incoming transfer request if necessary. When asserted, the CPU interface shall
hold the next pending request until the stall is cleared.
For non-pipelined CPU interfaces that only allow one outstanding transaction at a time,
these can be safely ignored.
In the following example, the regblock is configured such that:
* A read transaction takes 1 clock cycle to complete
* A write transaction takes 0 clock cycles to complete
.. wavedrom::
{
signal: [
{name: 'clk', wave: 'p.......'},
{name: 'cpuif_req', wave: '01.....0'},
{name: 'cpuif_req_is_wr', wave: 'x1.0.1.x'},
{name: 'cpuif_addr', wave: 'x33443.x', data: ['W1', 'W2', 'R1', 'R2', 'W3']},
{name: 'cpuif_req_stall_wr', wave: '0...1.0.'},
{},
{name: 'cpuif_rd_ack', wave: '0...220.', data: ['R1', 'R2']},
{name: 'cpuif_wr_ack', wave: '0220..20', data: ['W1', 'W2', 'W3']},
]
}
In the above waveform, observe that:
* The ``R2`` read request is not affected by the assertion of the write stall,
since the write stall only applies to write requests.
* The ``W3`` write request is stalled for one cycle, and is accepted once the stall is cleared.

BIN
docs/diagrams/arch.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 148 KiB

BIN
docs/diagrams/diagrams.odg Normal file

Binary file not shown.

View File

@@ -16,7 +16,8 @@ Install from `PyPi`_ using pip
.. code-block:: bash
python3 -m pip install peakrdl-regblock
# NOT RELEASED YET
# python3 -m pip install peakrdl-regblock
.. _PyPi: https://pypi.org/project/peakrdl-regblock
@@ -47,6 +48,7 @@ Links
cpuif/addressing
cpuif/apb3
cpuif/advanced
cpuif/internal_protocol
.. toctree::
:hidden:

View File

@@ -10,6 +10,8 @@ class PassthroughCpuif(CpuifBase):
"input wire s_cpuif_req_is_wr",
f"input wire [{self.addr_width-1}:0] s_cpuif_addr",
f"input wire [{self.data_width-1}:0] s_cpuif_wr_data",
"output wire s_cpuif_req_stall_wr",
"output wire s_cpuif_req_stall_rd",
"output wire s_cpuif_rd_ack",
"output wire s_cpuif_rd_err",
f"output wire [{self.data_width-1}:0] s_cpuif_rd_data",

View File

@@ -2,6 +2,8 @@ assign cpuif_req = s_cpuif_req;
assign cpuif_req_is_wr = s_cpuif_req_is_wr;
assign cpuif_addr = s_cpuif_addr;
assign cpuif_wr_data = s_cpuif_wr_data;
assign s_cpuif_req_stall_wr = cpuif_req_stall_wr;
assign s_cpuif_req_stall_rd = cpuif_req_stall_rd;
assign s_cpuif_rd_ack = cpuif_rd_ack;
assign s_cpuif_rd_err = cpuif_rd_err;
assign s_cpuif_rd_data = cpuif_rd_data;

View File

@@ -76,6 +76,12 @@ class RegblockExporter:
if kwargs:
raise TypeError("got an unexpected keyword argument '%s'" % list(kwargs.keys())[0])
min_read_latency = 0
min_write_latency = 0
if retime_read_fanin:
min_read_latency += 1
if retime_read_response:
min_read_latency += 1
# Scan the design for any unsupported features
# Also collect pre-export information
@@ -114,6 +120,8 @@ class RegblockExporter:
"readback": self.readback,
"get_always_ff_event": lambda resetsignal : get_always_ff_event(self.dereferencer, resetsignal),
"retime_read_response": retime_read_response,
"min_read_latency": min_read_latency,
"min_write_latency": min_write_latency,
}
# Write out design

View File

@@ -24,6 +24,8 @@ module {{module_name}} (
logic cpuif_req_is_wr;
logic [{{cpuif.addr_width-1}}:0] cpuif_addr;
logic [{{cpuif.data_width-1}}:0] cpuif_wr_data;
logic cpuif_req_stall_wr;
logic cpuif_req_stall_rd;
logic cpuif_rd_ack;
logic cpuif_rd_err;
@@ -34,6 +36,40 @@ module {{module_name}} (
{{cpuif.get_implementation()|indent}}
{% if min_read_latency == min_write_latency %}
// Read & write latencies are balanced. Stalls not required
assign cpuif_req_stall_rd = '0;
assign cpuif_req_stall_wr = '0;
{%- elif min_read_latency > min_write_latency %}
// Read latency > write latency. May need to delay next write that follows a read
logic [{{min_read_latency - min_write_latency - 1}}:0] cpuif_req_stall_sr;
always_ff {{get_always_ff_event(cpuif.reset)}} begin
if({{get_resetsignal(cpuif.reset)}}) begin
cpuif_req_stall_sr <= '0;
end else if(cpuif_req && !cpuif_req_is_wr) begin
cpuif_req_stall_sr <= '1;
end else begin
cpuif_req_stall_sr <= (cpuif_req_stall_sr >> 'd1);
end
end
assign cpuif_req_stall_rd = '0;
assign cpuif_req_stall_wr = cpuif_req_stall_sr[0];
{%- else %}
// Write latency > read latency. May need to delay next read that follows a write
logic [{{min_write_latency - min_read_latency - 1}}:0] cpuif_req_stall_sr;
always_ff {{get_always_ff_event(cpuif.reset)}} begin
if({{get_resetsignal(cpuif.reset)}}) begin
cpuif_req_stall_sr <= '0;
end else if(cpuif_req && cpuif_req_is_wr) begin
cpuif_req_stall_sr <= '1;
end else begin
cpuif_req_stall_sr <= (cpuif_req_stall_sr >> 'd1);
end
end
assign cpuif_req_stall_rd = cpuif_req_stall_sr[0];
assign cpuif_req_stall_wr = '0;
{%- endif %}
//--------------------------------------------------------------------------
// Address Decode
//--------------------------------------------------------------------------
@@ -47,15 +83,15 @@ module {{module_name}} (
{{address_decode.get_implementation()|indent(8)}}
end
// Writes are always granted with no error response
assign cpuif_wr_ack = cpuif_req & cpuif_req_is_wr;
assign cpuif_wr_err = '0;
// Pass down signals to next stage
assign decoded_req = cpuif_req;
assign decoded_req_is_wr = cpuif_req_is_wr;
assign decoded_wr_data = cpuif_wr_data;
// Writes are always granted with no error response
assign cpuif_wr_ack = decoded_req & decoded_req_is_wr;
assign cpuif_wr_err = '0;
//--------------------------------------------------------------------------
// Field logic
//--------------------------------------------------------------------------

View File

@@ -1,12 +1,14 @@
# Test Dependencies
## ModelSim
## Questa
Testcases require an installation of ModelSim/QuestaSim, and for `vlog` & `vsim`
Testcases require an installation of the Questa simulator, and for `vlog` & `vsim`
commands to be visible via the PATH environment variable.
ModelSim - Intel FPGA Edition can be downloaded for free from https://fpgasoftware.intel.com/ and is sufficient to run unit tests.
*Questa - Intel FPGA Starter Edition* can be downloaded for free from
https://fpgasoftware.intel.com/ and is sufficient to run unit tests. You will need
to generate a free license file to unlock the software: https://licensing.intel.com/psg/s/sales-signup-evaluationlicenses
## Python Packages
@@ -19,7 +21,7 @@ python3 -m pip install test/requirements.txt
# Running tests
Tests can be launched from the test directory using `pytest`.
Use `pytest -n auto` to run tests in parallel.
Use `pytest --workers auto` to run tests in parallel.
To run all tests:
```bash

View File

@@ -40,7 +40,7 @@ interface apb3_intf_driver #(
input PSLVERR;
endclocking
task reset();
task automatic reset();
cb.PSEL <= '0;
cb.PENABLE <= '0;
cb.PWRITE <= '0;
@@ -48,7 +48,10 @@ interface apb3_intf_driver #(
cb.PWDATA <= '0;
endtask
task write(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] data);
semaphore txn_mutex = new(1);
task automatic write(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] data);
txn_mutex.get();
##0;
// Initiate transfer
@@ -66,9 +69,11 @@ interface apb3_intf_driver #(
// Wait for response
while(cb.PREADY !== 1'b1) @(cb);
reset();
txn_mutex.put();
endtask
task read(logic [ADDR_WIDTH-1:0] addr, output logic [DATA_WIDTH-1:0] data);
task automatic read(logic [ADDR_WIDTH-1:0] addr, output logic [DATA_WIDTH-1:0] data);
txn_mutex.get();
##0;
// Initiate transfer
@@ -89,9 +94,10 @@ interface apb3_intf_driver #(
assert(!$isunknown(cb.PSLVERR)) else $error("Read from 0x%0x returned X's on PSLVERR", addr);
data = cb.PRDATA;
reset();
txn_mutex.put();
endtask
task assert_read(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] expected_data, logic [DATA_WIDTH-1:0] mask = '1);
task automatic assert_read(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] expected_data, logic [DATA_WIDTH-1:0] mask = '1);
logic [DATA_WIDTH-1:0] data;
read(addr, data);
data &= mask;

View File

@@ -77,7 +77,7 @@ interface axi4lite_intf_driver #(
input RRESP;
endclocking
task reset();
task automatic reset();
cb.AWVALID <= '0;
cb.AWADDR <= '0;
cb.AWPROT <= '0;
@@ -95,13 +95,20 @@ interface axi4lite_intf_driver #(
@cb;
end
task write(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] data);
semaphore txn_aw_mutex = new(1);
semaphore txn_w_mutex = new(1);
semaphore txn_b_mutex = new(1);
semaphore txn_ar_mutex = new(1);
semaphore txn_r_mutex = new(1);
task automatic write(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] data);
bit w_before_aw;
w_before_aw = $urandom_range(1,0);
##0;
fork
begin
txn_aw_mutex.get();
##0;
if(w_before_aw) repeat($urandom_range(2,0)) @cb;
cb.AWVALID <= '1;
cb.AWADDR <= addr;
@@ -109,9 +116,12 @@ interface axi4lite_intf_driver #(
@(cb);
while(cb.AWREADY !== 1'b1) @(cb);
cb.AWVALID <= '0;
txn_aw_mutex.put();
end
begin
txn_w_mutex.get();
##0;
if(!w_before_aw) repeat($urandom_range(2,0)) @cb;
cb.WVALID <= '1;
cb.WDATA <= data;
@@ -120,39 +130,47 @@ interface axi4lite_intf_driver #(
while(cb.WREADY !== 1'b1) @(cb);
cb.WVALID <= '0;
cb.WSTRB <= '0;
txn_w_mutex.put();
end
begin
txn_b_mutex.get();
@cb;
while(cb.BREADY !== 1'b1 && cb.BVALID !== 1'b1) @(cb);
assert(!$isunknown(cb.BRESP)) else $error("Read from 0x%0x returned X's on BRESP", addr);
txn_b_mutex.put();
end
join
endtask
task read(logic [ADDR_WIDTH-1:0] addr, output logic [DATA_WIDTH-1:0] data);
##0;
task automatic read(logic [ADDR_WIDTH-1:0] addr, output logic [DATA_WIDTH-1:0] data);
fork
begin
txn_ar_mutex.get();
##0;
cb.ARVALID <= '1;
cb.ARADDR <= addr;
cb.ARPROT <= '0;
@(cb);
while(cb.ARREADY !== 1'b1) @(cb);
cb.ARVALID <= '0;
txn_ar_mutex.put();
end
begin
txn_r_mutex.get();
@cb;
while(!(cb.RREADY === 1'b1 && cb.RVALID === 1'b1)) @(cb);
assert(!$isunknown(cb.RDATA)) else $error("Read from 0x%0x returned X's on RDATA", addr);
assert(!$isunknown(cb.RRESP)) else $error("Read from 0x%0x returned X's on RRESP", addr);
data = cb.RDATA;
txn_r_mutex.put();
end
join
endtask
task assert_read(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] expected_data, logic [DATA_WIDTH-1:0] mask = '1);
task automatic assert_read(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] expected_data, logic [DATA_WIDTH-1:0] mask = '1);
logic [DATA_WIDTH-1:0] data;
read(addr, data);
data &= mask;

View File

@@ -9,6 +9,8 @@ interface passthrough_driver #(
output logic m_cpuif_req_is_wr,
output logic [ADDR_WIDTH-1:0] m_cpuif_addr,
output logic [DATA_WIDTH-1:0] m_cpuif_wr_data,
input wire m_cpuif_req_stall_wr,
input wire m_cpuif_req_stall_rd,
input wire m_cpuif_rd_ack,
input wire m_cpuif_rd_err,
input wire [DATA_WIDTH-1:0] m_cpuif_rd_data,
@@ -25,6 +27,8 @@ interface passthrough_driver #(
output m_cpuif_req_is_wr;
output m_cpuif_addr;
output m_cpuif_wr_data;
input m_cpuif_req_stall_wr;
input m_cpuif_req_stall_rd;
input m_cpuif_rd_ack;
input m_cpuif_rd_err;
input m_cpuif_rd_data;
@@ -32,47 +36,70 @@ interface passthrough_driver #(
input m_cpuif_wr_err;
endclocking
task reset();
task automatic reset();
cb.m_cpuif_req <= '0;
cb.m_cpuif_req_is_wr <= '0;
cb.m_cpuif_addr <= '0;
cb.m_cpuif_wr_data <= '0;
endtask
task write(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] data);
##0;
semaphore txn_req_mutex = new(1);
semaphore txn_resp_mutex = new(1);
task automatic write(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] data);
fork
begin
// Initiate transfer
txn_req_mutex.get();
##0;
cb.m_cpuif_req <= '1;
cb.m_cpuif_req_is_wr <= '1;
cb.m_cpuif_addr <= addr;
cb.m_cpuif_wr_data <= data;
@(cb);
while(cb.m_cpuif_req_stall_wr !== 1'b0) @(cb);
reset();
txn_req_mutex.put();
end
begin
// Wait for response
txn_resp_mutex.get();
@cb;
while(cb.m_cpuif_wr_ack !== 1'b1) @(cb);
reset();
txn_resp_mutex.put();
end
join
endtask
task read(logic [ADDR_WIDTH-1:0] addr, output logic [DATA_WIDTH-1:0] data);
##0;
task automatic read(logic [ADDR_WIDTH-1:0] addr, output logic [DATA_WIDTH-1:0] data);
fork
begin
// Initiate transfer
txn_req_mutex.get();
##0;
cb.m_cpuif_req <= '1;
cb.m_cpuif_req_is_wr <= '0;
cb.m_cpuif_addr <= addr;
@(cb);
while(cb.m_cpuif_req_stall_rd !== 1'b0) @(cb);
reset();
txn_req_mutex.put();
end
begin
// Wait for response
txn_resp_mutex.get();
@cb;
while(cb.m_cpuif_rd_ack !== 1'b1) @(cb);
assert(!$isunknown(cb.m_cpuif_rd_data)) else $error("Read from 0x%0x returned X's on m_cpuif_rd_data", addr);
data = cb.m_cpuif_rd_data;
reset();
txn_resp_mutex.put();
end
join
endtask
task assert_read(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] expected_data, logic [DATA_WIDTH-1:0] mask = '1);
task automatic assert_read(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] expected_data, logic [DATA_WIDTH-1:0] mask = '1);
logic [DATA_WIDTH-1:0] data;
read(addr, data);
data &= mask;

View File

@@ -3,6 +3,8 @@ wire s_cpuif_req;
wire s_cpuif_req_is_wr;
wire [{{exporter.cpuif.addr_width-1}}:0] s_cpuif_addr;
wire [{{exporter.cpuif.data_width-1}}:0] s_cpuif_wr_data;
wire s_cpuif_req_stall_wr;
wire s_cpuif_req_stall_rd;
wire s_cpuif_rd_ack;
wire s_cpuif_rd_err;
wire [{{exporter.cpuif.data_width-1}}:0] s_cpuif_rd_data;
@@ -18,6 +20,8 @@ passthrough_driver #(
.m_cpuif_req_is_wr(s_cpuif_req_is_wr),
.m_cpuif_addr(s_cpuif_addr),
.m_cpuif_wr_data(s_cpuif_wr_data),
.m_cpuif_req_stall_wr(s_cpuif_req_stall_wr),
.m_cpuif_req_stall_rd(s_cpuif_req_stall_rd),
.m_cpuif_rd_ack(s_cpuif_rd_ack),
.m_cpuif_rd_err(s_cpuif_rd_err),
.m_cpuif_rd_data(s_cpuif_rd_data),

View File

@@ -16,7 +16,7 @@ from peakrdl.regblock import RegblockExporter
from .cpuifs.base import CpuifTestMode
from .cpuifs.apb3 import APB3
from .simulators.modelsim import ModelSim
from .simulators.questa import Questa
class RegblockTestCase(unittest.TestCase):
@@ -40,9 +40,9 @@ class RegblockTestCase(unittest.TestCase):
retime_read_response = False
#: Abort test if it exceeds this number of clock cycles
timeout_clk_cycles = 1000
timeout_clk_cycles = 5000
simulator_cls = ModelSim
simulator_cls = Questa
#: this gets auto-loaded via the _load_request autouse fixture
request = None # type: pytest.FixtureRequest

View File

@@ -4,27 +4,21 @@ import os
from . import Simulator
class ModelSim(Simulator):
class Questa(Simulator):
def compile(self) -> None:
cmd = [
"vlog", "-sv", "-quiet", "-l", "build.log",
"+incdir+%s" % os.path.join(os.path.dirname(__file__), ".."),
# Free version of ModelSim throws errors if generate/endgenerate
# blocks are not used.
# These have been made optional long ago. Modern versions of SystemVerilog do
# not require them and I prefer not to add them.
"-suppress", "2720",
# Ignore noisy warning about vopt-time checking of always_comb/always_latch
"-suppress", "2583",
# Use strict LRM conformance
"-svinputport=net",
# all warnings are errors
"-warning", "error",
# except this one.. TODO: figure out if I can avoid this
"-suppress", "13314",
# Ignore noisy warning about vopt-time checking of always_comb/always_latch
"-suppress", "2583",
]
# Add source files
@@ -42,6 +36,7 @@ class ModelSim(Simulator):
# call vsim
cmd = [
"vsim", "-quiet",
"-voptargs=+acc",
"-msgmode", "both",
"-do", "set WildcardFilter [lsearch -not -all -inline $WildcardFilter Memory]",
"-do", "log -r /*;",

View File

@@ -1,4 +1,4 @@
pytest
parameterized
pytest-xdist
pytest-parallel
jinja2-simple-tags

View File

View File

@@ -0,0 +1,8 @@
addrmap regblock {
default sw=rw;
default hw=r;
reg {
field {} x[31:0] = 0;
} x[64] @ 0 += 4;
};

View File

@@ -0,0 +1,50 @@
{% extends "lib/tb_base.sv" %}
{% block seq %}
{% sv_line_anchor %}
##1;
cb.rst <= '0;
##1;
// Write all regs in parallel burst
for(int i=0; i<64; i++) begin
fork
automatic int i_fk = i;
begin
cpuif.write(i_fk*4, i_fk + 32'h12340000);
end
join_none
end
wait fork;
// Verify HW value
@cb;
for(int i=0; i<64; i++) begin
assert(cb.hwif_out.x[i].x.value == i + 32'h12340000)
else $error("hwif_out.x[i] == 0x%0x. Expected 0x%0x", cb.hwif_out.x[i].x.value, i + 32'h12340000);
end
// Read all regs in parallel burst
for(int i=0; i<64; i++) begin
fork
automatic int i_fk = i;
begin
cpuif.assert_read(i_fk*4, i_fk + 32'h12340000);
end
join_none
end
wait fork;
// Mix read/writes
for(int i=0; i<64; i++) begin
fork
automatic int i_fk = i;
begin
cpuif.write(i_fk*4, i_fk + 32'h56780000);
cpuif.assert_read(i_fk*4, i_fk + 32'h56780000);
end
join_none
end
wait fork;
{% endblock %}

View File

@@ -0,0 +1,9 @@
from parameterized import parameterized_class
from ..lib.regblock_testcase import RegblockTestCase
from ..lib.test_params import TEST_PARAMS
@parameterized_class(TEST_PARAMS)
class Test(RegblockTestCase):
def test_dut(self):
self.run_test()