diff --git a/docs/architecture.rst b/docs/architecture.rst index 2ea9ad7..7ccc45d 100644 --- a/docs/architecture.rst +++ b/docs/architecture.rst @@ -1,26 +1,54 @@ Register Block Architecture =========================== -TODO: Add full block diagram +The generated register block RTL is organized into several sections. +Each section is automatically generated based on the source register model and +is rendered into the output register block SystermVerilog RTL. + +.. figure:: diagrams/arch.png + +Although it is not completely necessary to know the inner workings of the +generated RTL, it can be helpful to understand the implications of various +exporter configuration options. CPU Interface ------------- +The CPU interface logic layer provides an abstraction between the +application-specific bus protocol and the internal register file logic. +This logic layer normalizes external CPU read & write transactions into a common +:ref:`cpuif_protocol` that is used to interact with the register file. -TODO: describe boundary signals. Timing diagrams Address Decode -------------- +A common address decode operation is generated which computes individual access +strobes for each software-accessible register in the design. +This operation is performed completely combinationally. -TODO: describe boundary signals. Timing diagrams Field Logic ----------- +This layer of the register block implements the storage elements and state-change +logic for every field in the design. Field state is updated based on address +decode strobes from software read/write actions, as well as events from the +hardware interface input struct. +This section also assigns any hardware interface outputs. -TODO: describe boundary signals. Timing diagrams Readback -------- +The readback layer aggregates and reduces all readable registers into a single +read response. During a read operation, the same address decode strobes are used +to select the active register that is being accessed. +This allows for a simple OR-reduction operation to be used to compute the read +data response. -TODO: describe boundary signals. Timing diagrams -Retiming options +For designs with a large number of software-readable registers, an optional +fanin re-timing stage can be enabled. This stage is automatically inserted at a +balanced point in the read-data reduction so that fanin and logic-levels are +optimally reduced. + +A second optional read response retiming register can be enabled in-line with the +path back to the CPU interface layer. This can be useful if the CPU interface protocol +used has a fully combinational response path, and needs to be retimed further. diff --git a/docs/cpuif/internal_protocol.rst b/docs/cpuif/internal_protocol.rst new file mode 100644 index 0000000..64c8448 --- /dev/null +++ b/docs/cpuif/internal_protocol.rst @@ -0,0 +1,228 @@ +.. _cpuif_protocol: + +Internal CPUIF Protocol +======================= + +Internally, the regblock generator uses a common CPU interface handshake +protocol. This strobe-based protocol is designed to add minimal overhead to the +regblock implementation, while also being flexible enough to support advanced +features of a variety of bus interface standards. + + +Signal Descriptions +------------------- + +Request +^^^^^^^ +cpuif_req + When asserted, a read or write transfer will be initiated. + Denotes that the following signals are valid: ``cpuif_addr``, + ``cpuif_req_is_wr``, and ``cpuif_wr_data``. + + A transfer will only initiate if the relevant stall signal is not asserted. + If stalled, the request shall be held until accepted. A request's parameters + (type, address, etc) shall remain static throughout the stall. + +cpuif_addr + Byte-address of the transfer. + +cpuif_req_is_wr + If ``1``, denotes that the current transfer is a write. Otherwise transfer is + a read. + +cpuif_wr_data + Data to be written for the write transfer. This signal is ignored for read + transfers. + +cpuif_req_stall_rd + If asserted, and the next pending request is a read operation, then the + transfer will not be accepted until this signal is deasserted. + +cpuif_req_stall_wr + If asserted, and the next pending request is a read operation, then the + transfer will not be accepted until this signal is deasserted. + + +Read Response +^^^^^^^^^^^^^ +cpuif_rd_ack + Single-cycle strobe indicating a read transfer has completed. + Qualifies that the following signals are valid: ``cpuif_rd_err`` and + ``cpuif_rd_data`` + +cpuif_rd_err + If set, indicates that the read transaction failed and the CPUIF logic + should return an error response if possible. + +cpuif_rd_data + Read data. The width of this is bus is determined by the size of the largest + register in the design. + +Write Response +^^^^^^^^^^^^^^ +cpuif_wr_ack + Single-cycle strobe indicating a write transfer has completed. + Qualifies that the ``cpuif_wr_err`` signal is valid. + +cpuif_wr_err + If set, indicates that the write transaction failed and the CPUIF logic + should return an error response if possible. + + +Transfers +--------- + +Transfers have the following characteristics: + +* Only one transfer can be initiated per clock-cycle. This is implicit as there + is only one set of request signals. +* The register block implementation shall guarantee that only one response can be + asserted in a given clock cycle. Only one ``cpuif_*_ack`` signal can be + asserted at a time. +* Responses shall arrive in the same order as their corresponding request was + dispatched. + + +Basic Transfer +^^^^^^^^^^^^^^ + +Depending on the configuration of the exported register block, transfers can be +fully combinational or they may require one or more clock cycles to complete. +Both are valid and CPU interface logic shall be designed to anticipate either. + +.. wavedrom:: + + { + signal: [ + {name: 'clk', wave: 'p....'}, + {name: 'cpuif_req', wave: '010..'}, + {name: 'cpuif_req_is_wr', wave: 'x2x..'}, + {name: 'cpuif_addr', wave: 'x2x..', data: ['A']}, + {}, + {name: 'cpuif_*_ack', wave: '010..'}, + {name: 'cpuif_*_err', wave: 'x2x..'}, + ], + foot: { + text: "Zero-latency transfer" + } + } + +.. wavedrom:: + + { + signal: [ + {name: 'clk', wave: 'p..|...'}, + {name: 'cpuif_req', wave: '010|...'}, + {name: 'cpuif_req_is_wr', wave: 'x2x|...'}, + {name: 'cpuif_addr', wave: 'x2x|...', data: ['A']}, + {}, + {name: 'cpuif_*_ack', wave: '0..|10.'}, + {name: 'cpuif_*_err', wave: 'x..|2x.'}, + ], + foot: { + text: "Transfer with non-zero latency" + } + } + + +Read & Write Transactions +------------------------- + +Waveforms below show the timing relationship of simple read/write transactions. +For brevity, only showing non-zero latency transfers. + +.. wavedrom:: + + { + signal: [ + {name: 'clk', wave: 'p..|...'}, + {name: 'cpuif_req', wave: '010|...'}, + {name: 'cpuif_req_is_wr', wave: 'x0x|...'}, + {name: 'cpuif_addr', wave: 'x3x|...', data: ['A']}, + {}, + {name: 'cpuif_rd_ack', wave: '0..|10.'}, + {name: 'cpuif_rd_err', wave: 'x..|0x.'}, + {name: 'cpuif_rd_data', wave: 'x..|5x.', data: ['D']}, + ], + foot: { + text: "Read Transaction" + } + } + + +.. wavedrom:: + + { + signal: [ + {name: 'clk', wave: 'p..|...'}, + {name: 'cpuif_req', wave: '010|...'}, + {name: 'cpuif_req_is_wr', wave: 'x1x|...'}, + {name: 'cpuif_addr', wave: 'x3x|...', data: ['A']}, + {name: 'cpuif_wr_data', wave: 'x5x|...', data: ['D']}, + {}, + {name: 'cpuif_wr_ack', wave: '0..|10.'}, + {name: 'cpuif_wr_err', wave: 'x..|0x.'}, + ], + foot: { + text: "Write Transaction" + } + } + + +Transaction Pipelining & Stalls +------------------------------- +If the CPU interface supports it, read and write operations can be pipelined. + +.. wavedrom:: + + { + signal: [ + {name: 'clk', wave: 'p......'}, + {name: 'cpuif_req', wave: '01..0..'}, + {name: 'cpuif_req_is_wr', wave: 'x0..x..'}, + {name: 'cpuif_addr', wave: 'x333x..', data: ['A1', 'A2', 'A3']}, + {}, + {name: 'cpuif_rd_ack', wave: '0.1..0.'}, + {name: 'cpuif_rd_err', wave: 'x.0..x.'}, + {name: 'cpuif_rd_data', wave: 'x.555x.', data: ['D1', 'D2', 'D3']}, + ] + } + +It is very likely that the transfer latency of a read transaction will not +be the same as a write for a given register block configuration. Typically read +operations will be more deeply pipelined. This latency asymmetry would create a +hazard for response collisions. + +In order to eliminate this hazard, additional stall signals are provided to delay +an incoming transfer request if necessary. When asserted, the CPU interface shall +hold the next pending request until the stall is cleared. + +For non-pipelined CPU interfaces that only allow one outstanding transaction at a time, +these can be safely ignored. + +In the following example, the regblock is configured such that: + +* A read transaction takes 1 clock cycle to complete +* A write transaction takes 0 clock cycles to complete + +.. wavedrom:: + + { + signal: [ + {name: 'clk', wave: 'p.......'}, + {name: 'cpuif_req', wave: '01.....0'}, + {name: 'cpuif_req_is_wr', wave: 'x1.0.1.x'}, + {name: 'cpuif_addr', wave: 'x33443.x', data: ['W1', 'W2', 'R1', 'R2', 'W3']}, + {name: 'cpuif_req_stall_wr', wave: '0...1.0.'}, + {}, + {name: 'cpuif_rd_ack', wave: '0...220.', data: ['R1', 'R2']}, + {name: 'cpuif_wr_ack', wave: '0220..20', data: ['W1', 'W2', 'W3']}, + + ] + } + +In the above waveform, observe that: + +* The ``R2`` read request is not affected by the assertion of the write stall, + since the write stall only applies to write requests. +* The ``W3`` write request is stalled for one cycle, and is accepted once the stall is cleared. diff --git a/docs/diagrams/arch.png b/docs/diagrams/arch.png new file mode 100644 index 0000000..0aad955 Binary files /dev/null and b/docs/diagrams/arch.png differ diff --git a/docs/diagrams/diagrams.odg b/docs/diagrams/diagrams.odg new file mode 100644 index 0000000..5d500fe Binary files /dev/null and b/docs/diagrams/diagrams.odg differ diff --git a/docs/index.rst b/docs/index.rst index 135eb23..ae967ac 100644 --- a/docs/index.rst +++ b/docs/index.rst @@ -16,7 +16,8 @@ Install from `PyPi`_ using pip .. code-block:: bash - python3 -m pip install peakrdl-regblock + # NOT RELEASED YET + # python3 -m pip install peakrdl-regblock .. _PyPi: https://pypi.org/project/peakrdl-regblock @@ -47,6 +48,7 @@ Links cpuif/addressing cpuif/apb3 cpuif/advanced + cpuif/internal_protocol .. toctree:: :hidden: diff --git a/peakrdl/regblock/cpuif/passthrough/__init__.py b/peakrdl/regblock/cpuif/passthrough/__init__.py index 5cb9a0d..975a324 100644 --- a/peakrdl/regblock/cpuif/passthrough/__init__.py +++ b/peakrdl/regblock/cpuif/passthrough/__init__.py @@ -10,6 +10,8 @@ class PassthroughCpuif(CpuifBase): "input wire s_cpuif_req_is_wr", f"input wire [{self.addr_width-1}:0] s_cpuif_addr", f"input wire [{self.data_width-1}:0] s_cpuif_wr_data", + "output wire s_cpuif_req_stall_wr", + "output wire s_cpuif_req_stall_rd", "output wire s_cpuif_rd_ack", "output wire s_cpuif_rd_err", f"output wire [{self.data_width-1}:0] s_cpuif_rd_data", diff --git a/peakrdl/regblock/cpuif/passthrough/passthrough_tmpl.sv b/peakrdl/regblock/cpuif/passthrough/passthrough_tmpl.sv index d291496..b0897fc 100644 --- a/peakrdl/regblock/cpuif/passthrough/passthrough_tmpl.sv +++ b/peakrdl/regblock/cpuif/passthrough/passthrough_tmpl.sv @@ -2,6 +2,8 @@ assign cpuif_req = s_cpuif_req; assign cpuif_req_is_wr = s_cpuif_req_is_wr; assign cpuif_addr = s_cpuif_addr; assign cpuif_wr_data = s_cpuif_wr_data; +assign s_cpuif_req_stall_wr = cpuif_req_stall_wr; +assign s_cpuif_req_stall_rd = cpuif_req_stall_rd; assign s_cpuif_rd_ack = cpuif_rd_ack; assign s_cpuif_rd_err = cpuif_rd_err; assign s_cpuif_rd_data = cpuif_rd_data; diff --git a/peakrdl/regblock/exporter.py b/peakrdl/regblock/exporter.py index 2f5dc20..9a0d225 100644 --- a/peakrdl/regblock/exporter.py +++ b/peakrdl/regblock/exporter.py @@ -76,6 +76,12 @@ class RegblockExporter: if kwargs: raise TypeError("got an unexpected keyword argument '%s'" % list(kwargs.keys())[0]) + min_read_latency = 0 + min_write_latency = 0 + if retime_read_fanin: + min_read_latency += 1 + if retime_read_response: + min_read_latency += 1 # Scan the design for any unsupported features # Also collect pre-export information @@ -114,6 +120,8 @@ class RegblockExporter: "readback": self.readback, "get_always_ff_event": lambda resetsignal : get_always_ff_event(self.dereferencer, resetsignal), "retime_read_response": retime_read_response, + "min_read_latency": min_read_latency, + "min_write_latency": min_write_latency, } # Write out design diff --git a/peakrdl/regblock/module_tmpl.sv b/peakrdl/regblock/module_tmpl.sv index 32f2f4d..b72cd31 100644 --- a/peakrdl/regblock/module_tmpl.sv +++ b/peakrdl/regblock/module_tmpl.sv @@ -24,6 +24,8 @@ module {{module_name}} ( logic cpuif_req_is_wr; logic [{{cpuif.addr_width-1}}:0] cpuif_addr; logic [{{cpuif.data_width-1}}:0] cpuif_wr_data; + logic cpuif_req_stall_wr; + logic cpuif_req_stall_rd; logic cpuif_rd_ack; logic cpuif_rd_err; @@ -34,6 +36,40 @@ module {{module_name}} ( {{cpuif.get_implementation()|indent}} +{% if min_read_latency == min_write_latency %} + // Read & write latencies are balanced. Stalls not required + assign cpuif_req_stall_rd = '0; + assign cpuif_req_stall_wr = '0; +{%- elif min_read_latency > min_write_latency %} + // Read latency > write latency. May need to delay next write that follows a read + logic [{{min_read_latency - min_write_latency - 1}}:0] cpuif_req_stall_sr; + always_ff {{get_always_ff_event(cpuif.reset)}} begin + if({{get_resetsignal(cpuif.reset)}}) begin + cpuif_req_stall_sr <= '0; + end else if(cpuif_req && !cpuif_req_is_wr) begin + cpuif_req_stall_sr <= '1; + end else begin + cpuif_req_stall_sr <= (cpuif_req_stall_sr >> 'd1); + end + end + assign cpuif_req_stall_rd = '0; + assign cpuif_req_stall_wr = cpuif_req_stall_sr[0]; +{%- else %} + // Write latency > read latency. May need to delay next read that follows a write + logic [{{min_write_latency - min_read_latency - 1}}:0] cpuif_req_stall_sr; + always_ff {{get_always_ff_event(cpuif.reset)}} begin + if({{get_resetsignal(cpuif.reset)}}) begin + cpuif_req_stall_sr <= '0; + end else if(cpuif_req && cpuif_req_is_wr) begin + cpuif_req_stall_sr <= '1; + end else begin + cpuif_req_stall_sr <= (cpuif_req_stall_sr >> 'd1); + end + end + assign cpuif_req_stall_rd = cpuif_req_stall_sr[0]; + assign cpuif_req_stall_wr = '0; +{%- endif %} + //-------------------------------------------------------------------------- // Address Decode //-------------------------------------------------------------------------- @@ -47,15 +83,15 @@ module {{module_name}} ( {{address_decode.get_implementation()|indent(8)}} end - // Writes are always granted with no error response - assign cpuif_wr_ack = cpuif_req & cpuif_req_is_wr; - assign cpuif_wr_err = '0; - // Pass down signals to next stage assign decoded_req = cpuif_req; assign decoded_req_is_wr = cpuif_req_is_wr; assign decoded_wr_data = cpuif_wr_data; + // Writes are always granted with no error response + assign cpuif_wr_ack = decoded_req & decoded_req_is_wr; + assign cpuif_wr_err = '0; + //-------------------------------------------------------------------------- // Field logic //-------------------------------------------------------------------------- diff --git a/test/README.md b/test/README.md index 02b8c07..6f01098 100644 --- a/test/README.md +++ b/test/README.md @@ -1,12 +1,14 @@ # Test Dependencies -## ModelSim +## Questa -Testcases require an installation of ModelSim/QuestaSim, and for `vlog` & `vsim` +Testcases require an installation of the Questa simulator, and for `vlog` & `vsim` commands to be visible via the PATH environment variable. -ModelSim - Intel FPGA Edition can be downloaded for free from https://fpgasoftware.intel.com/ and is sufficient to run unit tests. +*Questa - Intel FPGA Starter Edition* can be downloaded for free from +https://fpgasoftware.intel.com/ and is sufficient to run unit tests. You will need +to generate a free license file to unlock the software: https://licensing.intel.com/psg/s/sales-signup-evaluationlicenses ## Python Packages @@ -19,7 +21,7 @@ python3 -m pip install test/requirements.txt # Running tests Tests can be launched from the test directory using `pytest`. -Use `pytest -n auto` to run tests in parallel. +Use `pytest --workers auto` to run tests in parallel. To run all tests: ```bash diff --git a/test/lib/cpuifs/apb3/apb3_intf_driver.sv b/test/lib/cpuifs/apb3/apb3_intf_driver.sv index 97b449c..5533f27 100644 --- a/test/lib/cpuifs/apb3/apb3_intf_driver.sv +++ b/test/lib/cpuifs/apb3/apb3_intf_driver.sv @@ -40,7 +40,7 @@ interface apb3_intf_driver #( input PSLVERR; endclocking - task reset(); + task automatic reset(); cb.PSEL <= '0; cb.PENABLE <= '0; cb.PWRITE <= '0; @@ -48,7 +48,10 @@ interface apb3_intf_driver #( cb.PWDATA <= '0; endtask - task write(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] data); + semaphore txn_mutex = new(1); + + task automatic write(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] data); + txn_mutex.get(); ##0; // Initiate transfer @@ -66,9 +69,11 @@ interface apb3_intf_driver #( // Wait for response while(cb.PREADY !== 1'b1) @(cb); reset(); + txn_mutex.put(); endtask - task read(logic [ADDR_WIDTH-1:0] addr, output logic [DATA_WIDTH-1:0] data); + task automatic read(logic [ADDR_WIDTH-1:0] addr, output logic [DATA_WIDTH-1:0] data); + txn_mutex.get(); ##0; // Initiate transfer @@ -89,9 +94,10 @@ interface apb3_intf_driver #( assert(!$isunknown(cb.PSLVERR)) else $error("Read from 0x%0x returned X's on PSLVERR", addr); data = cb.PRDATA; reset(); + txn_mutex.put(); endtask - task assert_read(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] expected_data, logic [DATA_WIDTH-1:0] mask = '1); + task automatic assert_read(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] expected_data, logic [DATA_WIDTH-1:0] mask = '1); logic [DATA_WIDTH-1:0] data; read(addr, data); data &= mask; diff --git a/test/lib/cpuifs/axi4lite/axi4lite_intf_driver.sv b/test/lib/cpuifs/axi4lite/axi4lite_intf_driver.sv index 522b515..837f595 100644 --- a/test/lib/cpuifs/axi4lite/axi4lite_intf_driver.sv +++ b/test/lib/cpuifs/axi4lite/axi4lite_intf_driver.sv @@ -77,7 +77,7 @@ interface axi4lite_intf_driver #( input RRESP; endclocking - task reset(); + task automatic reset(); cb.AWVALID <= '0; cb.AWADDR <= '0; cb.AWPROT <= '0; @@ -95,13 +95,20 @@ interface axi4lite_intf_driver #( @cb; end - task write(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] data); + semaphore txn_aw_mutex = new(1); + semaphore txn_w_mutex = new(1); + semaphore txn_b_mutex = new(1); + semaphore txn_ar_mutex = new(1); + semaphore txn_r_mutex = new(1); + + task automatic write(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] data); bit w_before_aw; w_before_aw = $urandom_range(1,0); - ##0; fork begin + txn_aw_mutex.get(); + ##0; if(w_before_aw) repeat($urandom_range(2,0)) @cb; cb.AWVALID <= '1; cb.AWADDR <= addr; @@ -109,9 +116,12 @@ interface axi4lite_intf_driver #( @(cb); while(cb.AWREADY !== 1'b1) @(cb); cb.AWVALID <= '0; + txn_aw_mutex.put(); end begin + txn_w_mutex.get(); + ##0; if(!w_before_aw) repeat($urandom_range(2,0)) @cb; cb.WVALID <= '1; cb.WDATA <= data; @@ -120,39 +130,47 @@ interface axi4lite_intf_driver #( while(cb.WREADY !== 1'b1) @(cb); cb.WVALID <= '0; cb.WSTRB <= '0; + txn_w_mutex.put(); end begin + txn_b_mutex.get(); + @cb; while(cb.BREADY !== 1'b1 && cb.BVALID !== 1'b1) @(cb); assert(!$isunknown(cb.BRESP)) else $error("Read from 0x%0x returned X's on BRESP", addr); + txn_b_mutex.put(); end join endtask - task read(logic [ADDR_WIDTH-1:0] addr, output logic [DATA_WIDTH-1:0] data); - ##0; + task automatic read(logic [ADDR_WIDTH-1:0] addr, output logic [DATA_WIDTH-1:0] data); fork begin + txn_ar_mutex.get(); + ##0; cb.ARVALID <= '1; cb.ARADDR <= addr; cb.ARPROT <= '0; @(cb); while(cb.ARREADY !== 1'b1) @(cb); cb.ARVALID <= '0; + txn_ar_mutex.put(); end begin + txn_r_mutex.get(); @cb; while(!(cb.RREADY === 1'b1 && cb.RVALID === 1'b1)) @(cb); assert(!$isunknown(cb.RDATA)) else $error("Read from 0x%0x returned X's on RDATA", addr); assert(!$isunknown(cb.RRESP)) else $error("Read from 0x%0x returned X's on RRESP", addr); data = cb.RDATA; + txn_r_mutex.put(); end join endtask - task assert_read(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] expected_data, logic [DATA_WIDTH-1:0] mask = '1); + task automatic assert_read(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] expected_data, logic [DATA_WIDTH-1:0] mask = '1); logic [DATA_WIDTH-1:0] data; read(addr, data); data &= mask; diff --git a/test/lib/cpuifs/passthrough/passthrough_driver.sv b/test/lib/cpuifs/passthrough/passthrough_driver.sv index 8be5444..e0ee464 100644 --- a/test/lib/cpuifs/passthrough/passthrough_driver.sv +++ b/test/lib/cpuifs/passthrough/passthrough_driver.sv @@ -9,6 +9,8 @@ interface passthrough_driver #( output logic m_cpuif_req_is_wr, output logic [ADDR_WIDTH-1:0] m_cpuif_addr, output logic [DATA_WIDTH-1:0] m_cpuif_wr_data, + input wire m_cpuif_req_stall_wr, + input wire m_cpuif_req_stall_rd, input wire m_cpuif_rd_ack, input wire m_cpuif_rd_err, input wire [DATA_WIDTH-1:0] m_cpuif_rd_data, @@ -25,6 +27,8 @@ interface passthrough_driver #( output m_cpuif_req_is_wr; output m_cpuif_addr; output m_cpuif_wr_data; + input m_cpuif_req_stall_wr; + input m_cpuif_req_stall_rd; input m_cpuif_rd_ack; input m_cpuif_rd_err; input m_cpuif_rd_data; @@ -32,47 +36,70 @@ interface passthrough_driver #( input m_cpuif_wr_err; endclocking - task reset(); + task automatic reset(); cb.m_cpuif_req <= '0; cb.m_cpuif_req_is_wr <= '0; cb.m_cpuif_addr <= '0; cb.m_cpuif_wr_data <= '0; endtask - task write(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] data); - ##0; + semaphore txn_req_mutex = new(1); + semaphore txn_resp_mutex = new(1); - // Initiate transfer - cb.m_cpuif_req <= '1; - cb.m_cpuif_req_is_wr <= '1; - cb.m_cpuif_addr <= addr; - cb.m_cpuif_wr_data <= data; - @(cb); - reset(); + task automatic write(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] data); + fork + begin + // Initiate transfer + txn_req_mutex.get(); + ##0; + cb.m_cpuif_req <= '1; + cb.m_cpuif_req_is_wr <= '1; + cb.m_cpuif_addr <= addr; + cb.m_cpuif_wr_data <= data; + @(cb); + while(cb.m_cpuif_req_stall_wr !== 1'b0) @(cb); + reset(); + txn_req_mutex.put(); + end - // Wait for response - while(cb.m_cpuif_wr_ack !== 1'b1) @(cb); - reset(); + begin + // Wait for response + txn_resp_mutex.get(); + @cb; + while(cb.m_cpuif_wr_ack !== 1'b1) @(cb); + txn_resp_mutex.put(); + end + join endtask - task read(logic [ADDR_WIDTH-1:0] addr, output logic [DATA_WIDTH-1:0] data); - ##0; + task automatic read(logic [ADDR_WIDTH-1:0] addr, output logic [DATA_WIDTH-1:0] data); + fork + begin + // Initiate transfer + txn_req_mutex.get(); + ##0; + cb.m_cpuif_req <= '1; + cb.m_cpuif_req_is_wr <= '0; + cb.m_cpuif_addr <= addr; + @(cb); + while(cb.m_cpuif_req_stall_rd !== 1'b0) @(cb); + reset(); + txn_req_mutex.put(); + end - // Initiate transfer - cb.m_cpuif_req <= '1; - cb.m_cpuif_req_is_wr <= '0; - cb.m_cpuif_addr <= addr; - @(cb); - reset(); - - // Wait for response - while(cb.m_cpuif_rd_ack !== 1'b1) @(cb); - assert(!$isunknown(cb.m_cpuif_rd_data)) else $error("Read from 0x%0x returned X's on m_cpuif_rd_data", addr); - data = cb.m_cpuif_rd_data; - reset(); + begin + // Wait for response + txn_resp_mutex.get(); + @cb; + while(cb.m_cpuif_rd_ack !== 1'b1) @(cb); + assert(!$isunknown(cb.m_cpuif_rd_data)) else $error("Read from 0x%0x returned X's on m_cpuif_rd_data", addr); + data = cb.m_cpuif_rd_data; + txn_resp_mutex.put(); + end + join endtask - task assert_read(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] expected_data, logic [DATA_WIDTH-1:0] mask = '1); + task automatic assert_read(logic [ADDR_WIDTH-1:0] addr, logic [DATA_WIDTH-1:0] expected_data, logic [DATA_WIDTH-1:0] mask = '1); logic [DATA_WIDTH-1:0] data; read(addr, data); data &= mask; diff --git a/test/lib/cpuifs/passthrough/tb_inst.sv b/test/lib/cpuifs/passthrough/tb_inst.sv index a1fdc57..680b794 100644 --- a/test/lib/cpuifs/passthrough/tb_inst.sv +++ b/test/lib/cpuifs/passthrough/tb_inst.sv @@ -3,6 +3,8 @@ wire s_cpuif_req; wire s_cpuif_req_is_wr; wire [{{exporter.cpuif.addr_width-1}}:0] s_cpuif_addr; wire [{{exporter.cpuif.data_width-1}}:0] s_cpuif_wr_data; +wire s_cpuif_req_stall_wr; +wire s_cpuif_req_stall_rd; wire s_cpuif_rd_ack; wire s_cpuif_rd_err; wire [{{exporter.cpuif.data_width-1}}:0] s_cpuif_rd_data; @@ -18,6 +20,8 @@ passthrough_driver #( .m_cpuif_req_is_wr(s_cpuif_req_is_wr), .m_cpuif_addr(s_cpuif_addr), .m_cpuif_wr_data(s_cpuif_wr_data), + .m_cpuif_req_stall_wr(s_cpuif_req_stall_wr), + .m_cpuif_req_stall_rd(s_cpuif_req_stall_rd), .m_cpuif_rd_ack(s_cpuif_rd_ack), .m_cpuif_rd_err(s_cpuif_rd_err), .m_cpuif_rd_data(s_cpuif_rd_data), diff --git a/test/lib/regblock_testcase.py b/test/lib/regblock_testcase.py index ce9a7a5..6466b4b 100644 --- a/test/lib/regblock_testcase.py +++ b/test/lib/regblock_testcase.py @@ -16,7 +16,7 @@ from peakrdl.regblock import RegblockExporter from .cpuifs.base import CpuifTestMode from .cpuifs.apb3 import APB3 -from .simulators.modelsim import ModelSim +from .simulators.questa import Questa class RegblockTestCase(unittest.TestCase): @@ -40,9 +40,9 @@ class RegblockTestCase(unittest.TestCase): retime_read_response = False #: Abort test if it exceeds this number of clock cycles - timeout_clk_cycles = 1000 + timeout_clk_cycles = 5000 - simulator_cls = ModelSim + simulator_cls = Questa #: this gets auto-loaded via the _load_request autouse fixture request = None # type: pytest.FixtureRequest diff --git a/test/lib/simulators/modelsim.py b/test/lib/simulators/questa.py similarity index 79% rename from test/lib/simulators/modelsim.py rename to test/lib/simulators/questa.py index 0a6f4d2..ed896f5 100644 --- a/test/lib/simulators/modelsim.py +++ b/test/lib/simulators/questa.py @@ -4,27 +4,21 @@ import os from . import Simulator -class ModelSim(Simulator): +class Questa(Simulator): def compile(self) -> None: cmd = [ "vlog", "-sv", "-quiet", "-l", "build.log", "+incdir+%s" % os.path.join(os.path.dirname(__file__), ".."), - # Free version of ModelSim throws errors if generate/endgenerate - # blocks are not used. - # These have been made optional long ago. Modern versions of SystemVerilog do - # not require them and I prefer not to add them. - "-suppress", "2720", - - # Ignore noisy warning about vopt-time checking of always_comb/always_latch - "-suppress", "2583", + # Use strict LRM conformance + "-svinputport=net", # all warnings are errors "-warning", "error", - # except this one.. TODO: figure out if I can avoid this - "-suppress", "13314", + # Ignore noisy warning about vopt-time checking of always_comb/always_latch + "-suppress", "2583", ] # Add source files @@ -42,6 +36,7 @@ class ModelSim(Simulator): # call vsim cmd = [ "vsim", "-quiet", + "-voptargs=+acc", "-msgmode", "both", "-do", "set WildcardFilter [lsearch -not -all -inline $WildcardFilter Memory]", "-do", "log -r /*;", diff --git a/test/requirements.txt b/test/requirements.txt index 04be52b..b13cbff 100644 --- a/test/requirements.txt +++ b/test/requirements.txt @@ -1,4 +1,4 @@ pytest parameterized -pytest-xdist +pytest-parallel jinja2-simple-tags diff --git a/test/test_pipelined_cpuif/__init__.py b/test/test_pipelined_cpuif/__init__.py new file mode 100644 index 0000000..e69de29 diff --git a/test/test_pipelined_cpuif/regblock.rdl b/test/test_pipelined_cpuif/regblock.rdl new file mode 100644 index 0000000..a208e43 --- /dev/null +++ b/test/test_pipelined_cpuif/regblock.rdl @@ -0,0 +1,8 @@ +addrmap regblock { + default sw=rw; + default hw=r; + + reg { + field {} x[31:0] = 0; + } x[64] @ 0 += 4; +}; diff --git a/test/test_pipelined_cpuif/tb_template.sv b/test/test_pipelined_cpuif/tb_template.sv new file mode 100644 index 0000000..b9c349f --- /dev/null +++ b/test/test_pipelined_cpuif/tb_template.sv @@ -0,0 +1,50 @@ +{% extends "lib/tb_base.sv" %} + +{% block seq %} + {% sv_line_anchor %} + ##1; + cb.rst <= '0; + ##1; + + // Write all regs in parallel burst + for(int i=0; i<64; i++) begin + fork + automatic int i_fk = i; + begin + cpuif.write(i_fk*4, i_fk + 32'h12340000); + end + join_none + end + wait fork; + + // Verify HW value + @cb; + for(int i=0; i<64; i++) begin + assert(cb.hwif_out.x[i].x.value == i + 32'h12340000) + else $error("hwif_out.x[i] == 0x%0x. Expected 0x%0x", cb.hwif_out.x[i].x.value, i + 32'h12340000); + end + + // Read all regs in parallel burst + for(int i=0; i<64; i++) begin + fork + automatic int i_fk = i; + begin + cpuif.assert_read(i_fk*4, i_fk + 32'h12340000); + end + join_none + end + wait fork; + + // Mix read/writes + for(int i=0; i<64; i++) begin + fork + automatic int i_fk = i; + begin + cpuif.write(i_fk*4, i_fk + 32'h56780000); + cpuif.assert_read(i_fk*4, i_fk + 32'h56780000); + end + join_none + end + wait fork; + +{% endblock %} diff --git a/test/test_pipelined_cpuif/testcase.py b/test/test_pipelined_cpuif/testcase.py new file mode 100644 index 0000000..a0e39d0 --- /dev/null +++ b/test/test_pipelined_cpuif/testcase.py @@ -0,0 +1,9 @@ +from parameterized import parameterized_class + +from ..lib.regblock_testcase import RegblockTestCase +from ..lib.test_params import TEST_PARAMS + +@parameterized_class(TEST_PARAMS) +class Test(RegblockTestCase): + def test_dut(self): + self.run_test()