334 lines
15 KiB
ReStructuredText
334 lines
15 KiB
ReStructuredText
.. _metasimulation:
|
|
|
|
Debugging & Testing with Metasimulation
|
|
=======================================
|
|
|
|
When discussing RTL simulation in FireSim, we are generally referring to
|
|
`metasimulation`: simulating the FireSim simulator's RTL, typically using VCS or
|
|
Verilator. In contrast, we'll refer to simulation of the target's unmodified (by
|
|
GoldenGate decoupling, host and target transforms) RTL as `target-level` simulation.
|
|
Target-level simulation in Chipyard is described at length `here
|
|
<https://chipyard.readthedocs.io/en/latest/Simulation/Software-RTL-Simulation.html>`_.
|
|
|
|
Metasimulation is the most productive way to catch bugs before generating an AGFI, and a
|
|
means for reproducing bugs seen on the FPGA. By default, metasimulation uses an abstract
|
|
but fast model of the host: the FPGA's DRAM controllers are modeled with a single-cycle
|
|
memory system, the PCI-E subsystem is not simulated, instead the driver presents DMA and
|
|
MMIO traffic directly on the FPGATop interfaces. Since FireSim simulations are robust
|
|
against timing differences across hosts, target behavior observed in an FPGA-hosted
|
|
simulation should be exactly reproducible in a metasimulation.
|
|
|
|
As a final note, metasimulations are generally only slightly slower than target-level
|
|
simulations. Example performance numbers can be found at
|
|
:ref:`metasimulation-performance`.
|
|
|
|
.. _metasimulation-supported-host-sims:
|
|
|
|
Supported Host Simulators
|
|
-------------------------
|
|
|
|
Currently, the following host simulators are supported for metasimulation:
|
|
|
|
- `Verilator <https://www.veripool.org/verilator/>`_
|
|
|
|
- FOSS, automatically installed during FireSim setup.
|
|
- Referred to throughout the codebase as ``verilator``.
|
|
|
|
- `Synopsys VCS <https://www.synopsys.com/verification/simulation/vcs.html>`_
|
|
|
|
- License required.
|
|
- Referred to throughout the codebase as ``vcs``.
|
|
|
|
Pull requests to add support for other simulators are welcome.
|
|
|
|
Running Metasimulations using the FireSim Manager
|
|
-------------------------------------------------
|
|
|
|
The FireSim manager supports running metasimulations using the standard ``firesim
|
|
{launchrunfarm, infrasetup, runworkload, terminaterunfarm}`` flow that is also used for
|
|
FPGA-accelerated simulations. Rather than using FPGAs, these metasimulations run within
|
|
one of the aforementioned software simulators
|
|
(:ref:`metasimulation-supported-host-sims`) on standard compute hosts (i.e. those
|
|
without FPGAs). This allows users to write a single definition of a target (configured
|
|
design and software workload), while seamlessly moving between software-only
|
|
metasimulations and FPGA-accelerated simulations.
|
|
|
|
As an example, if you have the default ``config_runtime.yaml`` that is setup for
|
|
FPGA-accelerated simulations (e.g. the one used for the 8-node networked simulation from
|
|
the :ref:``cluster-sim`` section), a few modifications to the configuration files can
|
|
convert it to running a distributed metasimulation.
|
|
|
|
First, modify the existing ``metasimulation`` mapping in ``config_runtime.yaml`` to the
|
|
following:
|
|
|
|
.. code-block:: yaml
|
|
|
|
metasimulation:
|
|
metasimulation_enabled: true
|
|
# vcs or verilator. use vcs-debug or verilator-debug for waveform generation
|
|
metasimulation_host_simulator: verilator
|
|
# plusargs passed to the simulator for all metasimulations
|
|
metasimulation_only_plusargs: "+fesvr-step-size=128 +max-cycles=100000000"
|
|
# plusargs passed to the simulator ONLY FOR vcs metasimulations
|
|
metasimulation_only_vcs_plusargs: "+vcs+initreg+0 +vcs+initmem+0"
|
|
|
|
This configures the manager to run Verilator-hosted metasimulations (without waveform
|
|
generation) for the target specified in ``config_runtime.yaml``. When in metasimulation
|
|
mode, the ``default_hw_config`` that you specify in ``target_config`` references an
|
|
entry in ``config_build_recipes.yaml`` instead of an entry in ``config_hwdb.yaml``.
|
|
|
|
As is the case when the manager runs FPGA-accelerated simulations, the number of
|
|
metasimulations that are run is determined by the parameters in the ``target_config``
|
|
section, e.g. ``topology`` and ``no_net_num_nodes``. Many parallel metasimulations can
|
|
then be run by writing a FireMarshal workload with a corresponding number of jobs.
|
|
|
|
In metasimulation mode, the run farm configuration must be able to support the required
|
|
number of metasimulations (see :ref:`run-farm-config-in-config-runtime` for details).
|
|
The ``num_metasims`` parameter on a run farm host specification defines how many
|
|
metasimulations are allowed to run on a particular host. This corresponds with the
|
|
``num_fpgas`` parameter used in FPGA-accelerated simulation mode. However
|
|
``num_metasims`` does not correspond as tightly with any physical property of the host;
|
|
it can be tuned depending on the complexity of your design and the compute/memory
|
|
resources on a host.
|
|
|
|
For example, in the case of the AWS EC2 run farm (``aws_ec2.yaml``), we define three
|
|
instance types (``z1d.{3, 6, 12}xlarge``) by default that loosely correspond with
|
|
``f1.{2, 4, 16}xlarge`` instances, but instead have no FPGAs and run only metasims (of
|
|
course, the ``f1.*`` instances could run metasims, but this would be wasteful):
|
|
|
|
.. code-block:: yaml
|
|
|
|
run_farm_hosts_to_use:
|
|
- z1d.3xlarge: 0
|
|
- z1d.6xlarge: 0
|
|
- z1d.12xlarge: 1
|
|
|
|
run_farm_host_specs:
|
|
- z1d.3xlarge:
|
|
num_fpgas: 0
|
|
num_metasims: 1
|
|
use_for_switch_only: false
|
|
- z1d.6xlarge:
|
|
num_fpgas: 0
|
|
num_metasims: 2
|
|
use_for_switch_only: false
|
|
- z1d.12xlarge:
|
|
num_fpgas: 0
|
|
num_metasims: 8
|
|
use_for_switch_only: false
|
|
|
|
In this case, the run farm will use a ``z1d.12xlarge`` instance to host 8
|
|
metasimulations.
|
|
|
|
To generate waveforms in a metasimulation, change ``metasimulation_host_simulator`` to a
|
|
simulator ending in ``-debug`` (e.g. ``verilator-debug``). When running with a simulator
|
|
with waveform generation, make sure to add ``waveform.vpd`` to the
|
|
``common_simulation_outputs`` area of your workload JSON file, so that the waveform is
|
|
copied back to your manager host when the simulation completes.
|
|
|
|
A last notable point is that unlike the normal FPGA simulation case, there are two
|
|
output logs in metasimulations. There is the expected ``uartlog`` file that holds the
|
|
``stdout`` from the metasimulation (as in FPGA-based simulations). However, there will
|
|
also be a ``metasim_stderr.out`` file that holds ``stderr`` coming out of the
|
|
metasimulation, commonly populated by ``printf`` calls in the RTL, including those that
|
|
are not marked for ``printf`` synthesis. If you want to copy ``metasim_stderr.out`` to
|
|
your manager when a simulation completes, you must add it to the
|
|
``common_simulation_outputs`` of the workload JSON.
|
|
|
|
Other than the changes discussed in this section, manager behavior is identical between
|
|
FPGA-based simulations and metasimulations. For example, simulation outputs are stored
|
|
in ``deploy/results-workload/`` on your manager host, FireMarshal workload definitions
|
|
are used to supply target software, etc. All standard manager functionality is supported
|
|
in metasimulations, including running networked simulations and using existing FireSim
|
|
debugging tools (i.e. AutoCounter, TracerV, etc).
|
|
|
|
Once the configuration changes discussed thus far in this section are made, the standard
|
|
``firesim {launchrunfarm, infrasetup, runworkload, terminaterunfarm}`` set of commands
|
|
will run metasimulations.
|
|
|
|
If you are planning to use FireSim metasimulations as your primary simulation tool while
|
|
developing a new target design, see the (optional) ``firesim builddriver`` command,
|
|
which can build metasimulations through the manager without requiring run farm hosts to
|
|
be launched or accessible. More about this command is found in the
|
|
:ref:`firesim-builddriver` section.
|
|
|
|
Understanding a Metasimulation Waveform
|
|
---------------------------------------
|
|
|
|
Module Hierarchy
|
|
~~~~~~~~~~~~~~~~
|
|
|
|
To build out a simulator, Golden Gate adds multiple layers of module hierarchy to the
|
|
target design and performs additional hierarchy mutations to implement bridges and
|
|
resource optimizations. Metasimulation uses the ``FPGATop`` module as the top-level
|
|
module, which excludes the platform shim layer (``F1Shim``, for EC2 F1). The original
|
|
top-level of the input design is nested three levels below FPGATop:
|
|
|
|
.. figure:: /img/metasim-module-hierarchy.png
|
|
|
|
The module hierarchy visible in a typical metasimulation.
|
|
|
|
Note that many other bridges (under ``FPGATop``), channel implementations (under
|
|
``SimWrapper``), and optimized models (under ``FAMETop``) may be present, and vary from
|
|
target to target. Under the ``FAMETop`` module instance you will find the original
|
|
top-level module (``FireSimPDES_``, in this case), however it has now been
|
|
host-decoupled using the default LI-BDN FAME transformation and is referred to as the
|
|
`hub model`. It will have ready-valid I/O interfaces for all of the channels bound to
|
|
it, and internally containing additional channel enqueue and clock firing logic to
|
|
control the advance of simulated time. Additionally, modules for bridges and optimized
|
|
models will no longer be found contained in this submodule hierarchy. Instead, I/O for
|
|
those extracted modules will now be as channel interfaces.
|
|
|
|
Clock Edges and Event Timing
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
Since FireSim derives target clocks by clock gating a single host clock, and since
|
|
bridges and optimized models may introduce stalls of their own, timing of target clock
|
|
edges in a metasimulation will appear contorted relative to a conventional
|
|
target-simulation. Specifically, the host-time between clock edges will not be
|
|
proportional to target-time elapsed over that interval, and will vary in the presence of
|
|
simulator stalls.
|
|
|
|
Finding The Source Of Simulation Stalls
|
|
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
|
|
|
|
In the best case, FireSim simulators will be able to launch new target clock pulses on
|
|
every host clock cycle. In other words, for single-clock targets the simulation can run
|
|
at FMR = 1. In the single clock case delays are introduced by bridges (like FASED memory
|
|
timing models) and optimized models (like a multi-cycle Register File model). You can
|
|
identify which bridges are responsible for additional delays between target clocks by
|
|
filtering for ``*sink_valid`` and ``*source_ready`` on the hub model. When
|
|
``<channel>_sink_valid`` is deasserted, a bridge or model has not yet produced a token
|
|
for the current timestep, stalling the hub. When ``<channel>_source_ready`` is
|
|
deasserted, a bridge or model is back-pressuring the channel.
|
|
|
|
Scala Tests
|
|
-----------
|
|
|
|
To make it easier to do metasimulation-based regression testing, the ScalaTests wrap
|
|
calls to Makefiles, and run a limited set of tests on a set of selected designs,
|
|
including all of the MIDAS examples and a handful of Chipyard-based designs. This is
|
|
described in greater detail in the :ref:`Developer documentation <Scala Integration
|
|
Tests>`.
|
|
|
|
Running Metasimulations through Make
|
|
------------------------------------
|
|
|
|
.. warning::
|
|
|
|
This section is for advanced developers; most metasimulation users should use the
|
|
manager-based metasimulation flow described above.
|
|
|
|
Metasimulations are run out of the ``firesim/sim`` directory. If you are running a
|
|
metasim for Chipyard, ensure you properly add the ``TARGET_PROJECT_MAKEFRAG`` variable
|
|
to point to Chipyard's makefrag. Generally this is set to
|
|
``TARGET_PROJECT_MAKEFRAG=${CY_DIR}/generators/firechip/src/main/makefrag/firesim``.
|
|
|
|
.. code-block:: bash
|
|
|
|
[in firesim/sim]
|
|
make <verilator|vcs>
|
|
|
|
To compile a simulator with full-visibility waveforms, type:
|
|
|
|
.. code-block:: bash
|
|
|
|
make <verilator|vcs>-debug
|
|
|
|
As part of target-generation, Rocket Chip emits a make fragment with recipes for running
|
|
suites of assembly tests. MIDAS puts this in
|
|
``firesim/sim/generated-src/f1/<DESIGN>-<TARGET_CONFIG>-<PLATFORM_CONFIG>/firesim.d``.
|
|
Make sure your ``$RISCV`` environment variable is set by sourcing
|
|
``firesim/sourceme-manager.sh`` or ``firesim/env.sh``, and type:
|
|
|
|
.. code-block:: bash
|
|
|
|
make run-<asm|bmark>-tests EMUL=<vcs|verilator>
|
|
|
|
To run only a single test, the make target is the full path to the output. Specifically:
|
|
|
|
.. code-block:: bash
|
|
|
|
make EMUL=<vcs|verilator> $PWD/output/f1/<DESIGN>-<TARGET_CONFIG>-<PLATFORM_CONFIG>/<RISCV-TEST-NAME>.<vpd|out>
|
|
|
|
A ``.vpd`` target will use (and, if required, build) a simulator with waveform dumping
|
|
enabled, whereas a ``.out`` target will use the faster waveform-less simulator.
|
|
|
|
Additionally, you can run a unique binary in the following way:
|
|
|
|
.. code-block:: bash
|
|
|
|
make SIM_BINARY=<PATH_TO_BINARY> run-<vcs|verilator>
|
|
make SIM_BINARY=<PATH_TO_BINARY> run-<vcs|verilator>-debug
|
|
|
|
Examples
|
|
~~~~~~~~
|
|
|
|
Run all RISCV-tools assembly and benchmark tests on a Verilated simulator.
|
|
|
|
.. code-block:: bash
|
|
|
|
[in firesim/sim]
|
|
make
|
|
make -j run-asm-tests
|
|
make -j run-bmark-tests
|
|
|
|
Run all RISCV-tools assembly and benchmark tests on a Verilated simulator with waveform
|
|
dumping.
|
|
|
|
.. code-block:: bash
|
|
|
|
make verilator-debug
|
|
make -j run-asm-tests-debug
|
|
make -j run-bmark-tests-debug
|
|
|
|
Run ``rv64ui-p-simple`` (a single assembly test) on a Verilated simulator.
|
|
|
|
.. code-block:: bash
|
|
|
|
make
|
|
make $(pwd)/output/f1/FireSim-FireSimRocketConfig-BaseF1Config/rv64ui-p-simple.out
|
|
|
|
Run ``rv64ui-p-simple`` (a single assembly test) on a VCS simulator with waveform
|
|
dumping.
|
|
|
|
.. code-block:: bash
|
|
|
|
make vcs-debug
|
|
make EMUL=vcs $(pwd)/output/f1/FireSim-FireSimRocketConfig-BaseF1Config/rv64ui-p-simple.vpd
|
|
|
|
.. _metasimulation-performance:
|
|
|
|
Metasimulation vs. Target simulation performance
|
|
------------------------------------------------
|
|
|
|
Generally, metasimulations are only slightly slower than target-level simulations. This
|
|
is illustrated in the chart below.
|
|
|
|
====== ===== ======= ============= ============= =========
|
|
Type Waves VCS Verilator -O1 Verilator -02 Verilator
|
|
====== ===== ======= ============= ============= =========
|
|
Target Off 4.8 kHz 3.9 kHz 6.6 kHz N/A
|
|
Target On 0.8 kHz 3.0 kHz 5.1 kHz N/A
|
|
Meta Off 3.8 kHz 2.4 kHz 4.5 kHz 5.3 KHz
|
|
Meta On 2.9 kHz 1.5 kHz 2.7 kHz 3.4 KHz
|
|
====== ===== ======= ============= ============= =========
|
|
|
|
Note that using more aggressive optimization levels when compiling the Verilated-design
|
|
dramatically lengthens compile time:
|
|
|
|
==== ===== === ============= ============= =========
|
|
Type Waves VCS Verilator -O1 Verilator -02 Verilator
|
|
==== ===== === ============= ============= =========
|
|
Meta Off 35s 48s 3m32s 4m35s
|
|
Meta On 35s 49s 5m27s 6m33s
|
|
==== ===== === ============= ============= =========
|
|
|
|
Notes: Default configurations of a single-core, Rocket-based instance running
|
|
``rv64ui-v-add``. Frequencies are given in target-Hz. Presently, the default compiler
|
|
flags passed to Verilator and VCS differ from level to level. Hence, these numbers are
|
|
only intended to give ball park simulation speeds, not provide a scientific comparison
|
|
between simulators. VCS numbers collected on a local Berkeley machine, Verilator numbers
|
|
collected on a ``c4.4xlarge``. (metasimulation Verilator version: 4.002, target-level
|
|
Verilator version: 3.904)
|