[docs] Update AutoCounter

This commit is contained in:
David Biancolin 2020-03-27 13:30:15 -07:00
parent eae7cb877f
commit 44aef9c8ee
1 changed files with 25 additions and 21 deletions

View File

@ -88,10 +88,11 @@ AutoCounter Runtime Parameters
AutoCounter currently takes a single runtime configurable parameter, defined AutoCounter currently takes a single runtime configurable parameter, defined
under the ``[autocounter]`` section in the ``config_runtime.ini`` file. The under the ``[autocounter]`` section in the ``config_runtime.ini`` file. The
``readrate`` parameter defines the rate at which the counters should be read, ``readrate`` parameter defines the rate at which the counters should be read,
and is measured in target-cycles. Hence, if the read-rate is defined to be 100, and is measured in target-cycles of the base target-clock (clock 0 produced by the ClockBridge).
the simulator will read and print the values of the counters every 100 cycles. Hence, if the read-rate is defined to be 100 and the tile frequency is 2x the base clock (ex., which may drive the uncore),
By default, the read-rate is set to 0 cycles, which is equivalent to disabling the simulator will read and print the values of the counters every 200 core-clock cycles.
AutoCounter. If the core-domain clock is the base clock, it would do so every 100 cycles.
By default, the read-rate is set to 0 cycles, which disables AutoCounter.
.. code-block:: ini .. code-block:: ini
@ -102,13 +103,16 @@ AutoCounter.
Upon setting this value, when you run a workload, an AutoCounter output file Upon setting this value, when you run a workload, an AutoCounter output file
will be placed in the ``sim_slot_<slot #>`` directory on the F1 instance under will be placed in the ``sim_slot_<slot #>`` directory on the F1 instance under
the name ``AUTOCOUNTERFILE0``. the name ``AUTOCOUNTERFILE<N>``. with one file generated per clock domain
containing an AutoCounter event. The header of each output file indicates the
associated clock domain and its frequency relative to the base clock.
.. Note:: AutoCounter is designed as a coarse-grained observability mechanism. .. Note:: AutoCounter is designed as a coarse-grained observability mechanism, as sampling
It assumes the counters will be read at intervals greater than O(10000) cycles. each counter requires two (blocking) MMIO reads (each read takes O(100) ns on EC2 F1).
As a result sampling at intervals less than O(10000) cycles may adversely affect
simulation performance for large numbers of counters.
If you intend on reading counters at a finer granularity, please consider using If you intend on reading counters at a finer granularity, please consider using
synthesizable printfs (otherwise, simulation performance may degrade more than synthesizable printfs.
necessary)
Using TracerV Trigger with AutoCounter Using TracerV Trigger with AutoCounter
----------------------------------------- -----------------------------------------
@ -119,17 +123,17 @@ triggers. See the :ref:`tracerv-trigger` section for more information.
AutoCounter using Synthesizable Printfs AutoCounter using Synthesizable Printfs
------------------------------------------------ ------------------------------------------------
The AutoCounter transformation in the Golden Gate compiler includes a legacy The AutoCounter transformation in the Golden Gate compiler includes an event-driven
mode that uses Synthesizable Printfs (learn more about these on the mode that uses Synthesizable Printfs (see
:ref:`printf-synthesis` page) to export counter results rather than :ref:`printf-synthesis`) to export counter results `as they are updated` rather than sampling them
a dedicated Bridge. This mode can be enabled by prepending the periodically with a dedicated Bridge. This mode can be enabled by prepending the
``WithAutoCounterCoverPrintf`` config to your ``PLATFORM_CONFIG`` instead of ``WithAutoCounterCoverPrintf`` config to your ``PLATFORM_CONFIG`` instead of
``WithAutoCounterCover``. In this mode, the counter values will be printed ``WithAutoCounterCover``. In this mode, the counter values and the local cycle count will be printed
using a synthesizable printf every time the counter is incremented (hence, you every time the counter is incremented using a synthesized printf (hence, you
will observe a series of printfs incrementing by 1). Nevertheless, the printf will observe a series of printfs incrementing by 1). This mode may
statements include the exact cycle of the printf, and therefore this mode may be useful for fine-grained observation of counters. The counter values will be
be useful for fine grained observation of counters. The counter values will be
printed to the same output stream as other synthesizable printfs. This mode printed to the same output stream as other synthesizable printfs. This mode
may export a large amount of data (since it prints every cycle a counter uses considerably more FPGA resources per counter, and may consume considerable
increments), and therefore it is not recommended unless you require such high amounts of DMA bandwidth (since it prints every cycle a counter
fidelity. increments), which may adversly affect simulation performance (increased FMR).