Merge remote-tracking branch 'origin/dev' into chisel-3.5-published

This commit is contained in:
David Biancolin 2022-02-08 00:29:35 +00:00
commit de2ea34bea
19 changed files with 870 additions and 218 deletions

View File

@ -105,7 +105,7 @@ class RuntimeHWConfig:
# TODO: supernode support
tracefile = "+tracefile=TRACEFILE" if trace_enable else ""
autocounterfile = "+autocounter-filename=AUTOCOUNTERFILE"
autocounterfile = "+autocounter-filename-base=AUTOCOUNTERFILE"
# this monstrosity boots the simulator, inside screen, inside script
# the sed is in there to get rid of newlines in runtime confs

View File

@ -16,28 +16,45 @@ simulation (unlike target-level performance counters), these counters do not
affect the behavior of the simulated machine, no matter how often they are
sampled.
Ad-hoc Performance Counters
------------------------------
Chisel Interface
----------------
AutoCounter enables the addition of ad-hoc counters using the ``PerfCounter``
function. The ``PerfCounter`` function takes 3 arguments: A UInt signal encoding an event increment, a counter label, and the counter description. The increment indicates the number of times the event of occurs in a single cycle and will be added on every simulated cycle to the generated performance counter. An increment can be boolean, with true implying the event occured (once) in the cycle, or multibit, to capture events that occur more than once in a cycle. We show two examples
of using the ``PerfCounter`` function below.
object in the `midas.targetutils` package. PerfCounters counters can be added in one of two modes:
# `Accumulate`, using the standard ``PerfCounter.apply`` method. Here the annotated UInt (1 or
more bits) is added to a 64b accumulation register: the target is treated as
representing an N-bit UInt and will increment the counter by a value between [0, 2^n - 1] per cycle.
# `Identity`, using the ``PerfCounter.identity`` method. Here the annotated UInt is sampled directly. This can be used
to annotate a sample with values are not accumulator-like (e.g., a PC),
and permits the user to define more complex instrumentation logic in the target itself.
We give examples of using PerfCounter below:
.. code-block:: scala
// A standard boolean event. Increments by 1 or 0 every local clock cycle.
midas.targetutils.PerfCounter(en_clock, "gate_clock", "Core clock gated")
// A multibit example
// A multibit example. If the core can retire three isntructions per cycle,
// encode this as a two-bit unit. Extra-width is OK but the encoding to the UInt
// (e.g., doing a pop count), must be done by the user.
midas.targetutils.PerfCounter(insns_ret, "iret", "Instructions retired")
// An identity value. Note: the pc here must be <= 64b wide.
midas.targetutils.PerfCounter.identity(pc, "pc", "The value of the program counter at the time of a sample")
Building a Design with AutoCounter
See the `PerfCounter Scala API docs
<https://fires.im/firesim/latest/api/midas/targetutils/PerfCounter$.html>`_ for more detail about the Chisel-side interface.
Enabling AutoCounter in Golden Gate
-------------------------------------
To enable AutoCounter when building a design, prepend the ``WithAutoCounter``
config to your ``PLATFORM_CONFIG``. During compilation, FireSim will print the
signals it is generating counters for. If AutoCounter has been enabled, the
``autocounter_t`` bridge driver will also be automatically instantiated.
By default, annotated events are not synthesized into AutoCounters. To enable
AutoCounter when compiling a design, prepend the ``WithAutoCounter`` config to
your ``PLATFORM_CONFIG``. During compilation, Golden Gate will print the
signals it is generating counters for.
Rocket Chip Cover Functions
@ -95,19 +112,42 @@ By default, the read-rate is set to 0 cycles, which disables AutoCounter.
readrate=100
Upon setting this value, when you run a workload, an AutoCounter output file
will be placed in the ``sim_slot_<slot #>`` directory on the F1 instance under
the name ``AUTOCOUNTERFILE<N>``, with one file generated per clock domain
containing an AutoCounter event. The header of each output file indicates the
associated clock domain and its frequency relative to the base clock.
.. Note:: AutoCounter is designed as a coarse-grained observability mechanism, as sampling
each counter requires two (blocking) MMIO reads (each read takes O(100) ns on EC2 F1).
As a result sampling at intervals less than O(10000) cycles may adversely affect
simulation performance for large numbers of counters.
If you intend on reading counters at a finer granularity, please consider using
If you intend on reading counters at a finer granularity, consider using
synthesizable printfs.
AutoCounter CSV Output Format
---------------------------------
AutoCounter output files are CSVs generated in the working directory where the
simulator was invoked (this applies to metasimulators too), with the default
names ``AUTOCOUNTERFILE<i>.csv``, one per clock domain. The CSV output format is
depicted below, assuming a sampling period of ``N`` base clock cycles.
.. csv-table:: AutoCounter CSV Format
:file: autocounter-csv-format.csv
Column Notes:
#. Each column beyond the first two corresponds to a PerfCounter instance in the clock domain.
#. Column 0 past the header corresponds to the base clock cycle of the sample.
#. The local_cycle counter (column 1) is implemented as an always enabled
single-bit event, and increments even when the target is under reset.
Row Notes:
#. Header row 0: autocounter csv format version, an integer.
#. Header row 1: clock domain information.
#. Header row 2: the label parameter provided to PerfCounter suffixed with the instance path.
#. Header row 3: the description parameter provided to PerfCounter. Quoted.
#. Header row 4: the width of the field annotated in the target.
#. Header row 5: the width of the accumulation register. Not configurable, but makes it clear when to expect rollover.
#. Header row 6: indicates the accumulation scheme. Can be "Identity" or "Accumulate".
#. Sample row 0: sampled values at the bitwidth of the accumulation register.
#. Sample row k: ditto above, k * N base cycles later
Using TracerV Trigger with AutoCounter
-----------------------------------------
In order to collect AutoCounter results from only from a particular region of
@ -117,16 +157,35 @@ triggers. See the :ref:`tracerv-trigger` section for more information.
AutoCounter using Synthesizable Printfs
------------------------------------------------
The AutoCounter transformation in the Golden Gate compiler includes an event-driven
The AutoCounter transformation in Golden Gate includes an event-driven
mode that uses Synthesizable Printfs (see
:ref:`printf-synthesis`) to export counter results `as they are updated` rather than sampling them
periodically with a dedicated Bridge. This mode can be enabled by prepending the
``WithAutoCounterCoverPrintf`` config to your ``PLATFORM_CONFIG`` instead of
``WithAutoCounterCover``. In this mode, the counter values and the local cycle count will be printed
every time the counter is incremented using a synthesized printf (hence, you
will observe a series of printfs incrementing by 1). This mode may
be useful for fine-grained observation of counters. The counter values will be
printed to the same output stream as other synthesizable printfs. This mode
uses considerably more FPGA resources per counter, and may consume considerable
amounts of DMA bandwidth (since it prints every cycle a counter
increments), which may adversly affect simulation performance (increased FMR).
``WithAutoCounterCover``. Based on the selected event mode the printfs will have the following runtime behavior:
* `Accumulate`: On a non-zero increment, the local cycle count and the new
counter value are printed. This produces a series of prints with
monotonically increasingly values.
* `Identity`: On a transition of the annotated target, the local cycle count and
the new value are printed. Thus a target that transitions every cycle will
produce printf traffic every cycle.
This mode may be useful for temporally fine-grained observation of counters.
The counter values will be printed to the same output stream as other
synthesizable printfs. This mode uses considerably more FPGA resources per
counter, and may consume considerable amounts of DMA bandwidth (since it prints
every cycle a counter increments), which may adversly affect simulation
performance (increased FMR).
Reset & Timing Considerations
------------------------------
* Events and identity values provided while under local reset, or while the
``GlobalResetCondition`` asserted, are zero-ed out. Similarly, printfs that
might otherwise be active under a reset are masked out.
* The sampling period in slower clock domains is currently calculated using a truncating
division of the period in the base clock domain. Thus, when the base clock
period can not be cleanly divided, samples in the slower clock domain will
gradually fall out of phase with samples in the base clock domain. In all
cases, the "local_cycle" column is most accurate measure of sample time.

View File

@ -0,0 +1,10 @@
version, `version number`
clock info,domain name,multiplier,M,divisor,N
labels,local_clock,label0,label1,...,labelN
"description","local clock cycle","desc0","desc1",...,"descN"
event width,1,width0,width1,...,widthN
acculator width,64,64,64,...,64
type,Increment,type0,type1,...,typeN
N ,cycle @ time N,value0 @ tN,value1 @ tN,...,value @ tN
...,...,...,...,...,..
kN,cycle @ time kN,value0 @ tkN,value1 @ tkN,...,valueN @ tkN
Can't render this file because it has a wrong number of fields in line 2.

View File

@ -3,6 +3,7 @@
#include "autocounter.h"
#include <iostream>
#include <regex>
#include <stdio.h>
#include <string.h>
#include <limits.h>
@ -19,6 +20,14 @@ autocounter_t::autocounter_t(
std::vector<std::string> &args,
AUTOCOUNTERBRIDGEMODULE_struct * mmio_addrs,
AddressMap addr_map,
const uint32_t event_count,
const char* const* event_types,
const uint32_t* event_widths,
const uint32_t* accumulator_widths,
const uint32_t* event_addr_hi,
const uint32_t* event_addr_lo,
const char* const* event_msgs,
const char* const* event_labels,
const char* const clock_domain_name,
const unsigned int clock_multiplier,
const unsigned int clock_divisor,
@ -26,33 +35,44 @@ autocounter_t::autocounter_t(
bridge_driver_t(sim),
mmio_addrs(mmio_addrs),
addr_map(addr_map),
event_count(event_count),
event_types (event_types, event_types + event_count),
event_widths (event_widths, event_widths + event_count),
accumulator_widths(accumulator_widths, accumulator_widths + event_count),
event_addr_hi (event_addr_hi, event_addr_hi + event_count),
event_addr_lo (event_addr_lo, event_addr_lo + event_count),
event_msgs (event_msgs, event_msgs + event_count),
event_labels (event_labels, event_labels + event_count),
clock_info(clock_domain_name, clock_multiplier, clock_divisor) {
this->readrate = 0;
this->autocounter_filename = "AUTOCOUNTER";
const char *autocounter_filename_in = NULL;
std::string readrate_arg = std::string("+autocounter-readrate=");
std::string filename_arg = std::string("+autocounter-filename=");
std::string filename_arg = std::string("+autocounter-filename-base=");
for (auto &arg: args) {
if (arg.find(readrate_arg) == 0) {
char *str = const_cast<char*>(arg.c_str()) + readrate_arg.length();
uint64_t base_cycles = atol(str);
this->readrate = this->clock_info.to_local_cycles(base_cycles);
// TODO: Just fix this in the bridge by not sampling with a fixed frequency
if (this->clock_info.to_base_cycles(this->readrate) != base_cycles) {
this->readrate_base_clock = base_cycles;
this->readrate = this->clock_info.to_local_cycles(this->readrate_base_clock);
// TODO: Fix this in the bridge by not sampling with a fixed frequency
if (this->clock_info.to_base_cycles(this->readrate) != this->readrate_base_clock) {
fprintf(stderr,
"[AutoCounter] Warning: requested sample rate of %llu [base] cycles does not map to a whole number\n\
of cycles in clock domain: %s, (%d/%d) of base clock.\n",
base_cycles, this->clock_info.domain_name,
this->clock_info.multiplier, this->clock_info.divisor);
"[AutoCounter] Warning: requested sample rate of %" PRIu64 " [base] cycles does not map to a whole number\n\
of cycles in clock domain: %s, (%d/%d) of base clock.\n\
See the AutoCounter documentation on Reset And Timing Considerations for discussion.\n",
this->readrate_base_clock,
this->clock_info.domain_name,
this->clock_info.multiplier,
this->clock_info.divisor);
fprintf(stderr, "[AutoCounter] Workaround: Pick a sample rate that is divisible by all clock divisors.\n");
}
}
if (arg.find(filename_arg) == 0) {
autocounter_filename_in = const_cast<char*>(arg.c_str()) + filename_arg.length();
this->autocounter_filename = std::string(autocounter_filename_in) + std::to_string(autocounterno);
this->autocounter_filename = std::string(autocounter_filename_in) + std::to_string(autocounterno) + ".csv";
}
}
@ -60,7 +80,7 @@ autocounter_t::autocounter_t(
if(!autocounter_file.is_open()) {
throw std::runtime_error("Could not open output file: " + this->autocounter_filename);
}
this->clock_info.emit_file_header(autocounter_file);
emit_autocounter_header();
}
autocounter_t::~autocounter_t() {
@ -68,37 +88,77 @@ autocounter_t::~autocounter_t() {
}
void autocounter_t::init() {
cur_cycle = 0;
// Decrement the readrate by one to simplify the HW a little bit
write(addr_map.w_registers.at("readrate_low"), (readrate - 1) & ((1ULL << 32) - 1));
write(addr_map.w_registers.at("readrate_high"), this->readrate >> 32);
write(mmio_addrs->init_done, 1);
}
std::string replace_all(std::string str, const std::string& from, const std::string& to) {
size_t start_pos = 0;
while((start_pos = str.find(from, start_pos)) != std::string::npos) {
str.replace(start_pos, from.length(), to);
start_pos += to.length();
}
return str;
}
// Since description fields may have commas, quote them to prevent introducing extra delimiters.
// Note, the standard way to escape double-quotes is to double them (" -> "")
// https://stackoverflow.com/questions/17808511/properly-escape-a-double-quote-in-csv
std::string quote_csv_element(std::string str) {
std::string quoted = replace_all(str, "\"", "\"\"");
return '"' + quoted + '"';
}
template<typename T>
void write_header_array_to_csv(std::ofstream& f, std::vector<T>& row, std::string first_column) {
f << first_column << ",";
assert(!row.empty());
for (auto it = row.begin(); it != row.end(); it++) {
f << *it;
if ((it + 1) != row.end()) {
f << ",";
} else {
f << std::endl;
}
}
}
void autocounter_t::emit_autocounter_header() {
autocounter_file << "version," << autocounter_csv_format_version << std::endl;
autocounter_file << clock_info.as_csv_row();
auto quoted_descriptions = std::vector<std::string>();
for (auto &desc: event_msgs) {
quoted_descriptions.push_back(quote_csv_element(desc));
}
write_header_array_to_csv(autocounter_file, event_labels, "label");
write_header_array_to_csv(autocounter_file, quoted_descriptions,"\"description\"");
write_header_array_to_csv(autocounter_file, event_types, "type");
write_header_array_to_csv(autocounter_file, event_widths, "event width");
write_header_array_to_csv(autocounter_file, accumulator_widths, "accumulator width");
}
bool autocounter_t::drain_sample() {
bool bridge_has_sample = read(addr_map.r_registers.at("countersready"));
if (bridge_has_sample) {
cur_cycle = read(this->mmio_addrs->cycles_low);
cur_cycle |= ((uint64_t)read(this->mmio_addrs->cycles_high)) << 32;
autocounter_file << "Cycle " << cur_cycle << std::endl;
autocounter_file << "============================" << std::endl;
for (auto pair: addr_map.r_registers) {
cur_cycle_base_clock += readrate_base_clock;
autocounter_file << cur_cycle_base_clock << ",";
for (size_t idx = 0; idx < event_count; idx++) {
uint64_t counter_val = ((uint64_t) (read(event_addr_hi[idx]))) << 32;
counter_val |= read(event_addr_lo[idx]);
autocounter_file << counter_val;
std::string low_prefix = std::string("autocounter_low_");
std::string high_prefix = std::string("autocounter_high_");
if (pair.first.find("autocounter_low_") == 0) {
char *str = const_cast<char*>(pair.first.c_str()) + low_prefix.length();
std::string countername(str);
uint64_t counter_val = ((uint64_t) (read(addr_map.r_registers.at(high_prefix + countername)))) << 32;
counter_val |= read(pair.second);
autocounter_file << "PerfCounter " << str << ": " << counter_val << std::endl;
if (idx < (event_count - 1)) {
autocounter_file << ",";
} else {
autocounter_file << std::endl;
}
}
write(addr_map.w_registers.at("readdone"), 1);
autocounter_file << "" << std::endl;
}
return bridge_has_sample;
}

View File

@ -7,6 +7,9 @@
#include <vector>
#include <fstream>
// This will need to be manually incremented by descretion.
constexpr int autocounter_csv_format_version = 1;
// Bridge Driver Instantiation Template
#define INSTANTIATE_AUTOCOUNTER(FUNC,IDX) \
AUTOCOUNTERBRIDGEMODULE_ ## IDX ## _substruct_create; \
@ -20,6 +23,14 @@
AUTOCOUNTERBRIDGEMODULE_ ## IDX ## _W_num_registers, \
(const unsigned int*) AUTOCOUNTERBRIDGEMODULE_ ## IDX ## _W_addrs, \
(const char* const*) AUTOCOUNTERBRIDGEMODULE_ ## IDX ## _W_names), \
AUTOCOUNTERBRIDGEMODULE_ ## IDX ## _event_count, \
AUTOCOUNTERBRIDGEMODULE_ ## IDX ## _event_types, \
AUTOCOUNTERBRIDGEMODULE_ ## IDX ## _event_widths, \
AUTOCOUNTERBRIDGEMODULE_ ## IDX ## _accumulator_widths, \
AUTOCOUNTERBRIDGEMODULE_ ## IDX ## _event_addr_hi, \
AUTOCOUNTERBRIDGEMODULE_ ## IDX ## _event_addr_lo, \
AUTOCOUNTERBRIDGEMODULE_ ## IDX ## _event_descriptions, \
AUTOCOUNTERBRIDGEMODULE_ ## IDX ## _event_labels, \
AUTOCOUNTERBRIDGEMODULE_ ## IDX ## _clock_domain_name, \
AUTOCOUNTERBRIDGEMODULE_ ## IDX ## _clock_multiplier, \
AUTOCOUNTERBRIDGEMODULE_ ## IDX ## _clock_divisor, \
@ -33,6 +44,14 @@ class autocounter_t: public bridge_driver_t {
std::vector<std::string> &args,
AUTOCOUNTERBRIDGEMODULE_struct * mmio_addrs,
AddressMap addr_map,
const uint32_t event_count,
const char* const* event_types,
const uint32_t* event_widths,
const uint32_t* accumulator_widths,
const uint32_t* event_addr_hi,
const uint32_t* event_addr_lo,
const char* const* event_msgs,
const char* const* event_labels,
const char* const clock_domain_name,
const unsigned int clock_multiplier,
const unsigned int clock_divisor,
@ -49,15 +68,28 @@ class autocounter_t: public bridge_driver_t {
simif_t* sim;
AUTOCOUNTERBRIDGEMODULE_struct * mmio_addrs;
AddressMap addr_map;
const uint32_t event_count;
std::vector<std::string> event_types;
std::vector<uint32_t> event_widths;
std::vector<uint32_t> accumulator_widths;
std::vector<uint32_t> event_addr_hi;
std::vector<uint32_t> event_addr_lo;
std::vector<std::string> event_msgs;
std::vector<std::string> event_labels;
ClockInfo clock_info;
uint64_t cur_cycle;
uint64_t cur_cycle_base_clock = 0;
uint64_t readrate;
uint64_t readrate_base_clock;
std::string autocounter_filename;
std::ofstream autocounter_file;
// Pulls a single sample from the Bridge, if available.
// Returns true if a sample was read
bool drain_sample();
// Writes event autocounter metadata to the first lines of the output csv.
void emit_autocounter_header();
};
#endif // AUTOCOUNTERWIDGET_struct_guard

View File

@ -44,7 +44,14 @@ public:
void emit_file_header(std::ostream& os) {
os << file_header();
}
};
std::string as_csv_row() {
char buf[200];
sprintf(buf, "Clock Domain Name, %s, Base Multiplier, %d, Base Divisor, %d\n",
domain_name, multiplier, divisor);
return std::string(buf);
};
};

View File

@ -86,13 +86,7 @@ class AutoCounterTransform extends Transform with AutoCounterConsts {
val countType = UIntType(IntWidth(64))
val zeroLit = UIntLiteral(0, IntWidth(64))
addedStmts ++= coverAnnos.flatMap({ case AutoCounterFirrtlAnnotation(target, clock, reset, label, _, _) =>
val countName = moduleNS.newName(label + "_counter")
val count = DefRegister(NoInfo, countName, countType, WRef(clock.ref), WRef(reset.ref), zeroLit)
val nextName = moduleNS.newName(label + "_next")
val next = DefNode(NoInfo, nextName, DoPrim(PrimOps.Add, Seq(WRef(count), WRef(target.ref)), Seq.empty, countType))
val countUpdate = Connect(NoInfo, WRef(count), WRef(next))
def generatePrintf(label: String, clock: ReferenceTarget, valueToPrint: WRef, printEnable: Expression, suggestedPrintName: String): Unit = {
// Generate a trigger sink and annotate it
val triggerName = moduleNS.newName("trigger")
val trigger = DefWire(NoInfo, triggerName, BoolType)
@ -101,12 +95,35 @@ class AutoCounterTransform extends Transform with AutoCounterConsts {
// Now emit a printf using all the generated hardware
val printFormat = StringLit(s"""[AutoCounter] $label: %d\n""")
val printName = moduleNS.newName(label + "_print")
val printStmt = Print(NoInfo, printFormat, Seq(WRef(count)),
WRef(clock.ref), And(WRef(trigger), Neq(WRef(target.ref), zero)), printName)
addedAnnos += SynthPrintfAnnotation(Seq(Seq(mT.ref(countName))), mT, printFormat.string, Some(printName))
Seq(count, next, printStmt, countUpdate)
})
val printName = moduleNS.newName(suggestedPrintName)
val printStmt = Print(NoInfo, printFormat, Seq(valueToPrint),
WRef(clock.ref), And(WRef(trigger), printEnable), printName)
addedAnnos += SynthPrintfAnnotation(Seq(Seq(mT.ref(valueToPrint.name))), mT, printFormat.string, Some(printName))
addedStmts += printStmt
}
coverAnnos.foreach {
case AutoCounterFirrtlAnnotation(target, clock, reset, label, _, PerfCounterOps.Accumulate, _) =>
val countName = moduleNS.newName(label + "_counter")
val count = DefRegister(NoInfo, countName, countType, WRef(clock.ref), WRef(reset.ref), zeroLit)
val nextName = moduleNS.newName(label + "_next")
val next = DefNode(NoInfo, nextName, DoPrim(PrimOps.Add, Seq(WRef(count), WRef(target.ref)), Seq.empty, countType))
val countUpdate = Connect(NoInfo, WRef(count), WRef(next))
addedStmts ++= Seq(count, next, countUpdate)
def printEnable = Neq(WRef(target.ref), zero)
generatePrintf(label, clock, WRef(count), printEnable, target.ref + "_print")
// Under the Identity mode, print whenever the value changes.
case AutoCounterFirrtlAnnotation(target, clock, reset, label, _, PerfCounterOps.Identity, _) =>
val regName = moduleNS.newName(label + "_reg")
val reg = DefRegister(NoInfo, regName, UIntType(UnknownWidth), WRef(clock.ref), WRef(reset.ref), zeroLit)
val regUpdate = Connect(NoInfo, WRef(reg), WRef(target.ref))
addedStmts ++= Seq(reg, regUpdate)
def printEnable = Neq(WRef(target.ref), WRef(reg))
generatePrintf(label, clock, WRef(target.ref), printEnable, target.ref + "_identity_print")
}
m.copy(body = Block(m.body, addedStmts:_*))
case o => o
}
@ -126,7 +143,7 @@ class AutoCounterTransform extends Transform with AutoCounterConsts {
state: CircuitState,
eventModuleMap: Map[String, Seq[AutoCounterFirrtlAnnotation]]): CircuitState = {
val labelMap = eventModuleMap.values.flatten.map(anno => anno.target -> anno.label).toMap
val sourceToAutoCounterAnnoMap = eventModuleMap.values.flatten.map(anno => anno.target -> anno).toMap
val bridgeTopWiringAnnos = eventModuleMap.values.flatten.map(
anno => BridgeTopWiringAnnotation(anno.target, anno.clock))
@ -160,10 +177,16 @@ class AutoCounterTransform extends Transform with AutoCounterConsts {
})
val eventMetadata = oAnnos.map({ anno =>
val pathlessLabel = labelMap(anno.pathlessSource)
val autoCounterAnno = sourceToAutoCounterAnnoMap(anno.pathlessSource)
val pathlessLabel = autoCounterAnno.label
val instPath = anno.absoluteSource.circuit +: anno.absoluteSource.asPath.map(_._1.value)
val eventWidth = portWidthMap(anno.topSink.ref)
EventMetadata(anno.topSink.ref, (pathlessLabel +: instPath).mkString("_"), eventWidth)
EventMetadata(
anno.topSink.ref,
(pathlessLabel +: instPath).mkString("_"),
autoCounterAnno.description,
eventWidth,
autoCounterAnno.opType)
})
// Step 2b. Manually add boolean channels, to carry the trigger and
@ -254,7 +277,7 @@ class AutoCounterTransform extends Transform with AutoCounterConsts {
println(s"[AutoCounter] signals are:")
selectedsignals.foreach({ case (modName, localEvents) =>
println(s" Module ${modName}")
localEvents.foreach({ anno => println(s" ${anno.label}: ${anno.message}") })
localEvents.foreach({ anno => println(s" ${anno.label}: ${anno.description}") })
})
// Common preprocessing: gate all annotated events with their associated reset
@ -284,7 +307,7 @@ class AutoCounterTransform extends Transform with AutoCounterConsts {
updatedState.copy(
annotations = updatedState.annotations.filter {
case AutoCounterCoverModuleFirrtlAnnotation(_) => false
case AutoCounterFirrtlAnnotation(_,_,_,_,_,_) => false
case AutoCounterFirrtlAnnotation(_,_,_,_,_,_,_) => false
case o => true
})
}

View File

@ -8,8 +8,15 @@ import freechips.rocketchip.config.{Parameters, Field}
import freechips.rocketchip.diplomacy.AddressSet
import freechips.rocketchip.util._
import midas.targetutils.{PerfCounterOpType, PerfCounterOps}
import midas.widgets.CppGenerationUtils.{genConstStatic, genArray}
trait AutoCounterConsts {
val counterWidth = 64
/* Quotes the description escapes potentially troublesome characters */
def sanitizeDescriptionForCSV(description: String): String =
'"' + description.replaceAll("\"", "\"\"") + '"'
}
/**
@ -17,11 +24,29 @@ trait AutoCounterConsts {
*
* @param portName the name of the IF exposed to the bridge by the autocounter transform
*
* @param label provides more detail about the nature of the event
* @param label The user provided [[AutoCounterFirrtlAnnotation]].label prepended with an instance path.
*
* @param width the bitwidth of the event
* @param description A passthrough of [[AutoCounterFirrtlAnnotation]].description
*
* @param width The bitwidth of the event
*
* @param opType The type of accumulation operation to apply to event
*/
case class EventMetadata(portName: String, label: String, width: Int)
case class EventMetadata(
portName: String,
label: String,
description: String,
width: Int,
opType: PerfCounterOpType) extends AutoCounterConsts
object EventMetadata {
val localCycleCount = EventMetadata(
"N/A",
"local_cycle",
"Clock cycles elapsed in the local domain.",
1,
PerfCounterOps.Accumulate)
}
class AutoCounterBundle(
eventMetadata: Seq[EventMetadata],
@ -37,20 +62,12 @@ class AutoCounterBundle(
override def cloneType = new AutoCounterBundle(eventMetadata, triggerName, resetPortName).asInstanceOf[this.type]
}
class AutoCounterToHostToken(val numCounters: Int) extends Bundle with AutoCounterConsts {
val data_out = Vec(numCounters, UInt(counterWidth.W))
val cycle = UInt(counterWidth.W)
}
class AutoCounterBridgeModule(
eventMetadata: Seq[EventMetadata],
triggerName: String,
resetPortName: String)(implicit p: Parameters)
extends BridgeModule[HostPortIO[AutoCounterBundle]]()(p) with AutoCounterConsts {
lazy val module = new BridgeModuleImp(this) {
val numCounters = eventMetadata.size
val labels = eventMetadata.map(_.label)
val io = IO(new WidgetIO())
val hPort = IO(HostPort(new AutoCounterBundle(eventMetadata, triggerName, resetPortName)))
val trigger = hPort.hBits.triggerEnable
@ -83,38 +100,58 @@ class AutoCounterBridgeModule(
cycles := cycles + 1.U
}
val counters = hPort.hBits.events.unzip._2.map({ increment =>
val count = RegInit(0.U(counterWidth.W))
when (targetFire && !hPort.hBits.underGlobalReset) {
count := count + increment
val counters = for (((_, field), metadata) <- hPort.hBits.events.zip(eventMetadata)) yield {
metadata.opType match {
case PerfCounterOps.Accumulate =>
val count = RegInit(0.U(counterWidth.W))
when (targetFire && !hPort.hBits.underGlobalReset) {
count := count + field
}
count
// Under local reset identity fields are zeroed out. This matches that behavior.
case PerfCounterOps.Identity =>
Mux(hPort.hBits.underGlobalReset, 0.U, field).pad(counterWidth)
}
count
}).toSeq
}
val periodcycles = RegInit(0.U(64.W))
val isSampleCycle = periodcycles === readrate
// Pipeline sample by one cycle, so that events on the final clock cycle of
// the interval can be captured. This has the effect of making a signal
// that is always high read a multiple of N, where N is the sampling rate.
val doSample = RegInit(false.B)
when (targetFire && isSampleCycle) {
periodcycles := 0.U
doSample := true.B
} .elsewhen (targetFire) {
periodcycles := periodcycles + 1.U
doSample := false.B
}
val btht_queue = Module(new Queue(new AutoCounterToHostToken(numCounters), 2))
val allEventMetadata = EventMetadata.localCycleCount +: eventMetadata
val allCounters = cycles +: counters
btht_queue.io.enq.valid := isSampleCycle && targetFire && trigger
btht_queue.io.enq.bits.data_out := VecInit(counters)
btht_queue.io.enq.bits.cycle := cycles
assert(allCounters.size == allEventMetadata.size)
val numCounters = allCounters.size
val labels = allEventMetadata.map(_.label)
val btht_queue = Module(new Queue(Vec(numCounters, UInt(counterWidth.W)), 2))
btht_queue.io.enq.valid := doSample && targetFire && trigger
btht_queue.io.enq.bits := VecInit(allCounters)
hPort.toHost.hReady := targetFire
val (lowCountAddrs, highCountAddrs) = (for ((counter, label) <- btht_queue.io.deq.bits.data_out.zip(labels)) yield {
val (lowCountAddrs, highCountAddrs) = (for ((counter, label) <- btht_queue.io.deq.bits.zip(labels)) yield {
val lowAddr = attach(counter(hostCounterLowWidth-1, 0), s"autocounter_low_${label}", ReadOnly)
val highAddr = attach(counter >> hostCounterLowWidth, s"autocounter_high_${label}", ReadOnly)
(lowAddr, highAddr)
}).unzip
//communication with the driver
attach(btht_queue.io.deq.bits.cycle(hostCyclesLowWidth-1, 0), "cycles_low", ReadOnly)
attach(btht_queue.io.deq.bits.cycle >> hostCyclesLowWidth, "cycles_high", ReadOnly)
// These are not current used, but are convienent to poke at from the driver
attach(btht_queue.io.deq.bits(0)(hostCyclesLowWidth-1, 0), "cycles_low", ReadOnly)
attach(btht_queue.io.deq.bits(0) >> hostCyclesLowWidth, "cycles_high", ReadOnly)
attach(readrate_low, "readrate_low", WriteOnly)
attach(readrate_high, "readrate_high", WriteOnly)
attach(initDone, "init_done", WriteOnly)
@ -129,6 +166,14 @@ class AutoCounterBridgeModule(
crRegistry.genHeader(headerWidgetName, base, sb, lowCountAddrs ++ highCountAddrs)
crRegistry.genArrayHeader(headerWidgetName, base, sb)
emitClockDomainInfo(headerWidgetName, sb)
sb.append(genConstStatic(s"${headerWidgetName}_event_count", UInt32(allEventMetadata.size)))
sb.append(genArray(s"${headerWidgetName}_event_types", allEventMetadata.map { m => CStrLit(m.opType.toString) }))
sb.append(genArray(s"${headerWidgetName}_event_labels", allEventMetadata.map { m => CStrLit(m.label) } ))
sb.append(genArray(s"${headerWidgetName}_event_descriptions", allEventMetadata.map { m => CStrLit(m.description) } ))
sb.append(genArray(s"${headerWidgetName}_event_addr_hi", highCountAddrs.map { offset => UInt32(base + offset) } ))
sb.append(genArray(s"${headerWidgetName}_event_addr_lo", lowCountAddrs.map { offset => UInt32(base + offset) } ))
sb.append(genArray(s"${headerWidgetName}_event_widths", allEventMetadata.map { m => UInt32(m.width) } ))
sb.append(genArray(s"${headerWidgetName}_accumulator_widths", allEventMetadata.map { m => UInt32(m.counterWidth) } ))
}
genCRFile()
}

View File

@ -162,6 +162,19 @@ object ExcludeInstanceAsserts {
}
}
sealed trait PerfCounterOpType
object PerfCounterOps {
/**
* Takes the annotated UInt and adds it to an accumulation register generated in the bridge
*/
case object Accumulate extends PerfCounterOpType
/** Takes the annotated UInt and exposes it directly to the driver
* NB: Fields longer than 64b are not supported, and must be divided into
* smaller segments that are sepearate annotated
*/
case object Identity extends PerfCounterOpType
}
/**
* AutoCounter annotations. Do not emit the FIRRTL annotations unless you are
@ -174,9 +187,10 @@ case class AutoCounterFirrtlAnnotation(
clock: ReferenceTarget,
reset: ReferenceTarget,
label: String,
message: String,
description: String,
opType: PerfCounterOpType = PerfCounterOps.Accumulate,
coverGenerated: Boolean = false)
extends firrtl.annotations.Annotation with DontTouchAllTargets {
extends firrtl.annotations.Annotation with DontTouchAllTargets with HasSerializationHints {
def update(renames: RenameMap): Seq[firrtl.annotations.Annotation] = {
val renamer = new ReferenceTargetRenamer(renames)
val renamedTarget = renamer.exactRename(target)
@ -188,6 +202,7 @@ case class AutoCounterFirrtlAnnotation(
def shouldBeIncluded(modList: Seq[String]): Boolean = !coverGenerated || modList.contains(target.module)
def enclosingModule(): String = target.module
def enclosingModuleTarget(): ModuleTarget = ModuleTarget(target.circuit, enclosingModule)
def typeHints(): Seq[Class[_]] = Seq(opType.getClass)
}
case class AutoCounterCoverModuleFirrtlAnnotation(target: ModuleTarget) extends
@ -201,15 +216,36 @@ case class AutoCounterCoverModuleAnnotation(target: String) extends ChiselAnnota
def toFirrtl = AutoCounterCoverModuleFirrtlAnnotation(ModuleTarget("",target))
}
object PerfCounter {
private def emitAnnotation(
target: chisel3.UInt,
clock: chisel3.Clock,
reset: Reset,
label: String,
description: String,
opType: PerfCounterOpType): Unit = {
requireIsHardware(target, "Target passed to PerfCounter:")
requireIsHardware(clock, "Clock passed to PerfCounter:")
requireIsHardware(reset, "Reset passed to PerfCounter:")
annotate(new ChiselAnnotation {
def toFirrtl = AutoCounterFirrtlAnnotation(
target.toTarget,
clock.toTarget,
reset.toTarget,
label,
description,
opType)
})
}
/**
* Labels a signal as an event for which an host-side counter (an
* "AutoCounter") should be generated). Events can be multi-bit to encode
* multiple occurances in a cycle (e.g., the number of instructions retired
* in a superscalar processor). NB: Golden Gate will not generate the
* coutner unless AutoCounter is enabled in your the platform config. See
* the docs for more info.
*
* the docs.fires.im for end-to-end usage information.
*
* @param target The number of occurances of the event (in the current cycle)
*
@ -220,31 +256,44 @@ object PerfCounter {
*
* @param label A verilog-friendly identifier for the event signal
*
* @param message A description of the event.
* @param description A human-friendly description of the event.
*
* @param opType Defines how the bridge should be aggregated into a performance counter.
*
*/
def apply(target: chisel3.UInt,
clock: chisel3.Clock,
reset: Reset,
label: String,
message: String): Unit = {
requireIsHardware(target, "Target passed to PerfCounter:")
requireIsHardware(clock, "Clock passed to PerfCounter:")
requireIsHardware(reset, "Reset passed to PerfCounter:")
annotate(new ChiselAnnotation {
def toFirrtl = AutoCounterFirrtlAnnotation(
target.toTarget,
clock.toTarget,
reset.toTarget, label, message)
})
}
def apply(
target: chisel3.UInt,
clock: chisel3.Clock,
reset: Reset,
label: String,
description: String,
opType: PerfCounterOpType = PerfCounterOps.Accumulate): Unit =
emitAnnotation(target, clock, reset, label, description, opType)
/**
* A simplified variation of the full apply method above that uses the
* implicit clock and reset.
*/
def apply(target: chisel3.UInt, label: String, message: String): Unit =
apply(target, Module.clock, Module.reset, label, message)
def apply(target: chisel3.UInt, label: String, description: String): Unit =
emitAnnotation(target, Module.clock, Module.reset, label, description, PerfCounterOps.Accumulate)
/**
* Passes the annotated UInt through to the driver without accumulation.
* Use cases:
* - Custom accumulation / counting logic not supported by the driver
* - Providing runtime metadata along side standard accumulation registers
*
* Note: Under reset, the passthrough value is set to 0. This keeps event
* handling uniform in the transform.
*
*/
def identity(target: chisel3.UInt, label: String, description: String): Unit = {
require(target.getWidth <= 64,
s"""|PerfCounter.identity can only accept fields <= 64b wide. Provided target for label:
| $label
|was ${target.getWidth}b.""".stripMargin)
emitAnnotation(target, Module.clock, Module.reset, label, description, opType = PerfCounterOps.Identity)
}
}
// Need serialization utils to be upstreamed to FIRRTL before i can use these.

View File

@ -49,6 +49,26 @@ public:
};
#endif //DESIGNNAME_AutoCounterModule
#ifdef DESIGNNAME_AutoCounter32bRollover
class AutoCounter32bRollover_t: public autocounter_module_t
{
public:
AutoCounter32bRollover_t(int argc, char** argv): autocounter_module_t(argc, argv) {};
virtual void run() {
for (auto &autocounter_endpoint: autocounter_endpoints) {
autocounter_endpoint->init();
}
poke(reset, 1);
poke(io_a, 0);
step(1);
poke(reset, 0);
step(1);
poke(io_a, 1);
run_and_collect(3000);
};
};
#endif //DESIGNNAME_AutoCounter32bRollover
#ifdef DESIGNNAME_AutoCounterGlobalResetCondition
class AutoCounterGlobalResetCondition_t: public autocounter_module_t
{

View File

@ -43,6 +43,8 @@
#include "PrintfModule.h"
#elif defined DESIGNNAME_AutoCounterModule
#include "AutoCounterModule.h"
#elif defined DESIGNNAME_AutoCounter32bRollover
#include "AutoCounterModule.h"
#elif defined DESIGNNAME_AutoCounterGlobalResetCondition
#include "AutoCounterModule.h"
#elif defined DESIGNNAME_AutoCounterCoverModule

View File

@ -152,7 +152,7 @@ nic_args = +shmemportname0=$(NET_SHMEMPORTNAME) +macaddr0=$(NET_MACADDR) \
+netbw0=$(NET_BW) +netburst0=8 $(NET_LOOPBACK)
tracer_args = +tracefile=TRACEFILE
blkdev_args = +blkdev-in-mem0=128 +blkdev-log0=blkdev-log$(NET_SLOT)
autocounter_args = +autocounter-readrate=1000 +autocounter-filename=AUTOCOUNTERFILE
autocounter_args = +autocounter-readrate=1000 +autocounter-filename-base=AUTOCOUNTERFILE
# Neglecting this +arg will make the simulator use the same step size as on the
# FPGA. This will make ML simulation more closely match results seen on the
# FPGA at the expense of dramatically increased target runtime

View File

@ -0,0 +1,28 @@
//See LICENSE for license details.
package firesim.midasexamples
import chisel3._
import freechips.rocketchip.config.Parameters
import firesim.midasexamples.AutoCounterWrappers.{PerfCounter}
/**
* Check that auto-counter output that would rollover a 32b boundary is
* captured correctly. This is an important boundary since under the current
* implementations accumulation registers are divided into 32b segments for
* MMIO.
*/
class AutoCounter32bRollover(implicit p: Parameters) extends PeekPokeMidasExampleHarness(() =>
new AutoCounter32bRolloverDUT()(new AutoCounterValidator))
class AutoCounter32bRolloverDUT(val instName: String = "dut")(implicit val v: AutoCounterValidator) extends Module
with AutoCounterTestContext {
val io = IO(new Bundle{val a = Input(Bool())})
val largeIncrement = WireDefault((BigInt(1) << 31).U(32.W))
PerfCounter(largeIncrement, "two_to_the_31", "Should not rollover the default accumulation width")
v.generateValidationPrintfs
}

View File

@ -3,38 +3,32 @@
package firesim.midasexamples
import chisel3._
import chisel3.util._
import freechips.rocketchip.config.Parameters
import midas.targetutils.{PerfCounter, AutoCounterCoverModuleAnnotation}
import freechips.rocketchip.util.property
import midas.targetutils.{AutoCounterCoverModuleAnnotation}
import firesim.midasexamples.AutoCounterWrappers.{PerfCounter, cover}
import scala.collection.mutable
/**
* Demonstrates how to instantiate autocounters, and validates those
* autocounter by comparing their output against a printf that should emit the
* same strings
*
* @param printfPrefix Used filter simulation output for validation lines
* @param instName The suggested name for this instance. Used in validation printf
* @param clockDivision Used to scale validation output, since autocounters in
* slower domains will appear to be sampled less frequently (in terms of local
* cycle count).
*/
class AutoCounterModuleDUT(
printfPrefix: String = "AUTOCOUNTER_PRINT ",
instName: String = "dut",
clockDivision: Int = 1) extends Module {
class AutoCounterModuleDUT(val instName: String = "dut")(implicit val v: AutoCounterValidator) extends Module
with AutoCounterTestContext {
val io = IO(new Bundle {
val a = Input(Bool())
})
val instPath = s"${parentPathName}_${instName}"
suggestName(instName)
val enabled_cycles = RegInit(0.U(16.W))
when(io.a) { enabled_cycles := enabled_cycles + 1.U }
PerfCounter(io.a, "ENABLED", "Enabled cycles. Should be identical to cycle count minus reset cycles")
PerfCounter(io.a, "ENABLED", "Enabled cycles, should be identical to cycle count minus reset cycles")
val enabled4 = ~enabled_cycles(1) & ~enabled_cycles(0) & io.a
@ -43,36 +37,23 @@ class AutoCounterModuleDUT(
// Multibit event
val count = RegInit(0.U(4.W))
count := count + 1.U
PerfCounter(count, "MULTIBIT_EVENT", "A multibit event")
PerfCounter(count, "MULTIBIT_EVENT", "A multibit event. Here's a quote: \" and a comma: , to check description serialization.")
val childInst = Module(new AutoCounterModuleChild)
childInst.io.c := io.a
//--------VALIDATION---------------
// Check that identity operation preserves the annotated target
val identityCounter = RegInit(0.U(64.W))
identityCounter := 0x01234567DEADBEEFL.U
PerfCounter.identity(identityCounter, "PASSTHROUGH", "A multibit that is preserved and not accumulated")
val samplePeriod = 1000 / clockDivision
val enabled_printcount = freechips.rocketchip.util.WideCounter(64, io.a)
val enabled4_printcount = freechips.rocketchip.util.WideCounter(64, enabled4)
val oddlfsr_printcount = freechips.rocketchip.util.WideCounter(64, childInst.io.oddlfsr)
val multibit_printcount = freechips.rocketchip.util.WideCounter(64, count)
val cycle_print = Reg(UInt(64.W))
cycle_print := cycle_print + 1.U
when ((cycle_print >= (samplePeriod - 1).U) & (cycle_print % samplePeriod.U === (samplePeriod - 1).U)) {
printf(s"${printfPrefix}Cycle %d\n", cycle_print)
printf(s"${printfPrefix}============================\n")
printf(s"${printfPrefix}PerfCounter ENABLED_${instPath}: %d\n", enabled_printcount)
printf(s"${printfPrefix}PerfCounter ENABLED_DIV_4_${instPath}: %d\n", enabled4_printcount)
printf(s"${printfPrefix}PerfCounter ODD_LFSR_${instPath}_childInst: %d\n", oddlfsr_printcount)
printf(s"${printfPrefix}PerfCounter MULTIBIT_EVENT_${instPath}: %d\n", multibit_printcount)
printf(s"${printfPrefix}\n")
}
v.generateValidationPrintfs()
}
class AutoCounterModuleChild extends Module {
class AutoCounterModuleChild(val instName: String = "child")(implicit val v: AutoCounterValidator) extends Module
with AutoCounterTestContext {
val io = IO(new Bundle {
val c = Input(Bool())
val oddlfsr = Output(Bool())
})
val lfsr = chisel3.util.random.LFSR(16, io.c)
@ -80,8 +61,6 @@ class AutoCounterModuleChild extends Module {
val odd_lfsr = lfsr(0)
PerfCounter(odd_lfsr, "ODD_LFSR", "Number of cycles the LFSR is has an odd value")
io.oddlfsr := odd_lfsr
}
/** Demonstrate explicit instrumentation of AutoCounters via PerfCounter
@ -93,42 +72,26 @@ class AutoCounterModuleChild extends Module {
* @see AutoCounterF1Test
* @see PerfCounter
*/
class AutoCounterModule(implicit p: Parameters) extends PeekPokeMidasExampleHarness(() => new AutoCounterModuleDUT)
class AutoCounterModule(implicit p: Parameters) extends PeekPokeMidasExampleHarness(() =>
new AutoCounterModuleDUT()(new AutoCounterValidator))
class AutoCounterCoverModuleDUT extends Module {
property.cover.setPropLib(new midas.passes.FireSimPropertyLibrary())
class AutoCounterCoverModuleDUT(val instName: String = "dut")(implicit val v: AutoCounterValidator = new AutoCounterValidator) extends Module with AutoCounterTestContext {
freechips.rocketchip.util.property.cover.setPropLib(new midas.passes.FireSimPropertyLibrary())
val io = IO(new Bundle {
val a = Input(Bool())
})
val instName = "dut"
val instPath = s"${parentPathName}_${instName}"
suggestName(instName)
val cycle = RegInit(0.U(12.W))
cycle := cycle + 1.U
val cycle8 = ~cycle(2) & ~cycle(1) & ~cycle(0)
property.cover(cycle8 , "CYCLES_DIV_8", "Count the number of times the cycle count is divisible by 8. Should be equal to number of cycles divided by 8")
cover(cycle8 , "CYCLES_DIV_8", "Count the number of times the cycle count is divisible by 8. Should be equal to number of cycles divided by 8")
chisel3.experimental.annotate(AutoCounterCoverModuleAnnotation("AutoCounterCoverModuleDUT"))
//--------VALIDATION---------------
val cycle8_printcount = RegInit(0.U(64.W))
when (cycle8) {
cycle8_printcount := cycle8_printcount + 1.U
}
val samplePeriod = 1000
val cycle_print = Reg(UInt(64.W))
cycle_print := cycle_print + 1.U
when ((cycle_print >= (samplePeriod - 1).U) & (cycle_print % 1000.U === (samplePeriod - 1).U)) {
printf("AUTOCOUNTER_PRINT Cycle %d\n", cycle_print)
printf("AUTOCOUNTER_PRINT ============================\n")
printf(s"AUTOCOUNTER_PRINT PerfCounter CYCLES_DIV_8_${instPath}: %d\n", cycle8_printcount)
printf("AUTOCOUNTER_PRINT \n")
}
v.generateValidationPrintfs()
}
/** Demonstrate implicit instrumentation of AutoCounters via RocketChip 'cover' functions
@ -147,17 +110,20 @@ class AutoCounterPrintfDUT extends Module {
val a = Input(Bool())
})
val childInst = Module(new AutoCounterModuleChild)
implicit val v = new AutoCounterValidator(autoCounterPrintfMode = true)
val childInst = Module(new AutoCounterModuleChild())
childInst.io.c := io.a
//--------VALIDATION---------------
val incrementer = RegInit(0.U(32.W))
incrementer := incrementer + 1.U
PerfCounter.identity(incrementer, "incrementer", "Should print on every cycle after reset is disabled.")
val oddlfsr_printcount = freechips.rocketchip.util.WideCounter(64, childInst.io.oddlfsr)
val cycle_print = Reg(UInt(39.W))
cycle_print := cycle_print + 1.U
when (childInst.io.oddlfsr) {
printf("SYNTHESIZED_PRINT CYCLE: %d [AutoCounter] ODD_LFSR: %d\n", cycle_print, oddlfsr_printcount)
}
// Randomly update an identity value
val lfsr64 = chisel3.util.random.LFSR(64)
val lfsr64Reg = RegEnable(lfsr64, (lfsr64(2,0) === 0.U))
PerfCounter.identity(lfsr64Reg, "lfsr64_three_lsbs_zero", "Should print when the LFSR transitions to a value with three LSBs unset.")
v.generateValidationPrintfs()
}
/** Demonstrate alternative impementation of AutoCounters using event-driven SynthesizedPrintf's

View File

@ -0,0 +1,245 @@
//See LICENSE for license details.
package firesim.midasexamples
import midas.targetutils.{PerfCounterOps, PerfCounterOpType}
import midas.widgets.{AutoCounterConsts, EventMetadata}
import chisel3._
import chisel3.util.experimental.BoringUtils
import scala.collection.mutable
/**
* This file contains utilites for doing integration testing of AutoCounter features.
* These tests consists of wrapping calls to PerfCounter to collect metadata about each counter,
* and synthesizing printfs to produce expected output.
*
* This is achieved by passing an instance of [[AutoCounterValidator]] between modules
* that mixin [[AutoCounterTestContext]]. See [[AutoCounterModule]] for an example.
*/
object AutoCounterVerificationConstants {
// This is repeated from the C++ since it is not used elsewhere in the Scala.
val expectedCSVVersion = 1
val headerLines = 7
}
/**
* Tracks all the required metadata to generate reference hardware for a single AutoCounter event.
*/
case class PerfCounterInstance(
target: chisel3.UInt,
label: String,
description: String,
opType: PerfCounterOpType,
eventWidth: Int,
wiringId: String,
instPath: Seq[String]) extends AutoCounterConsts {
/** Implements a target-RTL equivalent counter to what would be implemented in the bridge.
* Returns the generated Counter (_1), and the event that was annotated (_2)
*/
def generateReferenceHardware(): (UInt, UInt) = {
val sink = WireDefault(0.U(eventWidth.W))
BoringUtils.addSink(sink, wiringId)
dontTouch(sink)
opType match {
case PerfCounterOps.Accumulate =>
(freechips.rocketchip.util.WideCounter(counterWidth, sink).value, sink)
case PerfCounterOps.Identity =>
(sink.pad(counterWidth), sink)
}
}
/* Adds a formatted path to the label to match the behavior of the transform */
def pathPrefixedLabel: String = (label +: (instPath.reverse)).mkString("_")
/* Quotes the description escapes potentially troublesome characters */
def quoteDescriptionForCSV: String =
'"' + description.replaceAll("\"", "\"\"") + '"'
}
/**
* Used to collect information about PerfCounters as they are instantiated
* throughout a DUT's module hierarchy. Emits validation printfs at the
* top-level, using the WiringTransform to bring out the events and drive
* reference counters.
*
* @param domainName Name of the clock domain managed by validator
* @param printfPrefix Used filter simulation output for validation lines
* @param autoCounterPrintfMode Set when the autoCounter transform emits
* synthesizable printfs instead of the AutoCounterBridge
* @param clockDivision Used to scale validation output, since autocounters in
* slower domains will appear to be sampled less frequently (in terms of local
* cycle count).
* @param samplePeriodBase The expected period between AC samples in the base clock domain.
*/
class AutoCounterValidator(
domainName: String = "BaseClock",
printfPrefix: String = "AUTOCOUNTER_PRINT ",
autoCounterPrintfMode: Boolean = false,
clockDivision: Int = 1,
clockMultiplication: Int = 1,
samplePeriodBase: Int = 1000
) extends InstanceNameHelper with AutoCounterConsts {
// Auto Counter currently uses the top module name as the final substring in
// the label path. Set this top-module name upon instantiation, since we must
// be instantiated at the top of the module hierarchy to work correctly.
setModuleName(Module.currentModule.get.getClass.getSimpleName)
private val samplePeriod = samplePeriodBase / clockDivision
private val _instances = new mutable.ArrayBuffer[PerfCounterInstance]()
// Using our current AutoCounter instance should suffice to produce a unique wiring ID.
private def nextWiringId(): String = s"${domainName}_${_instances.size}"
/**
* Registers a new PerfCounter. Parameters to this function mirror PerfCounter parameters.
*
*/
def registerEvent (
target: chisel3.UInt,
label: String,
description: String,
opType: PerfCounterOpType,
addPathToLabel: Boolean = true): Unit = {
val wiringId = nextWiringId
BoringUtils.addSource(target, wiringId)
_instances += PerfCounterInstance(target, label, description, opType, target.getWidth, wiringId, currentInstPath)
}
// For validation, spoof an event that behaves like the implicit cycle count
// (there is no associated AutoCounterAnnotation for it)
private def addCycleCount(): Unit = {
val dummy = WireDefault(true.B)
val EventMetadata(_,label,description,width,opType) = EventMetadata.localCycleCount
PerfCounterInstance(dummy, label, description, opType, width, "UNUSED", Seq()) +=: _instances
}
// Creates a printf with the validation prefix so it can be extracted from
// the simulators stdout.
private def prefixed_printf(fmtString: String, args: Bits*) =
printf(s"${printfPrefix}${fmtString}\n", args:_*)
// Emits a csv header row to match the expected output of the driver. Note:
// columns may be swizzled.
private def print_header_row(name: String, extractor: PerfCounterInstance => String): Unit =
prefixed_printf((name +: _instances.map(extractor)).mkString(","))
/**
* Generate printfs that mirrors the expected output from the bridge driver.
*
* Note: columns in this output will likely be swizzled vs the
* driver-generated form. The ScalaTest code de-swizzles by matching
* labels.,
*/
def standardModeValidation(): Unit = {
val sampleCount, localCycles = Reg(UInt(64.W))
localCycles := localCycles + 1.U
val baseCycles = (sampleCount + 1.U) * samplePeriodBase.U
// Generate the validation hardware first, then spoof the cycle counter,
// since it's behavior under reset does not match a conventional auto
// counter (it still increments).
val counters = baseCycles +: localCycles +: _instances.map(_.generateReferenceHardware._1)
addCycleCount()
// Wait to print the header until the cycle before the first data row
// to avoid getting masked off
when(localCycles === (samplePeriod - 1).U) {
prefixed_printf(s"version,${AutoCounterVerificationConstants.expectedCSVVersion}")
prefixed_printf(s"Clock Domain Name, ${domainName}, Base Multiplier, ${clockMultiplication}, Base Divisor, ${clockDivision}")
print_header_row("label", { _.pathPrefixedLabel })
// First column is quoted to be consistent across the whole row
print_header_row("\"description\"", { _.quoteDescriptionForCSV });
print_header_row("type", { _.opType.toString });
print_header_row("event width", {_.eventWidth.toString });
print_header_row("accumulator width", {_ => counterWidth.toString } );
}
when ((localCycles >= (samplePeriod - 1).U) && (localCycles % samplePeriod.U === 0.U)) {
prefixed_printf(Seq.fill(counters.size)("%d").mkString(","), counters:_*)
sampleCount := sampleCount + 1.U
}
}
def printfModeValidation(): Unit = {
val localCycles = Reg(UInt(64.W))
localCycles := localCycles + 1.U
val eventTuples = _instances.map(_.generateReferenceHardware)
for (((counter, input), metadata) <- eventTuples.zip(_instances)) {
val commonFormat = s"CYCLE: %d [AutoCounter] ${metadata.label}: %d"
metadata.opType match {
case PerfCounterOps.Accumulate =>
when (input =/= 0.U) {
prefixed_printf(commonFormat, localCycles, counter)
}
case PerfCounterOps.Identity =>
when (input =/= RegNext(input)) {
prefixed_printf(commonFormat, localCycles, input)
}
}
}
}
def generateValidationPrintfs(): Unit = {
if (autoCounterPrintfMode) printfModeValidation else standardModeValidation
}
}
object AutoCounterWrappers {
/**
* Wraps the standard companion object to capture metadata about each
* PerfCounter invocation in a sideband (AutoCounterValidator). This will be
* used to generate validation printfs at the end of module elaboration.
*/
object PerfCounter {
def apply(
target: chisel3.UInt,
label: String,
description: String,
opType: PerfCounterOpType = PerfCounterOps.Accumulate)(
implicit v: AutoCounterValidator): Unit = {
midas.targetutils.PerfCounter(target, label, description)
v.registerEvent(target, label, description, opType)
}
def identity(
target: chisel3.UInt,
label: String,
description: String)(implicit v: AutoCounterValidator): Unit = {
midas.targetutils.PerfCounter.identity(target, label, description)
v.registerEvent(target, label, description, PerfCounterOps.Identity)
}
}
/**
* As above, wraps calls to to freechips.rocketchip.util.property.cover, to capture
* the required metadata to generate a reference counter.
*/
object cover {
def apply(target: Bool, label: String, description: String)(implicit v: AutoCounterValidator): Unit = {
freechips.rocketchip.util.property.cover(target, label, description)
v.registerEvent(target, label, description, PerfCounterOps.Accumulate)
}
}
}
/**
* Mix into any module that has AutoCounters contained within its module hierarchy.
*/
trait AutoCounterTestContext { this: Module =>
def instName: String
implicit val v: AutoCounterValidator
v.setModuleName(instName)
}

View File

@ -8,7 +8,7 @@ import freechips.rocketchip.config._
import junctions._
import firesim.util.DesiredHostFrequency
import firesim.configs.WithDefaultMemModel
import firesim.configs.{WithDefaultMemModel, WithWiringTransform}
class NoConfig extends Config(Parameters.empty)
// This is incomplete and must be mixed into a complete platform config
@ -22,6 +22,7 @@ class DefaultF1Config extends Config(new Config((site, here, up) => {
}) ++ new Config(
new firesim.configs.WithEC2F1Artefacts ++
new WithDefaultMemModel ++
new WithWiringTransform ++
new firesim.configs.WithILATopWiringTransform ++
new midas.F1Config))

View File

@ -0,0 +1,40 @@
//See LICENSE for license details.
package firesim.midasexamples
import chisel3._
import chisel3.experimental.BaseModule
/**
* This enables using the full path to a module before it's name is reflexively
* defined. It works by suggesting names that should be stable, these strings
* can then be used in verification collateral that is created during
* elaboration.
*
* Sharp edge: the top module technically does not have an "instance name". Might
* need to permit returning an empty Seq at the top of the module hierarchy instead
* of throwing an exception if generalizing beyond AutoCounter.
*
*/
trait InstanceNameHelper {
private var _instPathStack = List[(BaseModule, String)]()
// This grows the stack, and should be called on module entry
def setModuleName(suggestedName: String): Unit = {
val currentModule = Module.currentModule.get
currentModule.suggestName(suggestedName)
_instPathStack = (currentModule -> suggestedName) +: _instPathStack
}
// Returns the current instPath, shrinking the stack until it reaches the
// current context.
def currentInstPath(): Seq[String] = _instPathStack match {
case Nil => throw new RuntimeException(
s"Could not resolve instance path for Module ${Module.currentModule.get}. Did you forget to call setModuleName?")
case (mod, _) :: pairs if mod != Module.currentModule.get =>
_instPathStack = pairs
currentInstPath()
case pairs =>
pairs.map(_._2)
}
}

View File

@ -13,23 +13,26 @@ import midas.widgets.{RationalClockBridge, PeekPokeBridge, RationalClock}
// from verilator/vcs and compare against the two files produced by
// the bridges
class MulticlockAutoCounterModule(implicit p: Parameters) extends RawModule {
val clockBridge = RationalClockBridge(RationalClock("ThirdRate", 1, 3))
val List(refClock, div2Clock) = clockBridge.io.clocks.toList
val clockBridge = RationalClockBridge(RationalClock("SlowClock", 1, 3))
val List(refClock, slowClock) = clockBridge.io.clocks.toList
val reset = WireInit(false.B)
val resetHalfRate = ResetCatchAndSync(div2Clock, reset.asBool)
val slowReset = ResetCatchAndSync(slowClock, reset.asBool)
// Used to let printfs that emit the correct validation output
val instPath = "MulticlockAutoCounterModule_AutoCounterModuleDUT"
withClockAndReset(refClock, reset) {
val lfsr = chisel3.util.random.LFSR(16)
val fullRateMod = Module(new AutoCounterModuleDUT(instName = "secondRate"))
val fullRateMod = Module(new AutoCounterModuleDUT(instName = "secondRate")(new AutoCounterValidator))
fullRateMod.io.a := lfsr(0)
val peekPokeBridge = PeekPokeBridge(refClock, reset)
}
withClockAndReset(div2Clock, resetHalfRate) {
withClockAndReset(slowClock, slowReset) {
val lfsr = chisel3.util.random.LFSR(16)
val fullRateMod = Module(new AutoCounterModuleDUT("AUTOCOUNTER_PRINT_THIRDRATE ",
instName = "thirdRate",
clockDivision = 3))
val fullRateMod = Module(new AutoCounterModuleDUT("slowClock")(
new AutoCounterValidator(
domainName = "SlowClock",
printfPrefix = "AUTOCOUNTER_PRINT_SLOWCLOCK ",
clockDivision = 3)))
fullRateMod.io.a := lfsr(0)
}
}

View File

@ -5,6 +5,7 @@ import java.io.File
import scala.util.matching.Regex
import scala.io.Source
import org.scalatest.Suites
import org.scalatest.matchers.should._
abstract class TutorialSuite(
val targetName: String, // See GeneratorUtils
@ -12,7 +13,7 @@ abstract class TutorialSuite(
platformConfigs: String = "HostDebugFeatures_DefaultF1Config",
tracelen: Int = 8,
simulationArgs: Seq[String] = Seq()
) extends firesim.TestSuiteCommon {
) extends firesim.TestSuiteCommon with Matchers {
val backendSimulator = "verilator"
@ -122,6 +123,55 @@ abstract class TutorialSuite(
}
}
/**
* Compares an AutoCounter output CSV against a reference generated using in-circuit printfs.
*/
def checkAutoCounterCSV(filename: String, stdoutPrefix: String) {
it should s"produce a csv file (${filename}) that matches in-circuit printf output" in {
val scrubWhitespace = raw"\s*(.*)\s*".r
def splitAtCommas(s: String) = {
s.split(",")
.map(scrubWhitespace.findFirstMatchIn(_).get.group(1))
}
def quotedSplitAtCommas(s: String) = {
s.split("\",\"")
.map(scrubWhitespace.findFirstMatchIn(_).get.group(1))
}
val refLogFile = new File(outDir, s"/${targetName}.${backendSimulator}.out")
val acFile = new File(genDir, s"/${filename}")
val refVersion ::refClockInfo :: refLabelLine :: refDescLine :: refOutput =
extractLines(refLogFile, stdoutPrefix, headerLines = 0).toList
val acVersion ::acClockInfo :: acLabelLine :: acDescLine :: acOutput =
extractLines(acFile, prefix = "" , headerLines = 0).toList
assert(acVersion == refVersion)
val refLabels = splitAtCommas(refLabelLine)
val acLabels = splitAtCommas(acLabelLine)
acLabels should contain theSameElementsAs refLabels
val swizzle: Seq[Int] = refLabels.map { acLabels.indexOf(_) }
def checkLine(acLine: String, refLine: String, tokenizer: String => Seq[String] = splitAtCommas) {
val Seq(acFields, refFields) = Seq(acLine, refLine).map(tokenizer)
val assertMessagePrefix = s"Row commencing with ${refFields.head}:"
assert(acFields.size == refFields.size, s"${assertMessagePrefix} lengths do not match")
for ((field, columnIdx) <- refFields.zipWithIndex) {
assert(field == acFields(swizzle(columnIdx)),
s"${assertMessagePrefix} value for label ${refLabels(columnIdx)} does not match."
)
}
}
for ((acLine, refLine) <- acOutput.zip(refOutput)) {
checkLine(acLine, refLine)
}
}
}
mkdirs()
behavior of s"$targetName"
elaborateAndCompile()
@ -147,32 +197,43 @@ class AccumulatorF1Test extends TutorialSuite("Accumulator")
class VerilogAccumulatorF1Test extends TutorialSuite("VerilogAccumulator")
class AssertModuleF1Test extends TutorialSuite("AssertModule")
class AutoCounterModuleF1Test extends TutorialSuite("AutoCounterModule",
simulationArgs = Seq("+autocounter-readrate=1000", "+autocounter-filename=AUTOCOUNTERFILE")) {
diffSynthesizedLog("AUTOCOUNTERFILE0", stdoutPrefix = "AUTOCOUNTER_PRINT ", synthPrefix = "")
simulationArgs = Seq("+autocounter-readrate=1000", "+autocounter-filename-base=autocounter")) {
checkAutoCounterCSV("autocounter0.csv", "AUTOCOUNTER_PRINT ")
}
class AutoCounter32bRolloverTest extends TutorialSuite("AutoCounter32bRollover",
simulationArgs = Seq("+autocounter-readrate=1000", "+autocounter-filename-base=autocounter")) {
checkAutoCounterCSV("autocounter0.csv", "AUTOCOUNTER_PRINT ")
}
class AutoCounterCoverModuleF1Test extends TutorialSuite("AutoCounterCoverModule",
simulationArgs = Seq("+autocounter-readrate=1000", "+autocounter-filename=AUTOCOUNTERFILE")) {
diffSynthesizedLog("AUTOCOUNTERFILE0", stdoutPrefix = "AUTOCOUNTER_PRINT ", synthPrefix = "")
simulationArgs = Seq("+autocounter-readrate=1000", "+autocounter-filename-base=autocounter")) {
checkAutoCounterCSV("autocounter0.csv", "AUTOCOUNTER_PRINT ")
}
class AutoCounterPrintfF1Test extends TutorialSuite("AutoCounterPrintfModule",
simulationArgs = Seq("+print-file=synthprinttest.out"),
platformConfigs = "AutoCounterPrintf_HostDebugFeatures_DefaultF1Config") {
diffSynthesizedLog("synthprinttest.out0", stdoutPrefix = "SYNTHESIZED_PRINT CYCLE", synthPrefix = "CYCLE")
diffSynthesizedLog("synthprinttest.out0", stdoutPrefix = "AUTOCOUNTER_PRINT CYCLE", synthPrefix = "CYCLE")
}
class AutoCounterGlobalResetConditionF1Test extends TutorialSuite("AutoCounterGlobalResetCondition",
simulationArgs = Seq("+autocounter-readrate=1000", "+autocounter-filename=AUTOCOUNTERFILE")) {
def assertCountsAreZero(filename: String) {
simulationArgs = Seq("+autocounter-readrate=1000", "+autocounter-filename-base=autocounter")) {
def assertCountsAreZero(filename: String, clockDivision: Int) {
s"Counts reported in ${filename}" should "always be zero" in {
val log = new File(genDir, s"/${filename}")
val lines = extractLines(log, "PerfCounter ")
val perfCounterRegex = raw".*: (\d*)$$".r
lines.foreach {
case perfCounterRegex(value) => assert(value.toInt == 0)
val versionLine :: lines = extractLines(log, "", headerLines = 0).toList
val sampleLines = lines.drop(AutoCounterVerificationConstants.headerLines - 1)
assert(versionLine.split(",")(1).toInt == AutoCounterVerificationConstants.expectedCSVVersion)
val perfCounterRegex = raw"(\d*),(\d*),(\d*)".r
sampleLines.zipWithIndex foreach {
case (perfCounterRegex(baseCycle,localCycle,value), idx) =>
assert(baseCycle.toInt == 1000 * (idx + 1))
assert(localCycle.toInt == (1000 / clockDivision) * (idx + 1))
assert(value.toInt == 0)
}
}
}
assertCountsAreZero("AUTOCOUNTERFILE0")
assertCountsAreZero("AUTOCOUNTERFILE1")
assertCountsAreZero("autocounter0.csv", clockDivision = 1)
assertCountsAreZero("autocounter1.csv", clockDivision = 2)
}
class PrintfModuleF1Test extends TutorialSuite("PrintfModule",
@ -249,9 +310,9 @@ class MulticlockPrintF1Test extends TutorialSuite("MulticlockPrintfModule",
}
class MulticlockAutoCounterF1Test extends TutorialSuite("MulticlockAutoCounterModule",
simulationArgs = Seq("+autocounter-readrate=1000", "+autocounter-filename=AUTOCOUNTERFILE")) {
diffSynthesizedLog("AUTOCOUNTERFILE0", "AUTOCOUNTER_PRINT ", "")
diffSynthesizedLog("AUTOCOUNTERFILE1", "AUTOCOUNTER_PRINT_THIRDRATE ", "")
simulationArgs = Seq("+autocounter-readrate=1000", "+autocounter-filename-base=autocounter")) {
checkAutoCounterCSV("autocounter0.csv", "AUTOCOUNTER_PRINT ")
checkAutoCounterCSV("autocounter1.csv", "AUTOCOUNTER_PRINT_SLOWCLOCK ")
}
// Basic test for deduplicated extracted models
class TwoAddersF1Test extends TutorialSuite("TwoAdders")
@ -342,6 +403,7 @@ class AutoCounterCITests extends Suites(
new AutoCounterPrintfF1Test,
new MulticlockAutoCounterF1Test,
new AutoCounterGlobalResetConditionF1Test,
new AutoCounter32bRolloverTest,
)
class GoldenGateMiscCITests extends Suites(