From 7b0c832aacf26428753c2ccb18acf15ec0c19979 Mon Sep 17 00:00:00 2001
From: David Biancolin <david.biancolin@gmail.com>
Date: Mon, 4 Nov 2019 10:29:51 -0800
Subject: [PATCH 1/4] Update Bridge Docs (#408)

* Update Bridge Docs

* Address some of alons comments
---
 docs/Golden-Gate/Bridges.rst | 187 +++++++++++++++++------------------
 1 file changed, 89 insertions(+), 98 deletions(-)

diff --git a/docs/Golden-Gate/Bridges.rst b/docs/Golden-Gate/Bridges.rst
index 94e536ee..a565bf7d 100644
--- a/docs/Golden-Gate/Bridges.rst
+++ b/docs/Golden-Gate/Bridges.rst
@@ -2,25 +2,11 @@ Target-to-Host Bridges
 ======================
 
 A custom model in a FireSim Simulation, either CPU-hosted or FPGA-hosted, is
-deployed by a *Target-to-Host Bridge*, or Bridge for short. Bridges provide the
+deployed by using a *Target-to-Host Bridge*, or Bridge for short. Bridges provide the
 means to inject hardware and software models that produce and consume token streams. 
 
 Bridges enable:
 
-#. Software co-simulation. Ex. Before writing RTL for your accelerator, you can instantiate a custom bridge that
-   calls out to a software model running on the CPU.
-
-#. Resource savings by replacing components of the target with models that use
-   fewer FPGA resources or run entirely software.
-
-The use of Bridges in a FireSim simulation has many analogs to doing
-mixed-language (Verilog-C++) simulation of the same system in software. Where
-possible, we'll draw analogies.
-
-
-Use Cases
----------
-
 #. **Deterministic, host-agnostic I/O models.** This is the most common use case.
    Here you instantiate bridges at the I/O boundary of your chip, to provide
    a simulation models of the environment your design is executing in.  For an
@@ -38,67 +24,74 @@ Use Cases
    SoC boundary. Then write software models and bridge drivers that move
    tokens between each FPGA. See the SimpleNICBridge.
 
+#. **Resource optimizations.** Resource-intensive components of the target can
+   be replaced with models that use fewer FPGA resources or run entirely in
+   software.
 
-Defining A Bridge
+
+The use of Bridges in a FireSim simulation has many analogs to doing
+mixed-language (Verilog-C++) simulation of the same system in software. Where
+possible, we'll draw analogies.
+
+
+Terminology
 --------------------------
 
-Bridges have a target side, consisting of a specially annotated Module, and host side,
-which consist of an FPGA-hosted BridgeModule and an optional CPU-hosted BridgeDriver.
+Bridges have a `target side`, consisting of a specially annotated Module, and `host side`,
+which consists of an FPGA-hosted `bridge module` (deriving from ``BridgeModule``)
+and an optional CPU-hosted `bridge driver` (deriving from ``bridge_driver_t``).
 
-In a mixed-language software simulation, a Verilog VPI interface, (i.e, a tick
-fucntion) is analogous to the target side of a bridge, with the C++ backing
+In a mixed-language software simulation, a verilog procedural interface (VPI) is analogous to the target side of a bridge, with the C++ backing
 that interface being the host side.
 
-
 Target Side
 ----------------------
 
-In your target-side implementation, you will define a Scala trait that extends
-Bridge. This trait indicates that the module will declared and connected to in
-the target design, but that its implementation will be provided by a simulation
-Bridge. Once the trait is mixed into a Chisel BlackBox or a Module, that module
-will be extracted by Golden Gate, and its interface with the rest of the target
-design will be driven by your host-side implementation.
+In your target side, you will mix-in ``midas.widgets.Bridge`` into a Chisel
+``BaseModule`` (this can be a black or white-box Chisel module) and implement
+its abstract members. This trait indicates that the associated module will be
+replaced with a connection to the host-side of the bridge that sources and
+sinks token streams. During compilation, the target-side module will be extracted by Golden Gate and
+its interface will be driven by your bridge's host-side implementation.
 
 This trait has two type parameters and two abstract members you'll need define
-for your Bridge. Note that since you must mix Bridge into either a Chisel
-BlackBox or a Module, you'll of course need to define the IO for that module.
-That's the interface you'll use to connect to your target RTL.
+for your Bridge. Since you must mix ``Bridge`` into a Chisel ``BaseModule``, the IO you
+define for that module constitutes the target-side interface of your bridge.
 
 Type Parameters:
+++++++++++++++++
 
-#. Host Interface Type [HPType]: The Chisel type of your Bridge's target-land interface. This describes how the target interface
-has been divided into seperate token channels. One example, HostPort[T], divides a Chisel Bundle into a single bi-directional token stream.
-#. Host Module Type: The type of the Chisel Module you want Golden Gate to connect in-place of your black box.
+#. **Host Interface Type** ``HPType <: TokenizedRecord``: The Chisel type of your Bridge's
+   host-land interface. This describes how the target interface has been
+   divided into separate token channels. One example, ``HostPortIO[T]``, divides a
+   Chisel Bundle into a single bi-directional token stream and is sufficient
+   for defining bridges that do not model combinational paths between token
+   streams. We suggest starting with ``HostPortIO[T]`` when defining a Bridge for modeling IO devices, as it is the simplest
+   to reasonable about and can run at FMR = 1. For other port types, see Bridge Host Interaces.
+
+#. **BridgeModule Type** ``WidgetType <: BridgeModule``: The type of the
+   host-land BridgeModule you want Golden Gate to connect in-place of your target-side module.
+   Golden Gate will use its class name to invoke its constructor.
 
 Abstract Members:
++++++++++++++++++
 
-#. Host Interface Mock: In your bridge trait you'll create an instance of
-   your Host Interface of type HPType, which you'll use to communicate to
-   Golden Gate how the target-land IO of this black box is being divided into
-   channels.  The constructor of thisr must accept the target-land IO
-   interface, a hardware type, that it may correctly divide it into channels,
-   and annotate the right fields of your Bridge instance.
+#. **Host Interface Mock** ``bridgeIO: HPType``: Here you'll instantiate a mock instance of
+   your host-side interface. **This does not add IO to your target-side module**. Instead used
+   to emit annotations that tell Golden Gate how the target-land IO of the target-side module is being divided into
+   channels.
 
-#. Constructor Arg: A Scala case class you'd like to pass to your host-land
+#. **Bridge Module Constructor Arg** ``constructorArg: Option[AnyRef]``: A optional Scala case class you'd like to pass to your host-land
    BridgeModule's constructor. This will be serialized into an annotation and
-   consumed later by Golden Gate. In this case class you should capture all
+   consumed later by Golden Gate. If provided, your case class should capture all
    target-land configuration information you'll need in your Module's
    generator.
 
 
-Finally at the bottom of your Bridge's class definition **you'll need to call generateAnnotations()**.
-This will emit an "BridgeAnnotation" attached to module that indicates:
-
-#. This module is an Bridge.
-#. The class name of the BridgeModule's generator (e.g., firesim.bridges.UARTModule)
-#. The serialized constructor argument for that generator (e.g. firesim.bridges.UARTKey)
-#. A list of channel names; string references to Channel annotations
-
-And a series of FAMEChannelConnectionAnnotations, which target the module's I/O to group them into token channels.
+Finally at the bottom of your Bridge's class definition **you'll need to call generateAnnotations()**. This is necessary to have Golden Gate properly detect your bridge.
 
 You can freely instantiate your Bridge anywhere in your Target RTL: at the I/O
-boundary of your chi or deep in its module hierarchy.  Since all of the Golden
+boundary of your chip or deep in its module hierarchy.  Since all of the Golden
 Gate-specific metadata is captured in FIRRTL annotations, you can generate your
 target design and simulate it a target-level RTL simulation or even pass it off
 to ASIC CAD tools -- Golden Gate's annotations will simply be unused.
@@ -106,42 +99,43 @@ to ASIC CAD tools -- Golden Gate's annotations will simply be unused.
 What Happens Next?
 ------------------------
 
-If you do pass your FIRRTL & Annotations to Golden Gate. It will find your
-module, remove it,  and wire its dangling target-interface to the top-level of
-the design. During host-decoupling transforms, Golden Gate aggregates fields of
-your bridge's target IO based on ChannelAnnotations, and wraps them up into
-new Decoupled interfaces that match your Host Interface definition. Finally,
-once Golden Gate is done performing compiler transformations, it iterates
-through each Bridge annotation, generates your Module, passing it the
-serialized constructor argument, and connects it to the tokenized interface
-presented by the now host-decoupled target.
+If you pass your design to Golden Gate, it will find your target-side module, remove it,
+and wire its dangling target-interface to the top-level of the design. During
+host-decoupling transforms, Golden Gate aggregates fields of your bridge's
+target interface based on channel annotations emitted by the target-side of
+your bridge, and wraps them up into decoupled interfaces that match your host
+interface definition. Finally, once Golden Gate is done performing compiler
+transformations, it generates the bridge modules (by looking up their
+constructors and passing them their serialized constructor argument) and
+connects them to the tokenized interfaces presented by the now host-decoupled simulator.
 
-Host-side Implementation
-------------------------
+Host Side
+---------
 
-Host-side implementations have two components.
-#. A FPGA-hosted BridgeModule.
-#. An optional, CPU-hosted, bridge driver.
+The host side of a bridge has two components:
 
-In general, bridges have both a module and a driver: in FASED memory timing
-models, the BridgeDriver configures timing parameters at the start of
+#. A FPGA-hosted bridge module (``BridgeModule``).
+#. An optional, CPU-hosted, bridge driver (``bridge_driver_t``).
+
+In general, bridges have both: in FASED memory timing
+models, the driver configures timing parameters at the start of
 simulation, and periodically reads instrumentation during execution.  In the
-Block Device model, a Driver periodically polls hardware queues checking for
-new functional requests to be served. In the NIC model, the BridgeDriver moves
-tokens in bulk between the software switch model and the BridgeModule, which
+Block Device model, the driver periodically polls queues in the module checking for
+new functional requests to be served. In the NIC model, the driver moves
+tokens in bulk between the software switch model and the bridge module, which
 simply queues up tokens as they arrive.
 
-Communication between a BridgeModule and BridgeDriver is implemented with two types of transport:
+Communication between a bridge module and driver is implemented with two types of transport:
 
-#. MMIO: On the hardware-side this is implemented over a 32-bit AXI4-lite bus.
-   Reads and writes to this bus are made by BridgeDrivers using simif_t::read()
-   and simif_t::write(). BridgeModules register memory mapped registers using
-   methods defined in Widget, addresses for these registers are passed to the
+#. **MMIO**: In the module, this is implemented over a 32-bit AXI4-lite bus.
+   Reads and writes to this bus are made by drivers using ``simif_t::read()``
+   and ``simif_t::write()``. Bridge modules register memory mapped registers using
+   methods defined in ``midas.widgets.Widget``, addresses for these registers are passed to the
    drivers in a generated C++ header.
 
-#. DMA: On the hardware-side this is implemented with a wide (e.g., 512-bit) AXI4
-   bus, that is mastered by the CPU. BridgeDrivers initiate bulk transactions
-   by passing buffers to simif_t::push() and simif_t::pull() (DMA from the
+#. **DMA**: In the module this is implemented with a wide (e.g., 512-bit) AXI4
+   bus, that is mastered by the CPU. Bridge drivers initiate bulk transactions
+   by passing buffers to ``simif_t::push()`` and ``simif_t::pull()`` (DMA from the
    FPGA). DMA is typically used to stream tokens into and out of
    out of large FIFOs in the BridgeModule.
 
@@ -149,45 +143,42 @@ Communication between a BridgeModule and BridgeDriver is implemented with two ty
 Compile-Time (Parameterization) vs Runtime Configuration
 --------------------------------------------------------
 
-As when compiling a software-RTL simulator, the simulated design
+As when compiling a software RTL simulator, the simulated design
 is configured over two phases:
 
-#. Compile Time. By parameterization the target RTL and BridgeModule
+#. **Compile Time**, by parameterizing the target RTL and BridgeModule
    generators, and by enabling Golden Gate optimization and debug
    transformations. This changes the simulator's RTL and thus requires a
    FPGA-recompilation. This is equivalent to, but considerably slower than,
    invoking VCS to compile a new simulator.
 
-
-#. Runtime. By specifying plus args (e.g., +mm_latency=1) that are passed to
-   the BridgeDrivers.  This is isomorphic to passing plus args to a VCS
-   simulator, in fact, in many cases the plus args passed to a VCS simulator
+#. **Runtime**, by specifying plus args (e.g., +latency=1) that are passed to
+   the BridgeDrivers.  This is equivalent to passing plus args to a software
+   RTL simulator, and in many cases the plus args passed to an RTL simulator
    and a FireSim simulator can be the same.
 
 Target-Side vs Host-Side Parameterization
 -----------------------------------------
 
-Unlike in a VCS simulation, FireSim simulations have an additional phase of RTL
-elaboration, during which BridgeModules are generated (they are implemented as
-Chisel generators).
+Unlike in a software RTL simulation, FireSim simulations have an additional phase of RTL
+elaboration, during which bridge modules are generated (they are themselves Chisel generators).
 
 The parameterization of your bridge module can be captured in two places.
 
-#. Target-side: Here parameterization information is provided both as free
+#. **Target side.** here parameterization information is provided both as free
    parameters to the target's generator, and extracted from the context in
-   which the Bridge is instantiated. The latter might include things like width
+   which the bridge is instantiated. The latter might include things like widths
    of specific interfaces or bounds on the behavior the target might expose to
-   the Bridge (e.g., a maximum number of inflight requests). All of this
-   information must be captured in a single serializable constructor argument,
-   generally a case class (see Endpoint.constructorArg).
+   the bridge (e.g., a maximum number of inflight requests). All of this
+   information must be captured in a _single_ serializable constructor argument,
+   generally a case class (see ``Bridge.constructorArg``).
 
-#. Host-side: This is parameterization information captured in Golden Gate's
-   Parameters object.  This should be used to provide host-land implementation
-   hints (that don't change the simulated behavior of the system), or to
+#. **Host side.** This is parameterization information captured in Golden Gate's
+   ``Parameters`` object.  This should be used to provide host-land implementation
+   hints (that ideally don't change the simulated behavior of the system), or to
    provide arguments that cannot be serialized to the annotation file.
 
-
 In general, if you can capture target-behavior-changing parameterization information from
 the target-side you should. This makes it easier to prevent divergence between
-a RTL simulation and FireSim simulation of the same FIRRTL. It's also easier to
-configure multiple instances of the same type of bridge from the target-side.
+a software RTL simulation and FireSim simulation of the same FIRRTL. It's also easier to
+configure multiple instances of the same type of bridge from the target side.

From 0f007704207ccdecdb4b3ccd1be4b0beb1a1fc93 Mon Sep 17 00:00:00 2001
From: David Biancolin <david.biancolin@gmail.com>
Date: Mon, 11 Nov 2019 16:06:32 -0800
Subject: [PATCH 2/4] Update paper links in README.md

---
 README.md | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 75b5f28c..0cfb9e7e 100644
--- a/README.md
+++ b/README.md
@@ -98,13 +98,15 @@ Our paper from FPGA 2019 details the DRAM model used in FireSim:
 
 > David Biancolin, Sagar Karandikar, Donggyu Kim, Jack Koenig, Andrew Waterman, Jonathan Bachrach, Krste Asanović, **FASED: FPGA-Accelerated Simulation and Evaluation of DRAM**, In proceedings of the 27th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, Seaside, CA, February 2018.
 
-[Paper PDF](https://people.eecs.berkeley.edu/~biancolin/papers/fased-fpga19.pdf)
+[Paper PDF](https://people.eecs.berkeley.edu/~biancolin/papers/fased-fpga19.pdf) |
+[ACM DL](https://dl.acm.org/citation.cfm?id=3293894) |
+[BibTeX](https://people.eecs.berkeley.edu/~biancolin/bib/fased-fpga19.bib)
 
 ### ICCAD 2019: Golden Gate: Bridging The Resource-Efficiency Gap Between ASICs and FPGA Prototypes
 
 Our paper describing FireSim's Compiler, _Golden Gate_:
 
-> Albert Magyar, David T. Biancolin, Jack Koenig, Sanjit Seshia, Jonathan Bachrach, Krste Asanović, **Golden Gate: Bridging The Resource-Efficiency Gap Between ASICs and FPGA Prototypes**, To appear at ICCAD '19.
+> Albert Magyar, David T. Biancolin, Jack Koenig, Sanjit Seshia, Jonathan Bachrach, Krste Asanović, **Golden Gate: Bridging The Resource-Efficiency Gap Between ASICs and FPGA Prototypes**, *In proceedings of the 39th International Conference on Computer-Aided Design (ICCAD '19)*, Westminster, CO, November 2019.
 
 [Paper PDF](https://davidbiancolin.github.io/papers/goldengate-iccad19.pdf)
 

From ae4328c2e61ec1dc9d1736691f8de55fce077c3b Mon Sep 17 00:00:00 2001
From: Sagar Karandikar <sagark@eecs.berkeley.edu>
Date: Mon, 23 Dec 2019 19:33:46 +0530
Subject: [PATCH 3/4] Update First-time-AWS-User-Setup.rst

---
 .../First-time-AWS-User-Setup.rst             | 23 ++++++-------------
 1 file changed, 7 insertions(+), 16 deletions(-)

diff --git a/docs/Initial-Setup/First-time-AWS-User-Setup.rst b/docs/Initial-Setup/First-time-AWS-User-Setup.rst
index ae553774..ccb5a371 100644
--- a/docs/Initial-Setup/First-time-AWS-User-Setup.rst
+++ b/docs/Initial-Setup/First-time-AWS-User-Setup.rst
@@ -35,27 +35,18 @@ Follow these steps to do so:
 
 https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-resource-limits.html
 
-You'll probably want to start out with the following requests, depending on your existing limits:
-
-Request 1:
+You'll probably want to start out with the following request, depending on your existing limits:
 
 ::
 
-    Region:                US East (Northern Virginia)
-    Primary Instance Type: f1.2xlarge
-    Limit:                 Instance Limit
-    New limit value:       1
+    Limit Type:                EC2 Instances
+    Region:                    US East (Northern Virginia)
+    Primary Instance Type:     All F instances
+    Limit:                     Instance Limit
+    New limit value:           64
 
-Request 2:
 
-::
-
-    Region:                US East (Northern Virginia)
-    Primary Instance Type: f1.16xlarge
-    Limit:                 Instance Limit
-    New limit value:       1
-
-This allows you to run one node on the ``f1.2xlarge`` or eight nodes on the
+This limit of 64 vCPUs for F instances allows you to run one node on the ``f1.2xlarge`` or eight nodes on the
 ``f1.16xlarge``.
 
 For the "Use Case Description", you should describe your project and write

From 2d441af957b63a22f4c5019442b1c0289e5e7372 Mon Sep 17 00:00:00 2001
From: Sagar Karandikar <sagark@eecs.berkeley.edu>
Date: Mon, 23 Dec 2019 19:55:39 +0530
Subject: [PATCH 4/4] Update First-time-AWS-User-Setup.rst

---
 docs/Initial-Setup/First-time-AWS-User-Setup.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/docs/Initial-Setup/First-time-AWS-User-Setup.rst b/docs/Initial-Setup/First-time-AWS-User-Setup.rst
index ccb5a371..1be3b8fe 100644
--- a/docs/Initial-Setup/First-time-AWS-User-Setup.rst
+++ b/docs/Initial-Setup/First-time-AWS-User-Setup.rst
@@ -29,9 +29,9 @@ Requesting Limit Increases
 
 In our experience, new AWS accounts do not have access to EC2 F1 instances by
 default. In order to get access, you should file a limit increase
-request.
+request. You can learn more about EC2 instance limits here: https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-on-demand-instances.html#ec2-on-demand-instances-limits
 
-Follow these steps to do so:
+To request a limit increase, follow these steps:
 
 https://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-resource-limits.html