From 039324b8300499ab1fb15684f7b8fe2774ce4bcc Mon Sep 17 00:00:00 2001
From: Chris Lattner <clattner@nondot.org>
Date: Sat, 25 Sep 2021 18:24:44 -0700
Subject: [PATCH] [HW Docs] Write rationale for HW modules, parameters and
 other things.

This resolves Issue 1857
---
 docs/RationaleHW.md | 274 ++++++++++++++++++++++++++++++++++++++------
 llvm                |   2 +-
 2 files changed, 241 insertions(+), 35 deletions(-)
diff --git a/docs/RationaleHW.md b/docs/RationaleHW.md
index 6da17e7e3e..ee55b6f76d 100644
--- a/docs/RationaleHW.md
+++ b/docs/RationaleHW.md
@@ -13,8 +13,13 @@ Rationale](RationaleSV.md).
   - [Introduction to the `hw` Dialect](#introduction-to-the-hw-dialect)
   - [`hw` Type System](#hw-type-system)
   - [`hw.module` and `hw.instance`](#hwmodule-and-hwinstance)
-    - [Parameterized Modules](#parameterized-modules)
     - [Instance paths](#instance-paths)
+  - [Parameterized Modules](#parameterized-modules)
+    - [Valid Parameter Expression Attributes](#valid-parameter-expression-attributes)
+    - [Parameter Expression Canonicalization](#parameter-expression-canonicalization)
+    - [Using parameters in the body of a module](#using-parameters-in-the-body-of-a-module)
+    - [Parameterized Types](#parameterized-types)
+    - [Answers to other common questions](#answers-to-other-common-questions)
   - [Type declarations](#type-declarations)
   - [Symbols and Visibility](#symbols-and-visibility)
   - [Future Directions](#future-directions)
@@ -76,58 +81,259 @@ Verilog.
 The basic structure of a hardware designed is made up an "instance tree" of
 "modules" and "instances" that refer to them.  There are loose analogies to
 software programs which have corresponding "functions" and "calls" (but there
-are also major differences, see "Instance paths" below).  A module in the `hw`
-dialect has several major components:
+are also major differences, see "[Instance paths](#instance-paths)" below).
+Modules can have a
+definition `hw.module`, they can be a definition of an external module whose
+signature is known but whose body is provided separately `hw.module.extern`,
+and can be a definition of an external module with a known signature that
+can/will be generated in the future on demand (`hw.module.generated`).
 
-1) A symbol `name` which specifies the MLIR name for the module.
-2) input ports, as bb arguments.
-3) result ports, with `hw.output`
-4) parameters (described below)
-5) other attributes.
+A simple example module looks like this (many more can be found in the
+testsuite):
 
-`hw` is generally designed to be permissive in what it allows for the types of ports.
+```mlir
+hw.module @two_and_three(%in: i4) -> (twoX: i4, threeX: i4) {
+  %0 = comb.add %in, %in : i4
+  %1 = comb.add %a, %0 : i4
+  hw.output %0, %1 : i4, i4
+}
+```
 
-TODO: Spotlight on module.  Allows arbitrary types for ports.
+The signature of a modules have these major components:
 
-**Zero Bit Integer Ports**
+1) A symbol `name` which specifies the MLIR name for the module
+   (`@two_and_three` in the example above).  This is what connects instances to
+   modules in a stable way.
+2) A list of input ports, each of which has a type (`%in: i4` in the example
+   above).  Each input port is available as an SSA value through a block
+   argument in the entry block of an `hw.module`, allowing them to be used
+   within its body.  "inout" ports are modeled as inputs with an `!hw.inout<T>`
+   type.  Input port names are prefixed with a `%` because they are available
+   as SSA values in the body.
+3) A list of result port names and types (`twoX: i4` and `threeX: i4` in the
+   example above).  In a `hw.module` definition, the values for the
+   results are provided by the operands to the `hw.output` terminator in the
+   body block.  The names of result ports are not prefixed with `%` because
+   they are not MLIR SSA values.
+4) A list of module "parameters", which provide parametric polymorphism
+   capabilities (somewhat similar to C++ templates) for modules.  These are
+   described in more detail in the "[Parameterized 
+   Modules](#parameterized-modules) section below.
+5) The `verilogName` attribute can be used to override the name for an external
+   module.  We hope to eliminate this in the future and just use the symbol.
+6) Other ad-hoc attributes.  The `hw` dialect is intended to allow open
+   extensibility by other dialects.  Ad-hoc attributes put on `hw` dialect 
+   modules should be namespace qualified according to the dialect they come
+   from to avoid conflicts.
 
-Certain operations support zero bit declarations:
+This definition is fairly close to the Verilog family, but there are some
+notable differences: for example:
 
- - The `hw.module` operation allows zero bit ports, since it supports an open
-   type system.  They are dropped from Verilog emission.
- - Interface signals are allowed to be zero bits wide.  They are dropped from
-   Verilog emission.
-
-
-
-### Parameterized Modules
-
-TODO: describe this.
-
- - Why not use SSA graph.
- -  Why not default arguments.
- - Parameterized types.
- - Concat types.
- - Canonicalization of affine expressions.
- - how to go from parameter space to value space with localparam etc.
+ - We split output ports from input ports, don't use `hw.output` instead of 
+   connects to specify the results.  This allows better SSA dataflow
+   analysis from the `hw.output` which is useful for inter-module analyses.
+ - We allow arbitrary types for module ports.  The `hw` dialect is generally
+   designed to be extensible by other dialects, and thus being permissive here
+   is useful.  That said, the [Verilog exporter](VerilogGeneration.md) does not
+   support arbitrary user-defined types.
+ - The `comb` dialect in particular does not use signed integer types, its
+   operators do not support zero-width integer types.  Modules, on the other
+   hand, do support both of these.  Zero width ports are omitted (printed as
+   comments) when generating verilog.
 
 ### Instance paths
 
 An IR for Hardware is different than an IR for Software in a very important way:
 while each function in a software program usually compiles into one blob of
 binary code no matter how many times it is called, each instance in a hardware
-design is typically fully instantiated, because different instances turn into
+design is typically fully instantiated, because different instance turn into
 different gates.  The consequence of this is that the instance tree is really a
 compression mechanism that is eventually elaborated away.
 
 This compression approach has major advantages: it is much better for memory
 and compile time to represent a single definition of a hardware block than the 
 (possibly thousands or millions) of concrete instances that will eventually be
-required.  However, hardware engineers do need to reason about and control the
-different instances in some cases (e.g. providing physical layout constraints
-for one instance but not the rest).
+required.  However, hardware engineers often do need to reason about and control
+the different instances in some cases (e.g. providing physical layout
+constraints for one instance but not the rest).
 
-TODO: Bake out a design for this.
+TODO: Bake out a design for instance path references, an equivalent to the
+FIRRTL dialect `InstanceGraph` type, etc.
+
+## Parameterized Modules
+
+The `hw` dialect supports parametric "compile-time" polymorphism for modules.
+This allows for metaprogramming along the instance tree, guaranteed
+"instantiation time" optimizations and code generation, further enables
+the "IR compression" benefits of using instances in the first place, and enables
+the generation of parameters in generated Verilog (which can increase the
+percieved readability of the generated code).
+
+Parameters are declared on modules (including generated and external ones)
+with angle brackets: each parameter has a name and type, and can optionally
+have a default value.  Instances of a parameterized module provide a value for
+each parameter (even defaulted ones) in the same order:
+
+```mlir
+// This module has two parameters "p1" and "p2".
+hw.module.extern @parameterized<p1: i42 = 17, p2: i1>(%in: i8) -> (out: i8)
+
+hw.module @UseParameterized(%a: i8) -> (ww: i8) {
+  %r0 = hw.instance "inst" @parameters<p1: i42 = 17, p2: i1 = 1>(in: %a: i8) -> (out: i8)
+  hw.output %r0 : i8
+}
+```
+
+This approach makes analysis and transformation of the IR simple, predictable,
+and efficient: because the parameter list on instances and on modules always
+line up, they are indexable by integers (instead of strings), intermodule
+analysis is straight-forward (no filling in of default values etc), and
+Verilog generation is always predictable: the default value for a parameter
+is used when the instance and the module default are the same (e.g. in the
+example above, `p1` is not printed at the instance site because it is the same
+as the default.
+
+### Valid Parameter Expression Attributes
+
+The following attributes may be used as expressions involving parameters at
+an instance site or in the default value for a parameter declaration on a
+module:
+
+- IntegerAttr/FloatAttr/StringAttr constants may be used as simple leaf values.
+- The `#hw.param.decl.ref` attribute is used to refer to the value of a
+  parameter in the current module.  This is valid in most positions where a
+  parameter attribute is used - except in the default value for a module.
+- The `#hw.param.binary` operator allows combining other parameter expressions
+  into an expression tree.  Expression trees have important canonicalization
+  rules to ensure important cases are canonicalized to uniquable
+  representations.
+- `#hw.param.verbatim<"some string">` may be used to provide an opaque blob of
+  textual verilog that is uniqued by its string contents.  This is intended
+  as a general "escape hatch" that allows frontend authors to CIRCT does not
+  provide any checking to ensure that this is correct or safe, and assumes it
+  is single expression - parenthesize the string contents if not to be safe.
+  This [should eventually support
+  substitutions](https://github.com/llvm/circt/issues/1881) like
+  `sv.verbatim.expr`.
+
+Because parameter expressions are MLIR attributes, they are immortal values
+that are uniqued based on their structure.  This has several important
+implications, including:
+
+- A parameter reference (`#hw.param.decl.ref`) to a parameter `x` doesn't know
+  what module it is in.  The verifier checks that parameter expressions are
+  valid within the body of a module, and that the types line up between the
+  parameter reference and the declaration (after all, two different modules can
+  have two   different parameters named `x` with different types).
+- We want to depend on MLIR canonicalizing and uniquing the pointer address of
+  attributes in a predictable way to ensure that further derived uniqued objects
+  (e.g. a parameterized integer type) is also uniqued correctly.  For example,
+  we do not want the types `hw.int<x+1>` and `hw.int<1+x>` to turn into
+  different types.  See the [Parameter Expression
+  Canonicalization](#parameter-expression-canonicalization) section below for
+  more details.
+- Whereas the rest of the `hw` dialect is generally open for extension, the
+  current grammar of attribute expressions is closed: you have to hack the
+  HW dialect verifier and VerilogEmitter to add new kinds of valid expressions.
+  This is considered a limitation, we'd like to move to an attribute interface
+  at some point that would allow dialect-defined attributes.  For example, this
+  would allow  moving `hw.param.verbatim` attribute down to the `sv` dialect.
+
+### Parameter Expression Canonicalization
+
+As mentioned above, it is important to canonicalize parameter expressions.  This
+slightly reduces memory usage, but more importantly ensures that equivalent
+parameter expressions are pointer equivalent: we don't want `x+1` and `1+x` to
+be different, because that would cause everything derived from them to be as
+well.
+
+On the other hand, we expect to support a lot of weird expressions over time (at
+least the full complement that Verilog supports) and canonicalize arbitrary
+expressions in a predictable way is untennable.  As such, we support
+canonicalizating a fixed set of expressions predictably: more may be added in
+the future.
+
+This set includes:
+
+ - TODO: None yet. :-)
+
+### Using parameters in the body of a module
+
+Parameters are not SSA values, so they cannot directly be used within the body
+of the module.  To project them with a specific name, you can use the
+`sv.localparam` declaration like so:
+
+```mlir
+hw.module @M1<param1: i1>(%clock : i1, ...) {
+  ...
+  %param1 = sv.localparam : i1 { value = #hw.param.decl.ref<"param1">: i1 }
+  ...
+    sv.if %param1 {  // Compile-time conditional on parameter.
+      sv.fwrite "Only happens when the parameter is set\n"
+    }
+  ...
+}
+```
+
+Alternatively, if you don't want to introduce a local name, you can use a 
+**TODO**: yet-to-be-implemented new op.
+
+### Parameterized Types
+
+TODO: Not done yet.
+
+### Answers to other common questions
+
+During the design work on parameterized modules, we had several proposals for
+alternative designs a lot of discussion on this.  See in particular, these
+discussions at the open design meetings:
+
+ - [September 15, 2021](https://docs.google.com/document/d/1fOSRdyZR2w75D87yU2Ma9h2-_lEPL4NxvhJGJd-s5pk/edit#heading=h.gdy95njn5105):
+   discussion about using SSA values vs attributes for expressions, whether
+   parameters should just be a "special kind of port" etc.
+ - [September 22, 2021](https://docs.google.com/document/d/1fOSRdyZR2w75D87yU2Ma9h2-_lEPL4NxvhJGJd-s5pk/edit#heading=h.tcwfqa9fi7u2):
+   discussions on expression canonicalization, parameterized type casting and
+   other topics.
+
+This section tries to condense some of those discussions into key points:
+
+**Why do instances repeat default parameters from modules?**
+
+As described above, the full set of module parameters are specified on an
+instance, even if some have default values.  The reason for this is that we want
+the IR to be simple and efficient to analyze by the compiler: keeping (and
+verifying that) instance parameters are in canonical form means that we can
+index them with integers instead of names (just like module input and result
+ports), and intermodule analysis/optimization doesn't have to handle default
+values as a special case.  Instead they are just a matter for frontends and
+the Verilog exporter to care about.
+
+**Why model parameters with attributes instead of SSA values?**
+
+It seems unfortunate to replicate some parts of the `comb` dialect (e.g.
+`comb.add`) as attributes rather than just reusing the existing attributes.
+Such a design has historical familiarity (e.g. LLVM's `ConstantExpr` class)
+which led to a bunch of complexity in LLVM that would have been better avoided
+(and yes - there are much better designs for LLVM's purposes than what it has
+now).
+
+All that said, using attributes is the right thing for a number of reasons:
+
+1) This arithmetic happens at metaprogramming time, these ops do not turn into
+   hardware.  It use important and useful to be able to know that structurally.
+2) We need to verify parameter expressions are valid for the module they are
+   defined in - it isn't generally ok for the verifier of the `hw.instance` op
+   to walk an arbitrary amount of IR to check that an SSA value is valid as a
+   parameter.
+3) We need to support parameterized types like `!hw.int<n>`: because MLIR types
+   are immortal and uniqued, they can refer to attributes but cannot refer to
+   SSA values (which may be destoyed).
+4) Operations need to be able to compute their own type without creating other
+   operations.  For example, we need to compute that the result type of
+   `comb.concat %a, %b : (i1, !hw.int<n>)` is `!hw.int<n+1>` without introducing
+   a new `comb.add` node to "add one to n".
+5) In practice, comb ops and the canonicalizations that apply to them have very
+   different goals than the canonicalizations we apply to parameter expressions.
 
 ## Type declarations
 
diff --git a/llvm b/llvm
index 47cc166bc0..7acd1807dd 160000
--- a/llvm
+++ b/llvm
@@ -1 +1 @@
-Subproject commit 47cc166bc023b497bdffe0964d80f15eaee8b7da
+Subproject commit 7acd1807dd6899441cff9e1246155379971352fb