23 KiB
FIRRTL Dialect Rationale
This document describes various design points of the FIRRTL dialect, why it is the way it is, and current status and progress. This follows in the spirit of other MLIR Rationale docs.
Introduction
The FIRRTL project is an existing open source compiler infrastructure used by the Chisel framework to lower ".fir" files to Verilog. It provides a number of useful compiler passes and infrastructure that allows the development of domain specific passes. The FIRRTL project includes a well documented IR specification that explains the semantics of its IR, an ANTLR grammar includes some extensions beyond it, and a compiler implemented in Scala which we refer to as the Scala FIRRTL Compiler (SFC).
The FIRRTL dialect in CIRCT is designed to provide a drop-in replacement for the SFC for the subset of FIRRTL IR that is produced by Chisel and in common use. The FIRRTL dialect also provides robust support for SFC Annotations.
To achieve these goals, the FIRRTL dialect follows the FIRRTL IR specification and the SFC implementation almost exactly. Where the FIRRTL specification allows for undefined behavior, FIRRTL dialect and its passes will choose the SFC interpretation of specific undefined behavior. The small deviations we do make are discussed below. Early versions of the FIRRTL dialect made heavy deviations from FIRRTL IR and the SFC (see the Type Canonicalization section below). These deviations, while elegant, led to difficult to resolve mismatches with the SFC and the inability to verify FIRRTL IR. The remaining small deviations introduced in the FIRRTL dialect are done to simplify the CIRCT implementation of a FIRRTL compiler and to take advantage of MLIR's various features.
This document generally assumes that you've read and have a basic grasp of the FIRRTL IR spec, and it can be occasionally helpful to refer to the ANTLR grammar.
Status
The FIRRTL dialect and FIR parser is a generally complete implementation of the FIRRTL specification and is actively maintained, tracking new enhancements. The FIRRTL dialect supports some undocumented features and the "CHIRRTL" flavor of FIRRTL IR that is produced from Chisel. The FIRRTL dialect has support for parsing an SFC Annotation file consisting of only local annotations and converting this to operation or argument attributes. Non-local annotation support is planned, but not implemented.
There are some exceptions to the above:
- We don't support the
Fixed
types for fixed point numbers, and some primitives associated with them. - We don't support
Interval
types
Some of these may be research efforts that didn't gain broad adoption, in which case we don't want to support them. However, if there is a good reason and a community that would benefit from adding support for these, we can do so.
Naming
One of the goals of the FIRRTL compiler is to produce human-readable Verilog which can be easily mapped back to the input FIRRTL. Part of this effort means that we want the naming of Verilog objects to match the names used in the original FIRRTL, and they are predictably transformed during lowering. For example, after bundles are replaced with scalars in the lower-types pass, each field should be prefixed with the bundle name:
circuit Example
module Example
reg myreg: { a :UInt<1>, b: UInt<1> }, clock
; firrtl-lower-types =>
circuit Example
module Example
reg myreg_a: UInt<1>, clock
reg myreg_b: UInt<1>, clock
The name transformations applied by the SFC have become part of the documented API, and people rely on the final names to take a certain form.
There are names for temporaries generated by the Chisel and FIRRTL tooling
which are not important to maintain. These names are discarded when parsing,
which saves memory during compilation. New names are generated at Verilog
export time, which has the effect of renumbering intermediate value names.
Names generated by Chisel typically look like _T_12
, and names generated by
the SFC look like _GEN_12
. The FIRRTL compiler will not discard these names
if the object has an array attribute annotations
containing the attribute
{class = "firrtl.transforms.DontTouchAnnotation}
.
It is common for EDA tools to hide or optimize away entities which have a
name beginning with an _
. FIRRTL considers these names precious (excluding
FIRRTL temporary names) and will maintain them.
Type system
Not using standard types
At one point we tried to use the integer types in the standard dialect, like
si42
instead of !firrtl.sint<42>
, but we backed away from this. While it
originally seemed appealing to use those types, FIRRTL
operations generally need to work with "unknown width" integer types (i.e.
!firrtl.sint
).
Having the known width and unknown width types implemented with two different
C++ classes was awkward, led to casting bugs, and prevented having a
FIRRTLType
class that unified all the FIRRTL dialect types.
Not Canonicalizing Flip Types
An initial version of the FIRRTL dialect relied on canonicalization of flip types according to the following rules:
flip(flip(x))
==x
.flip(analog(x))
==analog(x)
since analog types are implicitly bidirectional.flip(bundle(a,b,c,d))
==bundle(flip(a), flip(b), flip(c), flip(d))
when the bundle has non-passive type or contains an analog type. This forces the flip into the subelements, where it recursively merges with the non-passive subelements and analogs.flip(vector(a, n))
==vector(flip(a), n)
when the vector has non-passive type or analogs. This forces the flip into the element type, generally canceling it out.bundle(flip(a), flip(b), flip(c), flip(d))
==flip(bundle(a, b, c, d)
. Due to the other rules, the operand to a flip must be a passive type, so the entire bundle will be passive, and rule #3 won't be recursively reinvoked.
While elegant in a number of ways (e.g., FIRRTL types are guaranteed to have a canonical representation and can be compared using pointer equality, flips partially subsume port directionality and "flow", and analog inputs and outputs are canonicalized to the same representation), this resulted in information loss during canonicalization because the number of flip types can change. Namely, three problems were identified:
- Type canonicalization may make illegal operations legal.
- The flow of connections could not be verified because flow is a function of the number of flip types.
- The directionality of leaves in an aggregate could not be determined.
As an example of the first problem, consider the following circuit:
module Foo:
output a: { flip a: UInt<1> }
output b: { a: UInt<1> }
b <= a
The connection b <= a
is illegal FIRRTL due to a type mismatch where { flip a: UInt<1> }
is not equal to { a: UInt<1> }
. However, type canonicalization
would transform this circuit into the following circuit:
module Foo:
input a: { a: UInt<1> }
output b: { a: UInt<1> }
b <= a
Here, the connection b <= a
is legal FIRRTL. This then makes it impossible
for a type canonical form to be type checked.
As an example of the second problem, consider the following circuit:
module Bar:
output a: { flip a: UInt<1> }
input b: { flip b: UInt<1> }
b <= a
Here, the connection b <= a
is illegal FIRRTL because b
is a source and
a
is a sink. However, type canonicalization converts this to the following
circuit:
module Bar:
input a: { a: UInt<1> }
output b: { b: UInt<1> }
b <= a
Here, the connect b <= a
is legal FIRRTL because b
is now a sink and a
is now a source. This then makes it impossible for a type canonical form to be
flow checked.
As an example of the third problem, consider the following circuit:
module Baz:
wire a: {flip a: {flip a: UInt<1>}}
wire b: {flip a: {flip a: UInt<1>}}
b.a <= a.a
The connection b.a <= a.a
, when lowered, results in the reverse connect
a.a.a <= b.a.a
. However, type canonicalization will remove the flips from the
circuit to produce:
module Baz:
wire a: {a: {a: UInt<1>}}
wire b: {a: {a: UInt<1>}}
b.a <= a.a
Here, the connect b.a <= a.a
, when lowered, results in the normal connect
b.a.a <= a.a.a
. Type canonicalization has thereby changed the semantics of
connect.
Due to the elegance of type canonicalization, we initially decided that we would use type canonicalization and CIRCT would accept more circuits than the SFC. The third problem (identified much later than the first two) convinced us to remove type canonicalization.
For a historical discussion of type canonicalization see:
Flow
The FIRRTL specification describes the concept of "flow". Flow encodes
additional information that determines the legality of operations. FIRRTL
defines three different flows: sink
, source
, and duplex
. Module inputs,
instance outputs, and nodes are source
, module outputs and instance inputs are
sink
, and wires and registers are duplex
. A value with sink
flow may only
be written to, but not read from (with the exception of module outputs and
instance inputs which may be also read from). A value with source
flow may be
read from, but not written to. A value with duplex
flow may be read from or
written to.
For FIRRTL connects or partial connect statements, it follows that the
left-hand-side must be sink
or duplex
and the right-hand-side must be
source
, duplex
, or a port/instance sink
.
Flow is not represented as a first-class type in CIRCT. We instead provide utilities for computing flow when needed, e.g., for connect statement verification.
Operations
Multiple result firrtl.instance
operation
The FIRRTL spec describes instances as returning a bundle type, where each element of the bundle corresponds to one of the ports of the module being instanced. This makes sense in the Scala FIRRTL implementation, given that it does not support multiple ports.
The MLIR FIRRTL dialect takes a different approach, having each element of the
bundle result turn into its own distinct result on the firrtl.instance
operation. This is made possible by MLIR's robust support for multiple value
operands, and makes the IR much easier to analyze and work with.
Module bodies require def-before-use dominance instead of allowing graphs
MLIR allows regions with arbitrary graphs in their bodies, and this is used by the HW dialect to allow direct expression of cyclic graphs etc. While this makes sense for hardware in general, the FIRRTL dialect is intended to be a pragmatic infrastructure focused on lowering of Chisel code to the HW dialect, it isn't intended to be a "generally useful IR for hardware".
We recommend that non-Chisel frontends target the HW dialect, or a higher level dialect of their own creation that lowers to HW as appropriate.
input
and output
Module Ports
The FIRRTL specification describes two kinds of ports: input
and output
. In
the firrtl.module
declaration we track this via an arbitrary precision integer
attribute (IntegerAttr
) where each bit encodes the directionality of the port
at that index.
Originally, we encoded direction as the absence of an outer flip type (input) or presence of an outer flip type (output). This was done as part of the original type canonicalization effort which combined input/output with the type system. However, once type canonicalization was removed flip type only became used in three places: on the types of bundle fields, on the variadic return types of instances or memories, and on ports. The first is the same as the FIRRTL specification. The second is a deviation from the FIRRTL specification, but allowable as it takes advantage of the MLIR's variadic capabilities to simplify the IR. The third was an inelegant abuse of an unrelated concept that added bloat to the type system. Many operations would have to check for an outer flip on ports and immediately discard it.
For this reason, the IntegerAttr
encoding implementation was chosen.
For a historical discussion of this issue and its development see:
firrtl.mem
Unlike the SFC, the FIRRTL dialect represents each memory port as a distinct
result value of the firrtl.mem
operation. Also, the firrtl.mem
node does
not allow zero port memories for simplicity. Zero port memories are dropped
by the .fir file parser.
CHIRRTL Memories
FIRRTL has two different representations of memories: Chisel cmemory
operations, smem
and cmem
, and the standard FIRRTL mem
operation. Chisel
memory operations exist to make it easy to produce FIRRTL code from Chisel, and
closely match the Chisel API for memories. Chisel memories are intended to be
replaced with standard FIRRTL memories early in the pipeline. The set of
operations related to Chisel memories are often referred to as CHIRRTL.
The main difference between Chisel and FIRRTL memories is that Chisel memories have an operation to add a memory port to a memory, while FIRRTL memories require all ports to be defined up front. Another difference is that Chisel memories have "enable inferrence", and are usually inferred to be enabled where they are declared. The following example shows a CHIRRTL memory declaration, and the standard FIRRTL memory equivalent.
smem mymemory : UInt<4>[8]
when p:
read mport port0 = mymemory[address], clock
mem mymemory:
data-type => UInt<4>
depth => 8
read-latency => 0
write-latency => 1
reader => port0
read-under-write => undefined
mymemory.port0.en <= p
mymemory.port0.clk <= clock
mymemory.port0.addr <= address
FIRRTL memory operations were created because it was thought that a concrete memory primitive, that looks like an instance, is a better design for a compiler IR. It was originally intended that Chisel would be modified to emit FIRRTL memory operations directly, and the CHIRRTL operations would be retired. The lowering from Chisel memories to FIRRTL memories proved far more complicated than originally envisioned, specifically surrounding the type of ports, inference of enable signals, and inference of clocks.
CHIRRTL operations have since stuck around, but their strange behavior has lead to discussions to remove, improve, or totally redesign them. For some current discussion about this see 1, 2. Since CIRCT is attempting to be a drop in replacement FIRRTL compiler, we are not attempting to implement these new ideas for Chisel memories. Instead, we are trying to implement what exists today.
There is, however, a major compatibility issue with the existing implementation
of Chisel memories which made them difficult to support in CIRCT. The FIRRTL
specification disallows using any declaration outside of the scope where it is
created. This means that a Chisel memory port declared inside of a when
block can only be used inside the scope of the when
block. Unfortunately,
this invariant is not enforced for memory ports, and this leniency has been
abused by the Chisel standard library. Due to the way clock and enable
inference works, we couldn't just hoist the declaration into the outer scope.
To support escaping memory port definitions, we decided to split the memory
port operation into two operations. We created a firrtl.memoryport
operation
to declare the memory port, and a firrtl.memoryport.access
operation to
enable the memory port. The following is an example of how FIRRTL translates
into the CIRCT dialect:
smem mymem : UInt<1>[8]
when cond:
infer mport myport = mymem[addr], clock
out <= myport
%mymem = firrtl.seqmem Undefined : !firrtl.cmemory<uint<1>, 8>
%myport_data, %myport_port = firrtl.memoryport Infer %mymem {name = "myport"} : (!firrtl.cmemory<uint<1>, 8>) -> (!firrtl.uint<1>, !firrtl.cmemoryport)
firrtl.when %cond {
firrtl.memoryport.access %myport_port[%addr], %clock : !firrtl.cmemoryport, !firrtl.uint<3>, !firrtl.clock
}
firrtl.connect %out, %myport_data : !firrtl.uint<1>, !firrtl.uint<1
For a historical discussion of this issue and its development see
llvm/circt#1561
.
More things are represented as primitives
We describe the mux
expression as "primitive", whereas the IR
spec and grammar implement it as a special kind of expression.
We do this to simplify the implementation: These expressions have the same structure as primitives, and modeling them as such allows reuse of the parsing logic instead of duplication of grammar rules.
invalid
Invalidate Operation is an expression
The FIRRTL spec describes an x is invalid
statement that logically computes
an invalid value and connects it to x
according to flow semantics. This
behavior makes analysis and transformation a bit more complicated, because there
are now two things that perform connections: firrtl.connect
and the
x is invalid
operation.
To make things easier to reason about, we split the x is invalid
operation
into two different ops: an firrtl.invalidvalue
op that takes no operands
and returns an invalid value, and a standard firrtl.connect
operation that
connects the invalid value to the destination (or a firrtl.attach
for analog
values). This has the same expressive power as the standard FIRRTL
representation but is easier to work with.
During parsing, we break up an x is invalid
statement into leaf connections.
As an example, consider the following FIRRTL module where a bi-directional
aggregate, a
is invalidated:
module Foo:
output a: { a: UInt<1>, flip b: UInt<1> }
a is invalid
This is parsed into the following MLIR. Here, only a.a
is invalidated:
firrtl.module @Foo(out %a: !firrtl.bundle<a: uint<1>, b: flip<uint<1>>>) {
%0 = firrtl.subfield %a("a") : (!firrtl.bundle<a: uint<1>, b: flip<uint<1>>>) -> !firrtl.uint<1>
%invalid_ui1 = firrtl.invalidvalue : !firrtl.uint<1>
firrtl.connect %0, %invalid_ui1 : !firrtl.uint<1>, !firrtl.uint<1>
}
validif
represented as a multiplexer
The FIRRTL spec describes a validif(en, x)
operation that is used during lowering from high to low FIRRTL. Consider the following example:
c <= invalid
when a:
c <= b
Lowering will introduce the following intermediate representation in low FIRRTL:
c <= validif(a, b)
Since there is no precedence of this validif
being used anywhere in the Chisel/FIRRTL ecosystem thus far and instead is always replaced by its right-hand operand b
, the FIRRTL MLIR dialect does not provide such an operation at all. Rather it directly replaces any validif
in FIRRTL input with the following equivalent operations:
%0 = firrtl.invalidvalue : !firrtl.uint<42>
%c = firrtl.mux(%a, %b, %0) : (!firrtl.uint<1>, !firrtl.uint<42>, !firrtl.uint<42>) -> !firrtl.uint<42>
A canonicalization then folds this combination of firrtl.invalidvalue
and firrtl.mux
to the "high" operand of the multiplexer to facilitate downstream transformation passes.
Inline SystemVerilog through verbatim.expr
operation
The FIRRTL dialect offers a firrtl.verbatim.expr
operation that allows for SystemVerilog expressions to be embedded verbatim in the IR. It is lowered to the corresponding sv.verbatim.expr
operation of the underlying SystemVerilog dialect, which embeds it in the emitted output. The operation has a FIRRTL result type, and a variadic number of operands can be accessed from within the inline SystemVerilog source text through string interpolation of {{0}}
-style placeholders.
The rationale behind this verbatim operation is to offer an escape hatch analogous to asm ("...")
in C/C++ and other languages, giving the user or compiler passes full control of what exactly gets embedded in the output. Usually, though, you would rather add a new operation to the IR to properly represent additional constructs.
As an example, a verbatim expression could be used to interact with yet-unsupported SystemVerilog constructs such as parametrized class typedef members:
firrtl.module @Magic (out %n : !firrtl.uint<32>) {
%0 = firrtl.verbatim.expr "$bits(SomeClass #(.Param(1))::SomeTypedef)" : !firrtl.uint<32>
firrtl.connect %n, %0 : !firrtl.uint<32>, !firrtl.uint<32>
}
This would lower through the other dialects to SystemVerilog as you would expect:
module Magic (output [31:0] n);
assign n = $bits(SomeClass #(.Param(1))::SomeTypedef);
endmodule
Interpretation of Undefined Behavior
The FIRRTL Specification has undefined behavior for certain features. When in doubt, FIRRTL dialect typically chooses to implement undefined behavior in the same manner as the SFC.
Invalid
The SFC has a context-sensitive interpretation of invalid.
When an is invalid
statement is used, the SFC will optimize this as a connect
to a constant zero if the invalidated component is not assigned to inside a
conditional block (when
/else
). This is an interpretation of invalid as a
value that the compiler chooses to connect to a single component.
When an is invalid
statement is used to specify the default of a component
that is connected to in a conditional block and the conditional block is not
complete, then a conditionally valid (validif
) statement is generated. The
conditionally valid statement connects a value when a condition is true and
invalidates the component otherwise. (This is modeled as a multiplexer in the
FIRRTL dialect.) When lowered, the SFC treats this invalidation as undefined
behavior and will choose the valid path unconditionally. This is an
interpretation of invalid as undefined behavior. (See above for more
information on validif
and the modeling of this as a multiplexer.)
Instead of choosing to aggressively optimize undefined behavior, FIRRTL dialect and its passes use this context-sensitive interpretation of invalid. Folds of primitive operations treat an invalid operand as a zero-valued constant. Folds of multiplexers treat invalid operands as undefined behavior and will optimize away the invalid path.
Propagation of invalid values is handled with extreme caution. Any propagation can cause a later conflation of these two interpretations of invalid and produce subtle bugs.