We were having problems with the universal MacOS wheels, so we
reverted back to just building the x86 wheels. But this still fails,
because these options are telling cibuildwheel to build for the
universal architecture on MacOS, which is not enabled.
Change `parseKeyword` to `parseKeywordOrString` in `StructType`, such
that struct field names that aren't valid keywords in MLIR can still be
used. They now print as quoted strings instead of crashing any MLIR tool
trying to read the type back in.
Double-precision floating point type.
Add attribute, constant op, and parse + emit support.
Only support what we understand as floatingpoint in lexer for now, which includes exponent notation.
Add support to LowerClasses.
`ModuleType::getInputTypes` returns inout types for inout ports but `getInputType` doesn't wrap with inout. This is inconsistent and python API `module_type.input_types` returns a wrong type for this.
Ibis dataflow methods share the same interface as an `ibis.method` but without imperative CFG-based control flow. Instead, this method implements a graph region, and control flow is expected to be defined by dataflow operations.
A couple of minor refactors - more will come - to allow for using `CFToHandshake` on things other than just `func.func` as the source operation and `handshake.func` as the target operation.
Previously, `instance_like_impl::verifyReferencedModule` and `instance_like_impl::getReferencedModule` had different semantics for where the referenced module is located;
`getReferencedModule` uses the nearest `ModuleOp` as the target symbol table, whereas `verifyReferencedModule` uses the nearest symbol table. As a result, `verifyReferencedModule` fails if the `hw.instance` op is nested within another symbol table operation.
e.g. the following would fail:
```mlir
module {
hw.module.extern @Foo()
ibis.container @C { // could be any SymbolTable-defining op. Use builtin.module for simplicity.
%this = ibis.this @C
hw.instance "foo" @Foo() -> ()
}
}
```
Discovering AppIDs used to be a pass and only worked on `msft.module`s. Since we're removing them, make AppID discovery an API which creates dynamic instances from AppIDs. Since it'll primarily be used from PyCDE, expose to Python. Test through Python.
Will remove the discover appid pass in a subsequent PR. Progress on #6109.
The `NameCollector` is only used to compute declaration word and type
string length such that we can format declarations nicely. Currently it
also asserts a weird invariant that `PrepareForEmission` must uphold,
but then it never actually relies on that invariant. And for some reason
the `NameCollector` also collects names in else branches of `IfOp`s, but
not in then branches.
Push `NameCollector` more into the best-effort cosmetics direction it's
already going by removing that assert and not visiting else branches.
Either we visit both branches of an if, or none at all. And since if
blocks introduce a separate scope with a separate place for decls, the
visition is not necessary. (It would just potentially add whitespace to
the if's surrounding scope's decl stack, to account for decls that are
actually emitted within the if scope.)
This also uncovered one instance of `maxDeclNameWidth - x` that was left
unguarded by `if (x < maxDeclNameWidth)`. Fix this.
This change doesn't affect the output, but makes ExportVerilog less
brittle by not enforcing invariants that it actually doesn't care about.
Use the PrettyPrinter CallbackToken API, to record print events and store the
verilog output locations for each operation. Record the print begin and end
location of each op. This has a side-effect of updating the IR, location
attribute of the op is updated to include the verilog location range. `FusedLoc`
is used to record the location ranges and each op can correspond to multiple
verilog locations.
This feature is disabled by default under an emission flag `emitVerilogLocations`.
Adds suppport for CallSiteLoc and NameLoc locations in export verilog.
I tried to do this without refactoring the whole location printing infra too much, although some refactoring was due.
Location printing has been modified such that:
1. There is a top-level dispatch function which dispatches `Location` printing to location type-specific print functions
2. No more passing around of `std::string`s. Instead, location emitter functions take a `llvm::raw_string_ostream`.
3. `FileLineColLoc` locations are still uniqued and printed in a sorted manner, as before. However, any adjacent locations not of the `FileLineColLoc` will be printed without any custom uniquing (apart from `Attribute` uniquing).
Location printing is still inline, i.e. without newlines. I can easily imagine that this can become a bit unwieldy for complex stack traces. However, deciding how to split locations to multiple lines seems like a separate issue.
Too strong of an assumption - changed to `dyn_cast` because we cannot assume that _all_ nodes are HWModuleLike - there may be cases where hw.module's are mixed with other ops that implement the `igraph::ModuleOpInterface`.
The prior motivation for a single return value was to enforce... well, exactly that. However, this is nothing but a pain, requiring the user to tupel/struct-ify return values at a high level. This is a trivial transformation that can be done down the line. Allowing multiple return values, however, allows this op to play much more nicely with other passes and code that expects function-like operations (and terminators of said ops) to -generally speaking - accept multiple return values.
Implement the python bindings for the `om.integer` attribute. Add support to
create `om::IntegerAttr` from `mlir::IntegerAttr`. Also ensure all tests use
`om::IntegerAttr` instead of `mlir::IntegerAttr`.
`argNames` and `resultNames` attributes were removed from HWModule (https://github.com/llvm/circt/pull/6095) in the presence of ModuleType but downstream python tools are relying these attributes for name look up. However currently there is no CAPI/Python API for ModuleType to query names of ports so this PR adds CAPI and bindings respectively in the same way to port types.
Inlines `ibis.sblock` operations, by creating MLIR blocks and `cf` operations, while adding annotations to the parent operation about `sblock`-specific attributes.
This already is supported for propassign (object -> anyref),
allow in List/Map expressions as well.
These types continue to be invariant w.r.t their type parameters
once created, add test that propassign rejects.
This PR restricts the clock gate op to solely use the clock type.
Uses in other dialects, especially Pipeline, were adjusted.
To maintain canonicalization behaviour, a clock constant op along with an attribute is also introduced to represent constant clocks and to fold ops to them.
... by requiring that handshake ops are nested within an operation that inherits the `FineGrainedDataflowRegionOpInterface`.
This is somewhat of a half-way solution, seeing as there isn't support for `HasParent` with an interface upstream. I've raised the issue and suggested a fix here https://github.com/llvm/llvm-project/pull/66196 but we'll see how long that takes to resolve.
Until then, added a `HasParentInterface` which does the same thing, albeit with a cryptic error message about which interface the parent op lacked (note: the whole issue here is that there isn't any name literal being generated for op interfaces).
I'll be monitoring the upstream stuff, and changing this over until then. For now, the motivation for adding this into circt is to unblock me in using handshake outside of a `handshake.func` while still having a restriction on where handshake ops can be used - i.e. i don't want to completely lift the `HasParent` restriction - users should still explicitly opt-into the fact that "using handshake => handshake ops is in a fine-grained dataflow region".
Make the `LowerState` pass allow operations to remain in the top-level
`arc.model` op after state lowering. This is necessary for lowering the
model op into an `eval` function in the future. Make use of this new
flexibility by inserting logic into the model that detects edges on the
clocks of the `arc.clock_tree` ops. The clock trees no longer trigger on
the clock itself, but are given an "enable" signal that indicates
whether a clock edge has been observed.
In the future, we'll want to schedule the ops in the `arc.model` and
lower it to a separate `eval` function, instead of throwing it away. In
doing so the user will no longer have to manually call clock functions,
but can call a singular `eval` function instead. A centralized function
that performs model execution will also allow us to properly simulate
clock edges occurring at the same time -- something which is impossible
today.
Together with the `arc.clock_domain` op, this `eval` function will make
the entire clock detection and grouping a performance optimization
instead of a required transformation. Theoretically, even if we did not
separate state with the same clock into dedicated clock functions, we'll
still be able to generate an `eval` function, with all logic inlined.
This will ultimately make the Arc dialect more robust and the transforms
more composable.
Adds a pass to lower operations. Currently, only lowering the boundary and inlining the op body is supported. Lowering of the body will be added in a separate commit since that is a bit more complex.
Rewrite the access analysis used in the `LegalizeStateUpdate` pass.
This pass is a known performance bottleneck in arcilator and causes poor
scaling to larger state spaces: on the `boom-small` benchmark in
`circt/arc-tests`, the analysis runs for >130s, which accounts for >95%
of the overall arcilator runtime.
This PR replaces the MLIR dataflow framework implementation of the
analysis with a custom one. The analysis requires only very sparse
traversal of the operations, which is a lot easier to implement with a
custom graph of nodes. On `boom-small` the new implementation takes
<200ms to complete.
Currently, all the pass base classes are included in all files
implementing one specific pass. This aims to reduce the amount of
unnecessarily included code.