hanchenye-scalehls/README.md

112 lines
5.6 KiB
Markdown
Raw Normal View History

2020-09-14 12:45:52 +08:00
# ScaleHLS Project (scalehls)
2020-05-13 12:28:39 +08:00
2021-07-28 02:00:10 +08:00
ScaleHLS is a next-generation HLS compilation flow, on top of a multi-level compiler infrastructure called MLIR. ScaleHLS is able to represent and optimize HLS designs at multiple levels of abstraction and provides an HLS-dedicated transform and analysis library to solve the optimization problems at the suitable representation levels. On top of the library, we also build an automated DSE engine to explore the multi-dimensional design space efficiently. In addition, we develop an HLS C front-end and a C/C++ emission back-end to translate HLS designs into/from MLIR for enabling the end-to-end ScaleHLS flow. Experimental results show that, comparing to the baseline designs only optimized by Xilinx Vivado HLS, ScaleHLS improves the performances with amazing quality-of-results up to 768.1× better on computation kernel level programs and up to 3825.0× better on neural network models.
Please check out our [arXiv paper](https://arxiv.org/abs/2107.11673) for more details.
2020-04-21 05:25:12 +08:00
## Quick Start
2021-05-01 01:34:36 +08:00
2021-10-02 02:22:03 +08:00
### 0. Download ScaleHLS
```sh
2021-08-24 02:26:18 +08:00
$ git clone --recursive git@github.com:hanchenye/scalehls.git
2021-10-02 02:22:03 +08:00
```
### 1. Install ScaleHLS
To enable the Python binding feature, please make sure the `pybind11` has been installed. To build MLIR and ScaleHLS, run (note that the `-DLLVM_PARALLEL_LINK_JOBS` option can be tuned to reduce the memory usage):
```sh
$ mkdir scalehls/build
$ cd scalehls/build
2021-10-01 03:07:10 +08:00
$ cmake -G Ninja ../polygeist/llvm-project/llvm \
-DLLVM_ENABLE_PROJECTS="mlir;clang" \
2021-10-02 02:22:03 +08:00
-DLLVM_EXTERNAL_PROJECTS="scalehls" \
-DLLVM_EXTERNAL_SCALEHLS_SOURCE_DIR=$PWD/.. \
-DLLVM_TARGETS_TO_BUILD="host" \
2020-11-06 07:56:34 +08:00
-DLLVM_ENABLE_ASSERTIONS=ON \
-DCMAKE_BUILD_TYPE=DEBUG \
-DMLIR_ENABLE_BINDINGS_PYTHON=ON \
-DSCALEHLS_ENABLE_BINDINGS_PYTHON=ON \
2021-10-02 02:22:03 +08:00
-DLLVM_PARALLEL_LINK_JOBS= \
-DLLVM_USE_LINKER=lld \
-DCMAKE_C_COMPILER=clang \
-DCMAKE_CXX_COMPILER=clang++
2020-11-06 07:56:34 +08:00
$ ninja
$ ninja check-scalehls
2021-10-03 23:19:58 +08:00
$ export PATH=$PATH:$PWD/bin
$ export PYTHONPATH=$PYTHONPATH:$PWD/tools/scalehls/python_packages/scalehls_core
2020-11-06 07:56:34 +08:00
```
2021-10-02 02:22:03 +08:00
ScaleHLS exploits the `mlir-clang` tool of Polygeist as the C front-end. To build Polygeist, run:
```sh
$ mkdir scalehls/polygeist/build
$ cd scalehls/polygeist/build
$ cmake -G Ninja .. \
-DMLIR_DIR=$PWD/../../build/lib/cmake/mlir \
-DCLANG_DIR=$PWD/../../build/lib/cmake/clang \
-DLLVM_ENABLE_ASSERTIONS=ON \
-DCMAKE_BUILD_TYPE=DEBUG \
-DLLVM_USE_LINKER=lld \
-DCMAKE_C_COMPILER=clang \
-DCMAKE_CXX_COMPILER=clang++
$ ninja check-mlir-clang
2021-10-03 23:19:58 +08:00
$ export PATH=$PATH:$PWD/mlir-clang
2021-10-02 02:22:03 +08:00
```
### 2. Try ScaleHLS
2021-10-02 02:22:03 +08:00
After the installation and regression test successfully completed, you should be able to play with:
2020-09-06 16:25:26 +08:00
```sh
$ cd scalehls
$ # HLS C programs parsing and automatic kernel-level design space exploration.
2021-10-07 03:44:35 +08:00
$ mlir-clang samples/polybench/gemm/gemm_32.c -function=gemm_32 -memref-fullrank -raise-scf-to-affine -S | \
scalehls-opt -dse="top-func=gemm_32 output-path=./ target-spec=samples/polybench/target-spec.ini" \
-debug-only=scalehls > /dev/null
$ scalehls-translate -emit-hlscpp gemm_32_pareto_0.mlir > gemm_32_pareto_0.cpp
$ # Loop and directive-level optimizations, QoR estimation, and C++ code generation.
$ scalehls-opt samples/polybench/syrk/syrk_32.mlir \
-affine-loop-perfection -affine-loop-order-opt -remove-variable-bound \
-partial-affine-loop-tile="tile-size=2" -legalize-to-hlscpp="top-func=syrk_32" \
-loop-pipelining="pipeline-level=3 target-ii=2" -canonicalize -simplify-affine-if \
-affine-store-forward -simplify-memref-access -array-partition -cse -canonicalize \
2021-10-03 23:19:58 +08:00
-qor-estimation="target-spec=samples/polybench/target-spec.ini" \
| scalehls-translate -emit-hlscpp
2020-09-06 16:25:26 +08:00
```
## Integration with ONNX-MLIR
If you have installed ONNX-MLIR or established ONNX-MLIR docker to `$ONNXMLIR_DIR` following the instruction from (https://github.com/onnx/onnx-mlir), you should be able to run the following integration test:
```sh
$ cd scalehls/samples/onnx-mlir/resnet18
$ # Export PyTorch model to ONNX.
$ python export_resnet18.py
$ # Parse ONNX model to MLIR.
$ $ONNXMLIR_DIR/build/bin/onnx-mlir -EmitONNXIR resnet18.onnx
$ # Lower from ONNX dialect to Affine dialect.
$ $ONNXMLIR_DIR/build/bin/onnx-mlir-opt resnet18.onnx.mlir \
-shape-inference -convert-onnx-to-krnl -pack-krnl-constants \
-convert-krnl-to-affine > resnet18.mlir
$ # (Optional) Print model graph.
$ scalehls-opt resnet18.tmp -print-op-graph 2> resnet18.gv
$ dot -Tpng resnet18.gv > resnet18.png
$ # Legalize the output of ONNX-MLIR, optimize and emit C++ code.
2021-04-27 10:05:18 +08:00
$ scalehls-opt resnet18.mlir -allow-unregistered-dialect -legalize-onnx \
-affine-loop-normalize -canonicalize -legalize-dataflow="insert-copy=true min-gran=3" \
2021-04-27 10:05:18 +08:00
-split-function -convert-linalg-to-affine-loops -legalize-to-hlscpp="top-func=main_graph" \
-affine-loop-perfection -affine-loop-order-opt -loop-pipelining -simplify-affine-if \
-affine-store-forward -simplify-memref-access -array-partition -cse -canonicalize \
| scalehls-translate -emit-hlscpp > resnet18.cpp
```
Please refer to the `samples/onnx-mlir` folder for more test cases, and `sample/onnx-mlir/ablation_int_test.sh` for how to conduct the graph, loop, and directive optimizations.
2020-04-21 05:29:04 +08:00
## References
2021-04-27 10:05:18 +08:00
1. [MLIR](https://mlir.llvm.org): Multi-Level Intermediate Representation
2. [NPComp](https://github.com/llvm/mlir-npcomp): MLIR based compiler toolkit for numerical python programs
3. [ONNX-MLIR](https://github.com/onnx/onnx-mlir): The Open Neural Network Exchange implementation in MLIR
4. [CIRCT](https://github.com/llvm/circt): Circuit IR Compilers and Tools
5. [COMBA](https://github.com/zjru/COMBA): A Model-Based Analysis Framework for High Level Synthesis on FPGAs