hanchenye-scalehls/README.md

# ScaleHLS Project (scalehls)

This project aims to create a framework that ultimately converts an algorithm written in a high level language into an efficient hardware implementation. With multiple levels of intermediate representations (IRs), MLIR appears to be the ideal tool for exploring ways to optimize the eventual design at various levels of abstraction (e.g. various levels of parallelism). Our framework will be based on MLIR, it will incorporate a backend for high level synthesis (HLS) C/C++ code. However, the key contribution will be our parameterization and optimization of a tremendously large design space.

## Quick Start

### 0. Download ScaleHLS and LLVM
```
$ git clone git@github.com:hanchenye/scalehls.git
$ cd scalehls
$ git submodule init
$ git submodule update
```

### 1. Install LLVM and MLIR
This step assumes this repository is cloned to `$SCALEHLS_DIR`. To build LLVM and MLIR, run:
```sh
$ mkdir $SCALEHLS_DIR/llvm/build
$ cd $SCALEHLS_DIR/llvm/build
$ cmake -G Ninja ../llvm \
    -DLLVM_ENABLE_PROJECTS="mlir;llvm;clang;clang-extra-tools" \
    -DLLVM_TARGETS_TO_BUILD="X86;RISCV" \
    -DLLVM_ENABLE_ASSERTIONS=ON \
    -DCMAKE_BUILD_TYPE=DEBUG
$ ninja
$ ninja check-mlir
```

### 2. Install ScaleHLS
To build and launch the tests, run:
```sh
$ mkdir $SCALEHLS_DIR/build
$ cd $SCALEHLS_DIR/build
$ cmake -G Ninja .. \
    -DMLIR_DIR=$PWD/../llvm/build/lib/cmake/mlir \
    -DLLVM_DIR=$PWD/../llvm/build/lib/cmake/llvm \
    -DCLANG_DIR=$PWD/../llvm/build/lib/cmake/clang \
    -DCMAKE_C_COMPILER=$PWD/../llvm/build/bin/clang \
    -DCMAKE_CXX_COMPILER=$PWD/../llvm/build/bin/clang++ \
    -DLLVM_ENABLE_ASSERTIONS=ON \
    -DCMAKE_BUILD_TYPE=DEBUG
$ ninja check-scalehls
```

### 3. Try ScaleHLS
After the installation and test successfully completed, you should be able to play with:
```sh
$ export PATH=$SCALEHLS_DIR/build/bin:$PATH
$ cd $SCALEHLS_DIR

$ # Loop and directive-level optimizations, QoR estimation, and C++ code generation.
$ scalehls-opt samples/polybench/syrk/syrk_32.mlir \
    -affine-loop-perfection -affine-loop-order-opt -remove-variable-bound \
    -partial-affine-loop-tile="tile-size=2" -legalize-to-hlscpp="top-func=syrk_32" \
    -loop-pipelining="pipeline-level=3 target-ii=2" -canonicalize -simplify-affine-if \
    -affine-store-forward -simplify-memref-access -array-partition -cse -canonicalize \
    -qor-estimation="target-spec=config/target-spec.ini" \
    | scalehls-translate -emit-hlscpp

$ # Automatic kernel-level design space exploration.
$ scalehls-opt samples/polybench/gemm/gemm_32.mlir \
    -multiple-level-dse="top-func=gemm_32 output-path=./ target-spec=config/target-spec.ini" \
    -debug-only=scalehls > /dev/null
$ scalehls-translate -emit-hlscpp gemm_32_pareto_0.mlir > gemm_32_pareto_0.cpp

$ # Benchmark generation, dataflow-level optimization, HLSKernel lowering and bufferization.
$ benchmark-gen -type "cnn" -config "config/cnn-config.ini" -number 1 \
    | scalehls-opt -legalize-dataflow="insert-copy=true min-gran=2" -split-function \
    -hlskernel-bufferize -hlskernel-to-affine -func-bufferize -canonicalize
```

Please refer to the `samples/polybench` folder for more test cases.

## Integration with ONNX-MLIR
If you have installed ONNX-MLIR or established ONNX-MLIR docker to `$ONNXMLIR_DIR` following the instruction from (https://github.com/onnx/onnx-mlir), you should be able to run the following integration test:
```sh
$ cd $SCALEHLS_DIR/samples/onnx-mlir/resnet18

$ # Export PyTorch model to ONNX.
$ python export_resnet18.py

$ # Parse ONNX model to MLIR.
$ $ONNXMLIR_DIR/build/bin/onnx-mlir -EmitONNXIR resnet18.onnx

$ # Lower from ONNX dialect to Affine dialect.
$ $ONNXMLIR_DIR/build/bin/onnx-mlir-opt resnet18.onnx.mlir \
    -shape-inference -convert-onnx-to-krnl -pack-krnl-constants \
    -convert-krnl-to-affine > resnet18.mlir

$ # (Optional) Print model graph.
$ scalehls-opt resnet18.tmp -print-op-graph 2> resnet18.gv
$ dot -Tpng resnet18.gv > resnet18.png

$ # Legalize the output of ONNX-MLIR, optimize and emit C++ code.
$ scalehls-opt resnet18.mlir -allow-unregistered-dialect -legalize-onnx \
    -affine-loop-normalize -canonicalize -legalize-dataflow="min-gran=3 insert-copy=true" \
    -split-function -convert-linalg-to-affine-loops -legalize-to-hlscpp="top-func=main_graph" \
    -affine-loop-perfection -affine-loop-order-opt -loop-pipelining -simplify-affine-if \
    -affine-store-forward -simplify-memref-access -array-partition -cse -canonicalize \
    | scalehls-translate -emit-hlscpp > resnet18.cpp
```

Please refer to the `samples/onnx-mlir` folder for more test cases, and `sample/onnx-mlir/ablation_int_test.sh` for how to conduct the graph, loop, and directive optimizations.

## References
1. [MLIR](https://mlir.llvm.org): Multi-Level Intermediate Representation
2. [NPComp](https://github.com/llvm/mlir-npcomp): MLIR based compiler toolkit for numerical python programs
3. [ONNX-MLIR](https://github.com/onnx/onnx-mlir): The Open Neural Network Exchange implementation in MLIR
4. [CIRCT](https://github.com/llvm/circt): Circuit IR Compilers and Tools
5. [COMBA](https://github.com/zjru/COMBA): A Model-Based Analysis Framework for High Level Synthesis on FPGAs
update readme 2020-09-14 12:45:52 +08:00			`# ScaleHLS Project (scalehls)`
add overview and benchmarks 2020-05-13 12:28:39 +08:00
add copyright header; remove redundant includes and usings; move EmitHLSCpp to Tanslation directory 2021-01-19 15:36:00 +08:00			This project aims to create a framework that ultimately converts an algorithm written in a high level language into an efficient hardware implementation. With multiple levels of intermediate representations (IRs), MLIR appears to be the ideal tool for exploring ways to optimize the eventual design at various levels of abstraction (e.g. various levels of parallelism). Our framework will be based on MLIR, it will incorporate a backend for high level synthesis (HLS) C/C++ code. However, the key contribution will be our parameterization and optimization of a tremendously large design space.
initial commit 2020-04-21 05:25:12 +08:00
build code structure; update readme; remove pymlir directory 2020-08-26 03:11:30 +08:00			`## Quick Start`
updated doc 2021-05-01 01:34:36 +08:00
			`### 0. Download ScaleHLS and LLVM`
			```
			`$ git clone git@github.com:hanchenye/scalehls.git`
			`$ cd scalehls`
			`$ git submodule init`
			`$ git submodule update`
			```

[Readme] add LLVM install instruction 2020-11-06 07:56:34 +08:00			`### 1. Install LLVM and MLIR`
format 2021-05-01 06:54:57 +08:00			This step assumes this repository is cloned to `$SCALEHLS_DIR`. To build LLVM and MLIR, run:
build code structure; update readme; remove pymlir directory 2020-08-26 03:11:30 +08:00			```sh
updated doc 2021-05-01 01:34:36 +08:00			`$ mkdir $SCALEHLS_DIR/llvm/build`
			`$ cd $SCALEHLS_DIR/llvm/build`
[Readme] add LLVM install instruction 2020-11-06 07:56:34 +08:00			`$ cmake -G Ninja ../llvm \`
Added C front end 2021-05-01 06:23:12 +08:00			`-DLLVM_ENABLE_PROJECTS="mlir;llvm;clang;clang-extra-tools" \`
[Readme] add LLVM install instruction 2020-11-06 07:56:34 +08:00			`-DLLVM_TARGETS_TO_BUILD="X86;RISCV" \`
			`-DLLVM_ENABLE_ASSERTIONS=ON \`
			`-DCMAKE_BUILD_TYPE=DEBUG`
			`$ ninja`
			`$ ninja check-mlir`
build code structure; update readme; remove pymlir directory 2020-08-26 03:11:30 +08:00			```
[Readme] add LLVM install instruction 2020-11-06 07:56:34 +08:00
			`### 2. Install ScaleHLS`
format 2021-05-01 06:54:57 +08:00			`To build and launch the tests, run:`
[Readme] add LLVM install instruction 2020-11-06 07:56:34 +08:00			```sh
			`$ mkdir $SCALEHLS_DIR/build`
			`$ cd $SCALEHLS_DIR/build`
			`$ cmake -G Ninja .. \`
changed llvm branch to main 2021-05-01 02:53:51 +08:00			`-DMLIR_DIR=$PWD/../llvm/build/lib/cmake/mlir \`
			`-DLLVM_DIR=$PWD/../llvm/build/lib/cmake/llvm \`
Added C front end 2021-05-01 06:23:12 +08:00			`-DCLANG_DIR=$PWD/../llvm/build/lib/cmake/clang \`
			`-DCMAKE_C_COMPILER=$PWD/../llvm/build/bin/clang \`
			`-DCMAKE_CXX_COMPILER=$PWD/../llvm/build/bin/clang++ \`
[Readme] add LLVM install instruction 2020-11-06 07:56:34 +08:00			`-DLLVM_ENABLE_ASSERTIONS=ON \`
			`-DCMAKE_BUILD_TYPE=DEBUG`
			`$ ninja check-scalehls`
			```

[Samples] add ablation test script; update Readme; bug fix in InsertPipelinePragma 2020-12-07 13:48:20 +08:00			`### 3. Try ScaleHLS`
[Readme] add onnx-mlir integration test instructions; add mnist.onnx, resnet18 test case; [EmitHLSCpp] support cast ops emission 2020-12-24 14:15:47 +08:00			`After the installation and test successfully completed, you should be able to play with:`
update readme 2020-09-06 16:25:26 +08:00			```sh
[Readme] add LLVM install instruction 2020-11-06 07:56:34 +08:00			`$ export PATH=$SCALEHLS_DIR/build/bin:$PATH`
			`$ cd $SCALEHLS_DIR`
[Readme] update instructions for testing ScaleHLS; [QoREstimation] remove getOpII method 2020-12-20 13:31:52 +08:00
[Readme] update readme for using instructions 2021-04-27 09:54:10 +08:00			`$ # Loop and directive-level optimizations, QoR estimation, and C++ code generation.`
			`$ scalehls-opt samples/polybench/syrk/syrk_32.mlir \`
[MultipleLevelDSE] support to emit scalehls debug info; fix bugs that AffineApplyOp cannot be correctly folded; update readme accordingly 2021-01-22 04:52:30 +08:00			`-affine-loop-perfection -affine-loop-order-opt -remove-variable-bound \`
[Readme] update readme for using instructions 2021-04-27 09:54:10 +08:00			`-partial-affine-loop-tile="tile-size=2" -legalize-to-hlscpp="top-func=syrk_32" \`
[QoREstimation] improve estimation speed by using parallel loop information; [Transforms] update utils to apply LegalizeToHLSCpp after loop tiling; [MultipleLevelDSE] add topFunc as pass option 2021-03-01 13:30:57 +08:00			`-loop-pipelining="pipeline-level=3 target-ii=2" -canonicalize -simplify-affine-if \`
[Readme] update readme for using instructions 2021-04-27 09:54:10 +08:00			`-affine-store-forward -simplify-memref-access -array-partition -cse -canonicalize \`
[QoREstimation] support function call estimation, a known issue is CallOps inside of loops are not comprehensively considered; estimation refinement for multiple loops and select op (#5); fix related bugs 2020-12-22 09:02:39 +08:00			`-qor-estimation="target-spec=config/target-spec.ini" \`
			`\| scalehls-translate -emit-hlscpp`
[Readme] update instructions for testing ScaleHLS; [QoREstimation] remove getOpII method 2020-12-20 13:31:52 +08:00
[Readme] update readme for using instructions 2021-04-27 09:54:10 +08:00			`$ # Automatic kernel-level design space exploration.`
			`$ scalehls-opt samples/polybench/gemm/gemm_32.mlir \`
[Readme] update references 2021-04-27 10:05:18 +08:00			`-multiple-level-dse="top-func=gemm_32 output-path=./ target-spec=config/target-spec.ini" \`
[Readme] update readme for using instructions 2021-04-27 09:54:10 +08:00			`-debug-only=scalehls > /dev/null`
			`$ scalehls-translate -emit-hlscpp gemm_32_pareto_0.mlir > gemm_32_pareto_0.cpp`

[Readme] update onnx-mlir integration test instruction; [Samples] add onnx-mlir and PolyBench folder, add export_resnet18.py script 2020-12-26 06:18:38 +08:00			`$ # Benchmark generation, dataflow-level optimization, HLSKernel lowering and bufferization.`
			`$ benchmark-gen -type "cnn" -config "config/cnn-config.ini" -number 1 \`
[Readme] update readme for using instructions 2021-04-27 09:54:10 +08:00			`\| scalehls-opt -legalize-dataflow="insert-copy=true min-gran=2" -split-function \`
[Readme] update onnx-mlir integration test instruction; [Samples] add onnx-mlir and PolyBench folder, add export_resnet18.py script 2020-12-26 06:18:38 +08:00			`-hlskernel-bufferize -hlskernel-to-affine -func-bufferize -canonicalize`
update readme 2020-09-06 16:25:26 +08:00			```
[README] add detailed build & test instruction; update TODOs list 2020-09-15 13:57:44 +08:00
[Readme] update readme for using instructions 2021-04-27 09:54:10 +08:00			Please refer to the `samples/polybench` folder for more test cases.

[Readme] add onnx-mlir integration test instructions; add mnist.onnx, resnet18 test case; [EmitHLSCpp] support cast ops emission 2020-12-24 14:15:47 +08:00			`## Integration with ONNX-MLIR`
[Readme] update onnx-mlir integration test instruction; [Samples] add onnx-mlir and PolyBench folder, add export_resnet18.py script 2020-12-26 06:18:38 +08:00			If you have installed ONNX-MLIR or established ONNX-MLIR docker to `$ONNXMLIR_DIR` following the instruction from (https://github.com/onnx/onnx-mlir), you should be able to run the following integration test:
[Readme] add onnx-mlir integration test instructions; add mnist.onnx, resnet18 test case; [EmitHLSCpp] support cast ops emission 2020-12-24 14:15:47 +08:00			```sh
[Readme] update readme for using instructions 2021-04-27 09:54:10 +08:00			`$ cd $SCALEHLS_DIR/samples/onnx-mlir/resnet18`
[Readme] update onnx-mlir integration test instruction; [Samples] add onnx-mlir and PolyBench folder, add export_resnet18.py script 2020-12-26 06:18:38 +08:00
			`$ # Export PyTorch model to ONNX.`
			`$ python export_resnet18.py`
[Readme] add onnx-mlir integration test instructions; add mnist.onnx, resnet18 test case; [EmitHLSCpp] support cast ops emission 2020-12-24 14:15:47 +08:00
			`$ # Parse ONNX model to MLIR.`
[Readme] update onnx-mlir integration test instruction; [Samples] add onnx-mlir and PolyBench folder, add export_resnet18.py script 2020-12-26 06:18:38 +08:00			`$ $ONNXMLIR_DIR/build/bin/onnx-mlir -EmitONNXIR resnet18.onnx`
[Readme] add onnx-mlir integration test instructions; add mnist.onnx, resnet18 test case; [EmitHLSCpp] support cast ops emission 2020-12-24 14:15:47 +08:00
			`$ # Lower from ONNX dialect to Affine dialect.`
[Readme] update onnx-mlir integration test instruction; [Samples] add onnx-mlir and PolyBench folder, add export_resnet18.py script 2020-12-26 06:18:38 +08:00			`$ $ONNXMLIR_DIR/build/bin/onnx-mlir-opt resnet18.onnx.mlir \`
			`-shape-inference -convert-onnx-to-krnl -pack-krnl-constants \`
			`-convert-krnl-to-affine > resnet18.mlir`

			`$ # (Optional) Print model graph.`
			`$ scalehls-opt resnet18.tmp -print-op-graph 2> resnet18.gv`
			`$ dot -Tpng resnet18.gv > resnet18.png`
[Readme] add onnx-mlir integration test instructions; add mnist.onnx, resnet18 test case; [EmitHLSCpp] support cast ops emission 2020-12-24 14:15:47 +08:00
			`$ # Legalize the output of ONNX-MLIR, optimize and emit C++ code.`
[Readme] update references 2021-04-27 10:05:18 +08:00			`$ scalehls-opt resnet18.mlir -allow-unregistered-dialect -legalize-onnx \`
			`-affine-loop-normalize -canonicalize -legalize-dataflow="min-gran=3 insert-copy=true" \`
			`-split-function -convert-linalg-to-affine-loops -legalize-to-hlscpp="top-func=main_graph" \`
[Readme] update readme for using instructions 2021-04-27 09:54:10 +08:00			`-affine-loop-perfection -affine-loop-order-opt -loop-pipelining -simplify-affine-if \`
			`-affine-store-forward -simplify-memref-access -array-partition -cse -canonicalize \`
[Samples] add scripts folder, polybench and resnet18 test script; [MultipleLevelDSE] randomize neighbor and pareto point selection 2021-03-03 02:24:23 +08:00			`\| scalehls-translate -emit-hlscpp > resnet18.cpp`
[Readme] add onnx-mlir integration test instructions; add mnist.onnx, resnet18 test case; [EmitHLSCpp] support cast ops emission 2020-12-24 14:15:47 +08:00			```

[Readme] update readme for using instructions 2021-04-27 09:54:10 +08:00			Please refer to the `samples/onnx-mlir` folder for more test cases, and `sample/onnx-mlir/ablation_int_test.sh` for how to conduct the graph, loop, and directive optimizations.

update readme 2020-04-21 05:29:04 +08:00			`## References`
[Readme] update references 2021-04-27 10:05:18 +08:00			`1. [MLIR](https://mlir.llvm.org): Multi-Level Intermediate Representation`
			`2. [NPComp](https://github.com/llvm/mlir-npcomp): MLIR based compiler toolkit for numerical python programs`
			`3. [ONNX-MLIR](https://github.com/onnx/onnx-mlir): The Open Neural Network Exchange implementation in MLIR`
			`4. [CIRCT](https://github.com/llvm/circt): Circuit IR Compilers and Tools`
			`5. [COMBA](https://github.com/zjru/COMBA): A Model-Based Analysis Framework for High Level Synthesis on FPGAs`