hanchenye-llvm-project/polly/test
Roman Gareev be5299af0b Change the determination of parameters of macro-kernel
Typically processor architectures do not include an L3 cache, which means that
Nc, the parameter of the micro-kernel, is, for all practical purposes,
redundant ([1]). However, its small values can cause the redundant packing of
the same elements of the matrix A, the first operand of the matrix
multiplication. At the same time, big values of the parameter Nc can cause
segmentation faults in case the available stack is exceeded.

This patch adds an option to specify the parameter Nc as a multiple of
the parameter of the micro-kernel Nr.

In case of Intel Core i7-3820 SandyBridge and the following options,

clang -O3 gemm.c -I utilities/ utilities/polybench.c -DPOLYBENCH_TIME
-march=native -mllvm -polly -mllvm -polly-pattern-matching-based-opts=true
-DPOLYBENCH_USE_SCALAR_LB -mllvm -polly-target-cache-level-associativity=8,8
-mllvm -polly-target-cache-level-sizes=32768,262144 -mllvm
-polly-target-latency-vector-fma=8

it helps to improve the performance from 11.303 GFlops/sec (39,247% of
theoretical peak) to 17.896 GFlops/sec (62,14% of theoretical peak).

Refs.:

[1] - http://www.cs.utexas.edu/users/flame/pubs/TOMS-BLIS-Analytical.pdf

Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: https://reviews.llvm.org/D28019

llvm-svn: 290256
2016-12-21 12:51:12 +00:00
..
CodeGen/OpenMP Update to ISL 0.17. 2016-05-04 14:41:36 +00:00
DeLICM [DeLICM] Add pass boilerplate code. 2016-11-29 16:41:21 +00:00
DeadCodeElimination [Polly] Remove unwanted --check-prefix=CHECK from unit tests. NFC. 2016-04-15 06:12:29 +00:00
DependenceInfo [GSoC] Do not process SCoPs with infeasible runtime context 2016-07-25 12:40:59 +00:00
FlattenSchedule Add -polly-flatten-schedule pass. 2016-09-08 15:02:36 +00:00
GPGPU [tests] Adjust test output to recent changed SCEV canonocalization [NFC] 2016-11-13 19:27:17 +00:00
Isl Fix debug info metadata for upstream change in LLVM. 2016-12-20 02:09:59 +00:00
ScheduleOptimizer Change the determination of parameters of macro-kernel 2016-12-21 12:51:12 +00:00
ScopDetect test: add more details to non-affine test case 2016-11-22 06:28:08 +00:00
ScopDetectionDiagnostics [ScopDetection] Remove redundant checks for endless loops 2016-09-20 17:05:22 +00:00
ScopInfo [ScopInfo] Fold constant coefficients in array dimensions to the right 2016-12-02 08:10:56 +00:00
Unit Introduce unittests. 2016-08-25 12:36:15 +00:00
UnitIsl Build and run isl_test as part of check-polly 2016-10-04 19:48:40 +00:00
CMakeLists.txt [cmake] Add polly-isl-test dependency to lit tests. 2016-10-16 21:35:57 +00:00
README
create_ll.sh
lit.cfg
lit.site.cfg.in GPGPU: create default initialized PPCG scop and gpu program 2016-07-14 10:22:19 +00:00
polly.ll tests: Drop -polly-detect-unprofitable and -polly-no-early-exit 2015-10-06 15:36:44 +00:00
update_check.py Add -polly-flatten-schedule pass. 2016-09-08 15:02:36 +00:00

README

place tests here