We now support regions with multiple entries and multiple exits natively.
Regions are not needed to be simplified to single entry and single exit.
We need to XFAIL two test cases as this change increases the scop coverage
and uncoveres two failures in the independent blocks pass. The first failure
will be fixed in a subsequent commit, the second one is in the non-default
-polly-codegen-scev mode and still needs to be fixed.
Contributed-by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 179673
Regions that have multiple entry edges are very common. A simple if condition
yields e.g. such a region:
if
/ \
then else
\ /
for_region
This for_region contains two entry edges 'then' -> 'for_region' and 'else' -> 'for_region'.
Previously we scheduled the RegionSimplify pass to translate such regions into
simple regions. With this patch, we now support them natively when the region is
in -loop-simplify form, which means the entry block should not be a loop header.
Contributed by: Star Tan <tanmx_star@yeah.net>
llvm-svn: 179586
After this commit, polly is clang-format clean. This can be tested with
'ninja polly-check-format'. Updates to clang-format may change this, but the
differences will hopefully be both small and general improvements to the
formatting.
We currently have some not very nice formatting for a couple of items, DEBUG()
stmts for example. I believe the benefit of being clang-format clean outweights
the not perfect layout of this code.
llvm-svn: 177796
Given the following code
for (i = 0; i < 10; i++) {
;
}
S: A[i] = 0
When code generating S using scev based code generation, we need to retrieve
the scev of 'i' at the location of 'S'. If we do not do this the scev that
we obtain will be expressed as {0,+,1}_for and will reference loop iterators
that do not surround 'S' and that we consequently do not know how to code
generate. What we really want is the scev to be instantiated to the value of 'i'
after the loop. This value is {10} and it can be code generated without
troubles.
llvm-svn: 177777
When using the scev based code generation, we now do not rely on the presence
of a canonical induction variable any more. This commit prepares the path to
(conditionally) disable the induction variable canonicalization pass.
llvm-svn: 177548
When doing SCEV based code generation, we ignore instructions calculating values
that are fully defined by a SCEV expression. The values that are calculated by
this instructions are recalculated on demand.
This commit improves the check to verify if certain instructions can be ignored
and recalculated on demand.
llvm-svn: 177313
We need to remove one dimension. Any is correct as long as it exists. We have
choosen for whatever reason the dimension #dims - 2. This is incorrect if
there is just one dimension. For CLooG this case did never happen. For isl
however, the case can happen and causes undefined behavior including crashes.
We choose now always the last dimension #dims - 1. We could have choosen
dimension '0' but the last dimension is what we remove conceptionally in the
algorithm, so it seems better to actually program it that way.
While at it remove another piece of undefined behavior.
llvm-svn: 174894
When polly was configured with cmake without cloog, compilation stopped with:
../tools/polly/lib/CodeGen/BlockGenerators.cpp:662: error: 'PollyVectorizerChoice' was not declared in this scope
../tools/polly/lib/CodeGen/BlockGenerators.cpp:662: error: 'VECTORIZER_FIRST_NEED_GROUPED_UNROLL' was not declared in this scope
llvm-svn: 168623
If the flags '-polly-report -g' are given, we print file name and line numbers
for the beginning and end of all detected scops.
linear-algebra/kernels/gemm/gemm.c:23: Scop start
linear-algebra/kernels/gemm/gemm.c:42: Scop end
linear-algebra/kernels/gemm/gemm.c:77: Scop start
linear-algebra/kernels/gemm/gemm.c:82: Scop end
llvm-svn: 167235
In addition to the arrays and clast variables a SCoP statement may also refer to
values defined before the SCoP or to function arguments. Detect these values and
add them to the set of values passed to the function generated for OpenMP
parallel execution of a clast.
Committed with additional test cases and some refactoring.
Contributed by: Armin Groesslinger <armin.groesslinger@uni-passau.de>
llvm-svn: 167214
This change ensures that isl is only detected if it includes code generation
support. This allows us to remove a lot of conditional compilation and also
avoids missing test cases in case the feature is not available.
llvm-svn: 166403
Previously isl always generated '<=' or '>='. However, in many cases '<' or '>'
leads to simpler code. This commit updates isl and adds the relevant code
generation support to Polly.
llvm-svn: 166020
This pass implements a new code generator that uses the code generation
algorithm included in isl.
For the moment the new code generation is limited to sequential code.
llvm-svn: 165037
This includes:
- The isl_id of the domain of the scattering must be copied from the original
domain
- Remove outdated references to a 'FinalRead' statement
- Print of the Pocc output, if -debug is provided.
- Add line breaks to some error messages.
Reported and Debugged by: Dustin Feld <d3.feld@gmail.com>
llvm-svn: 162901
Translate the selected parallel loop body into a ptx string and run it with the
cuda driver API. We limit this preliminary implementation to target the
following special test cases:
- Support only 2-dimensional parallel loops with or without only one innermost
non-parallel loop.
- Support write memory access to only one array in a SCoP.
The patch was committed with smaller changes to the build system:
There is now a flag to enable gpu code generation explictly. This was required
as we need the llvm.codegen() patch applied on the llvm sources, to compile this
feature correctly. Also, enabling gpu code generation does not require cuda.
This requirement was removed to allow 'make polly-test' runs, even without an
installed cuda runtime.
Contributed by: Yabin Hu <yabin.hwu@gmail.com>
llvm-svn: 161239
This fixes a conflict between polly::createIndVarSimplifyPass() and
llvm::createIndVarSimplifyPass(), which causes problems on windows.
Reported by: Michael Kruse <MichaelKruse@meinersbur.de
llvm-svn: 161235
I did not take into account, that this patch fails to compile without the
llvm.codegen patch applied. This breaks buildbots.
I revert this until we found a solution to commit this without buildbots
complaining.
This reverts commit cb43ab80e94434e780a66be3b9a6ad466822fe33.
llvm-svn: 160165
Translate the selected parallel loop body into a ptx string and run it
with cuda driver API. We limit this preliminary implementation to
target the following special test cases:
- Support only 2-dimensional parallel loops with or without only one
innermost non-parallel loop.
- Support write memory access to only one array in a SCoP.
Contributed by: Yabin Hu <yabin.hwu@gmail.com>
llvm-svn: 160164
Derive the maximal and minimal values of a parameter from the type it has. Add
this information to the scop context. This information is needed, to derive
optimal types during code generation.
llvm-svn: 157245
This is an incomplete implementation of the SCEV based code generation.
When finished it will remove the need for -indvars -enable-iv-rewrite.
For the moment it is still disabled. Even though it passes 'make polly-test',
there are still loose ends especially in respect of OpenMP code generation.
llvm-svn: 155717
We create a new file LoopGenerators that provides utility classes for the
generation of OpenMP parallel and scalar loops. This means we move a lot
of the OpenMP generation out of the Polly specific code generator.
llvm-svn: 153325
The FinalRead statement represented a virtual read that is executed after the
SCoP. It was used when we verified the correctness of a schedule by checking if
it yields the same FLOW dependences as the original code. This is only works, if
we have a final read that reads all memory at the end of the SCoP.
We now switched to just checking if a schedule does not introduce negative
dependences and also consider WAW WAR dependences. This restricts the schedules
a little bit more, but we do not have any optimizer that would calculate a more
complex schedule. Hence, for now final reads are obsolete.
llvm-svn: 152319
We now just check if the new scattering would create non-positive dependences.
This is a lot faster than recalculating dependences (which is especially slow
on tiled code).
llvm-svn: 152230
This allows us to enable -enable-iv-rewrite by default and releases LLVM from
the burdon to keep that feature. This is an intermediate step. We plan to soon
remove the need for rewritten induction variables entirely.
llvm-svn: 150481
Such a dead code elimination can remove redundant stores to arrays. It can also
eliminate calculations where the results are stored to memory but where they are
overwritten before ever being read. It may also fix bugs like:
http://llvm.org/bugs/show_bug.cgi?id=5117
This commit just adds a sceleton without any functionality.
If anybody is interested to learn about polyhedral optimizations this would be
a good task. Well definined, self contained and pretty simple. Ping me if you
want to start and you need some pointers to get going.
llvm-svn: 149386
In case we can not analyze an access function, we do not discard the SCoP, but
assume conservatively that all memory accesses that can be derived from our base
pointer may be accessed.
Patch provided by: Marcello Maggioni <hayarms@gmail.com>
llvm-svn: 146972
address is part of the access function. Also remove unused special cases that
were necessery when the base address was still contained in the access function
llvm-svn: 144280
This check was necessary because of the use AffineSCEVIterator in TempScopInfo.
As we removed this use recently it is not necessary any more.
llvm-svn: 144228
Instead of using TempScop to find parameters, we detect them directly
on the SCEV. This allows us to remove the TempScop parameter detection
in a subsequent commit.
This fixes a bug reported by Marcello Maggioni <hayarms@gmail.com>
llvm-svn: 144087
Previously we built a context that contained already all parameter dimensions
from the start. We now build a context without any parameter dimensions and
extend the context as needed. All parameter dimensions are added during final
realignment.
llvm-svn: 144085