Commit Graph

435 Commits

Author SHA1 Message Date
Tobias Grosser 2fd89da90d Remove non-debug printing of domain set
Contributed-by: Chris Jenneisch <chrisj@codeaurora.org>

Differential Revision: http://reviews.llvm.org/D15094

llvm-svn: 254343
2015-11-30 22:59:41 +00:00
Michael Kruse 4c86a1d97b IR cleanup after CodeGeneration
Re-run canonicalization passes after Polly's code generation.

The set of passes currently added here are nearly all the passes between
--polly-position=early and --polly-position=before-vectorizer, i.e. all
passes that would usually run after Polly.

In order to run these only if Polly actually modified the code, we add a
function attribute "polly-optimzed" to a function that contains
generated code. The cleanup pass is skipped if the function does not
have this attribute.

There is no support by the (legacy) PassManager to run passes only under
some conditions. One could have wrapped all transformation passes to run
only when CodeGeneration changed the code, but the analyses would run
anyway. This patch creates an independent pass manager. The
disadvantages are that all analyses have to re-run even if preserved and
it does not honor compiler switches like the PassManagerBuilder does.

Differential Revision: http://reviews.llvm.org/D14333

llvm-svn: 254150
2015-11-26 12:36:25 +00:00
Michael Kruse cba170e4d0 Introduce origin/kind for exit PHI node accesses
Previously, accesses that originate from PHI nodes in the exit block
were registered as SCALAR. In some context they are treated as scalars,
but it makes a difference in others. We used to check whether the
AccessInstruction is a terminator to differentiate the cases.

This patch introduces an MemoryAccess origin EXIT_PHI and a
ScopArrayInfo kind KIND_EXIT_PHI to make this case more explicit. No
behavioural change intended.

Differential Revision: http://reviews.llvm.org/D14688

llvm-svn: 254149
2015-11-26 12:26:06 +00:00
Tobias Grosser bc29e0b27c RegionGenerator: Only introduce subregion.ivs for loops fully within a subregion
IVs of loops for which the loop header is in the subregion, but not the entire
loop may be incremented outside of the subregion and can consequently not be
kept private to the subregion. Instead, they need to and are modeled as virtual
loops in the iteration domains. As this is the case, generating new subregion
induction variables for such loops is not needed and indeed wrong as they would
hide the virtual induction variables modeled in the scop.

This fixes a miscompile in MultiSource/Benchmarks/Ptrdist/bc and
MultiSource/Benchmarks/nbench/. Thanks Michael and Johannes for their
investiagations and helpful observations regarding this bug.

llvm-svn: 252860
2015-11-12 07:34:09 +00:00
Johannes Doerfert fdbf201fc9 [FIX] Do not generate code for parameters referencing dead values
Check if a value that is referenced by a parameter is dead and do not
generate code for the parameter in such a case.

llvm-svn: 252813
2015-11-11 22:40:51 +00:00
Johannes Doerfert dcfedf3505 [FIX] Cast pre-loaded values correctly or reload them with adjusted type.
Especially for structs, the SAI object of a base pointer does not
describe all the types that the user might expect when he loads from
that base pointer. While we will still cast integers and pointers we
will now reload the value with the correct type if floating point and
non-floating point values are involved. However, there are now TODOs
where we use bitcasts instead of a proper conversion or reloading.

This fixes bug 25479.

llvm-svn: 252706
2015-11-11 06:20:25 +00:00
Johannes Doerfert fc4bfc465a [FIX] Create empty invariant equivalence classes
We now create all invariant equivalence classes for required invariant loads
  instead of creating them on-demand. This way we can check if a parameter
  references an invariant load that is actually not executed and was therefor
  not materialized. If that happens the parameter is not materialized either.

This fixes bug 25469.

llvm-svn: 252701
2015-11-11 04:30:07 +00:00
Tobias Grosser 6abc75af4c ScopInfo: Introduce ArrayKind
Since 252422 we do not only distinguish two ScopArrayInfo kinds, PHI nodes
and others, but work with three kind of ScopArrayInfo objects. SCALAR, PHI and
ARRAY objects. Instead of keeping two boolean flags isPHI and isScalar and
wonder what an ScopArrayInfo object of kind (!isScalar && isPHI) is, we
list now explicitly the three different possible types of memory objects.

This change also allows us to remove the confusing nested pairs that have
been used in ArrayInfoMapTy.

llvm-svn: 252620
2015-11-10 17:31:31 +00:00
Tobias Grosser 262538435f ScopInfo: Make getDimensionSize better reflect which dimensions carry sizes
In polly the first dimensions of an array as well as all scalars do not carry
any size information. This commit makes this explicit in the interface of
getDimensionSize. Before this commit getDimensionSize(0) returned the size of
the first dimension that carried a size. After this commit getDimensionSize(i)
will either return the size of dimension 'i' or assert in case 'i' does not
carry a size or does not exist at all.

This very same behaviour was already present in getDimensionSizePw(). This
commit also adds assertions that ensure getDimensionSizePw() is called
appropriately.

llvm-svn: 252607
2015-11-10 14:24:21 +00:00
Michael Kruse c993739e0d Fix non-affine generated entering node not being recognized as dominating
Scalar reloads in the generated entering block were not recognized as
dominating the subregions locks when there were multiple entering
nodes. This resulted in values defined in there not being copied.

As a fix, we unconditionally add the BBMap of the generated entering
node to the generated entry. This fixes part of llvm.org/PR25439.

This reverts 252449 and reapplies r252445. Its test was failing
indeterministically due to r252375 which was reverted in r252522.

llvm-svn: 252540
2015-11-09 23:33:40 +00:00
Michael Kruse d6fb6f1b0c Fix dominance when subregion exit is outside scop
The dominance of the generated non-affine subregion block was based on
the scop's merge block, therefore resulted in an invalid DominanceTree.
It resulted in some values as assumed to be unusable in the actual
generated exit block.

We detect the case that the exit block has been moved and decide
dominance using the BB at the original exit. If we create another exit
node, that exit nodes is dominated by the one generated from where the
original exit resides. This fixes llvm.org/PR25438 and part of
llvm.org/PR25439.

llvm-svn: 252526
2015-11-09 23:07:38 +00:00
Michael Kruse ebffcbeefa Revert r252375 "Fix non-affine region dominance of implicitely stored values"
It introduced indeterminism as it was iterating over an address-indexed
hashtable. The corresponding bug PR25438 will be fixed in a successive
commit.

llvm-svn: 252522
2015-11-09 22:37:29 +00:00
Johannes Doerfert 7a6e292d86 [FIX] Use same alloca for invariant loads and the scalar users
llvm-svn: 252451
2015-11-09 06:28:45 +00:00
Johannes Doerfert 544b23a1ef Revert "Fix non-affine generated entering node not being recognized as dominating"
This reverts commit 9775824b265e574fc541e975d64d3e270243b59d due to a
failing unit test.

Please check and correct the unit test and commit again.

llvm-svn: 252449
2015-11-09 06:04:05 +00:00
Michael Kruse fd9c89e84b Fix non-affine generated entering node not being recognized as dominating
Scalar reloads in the generated entering block were not recognized as
dominating the subregions locks when there were multiple entering
nodes. This resulted in values defined in there not being copied.

As a fix, we unconditionally add the BBMap of the generated entering
node to the generated entry. This fixes part of llvm.org/PR25439.

llvm-svn: 252445
2015-11-09 05:00:30 +00:00
Johannes Doerfert 188542fda9 [FIX] Initialize incoming scalar memory locations for PHIs
llvm-svn: 252437
2015-11-09 00:21:21 +00:00
Johannes Doerfert 80c716df2b [NFC] Remove unused variable.
llvm-svn: 252436
2015-11-09 00:21:04 +00:00
Johannes Doerfert a768624f14 [FIX] Introduce different SAI objects for scalar and memory accesses
Even if a scalar and memory access have the same base pointer, we cannot use
  one SAI object as the type but also the number of dimensions are wrong. For
  the attached test case this caused a crash in the invariant load hoisting,
  though it could cause various other problems too.

This fixes bug 25428 and a execution time bug in MallocBench/cfrac.

Reported-by: Jeremy Huddleston Sequoia <jeremyhu@apple.com>
llvm-svn: 252422
2015-11-08 19:12:05 +00:00
Johannes Doerfert 3797707695 [FIX] Use unreachable to indicate dead code and repair dominance
When we bail out early we make the partially build new code path
  practically dead, though it was not unreachable. To remove dominance
  problems we now make it not only dead but also prevent the control
  flow to join with the original code path, thus allow to use original
  values after the SCoP without any PHI nodes.

This fixes bug 25447.

llvm-svn: 252420
2015-11-08 17:57:41 +00:00
Johannes Doerfert 1dd6e37af4 Verify IR after we bail out
The bail out in r252412 left the code generation without verifying the (so
  far) generated IR. This will change now and ensure we always run the
  verifier.

Suggested-by: Tobias Grosser <tobias@grosser.es>
llvm-svn: 252419
2015-11-08 16:34:17 +00:00
Johannes Doerfert c4898504ea [FIX] Bail out if there is a dependence cycle between invariant loads
While the program cannot cause a dependence cycle between invariant
  loads, additional constraints (e.g., to ensure finite loops) can
  introduce them. It is hard to detect them in the SCoP description,
  thus we will only check for them at code generation time. If such a
  recursion is detected we will bail out the code generation and place a
  "false" runtime check to guarantee the original code is used.

  This fixes bug 25443.

llvm-svn: 252412
2015-11-07 19:46:04 +00:00
Michael Kruse 0651480b97 Fix non-affine region dominance of implicitely stored values
After loop versioning, a dominance check of a non-affine subregion's
exit node causes the dominance check to always fail on any block in the
subregion if it shares the same exit block with the scop. The
subregion's exit block has become polly_merge_new_and_old, which also
receives the control flow of the generated code. This would cause that
any value for implicit stores is assumed to be not from the scop.

We check dominance with the generated exit node instead.

This fixes llvm.org/PR25438

llvm-svn: 252375
2015-11-07 00:36:50 +00:00
Duncan P. N. Exon Smith b8f58b53dd polly/ADT: Remove implicit ilist iterator conversions, NFC
Remove all the implicit ilist iterator conversions from polly, in
preparation for making them illegal in ADT.  There was one oddity I came
across: at line 95 of lib/CodeGen/LoopGenerators.cpp, there was a
post-increment `Builder.GetInsertPoint()++`.

Since it was a no-op, I removed it, but I admit I wonder if it might be
a bug (both before and after this change)?  Perhaps it should be a
pre-increment?

llvm-svn: 252357
2015-11-06 22:56:54 +00:00
Michael Kruse ddb6528ba6 Fix reuse of non-dominating synthesized value in subregion exit
We were adding all generated values in non-affine subregions to be used
for the subregions generated exit block. The thought was that only
values that are dominating the original exit block can be used there.
But it is possible for synthesizable values to be expanded in any
block. If the same values is also used for implicit writes, it would
try to reuse already synthesized values even if not dominating the exit
block.

The fix is to only add values to the list of values usable in the exit
block only if it is dominating the exit block. This fixes
llvm.org/PR25412.

llvm-svn: 252301
2015-11-06 13:51:24 +00:00
Tobias Grosser 4bf262a0f2 RunTimeDebugBuilder: Allocate memory _after_ knowing how much is needed
This fixes a memory corruption issue, where we accessed more memory than
actually allocated.

llvm-svn: 252197
2015-11-05 19:43:34 +00:00
Michael Kruse 27149cf32d Use per-BB value maps for non-exit BBs
For generating scalar writes of non-affine subregions, all except phi
writes are generated in the exit block. The phi writes are generated in
the incoming block for which we errornously used the same BBMap. This
can conflict if a value for one block is synthesized, and then reused
for another block which is not dominated by the first block. This is
fixed by using block-specific BBMaps for phi writes.

llvm-svn: 252172
2015-11-05 16:17:17 +00:00
Johannes Doerfert 22892687f7 [FIX] Simplify and correct preloading of base pointer origin
To simplify and correct the preloading of a base pointer origin, e.g.,
  the base pointer for the current indirect invariant load, we now just
  check if there is an invariant access class that involves the base
  pointer of the current class.

llvm-svn: 251962
2015-11-03 19:15:33 +00:00
Johannes Doerfert 475d8e3f42 [FIX] Ensure base pointer origin was preloaded already
If a base pointer of a preloaded value has a base pointer origin, thus it is
  an indirect invariant load, we have to make sure the base pointer origin is
  preloaded first.

llvm-svn: 251946
2015-11-03 16:49:02 +00:00
Johannes Doerfert 3181c2ef72 [FIX] Correctly update SAI base pointer
If a base pointer load is preloaded, we have change the base pointer of
  the derived SAI. However, as the derived SAI relationship is is
  coarse grained, we need to check if we actually preloaded the base
  pointer or a different element of the base pointer SAI array.

llvm-svn: 251881
2015-11-03 01:42:59 +00:00
Johannes Doerfert 907456fe04 [FIX] Use appropriately sized types for big constants
llvm-svn: 251869
2015-11-03 00:26:22 +00:00
Tobias Grosser f423c1200f Remove old and redundant options
We remove -polly-detect-unprofitable and -polly-no-early-exit. Both have been
superseeded by -polly-process-unprofitable and were only kept as aliases for
our buildbots to continue to work. As all buildbots have been moved to the new
options, we can now remove the old ones for good.

llvm-svn: 251787
2015-11-02 10:13:32 +00:00
Tobias Grosser 3e9560200f RegionGenerator: Clear local maps after statement construction
These maps are only needed during the construction of a single region statement.
Clearing them is important, as we otherwise get an assert in case some of the
referenced values are erased before the RegionGenerator is deleted.

llvm-svn: 251341
2015-10-26 20:41:53 +00:00
Tobias Grosser a3f6edaee1 BlockGenerator: Do not assert when finding model PHI nodes defined outside the scop
Such PHI nodes can not only appear in the ExitBlock of the Scop, but indeed
any scalar PHI node above the scop and used in the scop is modeled as scalar
read access.

llvm-svn: 251198
2015-10-24 19:01:09 +00:00
Tobias Grosser 27d742da59 BlockGenerator: Directly handle multi-exit PHI nodes
This change adds code to directly code-generate multi-exit PHI nodes, instead
of trying to reuse the EscapeMap infrastructure for this. Using escape maps
adds a level of indirection that is hard to understand and - more importantly -
breaks in certain cases.

Specifically, the original code relied on simplifyRegion() to split the original
PHI node in two PHI nodes, one merging the values coming from within the scop
and a second that merges the first PHI node with the values that come from
outside the scop. To generate code the first PHI node is then just handled like
any other in-scop value that is used somewhere outside the scop. This fails for
the case where all values from inside the scop are identical, as the first PHI
node is in such cases automatically simplified and eliminated by LLVM right at
construction. As a result, there is no instruction that can be pass to the
EscapeMap handling, which means the references in the second PHI node are not
updated and may still reference values from within the original scop that do not
dominate it.

Our new code iterates directly over all modeled ScopArrayInfo objects that
represent multi-exit PHI nodes and generates code for them without relying on
the EscapeMap infrastructure. Hence, it works also for the case where the first
PHI node is eliminated.

llvm-svn: 251191
2015-10-24 17:41:29 +00:00
Michael Kruse dc12222287 Synthesize phi arguments in incoming block
New values were always synthesized in the block of the instruction
that needed them. This is incorrect for PHI node whose' value must be
defined in the respective incoming block. This patch temporarily moves
the builder's insert point to the incoming block while synthesizing phi
node arguments. 

This fixes PR25241 (http://llvm.org/bugs/show_bug.cgi?id=25241)

llvm-svn: 250693
2015-10-19 09:19:25 +00:00
Johannes Doerfert af3e301a67 [FIX] Restructure invariant load equivalence classes
Sorting is replaced by a demand driven code generation that will pre-load a
  value when it is needed or, if it was not needed before, at some point
  determined by the order of invariant accesses in the program. Only in very
  little cases this demand driven pre-loading will kick in, though it will
  prevent us from generating faulty code. An example where it is needed is
  shown in:
    test/ScopInfo/invariant_loads_complicated_dependences.ll

  Invariant loads that appear in parameters but are not on the top-level (e.g.,
  the parameter is not a SCEVUnknown) will now be treated correctly.

Differential Revision: http://reviews.llvm.org/D13831

llvm-svn: 250655
2015-10-18 12:39:19 +00:00
Johannes Doerfert d8b6ad255f [FIX] Cast preloaded values
Preloaded values have to match the type of their counterpart in the
  original code and not the type of the base array.

llvm-svn: 250654
2015-10-18 12:36:42 +00:00
Johannes Doerfert 01978cfa0c Remove independent blocks pass
Polly can now be used as a analysis only tool as long as the code
  generation is disabled. However, we do not have an alternative to the
  independent blocks pass in place yet, though in the relevant cases
  this does not seem to impact the performance much. Nevertheless, a
  virtual alternative that allows the same transformations without
  changing the input region will follow shortly.

llvm-svn: 250652
2015-10-18 12:28:00 +00:00
Tobias Grosser b8d27aab7d Revert to original BlockGenerator::getOrCreateAlloca(MemoryAccess &Access)
Expressing this in terms of BlockGenerator::getOrCreateAlloca(const
ScopArrayInfo *Array) does not work as the MemoryAccess BasePtr is in case of
invariant load hoisting different to the ScopArrayInfo BasePtr. Until this is
investigated and fixed, we move back to code that just uses the baseptr of
MemoryAccess.

llvm-svn: 250637
2015-10-18 00:51:13 +00:00
Tobias Grosser a4f0988df5 BlockGenerator: Add getOrCreateAlloca(const ScopArrayInfo *Array)
This allows the caller to get the alloca locations of an array without the
need to thank if Array is a PHI or a non-PHI Array. We directly make use of this
in BlockGenerator::getOrCreateAlloca(MemoryAccess &Access).

llvm-svn: 250628
2015-10-17 22:16:00 +00:00
Michael Kruse 225f0d1ee2 Load/Store scalar accesses before/after the statement itself
Instead of generating implicit loads within basic blocks, put them 
before the instructions of the statment itself, including non-affine 
subregions. The region's entry node is dominating all blocks in the 
region and therefore the loaded value will be available there.

Implicit writes in block-stmts were already stored back at the end of 
the block. Now, also generate the stores of non-affine subregions when 
leaving the statement, i.e. in the exiting block.

This change is required for array-mapped implicits ("De-LICM") to 
ensure that there are no dependencies of demoted scalars within 
statments. Statement load all required values, operator on copied in 
registers, and then write back the changed value to the demoted memory. 
Lifetimes analysis within statements becomes unecessary.

Differential Revision: http://reviews.llvm.org/D13487

llvm-svn: 250625
2015-10-17 21:36:00 +00:00
Tobias Grosser ffa2446f28 BlockGenerator: Register outside users of scalars directly
Instead of checking at code generation time for each ScopStmt if a scalar has
external uses, we just iterate over the ScopArrayInfo descriptions we have and
check each of these for possible external uses.

Besides being somehow clearer, this approach has the benefit that we will always
create valid LLVM-IR even in case we disable the code generation of ScopStmt
bodies e.g. for testing purposes.

llvm-svn: 250608
2015-10-17 08:54:13 +00:00
Tobias Grosser 26e59ee746 Drop unused parameter from handleOutsideUsers
llvm-svn: 250606
2015-10-17 08:25:54 +00:00
Tobias Grosser 1c55e2b7f3 Also add -polly-no-early-exit back until LNT is restarted
llvm-svn: 249975
2015-10-11 13:49:27 +00:00
Johannes Doerfert f363ed9804 [NFC] Move helper functions to ScopHelper
Helper functions in the BlockGenerators.h/cpp introduce dependences
  from the frontend to the backend of Polly. As they are used in
  ScopDetection, ScopInfo, etc. we move them to the ScopHelper file.

llvm-svn: 249919
2015-10-09 23:40:24 +00:00
Johannes Doerfert 697fdf891c Consolidate invariant loads
If a (assumed) invariant location is loaded multiple times we
  generated a parameter for each location. However, this caused compile
  time problems for several benchmarks (e.g., 445_gobmk in SPEC2006 and
  BT in the NAS benchmarks). Additionally, the code we generate is
  suboptimal as we preload the same location multiple times and perform
  the same checks on all the parameters that refere to the same value.

  With this patch we consolidate the invariant loads in three steps:
    1) During SCoP initialization required invariant loads are put in
       equivalence classes based on their pointer operand. One
       representing load is used to generate a parameter for the whole
       class, thus we never generate multiple parameters for the same
       location.
    2) During the SCoP simplification we remove invariant memory
       accesses that are in the same equivalence class. While doing so
       we build the union of all execution domains as it is only
       important that the location is at least accessed once.
    3) During code generation we only preload one element of each
       equivalence class with the unified execution domain. All others
       are mapped to that preloaded value.

Differential Revision: http://reviews.llvm.org/D13338

llvm-svn: 249853
2015-10-09 17:12:26 +00:00
Johannes Doerfert 09e3697f44 Allow invariant loads in the SCoP description
This patch allows invariant loads to be used in the SCoP description,
  e.g., as loop bounds, conditions or in memory access functions.

  First we collect "required invariant loads" during SCoP detection that
  would otherwise make an expression we care about non-affine. To this
  end a new level of abstraction was introduced before
  SCEVValidator::isAffineExpr() namely ScopDetection::isAffine() and
  ScopDetection::onlyValidRequiredInvariantLoads(). Here we can decide
  if we want a load inside the region to be optimistically assumed
  invariant or not. If we do, it will be marked as required and in the
  SCoP generation we bail if it is actually not invariant. If we don't
  it will be a non-affine expression as before. At the moment we
  optimistically assume all "hoistable" (namely non-loop-carried) loads
  to be invariant. This causes us to expand some SCoPs and dismiss them
  later but it also allows us to detect a lot we would dismiss directly
  if we would ask e.g., AliasAnalysis::canBasicBlockModify(). We also
  allow potential aliases between optimistically assumed invariant loads
  and other pointers as our runtime alias checks are sound in case the
  loads are actually invariant. Together with the invariant checks this
  combination allows to handle a lot more than LICM can.

  The code generation of the invariant loads had to be extended as we
  can now have dependences between parameters and invariant (hoisted)
  loads as well as the other way around, e.g.,
    test/Isl/CodeGen/invariant_load_parameters_cyclic_dependence.ll
  First, it is important to note that we cannot have real cycles but
  only dependences from a hoisted load to a parameter and from another
  parameter to that hoisted load (and so on). To handle such cases we
  materialize llvm::Values for parameters that are referred by a hoisted
  load on demand and then materialize the remaining parameters. Second,
  there are new kinds of dependences between hoisted loads caused by the
  constraints on their execution. If a hoisted load is conditionally
  executed it might depend on the value of another hoisted load. To deal
  with such situations we sort them already in the ScopInfo such that
  they can be generated in the order they are listed in the
  Scop::InvariantAccesses list (see compareInvariantAccesses). The
  dependences between hoisted loads caused by indirect accesses are
  handled the same way as before.

llvm-svn: 249607
2015-10-07 20:17:36 +00:00
Johannes Doerfert 521dd5842f Move the ValueMapT declaration out of BlockGenerator
Value maps are created and used in many places and it is not always
  possible to include CodeGen/Blockgenerators.h. To this end, ValueMapT
  now lives in the ScopHelper.h which does not have any dependences itself.

  This patch also replaces uses of different other value map types with
  the ValueMapT.

llvm-svn: 249606
2015-10-07 20:15:56 +00:00
Tobias Grosser 6bab9f800b IRBuilder: Use Map.lookup instead of Map.find [NFC]
This simplifies the code.

Suggested-by: Johannes Doerfert
llvm-svn: 249545
2015-10-07 13:35:20 +00:00
Tobias Grosser d79cb0378b IRBuilder: Ensure we do not add empty map elements
Do not use "Map[Key] == nullptr" to check if a Key is in the map, but use
"Map.find(Key) == Map.end()". Map[Key] always adds Key into the map, a
side-effect we do not want.

Found by inspection. This is hard to test outside of a targetted unit test,
which seems too much overhead for this individual issue.

llvm-svn: 249544
2015-10-07 13:19:06 +00:00