Commit Graph

3028 Commits

Author SHA1 Message Date
Siddharth Bhat db5dd14cbb [DependenceInfo] Replace use of deprecated isl_dim_n_out [NFC]
Change isl_dim_n_out to isl_map_dim(*, isl_dim_out)

llvm-svn: 298075
2017-03-17 12:59:01 +00:00
Siddharth Bhat 65f3d5201e [DependenceInfo] Track may-writes and build flow information in
Dependences::calculateDependences.

This ensures that we handle may-writes correctly when building
dependence information. Also add a test case checking correctness of
may-write information. Not handling it before was an oversight.

Differential Revision: https://reviews.llvm.org/D31075

llvm-svn: 298074
2017-03-17 12:31:28 +00:00
Tobias Grosser 8a6e605e96 [ScopInfo] Do not take inbounds assumptions [NFC]
For experiments it is sometimes helpful to not take any inbounds assumptions.
Add a new option "-polly-ignore-inbounds" which does precisely this.

llvm-svn: 298073
2017-03-17 12:26:58 +00:00
Tobias Grosser b58ed8d3cd [ScopInfo] Do not try to eliminate parameter dimensions that do not exist
In subsequent changes we will make Polly a little bit more lazy in adding
parameter dimensions to different sets. As a result, not all parameters will
always be part of the parameter space. This change ensures that we do not use
the '-1' returned when a parameter dimension cannot be found, but instead
just do not try to eliminate the anyhow non-existing dimension.

llvm-svn: 298054
2017-03-17 09:02:53 +00:00
Tobias Grosser 941cb7d979 [ScopInfo] Do not expand getDomains() to full parameter space.
Since several years, isl can perform most operations on sets with differing
parameter spaces, by expanding the parameter space on demand relying using
named isl ids to distinguish different parameter dimensions.

By not always expanding to full dimensionality the set remain smaller and can
likely be operated on faster. This change by itself did not yet result in
measurable performance benefits, but it is a step into the right direction
needed to ensure that subsequent changes indeed can work with lower-dimensional
sets and these sets do not get blown up by accident when later intersected with
the domain context.

llvm-svn: 298053
2017-03-17 09:02:50 +00:00
Tobias Grosser f4fe34bfb8 Update to isl-0.18-387-g3fa6191
This is a normal / regular maintenance update.

llvm-svn: 297999
2017-03-16 21:33:20 +00:00
Siddharth Bhat 65c4026992 Set Dependences::RED to be non-null once Dependences::calculateDependences()
occurs, even if there is no actual reduction. This ensures correctness
with isl operations.

llvm-svn: 297981
2017-03-16 20:06:49 +00:00
Michael Kruse 5545407fa4 [ScopInfo] Introduce ScopStmt::getSurroundingLoop(). NFC.
Introduce ScopStmt::getSurroundingLoop() to replace getFirstNonBoxedLoopFor.

getSurroundingLoop() returns the precomputed surrounding/first non-boxed
loop. Except in ScopDetection, the list of boxed loops is only used to
get the surrounding loop. getFirstNonBoxedLoopFor also requires LoopInfo
at every use which is not necessarily available everywhere where we may
want to use it.

Differential Revision: https://reviews.llvm.org/D30985

llvm-svn: 297899
2017-03-15 22:16:43 +00:00
Tobias Grosser d614b3e6bd Preserve the isl-noexceptions.h C++ bindings when updating isl
The bindings currently need to be generated manually, as they are not yet
part of the official isl distribution. Hence, we keep them across updates
assuming they only need to be updated when new functions or functionality
should be exposed.

llvm-svn: 297710
2017-03-14 07:46:28 +00:00
Tobias Grosser 9c19a0e16a Add back header file that was accidentally dropped in previous update
llvm-svn: 297709
2017-03-14 07:39:05 +00:00
Tobias Grosser 593ebdfbd1 Update to isl-0.18-369-g5e613c6
This is a regular maintenance update.

llvm-svn: 297708
2017-03-14 07:33:26 +00:00
Tobias Grosser c9d4cb2f42 [ScheduleOptimizer] Allow tiling after fusion
In ScheduleOptimizer::isTileableBand(), allow the case in which
the band node's child is an isl_schedule_sequence_node and its
grandchildren isl_schedule_leaf_nodes. This case can arise when
two or more statements are fused by the isl scheduler.

The tile_after_fusion.ll test has two statements in separate
loop nests and checks whether they are tiled after being fused
when polly-opt-fusion equals "max".

Reviewers: grosser

Subscribers: gareevroman, pollydev

Tags: #polly

Contributed-by: Theodoros Theodoridis <theodort@student.ethz.ch>

Differential Revision: https://reviews.llvm.org/D30815

llvm-svn: 297587
2017-03-12 19:02:31 +00:00
Tobias Grosser de244eb450 Possible error in doc comment
If a SCoP is most probably sequential, then it's better to run it on a CPU.
Hence, there's no point in running it on a GPU.

Reviewers: grosser

Subscribers: nemanjai

Tags: #polly

Contributed-by: Singapuram Sanjay <singapuram.sanjay@gmail.com>

Differential Revision: https://reviews.llvm.org/D30864

llvm-svn: 297578
2017-03-12 08:19:01 +00:00
Tobias Grosser b2347dc241 [isl++] Add missing /* implicit */ marker
llvm-svn: 297577
2017-03-12 08:17:50 +00:00
Tobias Grosser 5ac963743f [isl++] Add last set of missing isl:: prefixes to increase consistency [NFC]
llvm-svn: 297558
2017-03-11 07:58:12 +00:00
Tobias Grosser 9cc7e3561d [unittest] Do not convert large unsigned long to isl::val
Currently the isl::val constructor only takes a signed long as parameter, which
on Windows is only 32 bit large and can consequently not be used to obtain the
same result when loading from the expression '(1ull << 32) - 1)' that we get
when loading this value via isl_val_int_from_ui or when loading the value
on Linux systems with 64 bit long data types. We avoid this issue by performing
the shift and subtractiong within the isl::val.

It would be nice to teach the isl++ bindings to construct isl::val from other
integer types, but the current interface is not sufficient to do so. If
constructors from both signed long and unsigned long are exposed, then integer
literals that are of type 'int' and which must be converted to 'long' to match
the constructor have two ambigious constructors to choose from, which result
in a compiler error. The right solution is likely to additionally expose
constructors from signed and unsigned int, but these are not yet available in
the isl C interface and adding those adhoc to our bindings is something I would
like to avoid for now. We should address this issue with a proper discussion
on the isl side.

llvm-svn: 297522
2017-03-10 22:25:39 +00:00
Tobias Grosser d67d368e12 [isl++] Add namespace prefixes to isl::ctx and isl::stat
These were missed in r297478. We add them for consistency.

llvm-svn: 297520
2017-03-10 22:10:19 +00:00
Tobias Grosser 30a06dce68 [isl++] Drop warning about experimental status
As most discussions about these bindings have concluded and only the final
patch review on the isl mailing list is missing, we drop the experimental
warning tag to match the patchset we will submit to isl, which is expected to
not change notably any more.

llvm-svn: 297519
2017-03-10 22:10:15 +00:00
Tobias Grosser 9839774e5d [isl++] Do not use enum prefix
Instead of declaring a function as:

  inline val plain_get_val_if_fixed(enum dim type, unsigned int pos) const;

we use:

  inline isl::val plain_get_val_if_fixed(isl::dim type, unsigned int pos) const;

The first argument caused the following compile time error on windows:

  "error C3431: 'dim': a scoped enumeration cannot be redeclared as an
  unscoped enumeration"

In some cases it is sufficient to just drop the 'enum' prefix, but for example
for isl::set the 'enum class dim' type collides with the function name
isl::set::dim and can consequently not be referenced. To avoid such kind of
ambiguities in the future we add the isl:: prefix consistently to all types
used.

Reported-by: Michael Kruse <llvm@meinersbur.de>
llvm-svn: 297478
2017-03-10 17:01:30 +00:00
Michael Kruse 0446d81e2d [Simplify] Add -polly-simplify pass.
This new pass removes unnecessary accesses and writes. It currently
supports 2 simplifications, but more are planned.

It removes write accesses that write a loaded value back to the location
it was loaded from. It is a typical artifact from DeLICM. Removing it
will get rid of bogus dependencies later in dependency analysis.

It also removes statements without side-effects. ScopInfo already
removes these, but the removal of unnecessary writes can result in
more side-effect free statements.

Differential Revision: https://reviews.llvm.org/D30820

llvm-svn: 297473
2017-03-10 16:05:24 +00:00
Tobias Grosser 3e618c33fe [DeadCodeElimination] Translate to C++ bindings
This pass is a small and self-contained example of a piece of code that was
written with the isl C interface. The diff of this change nicely shows how the
C++ bindings can improve the readability of the code by avoiding the long C
function names and by avoiding any need for memory management.

As you will see, no calls to isl_*_copy or isl_*_free are needed anymore.
Instead the C++ interface takes care of automatically managing the objects.
This may introduce internally additional copies, but due to the isl reference
counting, such copies are expected to be cheap. For performance critical
operations, we will later exploit move semantics to eliminate unnecessary
copies that have shown to be costly.

Below we give a set of examples that shows the benefit of the C++ interface vs.
the pure C interface.

Check properties
----------------

Before:

  if (isl_aff_is_zero(aff) ||  isl_aff_is_one(aff))
    return true;

After:

  if (Aff.is_zero() || Aff.is_one())
    return true;

Type conversion
---------------

Before:

  isl_union_pw_multi_aff *UPMA = isl_union_pw_multi_aff_from_union_map(umap);

After:

  isl::union_pw_multi_aff UPMA = UMap;

Type construction
-----------------

Before:

  auto *Empty = isl_union_map_empty(space);

After:

  auto Empty = isl::union_map::empty(Space);

Operations
----------

Before:

  set = isl_union_set_intersect(set, set2);

After:

  Set = Set.intersect(Set2);

The use of isl::boolean in return types also adds an increases the robustness
of Polly, as on conversion to true or false, we verify that no isl_bool_error
has been returned and assert in case an error was returned. Before this change
we would have just ignored the error and proceeded with (some) exection path.

Tags: #polly

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D30619

llvm-svn: 297466
2017-03-10 15:05:38 +00:00
Tobias Grosser 3cc57fa1e7 [unittest] Translate isl tests to C++ bindings
For this translation we introduce two functions, valFromAPInt and APIntFromVal,
to convert between isl::val and APInt. For now these are just proxies, but in
the future they will replace the current isl_val* based conversion functions.

The isl unit test cases benefit most from the new isl::boolean (from Michael
Kruse), which can be explicitly casted to bool and which -- as part of this cast
-- emits a check that no error condition has been triggered so far. This allows
us to simplify

  EXPECT_EQ(isl_bool_true, isl_val_is_zero(IslZero));

to

  EXPECT_TRUE(IslZero.is_zero());

This simplification also becomes very clear in operator==, which changes from

  auto IsEqual = isl_set_is_equal(LHS.keep(), RHS.keep());
  EXPECT_NE(isl_bool_error, IsEqual);
  return IsEqual;

to just

  return bool(LHS.is_equal(RHS));

Some background for non-isl users. The isl C interface has an isl_bool type,
which can be either true, false, or error. Hence, whenever a function returns
a value of type isl_bool, an explicit error check should be considered. By
using isl::boolean, we can just cast the isl::boolean to 'bool' or simply use
the isl::boolean in a context where it will automatically be casted to bool
(e.g., in an if-condition). When doing so, the C++ bindings automatically add
code that verifies that the return value is not an error code. If it is, the
program will warn about this and abort. For cases where errors are expected,
isl::boolean provides checks such as boolean::is_true_or_error() or
boolean::is_true_no_error() to explicitly control program behavior in case of
error conditions.

Thanks to the new automatic memory management, we also can avoid many calls to
isl_*_free. For code that had previously been managed by IslPtr<>, many calls
to give/take/copy are eliminated.

Tags: #polly

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D30618

llvm-svn: 297464
2017-03-10 14:58:50 +00:00
Tobias Grosser 51ebda8c9d [FlattenAlgo] Translate to C++ bindings
Translate the full algorithm to use the new isl C++ bindings

This is a large piece of code that has been written with the Polly IslPtr<>
memory management tool, which only performed memory management, but did not
provide a method interface. As such the code was littered with calls to
give(), copy(), keep(), and take(). The diff of this change should give a
good example how the new method interface simplifies the code by removing the
need for switching between managed types and C functions all the time
and consequently also the need to use the long C function names.

These are a couple of examples comparing the old IslPtr memory management
interface with the complete method interface.

Check properties
----------------

Before:

  if (isl_aff_is_zero(Aff.get()) ||  isl_aff_is_one(Aff.get()))
    return true;

After:

  if (Aff.is_zero() || Aff.is_one())
    return true;

Type conversion
---------------

Before:

  isl_union_pw_multi_aff *UPMA =
      give(isl_union_pw_multi_aff_from_union_map(UMap.copy());

After:

  isl::union_pw_multi_aff UPMA = UMap;

Type construction
-----------------

Before:

  auto Empty = give(isl_union_map_empty(Space.copy());

After:

  auto Empty = isl::union_map::empty(Space);

Operations
----------

Before:

  Set = give(isl_union_set_intersect(Set.copy(), Set2.copy());

After:

  Set = Set.intersect(Set2);

Tags: #polly

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D30617

llvm-svn: 297463
2017-03-10 14:55:58 +00:00
Tobias Grosser 4c24e57965 Add method interface to isl C++ bindings
The isl C++ binding method interface introduces a thin C++ layer that allows
to call isl methods directly on the memory managed C++ objects. This makes the
relevant methods directly available via code-completion interfaces, allows for
the use of overloading, conversion constructors, and many other nice C++
features that make using isl a lot easier.

The individual features will be highlighted in the subsequent commits.

Tags: #polly

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D30616

llvm-svn: 297462
2017-03-10 14:53:00 +00:00
Tobias Grosser deaef15f52 Introduce isl C++ bindings, Part 1: value_ptr style interface
Over the last couple of months several authors of independent isl C++ bindings
worked together to jointly design an official set of isl C++ bindings which
combines their experience in developing isl C++ bindings. The new bindings have
been designed around a value pointer style interface and remove the need for
explicit pointer managenent and instead use C++ language features to manage isl
objects.

This commit introduces the smart-pointer part of the isl C++ bindings and
replaces the current IslPtr<T> classes, which served the very same purpose, but
had to be manually maintained. Instead, we now rely on automatically generated
classes for each isl object, which provide value_ptr semantics.

An isl object has the following smart pointer interface:

    inline set manage(__isl_take isl_set *ptr);

    class set {
      friend inline set manage(__isl_take isl_set *ptr);
      isl_set *ptr = nullptr;
      inline explicit set(__isl_take isl_set *ptr);

    public:
      inline set();
      inline set(const set &obj);
      inline set &operator=(set obj);
      inline ~set();
      inline __isl_give isl_set *copy() const &;
      inline __isl_give isl_set *copy() && = delete;
      inline __isl_keep isl_set *get() const;
      inline __isl_give isl_set *release();
      inline bool is_null() const;
    }

The interface and behavior of the new value pointer style classes is inspired
by http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3339.pdf, which
proposes a std::value_ptr, a smart pointer that applies value semantics to its
pointee.

We currently only provide a limited set of public constructors and instead
require provide a global overloaded type constructor method "isl::obj
isl::manage(isl_obj *)", which allows to convert an isl_set* to an isl::set by
calling 'S = isl::manage(s)'. This pattern models the make_unique() constructor
for unique pointers.

The next two functions isl::obj::get() and isl::obj::release() are taken
directly from the std::value_ptr proposal:

S.get() extracts the raw pointer of the object managed by S.
S.release() extracts the raw pointer of the object managed by S and sets the
object in S to null.

We additionally add std::obj::copy(). S.copy() returns a raw pointer refering
to a copy of S, which is a shortcut for "isl::obj(oldobj).release()", a
functionality commonly needed when interacting directly with the isl C
interface where all methods marked with __isl_take require consumable raw
pointers.

S.is_null() checks if S manages a pointer or if the managed object is currently
null. We add this function to provide a more explicit way to check if the
pointer is empty compared to a direct conversion to bool.

This commit also introduces a couple of polly-specific extensions that cover
features currently not handled by the official isl C++ bindings draft, but
which have been provided by IslPtr<T> and are consequently added to avoid code
churn. These extensions include:

	- operator bool() : Conversion from objects to bool
	- construction from nullptr_t
	- get_ctx() method
	- take/keep/give methods, which match the currently used naming
	  convention of IslPtr<T> in Polly. They just forward to
	  (release/get/manage).
	- raw_ostream printers

We expect that these extensions are over time either removed or upstreamed to
the official isl bindings.

We also export a couple of classes that have not yet been exported in isl (e.g.,
isl::space)

As part of the code review, the following two questions were asked:

- Why do we not use a standard smart pointer?

std::value_ptr was a proposal that has not been accepted. It is consequently
not available in the standard library. Even if it would be available, we want
to expand this interface with a complete method interface that is conveniently
available from each managed pointer. The most direct way to achieve this is to
generate a specialiced value style pointer class for each isl object type and
add any additional methods to this class. The relevant changes follow in
subsequent commits.

- Why do we not use templates or macros to avoid code duplication?

It is certainly possible to use templates or macros, but as this code is
auto-generated there is no need to make writing this code more efficient. Also,
most of these classes will be specialized with individual member functions in
subsequent commits, such that there will be little code reuse to exploit. Hence,
we decided to do so at the moment.

These bindings are not yet officially part of isl, but the draft is already very
stable. The smart pointer interface itself did not change since serveral months.
Adding this code to Polly is against our normal policy of only importing
official isl code. In this case however, we make an exception to showcase a
non-trivial use case of these bindings which should increase confidence in these
bindings and will help upstreaming them to isl.

Tags: #polly

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D30325

llvm-svn: 297452
2017-03-10 11:41:03 +00:00
Tobias Grosser e5671e54c0 Update to isl-0.18-356-g0b05d01
This is a regular maintenance update.

llvm-svn: 297449
2017-03-10 09:17:55 +00:00
Michael Kruse 0666a76aac [Support] Correct filename in file head comment. NFC.
llvm-svn: 297430
2017-03-10 00:36:54 +00:00
Michael Kruse e4292bf086 [Support] Add -polly-dump-module pass.
This pass allows writing the LLVM-IR just before and after the Polly
passes to a file.

Dumping the IR before Polly helps reproducing bugs that occur in code
generated by clang. It is the only reliable way to get the IR that
triggers a bug. The alternative is to emit the IR with

    clang -c -emit-llvm -S -o dump.ll

then pass it through all optimization passes

    opt dump.ll -basicaa -sroa ... -S -o optdump.ll

to then reproduce the error with

    opt optdump.ll -polly-opt-isl -polly-codegen -analyze

However, the IR is not the same. -O3 uses a PassBuilder than creates passes
with different parameters than the default.

Dumping the IR after Polly is useful to compare a miscompilation with
a known-good configuration.

Differential Revision: https://reviews.llvm.org/D30788

llvm-svn: 297415
2017-03-09 22:29:58 +00:00
Michael Kruse a9520b94d5 [Cmake] Generate a PollyConfig.cmake.
Generate a PollyConfig.cmake for use with Cmake's find_package in
out-of-tree projects.

Contributed-by: Philip Pfaffe <philip.pfaffe@gmail.com>

Differential Revision: https://reviews.llvm.org/D30495

llvm-svn: 297395
2017-03-09 17:58:20 +00:00
Tobias Grosser 8bd7f3c0a5 [ScopDetect/Info] Allow unconditional hoisting of loads from dereferenceable ptrs
In case LLVM pointers are annotated with !dereferencable attributes/metadata
or LLVM can look at the allocation from which a pointer is derived, we can know
that dereferencing pointers is safe and can be done unconditionally. We use this
information to proof certain pointers as save to hoist and then hoist them
unconditionally.

llvm-svn: 297375
2017-03-09 11:36:00 +00:00
Michael Kruse 9fb3ab1b19 [DeLICM] Add -polly-delicm-overapproximate-writes option.
One of the current limitations of DeLICM is that it only creates
PHI WRITEs that it knows are read by some PHI. Such writes may not span
all instances of a statement. Polly's code generator currently does not
support MemoryAccesses that are not executed in all instances
('partial accesses') and so has to give up on a possible mapping.

This workaround has once been suggested by Tobias Grosser: Try to
interpolate an arbitrary expansion to all instances. It will be checked
for possible conflicts with the existing Knowledge and can be applied if
the conflict checking result is that no semantics are changed.

Expansion is done by simplifying the mapping by coalescing with the hope
that coalescing will find a polyhedral 'rule' of the relevant map. It is
then 'gist'-ed using the domain of the relevant instances such that the
rule is expanded to the universe and finally intersected with the domain
of all statement instances.

The expansion makes conflicts become more likely, the found rule may
still not encompass all statement instances and the found rule exposes
internals of isl's implementation of coalesce and gist. The latter means
that the result depends on how much effort the implementation invests
into finding a rule which may change between versions of isl. Trivial
implementations of gist and coalesce just return the input arguments.

A patch that makes codegen support partial accesses is in preparation
as well.

Differential Revision: https://reviews.llvm.org/D30763

llvm-svn: 297373
2017-03-09 11:23:22 +00:00
Michael Kruse 935b2a3654 [DeadCodeElim] Put -polly-dce-precise-steps into the Polly category.
llvm-svn: 297318
2017-03-08 23:25:35 +00:00
Michael Kruse 6744efa8d8 [ScopDetection] Only allow SCoP-wide available base pointers.
Simplify ScopDetection::isInvariant(). Essentially deny everything that
is defined within the SCoP and is not load-hoisted.

The previous understanding of "invariant" has a few holes:

- Expressions without side-effects with only invariant arguments, but
  are defined withing the SCoP's region with the exception of selects
  and PHIs. These should be part of the index expression derived by
  ScalarEvolution and not of the base pointer.

- Function calls with that are !mayHaveSideEffects() (typically
  functions with "readnone nounwind" attributes). An example is given
  below.

      @C = external global i32
      declare float* @getNextBasePtr(float*) readnone nounwind
      ...
      %ptr = call float* @getNextBasePtr(float* %A, float %B)

  The call might return:

  * %A, so %ptr aliases with it in the SCoP
  * %B, so %ptr aliases with it in the SCoP
  * @C, so %ptr aliases with it in the SCoP
  * a new pointer everytime it is called, such as malloc()
  * a pointer into the allocated block of one of the aforementioned
  * any of the above, at random at each call

  Hence and contrast to a comment in the base_pointer.ll regression
  test, %ptr is not necessarily the same all the time. It might also
  alias with anything and no AliasAnalysis can tell otherwise if the
  definition is external. It is hence not suitable in the role of a
  base pointer.

The practical problem with base pointers defined in SCoP statements is
that it is not available globally in the SCoP. The statement instance
must be executed first before the base pointer can be used. This is no
problem if the base pointer is transferred as a scalar value between
statements. Uses of MemoryAccess::setNewAccessRelation may add a use of
the base pointer anywhere in the array. setNewAccessRelation is used by
JSONImporter, DeLICM and D28518. Indeed, BlockGenerator currently
assumes that base pointers are available globally and generates invalid
code for new access relation (referring to the base pointer of the
original code) if not, even if the base pointer would be available in
the statement.

This could be fixed with some added complexity and restrictions. The
ExprBuilder must lookup the local BBMap and code that call
setNewAccessRelation must check whether the base pointer is available
first.

The code would still be incorrect in the presence of aliasing. There
is the switch -polly-ignore-aliasing to explicitly allow this, but
it is hardly a justification for the additional complexity. It would
still be mostly useless because in most cases either getNextBasePtr()
has external linkage in which case the readnone nounwind attributes
cannot be derived in the translation unit itself, or is defined in the
same translation unit and gets inlined.

Reviewed By: grosser

Differential Revision: https://reviews.llvm.org/D30695

llvm-svn: 297281
2017-03-08 15:14:46 +00:00
Michael Kruse 5a4ec5c42b [ScopDetection] Require LoadInst base pointers to be hoisted.
Only when load-hoisted we can be sure the base pointer is invariant
during the SCoP's execution. Most of the time it would be added to
the required hoists for the alias checks anyway, except with
-polly-ignore-aliasing, -polly-use-runtime-alias-checks=0 or if
AliasAnalysis is already sure it doesn't alias with anything
(for instance if there is no other pointer to alias with).

Two more parts in Polly assume that this load-hoisting took place:
- setNewAccessRelation() which contains an assert which tests this.
- BlockGenerator which would use to the base ptr from the original
  code if not load-hoisted (if the access expression is regenerated)

Differential Revision: https://reviews.llvm.org/D30694

llvm-svn: 297195
2017-03-07 20:28:43 +00:00
Tobias Grosser a0b85963ba Update isl to isl-0.18-336-g1e193d9
This is a regular maintenance update

llvm-svn: 297169
2017-03-07 17:53:34 +00:00
Tobias Grosser 6c9958e0b3 [tests] Make sure tests do not end in 'unreachable' - Part III
There is no point in optimizing unreachable code, hence our test cases should
always return.

This commit is part of a series that makes Polly more robust on the presence of
unreachables.

llvm-svn: 297158
2017-03-07 16:28:53 +00:00
Tobias Grosser 2d233fb35d [tests] Update bounds-check elimination test cases
These test cases should work in combination with
https://reviews.llvm.org/D12676, but became outdated over time. Update them
in preparation of discussions with Daniel Berlin on how to represent unreachable
in the post-dominator tree.

llvm-svn: 297157
2017-03-07 16:17:58 +00:00
Tobias Grosser ce69e7b593 [ScopInfo] Avoid infinite loop during schedule construction
Our current scop modeling enters an infinite loop when trying to model code
that has unreachable instructions (e.g.,
test/ScopInfo/BoundChecks/single-loop.ll), as the number of basic blocks
returned by the LLVM Loop* does not include unreachable basic blocks that
branch off from the core loop body. This arises for example in the following
piece of code:

  for (i = 0; i < N; i++) {
    if (i > 1024)
      abort();            <- this abort might be translated to an
                             unreachable

    A[i] = ...
  }

This patch adds these unreachable basic blocks in our per loop basic block
count to ensure that the schedule construction does not assume a loop has been
processed completely, despite certain unreachable basic blocks still remaining.

The infinite loop is only observable in combination with
https://reviews.llvm.org/D12676 or a similar patch.

llvm-svn: 297156
2017-03-07 16:17:55 +00:00
Tobias Grosser 134a572951 [ScopDetection] Do not detect scops that exit to an unreachable
Scops that exit with an unreachable are today still permitted, but make little
sense to optimize. We therefore can already skip them during scop detection.
This speeds up scop detection in certain cases and also ensures that bugpoint
does not introduce unreachables when reducing test cases.

In practice this change should have little impact, as the performance of
unreachable code is unlikely to matter.

This commit is part of a series that makes Polly more robust in the presence
of unreachables.

llvm-svn: 297151
2017-03-07 15:50:43 +00:00
Tobias Grosser 87dcd46aa7 [tests] Make sure tests do not end in 'unreachable' - Part II
There is no point in optimizing unreachable code, hence our test cases should
always return.

This commit is part of a series that makes Polly more robust on the presence of
unreachables.

llvm-svn: 297150
2017-03-07 15:23:30 +00:00
Tobias Grosser 2dc1f547ae [tests] Make sure tests do not end in 'unreachable'
There is no point in optimizing unreachable code, hence our test cases should
always return.

This commit is part of a series that makes Polly more robust on the presence of
unreachables.

llvm-svn: 297147
2017-03-07 15:17:23 +00:00
Sanjoy Das b641a90529 Adapt to llvm change r296992 to unbreak the bots
r296992 made ScalarEvolution's CompareValueComplexity less aggressive,
and that broke the polly test being fixed in this change.  This change
explicitly bumps CompareValueComplexity in said test case to make it
pass.

Can someone from the polly team please can give me an idea on if this
case is important enough to have
scalar-evolution-max-value-compare-depth be 3 by default?

llvm-svn: 296994
2017-03-06 01:12:16 +00:00
Tobias Grosser 7d136d952e [tests] Specify the dependence to NVPTX backend for Polly ACC test cases
Some Polly ACC test cases fail without a working NVPTX backend. We explicitly
specify this dependence in REQUIRES. Alternatively, we could have only marked
polly-acc as supported in case the NVPTX backend is available, but as we might
use other backends in the future, this does not seem to be the best choice.

For this to work, we also need to make the 'targets_to_build' information
available.

Suggested-by: Michael Kruse <llvm@meinersbur.de>
llvm-svn: 296853
2017-03-03 03:38:50 +00:00
Tobias Grosser 9d551da5c1 [test] Do not emit binary data to output
Suggested-by: Michael Kruse <llvm@meinersbur.de>
llvm-svn: 296852
2017-03-03 03:24:34 +00:00
Tobias Grosser 7a93d94a8f Revert "Currently broken by recent LLVM upstream changes"
This reverts commit r296579, which is not needed anymore as the relevant changes
in trunk have been reverted.

llvm-svn: 296817
2017-03-02 21:43:50 +00:00
Tobias Grosser 1c787e0b49 [ScopDetection] Do not allow required-invariant loads in non-affine region
These loads cannot be savely hoisted as the condition guarding the
non-affine region cannot be duplicated to also protect the hoisted load
later on. Today they are dropped in ScopInfo. By checking for this early, we
do not even try to model them and possibly can still optimize smaller regions
not containing this specific required-invariant load.

llvm-svn: 296744
2017-03-02 12:15:37 +00:00
Tobias Grosser c2f151084d [ScopInfo] Disable memory folding in case it results in multi-disjunct relations
Multi-disjunct access maps can easily result in inbound assumptions which
explode in case of many memory accesses and many parameters. This change reduces
compilation time of some larger kernel from over 15 minutes to less than 16
seconds.

Interesting is the test case test/ScopInfo/multidim_param_in_subscript.ll
which has a memory access

  [n] -> { Stmt_for_body3[i0, i1] -> MemRef_A[i0, -1 + n - i1] }

which requires folding, but where only a single disjunct remains. We can still
model this test case even when only using limited memory folding.

For people only reading commit messages, here the comment that explains what
memory folding is:

To recover memory accesses with array size parameters in the subscript
expression we post-process the delinearization results.

We would normally recover from an access A[exp0(i) * N + exp1(i)] into an
array A[][N] the 2D access A[exp0(i)][exp1(i)]. However, another valid
delinearization is A[exp0(i) - 1][exp1(i) + N] which - depending on the
range of exp1(i) - may be preferrable. Specifically, for cases where we
know exp1(i) is negative, we want to choose the latter expression.

As we commonly do not have any information about the range of exp1(i),
we do not choose one of the two options, but instead create a piecewise
access function that adds the (-1, N) offsets as soon as exp1(i) becomes
negative. For a 2D array such an access function is created by applying
the piecewise map:

[i,j] -> [i, j] :      j >= 0
[i,j] -> [i-1, j+N] :  j <  0

After this patch we generate only the first case, except for situations where
we can proove the first case to be invalid and can consequently select the
second without introducing disjuncts.

llvm-svn: 296679
2017-03-01 21:11:27 +00:00
Tobias Grosser 24222c7357 Fix namespaces after clang-format update
llvm-svn: 296635
2017-03-01 15:54:27 +00:00
Tobias Grosser 6f9b60cf38 Currently broken by recent LLVM upstream changes
We mark it as XFAIL to get buildbots back to green, until the upstream changes
have been addressed.

llvm-svn: 296579
2017-03-01 04:34:44 +00:00
Tobias Grosser d7c4975349 [ScopInfo] Simplify inbounds assumptions under domain constraints
Without this simplification for a loop nest:

  void foo(long n1_a, long n1_b, long n1_c, long n1_d,
           long p1_b, long p1_c, long p1_d,
           float A_1[][p1_b][p1_c][p1_d]) {
    for (long i = 0; i < n1_a; i++)
      for (long j = 0; j < n1_b; j++)
        for (long k = 0; k < n1_c; k++)
          for (long l = 0; l < n1_d; l++)
            A_1[i][j][k][l] += i + j + k + l;
 }

the assumption:

  n1_a <= 0 or (n1_a > 0 and n1_b <= 0) or
  (n1_a > 0 and n1_b > 0 and n1_c <= 0) or
  (n1_a > 0 and n1_b > 0 and n1_c > 0 and n1_d <= 0) or
  (n1_a > 0 and n1_b > 0 and n1_c > 0 and n1_d > 0 and
   p1_b >= n1_b and p1_c >= n1_c and p1_d >= n1_d)

is taken rather than the simpler assumption:

  p9_b >= n9_b and p9_c >= n9_c and p9_d >= n9_d.

The former is less strict, as it allows arbitrary values of p1_* in case, the
loop is not executed at all. However, in practice these precise constraints
explode when combined across different accesses and loops. For now it seems
to make more sense to take less precise, but more scalable constraints by
default. In case we find a practical example where more precise constraints
are needed, we can think about allowing such precise constraints in specific
situations where they help.

This change speeds up the new test case from taking very long (waited at least
a minute, but it probably takes a lot more) to below a second.

llvm-svn: 296456
2017-02-28 09:45:54 +00:00
Tobias Grosser cf66ea3845 Update isl to isl-0.18-304-g1efe43d
This is a normal maintenance update.

llvm-svn: 296441
2017-02-28 07:06:06 +00:00
Michael Kruse 6469380daa [Cmake] Optionally use a system isl version.
This patch adds an option to build against a version of libisl already
installed on the system. The installation is autodetected using the
pkg-config file shipped with isl.

The detection of the library is in the FindISL.cmake module that creates
an imported target.

Contributed-by: Philip Pfaffe <philip.pfaffe@gmail.com>

Differential Revision: https://reviews.llvm.org/D30043

llvm-svn: 296361
2017-02-27 17:54:25 +00:00
Michael Kruse c4f61d2346 [DeLICM] Add nomap regressions tests. NFC.
These verify that some scalars are not mapped because it would be
incorrect to do so.

For these check we verify that no transformation has been executed from
output of the pass's '-analyze'. Adding optimization remarks is not useful
as it would result in too many messages, even repeated ones. I avoided
checking the '-debug-only=polly-delicm' output which is an antipattern.

llvm-svn: 296348
2017-02-27 15:53:18 +00:00
Michael Kruse b295c37a15 [DeLICM] Statistics for use in regression tests.
Print some measurements of the DeLICM transformation at -analyze to be
used in regression tests.

llvm-svn: 296347
2017-02-27 15:53:13 +00:00
Roman Gareev bc3fbe49c5 Disable the parallel code generation in case of extension nodes
We can not perform the dependence analysis and, consequently, the parallel
code generation in case the schedule tree contains extension nodes.

Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: https://reviews.llvm.org/D30394

llvm-svn: 296325
2017-02-27 08:03:11 +00:00
Michael Kruse e199f285b0 [DeLICM] Fortify against exceeding isl's max operations counter.
Control flow would flow-through after the check whether the operations
quota exceeded, with the intention that it would later be caught by
Knowledge::isUsable(). However, the Knowledge constructor has its own
assertions to check consistency which would fail if its fields have only
been initialized partially because some sets have been computed correctly
before the operations quota takes effect.

Fix by erroring-out early instead of falling-throught into the code that
might expect that everything has been computed correctly. For robustness,
also bail-out if any of the fields contain nullptr values instead of
relying on isl always setting exactly this error code if something went
wrong.

This should fix the
perf-x86_64-penryn-O3-polly-before-vectorizer-unprofitable
(-polly-process-unprofitable -polly-position=before-vectorizer
-polly-enable-delicm) buildbot.

llvm-svn: 296022
2017-02-23 21:58:20 +00:00
Michael Kruse f4e201e09f [Support] Remove NonowningIslPtr. NFC.
NonowningIslPtr<isl_X> was used as types of function parameters when the
function does not consume the isl object, i.e. an __isl_keep parameter.

The alternatives are:

1. IslPtr<isl_X>
   This has additional calls to isl_X_copy and isl_X_free to
   increase/decrease the reference counter even though not needed. The
   caller already owns a reference to the isl object.

2. const IslPtr<isl_X>&
   This does not change the reference counter, but requires an
   additional load to get the pointer to the isl object (instead of just
   passing the pointer itself).
   Moreover, the compiler cannot rely on the constness of the pointer
   and has to reload the pointer every time it writes to memory (unless
   alias analysis such as TBAA says it is not possible).

The isl C++ bindings currently in development do not have an equivalent
to NonowningIslPtr and adding one would make the binding more
complicated and its advantage in performance is small. In order to
simplify the transition to these C++ bindings, remove NonowningIslPtr.
Change every former use of it to alternative 2 mentioned aboce
(const IslPtr<isl_X>&).

llvm-svn: 295998
2017-02-23 17:57:27 +00:00
Michael Kruse 2c7169d00c [DependenceInfo] Remove unused variable. NFC.
llvm-svn: 295987
2017-02-23 15:41:01 +00:00
Michael Kruse dd6f29375b [DependenceInfo] Use references instead of double pointers. NFC.
Non-const references are the more C++-ish way to modify a variable
passed by the caller.

llvm-svn: 295986
2017-02-23 15:40:56 +00:00
Michael Kruse ec8fc32160 [DependenceInfo] Rename StmtScheduleDomain -> TaggedStmtDomain. NFC.
llvm-svn: 295985
2017-02-23 15:40:52 +00:00
Michael Kruse 00c38e0df2 [DependenceInfo] Simplify use of StmtSchedule's domain [NFC]
Once a StmtSchedule is created, only its domain is used anywhere within
DependenceInfo::calculateDependences. So, we choose to return the
wrapped domain of the union_map rather than the entire union_map.

However, we still build the union_map first within collectInfo(). It is
cleaner to first build the entire union_map and then pull the domain out in
one shot, rather than repeatedly extracting the domain in bits and pieces
from accdom.

Contributed-by: Siddharth Bhat <siddu.druid@gmail.com>

Differential Revision: https://reviews.llvm.org/D30208

llvm-svn: 295984
2017-02-23 15:40:46 +00:00
Michael Kruse 52ab4943b4 Remove all references to PostDominators. NFC.
Marking a pass as preserved is necessary if any Polly pass uses it, even
if it is not preserved within the generated code. Not marking it would
cause the the Polly pass chain to be interrupted. It is not used by any
Polly pass anymore, hence we can remove all references to it.

llvm-svn: 295983
2017-02-23 15:16:22 +00:00
Michael Kruse 9f519714b3 [DeLICM] Add missing Doxygen comment. NFC.
llvm-svn: 295978
2017-02-23 14:51:50 +00:00
Michael Kruse 311ecb00dc [DeLICM] Capitalize parameter name. NFC.
llvm-svn: 295977
2017-02-23 14:51:45 +00:00
Tobias Grosser 59d23bbdc6 Update isl to isl-0.18-282-g12465a5
Besides a variety of smaller cleanups, this update also contains a correctness
fix to isl coalesce which resolves a crash in Polly.

llvm-svn: 295966
2017-02-23 12:48:42 +00:00
Roman Gareev 96e1119a96 Make optimizations based on pattern matching be enabled by default
Currently, pattern based optimizations of Polly can identify matrix
multiplication and optimize it according to BLIS matmul optimization pattern
(see ScheduleTreeOptimizer for details). This patch makes optimizations
based on pattern matching be enabled by default.

Reviewed-by: Tobias Grosser <tobias@grosser.es>

Differential Revision: https://reviews.llvm.org/D30293

llvm-svn: 295958
2017-02-23 11:44:12 +00:00
Michael Kruse d8d32bb3d1 [DeLICM] Regression test for skipping map targets.
Add optimization-remarks-missed for when mapping targets have been
skipped and add regression tests for them.

llvm-svn: 295953
2017-02-23 10:25:20 +00:00
Michael Kruse deb30e8278 [DeLICM] Add regression tests for DeLICM reject cases.
These tests were not included in the main DeLICM commit. These check the
cases where zone analysis cannot be successful because of assumption
violations.

We use the LLVM optimization remark infrastructure as it seems to be the
best fit for this kind of messages. I tried to make use if the
OptimizationRemarkEmitter. However, it would insert additional function
passes into the pass manager to get the hotness information. The pass
manager would insert them between the flatten pass and delicm, causing
the ScopInfo with the flattened schedule being thrown away.

Differential Revision: https://reviews.llvm.org/D30253

llvm-svn: 295846
2017-02-22 15:14:08 +00:00
Michael Kruse 8474470500 [DeLICM] Fix wrong comment. NFC.
Correct a comment that claimed that a store after load was detected
when the code checks a load after a store.

llvm-svn: 295835
2017-02-22 14:14:40 +00:00
Michael Kruse 43ed25f1d9 [DeLICM] Print message when zone analysis is not available on -analysis.
This is to distinguish the cases that analysis has failed from the case
where not transformation was performed.

llvm-svn: 295833
2017-02-22 13:48:35 +00:00
Michael Kruse 91cdafb86f [DeLICM] Use opt<int>.
There is no template specialization for cl::parser<unsigned long> such
that parsing an cl::opt<unsigned long> command line argument will fail.
Use opt<int> instead which has an associated parser.

llvm-svn: 295832
2017-02-22 13:48:18 +00:00
Tobias Grosser cc43087afc [DependenceInfo] Simplify creation and subsequent use of AccessSchedule [NFC]
We only ever use the wrapped domain of AccessSchedule, so stop
creating an entire union_map and then pulling the domain out.

Reviewers: grosser
Tags: #polly

Contributed-by: Siddharth Bhat <siddu.druid@gmail.com>

Differential Revision: https://reviews.llvm.org/D30179

llvm-svn: 295726
2017-02-21 15:38:31 +00:00
Michael Kruse 9e52c39f0a [DeLICM] Map values hoisted by LICM back to the array.
Implement the -polly-delicm pass. The pass intends to undo the
effects of LoopInvariantCodeMotion (LICM) which adds additional scalar
dependencies into SCoPs. DeLICM will try to map those scalars back to
the array elements they were promoted from, as long as the array
element is unused.

The is the main patch from the DeLICM/DePRE patch series. It does not
yet undo GVN PRE for which additional information about known values
is needed and does not handle PHI write accesses that have have no
target. As such its usefulness is limited. Patches for these issues
including regression tests for error situatons will follow.

Reviewers: grosser

Differential Revision: https://reviews.llvm.org/D24716

llvm-svn: 295713
2017-02-21 10:20:54 +00:00
Michael Kruse d9cdeb453d [Cmake] Bump required cmake version to 3.4.3.
This is currently the minimum required version by LLVM. Since LLVM is
needed to build Polly, we also require at least that version.

Suggested-by: Philip Pfaffe <philip.pfaffe@gmail.com>
llvm-svn: 295672
2017-02-20 17:06:31 +00:00
Michael Kruse 5ab24fdb73 [Cmake] Install the isl headers into the install tree.
isl headers are currently missing in a Polly installation. Because the
Polly headers depend on those, code can't be compiled against an
installed Polly.

This patch installs the isl headers. I left a TODO, as optionally it
should be possible to use a system version of isl instead of the one
shipped with Polly.

When compiling, clients of the installation need to add
-I${PREFIX}/include/polly/ to there include path right now, because
there currently is no way to export this path automatically.

Contributed-by: Philip Pfaffe <philip.pfaffe@gmail.com>

Differential Revision: https://reviews.llvm.org/D29931

llvm-svn: 295671
2017-02-20 16:57:14 +00:00
Tobias Grosser 079d511891 [ScopInfo] Count read-only arrays when computing complexity of alias check
Instead of counting the number of read-only accesses, we now count the number of
distinct read-only array references when checking if a run-time alias check
may be too complex. The run-time alias check is quadratic in the number of
base pointers, not the number of accesses.

Before this change we accidentally skipped SPEC's lbm test case.

llvm-svn: 295567
2017-02-18 20:51:29 +00:00
Tobias Grosser 28492b85e2 [DependenceInfo] Pull out statement [NFC]
This simplifies the code slightly.

llvm-svn: 295551
2017-02-18 16:41:28 +00:00
Tobias Grosser 8ee46985d2 [Dependences] Compute reduction dependences on schedule tree [NFC]
This change gets rid of the need for zero padding, makes the reduction
computation code more similar to the normal dependence computation, and also
better documents what we do at the moment.

Making the dependence computation for reductions a little bit easier to
understand will hopefully help us to further reduce code duplication.

This reduces the time spent only in the reduction dependence pass from 260ms to
150ms for test/DependenceInfo/reduction_sequence.ll. This is a reduction of over
40% in dependence computation time.

This change was inspired by discussions with Michael Kruse, Utpal Bora,
Siddharth Bhat, and Johannes Doerfert. It can hopefully lay the base for further
cleanups of the reduction code.

llvm-svn: 295550
2017-02-18 16:39:04 +00:00
Tobias Grosser 41f0d81b31 [test] Add reduction sequence test case [NFC]
This test case is a mini performance test case that shows the time needed for a
couple of simple reductions. It takes today about 325ms on my machine to run
this test case through 'opt' with scop construction and reduction detection. It
can be used as mini-proxy for further tuning of the reduction code.

Generally we do not commit performance test cases, but as this is very
small and also very fast it seems OK to keep it in the lit test suite.

This test case will also help to verify that future changes to the reduction
code will not affect the ordering of the reduction sets and will consequently
not cause spurious performance changes that only result from reordering of
dependences in the reduction set.

llvm-svn: 295549
2017-02-18 16:38:58 +00:00
Tobias Grosser 2461021150 Drop leftover debug statement
llvm-svn: 295444
2017-02-17 13:39:45 +00:00
Tobias Grosser cd01a363d6 [ScopInfo] Add statistics to count loops after scop modeling
llvm-svn: 295431
2017-02-17 08:12:36 +00:00
Tobias Grosser 65ce9362b8 [ScopDetection] Compute the maximal loop depth correctly
Before this change, we obtained loop depth numbers that were deeper then the
actual loop depth.

llvm-svn: 295430
2017-02-17 08:08:54 +00:00
Tobias Grosser 72745c2ef5 Updated isl to isl-0.18-254-g6bc184d
This update includes a couple more coalescing changes as well as a large
number of isl-internal code cleanups (dead assigments, ...).

llvm-svn: 295419
2017-02-17 05:11:16 +00:00
Tobias Grosser ca2cfd0bd8 [ScopInfo] Do not try to fold array dimensions of size zero
Trying to fold such kind of dimensions will result in a division by zero,
which crashes the compiler. As such arrays are likely to invalidate the
scop anyhow (but are not illegal in LLVM-IR), there is no point in trying
to optimize the array layout. Hence, we just avoid the folding of
constant dimensions of size zero.

llvm-svn: 295415
2017-02-17 04:48:52 +00:00
Tobias Grosser 90411a967b [ScopInfo] Rename MaxDisjunctions -> MaxDisjuncts [NFC]
There is only a single disjunction. However, we bound the number of 'disjuncts'
in this disjunction. Name the variable accordingly.

llvm-svn: 295362
2017-02-16 19:11:33 +00:00
Tobias Grosser 76ec194951 [tests] Fix some misspellings [NFC]
llvm-svn: 295361
2017-02-16 19:11:29 +00:00
Tobias Grosser c8a8276710 [ScopInfo] Bound the number of disjuncts in context
Before this change wrapping range metadata resulted in exponential growth of
the context, which made context construction of large scops very slow. Instead,
we now just do not model the range information precisely, in case the number
of disjuncts in the context has already reached a certain limit.

llvm-svn: 295360
2017-02-16 19:11:25 +00:00
Tobias Grosser 98a3aa4f19 [ScopInfo] Use uppercase variable name [NFC]
llvm-svn: 295350
2017-02-16 18:39:18 +00:00
Tobias Grosser 3281f601bb [ScopInfo] Always derive upper and lower bounds for parameters
Commit r230230 introduced the use of range metadata to derive bounds for
parameters, instead of just looking at the type of the parameter. As part of
this commit support for wrapping ranges was added, where the lower bound of a
parameter is larger than the upper bound:

  { 255 < p || p < 0 }

However, at the same time, for wrapping ranges support for adding bounds given
by the size of the containing type has acidentally been dropped. As a result,
the range of the parameters was not guaranteed to be bounded any more. This
change makes sure we always add the bounds given by the size of the type and
then additionally add bounds based on signed wrapping, if available. For a
parameter p with a type size of 32 bit, the valid range is then:

  { -2147483648 <= p <= 2147483647 and (255 < p or p < 0) }

llvm-svn: 295349
2017-02-16 18:39:14 +00:00
Roman Gareev 4eb07e481e [FIX] Fix the typo in ScheduleOptimizer.cpp.
llvm-svn: 295292
2017-02-16 07:04:41 +00:00
Michael Kruse c28c584604 [DeLICM] Add forgotten unittests in previous commit. NFC.
llvm-svn: 295204
2017-02-15 17:19:22 +00:00
Michael Kruse e23e94a08d [DeLICM] Add Knowledge class. NFC.
The Knowledge class remembers the state of data at any timepoint of a SCoP's
execution. Currently, it tracks whether an array element is unused or is
occupied by some value, and the writes to it. A future addition will be to also
remember which value it contains.

Objects are used to determine whether two Knowledge contain conflicting
information, i.e. two states cannot be true a the same time.

This commit was extracted from the DeLICM algorithm at
https://reviews.llvm.org/D24716.

llvm-svn: 295197
2017-02-15 16:59:10 +00:00
Tobias Grosser 288c450cf6 [ScopDetectDiagnostics] Do not format unnamed array names
Formatting unnamed array names is expensive in LLVM as the this requires
deriving the numbered virtual instruction name (e.g., %12) for an llvm::Value,
which is currently not implemented efficiently. As instruction numberes anyhow
do not really carry a lot of information for the user, we just print 'unknown'
instead.

This change reduces the scop detection time from 24 to 19 seconds, for one of
our large-scale inputs. This is a reduction by 21%.

llvm-svn: 294894
2017-02-12 10:53:02 +00:00
Tobias Grosser 9fe37df27c [ScopDetection] Add statistics to count the maximal number of scops in loop
llvm-svn: 294893
2017-02-12 10:52:57 +00:00
Tobias Grosser b3a85884f7 Do not use wrapping ranges to bound non-affine accesses
When deriving the range of valid values of a scalar evolution expression might
be a range [12, 8), where the upper bound is smaller than the lower bound and
where the range is expected to possibly wrap around. We theoretically could
model such a range as a union of two non-wrapping ranges, but do not do this
as of yet. Instead, we just do not derive any bounds. Before this change,
we could have obtained bounds where the maximal possible value is strictly
smaller than the minimal possible value, which is incorrect and also caused
assertions during scop modeling.

llvm-svn: 294891
2017-02-12 08:11:12 +00:00
Roman Gareev b196055c0c Check reduction dependencies in case of the matrix multiplication optimization
To determine parameters of the matrix multiplication, we check RAW dependencies
that can be expressed using only reduction dependencies. Consequently, we
should check the reduction dependencies, if this is the case.

Reviewed-by: Tobias Grosser <tobias@grosser.es>,
             Sven Verdoolaege <skimo-polly@kotnet.org>
             Michael Kruse <llvm@meinersbur.de>

Differential Revision: https://reviews.llvm.org/D29814

llvm-svn: 294836
2017-02-11 09:59:09 +00:00
Roman Gareev de69293b01 [FIX] Fix the potential issue of containsOnlyMatMulDep.
llvm-svn: 294835
2017-02-11 09:48:09 +00:00
Roman Gareev 5ef7e210c0 [NFC] Fix the style issue of lib/Transform/ScheduleOptimizer.cpp.
llvm-svn: 294834
2017-02-11 08:43:41 +00:00
Roman Gareev afcf026d81 [NFC] Fix style issues of lib/Transform/ScheduleOptimizer.cpp.
llvm-svn: 294831
2017-02-11 07:14:37 +00:00
Roman Gareev 3d4eae31ea Use the size of the widest type of the matrix multiplication operands
The size of the operands type is the one of the parameters required
to determine the BLIS micro-kernel. We get the size of the widest type
of the matrix multiplication operands in case there are several
different types.

Reviewed-by: Michael Kruse <llvm@meinersbur.de>

Differential Revision: https://reviews.llvm.org/D29269

llvm-svn: 294828
2017-02-11 07:00:05 +00:00