v2: continue iterating through the rest of the bb
use for loop
v3: initialize FlattenCFG pass in ScalarOps
add test
v4: split off initializing flattencfg to a separate patch
add comment
Signed-off-by: Jan Vesely <jan.vesely@rutgers.edu>
llvm-svn: 215574
This for some reason fixes v1i64 kernel arguments on pre-SI. This
currently breaks some other cases in the kernel-args.ll test for R600,
but I'm not particularly confident in the new output. VTX_READ_* are not
used for some of the scalarized cases, and the code reading from the
constant buffer doesn't make much sense to me.
llvm-svn: 215564
Remove the PoCC and ScopLib support from Polly as we do not have a
user/maintainer for it.
Differential Revision: http://reviews.llvm.org/D4871
llvm-svn: 215563
Many of the test executables use pthreads directly. This isn't
portable on Windows, so this patch converts these test to use
C++11 threads and mutexes. Since Windows' implementation of
std::thread classes throw and catch from header files, this patch
also disables exceptions when compiling with clang on Windows.
Reviewed by: Todd Fiala, Ed Maste
Differential Revision: http://reviews.llvm.org/D4816
llvm-svn: 215562
Add header guards to files that were missing guards. Remove #endif comments
as they don't seem common in LLVM (we can easily add them back if we decide
they're useful)
Changes made by clang-tidy with minor tweaks.
llvm-svn: 215558
This patch improves the existing algorithm in DAGCombiner that
attempts to fold shuffles according to rule:
shuffle(shuffle(x, y, M1), undef, M2) -> shuffle(y, undef, M3)
Before this change, there were cases where the DAGCombiner conservatively
avoided folding shuffles even if the resulting mask would have been legal.
That is because the algorithm wrongly assumed that commuting
an illegal shuffle mask would always produce an illegal mask.
With this change, we now correctly compute the commuted shuffle mask before
calling method 'isShuffleMaskLegal' on it.
On X86, this improves for example the codegen for the following function:
define <4 x i32> @test(<4 x i32> %A, <4 x i32> %B) {
%1 = shufflevector <4 x i32> %B, <4 x i32> %A, <4 x i32> <i32 1, i32 2, i32 6, i32 7>
%2 = shufflevector <4 x i32> %1, <4 x i32> undef, <4 x i32> <i32 2, i32 3, i32 2, i32 3>
ret <4 x i32> %2
}
Before this change the X86 backend (-mcpu=corei7) generated
the following assembly code for function @test:
shufps $-23, %xmm0, %xmm1 # xmm1 = xmm1[1,2],xmm0[2,3]
movhlps %xmm1, %xmm1 # xmm1 = xmm1[1,1]
movaps %xmm1, %xmm0
Now we produce:
movhlps %xmm0, %xmm0 # xmm0 = xmm0[1,1]
Added extra test cases in combine-vec-shuffle-2.ll to verify that we correctly
fold according to the above-mentioned rule.
llvm-svn: 215555
Yet more problems due to the missing CXXBindTemporaryExpr in the CFG for
default arguments.
Unfortunately we cannot just switch off inserting temporaries for the
corresponding default arguments, as that breaks existing tests
(test/SemaCXX/return-noreturn.cpp:245).
llvm-svn: 215554
of MIPS toolchains.
The uCLibc implemented for multiple architectures. A couple of MIPS toolchains
contains both uCLibc and glibc implementation so these options allow to select
used C library.
Initially -muclibc / -mglibc (as well as -mbionic) have been implemented in gcc
for various architectures so they are not MIPS specific.
llvm-svn: 215552
The implementation is split into a generic part and a LLVM-specific part.
Other codebases can implement it with their own style. The specific features
supported are:
- Verification (and fixing) of header guards against a style based on the file path
- Automatic insertion of header guards for headers that are missing them
- A warning when the header guard doesn't enable our fancy header guard optimization
(e.g. when there's an include preceeding the guard)
- Automatic insertion of a comment with the guard name after #endif.
For the LLVM style we disable #endif comments for now, they're not very common
in the codebase. We also only flag headers in the include directories, there
doesn't seem to be a common style outside.
Differential Revision: http://reviews.llvm.org/D4867
llvm-svn: 215548
This patch adds the initial ELF/AArch64 support to lld. Only a basic "Hello
World" app has been successfully tested for both dynamic and static compiling.
Differential Revision: http://reviews.llvm.org/D4778
Patch by Daniel Stewart <stewartd@codeaurora.org>!
llvm-svn: 215544
Summary:
Moved some calls to setCanHaveModuleDir to the MipsTargetStreamer base class and removed the resulting empty functions from the MipsTargetELFStreamer class.
Also fixed a missing call to setCanHaveModuleDir in MipsTargetELFStreamer::emitDirectiveSetMicroMips.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: tomatabacu
Differential Revision: http://reviews.llvm.org/D4781
llvm-svn: 215542
With modules we start accessing headers for the first time while reading
the module map, which often has very different paths from the include
scanning logic.
Using the name by which the file was accessed gets us one step closer to
the right solution, which is using a FileName abstraction that decouples
the name by which a file was accessed from the FileEntry.
llvm-svn: 215541
Especially with blends and large tree heights there was a problem with
the fuzzer where it would end up with enough undef shuffle elements in
enough parts of the tree that in a birthday-attack kind of way we ended
up regularly having large numbers of undef elements in the result. I was
seeing reasonably frequent cases of *all* results being undef which
prevents us from doing any correctness checking at all. While having
undef lanes is important, this was too much.
So I've tried to apply some math to the probabilities of having an undef
lane and balance them against the tree height. Please be gentle, I'm
really terrible at math. I probably made a bunch of amateur mistakes
here. Fixes, etc. are quite welcome. =D At least in running it some, it
seems to be producing more interesting (for correctness testing)
results.
llvm-svn: 215540
attribute and function argument attribute synthesizing and propagating.
As with the other uses of this attribute, the goal remains a best-effort
(no guarantees) attempt to not optimize the function or assume things
about the function when optimizing. This is particularly useful for
compiler testing, bisecting miscompiles, triaging things, etc. I was
hitting specific issues using optnone to isolate test code from a test
driver for my fuzz testing, and this is one step of fixing that.
llvm-svn: 215538
Added avx512_movnt_vl multiclass for handling 256/128-bit forms of instruction.
Added encoding and lowering tests.
Reviewed by Elena Demikhovsky <elena.demikhovsky@intel.com>
llvm-svn: 215536
Patch by Matheus Almeida and Toma Tabacu
The lld test failure on the previous attempt to commit was caused by the
addition of the .pdr section causing the offsets it was checking to change.
This has been fixed by removing the .ent/.end directives from that test since
they weren't really needed.
llvm-svn: 215535
The commit of the .ent/.end implementation will change the result of the
relocation evaluations (because of a new section and additional relocations)
which will lead to a failure if the .ent/.end directives are present in this
test.
We don't really need the .ent/.end directives in this test so let's just remove
them to preserve the current output.
llvm-svn: 215534
Rather than silently disabling unaligned accesses for v6m targets as
in the previous patch to llvm, instead produce a warning saying that
this architecture doesn't support unaligned accesses.
Patch by Ben Foster
llvm-svn: 215531
a tree of inputs to blend iteratively together.
This required a pretty substantial rewrite of the innards. The number of
shuffle instructions is now bounded in terms of tree-height. There is
a flag to disable blends so that its still possible to test single input
shuffles. I've also improved various aspects of how the test program is
generated, primarily to simplify the test harness and allow some
optimizations to clean up how we actually check the results and build up
the inputs.
Again, apologies for my likely horrible use of Python... But hey, it
works! (Ish?)
llvm-svn: 215530
Correctness proof of the transform using CVC3-
$ cat t.cvc
A, B : BITVECTOR(32);
QUERY BVXOR(A | B, BVXOR(A,B) ) = A & B;
$ cvc3 t.cvc
Valid.
llvm-svn: 215524