Commit Graph

252382 Commits

Author SHA1 Message Date
Simon Pilgrim 5f2f53b106 [X86][SSE] Attempt to pre-truncate arithmetic operations that have already been extended
As discussed on D28219 - it is profitable to combine trunc(binop (s/zext(x), s/zext(y)) to binop(trunc(s/zext(x)), trunc(s/zext(y))) assuming the trunc(ext()) will simplify further

llvm-svn: 292493
2017-01-19 16:25:02 +00:00
Sanjay Patel 291c3d8ff2 [InstCombine] icmp Pred (shl nsw X, C1), C0 --> icmp Pred X, C0 >> C1
Try harder to fold icmp with shl nsw as discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2017-January/108749.html

This is similar to the 'shl nuw' transforms that were added with D25913.

This may eventually help solve:
https://llvm.org/bugs/show_bug.cgi?id=30773

Differential Revision: https://reviews.llvm.org/D28406

llvm-svn: 292492
2017-01-19 16:12:10 +00:00
Felix Berger 08df246407 [clang-tidy] Do not trigger move fix for non-copy assignment operators in performance-unnecessary-value-param check
Reviewers: alexfh, sbenza, malcolm.parsons

Subscribers: JDevlieghere, cfe-commits

Differential Revision: https://reviews.llvm.org/D28899

llvm-svn: 292491
2017-01-19 15:51:10 +00:00
Marshall Clow c3cb054e0c Mark two of the TS implementations as 'in progress'
llvm-svn: 292490
2017-01-19 15:30:36 +00:00
Pavel Labath a6321a8e95 Refactor logging in NativeProcessLinux
Use the LLDB_LOG macro instead of the more verbose if(log) ... syntax.

I have also consolidated the log channels (everything now goes to the posix
channel, instead of a mixture of posix and lldb), and cleaned up some of the
more convoluted log statements.

llvm-svn: 292489
2017-01-19 15:26:04 +00:00
Hafiz Abid Qadeer 05008cac15 Avoid unused variable warning when assert is disabled.
llvm-svn: 292488
2017-01-19 15:11:01 +00:00
Simon Pilgrim 3b23eac71f [X86][SSE] Added tests for pre-truncating arithmetic operations that have already been extended
As discussed on D28219 - it is profitable to combine trunc(binop (s/zext(x), s/zext(y)) to binop(trunc(s/zext(x)), trunc(s/zext(y))) assuming the trunc(ext()) will simplify further

llvm-svn: 292487
2017-01-19 15:03:00 +00:00
Tobias Grosser 75dfaa1dbe BlockGenerator: Do not redundantly reload from PHI-allocas in non-affine stmts
Before this change we created an additional reload in the copy of the incoming
block of a PHI node to reload the incoming value, even though the necessary
value has already been made available by the normally generated scalar loads.
In this change, we drop the code that generates this redundant reload and
instead just reuse the scalar value already available.

Besides making the generated code slightly cleaner, this change also makes sure
that scalar loads go through the normal logic, which means they can be remapped
(e.g. to array slots) and corresponding code is generated to load from the
remapped location. Without this change, the original scalar load at the
beginning of the non-affine region would have been remapped, but the redundant
scalar load would continue to load from the old PHI slot location.

It might be possible to further simplify the code in addOperandToPHI,
but this would not only mean to pull out getNewValue, but to also change the
insertion point update logic. As this did not work when trying it the first
time, this change is likely not trivial. To not introduce bugs last minute, we
postpone further simplications to a subsequent commit.

We also document the current behavior a little bit better.

Reviewed By: Meinersbur

Differential Revision: https://reviews.llvm.org/D28892

llvm-svn: 292486
2017-01-19 14:12:45 +00:00
Mikael Holmen 2074e7497b [DAG] Don't increase SDNodeOrder for dbg.value/declare.
Summary:
The SDNodeOrder is saved in the IROrder field in the SDNode, and this
field may affects scheduling. Thus, letting dbg.value/declare increase
the order numbers may in turn affect scheduling.

Because of this change we also need to update the code deciding when
dbg values should be output, in ScheduleDAGSDNodes.cpp/ProcessSDDbgValues.

Dbg values now have the same order as the SDNode they are connected to,
not the following orders.

Test cases provided by Florian Hahn.

Reviewers: bogner, aprantl, sunfish, atrick

Reviewed By: atrick

Subscribers: fhahn, probinson, andreadb, llvm-commits, MatzeB

Differential Revision: https://reviews.llvm.org/D25318

llvm-svn: 292485
2017-01-19 13:55:55 +00:00
Malcolm Parsons 9cfb973ab4 [docs] Tell Doxygen to expand LLVM_ALIGNAS to nothing
Summary:
Docs for clang::Decl and clang::TemplateSpecializationType have
not been generated since LLVM_ALIGNAS was added to them.

Tell Doxygen to expand LLVM_ALIGNAS to nothing as described at
https://www.stack.nl/~dimitri/doxygen/manual/preprocessing.html

Reviewers: aaron.ballman, klimek, alexfh

Subscribers: ioeric, cfe-commits

Differential Revision: https://reviews.llvm.org/D28850

llvm-svn: 292484
2017-01-19 13:38:19 +00:00
Malcolm Parsons 386171ba55 [docs] Tell Doxygen to expand LLVM_ALIGNAS to nothing
Summary:
Docs for clang::Decl and clang::TemplateSpecializationType have
not been generated since LLVM_ALIGNAS was added to them.

Tell Doxygen to expand LLVM_ALIGNAS to nothing as described at
https://www.stack.nl/~dimitri/doxygen/manual/preprocessing.html

Reviewers: aaron.ballman, klimek, alexfh

Subscribers: ioeric, cfe-commits

Differential Revision: https://reviews.llvm.org/D28850

llvm-svn: 292483
2017-01-19 13:37:42 +00:00
Mikael Holmen 8bf15614fb Test commit access, remove trailing whitespace
llvm-svn: 292482
2017-01-19 13:35:13 +00:00
Kristof Beyls e9412b4d47 [GlobalISel] Pointers are legal operands for G_SELECT on AArch64
Differential Revision: https://reviews.llvm.org/D28805

llvm-svn: 292481
2017-01-19 13:32:14 +00:00
Tobias Grosser 943c369c60 BlockGenerator: remove obfuscating const and const casts
Making certain values 'const' to just cast it away a little later mainly
obfuscates the code. Hence, we just drop the 'const' parts.

Suggested-by: Michael Kruse <llvm@meinersbur.de>
llvm-svn: 292480
2017-01-19 13:25:52 +00:00
Elena Demikhovsky e01512cecf Recommiting unsigned saturation with a bugfix.
A test case that crached is added to avx512-trunc.ll.
(PR31589)

llvm-svn: 292479
2017-01-19 12:08:21 +00:00
Daniel Sanders d64d5024a4 Re-commit: [globalisel] Tablegen-erate current Register Bank Information
Summary:
Adds a RegisterBank tablegen class that can be used to declare the register
banks and an associated tablegen pass to generate the necessary code.

Changes since first commit attempt:
* Added missing guards
* Added more missing guards
* Found and fixed a use-after-free bug involving Twine locals

Reviewers: t.p.northover, ab, rovka, qcolombet

Reviewed By: qcolombet

Subscribers: aditya_nandakumar, rengolin, kristof.beyls, vkalintiris, mgorny, dberris, llvm-commits, rovka

Differential Revision: https://reviews.llvm.org/D27338

llvm-svn: 292478
2017-01-19 11:15:55 +00:00
Malcolm Parsons 207a68985b [docs] Tell Doxygen to expand LLVM_ALIGNAS to nothing
Summary:
Docs for clang::Decl and clang::TemplateSpecializationType have
not been generated since LLVM_ALIGNAS was added to them.

Tell Doxygen to expand LLVM_ALIGNAS to nothing as described at
https://www.stack.nl/~dimitri/doxygen/manual/preprocessing.html

Reviewers: aaron.ballman, klimek, alexfh

Subscribers: ioeric, cfe-commits

Differential Revision: https://reviews.llvm.org/D28850

llvm-svn: 292477
2017-01-19 09:27:45 +00:00
Justin Bogner ddb80aee7e GlobalISel: Implement widening for shifts
llvm-svn: 292476
2017-01-19 07:51:17 +00:00
Craig Topper b8e92f775d [AVX-512] Add test cases that show where we are using two subvector inserts to broadcast a 128-bit subvector into a 512-bit vector. We'd be better off using something like SHUFF32X4.
If the subvector comes from a load, we convert to SUBV_BROADCAST and use a broadcast instruction. But if there is no load we keep the inserts. I think we should create the SUBV_BROADCAST even without the load and let isel use the fallback patterns that are used if the load can't be folded. This will use the SHUFF32X4 or similar instruction for the 128-bit into 512-bit case and a single insert for 128 into 256 or 256 into 512.

This should be fixed so subvector broadcast intrinsics can be replaced with native IR since some of those currently lower directly to SHUFF32X4.

llvm-svn: 292475
2017-01-19 07:37:45 +00:00
Craig Topper 200ea31684 [AVX-512] Support ADD/SUB/MUL of mask vectors
Summary:
Currently we expand and scalarize these operations, but I think we should be able to implement ADD/SUB with KXOR and MUL with KAND.

We already do this for scalar i1 operations so I just extended it to vectors of i1.

Reviewers: zvi, delena

Reviewed By: delena

Subscribers: guyblank, llvm-commits

Differential Revision: https://reviews.llvm.org/D28888

llvm-svn: 292474
2017-01-19 07:12:35 +00:00
Matt Arsenault 3e6f9b5773 AMDGPU: Disable some fneg combines unless nsz
For -(x + y) -> (-x) + (-y), if x == -y, this would
change the result from -0.0 to 0.0. Since the fma/fmad
combine is an extension of this problem it also
applies there.

fmul should be fine, and I don't think any of the unary
operators or conversions should be a problem either.

llvm-svn: 292473
2017-01-19 06:35:27 +00:00
Matt Arsenault 3b99f12a4e AMDGPU: Remove modifiers from v_div_scale_*
They seem to produce nonsense results when used.

This should be applied to the release branch.

llvm-svn: 292472
2017-01-19 06:04:12 +00:00
Tobias Grosser 97b8490982 Use range-based for loop [NFC]
llvm-svn: 292471
2017-01-19 05:09:23 +00:00
Tobias Grosser a989a8b84c Improve test coverage in test/Isl/CodeGen/loop_partially_in_scop.ll [NFC]
We rename the test case with -metarenamer to make the variable names easier to
read and add additional check lines that verify the code we currently generate
for PHI nodes. This code is interesting as it contains a PHI node in a
non-affine sub-region, where some incoming blocks are within the non-affine
sub-region and others are outside of the non-affine subregion.

As can be seen in the check lines we currently load the PHI-node value twice.
This commit documents this behavior. In a subsequent patch we will try to
improve this.

llvm-svn: 292470
2017-01-19 04:54:45 +00:00
Craig Topper c227529105 [X86] Merge LowerADD and LowerSUB into a single LowerADD_SUB since they are identical.
llvm-svn: 292469
2017-01-19 03:49:29 +00:00
Mike Aizatsky 7da919b8b0 [sancov] applying blacklist to covered points too
Differential Revision: https://reviews.llvm.org/D28872

llvm-svn: 292468
2017-01-19 03:49:18 +00:00
Saleem Abdulrasool c8bcda2b56 llvm-cxxfilt: filter out invalid manglings
c++filt does not attempt to demangle symbols which do not match its
expected format.  This means that the symbol must start with _Z or ___Z
(block invocation function extension).  Any other symbols are returned
as is.  Note that this is different from the behaviour of __cxa_demangle
which will demangle fragments.

llvm-svn: 292467
2017-01-19 02:58:46 +00:00
Craig Topper b561e66384 [AVX-512] Use VSHUF instructions instead of two inserts as fallback for subvector broadcasts that can't fold the load.
llvm-svn: 292466
2017-01-19 02:34:29 +00:00
Craig Topper 044662d14b [AVX-512] Add additional test cases for broadcast intrinsics that demonstates that we don't fold the loads to use a broadcast instruction.
llvm-svn: 292465
2017-01-19 02:34:25 +00:00
Michael Kuperstein 8ecc38ef85 [PM] Add LoopVectorize to the default module pipeline
LV no longer "requires" LCSSA and LoopSimplify, and instead forms
them internally as required. So, there's nothing preventing it from
being enabled.

llvm-svn: 292464
2017-01-19 02:21:54 +00:00
Peter Collingbourne 22d9d3cdce LowerTypeTests: Implement exporting of type identifiers.
Type identifiers are exported by:
- Adding coarse-grained information about how to test the type
  identifier to the summary.
- Creating symbols in the object file (aliases and absolute symbols)
  containing fine-grained information about the type identifier.

Differential Revision: https://reviews.llvm.org/D28424

llvm-svn: 292462
2017-01-19 01:20:11 +00:00
Justin Bogner d09c3ce6c0 GlobalISel: Implement narrowing for G_LOAD
llvm-svn: 292461
2017-01-19 01:05:48 +00:00
Justin Bogner 1f5c505437 GlobalISel: Fix text wrapping in a comment. NFC
llvm-svn: 292460
2017-01-19 01:04:46 +00:00
Matthias Braun 58f99615d6 Use an actual valid register in test
llvm-svn: 292459
2017-01-19 01:04:08 +00:00
Dehao Chen b3a70de753 Add -fdebug-info-for-profiling to emit more debug info for sample pgo profile collection
Summary:
SamplePGO uses profile with debug info to collect profile. Unlike the traditional debugging purpose, sample pgo needs more accurate debug info to represent the profile. We add -femit-accurate-debug-info for this purpose. It can be combined with all debugging modes (-g, -gmlt, etc). It makes sure that the following pieces of info is always emitted:

* start line of all subprograms
* linkage name of all subprograms
* standalone subprograms (functions that has neither inlined nor been inlined)

The impact on speccpu2006 binary size (size increase comparing with -g0 binary, also includes data for -g binary, which does not change with this patch):

               -gmlt(orig) -gmlt(patched) -g
433.milc       4.68%       5.40%          19.73%
444.namd       8.45%       8.93%          45.99%
447.dealII     97.43%      115.21%        374.89%
450.soplex     27.75%      31.88%         126.04%
453.povray     21.81%      26.16%         92.03%
470.lbm        0.60%       0.67%          1.96%
482.sphinx3    5.77%       6.47%          26.17%
400.perlbench  17.81%      19.43%         73.08%
401.bzip2      3.73%       3.92%          12.18%
403.gcc        31.75%      34.48%         122.75%
429.mcf        0.78%       0.88%          3.89%
445.gobmk      6.08%       7.92%          42.27%
456.hmmer      10.36%      11.25%         35.23%
458.sjeng      5.08%       5.42%          14.36%
462.libquantum 1.71%       1.96%          6.36%
464.h264ref    15.61%      16.56%         43.92%
471.omnetpp    11.93%      15.84%         60.09%
473.astar      3.11%       3.69%          14.18%
483.xalancbmk  56.29%      81.63%         353.22%
geomean        15.60%      18.30%         57.81%

Debug info size change for -gmlt binary with this patch:

433.milc       13.46%
444.namd       5.35%
447.dealII     18.21%
450.soplex     14.68%
453.povray     19.65%
470.lbm        6.03%
482.sphinx3    11.21%
400.perlbench  8.91%
401.bzip2      4.41%
403.gcc        8.56%
429.mcf        8.24%
445.gobmk      29.47%
456.hmmer      8.19%
458.sjeng      6.05%
462.libquantum 11.23%
464.h264ref    5.93%
471.omnetpp    31.89%
473.astar      16.20%
483.xalancbmk  44.62%
geomean        16.83%

Reviewers: davidxl, andreadb, rob.lougher, dblaikie, echristo

Reviewed By: dblaikie, echristo

Subscribers: hfinkel, rob.lougher, andreadb, gbedwell, cfe-commits, probinson, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D25435

llvm-svn: 292458
2017-01-19 00:44:21 +00:00
Dehao Chen 1ce8d6ca59 Add -debug-info-for-profiling to emit more debug info for sample pgo profile collection
Summary:
SamplePGO binaries built with -gmlt to collect profile. The current -gmlt debug info is limited, and we need some additional info:

* start line of all subprograms
* linkage name of all subprograms
* standalone subprograms (functions that has neither inlined nor been inlined)

This patch adds these information to the -gmlt binary. The impact on speccpu2006 binary size (size increase comparing with -g0 binary, also includes data for -g binary, which does not change with this patch):

               -gmlt(orig) -gmlt(patched) -g
433.milc       4.68%       5.40%          19.73%
444.namd       8.45%       8.93%          45.99%
447.dealII     97.43%      115.21%        374.89%
450.soplex     27.75%      31.88%         126.04%
453.povray     21.81%      26.16%         92.03%
470.lbm        0.60%       0.67%          1.96%
482.sphinx3    5.77%       6.47%          26.17%
400.perlbench  17.81%      19.43%         73.08%
401.bzip2      3.73%       3.92%          12.18%
403.gcc        31.75%      34.48%         122.75%
429.mcf        0.78%       0.88%          3.89%
445.gobmk      6.08%       7.92%          42.27%
456.hmmer      10.36%      11.25%         35.23%
458.sjeng      5.08%       5.42%          14.36%
462.libquantum 1.71%       1.96%          6.36%
464.h264ref    15.61%      16.56%         43.92%
471.omnetpp    11.93%      15.84%         60.09%
473.astar      3.11%       3.69%          14.18%
483.xalancbmk  56.29%      81.63%         353.22%
geomean        15.60%      18.30%         57.81%

Debug info size change for -gmlt binary with this patch:

433.milc       13.46%
444.namd       5.35%
447.dealII     18.21%
450.soplex     14.68%
453.povray     19.65%
470.lbm        6.03%
482.sphinx3    11.21%
400.perlbench  8.91%
401.bzip2      4.41%
403.gcc        8.56%
429.mcf        8.24%
445.gobmk      29.47%
456.hmmer      8.19%
458.sjeng      6.05%
462.libquantum 11.23%
464.h264ref    5.93%
471.omnetpp    31.89%
473.astar      16.20%
483.xalancbmk  44.62%
geomean        16.83%

Reviewers: davidxl, echristo, dblaikie

Reviewed By: echristo, dblaikie

Subscribers: aprantl, probinson, llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D25434

llvm-svn: 292457
2017-01-19 00:44:11 +00:00
Michael Kuperstein 230867e583 [LV] Run loop-simplify and LCSSA explicitly instead of "requiring" them
This changes the vectorizer to explicitly use the loopsimplify and lcssa utils,
instead of "requiring" the transformations as if they were analyses.

This is not NFC, since it changes the LCSSA behavior - we no longer run LCSSA
for all loops, but rather only for the loops we expect to modify.

Differential Revision: https://reviews.llvm.org/D28868

llvm-svn: 292456
2017-01-19 00:42:28 +00:00
Matthias Braun 9f21a8d787 LiveIntervalAnalysis: Cleanup; NFC
- Fix doxygen comments: Do not repeat name, remove duplicated doxygen
  comment (on declaration + implementation), etc.
- Use more range based for

llvm-svn: 292455
2017-01-19 00:32:13 +00:00
Jason Molenda 848c7be02a Fix a problem with the new dyld interface code -- when a new process
starts up, we need to clear the target's image list and only add
the binaries into the target that are actually present in this
process run.

<rdar://problem/29857613> 

llvm-svn: 292454
2017-01-19 00:20:29 +00:00
Artem Belevich 3d3f6190ab [NVPTX] Fix lowering of fp16 ISD::FNEG.
There's no neg.f16 instruction, so negation has to
be done via subtraction from zero.

Differential Revision: https://reviews.llvm.org/D28876

llvm-svn: 292452
2017-01-19 00:14:45 +00:00
Peter Collingbourne 87cdfa7635 Add llvm-dis dependency to check-clang.
llvm-svn: 292450
2017-01-19 00:04:44 +00:00
Eli Friedman f1f49c8265 [SCEV] Make getUDivExactExpr handle non-nuw multiplies correctly.
To avoid regressions, make ScalarEvolution::createSCEV a bit more
clever.

Also get rid of some useless code in ScalarEvolution::howFarToZero
which was hiding this bug.

No new testcase because it's impossible to actually expose this bug:
we don't have any in-tree users of getUDivExactExpr besides the two
functions I just mentioned, and they both dodged the problem. I'll
try to add some interesting users in a followup.

Differential Revision: https://reviews.llvm.org/D28587

llvm-svn: 292449
2017-01-18 23:56:42 +00:00
Peter Collingbourne 1e1475ace5 Move vtable type metadata emission behind a cc1-level flag.
In ThinLTO mode, type metadata will require the module to be written as a
multi-module bitcode file, which is currently incompatible with the Darwin
linker. It is also useful to be able to enable or disable multi-module bitcode
for testing purposes. This introduces a cc1-level flag, -f{,no-}lto-unit,
which is used by the driver to enable multi-module bitcode on all but
Darwin+ThinLTO, and can also be used to enable/disable the feature manually.

Differential Revision: https://reviews.llvm.org/D28877

llvm-svn: 292448
2017-01-18 23:55:27 +00:00
Eli Friedman 0a2174533e Preserve domtree and loop-simplify for runtime unrolling.
Mostly straightforward changes; we just didn't do the computation before.
One sort of interesting change in LoopUnroll.cpp: we weren't handling
dominance for children of the loop latch correctly, but
foldBlockIntoPredecessor hid the problem for complete unrolling.

Currently punting on loop peeling; made some minor changes to isolate
that problem to LoopUnrollPeel.cpp.

Adds a flag -unroll-verify-domtree; it verifies the domtree immediately
after we finish updating it. This is on by default for +Asserts builds.

Differential Revision: https://reviews.llvm.org/D28073

llvm-svn: 292447
2017-01-18 23:26:37 +00:00
Krzysztof Parzyszek de44c9d857 Treat segment [B, E) as not overlapping block with boundaries [A, B)
llvm-svn: 292446
2017-01-18 23:12:19 +00:00
Krzysztof Parzyszek 954dd8d9ba [Hexagon] Remove dead defs from the live set when expanding wstores
llvm-svn: 292445
2017-01-18 23:11:40 +00:00
Michael Kuperstein d3d2925933 Revert r291670 because it introduces a crash.
r291670 doesn't crash on the original testcase from PR31589,
but it crashes on a slightly more complex one.

PR31589 has the new reproducer.

llvm-svn: 292444
2017-01-18 23:05:58 +00:00
Stephan T. Lavavej d6c0b35c11 [libcxx] [test] Add msvc_stdlib_force_include.hpp.
No functional change; nothing includes this, instead our test harness
injects it via the /FI compiler option.

No code review; blessed in advance by EricWF.

llvm-svn: 292443
2017-01-18 22:19:14 +00:00
Mehdi Amini 062b3fed4c Improve the `-filter-print-funcs` option to skip the banner for CGSCC pass when nothing is to be printed
Before, it would print a sequence of:

  *** IR Dump After Function Integration/Inlining ******
  *** IR Dump After Function Integration/Inlining ******
  *** IR Dump After Function Integration/Inlining ******
  ...

for every single function in the module.

llvm-svn: 292442
2017-01-18 21:37:11 +00:00
Sanjay Patel cfb8a45942 [InstCombine] add tests for shl nsw with icmp eq/ne; NFCI
These should be fixed with D28406.

llvm-svn: 292441
2017-01-18 21:31:21 +00:00