Commit Graph

327139 Commits

Author SHA1 Message Date
David Blaikie f27367cd32 llvm-reduce: Clean out previous test temp/output dir, since it was a dir and now it's used as just a single file
llvm-svn: 372054
2019-09-16 23:56:26 +00:00
David Blaikie a458acb5ba llvm-reduce: Remove some string copies
llvm-svn: 372053
2019-09-16 23:54:57 +00:00
Jonas Devlieghere 71b32e4175 [test] Fail gracefully if the regex doesn't match
This test is failing on the Fedora bot (staging). Rather than failing
with an IndexError, we should trigger an assert and dump the log when
the regex doesn't match.

llvm-svn: 372052
2019-09-16 23:49:42 +00:00
Joel E. Denny 0a0ea7ec99 Revert r372035: "[lit] Make internal diff work in pipelines"
This breaks a Windows bot.

llvm-svn: 372051
2019-09-16 23:47:46 +00:00
Amara Emerson 9d64721ca5 [GlobalISel] Partially revert r371901.
r371901 was overeager and widenScalarDst() and the like in the legalizer
attempt to increment the insert point given in order to add new instructions
after the currently legalizing inst. In cases where the insertion point is not
exactly the current instruction, then callers need to de-compensate for the
behaviour by decrementing the insertion iterator before calling them. It's not
a nice state of affairs, for now just undo the problematic parts of the change.

llvm-svn: 372050
2019-09-16 23:46:03 +00:00
David Blaikie cb4aee7318 llvm-reduce: Make tests shell-independent by passing the interpreter on the command line rather than using #! in the test file
llvm-svn: 372049
2019-09-16 23:41:19 +00:00
David L. Jones 4a249553fe Add libc to path mappings in git-llvm.
llvm-svn: 372048
2019-09-16 23:36:35 +00:00
Haibo Huang 5a115e81cd Fix swig python package path
Summary:
The path defined in CMakeLists.txt doesn't match the path generated in
our python script. This change fixes that.

LLVM_LIBRARY_OUTPUT_INTDIR is defined as:

${CMAKE_BINARY_DIR}/${CMAKE_CFG_INTDIR}/lib${LLVM_LIBDIR_SUFFIX})

On the other hand, the path of site-package is generaged in
get_framework_python_dir_windows() in finishSwigPythonLLDB.py as:
(Dispite its name, the function is used for everything other than xcode)

prefix/cmakeBuildConfiguration/distutils.sysconfig.get_python_lib()

From lldb/CMakeLists.txt, we can see that:
prefix=${CMAKE_BINARY_DIR},
cmakeBuildConfiguration=${CMAKE_CFG_INTDIR}

And from python source code, we can see get_python_lib() always returns
lib/pythonx.y/site-packages for posix, or Lib/site-packages for windows:
https://github.com/python/cpython/blob/3.8/Lib/distutils/sysconfig.py#L128

We should make them match each other.

Subscribers: mgorny, lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D67583

llvm-svn: 372047
2019-09-16 23:31:16 +00:00
Jonas Devlieghere 8fc8d3fe01 [Reproducer] Implement dumping packets.
This patch completes the dump functionality by adding support for
dumping a reproducer's GDB remote packets.

Differential revision: https://reviews.llvm.org/D67636

llvm-svn: 372046
2019-09-16 23:31:06 +00:00
Jonas Devlieghere 3cabfb344b Fix warning: lambda capture 'temp_file_path' is not used
llvm-svn: 372044
2019-09-16 22:55:49 +00:00
Nemanja Ivanovic e63c676825 [PowerPC] Cust lower fpext v2f32 to v2f64 from extract_subvector v4f32
Add the missing piece of r372029.
Somehow when the patch for review D61961 was committed, only the test case
went in and the code didn't. This of course caused all kinds of build bot
breaks.
This patch just adds the code for that patch.

Author: Lei Huang
Differential revision: https://reviews.llvm.org/D61961

llvm-svn: 372043
2019-09-16 22:54:52 +00:00
Francis Visoiu Mistrih 77383d83eb [Remarks] Allow remarks::Format::YAML to take a string table
It should be allowed to take a string table in case all the strings in
the remarks point there, but it shouldn't use it during serialization.

llvm-svn: 372042
2019-09-16 22:45:17 +00:00
Vedant Kumar c693aa3def [test] Clean up previous raw profile before merging into it
This fixes a test failure in instrprof-set-file-object-merging.c which
seems to have been caused by reuse of stale data in old raw profiles.

llvm-svn: 372041
2019-09-16 22:32:18 +00:00
Alexey Bataev 87afb22707 [OPENMP]Fix the test, NFC.
llvm-svn: 372040
2019-09-16 22:17:10 +00:00
Bruno Cardoso Lopes 919fc50034 [Modules][Objective-C] Use complete decl from module when diagnosing missing import
Summary:
Otherwise the definition (first found) for ObjCInterfaceDecl's might
precede the module one, which will eventually lead to crash, since
diagnoseMissingImport needs one coming from a module.

This behavior changed after Richard's r342018, which started to look
into the definition of ObjCInterfaceDecls.

rdar://problem/49237144

Reviewers: rsmith, arphaman

Subscribers: jkorous, dexonsmith, ributzka, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D66982

llvm-svn: 372039
2019-09-16 22:00:29 +00:00
Jian Cai 155a43edb0 [compiler-rt][crt] make test case nontrivial in check_cxx_section_exists
Summary:
.init_array gets optimized away when building with -O2 and as a result,
check_cxx_section_exists failed to pass -DCOMPILER_RT_HAS_INITFINI_ARRAY
when building crtbegin.o and crtend.o, which causes binaries linked with
them encounter segmentation fault. See https://crbug.com/855759 for
details. This change prevents .init_array section to be optimized away
even with -O2 or higher optimization level.

Subscribers: dberris, mgorny, #sanitizers, llvm-commits

Tags: #sanitizers, #llvm

Differential Revision: https://reviews.llvm.org/D67628

llvm-svn: 372038
2019-09-16 21:47:47 +00:00
Jian Cai 9d2066af8d [clang-tidy] add checks to bugprone-posix-return
This check now also checks if any calls to pthread_* functions expect negative return values. These functions return either 0 on success or an errno on failure, which is positive only.

llvm-svn: 372037
2019-09-16 21:43:56 +00:00
David L. Jones ec80f531ca Add a director, along with README.txt and LICENSE.txt, for libc.
llvm-svn: 372036
2019-09-16 21:39:08 +00:00
Joel E. Denny 2152ae985c [lit] Make internal diff work in pipelines
When using lit's internal shell, RUN lines like the following
accidentally execute an external `diff` instead of lit's internal
`diff`:

```
 # RUN: program | diff file -
 # RUN: not diff file1 file2 | FileCheck %s
```

Such cases exist now, in `clang/test/Analysis` for example.  We are
preparing patches to ensure lit's internal `diff` is called in such
cases, which will then fail because lit's internal `diff` cannot
currently be used in pipelines and doesn't recognize `-` as a
command-line option.

To enable pipelines, this patch moves lit's `diff` implementation into
an out-of-process script, similar to lit's `cat` implementation.  A
follow-up patch will implement `-` to mean stdin.

Reviewed By: probinson, stella.stamenova

Differential Revision: https://reviews.llvm.org/D66574

llvm-svn: 372035
2019-09-16 21:22:29 +00:00
Dan Albert c1c519d2f1 Revert "Implement std::condition_variable via pthread_cond_clockwait() where available"
This reverts commit 5e37d7f9ff.

llvm-svn: 372034
2019-09-16 21:20:32 +00:00
Bardia Mahjour 474c713fc7 [NFC] Test commit access
llvm-svn: 372033
2019-09-16 20:44:15 +00:00
DeForest Richards 3b27f4c088 [Docs] Bug fix for docs homepage
Removes reference to non-existent Reference Documentation page.

llvm-svn: 372032
2019-09-16 20:29:56 +00:00
DeForest Richards e151cb7c63 [Docs] Adds Getting Started/Tutorials, Reference to LLVM docs homepage
Adds a section for Getting Started/Tutorials and Reference topics to the LLVM docs homepage.

llvm-svn: 372031
2019-09-16 20:19:32 +00:00
Lei Huang bfb197d7a3 [PowerPC] Cust lower fpext v2f32 to v2f64 from extract_subvector v4f32
This is a follow up patch from https://reviews.llvm.org/D57857 to handle
extract_subvector v4f32.  For cases where we fpext of v2f32 to v2f64 from
extract_subvector we currently generate on P9 the following:

  lxv 0, 0(3)
  xxsldwi 1, 0, 0, 1
  xscvspdpn 2, 0
  xxsldwi 3, 0, 0, 3
  xxswapd 0, 0
  xscvspdpn 1, 1
  xscvspdpn 3, 3
  xscvspdpn 0, 0
  xxmrghd 0, 0, 3
  xxmrghd 1, 2, 1
  stxv 0, 0(4)
  stxv 1, 0(5)

This patch custom lower it to the following sequence:

  lxv 0, 0(3)       # load the v4f32 <w0, w1, w2, w3>
  xxmrghw 2, 0, 0   # Produce the following vector <w0, w0, w1, w1>
  xxmrglw 3, 0, 0   # Produce the following vector <w2, w2, w3, w3>
  xvcvspdp 2, 2     # FP-extend to <d0, d1>
  xvcvspdp 3, 3     # FP-extend to <d2, d3>
  stxv 2, 0(5)      # Store <d0, d1> (%vecinit11)
  stxv 3, 0(4)      # Store <d2, d3> (%vecinit4)

Differential Revision: https://reviews.llvm.org/D61961

llvm-svn: 372029
2019-09-16 20:04:15 +00:00
Jonas Devlieghere 4e053ff1d1 [NFC] Move dumping into GDBRemotePacket
This moves the dumping logic from the GDBRemoteCommunicationHistory
class into the GDBRemotePacket so that it can be reused from the
reproducer command object.

llvm-svn: 372028
2019-09-16 20:02:57 +00:00
Dan Albert a7e9059967 Open fstream files in O_CLOEXEC mode when possible.
Reviewers: EricWF, mclow.lists, ldionne

Reviewed By: ldionne

Subscribers: smeenai, dexonsmith, christof, ldionne, libcxx-commits

Tags: #libc

Differential Revision: https://reviews.llvm.org/D59839

llvm-svn: 372027
2019-09-16 19:26:41 +00:00
Lubos Lunak a507a5ec8f do not emit -Wunused-macros warnings in -frewrite-includes mode (PR15614)
-frewrite-includes calls PP.SetMacroExpansionOnlyInDirectives() to avoid
macro expansions that are useless in that mode, but this can lead
to -Wunused-macros false positives. As -frewrite-includes does not emit
normal warnings, block -Wunused-macros too.

Differential Revision: https://reviews.llvm.org/D65371

llvm-svn: 372026
2019-09-16 19:18:37 +00:00
Vedant Kumar 413647d730 [Coverage] Speed up file-based queries for coverage info, NFC
Speed up queries for coverage info in a file by reducing the amount of
time spent determining whether a function record corresponds to a file.

This gives a 36% speedup when generating a coverage report for `llc`.
The reduction is entirely in user time.

rdar://54758110

Differential Revision: https://reviews.llvm.org/D67575

llvm-svn: 372025
2019-09-16 19:08:44 +00:00
Vedant Kumar 95de24978e [Coverage] Assert that filenames in a TU are unique, NFC
llvm-svn: 372024
2019-09-16 19:08:41 +00:00
Steven Wu dd63b9f570 [lld] Update lld driver to use new LTO APIs to handle libcall symbols
NFC. Remove duplicated code in ELF/COFF driver and libLTO legacy
interfaces.

llvm-svn: 372022
2019-09-16 18:49:57 +00:00
Steven Wu 34d80461ff [LTO][Legacy] Add new C inferface to query libcall functions
Summary:
This is needed to implemented the same approach as lld (implemented in r338434)
for how to handling symbols that can be generated by LTO code generator
but not present in the symbol table for linker that uses legacy C APIs.

libLTO is in charge of providing the list of symbols. Linker is in
charge of implementing the eager loading from static libraries using
the list of symbols.

rdar://problem/52853974

Reviewers: tejohnson, bd1976llvm, deadalnix, espindola

Reviewed By: tejohnson

Subscribers: emaste, arichardson, hiraditya, MaskRay, dang, kledzik, mehdi_amini, inglorion, jkorous, dexonsmith, ributzka, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67568

llvm-svn: 372021
2019-09-16 18:49:54 +00:00
Reid Kleckner 32837a0c93 [PGO] Use linkonce_odr linkage for __profd_ variables in comdat groups
This fixes relocations against __profd_ symbols in discarded sections,
which is PR41380.

In general, instrumentation happens very early, and optimization and
inlining happens afterwards. The counters for a function are calculated
early, and after inlining, counters for an inlined function may be
widely referenced by other functions.

For C++ inline functions of all kinds (linkonce_odr &
available_externally mainly), instr profiling wants to deduplicate these
__profc_ and __profd_ globals. Otherwise the binary would be quite
large.

I made __profd_ and __profc_ comdat in r355044, but I chose to make
__profd_ internal. At the time, I was only dealing with coverage, and in
that case, none of the instrumentation needs to reference __profd_.
However, if you use PGO, then instrumentation passes add calls to
__llvm_profile_instrument_range which reference __profd_ globals. The
solution is to make these globals externally visible by using
linkonce_odr linkage for data as was done for counters.

This is safe because PGO adds a CFG hash to the names of the data and
counter globals, so if different TUs have different globals, they will
get different data and counter arrays.

Reviewers: xur, hans

Differential Revision: https://reviews.llvm.org/D67579

llvm-svn: 372020
2019-09-16 18:49:09 +00:00
Roman Lebedev 69911b8d01 [ARM][Codegen] Autogenerate arm-cgp-casts.ll test.
Apparently it got broken by r372009 while i thought it was r372012.

llvm-svn: 372019
2019-09-16 18:28:22 +00:00
Raphael Isemann 0d8a008611 [lldb] Remove SetCount/ClearCount from Flags
Summary:
These functions are only used in tests where we should test the actual flag values instead of counting all bits for an approximate check.
Also these popcount implementation aren't very efficient and doesn't seem to be optimised to anything fast.

Reviewers: davide, JDevlieghere

Reviewed By: davide, JDevlieghere

Subscribers: abidh, JDevlieghere, lldb-commits

Tags: #lldb

Differential Revision: https://reviews.llvm.org/D67540

llvm-svn: 372018
2019-09-16 18:02:49 +00:00
Raphael Isemann 21641a2f6d [lldb][NFC] Make ApplyObjcCastHack less scary
llvm-svn: 372017
2019-09-16 18:02:21 +00:00
Dan Albert 5e37d7f9ff Implement std::condition_variable via pthread_cond_clockwait() where available
std::condition_variable is currently implemented via
pthread_cond_timedwait() on systems that use pthread. This is
problematic, since that function waits by default on CLOCK_REALTIME
and libc++ does not provide any mechanism to change from this
default.

Due to this, regardless of if condition_variable::wait_until() is
called with a chrono::system_clock or chrono::steady_clock parameter,
condition_variable::wait_until() will wait using CLOCK_REALTIME. This
is not accurate to the C++ standard as calling
condition_variable::wait_until() with a chrono::steady_clock parameter
should use CLOCK_MONOTONIC.

This is particularly problematic because CLOCK_REALTIME is a bad
choice as it is subject to discontinuous time adjustments, that may
cause condition_variable::wait_until() to immediately timeout or wait
indefinitely.

This change fixes this issue with a new POSIX function,
pthread_cond_clockwait() proposed on
http://austingroupbugs.net/view.php?id=1216. The new function is
similar to pthread_cond_timedwait() with the addition of a clock
parameter that allows it to wait using either CLOCK_REALTIME or
CLOCK_MONOTONIC, thus allowing condition_variable::wait_until() to
wait using CLOCK_REALTIME for chrono::system_clock and CLOCK_MONOTONIC
for chrono::steady_clock.

pthread_cond_clockwait() is implemented in glibc (2.30 and later) and
Android's bionic (Android API version 30 and later).

This change additionally makes wait_for() and wait_until() with clocks
other than chrono::system_clock use CLOCK_MONOTONIC.<Paste>

llvm-svn: 372016
2019-09-16 17:57:48 +00:00
Roman Lebedev 6fcd4e080f [Clang][Codegen] Disable arm_acle.c test.
This test is broken by design. Clang codegen tests should not depend
on llvm middle-end behaviour, they should *only* test clang codegen.
Yet this test runs whole optimization pipeline.
I've really tried to fix it, but there isn't just a few things
that depend on passes, but everything there does.

llvm-svn: 372015
2019-09-16 17:46:08 +00:00
Roman Lebedev b9909ffed8 [Clang][Codegen] Relax available-externally-suppress.c test
That test is broken by design.
It depends on llvm middle-end behavior.
No clang codegen test should be doing that.
This one is salvageable by relaxing check lines.

llvm-svn: 372014
2019-09-16 17:46:01 +00:00
Simon Pilgrim 3df0daddfd [X86][AVX] matchShuffleWithSHUFPD - add support for zeroable operands
Determine if all of the uses of LHS/RHS operands can be replaced with a zero vector.

llvm-svn: 372013
2019-09-16 17:30:33 +00:00
David Green 8d21460dc5 [ARM] A predicate cast of a predicate cast is a predicate cast
The adds some very basic folding of PREDICATE_CASTS, removing cases when they
are chained together. These would already be removed eventually, as these are
lowered to copies. This just allows it to happen earlier, which can help other
simplifications.

Differential Revision: https://reviews.llvm.org/D67591

llvm-svn: 372012
2019-09-16 17:29:07 +00:00
Alexey Bataev a00630785f [OPENMP]Fix parsing/sema for function templates with declare simd.
Need to return original declaration group with FunctionTemplateDecl, not
the inner FunctionDecl, to correctly handle parsing of directives with
the templates parameters.

llvm-svn: 372011
2019-09-16 17:06:31 +00:00
Roman Lebedev 10151f6618 [SimplifyCFG] FoldTwoEntryPHINode(): consider *total* speculation cost, not per-BB cost
Summary:
Previously, if the threshold was 2, we were willing to speculatively
execute 2 cheap instructions in both basic blocks (thus we were willing
to speculatively execute cost = 4), but weren't willing to speculate
when one BB had 3 instructions and other one had no instructions,
even thought that would have total cost of 3.

This looks inconsistent to me.
I don't think `cmov`-like instructions will start executing
until both of it's inputs are available: https://godbolt.org/z/zgHePf
So i don't see why the existing behavior is the correct one.

Also, let's add it's own `cl::opt` for this threshold,
with default=4, so it is not stricter than the previous threshold:
will allow to fold when there are 2 BB's each with cost=2.
And since the logic has changed, it will also allow to fold when
one BB has cost=3 and other cost=1, or there is only one BB with cost=4.

This is an alternative solution to D65148:
This fix is mainly motivated by `signbit-like-value-extension.ll` test.
That pattern comes up in JPEG decoding, see e.g.
`Figure F.12 – Extending the sign bit of a decoded value in V`
of `ITU T.81` (JPEG specification).
That branch is not predictable, and it is within the innermost loop,
so the fact that that pattern ends up being stuck with a branch
instead of `select` (i.e. `CMOV` for x86) is unlikely to be beneficial.

This has great results on the final assembly (vanilla test-suite + RawSpeed): (metric pass - D67240)
| metric                                 |     old |     new | delta |      % |
| x86-mi-counting.NumMachineFunctions    |   37720 |   37721 |     1 |  0.00% |
| x86-mi-counting.NumMachineBasicBlocks  |  773545 |  771181 | -2364 | -0.31% |
| x86-mi-counting.NumMachineInstructions | 7488843 | 7486442 | -2401 | -0.03% |
| x86-mi-counting.NumUncondBR            |  135770 |  135543 |  -227 | -0.17% |
| x86-mi-counting.NumCondBR              |  423753 |  422187 | -1566 | -0.37% |
| x86-mi-counting.NumCMOV                |   24815 |   25731 |   916 |  3.69% |
| x86-mi-counting.NumVecBlend            |      17 |      17 |     0 |  0.00% |

We significantly decrease basic block count, notably decrease instruction count,
significantly decrease branch count and very significantly increase `cmov` count.

Performance-wise, unsurprisingly, this has great effect on
target RawSpeed benchmark. I'm seeing 5 **major** improvements:
```
Benchmark                                                                                             Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_pvalue                                 0.0000          0.0000      U Test, Repetitions: 49 vs 49
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_mean                                  -0.3064         -0.3064      226.9913      157.4452      226.9800      157.4384
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_median                                -0.3057         -0.3057      226.8407      157.4926      226.8282      157.4828
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_stddev                                -0.4985         -0.4954        0.3051        0.1530        0.3040        0.1534
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_pvalue                                  0.0000          0.0000      U Test, Repetitions: 49 vs 49
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_mean                                   -0.1747         -0.1747       80.4787       66.4227       80.4771       66.4146
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_median                                 -0.1742         -0.1743       80.4686       66.4542       80.4690       66.4436
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_stddev                                 +0.6089         +0.5797        0.0670        0.1078        0.0673        0.1062
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_pvalue                                 0.0000          0.0000      U Test, Repetitions: 49 vs 49
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_mean                                  -0.1598         -0.1598      171.6996      144.2575      171.6915      144.2538
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_median                                -0.1598         -0.1597      171.7109      144.2755      171.7018      144.2766
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_stddev                                +0.4024         +0.3850        0.0847        0.1187        0.0848        0.1175
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_pvalue                                  0.0000          0.0000      U Test, Repetitions: 49 vs 49
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_mean                                   -0.0550         -0.0551      280.3046      264.8800      280.3017      264.8559
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_median                                 -0.0554         -0.0554      280.2628      264.7360      280.2574      264.7297
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_stddev                                 +0.7005         +0.7041        0.2779        0.4725        0.2775        0.4729
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_pvalue                                  0.0000          0.0000      U Test, Repetitions: 49 vs 49
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_mean                                   -0.0354         -0.0355      316.7396      305.5208      316.7342      305.4890
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_median                                 -0.0354         -0.0356      316.6969      305.4798      316.6917      305.4324
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_stddev                                 +0.0493         +0.0330        0.3562        0.3737        0.3563        0.3681
```

That being said, it's always best-effort, so there will likely
be cases where this worsens things.

Reviewers: efriedma, craig.topper, dmgreen, jmolloy, fhahn, Carrot, hfinkel, chandlerc

Reviewed By: jmolloy

Subscribers: xbolva00, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67318

llvm-svn: 372009
2019-09-16 16:18:24 +00:00
Ilya Biryukov 685d8a95c5 [clangd] Simplify semantic highlighting visitor
Summary:
- Functions to compute highlighting kinds for things are separated from
  the ones that add highlighting tokens.
  This keeps each of them more focused on what they're doing: getting
  locations and figuring out the kind of the entity, correspondingly.

- Less special cases in visitor for various nodes.

This change is an NFC.

Reviewers: hokein

Reviewed By: hokein

Subscribers: MaskRay, jkorous, arphaman, kadircet, cfe-commits

Tags: #clang

Differential Revision: https://reviews.llvm.org/D67341

llvm-svn: 372008
2019-09-16 16:16:03 +00:00
Sanjay Patel 3961a143e1 [InstCombine] remove unneeded one-use checks for icmp fold
Related folds were added in:
rL125734
...the code comment about register pressure is discussed in
more detail in:
https://bugs.llvm.org/show_bug.cgi?id=2698

But 10 years later, perf testing bzip2 with this change now
shows a slight (0.2% average) improvement on Haswell although
that's probably within test noise.

Given that this is IR canonicalization, we shouldn't be worried
about register pressure though; the backend should be able to
adjust for that as needed.

This is part of solving PR43310 the theoretically right way:
https://bugs.llvm.org/show_bug.cgi?id=43310
...ie, if we don't cripple basic transforms, then we won't
need to add special-case code to detect larger patterns.

rL371940 and rL371981 are related patches in this series.

llvm-svn: 372007
2019-09-16 16:15:25 +00:00
Sanjay Patel 4d9d0f9cf5 [InstCombine] move tests for icmp+add; NFC
llvm-svn: 372004
2019-09-16 15:33:40 +00:00
Oliver Cruickshank ee6fbebbaf [ARM] Add patterns for BSWAP intrinsic on MVE
BSWAP can use the VREV instruction on MVE to produce better results than
expanding.

llvm-svn: 372002
2019-09-16 15:20:10 +00:00
Oliver Cruickshank e9510a6cad [ARM] Add patterns for bitreverse intrinsic on MVE
BITREVERSE can use the VBRSR which will reverse and right shift.
Shifting right by 0 will just reverse the bits.

llvm-svn: 372001
2019-09-16 15:20:03 +00:00
Oliver Cruickshank 5f799ef162 [ARM] Lower CTTZ on MVE
Lower CTTZ on MVE using VBRSR and VCLS which will reverse the bits and
count the leading zeros, equivalent to a count trailing zeros (CTTZ).

llvm-svn: 372000
2019-09-16 15:19:56 +00:00
Oliver Cruickshank cd1a0b9271 [ARM] Add patterns for CTLZ on MVE
CTLZ intrinsic can use the VCLS instruction on MVE, which produces
better results than expanding.

llvm-svn: 371999
2019-09-16 15:19:49 +00:00
Simon Pilgrim a48b6e98ab [ExecutionEngine] Don't dereference a dyn_cast result. NFCI.
The static analyzer is warning about potential null dereferences of dyn_cast<> results - in these cases we can safely use cast<> directly as we know that these cases should all be the correct type, which is why its working atm and anyway cast<> will assert if they aren't.

llvm-svn: 371998
2019-09-16 15:19:11 +00:00