Commit Graph

291549 Commits

Author SHA1 Message Date
Justin Lebar 7b4656c1d3 Fix indentation in ScalarEvolution.cpp.
Whitespace-only change.  (clang-formatted the whole block.)

llvm-svn: 334427
2018-06-11 18:57:27 +00:00
Krzysztof Parzyszek dd9415d550 [Hexagon] Late predicate producers cannot be used as dot-new sources
llvm-svn: 334426
2018-06-11 18:45:52 +00:00
Tim Shen cc63761720 [SCEV] Canonicalize "A /u C1 /u C2" to "A /u (C1*C2)".
Summary: FWIW InstCombine already folds this. Also avoid the case where C1*C2 overflows.

Reviewers: sunfish, sanjoy

Subscribers: hiraditya, bixia, llvm-commits

Differential Revision: https://reviews.llvm.org/D47965

llvm-svn: 334425
2018-06-11 18:44:58 +00:00
Alex Shlyapnikov 789494026e [Sanitizers] Move pvalloc overflow tests to common.
Summary:
Now all sanitizers with improved allocator error reporting are covered
by these common tests.

Also, add pvalloc-specific checks to LSan.

HWASan is not covered by sanitizer_common, hence its own pvalloc
and other allocator tests.

Reviewers: vitalybuka

Subscribers: srhines, kubamracek, delcypher, #sanitizers, llvm-commits

Differential Revision: https://reviews.llvm.org/D47970

llvm-svn: 334424
2018-06-11 17:33:53 +00:00
Simon Pilgrim 14ee66ef37 [X86][AVX512] Tag AVX5124FMAPS/AVX5124VNNIW with missing scheduler classes
Necessary for D46276 as even though btver2 doesn't use these instructions, its now flagged as complete so complains if ANY instruction isn't tagged.....

UnsupportedFeatures wouldn't help here as these instructions don't appear to have a feature predicate (like a lot of AVX512).

llvm-svn: 334423
2018-06-11 17:28:00 +00:00
Craig Topper 201b9dd334 [X86] Fix operand order in the shuffle created for blend builtins.
This was broken when the builtin was added in r334249.

llvm-svn: 334422
2018-06-11 17:06:01 +00:00
Matt Morehouse 3416773cb1 [clang-fuzzer] Modified protobuf and converter to add new signature, remove conditionals.
Changed the function signature and removed conditionals from loop body.

Patch By:  emmettneyman

Differential Revision: https://reviews.llvm.org/D47964

llvm-svn: 334421
2018-06-11 17:05:45 +00:00
Stanislav Mekhanoshin 7ba3fc730c [AMDGPU] Do not consider indirect acces through phi for wave limiter
Rational: if there is indirect access that is usually an issue
because load is not ready by the use. However, if use is inside a
loop and load is outside that is potentially an issue for a first
iteration only.

Differential Revision: https://reviews.llvm.org/D47740

llvm-svn: 334420
2018-06-11 16:50:49 +00:00
Aleksandar Beserminji 62cf9d21ab [mips] Fix spill slot for mips3, n64 abi
When program is compiled for mips3 with n64 abi, wrong register class
is used for creating an emergency spill slot. This patch fixes the
correct register class to be chosen.

This patch resolves PR35859.

Thanks to John Baldwin for reporting the issue!

Differential Revision: https://reviews.llvm.org/D47938

llvm-svn: 334419
2018-06-11 16:50:28 +00:00
Reid Kleckner 03ab0fccc1 Enable crash recovery tests on Windows, globs work in the lit internal shell now
llvm-svn: 334418
2018-06-11 16:50:07 +00:00
Reid Kleckner 3513fdcc0f [MS] Use mangled names and comdats for string merging with ASan
This should reduce the binary size penalty of ASan on Windows. After
r334313, ASan will add red zones to globals in comdats, so we will still
find OOB accesses to string literals.

llvm-svn: 334417
2018-06-11 16:49:43 +00:00
Craig Topper ae7179414d [X86] Properly account for the immediate being multiplied by 8 in the immediate range checking for BI__builtin_ia32_psrldqi128 and friends.
The limit was set to 1023 which only up to 127*8. It needs to be 2047 to allow 255*8.

llvm-svn: 334416
2018-06-11 16:34:10 +00:00
Martin Probst c8b7a41a00 clang-format: [JS] strict prop init annotation.
Summary:
TypeScript uses the `!` token for strict property initialization
assertions, as in:

    class X {
      strictPropAsserted!: string;
    }

Previously, clang-format would wrap between the `!` and the `:` for
overly long lines. This patch fixes that by generally preventing the
wrap in that location.

Reviewers: krasimir

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D48030

llvm-svn: 334415
2018-06-11 16:20:13 +00:00
Mikhail Maltsev 5434047862 [Driver] Add aliases for -Qn/-Qy
This patch adds aliases for -Qn (-fno-ident) and -Qy (-fident) which
look less cryptic than -Qn/-Qy. The aliases are compatible with GCC.

Differential Revision: https://reviews.llvm.org/D48021

llvm-svn: 334414
2018-06-11 16:10:06 +00:00
Tobias Grosser 6538f40e31 Drop unnecessary whitespace [NFCI]
llvm-svn: 334413
2018-06-11 15:11:57 +00:00
Tobias Grosser 80677bce11 [ScopBuilder] Slightly improve code structure [NFCI]
First build the surrounding loops and then build up the polyhedral
structures. Before r326664 we had to mix these updates, clean this
up to improve readability (slightly).

llvm-svn: 334412
2018-06-11 14:59:28 +00:00
Pavel Labath cb512a3072 Fix tuple getter in std unique pointer pretty-printer
Summary: Check case when _M_t child member is not present.

Reviewers: labath, tberghammer

Reviewed By: labath, tberghammer

Differential Revision: https://reviews.llvm.org/D47932
Patch by Aleksandr Urakov <aleksandr.urakov@jetbrains.com>.

llvm-svn: 334411
2018-06-11 14:52:52 +00:00
Kostya Kortchinsky 4410e2c43f [scudo] Improve the scalability of the shared TSD model
Summary:
The shared TSD model in its current form doesn't scale. Here is an example of
rpc2-benchmark (with default parameters, which is threading heavy) on a 72-core
machines (defaulting to a `CompactSizeClassMap` and no Quarantine):
- with tcmalloc: 337K reqs/sec, peak RSS of 338MB;
- with scudo (exclusive): 321K reqs/sec, peak RSS of 637MB;
- with scudo (shared): 241K reqs/sec, peak RSS of 324MB.

This isn't great, since the exclusive model uses a lot of memory, while the
shared model doesn't even come close to be competitive.

This is mostly due to the fact that we are consistently scanning the TSD pool
starting at index 0 for an available TSD, which can result in a lot of failed
lock attempts, and touching some memory that needs not be touched.

This CL attempts to make things better in most situations:
- first, use a thread local variable on Linux (intead of pthread APIs) to store
  the current TSD in the shared model;
- move the locking boolean out of the TSD: this allows the compiler to use a
  register and potentially optimize out a branch instead of reading it from the
  TSD everytime (we also save a tiny bit of memory per TSD);
- 64-bit atomic operations on 32-bit ARM platforms happen to be expensive: so
  store the `Precedence` in a `uptr` instead of a `u64`. We lose some
  nanoseconds of precision and we'll wrap around at some point, but the benefit
  is worth it;
- change a `CHECK` to a `DCHECK`: this should never happen, but if something is
  ever terribly wrong, we'll crash on a near null AV if the TSD happens to be
  null;
- based on an idea by dvyukov@, we are implementing a bound random scan for
  an available TSD. This requires computing the coprimes for the number of TSDs,
  and attempting to lock up to 4 TSDs in an random order before falling back to
  the current one. This is obviously slightly more expansive when we have just
  2 TSDs (barely noticeable) but is otherwise beneficial. The `Precedence` still
  basically corresponds to the moment of the first contention on a TSD. To seed
  on random choice, we use the precedence of the current TSD since it is very
  likely to be non-zero (since we are in the slow path after a failed `tryLock`)

With those modifications, the benchmark yields to:
- with scudo (shared): 330K reqs/sec, peak RSS of 327MB.

So the shared model for this specific situation not only becomes competitive but
outperforms the exclusive model. I experimented with some values greater than 4
for the number of TSDs to attempt to lock and it yielded a decrease in QPS. Just
sticking with the current TSD is also a tad slower. Numbers on platforms with
less cores (eg: Android) remain similar.

Reviewers: alekseyshl, dvyukov, javed.absar

Reviewed By: alekseyshl, dvyukov

Subscribers: srhines, kristof.beyls, delcypher, llvm-commits, #sanitizers

Differential Revision: https://reviews.llvm.org/D47289

llvm-svn: 334410
2018-06-11 14:50:31 +00:00
Dylan McKay d011869c82 [AVR] Set trackLivenessAfterRegAlloc
This sets trackLivenessAfterRegAlloc on AVRRegisterInfo.

Most existing targets set this flag. Without it, specific IR inputs
cause LLVM to fail with:

Assertion failed: (getParent()->getProperties().hasProperty( MachineFunctionProperties::Property::TracksLiveness) &&
                   "Liveness information is accurate"), function livein_begin
file MachineBasicBlock.cpp, line 1354.

With this commit, this no longer happens.

Patch by Peter Nimmervoll.

llvm-svn: 334409
2018-06-11 14:46:48 +00:00
Francois Ferrand 6bb103f9fa clang-format: Introduce BreakInheritanceList option
Summary:
This option replaces the BreakBeforeInheritanceComma option with an
enum, thus introducing a mode where the colon stays on the same line as
constructor declaration:

  // When it fits on line:
  class A : public B, public C {
    ...
  };

  // When it does not fit:
  class A :
      public B,
      public C {
    ...
  };

This matches the behavior of the `BreakConstructorInitializers` option,
introduced in https://reviews.llvm.org/D32479.

Reviewers: djasper, klimek

Reviewed By: djasper

Subscribers: mzeren-vmw, cfe-commits

Differential Revision: https://reviews.llvm.org/D43015

llvm-svn: 334408
2018-06-11 14:41:26 +00:00
Clement Courbet 7db69cc08a [X86] Fix skylake server scheduling info.
Summary:
This fixes most of the scheduling info for SKX vector operations.
I had to split a lot of the YMM/ZMM classes into separate classes for YMM and ZMM.

The before/after llvm-exegesis analysis are in the phabricator diff.

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47721

llvm-svn: 334407
2018-06-11 14:37:53 +00:00
Tobias Grosser 2c543e775f Update isl to isl-0.19-185-g8e9f55ce
This is mainly a maintenance update.

llvm-svn: 334406
2018-06-11 14:25:42 +00:00
Guillaume Chatelet 015b3e5be4 [llvm-exegesis] Fix unhandled error.
Summary: Fixing an unhandled error when calling writeYaml.

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D48022

llvm-svn: 334405
2018-06-11 14:10:10 +00:00
Sanjay Patel a1791be455 [x86] add scalar cvtt intrinsic tests; NFC
More coverage for the problem noted in D47993 (although these shouldn't be affected by that patch).

llvm-svn: 334404
2018-06-11 13:51:34 +00:00
Pavel Labath b2c73c4cc0 Fix build errors on some configurations
It's been reported
<http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20180611/559616.html>
that template argument deduction for RetryAfterSignal fails if open is
not prefixed with "::".

This should help us build correctly on those platforms and explicitly
specifying the namespace is more correct anyway.

llvm-svn: 334403
2018-06-11 13:30:47 +00:00
Pavel Labath 08dae4bb3c DWARFDebugNames: Fix lookup in dwo files
The getDIESectionOffset function is not correct for split dwarf files
(and will probably be removed in D48009).

This patch implements correct section offset computation for split and
non-split compile units -- we first need to check if the referenced unit
is a skeleton unit, and if it is, we add the die offset to the full unit
base offset (as the full unit is the one which contains the die).

llvm-svn: 334402
2018-06-11 13:22:31 +00:00
Krasimir Georgiev 8b98f5517f [clang-format] text protos: put entries on separate lines if there is a submessage
Summary:
This patch updates clang-format text protos to put entries of a submessage into separate lines if the submessage contains at least two entries and contains at least one submessage entry.

For example, the entries here are kept on separate lines even if putting them on a single line would be under the column limit:
```
message: {
  entry: 1
  submessage: { key: value }
}
```

Messages containing a single submessage or several scalar entries can still be put on one line if they fit:
```
message { submessage { key: value } }
message { x: 1 y: 2 z: 3 }
```

Reviewers: sammccall

Reviewed By: sammccall

Subscribers: klimek, cfe-commits

Differential Revision: https://reviews.llvm.org/D46757

llvm-svn: 334401
2018-06-11 12:53:25 +00:00
Alexander Kornienko 5f8ede4f0d Add support for arrays in performance-implicit-conversion-in-loop
Summary:
Add support for arrays (and structure that use naked pointers for their iterator, like std::array) in performance-implicit-conversion-in-loop

Reviewers: alexfh

Reviewed By: alexfh

Subscribers: cfe-commits

Patch by Alex Pilkiewicz.

Differential Revision: https://reviews.llvm.org/D47945

llvm-svn: 334400
2018-06-11 12:46:48 +00:00
Pavel Labath d8c6290ba4 Move VersionTuple from clang/Basic to llvm/Support
Summary:
This kind of functionality is useful to other project apart from clang.
LLDB works with version numbers a lot, but it does not have a convenient
abstraction for this. Moving this class to a lower level library allows
it to be freely used within LLDB.

Since this class is used in a lot of places in clang, and it used to be
in the clang namespace, it seemed appropriate to add it to the list of
adopted classes in LLVM.h to avoid prefixing all uses with "llvm::".

Also, I didn't find any tests specific for this class, so I wrote a
couple of quick ones for the more interesting bits of functionality.

Reviewers: zturner, erik.pilkington

Subscribers: mgorny, cfe-commits, llvm-commits

Differential Revision: https://reviews.llvm.org/D47887

llvm-svn: 334399
2018-06-11 10:28:04 +00:00
Roman Lebedev b896c4e860 [NFC][AMDGPU] Add tests for all the various IR patterns equivalent to extracting low bits.
Summary:
The idiom recognition seems rather poor.
Only the `@bzhi32_d0` produces `v_bfe_u32`.
But they all should.

This needs to be fixed before D47980 can be re-landed.

Reviewers: mareko, bogner, rampitec, arsenm, tstellar, nhaehnle

Reviewed By: nhaehnle

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Tags: #amdgpu

Differential Revision: https://reviews.llvm.org/D48005

llvm-svn: 334398
2018-06-11 10:21:10 +00:00
Pavel Labath dee4930110 Editline: make #include <codecvt> conditional
My previous patch made this include unconditional. However, it seems it
is not available everywhere. This patch makes us include it only in
configurations we really need it, which should be enough to unblock the
bots.

llvm-svn: 334397
2018-06-11 09:32:58 +00:00
Roman Lebedev dbd98b3a09 [Utils] update_llc_test_checks.py: support AMDGPU backend: AMDGCN, r600 triples
Summary:
Lack of that support has taken me by surprise.
I need to add (or at least look at) some tests for https://reviews.llvm.org/D47980#1127615,
and i don't really fancy doing that by hand.

The asm pattern is quite similar to that of x86:
https://godbolt.org/g/hfgeds
just with `#` replaced with `;`

Reviewers: spatel, RKSimon, MaskRay, tstellar, arsenm

Reviewed By: arsenm

Subscribers: arsenm, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, rampitec, bogner, mareko, llvm-commits

Tags: #amdgpu

Differential Revision: https://reviews.llvm.org/D48001

llvm-svn: 334396
2018-06-11 09:20:21 +00:00
Guillaume Chatelet 6416592909 [llvm-exegesis] Program should succeed if benchmark returns StringError.
Summary: Fix for https://bugs.llvm.org/show_bug.cgi?id=37759.

Reviewers: courbet

Subscribers: tschuett, llvm-commits

Differential Revision: https://reviews.llvm.org/D48004

llvm-svn: 334395
2018-06-11 09:18:01 +00:00
Mikhail Maltsev fa5a728657 [Unittests] Change linker flags of dynamic library tests
A recent change https://reviews.llvm.org/D46898 which had no intended
behavior change, actually modified the linker flags used when linking
the dynamic libraries used by the DynamicLibraryTests unit test. This
made the test fail in our testing environment which runs the tests
from an NFS share. Prior to D46898 the two libraries used by the test
were different (because the library name used to be embedded into the
binary), and after the change they became bit-to-bit identical. This
causes dlopen to return the same handle when these two libraries are
loaded from an NFS share, and the test expects two different handles.

This patch reverts the part of D46898 that is responsible for
changing the linker flags.

Differential Revision: https://reviews.llvm.org/D47469

llvm-svn: 334394
2018-06-11 09:15:37 +00:00
Pavel Labath 3351381165 [cmake] Detect presence of wide-char libedit at build time
Summary:
Instead of hardcoding a list of platforms where libedit is known to have
wide char support we detect this in cmake. The main motivation for this
is attempting to improve compatibility with different versions of
libedit, as the interface of non-wide-char functions varies slightly
between versions.

Reviewers: krytarowski, uweigand, jankratochvil, timshen, beanz

Subscribers: mgorny, lldb-commits

Differential Revision: https://reviews.llvm.org/D47625

llvm-svn: 334393
2018-06-11 09:14:26 +00:00
Simon Atanasyan 00d8843fa3 [ELF] Pass a pointer to InputFile to the getRelocTargetVA to escape dereferencing of nullptr. NFC
llvm-svn: 334392
2018-06-11 08:37:19 +00:00
Clement Courbet f4f6899cdf [ExynosM1][Sched] Fix resource usage in scheduling model.
This is part of https://reviews.llvm.org/D46356.

llvm-svn: 334391
2018-06-11 07:33:08 +00:00
Simon Atanasyan ed9ee69ccf [ELF][MIPS] Multi-GOT implementation
Almost all entries inside MIPS GOT are referenced by signed 16-bit
index. Zero entry lies approximately in the middle of the GOT. So the
total number of GOT entries cannot exceed ~16384 for 32-bit architecture
and ~8192 for 64-bit architecture. This limitation makes impossible to
link rather large application like for example LLVM+Clang. There are two
workaround for this problem. The first one is using the -mxgot
compiler's flag. It enables using a 32-bit index to access GOT entries.
But each access requires two assembly instructions two load GOT entry
index to a register. Another workaround is multi-GOT. This patch
implements it.

Here is a brief description of multi-GOT for detailed one see the
following link https://dmz-portal.mips.com/wiki/MIPS_Multi_GOT.

If the sum of local, global and tls entries is less than 64K only single
got is enough. Otherwise, multi-got is created. Series of primary and
multiple secondary GOTs have the following layout:
```
- Primary GOT
    Header
    Local entries
    Global entries
    Relocation only entries
    TLS entries

- Secondary GOT
    Local entries
    Global entries
    TLS entries
...
```

All GOT entries required by relocations from a single input file
entirely belong to either primary or one of secondary GOTs. To reference
GOT entries each GOT has its own _gp value points to the "middle" of the
GOT. In the code this value loaded to the register which is used for GOT
access.

MIPS 32 function's prologue:
```
lui     v0,0x0
0: R_MIPS_HI16  _gp_disp
addiu   v0,v0,0
4: R_MIPS_LO16  _gp_disp
```

MIPS 64 function's prologue:
```
lui     at,0x0
14: R_MIPS_GPREL16  main
```

Dynamic linker does not know anything about secondary GOTs and cannot
use a regular MIPS mechanism for GOT entries initialization. So we have
to use an approach accepted by other architectures and create dynamic
relocations R_MIPS_REL32 to initialize global entries (and local in case
of PIC code) in secondary GOTs. But ironically MIPS dynamic linker
requires GOT entries and correspondingly ordered dynamic symbol table
entries to deal with dynamic relocations. To handle this problem
relocation-only section in the primary GOT contains entries for all
symbols referenced in global parts of secondary GOTs. Although the sum
of local and normal global entries of the primary got should be less
than 64K, the size of the primary got (including relocation-only entries
can be greater than 64K, because parts of the primary got that overflow
the 64K limit are used only by the dynamic linker at dynamic link-time
and not by 16-bit gp-relative addressing at run-time.

The patch affects common LLD code in the following places:

- Added new hidden -mips-got-size flag. This flag required to set low
maximum size of a single GOT to be able to test the implementation using
small test cases.

- Added InputFile argument to the getRelocTargetVA function. The same
symbol referenced by GOT relocation from different input file might be
allocated in different GOT. So result of relocation depends on the file.

- Added new ctor to the DynamicReloc class. This constructor records
settings of dynamic relocation which used to adjust address of 64kb page
lies inside a specific output section.

With the patch LLD is able to link all LLVM+Clang+LLD applications and
libraries for MIPS 32/64 targets.

Differential revision: https://reviews.llvm.org/D31528

llvm-svn: 334390
2018-06-11 07:24:31 +00:00
Clement Courbet c48435bfe5 [X86] Explicitly mark unsupported classes in scheduling models.
Summary: In preparation for D47721. HSW and SNB still define unsupported
classes as they are used by KNL and generic models respectively.

Reviewers: RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D47763

llvm-svn: 334389
2018-06-11 07:00:08 +00:00
Hans Wennborg 4cae35f6e4 [MS ABI] Mangle unnamed empty enums (PR37723)
Differential Revision: https://reviews.llvm.org/D47875

llvm-svn: 334388
2018-06-11 06:54:23 +00:00
Craig Topper 5e403b2981 [X86] Add encoding tests for avx5124fmaps and avx5124vnni instructions.
I forgot to git add these in r333812

llvm-svn: 334387
2018-06-11 06:22:41 +00:00
Craig Topper c12ac0c984 [X86] Add test files for upgrade of vbmi2 expand load and compress store intrinsics that was done in r334381.
llvm-svn: 334386
2018-06-11 06:20:24 +00:00
Craig Topper 91bbe98757 [X86] Remove masking from dbpsadbw builtins, use select builtin instead.
llvm-svn: 334385
2018-06-11 06:18:29 +00:00
Craig Topper 0e25c8239a [X86] Remove masking from dbpsadbw intrinsics, use select in IR instead.
llvm-svn: 334384
2018-06-11 06:18:22 +00:00
Daniel Cederman 33f67a256b [Sparc] Add support for 13-bit PIC
Summary: When compiling with -fpic, in contrast to -fPIC, use only the
immediate field to index into the GOT. This saves space if the GOT is
known to be small. The linker will warn if the GOT is too large for
this method.

Reviewers: jyknight, venkatra

Reviewed By: jyknight

Subscribers: brad, fedor.sergeev, jrtc27, llvm-commits

Differential Revision: https://reviews.llvm.org/D47136

llvm-svn: 334383
2018-06-11 05:50:08 +00:00
Brock Wyma b60532f89a [CodeView] Omit forward references for unnamed structs and unions
Codeview references to unnamed structs and unions are expected to refer to the
complete type definition instead of a forward reference so Visual Studio can
resolve the type properly.

Differential Revision: https://reviews.llvm.org/D32498

llvm-svn: 334382
2018-06-11 01:39:34 +00:00
Craig Topper e71ad1f6d0 [X86] Remove and autoupgrade the expandload and compressstore intrinsics.
We use the target independent intrinsics now.

llvm-svn: 334381
2018-06-11 01:25:22 +00:00
Craig Topper 08f5c7b8c3 [TableGen] Make better use of std::map::emplace and emplace construct the object in the map rather than moving it into it. Remove a use std::map::find by remembering the return from emplace.
llvm-svn: 334380
2018-06-10 23:15:49 +00:00
Craig Topper d78567f16f [TableGen] Combine two constructors by taking vectors by value instead of trying to support combininations for rvalue and lvalue references.
llvm-svn: 334379
2018-06-10 23:15:48 +00:00
Sanjay Patel 3e5c70cc1d [DAGCombiner] match vector compare and select sizes with extload operand (PR37427)
This patch started off much more general and ambitious, but it's been a nightmare 
seeing all the ways x86 vector codegen can go wrong.

So the code is still structured to allow extending easily, but it's currently 
limited in several ways:

1. Only handle cases with an extending load.
2. Only handle cases with a zero constant compare.
3. Ignore setcc with vector bitmask (SetCCWidth != 1) - so AVX512 should be unaffected.

The motivating case from PR37427:
https://bugs.llvm.org/show_bug.cgi?id=37427
...is the 1st test, and that shows the expected win - we eliminated the unnecessary 
intermediate cast.

There's a clear regression in the last test (sgt_zero_fp_select) because we longer 
recognize a 'SHRUNKBLEND' opportunity. I think that general problem is also present 
in sgt_zero, so I'll try to fix that in a follow-up. We need to match a sign-bit 
setcc from a sign-extended operand and remove it.

Differential Revision: https://reviews.llvm.org/D47330

llvm-svn: 334378
2018-06-10 23:09:50 +00:00