Commit Graph

393019 Commits

Author SHA1 Message Date
LLVM GN Syncbot 87e41cc4b6 [gn build] Port 321c2ea91c 2021-07-08 15:35:54 +00:00
Mark de Wever 321c2ea91c [libc++][NFC] Move monostate to its own header.
The format library uses `std::monostate`, but not a `std::variant`.
Moving `std::monostate` to its own header allows the format library to
reduce the amount of included code.

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D105582
2021-07-08 17:35:26 +02:00
Mark de Wever 4947ecf4e9 [libc++] Guard testing implementation details.
The unit tests test some implementation details. As @Quuxplusone pointed
out in D96664 this should only be tested when the tests use libc++. This
addresses the issue for code already in main.

Reviewed By: #libc, ldionne

Differential Revision: https://reviews.llvm.org/D105568
2021-07-08 17:34:58 +02:00
Tim Northover 48c68a630e Recommit: Support: add llvm::thread class that supports specifying stack size.
This adds a new llvm::thread class with the same interface as std::thread
except there is an extra constructor that allows us to set the new thread's
stack size. On Darwin even the default size is boosted to 8MB to match the main
thread.

It also switches all users of the older C-style `llvm_execute_on_thread` API
family over to `llvm::thread` followed by either a `detach` or `join` call and
removes the old API.

Moved definition of DefaultStackSize into the .cpp file to hopefully
fix the build on some (GCC-6?) machines.
2021-07-08 16:22:26 +01:00
Alexey Bataev b5113bff46 [Instcombine]Transform reduction+(sext/zext(<n x i1>) to <n x im>) to [-]zext/trunc(ctpop(bitcast <n x i1> to in)) to im.
Some of the SPEC tests end up with reduction+(sext/zext(<n x i1>) to <n x im>) pattern, which can be transformed to [-]zext/trunc(ctpop(bitcast <n x i1> to in)) to im.
Also, reduction+(<n x i1>) can be transformed to ctpop(bitcast <n x i1> to in) & 1 != 0.

Differential Revision: https://reviews.llvm.org/D105587
2021-07-08 07:56:41 -07:00
Michael Liao 4e5d9c8803 [Internalize] Preserve variables externally initialized.
- ``externally_initialized`` variables would be initialized or modified
  elsewhere. Particularly, CUDA or HIP may have host code to initialize
  or modify ``externally_initialized`` device variables, which may not
  be explicitly referenced on the device side but may still be used
  through the host side interfaces. Not preserving them triggers the
  elimination of them in the GlobalDCE and breaks the user code.

Reviewed By: yaxunl

Differential Revision: https://reviews.llvm.org/D105135
2021-07-08 10:48:19 -04:00
Markus Böck 3e6d2cbf26 [mlir] Fully qualify types and expressions in Interfaces
This patch adds full qualification to the types and function calls used inside of (Static)InterfaceMethod in OpInterfaces. Without this patch using many of these interfaces in a downstream project yields compiler errors as the types and default implementations are mostly copied verbatim. Without then putting using namespace mlir; in the header file of the implementations of those interfaces, compilation is impossible.

Using fully qualified lookup fixes this issue.

Differential Revision: https://reviews.llvm.org/D105619
2021-07-08 16:44:16 +02:00
Alexey Bataev 9320d4b695 [Instcombine][NFC]Add a test for reduce+([sext/zext](<n x i1)) case, NFC. 2021-07-08 07:38:11 -07:00
Michael Liao cc92833f8a [amdgpu] Remove the GlobalDCE pass prior to the internalization pass.
- In [D98783](https://reviews.llvm.org/D98783), an extra GlobalDCE pass
  is inserted before the internalization pass to ensure a global
  variable without users could be internalized even if there are dead
  users. Instead of inserting a dedicated optimization pass, the
  dead user checking, i.e. 'use_empty()', should be preceeded with
  constant dead user removal to ensure an accurate result.

Differential Revision: https://reviews.llvm.org/D105590
2021-07-08 10:25:58 -04:00
Tim Northover 2bf5e8d953 Revert "Support: add llvm::thread class that supports specifying stack size."
It's causing build failures because DefaultStackSize isn't defined everywhere
it should be and I need time to investigate.
2021-07-08 14:59:47 +01:00
Tim Northover 727e1c9be3 Support: add llvm::thread class that supports specifying stack size.
This adds a new llvm::thread class with the same interface as std::thread
except there is an extra constructor that allows us to set the new thread's
stack size. On Darwin even the default size is boosted to 8MB to match the main
thread.

It also switches all users of the older C-style `llvm_execute_on_thread` API
family over to `llvm::thread` followed by either a `detach` or `join` call and
removes the old API.
2021-07-08 14:51:53 +01:00
xndcn 7445f1e4dc [NFC] Mark Expected<T>::assertIsChecked() as const
Some const methods of Expected<T> invoke assertIsChecked(),
so we should mark it as const too.

Differential Revision: https://reviews.llvm.org/D105292
2021-07-08 21:30:23 +08:00
Chia-hung Duan ba913b8da5 [mlir-reduce] Fix the memory leak and recycle unused modules.
Reviewed By: jpienaar

Differential Revision: https://reviews.llvm.org/D105416
2021-07-08 20:03:47 +08:00
Bradley Smith 026bb84bcd [AArch64][SVE] Add ISel patterns for floating point compare with zero instructions
Additionally, lower the floating point compare SVE intrinsics to
SETCC_MERGE_ZERO ISD nodes to avoid duplicating ISel patterns.

Differential Revision: https://reviews.llvm.org/D105486
2021-07-08 10:46:12 +00:00
Max Kazantsev 767eb9f9d5 [Test] Add loop deletion switch tests
Patch by Dmitry Makogon!

Differential Revision: https://reviews.llvm.org/D105543
2021-07-08 17:28:08 +07:00
Nicolas Vasilache 31f80393bc Revert "[mlir][MemRef] Fix DimOp folding of OffsetSizeAndStrideInterface."
This reverts commit 6c0fd4db79.

This simple implementation is unfortunately not extensible and needs to be reverted.
The extensible way should be to extend https://reviews.llvm.org/D104321.
2021-07-08 10:09:00 +00:00
Moritz Sichert d58c7a9238 [IR] Added operator delete to subclasses of User to avoid UB
Several subclasses of User override operator new without also overriding
operator delete. This means that delete expressions fall back to using
operator delete of the base class, which would be User. However, this is
only allowed if the base class has a virtual destructor which is not the
case for User, so this is UB.

See also [expr.delete] (3) for the exact wording.

This is actually detected in some cases by GCC 11's
-Wmismatched-new-delete now which is how I found this error.

Differential Revision: https://reviews.llvm.org/D103143
2021-07-08 11:59:22 +02:00
Martin Storsjö 715ca752ac [libcxx] [test] Fix spurious failures in the thread detach test on Windows
Make sure that the detached thread has started up before exiting
the process.

If the detached thread hasn't started up at all, and the main thread
exits, global data structures in the process are torn down, which
then can cause crashes when the thread starts up late after required
mutexes have been destroyed. (In particular, the mutex used internally
in _Init_thread_header, which is used in the initialization of
__thread_local_data()::__p, can cause crashes if the main thread already
has finished and progressed far with destruction.)

Differential Revision: https://reviews.llvm.org/D105592
2021-07-08 12:36:03 +03:00
Tobias Gysi abfa950d86 [mlir][linalg][python] Add exp and log to the OpDSL.
Introduce the exp and log function in OpDSL. Add the soft plus operator to test the emitted IR in Python and C++.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D105420
2021-07-08 08:48:23 +00:00
Tobias Gysi 84354b2ab2 [mlir][linalg] Remove GenericOpBase.
Remove the GenericOpBase class formerly used to factor out common logic shared be GenericOp and IndexedGenericOp. After removing IndexedGenericOp, the base class is not used anymore.

Differential Revision: https://reviews.llvm.org/D105307
2021-07-08 08:35:28 +00:00
Nicolas Vasilache 6c0fd4db79 [mlir][MemRef] Fix DimOp folding of OffsetSizeAndStrideInterface.
This addresses the issue reported in

https://llvm.discourse.group/t/rank-reducing-memref-subview-offsetsizeandstrideopinterface-interface-issues/3805

Differential Revision: https://reviews.llvm.org/D105558
2021-07-08 08:30:24 +00:00
Alex Zinenko 684dfe8adb [mlir] factor out ConvertToLLVMPattern
This class and classes that extend it are general utilities for any dialect
that is being converted into the LLVM dialect. They are in no way specific to
Standard-to-LLVM conversion and should not make their users depend on it.

Reviewed By: nicolasvasilache

Differential Revision: https://reviews.llvm.org/D105542
2021-07-08 10:27:05 +02:00
Sebastian Neubauer 9ced1e44ad [AMDGPU] Fix typo 2021-07-08 10:07:33 +02:00
Mikael Holmen 21fd875952 [lld/mac] Fix warning about unused variable [NFC]
Change "dyn_cast" to "isa" to get rid of the unused
variable "bitcodeFile".

gcc warned with

lld/MachO/Driver.cpp:531:17: warning: unused variable 'bitcodeFile' [-Wunused-variable]
531 |       if (auto *bitcodeFile = dyn_cast<BitcodeFile>(file)) {
    |                 ^~~~~~~~~~~
2021-07-08 09:46:30 +02:00
Tobias Gysi 511af1b1ad [mlir][linalg] Tighter StructuredOp Verification.
Verify the number of results matches exactly the number of output tensors. Simplify the FillOp verification since part of it got redundant.

Differential Revision: https://reviews.llvm.org/D105427
2021-07-08 06:53:36 +00:00
Lang Hames d7afd11e3d [ORC] Introduce ExecutorAddress type, fix broken LLDB bot.
ExecutorAddressRange depended on JITTargetAddress, but JITTargetAddress is
defined in ExecutionEngine, which OrcShared should not depend on.

This seems like as good a time as any to introduce a new ExecutorAddress type
to eventually replace JITTargetAddress. For now it's just another uint64_t
alias, but it will soon be changed to a class type to provide greater type
safety.
2021-07-08 16:31:59 +10:00
Lang Hames 963378bd82 [ORC] Improve computeLocalDeps / computeNamedSymbolDependencies performance.
The computeNamedSymbolDependencies and computeLocalDeps methods on
ObjectLinkingLayerJITLinkContext are responsible for computing, for each symbol
in the current MaterializationResponsibility, the set of non-locally-scoped
symbols that are depended on. To calculate this we have to consider the effect
of chains of dependence through locally scoped symbols in the LinkGraph. E.g.

        .text
        .globl  foo
foo:
        callq   bar                    ## foo depneds on external 'bar'
        movq    Ltmp1(%rip), %rcx      ## foo depends on locally scoped 'Ltmp1'
        addl    (%rcx), %eax
        retq

        .data
Ltmp1:
        .quad   x                      ## Ltmp1 depends on external 'x'

In this example symbol 'foo' depends directly on 'bar', and indirectly on 'x'
via 'Ltmp1', which is locally scoped.

Performance of the existing implementations appears to have been mediocre:
Based on flame graphs posted by @drmeister (in #jit on the LLVM discord server)
the computeLocalDeps function was taking up a substantial amount of time when
starting up Clasp (https://github.com/clasp-developers/clasp).

This commit attempts to address the performance problems in three ways:

1. Using jitlink::Blocks instead of jitlink::Symbols as the nodes of the
dependencies-introduced-by-locally-scoped-symbols graph.

Using either Blocks or Symbols as nodes provides the same information, but since
there may be more than one locally scoped symbol per block the block-based
version of the dependence graph should always be a subgraph of the Symbol-based
version, and so faster to operate on.

2. Improved worklist management.

The older version of computeLocalDeps used a fixed worklist containing all
nodes, and iterated over this list propagating dependencies until no further
changes were required. The worklist was not sorted into a useful order before
the loop started.

The new version uses a variable work-stack, visiting nodes in DFS order and
only adding nodes when there is meaningful work to do on them.

Compared to the old version the new version avoids revisiting nodes which
haven't changed, and I suspect it converges more quickly (due to the DFS
ordering).

3. Laziness and caching.

Mappings of...

jitlink::Symbol* -> Interned Name (as SymbolStringPtr)
jitlink::Block* -> Immediate dependencies (as SymbolNameSet)
jitlink::Block* -> Transitive dependencies (as SymbolNameSet)

are all built lazily and cached while running computeNamedSymbolDependencies.

According to @drmeister these changes reduced Clasp startup time in his test
setup (averaged over a handful of starts) from 4.8 to 2.8 seconds (with
ORC/JITLink linking ~11,000 object files in that time), which seems like
enough to justify switching to the new algorithm in the absence of any other
perf numbers.
2021-07-08 16:31:59 +10:00
Thomas Lively 0fd5e7b2d8 [WebAssembly][lld] Fix segfault on .bss sections in mapfile
When memory is declared in the Wasm module, we rely on the implicit zero
initialization behavior and do not explicitly output .bss sections. The means
that they do not have associated `outputSec` entries, which was causing
segfaults in the mapfile support. Fix the issue by guarding against null
`outputSec` and falling back to using a zero offset.

Differential Revision: https://reviews.llvm.org/D102951
2021-07-07 23:31:48 -07:00
Thomas Lively f8c5a4c670 [WebAssembly] Optimize out shift masks
WebAssembly's shift instructions implicitly masks the shift count, so optimize
out redundant explicit masks of the shift count. For vector shifts, this
currently only works if the mask is applied before splatting the shift count,
but this should be addressed in a future commit. Resolves PR49655.

Differential Revision: https://reviews.llvm.org/D105600
2021-07-07 23:14:31 -07:00
Lang Hames 5471766f9d [ORC] Replace MachOJITDylibInitializers::SectionExtent with ExecutorAddressRange
MachOJITDylibInitializers::SectionExtent represented the address range of a
section as an (address, size) pair. The new ExecutorAddressRange type
generalizes this to an address range (for any object, not necessarily a section)
represented as a (start-address, end-address) pair.

The aim is to express more of ORC (and the ORC runtime) in terms of simple types
that can be serialized/deserialized via SPS. This will simplify SPS-based RPC
involving arguments/return-values of these types.
2021-07-08 14:15:44 +10:00
Lang Hames 88efb59b78 [ORC] Fix file comments. 2021-07-08 14:15:44 +10:00
Patrick Holland d38b9f1f31 Revert "[MCA] [AMDGPU] Adding an implementation to AMDGPUCustomBehaviour for handling s_waitcnt instructions."
Build failures when building with shared libraries. Reverting until I can fix.

Differential Revision: https://reviews.llvm.org/D104730
2021-07-07 20:48:42 -07:00
Qiu Chaofan a22ecb4508 [PowerPC] Fix i64 to vector lowering on big endian
Lowering for scalar to vector would skip if current subtarget is big
endian and the scalar is larger or equal than 64 bits. However there's
some issue in implementation that SToVRHS may refer to SToVLHS's scalar
size if SToVLHS is present, which leads to some crash.o

Reviewed By: nemanjai, shchenz

Differential Revision: https://reviews.llvm.org/D105094
2021-07-08 11:05:09 +08:00
Jinsong Ji 31d10ea10e [AIX] Don't pass no-integrated-as by default
D105314 added the abibility choose to use AsmParser for parsing inline
asm. -no-intergrated-as will override this default if specified
explicitly.

If toolchain choose to use MCAsmParser for inline asm, don't pass
the option to disable integrated-as explictly unless set by user.

Reviewed By: #powerpc, shchenz

Differential Revision: https://reviews.llvm.org/D105512
2021-07-08 02:50:17 +00:00
Nico Weber e37dbc6e57 [gn build] (manually) port ef16c8eaa5 (MCACustomBehaviorAMDGPU) 2021-07-07 21:59:07 -04:00
Jordan Rupprecht 74c308c56a [Bazel] Fixes for b5d847b1b9 and 6412a13539 2021-07-07 16:50:23 -07:00
Nico Weber 877e835add [gn build] (semi-manually) port 966386514b 2021-07-07 19:27:19 -04:00
Stanislav Mekhanoshin 0fdb25cd95 [AMDGPU] Disable garbage collection passes
Differential Revision: https://reviews.llvm.org/D105593
2021-07-07 15:47:57 -07:00
Leonard Chan 398bfa2ead [compiler-rt][Fuchsia] Disable interceptors while enabling new/delete replacements
This disables use of hwasan interceptors which we do not use on Fuchsia. This
explicitly sets the macro for defining the hwasan versions of new/delete.

Differential Revision: https://reviews.llvm.org/D103544
2021-07-07 15:14:34 -07:00
Matheus Izvekov 2c60d22610 [clang] disable P2266 simpler implicit moves under -fms-compatibility
The Microsoft STL currently has some issues with P2266.
We disable it for now in that mode, but we might come back later with a
more targetted approach.

Signed-off-by: Matheus Izvekov <mizvekov@gmail.com>

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D105518
2021-07-08 00:13:11 +02:00
Leonard Chan 966386514b [compiler-rt][hwasan] Setup hwasan thread handling on Fuchsia
This patch splits up hwasan thread creation between `__sanitizer_before_thread_create_hook`,
`__sanitizer_thread_create_hook`, and `__sanitizer_thread_start_hook`.
The linux implementation creates the hwasan thread object inside the
new thread. On Fuchsia, we know the stack bounds before thread creation,
so we can initialize part of the thread object in `__sanitizer_before_thread_create_hook`,
then initialize the stack ring buffer in `__sanitizer_thread_start_hook`
once we enter the thread.

Differential Revision: https://reviews.llvm.org/D104085
2021-07-07 15:05:28 -07:00
Fangrui Song b81aa458af [llvm-nm][test] Fix just-symbols.test 2021-07-07 15:04:18 -07:00
Arthur Eubanks aad41e2299 [OpaquePtr] Use ArgListEntry::IndirectType for lowering ABI attributes
Consolidate PreallocatedType and ByValType into IndirectType, and use that for inalloca.
2021-07-07 14:58:38 -07:00
Jinsong Ji 89f2d98b98 [PowerPC] Add P7 RUN line for load and splat test 2021-07-07 21:43:46 +00:00
Arthur Eubanks d338e79a4c [OpaquePtr] Remove checking pointee type for byval/preallocated type
These currently always require a type parameter. The bitcode reader
already upgrades old bitcode without the type parameter to use the
pointee type.

In cases where the caller does not have byval but the callee does, we
need to follow CallBase::paramHasAttr() and also look at the callee for
the byval type so that CallBase::isByValArgument() and
CallBase::getParamByValType() are in sync. Do the same for preallocated.

While we're here add a corresponding version for inalloca since we'll
need it soon.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D104663
2021-07-07 14:28:55 -07:00
George Burgess IV 81ee4952f1 utils: add a revert checker
Chrome OS and Android have found it useful to have an automated revert
checker. It was requested to upstream it, since other folks in the LLVM
community may also find value in it.

The tests depend on having a full (non-shallow) checkout of LLVM. This
seems reasonable to me, since:

- the tests should only be run if the user is developing on this script
- it's kind of hard to develop on this script without local git history
  :)

If people really want, the tests' dependency on LLVM's history can be
removed. It's mostly just effort/complexity that doesn't seem necessary.

Differential Revision: https://reviews.llvm.org/D105578
2021-07-07 14:20:01 -07:00
Patrick Holland af3baf1761 [MCA] [AMDGPU] Adding an implementation to AMDGPUCustomBehaviour for handling s_waitcnt instructions.
This commit also makes some slight changes to the scheduling model for AMDGPU to set the RetireOOO flag for all scheduling classes.

This flag is only used by llvm-mca and allows instructions to retire out of order.

See the differential link below for a deeper explanation of everything.

Differential Revision: https://reviews.llvm.org/D104730
2021-07-07 14:17:54 -07:00
David Green ab0096de05 [ARM] Add some opaque pointer gather/scatter tests. NFC
They seem to work OK. Some other test cleanups at the same time.
2021-07-07 22:03:53 +01:00
Nikita Popov f42bc8424e [AsmWriter] Simplify type attribute printing (NFC)
Avoid enumerating all supported type attributes, instead fetch
their name from the attribute kind.
2021-07-07 22:47:33 +02:00
Nikita Popov e000b848e6 [IR] Simplify Attribute::getAsString() (NFC)
Avoid enumerating all attributes here and instead use
getNameFromAttrKind(), which is based on the tablegen data.

This only leaves us with custom handling for int attributes,
which don't have uniform printing.
2021-07-07 22:43:17 +02:00