Author: Samuel Pitoiset
Without this patch, it appears to me that we are selecting
the wrong operand when inverting conditions. In the attached
test, it will select %tmp3 instead of %tmp4. To fix it, just
use 'A' as everywhere.
This fixes a regression introduced by
"[PatternMatch] define m_Not using m_Xor and cst_pred_ty"
https://reviews.llvm.org/D46351
llvm-svn: 332403
This option just keeps being a problem and really needs to be implemented
in some fashion. Implementing it properly requires some kind of
"replaceSectionReference" method because all the existing links need to be
maintained. The desired behavior is just for allocated sections to become
NOBITS but actually implementing that is rather tricky due to the current
design of llvm-objcopy. However converting allocated sections to NOBITS is
just an optimization and not something debuggers need. Debuggers can debug
a stripped executable and take an unstripped executable for that stripped
executable as input. Additionally allocated sections account for a very
small part of debug binaries so this optimization is quite small. I propose
that for the time being we implement this as a NOP so that people can use
llvm-objcopy where they need to, just in a sub-optimal way.
This option has already blocked a lot of people and its currently blocking me.
llvm-svn: 332396
When storing the 0th lane of a vector, use a simpler and usually more
efficient scalar store instead. In this case, also using the unscaled
offset.
Differential revision: https://reviews.llvm.org/D46762
llvm-svn: 332394
search. NFCI.
Migrate single-use and non-volatility, non-indexed requirements on
stores of immediate store values to candidate collection pass from
later stage.
llvm-svn: 332392
Summary:
This is just an idea, really two ideas. I expect some push-back,
but I realize that posting a diff is the most comprehensive way to express
these concepts.
This patch introduces a Stage class which represents the
various stages of an instruction pipeline. As a start, I have created a simple
FetchStage that is based on existing logic for how MCA produces
instructions, but now encapsulated in a Stage. The idea should become more concrete
once we introduce additional stages. The idea being, that when a stage completes,
the next stage in the pipeline will be executed. Stages are chained together
as a singly linked list to closely model a real pipeline. For now there is only one stage,
so the stage-to-stage flow of instructions isn't immediately obvious.
Eventually, Stage will also handle event notifications, but that functionality
is not complete, and not destined for this patch. Ideally, an interested party
can register for notifications from a particular stage. Callbacks will be issued to
these listeners at various points in the execution of the stage.
For now, eventing functionality remains similar to what it has been in mca::Backend.
We will be building-up the Stage class as we move on, such as adding debug output.
This patch also removes the unique_ptr<Instruction> return value from
InstrBuilder::createInstruction. An Instruction pointer is still produced,
but now it's up to the caller to decide how that item should be managed post-allocation
(e.g., smart pointer). This allows the Fetch stage to create instructions and
manage the lifetime of those instructions as it wishes, and not have to be bound to any
specific managed pointer type. Other callers of createInstruction might have different
requirements, and thus can manage the pointer to fit their needs. Another idea would be to push the
ownership to the RCU.
Currently, the FetchStage will wrap the Instruction
pointer in a shared_ptr. This allows us to remove the Instruction container in
Backend, which was probably going to disappear, or move, at some point anyways.
Note that I did run these changes through valgrind, to make sure we are not leaking
memory. While the shared_ptr comes with some additional overhead it relieves us
from having to manage a list of generated instructions, and/or make lookup calls
to remove the instructions.
I realize that both the Stage class and the Instruction pointer management
(mentioned directly above) are separate but related ideas, and probably should
land as separate patches; I am happy to do that if either idea is decent.
The main reason these two ideas are together is that
Stage::execute() can mutate an InstRef. For the fetch stage, the InstRef is populated
as the primary action of that stage (execute()). I didn't want to change the Stage interface
to support the idea of generating an instruction. Ideally, instructions are to
be pushed through the pipeline. I didn't want to draw too much of a
specialization just for the fetch stage. Excuse the word-salad.
Reviewers: andreadb, courbet, RKSimon
Reviewed By: andreadb
Subscribers: llvm-commits, mgorny, javed.absar, tschuett, gbedwell
Differential Revision: https://reviews.llvm.org/D46741
llvm-svn: 332390
specially handle SETB_C* pseudo instructions.
Summary:
While the logic here is somewhat similar to the arithmetic lowering, it
is different enough that it made sense to have its own function.
I actually tried a bunch of different optimizations here and none worked
well so I gave up and just always do the arithmetic based lowering.
Looking at code from the PR test case, we actually pessimize a bunch of
code when generating these. Because SETB_C* pseudo instructions clobber
EFLAGS, we end up creating a bunch of copies of EFLAGS to feed multiple
SETB_C* pseudos from a single set of EFLAGS. This in turn causes the
lowering code to ruin all the clever code generation that SETB_C* was
hoping to achieve. None of this is needed. Whenever we're generating
multiple SETB_C* instructions from a single set of EFLAGS we should
instead generate a single maximally wide one and extract subregs for all
the different desired widths. That would result in substantially better
code generation. But this patch doesn't attempt to address that.
The test case from the PR is included as well as more directed testing
of the specific lowering pattern used for these pseudos.
Reviewers: craig.topper
Subscribers: sanjoy, mcrosier, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D46799
llvm-svn: 332389
Summary:
After r332167 we started to sort the IDF blocks inside IDF calculation, so
there is no need to re-sort them on the user site. The test changes are due to
a slightly different order we're using now (originally we used DFSInNumber and
now the blocks are sorted by a pair (LevelFromRoot, DFSInNumber)).
Reviewers: dberlin, mgrang
Subscribers: Prazek, hiraditya, george.burgess.iv, llvm-commits
Differential Revision: https://reviews.llvm.org/D46899
llvm-svn: 332385
Certain tests in rtti-options.cpp are not really testing anything because they are testing for the absence of -frtti option to the cc1 process. Since the cc1 process does not take -frtti option, these tests are passing tautologically.
The RTTI mode is enabled by default in cc1, and -fno-rtti disables it. Therefore the correct way to check for enabling of RTTI is to check for the absence of -fno-rtti to cc1, and the correct way to check for disabling of RTTI is to check for the presence of -fno-rtti to cc1.
This patch fixes those tests.
Differential Revision: https://reviews.llvm.org/D46836
llvm-svn: 332384
In generic data-sharing mode we do not need to globalize
variables/parameters of reference/pointer types. They already are placed
in the global memory.
llvm-svn: 332380
Summary:
Code completion scoring was embedded in CodeComplete.cpp, which is bad:
- awkward to test. The mechanisms (extracting info from index/sema) can be
unit-tested well, the policy (scoring) should be quantitatively measured.
Neither was easily possible, and debugging was hard.
The intermediate signal struct makes this easier.
- hard to reuse. This is a bug in workspaceSymbols: it just presents the
results in the index order, which is not sorted in practice, it needs to rank
them!
Also, index implementations care about scoring (both query-dependent and
independent) in order to truncate result lists appropriately.
The main yak shaved here is the build() function that had 3 variants across
unit tests is unified in TestTU.h (rather than adding a 4th variant).
Reviewers: ilya-biryukov
Subscribers: klimek, mgorny, ioeric, MaskRay, jkorous, mgrang, cfe-commits
Differential Revision: https://reviews.llvm.org/D46524
llvm-svn: 332378
BtVer2 - Fixes schedules for (V)CVTPS2PD instructions
A lot of the Intel models still have too many InstRW overrides for these new classes - this needs cleaning up but I wanted to get the classes in first
llvm-svn: 332376
This CL places .dynsym and .dynstr at the beginning of SHF_ALLOC
sections. We do this to mitigate the possibility that huge .dynsym and
.dynstr sections placed between ro-data and text sections cause
relocation overflow.
Differential Revision: https://reviews.llvm.org/D45788
llvm-svn: 332374
Summary: This is similar to D46290 D46320.
Reviewers: ruiu, grimar
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D46861
llvm-svn: 332372
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
Differential Revision: https://reviews.llvm.org/D44976
llvm-svn: 332371
LLVM uses cpp as its C++ file extension, these are the only three cxx file in
the monorepo. These files apparently were called to escape a CMake check -- use
the LLVM_OPTIONAL_SOURCES mechanism that's meant as an escape for this case
instead.
No intended behavior change.
https://reviews.llvm.org/D46843
llvm-svn: 332368
The test case added in r332265 had incomplete livein information which
was caught by the EXPENSIVE_CHECKS bot. Fix the livein information and
add -verify-machineinstrs to the test case.
llvm-svn: 332367
Summary:
o Remove IncludeInsertion LSP command.
o Populate include insertion edits synchromously in completion items.
o Share the code completion compiler instance and precompiled preamble to get existing inclusions in main file.
o Include insertion logic lives only in CodeComplete now.
o Use tooling::HeaderIncludes for inserting new includes.
o Refactored tests.
Reviewers: sammccall
Reviewed By: sammccall
Subscribers: klimek, ilya-biryukov, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D46497
llvm-svn: 332363
Summary:
clangd will populate #include insertions as addtionalEdits in completion items.
The code completion tests in ClangdServerTest will be added back in D46497.
Reviewers: ilya-biryukov, sammccall
Reviewed By: ilya-biryukov
Subscribers: klimek, MaskRay, jkorous, cfe-commits
Differential Revision: https://reviews.llvm.org/D46676
llvm-svn: 332362
Summary: Add function to the NVPTX libomptarget library that will return true if the current target region is being executed in SPMD mode.
Reviewers: ABataev, grokos, carlo.bertolli, caomhin
Reviewed By: grokos
Subscribers: guansong, openmp-commits
Differential Revision: https://reviews.llvm.org/D46840
llvm-svn: 332360
Summary: Creating the Scop name is expensive, because creating the
Region name it's derived from is expensive. So create the name lazily,
because getName() is actually called rarely.
This is a reiteration of r328666, which introduced a use-after-free and
got reverted in r331363.
Differential Revision: https://reviews.llvm.org/D46868
llvm-svn: 332359
This is a simple hack based on what's proposed in D37686, but we can extend it if needed in follow-ups.
It gets us most of the FMF functionality that we want without adding any state bits to the flags. It
also intentionally leaves out non-FMF flags (nsw, etc) to minimize the patch.
It should provide a superset of the functionality from D46563 - the extra tests show propagation and
codegen diffs for fcmp, vecreduce, and FP libcalls.
The PPC log2() test shows the limits of this most basic approach - we only applied 'afn' to the last
node created for the call. AFAIK, there aren't any libcall optimizations based on the flags currently,
so that shouldn't make any difference.
Differential Revision: https://reviews.llvm.org/D46854
llvm-svn: 332358
Btver2 - VCVTPH2PSYrm needs to double pump the AGU
Broadwell - missing VCVTPS2PH*mr stores extra latency
Allows us to remove the WriteCvtF2FSt conversion store class
llvm-svn: 332357
This option permits to explicitly keep the specified
symbol so that it doesn't get removed.
Differential Revision: https://reviews.llvm.org/D46819
llvm-svn: 332356
Summary:
Unittests aren't working, and I don't think they ever were. Just remove
them, so that we don't have to write `arc --nounit` all the time.
Reviewers: grosser, Meinersbur, bollu
Subscribers: bollu, pollydev, llvm-commits
Differential Revision: https://reviews.llvm.org/D46869
llvm-svn: 332355