This is meant to reduce the typing that one needs to do to get from the test subdirectory to actual test cases. Now one can just do
$ ./dotest.py ./testcases/<yaddayaddayadda>
llvm-svn: 255741
Summary:
This patch adds support for using LIT to drive generating PGO profile data for clang.
This first pass implementation should work on Linux and Unix based platforms. If you build clang using CMake with LLVM_BUILD_INSTRUMENTED=On the CMake build generates a generate-profdata target that will use the just-built clang to build any test files (see hello_world.cpp as an example). Each test compile will generate profraw files for each clang process. After all tests have run CMake will merge the profraw files using llvm-profdata.
Future opportunities for extension:
* Support for Build->Profile->Build bootstrapping
* Support for linker order file generation using a similar mechanism and the same training data
* Support for Windows
Reviewers: dexonsmith, friss, bogner, cmatthews, vsk, silvas
Subscribers: cfe-commits
Differential Revision: http://reviews.llvm.org/D15462
llvm-svn: 255740
Extend EarlyCSE with an additional style of dead store elimination. If we write back a value just read from that memory location, we can eliminate the store under the assumption that the value hasn't changed.
I'm implementing this mostly because I noticed the omission when looking at the code. It seemed strange to have InstCombine have a peephole which was more powerful than EarlyCSE. :)
Differential Revision: http://reviews.llvm.org/D15397
llvm-svn: 255739
This patch allows atomic loads and stores of floating point to be specified in the IR and adds an adapter to allow them to be lowered via existing backend support for bitcast-to-equivalent-integer idiom.
Previously, the only way to specify a atomic float operation was to bitcast the pointer to a i32, load the value as an i32, then bitcast to a float. At it's most basic, this patch simply moves this expansion step to the point we start lowering to the backend.
This patch does not add canonicalization rules to convert the bitcast idioms to the appropriate atomic loads. I plan to do that in the future, but for now, let's simply add the support. I'd like to get instruction selection working through at least one backend (x86-64) without the bitcast conversion before canonicalizing into this form.
Similarly, I haven't yet added the target hooks to opt out of the lowering step I added to AtomicExpand. I figured it would more sense to add those once at least one backend (x86) was ready to actually opt out.
As you can see from the included tests, the generated code quality is not great. I plan on submitting some patches to fix this, but help from others along that line would be very welcome. I'm not super familiar with the backend and my ramp up time may be material.
Differential Revision: http://reviews.llvm.org/D15471
llvm-svn: 255737
Summary:
Using the blacklist the user can filter own unwanted functions
from all outputs. By default blacklist contains "fun:__sancov*" line.
Differential Revision: http://reviews.llvm.org/D15364
llvm-svn: 255732
Summary:
DWARF 5 proposes a reinvented .debug_macro section. This change follows
that spec.
Currently, only GCC produces the .debug_macro section and hence
the added test is annottated with expectedFailureClang.
Reviewers: spyffe, clayborg, tberghammer
Subscribers: lldb-commits
Differential Revision: http://reviews.llvm.org/D15437
llvm-svn: 255729
When a pass removes a loop it currently has to reach up into the
LPPassManager's internals to update the state of the iteration over
loops. This reverse dependency results in a pretty awkward interplay
of the LPPassManager and its Passes.
Here, we change this to instead keep track of when a loop has become
"unlooped" in the Loop objects themselves, then the LPPassManager can
check this and manipulate its own state directly. This opens the door
to allow most of the loop passes to work without a backreference to
the LPPassManager.
I've kept passes calling the LPPassManager::deleteLoopFromQueue API
now so I could put an assert in to prove that this is NFC, but a later
pass will update passes just to preserve the LoopInfo directly and
stop referencing the LPPassManager completely.
llvm-svn: 255720
Load instructions may possibly be related to multiple memory accesses, but we
are only interested in the array read access that describes the memory location
the load instructions loads from. By using getArrayAccessfor we ensure to always
obtain the right memory access.
This issue was found by inspection without having a failing test case.
llvm-svn: 255716
getAccessFor does not guarantee a certain access to be returned in case an
instruction is related to multiple accesses. However, in the vector code
generation we want to know the stride of the array access of a store
instruction. By using getArrayAccessFor we ensure we always get the correct
memory access.
This patch fixes a potential bug, but I was unable to produce a failing test
case. Several existing test cases cover this code, but all of them already
passed out of luck (or the specific but not-guaranteed order in which we build
memory accesses).
llvm-svn: 255715
When generating scalar loads/stores separately the vector code has not been
updated. This commit adds code to generate scalar loads for vector code as well
as code to assert in case scalar stores are encountered within a vector loop.
llvm-svn: 255714
When rewriting the access functions of load/store statements, we are only
interested in the actual array memory location. The current code just took
the very first memory access, which could be a scalar or an array access. As
a result, we failed to update access functions even though this was requested
via .jscop.
llvm-svn: 255713
debugserver. thread-pcs has a comma separated list of base 16
addresses - the current pc value for every thread in the process.
It is a partner of the "threads:" key where a list of thread IDs
is given. The pc values in thread-pcs correspond one-to-one with
the thread IDs in the threads list.
This is a part of performance work. When lldb is instruction
stepping / fast stepping over a range of addresses for e.g. a "next"
command, and it steps in to another function, lldb will put a
breakpoint on the return address and continue the process. Before
it calls continue, it calls Thread::SetupForResume on all the
threads, and SetupForResume needs to get the current pc value for
every thread to see if any are at a breakpoint site.
The result is that issuing a "c" continue requires that we send
"read pc register" packets for every thread.
We may do this sequence of step-into-function / continue-to-get-out
many times for a single user-visible "next" or "step" command, and
with highly multithreaded programs, we are sending many extra
packets to get all the thread values.
I looked at including this data in the "jstopinfo" JSON that
we already have in the T packet. But there are three problems that
would make this increase the size of the T packet significantly.
First, numbers in JSON are base 10. Second, a proper JSON would
have something like "thread_pcs": { "34224331112":383772734222, ...}
for thread-id 34224331112 and pc 383772734222 - so we're including
a whole extra copy of the thread id in addition to the pc. Third,
the JSON text is hex-ascii'fied so the size of it is doubled.
In one example,
threads:585db8,585dc7,585dc8,585dc9,585dca,585dce;thread-pcs:100001400,7fff8badc6de,7fff8badcff6,7fff8badc6de,7fff8badc6de,7fff8badc6de;
The "thread-pcs" adds 86 characters - 136 characters for both
threads and thread-pcs. Doing this in JSON would look like
threads={"5791160":4294972416,"5791175":140735536809694,"5791176":140735536812022,"5791177":140735536809694,"5791178":140735536809694,"5791182":140735536809694}
or 160 characters -- or 320 characters once it is hex-asciified.
Given that it's 86 characters vrs 320, I went with the old style
approach. I've seen real world programs that have up to 60 threads
in them, so this could result in vastly larger packets if it
was all done in the JSON with hex-ascii expansion.
If we had an all-JSON T packet, where we didn't need to hex-ascii
encode anything, that would have been the better approach. But
we'd already have a list of threads in JSON at that point so
the additional text wouldn't be too bad.
I'm working on finishing the patches to lldb to use this data;
will commit those once I've had a chance to test them more. But
I wanted to commit the debugserver bits which are more
straightforward.
<rdar://problem/21963031>
llvm-svn: 255711
These tests started passing after libcxxabi's r255559, which fixed a problem
relating to how libcxxabi links its EH library. The test failures were
caused by an issue with libc++, not the sanitizers (confirmed by building a
pre-r255559 revision with libc++/libc++abi and without sanitizers), so they
should never have been XFAILed under the sanitizers.
llvm-svn: 255708
It adjusts from RSP-after-prologue to RBP, which is what SEH filters
need to do before they can use llvm.localrecover.
Fixes SEH filter captures, which were broken in r250088.
Issue reported by Alex Crichton.
llvm-svn: 255707
This patch improves on the suggested codegen from PR24475:
https://llvm.org/bugs/show_bug.cgi?id=24475
but only for the fmaxf() case to start, so we can sort out any bugs before
extending to fmin, f64, and vectors.
The fmax / maxnum definitions provide us flexibility for signed zeros, so the
only thing we have to worry about in this replacement sequence is NaN handling.
Note 1: It may be better to implement this as lowerFMAXNUM(), but that exposes
a problem: SelectionDAGBuilder::visitSelect() transforms compare/select
instructions into FMAXNUM nodes if we declare FMAXNUM legal or custom. Perhaps
that should be checking for NaN inputs or global unsafe-math before transforming?
As it stands, that bypasses a big set of optimizations that the x86 backend
already has in PerformSELECTCombine().
Note 2: The v2f32 test reveals another bug; the vector is extended to v4f32, so
we have completely unnecessary operations happening on undef elements of the
vector.
Differential Revision: http://reviews.llvm.org/D15294
llvm-svn: 255700
This is an initial version of the runtime cross-DSO CFI support
library.
It contains a number of FIXMEs, ex. it does not support the
diagnostic mode nor dlopen/dlclose, but it works and can be tested.
Diagnostic mode, in particular, would require some refactoring (we'd
like to gather all CFI hooks in the UBSan library into one function
so that we could easier pass the diagnostic information down to
__cfi_check). It will be implemented later.
Once the diagnostic mode is in, I plan to create a second test
configuration to run all existing tests in both modes. For now, this
patch includes only a few new cross-DSO tests.
llvm-svn: 255695
Clang-side cross-DSO CFI.
* Adds a command line flag -f[no-]sanitize-cfi-cross-dso.
* Links a runtime library when enabled.
* Emits __cfi_slowpath calls is bitset test fails.
* Emits extra hash-based bitsets for external CFI checks.
* Sets a module flag to enable __cfi_check generation during LTO.
This mode does not yet support diagnostics.
llvm-svn: 255694
An LTO pass that generates a __cfi_check() function that validates a
call based on a hash of the call-site-known type and the target
pointer.
llvm-svn: 255693
Summary: I'm not sure how things worked before without this.
Reviewers: arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D15492
llvm-svn: 255692