Commit Graph

15603 Commits

Author SHA1 Message Date
Rafael Espindola ba0a6cabb8 Always compute all the bits in ComputeMaskedBits.
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.

llvm-svn: 154011
2012-04-04 12:51:34 +00:00
Benjamin Kramer a1355d17ca Move yaml::Stream's dtor out of line so it can see Scanner's dtor.
llvm-svn: 154004
2012-04-04 08:53:34 +00:00
Benjamin Kramer e43bde73aa Implement DwarfLLVMRegPair::operator< without violating asymmetry.
MSVC8 verifies this.

llvm-svn: 154002
2012-04-04 08:24:08 +00:00
Michael J. Spencer afc0d6a36f Sorry about that. MSVC seems to accept just about any random string you give it ;/
llvm-svn: 153979
2012-04-03 23:36:44 +00:00
Michael J. Spencer 22120c47a7 Add YAML parser to Support.
llvm-svn: 153977
2012-04-03 23:09:22 +00:00
Lang Hames ffa52d2ae2 Matrix simplification in PBQP may push infinite costs onto register options.
The colorability heuristic should count these as denied registers.

No test case - this exposed a bug on an out-of-tree target.

llvm-svn: 153958
2012-04-03 16:27:16 +00:00
Eric Christopher 34164196af Add a line number for the scope of the function (starting at the first
brace) so that we get more accurate line number information about the
declaration of a given function and the line where the function
first starts.

Part of rdar://11026482

llvm-svn: 153916
2012-04-03 00:43:49 +00:00
Pete Cooper 4f0dbb27d9 Fixes to r153903. Added missing explanation of behaviour when the VirtRegMap is NULL. Also changed it in this case to just avoid updating the map, but live ranges or intervals will still get updated and created
llvm-svn: 153914
2012-04-03 00:28:46 +00:00
Pete Cooper 3ca96f9950 Moved LiveRangeEdit.h so that it can be called from other parts of the backend, not just libCodeGen
llvm-svn: 153906
2012-04-02 22:44:18 +00:00
Rafael Espindola f76bff0504 Make dominatedBySlowTreeWalk private and assert cases handled by the caller.
llvm-svn: 153905
2012-04-02 22:37:54 +00:00
Bill Wendling 932b992888 Add an option to turn off the expensive GVN load PRE part of GVN.
llvm-svn: 153902
2012-04-02 22:16:50 +00:00
Owen Anderson 98f2c0c384 Add predicates for checking whether targets have free FNEG and FABS operations, and prevent the DAGCombiner from turning them into bitwise operations if they do.
llvm-svn: 153901
2012-04-02 22:10:29 +00:00
Akira Hatanaka b1f68f9696 Initial 64 bit direct object support.
This patch allows llvm to recognize that a 64 bit object file is being produced
and that the subsequently generated ELF header has the correct information.

The test case checks for both big and little endian flavors.

Patch by Jack Carter.

llvm-svn: 153889
2012-04-02 19:25:22 +00:00
Hal Finkel f208af02a4 Add triple support for the IBM BG/P and BG/Q supercomputers.
llvm-svn: 153882
2012-04-02 18:31:33 +00:00
Rafael Espindola ebe09ec137 Add missing 'd'.
llvm-svn: 153872
2012-04-02 13:02:57 +00:00
Benjamin Kramer 1c0541b031 Move getOpcodeName from the various target InstPrinters into the superclass MCInstPrinter.
All implementations used the same code.

llvm-svn: 153866
2012-04-02 08:32:38 +00:00
Craig Topper 54bfde79db Make MCInstrInfo available to the MCInstPrinter. This will be used to remove getInstructionName and the static data it contains since the same tables are already in MCInstrInfo.
llvm-svn: 153860
2012-04-02 06:09:36 +00:00
Chandler Carruth 219173a1be Start cleaning up the InlineCost class. This switches to sentinel values
rather than a bitfield, a great suggestion by Chris during code review.

There is still quite a bit of cruft in the interface, but that requires
sorting out some awkward uses of the cost inside the actual inliner.

No functionality changed intended here.

llvm-svn: 153853
2012-04-01 22:44:09 +00:00
Benjamin Kramer 12af4285d1 Emit the LLVM<->DWARF register mapping as a sorted table and use binary search to do the lookup.
This also avoids emitting the information twice, which led to code bloat. On i386-linux-Release+Asserts
with all targets built this change shaves a whopping 1.3 MB off clang. The number is probably exaggerated
by recent inliner changes but the methods were already enormous with the old inline cost computation.

The DWARF reg -> LLVM reg mapping doesn't seem to have holes in it, so it could be a simple lookup table.
I didn't implement that optimization yet to avoid potentially changing functionality.

There is still some duplication both in tablegen and the generated code that should be cleaned up eventually.

llvm-svn: 153837
2012-04-01 14:23:58 +00:00
Andrew Trick 779b32a44e misched: Add finalizeScheduler to complete the target interface.
llvm-svn: 153827
2012-04-01 07:24:23 +00:00
Rafael Espindola 1eaae50734 Add a workaround for building with old versions of clang.
llvm-svn: 153820
2012-03-31 21:54:20 +00:00
Rafael Espindola 80c540e656 Teach CodeGen's version of computeMaskedBits to understand the range metadata.
This is the CodeGen equivalent of r153747. I tested that there is not noticeable
performance difference with any combination of -O0/-O2 /-g when compiling
gcc as a single compilation unit.

llvm-svn: 153817
2012-03-31 18:14:00 +00:00
Chandler Carruth edd2826f3e Remove a bunch of empty, dead, and no-op methods from all of these
interfaces. These methods were used in the old inline cost system where
there was a persistent cache that had to be updated, invalidated, and
cleared. We're now doing more direct computations that don't require
this intricate dance. Even if we resume some level of caching, it would
almost certainly have a simpler and more narrow interface than this.

llvm-svn: 153813
2012-03-31 12:48:08 +00:00
Chandler Carruth 0539c071ea Initial commit for the rewrite of the inline cost analysis to operate
on a per-callsite walk of the called function's instructions, in
breadth-first order over the potentially reachable set of basic blocks.

This is a major shift in how inline cost analysis works to improve the
accuracy and rationality of inlining decisions. A brief outline of the
algorithm this moves to:

- Build a simplification mapping based on the callsite arguments to the
  function arguments.
- Push the entry block onto a worklist of potentially-live basic blocks.
- Pop the first block off of the *front* of the worklist (for
  breadth-first ordering) and walk its instructions using a custom
  InstVisitor.
- For each instruction's operands, re-map them based on the
  simplification mappings available for the given callsite.
- Compute any simplification possible of the instruction after
  re-mapping, and store that back int othe simplification mapping.
- Compute any bonuses, costs, or other impacts of the instruction on the
  cost metric.
- When the terminator is reached, replace any conditional value in the
  terminator with any simplifications from the mapping we have, and add
  any successors which are not proven to be dead from these
  simplifications to the worklist.
- Pop the next block off of the front of the worklist, and repeat.
- As soon as the cost of inlining exceeds the threshold for the
  callsite, stop analyzing the function in order to bound cost.

The primary goal of this algorithm is to perfectly handle dead code
paths. We do not want any code in trivially dead code paths to impact
inlining decisions. The previous metric was *extremely* flawed here, and
would always subtract the average cost of two successors of
a conditional branch when it was proven to become an unconditional
branch at the callsite. There was no handling of wildly different costs
between the two successors, which would cause inlining when the path
actually taken was too large, and no inlining when the path actually
taken was trivially simple. There was also no handling of the code
*path*, only the immediate successors. These problems vanish completely
now. See the added regression tests for the shiny new features -- we
skip recursive function calls, SROA-killing instructions, and high cost
complex CFG structures when dead at the callsite being analyzed.

Switching to this algorithm required refactoring the inline cost
interface to accept the actual threshold rather than simply returning
a single cost. The resulting interface is pretty bad, and I'm planning
to do lots of interface cleanup after this patch.

Several other refactorings fell out of this, but I've tried to minimize
them for this patch. =/ There is still more cleanup that can be done
here. Please point out anything that you see in review.

I've worked really hard to try to mirror at least the spirit of all of
the previous heuristics in the new model. It's not clear that they are
all correct any more, but I wanted to minimize the change in this single
patch, it's already a bit ridiculous. One heuristic that is *not* yet
mirrored is to allow inlining of functions with a dynamic alloca *if*
the caller has a dynamic alloca. I will add this back, but I think the
most reasonable way requires changes to the inliner itself rather than
just the cost metric, and so I've deferred this for a subsequent patch.
The test case is XFAIL-ed until then.

As mentioned in the review mail, this seems to make Clang run about 1%
to 2% faster in -O0, but makes its binary size grow by just under 4%.
I've looked into the 4% growth, and it can be fixed, but requires
changes to other parts of the inliner.

llvm-svn: 153812
2012-03-31 12:42:41 +00:00
Chandler Carruth 056b460917 Add support to the InstVisitor for visiting a generic callsite. The
visitor will now visit a CallInst and an InvokeInst with
instruction-specific visitors, then visit a generic CallSite visitor,
then delegate back to the Instruction visitor and the TerminatorInst
visitors depending on whether a call or an invoke originally. This will
be used in the soon-to-land inline cost rewrite.

llvm-svn: 153811
2012-03-31 11:31:24 +00:00
Bill Wendling 152e4739a2 Cleanup whitespace and remove unneeded 'extern' keyword on function definitions.
llvm-svn: 153802
2012-03-31 10:44:20 +00:00
Jakob Stoklund Olesen 066aba5fe9 Reapply 153764 and 153761 with a fix.
Use an explicit comparator instead of the default.

The sets are sorted, but not using the default comparator. Hopefully,
this will unbreak the Linux builders.

llvm-svn: 153772
2012-03-30 20:24:14 +00:00
Rafael Espindola fc06055173 Revert 153764 and 153761. They broke a --enable-optimized --enable-assertions
--enable-expensive-checks build.

llvm-svn: 153771
2012-03-30 20:09:06 +00:00
Jakob Stoklund Olesen 569e116d35 Compress register lists by sharing suffixes.
TableGen emits lists of sub-registers, super-registers, and overlaps. Put
them all in a single table and use a SequenceToOffsetTable to share
suffixes.

llvm-svn: 153761
2012-03-30 17:25:43 +00:00
Rafael Espindola a53c46aaa3 Handle unreachable code in the dominates functions. This changes users when
needed for correctness, but still doesn't clean up code that now unnecessary
checks for reachability.

llvm-svn: 153755
2012-03-30 16:46:21 +00:00
Danil Malyshev 70d22ccb22 Re-factored RuntimeDyLd:
1. The main works will made in the RuntimeDyLdImpl with uses the ObjectFile class. RuntimeDyLdMachO and RuntimeDyLdELF now only parses relocations and resolve it. This is allows to make improvements of the RuntimeDyLd more easily. In addition the support for COFF can be easily added.

2. Added ARM relocations to RuntimeDyLdELF.

3. Added support for stub functions for the ARM, allowing to do a long branch.

4. Added support for external functions that are not loaded from the object files, but can be loaded from external libraries. Now MCJIT can correctly execute the code containing the printf, putc, and etc.

5. The sections emitted instead functions, thanks Jim Grosbach. MemoryManager.startFunctionBody() and MemoryManager.endFunctionBody() have been removed.
6. MCJITMemoryManager.allocateDataSection() and MCJITMemoryManager. allocateCodeSection() used JMM->allocateSpace() instead of JMM->allocateCodeSection() and JMM->allocateDataSection(), because I got an error: "Cannot allocate an allocated block!" with object file contains more than one code or data sections.

llvm-svn: 153754
2012-03-30 16:45:19 +00:00
Bill Wendling 76fdc4b885 Revert r153694. It was causing failures in the buildbots.
llvm-svn: 153701
2012-03-29 23:23:59 +00:00
Danil Malyshev 3548eaf896 Re-factored RuntimeDyld.
Added ExecutionEngine/MCJIT tests.

llvm-svn: 153694
2012-03-29 21:46:18 +00:00
Eric Christopher c13fd6d1e1 Lowercase the tag name to match the rest of dwarf.
llvm-svn: 153691
2012-03-29 21:35:05 +00:00
Eric Christopher 70e1bd8872 Add support for objc property decls according to the page at:
http://llvm.org/docs/SourceLevelDebugging.html#objcproperty

including type and DECL. Expand the metadata needed accordingly.

rdar://11144023

llvm-svn: 153639
2012-03-29 08:42:56 +00:00
Jakob Stoklund Olesen c3e80cc885 Enable machine code verification in the entire code generator.
Some targets still mess up the liveness information, but that isn't
verified after MRI->invalidateLiveness().

The verifier can still check other useful things like register classes
and CFG, so it should be enabled after all passes.

llvm-svn: 153615
2012-03-28 23:54:28 +00:00
Jim Grosbach 4970c304e1 Tidy up. Whitespace.
llvm-svn: 153609
2012-03-28 22:34:41 +00:00
Danil Malyshev bfee542cce Move getPointerToNamedFunction() from JIT/MCJIT to JITMemoryManager.
llvm-svn: 153607
2012-03-28 21:46:36 +00:00
Chandler Carruth 772c88b887 Switch to WeakVHs in the value mapper, and aggressively prune dead basic
blocks in the function cloner. This removes the last case of trivially
dead code that I've been seeing in the wild getting inlined, analyzed,
re-inlined, optimized, only to be deleted. Nukes a FIXME from the
cleanup tests.

llvm-svn: 153572
2012-03-28 08:38:27 +00:00
Jakob Stoklund Olesen 9c1ad5cb7d Add an MRI::tracksLiveness() flag.
Late optimization passes like branch folding and tail duplication can
transform the machine code in a way that makes it expensive to keep the
register liveness information up to date. There is a fuzzy line between
register allocation and late scheduling where the liveness information
degrades.

The MRI::tracksLiveness() flag makes the line clear: While true,
liveness information is accurate, and can be used for register
scavenging. Once the flag is false, liveness information is not
accurate, and can only be used as a hint.

Late passes generally don't need the liveness information, but they will
sometimes use the register scavenger to help update it. The scavenger
enforces strict correctness, and we have to spend a lot of code to
update register liveness that may never be used.

llvm-svn: 153511
2012-03-27 15:13:58 +00:00
Lang Hames 95e021faf5 Add a debug option to dump PBQP graphs during register allocation.
llvm-svn: 153483
2012-03-26 23:07:23 +00:00
Bill Wendling 12a98c9f07 Add 'undef's to make SWIG happier. Patch by Baozeng Ding.
llvm-svn: 153479
2012-03-26 22:15:12 +00:00
Eric Christopher 56079c1e72 Add InitializeNativeTargetDisassembler function.
Patch by Ojab.

llvm-svn: 153476
2012-03-26 21:56:56 +00:00
Craig Topper 6e80c28017 Prune some includes and forward declarations.
llvm-svn: 153429
2012-03-26 06:58:25 +00:00
Craig Topper 7a901d98f6 Prune some includes and forward declarations.
llvm-svn: 153414
2012-03-25 18:09:44 +00:00
Rafael Espindola 074f815148 Use the isReachableFromEntry method.
llvm-svn: 153400
2012-03-24 23:29:27 +00:00
Rafael Espindola c9dccb1179 Avoid using dominatedBySlowTreeWalk.
llvm-svn: 153398
2012-03-24 22:52:25 +00:00
Chandler Carruth cf1b585f60 Refactor the interface to recursively simplifying instructions to be tad
bit simpler by handling a common case explicitly.

Also, refactor the implementation to use a worklist based walk of the
recursive users, rather than trying to use value handles to detect and
recover from RAUWs during the recursive descent. This fixes a very
subtle bug in the previous implementation where degenerate control flow
structures could cause mutually recursive instructions (PHI nodes) to
collapse in just such a way that From became equal to To after some
amount of recursion. At that point, we hit the inf-loop that the assert
at the top attempted to guard against. This problem is defined away when
not using value handles in this manner. There are lots of comments
claiming that the WeakVH will protect against just this sort of error,
but they're not accurate about the actual implementation of WeakVHs,
which do still track RAUWs.

I don't have any test case for the bug this fixes because it requires
running the recursive simplification on unreachable phi nodes. I've no
way to either run this or easily write an input that triggers it. It was
found when using instruction simplification inside the inliner when
running over the nightly test-suite.

llvm-svn: 153393
2012-03-24 21:11:24 +00:00
Rafael Espindola ef9f5504ea First part of PR12251. Add documentation and verifier support for the range
metadata.

llvm-svn: 153359
2012-03-24 00:14:51 +00:00
Kostya Serebryany e505a5abe9 add EP_OptimizerLast extension point
llvm-svn: 153353
2012-03-23 23:22:59 +00:00