Commit Graph

12756 Commits

Author SHA1 Message Date
Eli Friedman f9081a8afe Zap unnecessary isIntDivCheap() check. PR11485. No testcase because this doesn't affect any in-tree target.
llvm-svn: 146015
2011-12-07 03:55:52 +00:00
Jakob Stoklund Olesen 6ad6848522 Add missing check.
llvm-svn: 146004
2011-12-07 01:08:22 +00:00
Eli Friedman ed8b3e38ec Support vector bitcasts in the AsmPrinter. PR11495.
llvm-svn: 146001
2011-12-07 00:50:54 +00:00
Jakob Stoklund Olesen b0d91abec0 Add MachineOperand IsInternalRead flag.
This flag is used when bundling machine instructions.  It indicates
whether the operand reads a value defined inside or outside its bundle.

llvm-svn: 145997
2011-12-07 00:22:07 +00:00
Eli Friedman 0e58cba286 Fix an optimization involving EXTRACT_SUBVECTOR in DAGCombine so it behaves correctly. PR11494.
llvm-svn: 145996
2011-12-07 00:11:56 +00:00
Jakub Staszak c007ab8551 Remove unneeded type.
llvm-svn: 145995
2011-12-07 00:08:00 +00:00
Jakub Staszak d4d2b05eba - Remove unneeded #includes.
- Remove unused types/fields.
- Add some constantness.

llvm-svn: 145993
2011-12-06 23:59:33 +00:00
Evan Cheng 2a81dd4a3c First chunk of MachineInstr bundle support.
1. Added opcode BUNDLE
2. Taught MachineInstr class to deal with bundled MIs
3. Changed MachineBasicBlock iterator to skip over bundled MIs; added an iterator to walk all the MIs
4. Taught MachineBasicBlock methods about bundled MIs

llvm-svn: 145975
2011-12-06 22:12:01 +00:00
Jakob Stoklund Olesen 2a2b37ea4a Pretty-print basic block alignment.
llvm-svn: 145965
2011-12-06 21:08:39 +00:00
Sebastian Pop ac35a4d0f7 use space star instead of star space
llvm-svn: 145944
2011-12-06 17:34:16 +00:00
Sebastian Pop 9aa6137d97 add missing point at the end of sentences
llvm-svn: 145943
2011-12-06 17:34:11 +00:00
Evan Cheng c1610bede1 Mix some minor misuse of MachineBasicBlock iterator.
llvm-svn: 145903
2011-12-06 02:49:06 +00:00
Pete Cooper d2971264c6 Removed isWinToJoinCrossClass from the register coalescer.
The new register allocator is much more able to split back up ranges too constrained by register classes.

Fixes <rdar://problem/10466609>

llvm-svn: 145899
2011-12-06 02:06:50 +00:00
Lang Hames 52f24d7a32 Kill off the LoopSplitter. It's not being used or maintained.
llvm-svn: 145897
2011-12-06 01:57:59 +00:00
Lang Hames b13b6a04d0 Update PBQP's analysis usage to reflect the requirements of the inline spiller.
llvm-svn: 145893
2011-12-06 01:45:57 +00:00
Jakob Stoklund Olesen 10e1252269 Use logarithmic units for basic block alignment.
This was actually a bit of a mess. TLI.setPrefLoopAlignment was clearly
documented as taking log2(bytes) units, but the x86 target would still
set a preferred loop alignment of '16'.

CodePlacementOpt passed this number on to the basic block, and
AsmPrinter interpreted it as bytes.

Now both MachineFunction and MachineBasicBlock use logarithmic
alignments.

Obviously, MachineConstantPool still measures alignments in bytes, so we
can emulate the thrill of using as.

llvm-svn: 145889
2011-12-06 01:26:19 +00:00
Nadav Rotem 3924cb0267 Add support for vectors of pointers.
llvm-svn: 145801
2011-12-05 06:29:09 +00:00
Eric Christopher 8dda5d0f06 Add inline subprogram names to the name lookup table since they may
not get there any other way.

llvm-svn: 145789
2011-12-04 06:02:38 +00:00
Anton Korobeynikov 965e0c6de2 Emit the ctors in the proper order on ARM/EABI.
Maybe some targets should use this as well.

Patch by Evgeniy Stepanov!

llvm-svn: 145781
2011-12-03 23:49:37 +00:00
Benjamin Kramer 71ba18c1e0 Simplify code. No functionality change.
-3% on ARMDissasembler.cpp.

llvm-svn: 145773
2011-12-03 16:18:22 +00:00
Nick Lewycky 50f02cb21b Move global variables in TargetMachine into new TargetOptions class. As an API
change, now you need a TargetOptions object to create a TargetMachine. Clang
patch to follow.

One small functionality change in PTX. PTX had commented out the machine
verifier parts in their copy of printAndVerify. That now calls the version in
LLVMTargetMachine. Users of PTX who need verification disabled should rely on
not passing the command-line flag to enable it.

llvm-svn: 145714
2011-12-02 22:16:29 +00:00
Hal Finkel 4201820275 make sure ScheduleDAGInstrs::EmitSchedule does not crash when the first instruction in Sequence is a Noop
llvm-svn: 145677
2011-12-02 04:58:07 +00:00
Dylan Noblesmith c19f0b7357 CodeGen: fix CMake build
Missing file from r145629.

llvm-svn: 145634
2011-12-01 21:49:23 +00:00
Anshuman Dasgupta 08ebdc1e71 Add a deterministic finite automaton based packetizer for VLIW architectures
llvm-svn: 145629
2011-12-01 21:10:21 +00:00
Chad Rosier 46addb9e07 If fast-isel fails, remove dead instructions generated during the failed
attempt.  

llvm-svn: 145425
2011-11-29 19:40:47 +00:00
Daniel Dunbar 539d0a8a09 build/CMake: Finish removal of add_llvm_library_dependencies.
llvm-svn: 145420
2011-11-29 19:25:30 +00:00
Bill Wendling e4cc332729 On MachO, the pointer to the personality function should always be in the
non_lazy_symbol_pointers section (__IMPORT,__pointers). Ignore the 'hidden' part
since that will place it in the wrong section.
<rdar://problem/10443720>

llvm-svn: 145356
2011-11-29 01:43:20 +00:00
Eli Friedman e7ab1a2f0f Make SelectionDAG::InferPtrAlignment use llvm::ComputeMaskedBits instead of duplicating the logic for globals. Make llvm::ComputeMaskedBits handle GlobalVariables slightly more aggressively, to match what InferPtrAlignment knew how to do.
llvm-svn: 145304
2011-11-28 22:48:22 +00:00
Evan Cheng 4a5b2040e2 Revert r145273 and fix in SelectionDAG::InferPtrAlignment() instead.
Conservatively returns zero when the GV does not specify an alignment nor is it
initialized. Previously it returns ABI alignment for type of the GV. However, if
the type is a "packed" type, then the under-specified alignments is attached to
the load / store instructions. In that case, the alignment of the type cannot be
trusted.
rdar://10464621

llvm-svn: 145300
2011-11-28 22:37:34 +00:00
Evan Cheng a4b6404cf0 DAG combine should not increase alignment of loads / stores with alignment less
than ABI alignment. These are loads / stores from / to "packed" data structures.
Their alignments are intentionally under-specified.

rdar://10301431

llvm-svn: 145273
2011-11-28 20:42:56 +00:00
Chad Rosier 61e8d1026f 80-column.
llvm-svn: 145267
2011-11-28 19:59:09 +00:00
Bill Wendling 5ebc95ff4c Remove dead llvm.eh.sjlj.dispatchsetup intrinsic.
llvm-svn: 145263
2011-11-28 19:23:13 +00:00
Chandler Carruth 4f56720754 Prevent rotating the blocks of a loop (and thus getting a backedge to be
fallthrough) in cases where we might fail to rotate an exit to an outer
loop onto the end of the loop chain.

Having *some* rotation, but not performing this rotation, is the primary
fix of thep performance regression with -enable-block-placement for
Olden/em3d (a whopping 30% regression). Still working on reducing the
test case that actually exercises this and the new rotation strategy out
of this code, but I want to check if this regresses other test cases
first as that may indicate it isn't the correct fix.

llvm-svn: 145195
2011-11-27 20:18:00 +00:00
Chandler Carruth 03adbd46ca Take two on rotating the block ordering of loops. My previous attempt
was centered around the premise of laying out a loop in a chain, and
then rotating that chain. This is good for preserving contiguous layout,
but bad for actually making sane rotations. In order to keep it safe,
I had to essentially make it impossible to rotate deeply nested loops.
The information needed to correctly reason about a deeply nested loop is
actually available -- *before* we layout the loop. We know the inner
loops are already fused into chains, etc. We lose information the moment
we actually lay out the loop.

The solution was the other alternative for this algorithm I discussed
with Benjamin and some others: rather than rotating the loop
after-the-fact, try to pick a profitable starting block for the loop's
layout, and then use our existing layout logic. I was worried about the
complexity of this "pick" step, but it turns out such complexity is
needed to handle all the important cases I keep teasing out of benchmarks.

This is, I'm afraid, a bit of a work-in-progress. It is still
misbehaving on some likely important cases I'm investigating in Olden.
It also isn't really tested. I'm going to try to craft some interesting
nested-loop test cases, but it's likely to be extremely time consuming
and I don't want to go there until I'm sure I'm testing the correct
behavior. Sadly I can't come up with a way of getting simple, fine
grained test cases for this logic. We need complex loop structures to
even trigger much of it.

llvm-svn: 145183
2011-11-27 13:34:33 +00:00
Chandler Carruth 9e46684154 Fix an impressive type-o / spell-o Duncan noticed.
llvm-svn: 145181
2011-11-27 10:32:16 +00:00
Chandler Carruth a054580993 Rework a bit of the implementation of loop block rotation to not rely so
heavily on AnalyzeBranch. That routine doesn't behave as we want given
that rotation occurs mid-way through re-ordering the function. Instead
merely check that there are not unanalyzable branching constructs
present, and then reason about the CFG via successor lists. This
actually simplifies my mental model for all of this as well.

The concrete result is that we now will rotate more loop chains. I've
added a test case from Olden highlighting the effect. There is still
a bit more to do here though in order to regain all of the performance
in Olden.

llvm-svn: 145179
2011-11-27 09:22:53 +00:00
Chandler Carruth 9ffb97e631 Introduce a loop block rotation optimization to the new block placement
pass. This is designed to achieve one of the important optimizations
that the old code placement pass did, but more simply.

This is a somewhat rough and *very* conservative version of the
transform. We could get a lot fancier here if there are profitable cases
to do so. In particular, this only looks for a single pattern, it
insists that the loop backedge being rotated away is the last backedge
in the chain, and it doesn't provide any means of doing better in-loop
placement due to the rotation. However, it appears that it will handle
the important loops I am finding in the LLVM test suite.

llvm-svn: 145158
2011-11-27 00:38:03 +00:00
Benjamin Kramer 7ba71be392 Move code into anonymous namespaces.
llvm-svn: 145154
2011-11-26 23:01:57 +00:00
Chandler Carruth 7adee1a01a Fix a silly use-after-free issue. A much earlier version of this code
need lots of fanciness around retaining a reference to a Chain's slot in
the BlockToChain map, but that's all gone now. We can just go directly
to allocating the new chain (which will update the mapping for us) and
using it.

Somewhat gross mechanically generated test case replicates the issue
Duncan spotted when actually testing this out.

llvm-svn: 145120
2011-11-24 11:23:15 +00:00
Chandler Carruth d394bafd2d When adding blocks to the list of those which no longer have any CFG
conflicts, we should only be adding the first block of the chain to the
list, lest we try to merge into the middle of that chain. Most of the
places we were doing this we already happened to be looking at the first
block, but there is no reason to assume that, and in some cases it was
clearly wrong.

I've added a couple of tests here. One already worked, but I like having
an explicit test for it. The other is reduced from a test case Duncan
reduced for me and used to crash. Now it is handled correctly.

llvm-svn: 145119
2011-11-24 08:46:04 +00:00
Chandler Carruth 99fe42fbd9 Relax an invariant that block placement was trying to assert a bit
further. This invariant just wasn't going to work in the face of
unanalyzable branches; we need to be resillient to the phenomenon of
chains poking into a loop and poking out of a loop. In fact, we already
were, we just needed to not assert on it.

This was found during a bootstrap with block placement turned on.

llvm-svn: 145100
2011-11-23 10:35:36 +00:00
Chandler Carruth 8c68f1f3c8 Handle the case of a no-return invoke correctly. It actually still has
successors, they just are all landing pad successors. We handle this the
same way as no successors. Comments attached for the next person to wade
through here and another lovely test case courtesy of Benjamin Kramer's
bugpoint reduction.

llvm-svn: 145098
2011-11-23 08:23:54 +00:00
Bob Wilson ebb44646c4 Enable stack protectors for all arrays, not just char arrays. rdar://5875909
Patch by Bill Wendling.

llvm-svn: 145097
2011-11-23 07:13:56 +00:00
Jakob Stoklund Olesen 02845410f9 Fix PR11422.
This was a bug in keeping track of the available domains when merging
domain values.

The wrong domain mask caused ExecutionDepsFix to try to move VANDPSYrr
to the integer domain which is only available in AVX2.

Also add an assertion to catch future attempts at emitting AVX2
instructions.

llvm-svn: 145096
2011-11-23 04:03:08 +00:00
Chandler Carruth 4a87aa0c31 Fix a crash in block placement due to an inner loop that happened to be
reversed in the function's original ordering, and we happened to
encounter it while handling an outer unnatural CFG structure.

Thanks to the test case reduced from GCC's source by Benjamin Kramer.
This may also fix a crasher in gzip that Duncan reduced for me, but
I haven't yet gotten to testing that one.

llvm-svn: 145094
2011-11-23 03:03:21 +00:00
Chandler Carruth ee54feb6f6 Fix a devilish miscompile exposed by block placement. The
updateTerminator code didn't correctly handle EH terminators in one very
specific case. AnalyzeBranch would find no terminator instruction, and
so the fallback in updateTerminator is to assume fallthrough. This is
correct, but the destination of the fallthrough was assumed to be the
first successor.

This is *almost always* true, but in certain cases the loop
transformations will cause the landing pad to be the first successor!
Instead of this brittle logic, actually look through the successors for
a non-landing-pad accessor, and to assert if more than one is found.

This will hopefully fix some (if not all) of the self host miscompiles
with block placement. Thanks to Benjamin Kramer for reporting, Nick
Lewycky for an initial stab at a reduction, and Duncan for endless
advice on EH (which I know nothing about) as well as reviewing the
actual fix.

llvm-svn: 145062
2011-11-22 13:13:16 +00:00
Chandler Carruth e2530dc889 Fix an obvious omission in the SelectionDAGBuilder where we were
dropping weights on the floor for invokes. This was impeding my writing
further test cases for invoke when interacting with probabilities and
block placement.

No test case as there doesn't appear to be a way to test this stuff. =/
Suggestions for a test case of course welcome. I hope to be able to add
test cases that indirectly cover this eventually by adding probabilities
to the exceptional edge and reordering blocks as a result.

llvm-svn: 145060
2011-11-22 11:37:46 +00:00
Rafael Espindola 2021f38281 If a register is both an early clobber and part of a tied use, handle the use
before the clobber so that we copy the value if needed.

Fixes pr11415.

llvm-svn: 145056
2011-11-22 06:27:18 +00:00
Chandler Carruth 18dfac385b The logic for breaking the CFG in the presence of hot successors didn't
properly account for the *global* probability of the edge being taken.
This manifested as a very large number of unconditional branches to
blocks being merged against the CFG even though they weren't
particularly hot within the CFG.

The fix is to check whether the edge being merged is both locally hot
relative to other successors for the source block, and globally hot
compared to other (unmerged) predecessors of the destination block.

This introduces a new crasher on GCC single-source, but it's currently
behind a flag, and Ben has offered to work on the reduction. =]

llvm-svn: 145010
2011-11-20 11:22:06 +00:00
Chandler Carruth f3dc9eff16 Move the handling of unanalyzable branches out of the loop-driven chain
formation phase and into the initial walk of the basic blocks. We
essentially pre-merge all blocks where unanalyzable fallthrough exists,
as we won't be able to update the terminators effectively after any
reorderings. This is quite a bit more principled as there may be CFGs
where the second half of the unanalyzable pair has some analyzable
predecessor that gets placed first. Then it may get placed next,
implicitly breaking the unanalyzable branch even though we never even
looked at the part that isn't analyzable. I've included a test case that
triggers this (thanks Benjamin yet again!), and I'm hoping to synthesize
some more general ones as I dig into related issues.

Also, to make this new scheme work we have to be able to handle branches
into the middle of a chain, so add this check. We always fallback on the
incoming ordering.

Finally, this starts to really underscore a known limitation of the
current implementation -- we don't consider broken predecessors when
merging successors. This can caused major missed opportunities, and is
something I'm planning on looking at next (modulo more bug reports).

llvm-svn: 144994
2011-11-19 10:26:02 +00:00