Commit Graph

81648 Commits

Author SHA1 Message Date
Chandler Carruth 68062617a6 Make a somewhat subtle change in the logic of block placement. Sometimes
the loop header has a non-loop predecessor which has been pre-fused into
its chain due to unanalyzable branches. In this case, rotating the
header into the body of the loop in order to place a loop exit at the
bottom of the loop is a Very Bad Idea as it makes the loop
non-contiguous.

I'm working on a good test case for this, but it's a bit annoynig to
craft. I should get one shortly, but I'm submitting this now so I can
begin the (lengthy) performance analysis process. An initial run of LNT
looks really, really good, but there is too much noise there for me to
trust it much.

llvm-svn: 154395
2012-04-10 13:35:57 +00:00
Anton Korobeynikov 4d1220de34 Transform div to mul with reciprocal only when fp imm is legal.
This fixes PR12516 and uncovers one weird problem in legalize (workarounded)

llvm-svn: 154394
2012-04-10 13:22:49 +00:00
David Chisnall bbec87205d Use the correct section types on Solaris for unwind data on both x86 and x86-64.
Patch by Dmitri Shubin!

llvm-svn: 154391
2012-04-10 11:44:33 +00:00
Duncan Sands af06b26c8e Express the number of ULPs in fpaccuracy metadata as a real rather than a
rational number, eg as 2.5 rather than 5, 2.  OK'd by Peter Collingbourne.

llvm-svn: 154387
2012-04-10 08:22:43 +00:00
Andrew Trick 4442bfe559 Fix 12513: Loop unrolling breaks with indirect branches.
Take this opportunity to generalize the indirectbr bailout logic for
loop transformations. CFG transformations will never get indirectbr
right, and there's no point trying.

llvm-svn: 154386
2012-04-10 05:14:42 +00:00
Andrew Trick 4104ed9c76 whitespace
llvm-svn: 154385
2012-04-10 05:14:37 +00:00
Andrew Trick 7d52db9864 Fix for register pressure tables.
Recent refactoring introduced a bug. Fix: added buildRegUnitSets.

llvm-svn: 154382
2012-04-10 03:36:49 +00:00
Evan Cheng 0752624970 Add proper checks.
llvm-svn: 154379
2012-04-10 03:15:42 +00:00
Evan Cheng 136861d994 Make the code slightly more palatable.
llvm-svn: 154378
2012-04-10 03:15:18 +00:00
Andrew Trick 9002c3157f Use std::includes instead of my own implementation.
Jakob's review.

llvm-svn: 154377
2012-04-10 03:12:29 +00:00
Andrew Trick 31f6487532 Added a TargetRegisterInfo interface for accessing register pressure sets.
llvm-svn: 154375
2012-04-10 02:25:26 +00:00
Andrew Trick 739a00386e Added register unit sets to the target description.
This is a new algorithm that finds sets of register units that can be
used to model registers pressure. This handles arbitrary, overlapping
register classes. Each register class is associated with a (small)
list of pressure sets. These are the dimensions of pressure affected
by the register class's liveness.

llvm-svn: 154374
2012-04-10 02:25:24 +00:00
Andrew Trick 1d7a2c572c Added register unit weights to the target description.
This is a new algorithm that associates registers with weighted
register units to accuretely model their effect on register
pressure. This handles registers with multiple overlapping
subregisters. It is possible, but almost inconceivable that the
algorithm fails to find an exact solution for a target description. If
an exact solution cannot be found, an inexact, but reasonable solution
will be chosen.

llvm-svn: 154373
2012-04-10 02:25:21 +00:00
Andrew Trick 3a6e88dcc9 Fix header comment
llvm-svn: 154372
2012-04-10 02:25:18 +00:00
Danil Malyshev 549515e128 Add a constructor for DataRefImpl and remove excess initialization.
llvm-svn: 154371
2012-04-10 01:54:44 +00:00
Evan Cheng f8bad08001 Fix a long standing tail call optimization bug. When a libcall is emitted
legalizer always use the DAG entry node. This is wrong when the libcall is
emitted as a tail call since it effectively folds the return node. If
the return node's input chain is not the entry (i.e. call, load, or store)
use that as the tail call input chain.

PR12419
rdar://9770785
rdar://11195178

llvm-svn: 154370
2012-04-10 01:51:00 +00:00
Rafael Espindola 1d9672bdce Don't try to zExt just to check if an integer constant is zero, it might
not fit in a i64.

llvm-svn: 154364
2012-04-10 00:16:22 +00:00
Jim Grosbach 8f99bc3aed ARM LDR/LDRT has the same encoding collision as STR/STRT.
Generalized logic of r154141.

llvm-svn: 154362
2012-04-10 00:13:07 +00:00
Lang Hames ec96cd0690 Test case for PR12495.
llvm-svn: 154359
2012-04-09 23:58:59 +00:00
Bill Wendling b5cedde66d Revert the 'EnableInitializing' flag. There is debate on whether we should run that pass by default in LTO.
llvm-svn: 154356
2012-04-09 23:16:51 +00:00
Bill Wendling 383fda29be Apply the scope restrictions after parsing the command line options. There may be some which are used in that function.
llvm-svn: 154348
2012-04-09 22:18:01 +00:00
Akira Hatanaka 8483a6c47d Have TargetLowering::getPICJumpTableRelocBase return a node that points to the
GOT if jump table uses 64-bit gp-relative relocation.

llvm-svn: 154341
2012-04-09 20:32:12 +00:00
Chad Rosier e0e38f61a5 When performing a truncating store, it's possible to rearrange the data
in-register, such that we can use a single vector store rather then a 
series of scalar stores.

For func_4_8 the generated code

	vldr	d16, LCPI0_0
	vmov	d17, r0, r1
	vadd.i16	d16, d17, d16
	vmov.u16	r0, d16[3]
	strb	r0, [r2, #3]
	vmov.u16	r0, d16[2]
	strb	r0, [r2, #2]
	vmov.u16	r0, d16[1]
	strb	r0, [r2, #1]
	vmov.u16	r0, d16[0]
	strb	r0, [r2]
	bx	lr

becomes

	vldr	d16, LCPI0_0
	vmov	d17, r0, r1
	vadd.i16	d16, d17, d16
	vuzp.8	d16, d17
	vst1.32	{d16[0]}, [r2, :32]
	bx	lr

I'm not fond of how this combine pessimizes 2012-03-13-DAGCombineBug.ll,
but I couldn't think of a way to judiciously apply this combine.

This

	ldrh	r0, [r0, #4]
	strh	r0, [r1]

becomes

	vldr	d16, [r0]
	vmov.u16	r0, d16[2]
	vmov.32	d16[0], r0
	vuzp.16	d16, d17
	vst1.32	{d16[0]}, [r1, :32]

PR11158
rdar://10703339

llvm-svn: 154340
2012-04-09 20:32:02 +00:00
Lang Hames 3ad11ff90f Patch r153892 for PR11861 apparently broke an external project (see PR12493).
This patch restores TwoAddressInstructionPass's pre-r153892 behaviour when
rescheduling instructions in TryInstructionTransform. Hopefully this will fix
PR12493. To refix PR11861, lowering of INSERT_SUBREGS is deferred until after
the copy that unties the operands is emitted (this seems to be a more
appropriate fix for that issue anyway).

llvm-svn: 154338
2012-04-09 20:17:30 +00:00
Chad Rosier 99cbde9e82 Update comments and remove unnecessary isVolatile() check.
llvm-svn: 154336
2012-04-09 19:38:15 +00:00
Eric Christopher 132a998331 Typo.
llvm-svn: 154329
2012-04-09 17:54:34 +00:00
David Blaikie e6b6fae8ff Fix accidentally constant conditions found by uncommitted improvements to -Wconstant-conversion.
A couple of cases where we were accidentally creating constant conditions by
something like "x == a || b" instead of "x == a || x == b". In one case a
conditional & then unreachable was used - I transformed this into a direct
assert instead.

llvm-svn: 154324
2012-04-09 16:29:35 +00:00
Rafael Espindola 8f62b3248e Pattern match a setcc of boolean value with 0 as a truncate.
llvm-svn: 154322
2012-04-09 16:06:03 +00:00
Preston Gurd 2eec367227 This patch adds X86 instruction itineraries, which were missed by the
original patch to add itineraries, to X86InstrArithmetc.td.  

llvm-svn: 154320
2012-04-09 15:32:22 +00:00
Duncan Sands f1e1bb213f Clarify that fpaccuracy metadata is giving the compiler permission to use a
less accurate method.

llvm-svn: 154319
2012-04-09 14:08:00 +00:00
Nadav Rotem fb7e2ae53c Lower some x86 shuffle sequences to the vblend family of instructions.
llvm-svn: 154313
2012-04-09 08:33:21 +00:00
Bill Wendling deffc42d63 s/lto_codegen_whole_program_optimization/lto_codegen_set_whole_program_optimization/
llvm-svn: 154312
2012-04-09 08:32:21 +00:00
Nadav Rotem b801ca3976 Fix a bug in the lowering of broadcasts: ConstantPools need to use the target pointer type.
Move NormalizeVectorShuffle and LowerVectorBroadcast into X86TargetLowering.

llvm-svn: 154310
2012-04-09 07:45:58 +00:00
Craig Topper 9c3da316ec Remove unnecessary type check when combining and/or/xor of swizzles. Move some checks to allow better early out.
llvm-svn: 154309
2012-04-09 07:19:09 +00:00
Craig Topper e5893f64e8 Remove unnecessary 'else' on an 'if' that always returns
llvm-svn: 154308
2012-04-09 05:59:53 +00:00
Craig Topper e3ad4834ae Optimize code slightly. No functionality change.
llvm-svn: 154307
2012-04-09 05:55:33 +00:00
Bill Wendling 8a49d049e1 Add a hook to turn on the internalize pass through the LTO interface.
llvm-svn: 154306
2012-04-09 05:26:48 +00:00
Craig Topper 5894fe430a Replace some explicit checks with asserts for conditions that should never happen.
llvm-svn: 154305
2012-04-09 05:16:56 +00:00
Chandler Carruth 3779ac10b4 Cleanup and relax a restriction on the matching of global offsets into
x86 addressing modes. This allows PIE-based TLS offsets to fit directly
into an addressing mode immediate offset, which is the last remaining
code quality issue from PR12380. With this patch, that PR is completely
fixed.

To understand why this patch is correct to match these offsets into
addressing mode immediates, break it down by cases:
1) 32-bit is trivially correct, and unmodified here.
2) 64-bit non-small mode is unchanged and never matches.
3) 64-bit small PIC code which is RIP-relative is handled specially in
   the match to try to fit RIP into the base register. If it fails, it
   now early exits. This behavior is unchanged by the patch.
4) 64-bit small non-PIC code which is not RIP-relative continues to work
   as it did before. The reason these immediates are safe is because the
   ABI ensures they fit in small mode. This behavior is unchanged.
5) 64-bit small PIC code which is *not* using RIP-relative addressing.
   This is the only case changed by the patch, and the primary place you
   see it is in TLS, either the win64 section offset TLS or Linux
   local-exec TLS model in a PIC compilation. Here the ABI again ensures
   that the immediates fit because we are in small mode, and any other
   operations required due to the PIC relocation model have been handled
   externally to the Wrapper node (extra loads etc are made around the
   wrapper node in ISelLowering).

I've tested this as much as I can comparing it with GCC's output, and
everything appears safe. I discussed this with Anton and it made sense
to him at least at face value. That said, if there are issues with PIC
code after this patch, yell and we can revert it.

llvm-svn: 154304
2012-04-09 02:13:06 +00:00
Chandler Carruth 84b834267e Fold 15 tiny test cases into a single file that implements the
comprehensive testing of TLS codegen for x86. Convert all of the ones
that were still using grep to use FileCheck. Remove some redundancies
between them.

Perhaps most interestingly expand the test cases so that they actually
fully list the instruction snippet being tested. TLS operations are
*very* narrowly defined, and so these seem reasonably stable. More
importantly, the existing test cases already were crazy fine grained,
expecting specific registers to be allocated. This just clarifies that
no *other* instructions are expected, and fills in some crucial gaps
that weren't being tested at all.

This will make any subsequent changes to TLS much more clear during
review.

llvm-svn: 154303
2012-04-09 01:43:17 +00:00
Craig Topper 6148fe65e8 Optimize code a bit. No functional change intended.
llvm-svn: 154299
2012-04-08 23:15:04 +00:00
Benjamin Kramer bb6ff08766 Silence sign-compare warning.
llvm-svn: 154297
2012-04-08 19:04:45 +00:00
Duncan Sands 2f1dc3814b Only have codegen turn fdiv by a constant into fmul by the reciprocal
when -ffast-math, i.e. don't just always do it if the reciprocal can
be formed exactly.  There is already an IR level transform that does
that, and it does it more carefully.

llvm-svn: 154296
2012-04-08 18:08:12 +00:00
Craig Topper c8e2d91a58 Simplify code that tries to do vector extracts for shuffles when the mask width and the input vector widths don't match. No need to check the min and max are in range before calculating the start index. The range check after having the start index is sufficient. Also no need to check for an extract from the beginning differently.
llvm-svn: 154295
2012-04-08 17:53:33 +00:00
Chandler Carruth ede4a8aa2b Teach LLVM about a PIE option which, when enabled on top of PIC, makes
optimizations which are valid for position independent code being linked
into a single executable, but not for such code being linked into
a shared library.

I discussed the design of this with Eric Christopher, and the decision
was to support an optional bit rather than a completely separate
relocation model. Fundamentally, this is still PIC relocation, its just
that certain optimizations are only valid under a PIC relocation model
when the resulting code won't be in a shared library. The simplest path
to here is to expose a single bit option in the TargetOptions. If folks
have different/better designs, I'm all ears. =]

I've included the first optimization based upon this: changing TLS
models to the *Exec models when PIE is enabled. This is the LLVM
component of PR12380 and is all of the hard work.

llvm-svn: 154294
2012-04-08 17:51:45 +00:00
Chandler Carruth 16f0ebcbb5 Move the TLSModel information into the TargetMachine rather than hiding
in TargetLowering. There was already a FIXME about this location being
odd. The interface is simplified as a consequence. This will also make
it easier to change TLS models when compiling with PIE.

llvm-svn: 154292
2012-04-08 17:20:55 +00:00
Benjamin Kramer 25a3d816a6 EngineBuilder::create is expected to take ownership of the TargetMachine passed to it. Delete it on error or when we create an interpreter that doesn't need it.
llvm-svn: 154288
2012-04-08 14:53:14 +00:00
Chandler Carruth bed1abf9ca Remove an over zealous assert. The assert was trying to catch places
where a chain outside of the loop block-set ended up in the worklist for
scheduling as part of the contiguous loop. However, asserting the first
block in the chain is in the loop-set isn't a valid check -- we may be
forced to drag a chain into the worklist due to one block in the chain
being part of the loop even though the first block is *not* in the loop.
This occurs when we have been forced to form a chain early due to
un-analyzable branches.

No test case here as I have no idea how to even begin reducing one, and
it will be hopelessly fragile. We have to somehow end up with a loop
header of an inner loop which is a successor of a basic block with an
unanalyzable pair of branch instructions. Ow. Self-host triggers it so
it is unlikely it will regress.

This at least gets block placement back to passing selfhost and the test
suite. There are still a lot of slowdown that I don't like coming out of
block placement, although there are now also a lot of speedups. =[ I'm
seeing swings in both directions up to 10%. I'm going to try to find
time to dig into this and see if we can turn this on for 3.1 as it does
a really good job of cleaning up after some loops that degraded with the
inliner changes.

llvm-svn: 154287
2012-04-08 14:37:02 +00:00
Chandler Carruth 49158908dc Add a debug-only 'dump' method to the BlockChain structure to ease
debugging.

llvm-svn: 154286
2012-04-08 14:37:01 +00:00
Chandler Carruth f82b0e2d29 Teach InstCombine to nuke a common alloca pattern -- an alloca which has
GEPs, bit casts, and stores reaching it but no other instructions. These
often show up during the iterative processing of the inliner, SROA, and
DCE. Once we hit this point, we can completely remove the alloca. These
were actually showing up in the final, fully optimized code in a bunch
of inliner tests I've been working on, and notably they show up after
LLVM finishes optimizing away all function calls involved in
hash_combine(a, b).

llvm-svn: 154285
2012-04-08 14:36:56 +00:00
Nadav Rotem 82609df647 AVX2: Build splat vectors by broadcasting a scalar from the constant pool.
Previously we used three instructions to broadcast an immediate value into a
vector register.
On Sandybridge we continue to load the broadcasted value from the constant pool.

llvm-svn: 154284
2012-04-08 12:54:54 +00:00
Bill Wendling 8c783d4122 Remove old 'grep' lines.
llvm-svn: 154283
2012-04-08 11:53:54 +00:00
Bill Wendling ccf1109040 Formatting changes. Don't put spaces in front of some code, which only makes it look 'off'.
llvm-svn: 154282
2012-04-08 11:52:52 +00:00
Bill Wendling 57f8e5ebe4 FileCheckize these testcases.
llvm-svn: 154281
2012-04-08 11:00:38 +00:00
Bill Wendling 5c0068f807 Remove the 'Parent' pointer from the MDNodeOperand class.
An MDNode has a list of MDNodeOperands allocated directly after it as part of
its allocation. Therefore, the Parent of the MDNodeOperands can be found by
walking back through the operands to the beginning of that list. Mark the first
operand's value pointer as being the 'first' operand so that we know where the
beginning of said list is.

This saves a *lot* of space during LTO with -O0 -g flags.

llvm-svn: 154280
2012-04-08 10:20:49 +00:00
Bill Wendling 9b2503a006 Allow subclasses of the ValueHandleBase to store information as part of the
value pointer by making the value pointer into a pointer-int pair with 2 bits
available for flags.

llvm-svn: 154279
2012-04-08 10:16:43 +00:00
Craig Topper d024cef233 Turn avx2 vinserti128 intrinsic calls into INSERT_SUBVECTOR DAG nodes and remove patterns for selecting the intrinsic. Similar was already done for avx1.
llvm-svn: 154272
2012-04-07 22:32:29 +00:00
Craig Topper aa9aab5ad2 Move vinsertf128 patterns near the instruction definitions. Add AddedComplexity to AVX2 vextracti128 patterns to give them priority over the integer versions of vextractf128 patterns.
llvm-svn: 154268
2012-04-07 21:57:43 +00:00
Craig Topper e09d1c5c48 Remove 'else' after 'if' that ends in return.
llvm-svn: 154267
2012-04-07 21:23:41 +00:00
Nadav Rotem 71d07ae5cb 1. Remove the part of r153848 which optimizes shuffle-of-shuffle into a new
shuffle node because it could introduce new shuffle nodes that were not
   supported efficiently by the target.

2. Add a more restrictive shuffle-of-shuffle optimization for cases where the
   second shuffle reverses the transformation of the first shuffle.

llvm-svn: 154266
2012-04-07 21:19:08 +00:00
Duncan Sands 5f8397a934 Convert floating point division by a constant into multiplication by the
reciprocal if converting to the reciprocal is exact.  Do it even if inexact
if -ffast-math.  This substantially speeds up ac.f90 from the polyhedron
benchmarks.

llvm-svn: 154265
2012-04-07 20:04:00 +00:00
Chandler Carruth 75a1cf327a Perform partial SROA on the helper hashing structure. I really wish the
optimizers could do this for us, but expecting partial SROA of classes
with template methods through cloning is probably expecting too much
heroics. With this change, the begin/end pointer pairs which indicate
the status of each loop iteration are actually passed directly into each
layer of the combine_data calls, and the inliner has a chance to see
when most of the combine_data function could be deleted by inlining.
Similarly for 'length'.

We have to be careful to limit the places where in/out reference
parameters are used as those will also defeat the inliner / optimizers
from properly propagating constants.

With this change, LLVM is able to fully inline and unroll the hash
computation of small sets of values, such as two or three pointers.
These now decompose into essentially straight-line code with no loops or
function calls.

There is still one code quality problem to be solved with the hashing --
LLVM is failing to nuke the alloca. It removes all loads from the
alloca, leaving only lifetime intrinsics and dead(!!) stores to the
alloca. =/ Very unfortunate.

llvm-svn: 154264
2012-04-07 20:01:31 +00:00
Chandler Carruth 28192c9398 Fix ValueTracking to conclude that debug intrinsics are safe to
speculate. Without this, loop rotate (among many other places) would
suddenly stop working in the presence of debug info. I found this
looking at loop rotate, and have augmented its tests with a reduction
out of a very hot loop in yacr2 where failing to do this rotation costs
sometimes more than 10% in runtime performance, perturbing numerous
downstream optimizations.

This should have no impact on performance without debug info, but the
change in performance when debug info is enabled can be extreme. As
a consequence (and this how I got to this yak) any profiling of
performance problems should be treated with deep suspicion -- they may
have been wildly innacurate of debug info was enabled for profiling. =/
Just a heads up.

llvm-svn: 154263
2012-04-07 19:22:18 +00:00
Benjamin Kramer e1f4ca1b0f SCEV: When expanding a GEP the final addition to the base pointer has NUW but not NSW.
Found by inspection.

llvm-svn: 154262
2012-04-07 17:19:26 +00:00
Bob Wilson 6f9be7e2c6 Fix Thumb __builtin_longjmp with integrated assembler. <rdar://problem/11203543>
The tLDRr instruction with the last register operand set to the zero register
prints in assembly as if no register was specified, and the assembler encodes
it as a tLDRi instruction with a zero immediate.  With the integrated assembler,
that zero register gets emitted as "r0", so we get "ldr rx, [ry, r0]" which
is broken.  Emit the instruction as tLDRi with a zero immediate.  I don't
know if there's a good way to write a testcase for this.  Suggestions welcome.

Opportunities for follow-up work:
1) The asm printer should complain if a non-optional register operand is set
   to the zero register, instead of silently dropping it.
2) The integrated assembler should complain in the same situation, instead of
   silently emitting the operand as "r0".

llvm-svn: 154261
2012-04-07 16:51:59 +00:00
Hongbin Zheng 5758f495da Refactor: Use positive field names in VectorizeConfig.
llvm-svn: 154249
2012-04-07 03:56:23 +00:00
NAKAMURA Takumi b95f64134e Target/X86/MCTargetDesc/X86MCAsmInfo.cpp: Enable DwarfCFI (aka DW2) on Cygming.
Cygwin-1.7 supports dw2. Some recent mingw distros support one, too.
I have confirmed test-suite/SingleSource/Benchmarks/Shootout-C++/except.cpp can pass on Cygwin.

llvm-svn: 154247
2012-04-07 02:24:20 +00:00
Alexis Hunt 78fce432b7 Make the test for r154235 more platform-independent with a shorter
string.

llvm-svn: 154243
2012-04-07 01:33:14 +00:00
Alexis Hunt 0235f684f0 Output UTF-8-encoded characters as identifier characters into assembly
by default.

This is a behaviour configurable in the MCAsmInfo. I've decided to turn
it on by default in (possibly optimistic) hopes that most assemblers are
reasonably sane. If this proves a problem, switching to default seems
reasonable.

I'm not sure if this is the opportune place to test, but it seemed good
to make sure it was tested somewhere.

llvm-svn: 154235
2012-04-07 00:37:53 +00:00
Jim Grosbach 0c509fa6bf Tidy up. 80 columns.
llvm-svn: 154226
2012-04-06 23:43:50 +00:00
Jakob Stoklund Olesen baa3566091 ARMPat is equivalent to Requires<[IsARM]>.
llvm-svn: 154210
2012-04-06 21:21:59 +00:00
Jakob Stoklund Olesen b4bd3880ba Eliminate iOS-specific tail call instructions.
After register masks were introdruced to represent the call clobbers, it
is no longer necessary to have duplicate instruction for iOS.

llvm-svn: 154209
2012-04-06 21:17:42 +00:00
Akira Hatanaka 487e56763d Add lines in global-address.ll to test N32 and N64 code generation.
llvm-svn: 154202
2012-04-06 20:23:36 +00:00
Chandler Carruth 8a102c21e3 There is no portable std::abs overload for int64_t, use the llvm::abs64
which exists for this purpose.

llvm-svn: 154199
2012-04-06 20:10:52 +00:00
Sean Callanan e804b5b762 Fixed two leaks in the MC disassembler. The MC
disassembler requires a MCSubtargetInfo and a
MCInstrInfo to exist in order to initialize the
instruction printer and disassembler; however,
although the printer and disassembler keep
references to these objects they do not own them.
Previously, the MCSubtargetInfo and MCInstrInfo
objects were just leaked.

I have extended LLVMDisasmContext to own these
objects and delete them when it is destroyed.

llvm-svn: 154192
2012-04-06 18:21:09 +00:00
Jakob Stoklund Olesen 967b86a0a2 Allow negative immediates in ARM and Thumb2 compares.
ARM and Thumb2 mode can use cmn instructions to compare against negative
immediates. Thumb1 mode can't.

llvm-svn: 154183
2012-04-06 17:45:04 +00:00
David Chisnall c1c9cdab23 Reintroduce InlineCostAnalyzer::getInlineCost() variant with explicit callee
parameter until we have a more sensible API for doing the same thing.

Reviewed by Chandler.

llvm-svn: 154180
2012-04-06 17:27:41 +00:00
Chandler Carruth 49da93396e Sink the collection of return instructions until after *all*
simplification has been performed. This is a bit less efficient
(requires another ilist walk of the basic blocks) but shouldn't matter
in practice. More importantly, it's just too much work to keep track of
all the various ways the return instructions can be mutated while
simplifying them. This fixes yet another crasher, reported by Daniel
Dunbar.

llvm-svn: 154179
2012-04-06 17:21:31 +00:00
Chandler Carruth e547fefcb7 Tweak this test to ensure the inliner did indeed fire. Thanks to Richard
Smith for pointing this out in review.

llvm-svn: 154178
2012-04-06 17:21:28 +00:00
Duncan Sands d12b18f820 Make GVN's propagateEquality non-recursive. No intended functionality change.
The modifications are a lot more trivial than they appear to be in the diff!

llvm-svn: 154174
2012-04-06 15:31:09 +00:00
Craig Topper bdc9f071a4 Test case for PR12413
llvm-svn: 154172
2012-04-06 14:38:25 +00:00
Benjamin Kramer 3cacabfb04 Fix narrowing conversion.
llvm-svn: 154171
2012-04-06 13:33:52 +00:00
Benjamin Kramer 15e21a159e DenseMap: Perform the pod-like object optimization when the value type is POD-like, not the DenseMapInfo for it.
Purge now unused template arguments. This has been broken since r91421. Patch by Lubos Lunak!

llvm-svn: 154170
2012-04-06 10:43:44 +00:00
Craig Topper 447417c932 Allow 256-bit shuffles to be split if a 128-bit lane contains elements from a single source. This is a rewrite of the 256-bit shuffle splitting code based on similar code from legalize types. Fixes PR12413.
llvm-svn: 154166
2012-04-06 07:45:23 +00:00
Craig Topper 4eb9616b24 Add the tests that were supposed to go with r153935 that I forgot svn add
llvm-svn: 154165
2012-04-06 07:09:59 +00:00
Chandler Carruth 17e335888c Actually finish this sentence in the comment the way I intended. Thanks
Matt for pointing this out.

llvm-svn: 154158
2012-04-06 01:19:38 +00:00
Chandler Carruth e41f6f4189 Sink the return instruction collection until after we're done deleting
dead code, including dead return instructions in some cases. Otherwise,
we end up having a bogus poniter to a return instruction that blows up
much further down the road.

It turns out that this pattern is both simpler to code, easier to update
in the face of enhancements to the inliner cleanup, and likely cheaper
given that it won't add dead instructions to the list.

Thanks to John Regehr's numerous test cases for teasing this out.

llvm-svn: 154157
2012-04-06 01:11:52 +00:00
Jakob Stoklund Olesen 6a2e99a46a Deduplicate ARM call-related instructions.
We had special instructions for iOS because r9 is call-clobbered, but
that is represented dynamically by the register mask operands now, so
there is no need for the pseudo-instructions.

llvm-svn: 154144
2012-04-06 00:04:58 +00:00
Jim Grosbach d6a1a1dc2f ARM: Don't form a t2LDRi8 or t2STRi8 with an offset of zero.
The load/store optimizer splits LDRD/STRD into two instructions when the
register pairing doesn't work out. For negative offsets in Thumb2, it uses
t2STRi8 to do that. That's fine, except for the case when the offset is in
the range [-4,-1]. In that case, we'll also form a second t2STRi8 with
the original offset plus 4, resulting in a t2STRi8 with a non-negative
offset, which ends up as if it were an STRT, which is completely bogus.
Similarly for loads.

No testcase, unfortunately, as any I've been able to construct is both large
and extremely fragile.

rdar://11193937

llvm-svn: 154141
2012-04-05 23:51:24 +00:00
Kaelyn Uhrain cb5b585cca Fix the build breakage introduced by r154131.
The empty 1-argument operator delete is for the benefit of the
destructor. A couple of spot checks of running yaml-bench under
valgrind against a few of the files under test/YAMLParser did
not reveal any leaks introduced by this change.

llvm-svn: 154137
2012-04-05 23:06:17 +00:00
Kaelyn Uhrain 64aa24e13f Really fix -Wnon-virtual-dtor warnings; gcc needs the dtors to be
explicitly marked as virtual.

llvm-svn: 154131
2012-04-05 22:11:12 +00:00
Bill Wendling 4f60125dd8 The internalize pass can be dangerous for LTO.
Consider the following program:

$ cat main.c
void foo(void) { }

int main(int argc, char *argv[]) {
    foo();
    return 0;
}
$ cat bundle.c 
extern void foo(void);

void bar(void) {
     foo();
}
$ clang -o main main.c
$ clang -o bundle.so bundle.c -bundle -bundle_loader ./main
$ nm -m bundle.so
0000000000000f40 (__TEXT,__text) external _bar
                 (undefined) external _foo (from executable)
                 (undefined) external dyld_stub_binder (from libSystem)
$ clang -o main main.c -O4
$ clang -o bundle.so bundle.c -bundle -bundle_loader ./main
Undefined symbols for architecture x86_64:
  "_foo", referenced from:
      _bar in bundle-elQN6d.o
ld: symbol(s) not found for architecture x86_64
clang: error: linker command failed with exit code 1 (use -v to see invocation)

The linker was told that the 'foo' in 'main' was 'internal' and had no uses, so
it was dead stripped.

Another situation is something like:

define void @foo() {
  ret void
}

define void @bar() {
  call asm volatile "call _foo" ...
  ret void
}

The only use of 'foo' is inside of an inline ASM call. Since we don't look
inside those for uses of functions, we don't specify this as a "use."

Get around this by not invoking the 'internalize' pass by default. This is an
admitted hack for LTO correctness.
<rdar://problem/11185386>

llvm-svn: 154124
2012-04-05 21:26:44 +00:00
Jim Grosbach 930f2f66e7 ARM assembly aliases for add negative immediates using sub.
'add r2, #-1024' should just use 'sub r2, #1024' rather than erroring out.
Thumb1 aliases for adding a negative immediate to the stack pointer,
also.

rdar://11192734

llvm-svn: 154123
2012-04-05 20:57:13 +00:00
Akira Hatanaka 43fb2b2cea Reapply test case in 154038, this time with triple to prevent the backend
from emitting gp_rel relocation.

llvm-svn: 154122
2012-04-05 20:44:35 +00:00
Eric Christopher aec8a82694 Patch to set is_stmt a little better for prologue lines in a function.
This enables debuggers to see what are interesting lines for a
breakpoint rather than any line that starts a function.

rdar://9852092

llvm-svn: 154120
2012-04-05 20:39:05 +00:00
Jakob Stoklund Olesen 37492eac8c Don't break the IV update in TLI::SimplifySetCC().
LSR always tries to make the ICmp in the loop latch use the incremented
induction variable. This allows the induction variable to be kept in a
single register.

When the induction variable limit is equal to the stride,
SimplifySetCC() would break LSR's hard work by transforming:

   (icmp (add iv, stride), stride) --> (cmp iv, 0)

This forced us to use lea for the IC update, preventing the simpler
incl+cmp.

<rdar://problem/7643606>
<rdar://problem/11184260>

llvm-svn: 154119
2012-04-05 20:30:20 +00:00
Dan Gohman cc64bbca81 Fix accidentally inverted logic from r152803, and make the
testcase slightly less trivial. This fixes rdar://11171718.

llvm-svn: 154118
2012-04-05 20:27:21 +00:00
Sylvestre Ledru e8235fef31 Fix a problem in the target detection for Debian GNU/HURD
llvm-svn: 154117
2012-04-05 19:34:15 +00:00
Sylvestre Ledru 4cf7dae516 Fix a problem in the target detection for Debian GNU/kFreeBSD
llvm-svn: 154114
2012-04-05 18:53:09 +00:00
Owen Anderson a6eebf6013 Treat f16 the same as f80/f128 for the purposes of generating constants during instruction selection.
llvm-svn: 154113
2012-04-05 18:50:32 +00:00
Silviu Baranga af3c79f0ac Added support for unpredictable ADC/SBC instructions on ARM, and also fixed some corner cases involving the PC register as an operand for these instructions.
llvm-svn: 154101
2012-04-05 16:19:29 +00:00
Silviu Baranga d365397daa Added support for handling unpredictable arithmetic instructions on ARM.
llvm-svn: 154100
2012-04-05 16:13:15 +00:00
Hongbin Zheng 31d33b8318 BBVectorize: Add the const modifier to the VectorizeConfig because we won't
modify it.

llvm-svn: 154098
2012-04-05 16:07:49 +00:00
Hongbin Zheng d6825173d3 Introduce the VectorizeConfig class, with which we can control the behavior
of the BBVectorizePass without using command line option. As pointed out
  by Hal, we can ask the TargetLoweringInfo for the architecture specific
  VectorizeConfig to perform vectorizing with architecture specific
  information.

llvm-svn: 154096
2012-04-05 15:46:55 +00:00
James Molloy 1ea6473688 An oversight when applying the patches for r150956 and r150957 to a vanilla tree meant I forgot to svn add these testcases.
Noticed while investigating PR12274!

llvm-svn: 154090
2012-04-05 10:01:12 +00:00
Hongbin Zheng 6edbc39bd7 Add the function "vectorizeBasicBlock" which allow users vectorize a
BasicBlock in other passes, e.g. we can call vectorizeBasicBlock in the
 loop unroll pass right after the loop is unrolled.

llvm-svn: 154089
2012-04-05 08:05:16 +00:00
Jim Grosbach 15c6884a4b ARM assembly aliases for two-operand V[R]SHR instructions.
rdar://11189467

llvm-svn: 154087
2012-04-05 07:23:53 +00:00
Argyrios Kyrtzidis ef909265e8 In MemoryBuffer::getOpenFile() make sure that the buffer is null-terminated if
the caller requested a null-terminated one.

When mapping the file there could be a racing issue that resulted in the file being larger
than the FileSize passed by the caller. We already have an assertion
for this in MemoryBuffer::init() but have a runtime guarantee that
the buffer will be null-terminated, so do a copy that adds a null-terminator.

Protects against crash of rdar://11161822.

llvm-svn: 154082
2012-04-05 04:23:56 +00:00
Jim Grosbach 3d00eecc53 ARM assembly parsing for 'msr' plain 'cpsr' operand.
Plain 'cpsr' is an alias for 'cpsr_fc'.

rdar://11153753

llvm-svn: 154080
2012-04-05 03:17:53 +00:00
Jakob Stoklund Olesen f2390e8303 Pass the right sign to TLI->isLegalICmpImmediate.
LSR can fold three addressing modes into its ICmpZero node:

  ICmpZero BaseReg + Offset      => ICmp BaseReg, -Offset
  ICmpZero -1*ScaleReg + Offset  => ICmp ScaleReg, Offset
  ICmpZero BaseReg + -1*ScaleReg => ICmp BaseReg, ScaleReg

The first two cases are only used if TLI->isLegalICmpImmediate() likes
the offset.

Make sure the right Offset sign is passed to this method in the second
case. The ARM version is not symmetric.

<rdar://problem/11184260>

llvm-svn: 154079
2012-04-05 03:10:56 +00:00
Bob Wilson 1864146ab7 Do not include multiple -arch options in CPPFLAGS.
llvm-svn: 154070
2012-04-05 00:35:55 +00:00
Michael J. Spencer b2d30b8699 Fix -Wnon-virtual-dtor warnings.
llvm-svn: 154063
2012-04-04 22:34:55 +00:00
Akira Hatanaka 121342fcc2 Reapply 154038 without the failing test.
llvm-svn: 154062
2012-04-04 22:16:36 +00:00
Owen Anderson 4743c6e159 Revert r154038. It was causing make check failures.
llvm-svn: 154054
2012-04-04 21:18:58 +00:00
Pete Cooper d7290700e6 REG_SEQUENCE expansion to COPY instructions wasn't taking account of sub register indices on the source registers. No simple test case
llvm-svn: 154051
2012-04-04 21:03:25 +00:00
Benjamin Kramer 379018b2da Fix a C++11 UDL conflict.
Still not fixed in the standard ;)

llvm-svn: 154044
2012-04-04 20:33:56 +00:00
Pete Cooper 8a3dc0ed8c f16 FREM can now be legalized by promoting to f32
llvm-svn: 154039
2012-04-04 19:36:31 +00:00
Akira Hatanaka 9705c865d9 Fix LowerGlobalAddress to produce instructions with the correct relocation
types for N32 ABI. Add new test case and update existing ones.

llvm-svn: 154038
2012-04-04 19:02:38 +00:00
Akira Hatanaka 591ecdd7c1 Fix LowerJumpTable to produce instructions with the correct relocation
types for N32 ABI. Test case will be updated after the patch that fixes
TargetLowering::getPICJumpTableRelocBase is checked in.

llvm-svn: 154036
2012-04-04 18:31:32 +00:00
Akira Hatanaka b3a2b8c199 Fix LowerConstantPool to produce instructions with the correct relocation
types for N32 ABI and update test case.

llvm-svn: 154034
2012-04-04 18:26:12 +00:00
Jakob Stoklund Olesen 0a5b72f0e4 Implement ARMBaseInstrInfo::commuteInstruction() for MOVCCr.
A MOVCCr instruction can be commuted by inverting the condition. This
can help reduce register pressure and remove unnecessary copies in some
cases.

<rdar://problem/11182914>

llvm-svn: 154033
2012-04-04 18:23:42 +00:00
Jakob Stoklund Olesen 92fd79a639 Remove spurious debug output.
llvm-svn: 154032
2012-04-04 18:23:38 +00:00
Akira Hatanaka aeff24e424 Fix LowerBlockAddress to produce instructions with the correct relocation
types for N32 ABI and update test case.

llvm-svn: 154031
2012-04-04 18:22:53 +00:00
Hongbin Zheng e1fd20172b Add testcase for r154007, when a function has the optsize attribute,
the loop should be unrolled according the value of OptSizeUnrollThreshold.

llvm-svn: 154014
2012-04-04 13:24:40 +00:00
Rafael Espindola ba0a6cabb8 Always compute all the bits in ComputeMaskedBits.
This allows us to keep passing reduced masks to SimplifyDemandedBits, but
know about all the bits if SimplifyDemandedBits fails. This allows instcombine
to simplify cases like the one in the included testcase.

llvm-svn: 154011
2012-04-04 12:51:34 +00:00
Hongbin Zheng b21b865fe8 LoopUnrollPass: Use variable "Threshold" instead of "CurrentThreshold" when
reducing unroll count, otherwise the reduced unroll count is not taking
  the "OptimizeForSize" attribute into account.

llvm-svn: 154007
2012-04-04 11:44:08 +00:00
Benjamin Kramer a1355d17ca Move yaml::Stream's dtor out of line so it can see Scanner's dtor.
llvm-svn: 154004
2012-04-04 08:53:34 +00:00
Benjamin Kramer e43bde73aa Implement DwarfLLVMRegPair::operator< without violating asymmetry.
MSVC8 verifies this.

llvm-svn: 154002
2012-04-04 08:24:08 +00:00
Craig Topper 34487838bf Convert assert(false) followed by a return to llvm_unreachable
llvm-svn: 153997
2012-04-04 04:55:46 +00:00
Craig Topper 4c7d995029 Remove default case from switch that was already covering all cases.
llvm-svn: 153996
2012-04-04 04:42:42 +00:00
Pete Cooper e7bff68a5e Removed useless switch for default case when switch was covering all the enum values
llvm-svn: 153984
2012-04-04 00:53:04 +00:00
Bob Wilson 3e66d73259 Fix the install location for the Embedded makefile target.
svn r145378 inadvertently changed the destination for the Embedded target
in the makefile.  Add a "/Developer" suffix to DSTROOT to compensate.

llvm-svn: 153980
2012-04-03 23:44:39 +00:00
Michael J. Spencer afc0d6a36f Sorry about that. MSVC seems to accept just about any random string you give it ;/
llvm-svn: 153979
2012-04-03 23:36:44 +00:00
Bob Wilson 9d12ffcd71 Remove dead code for installing libLTO when building llvmCore.
llvm-svn: 153978
2012-04-03 23:13:26 +00:00
Michael J. Spencer 22120c47a7 Add YAML parser to Support.
llvm-svn: 153977
2012-04-03 23:09:22 +00:00
Pete Cooper 9511ec86f9 Add VSELECT to LegalizeVectorTypes::ScalariseVectorResult. Previously it would crash if it encountered a 1 element VSELECT. Solution is slightly more complicated than just creating a SELET as we have to mask or sign extend the vector condition if it had different boolean contents from the scalar condition. Fixes <rdar://problem/11178095>
llvm-svn: 153976
2012-04-03 22:57:55 +00:00
Pete Cooper b98934cf72 Removed one last bad continue statement meant to be removed in r153914.
llvm-svn: 153975
2012-04-03 22:18:49 +00:00
Bob Wilson 8bbd98df00 When building llvmCore, pass the SDKROOT and -arch setting to configure.
So far all of configure tests have been run against the default SDK and
architecture, regardless of what is actually being built.  We've gotten
lucky until now.  <rdar://problem/11112479>

llvm-svn: 153972
2012-04-03 21:50:26 +00:00
Bob Wilson 5512ec8bae Remove a reference to the C backend.
llvm-svn: 153971
2012-04-03 21:50:24 +00:00
Chad Rosier 2a02fe1bb2 Fix an issue in SimplifySetCC() specific to vector comparisons.
When folding X == X we need to check getBooleanContents() to determine if the
result is a vector of ones or a vector of negative ones. 

I tried creating a test case, but the problem seems to only be exposed on a
much older version of clang (around r144500).
rdar://10923049

llvm-svn: 153966
2012-04-03 20:11:24 +00:00
Anton Korobeynikov d0b458d694 Set soname for FreeBSD as well.
Patch by Bernard Cafarelli!

llvm-svn: 153965
2012-04-03 19:48:31 +00:00
Eric Christopher b81e2b403c Fix thinko check for number of operands to be the one that actually
might have more than 19 operands. Add a testcase to make sure I
never screw that up again.

Part of rdar://11026482

llvm-svn: 153961
2012-04-03 17:55:42 +00:00
Lang Hames ffa52d2ae2 Matrix simplification in PBQP may push infinite costs onto register options.
The colorability heuristic should count these as denied registers.

No test case - this exposed a bug on an out-of-tree target.

llvm-svn: 153958
2012-04-03 16:27:16 +00:00
Dylan Noblesmith 7a3973d3e0 ARMDisassembler: drop bogus dependency on ARMCodeGen
And indirectly, a dependency on most of the core LLVM optimization
libraries.

llvm-svn: 153957
2012-04-03 15:48:14 +00:00
Dylan Noblesmith 6338485d59 Object: drop bogus VMCore dependency
llvm-svn: 153956
2012-04-03 15:48:10 +00:00
Bill Wendling e2cf674310 The speedup doesn't appear to have been from this, but was an anomaly of my testing machine.
llvm-svn: 153951
2012-04-03 11:19:21 +00:00
Bill Wendling dd91e73409 Reserve space for the eventual filling of the vector. This gives a small speedup.
llvm-svn: 153949
2012-04-03 10:50:09 +00:00
Nadav Rotem 269703f983 Add an additional testcase which checks ops with multiple users.
llvm-svn: 153939
2012-04-03 07:39:36 +00:00
Anton Korobeynikov 325e92668b Make PPCCompilationCallbackC function to be static, so there will be no need to issue call via
PLT when LLVM is built as shared library. This mimics the X86 backend towards the approach.

llvm-svn: 153938
2012-04-03 06:59:28 +00:00
Craig Topper 9c252ebe4c Tidy up spacing in some tablegen outputs.
llvm-svn: 153937
2012-04-03 06:52:47 +00:00
Craig Topper 7629d63bc4 Add support for AVX enhanced comparison predicates. Patch from Kay Tiong Khoo.
llvm-svn: 153935
2012-04-03 05:20:24 +00:00
Bill Wendling 32867652c9 Reformatting. No functionality change.
llvm-svn: 153928
2012-04-03 03:56:52 +00:00
Bill Wendling 7d350efddc As Eric pointed out, even a Debug build should be equal. Leave the flag that can turn off comparisons though.
llvm-svn: 153927
2012-04-03 03:27:43 +00:00
Akira Hatanaka d19f025374 Revert r153924. Delete test/MC/Disassembler/Mips and lib/Target/Mips/Disassembler.
llvm-svn: 153926
2012-04-03 03:01:13 +00:00
Akira Hatanaka 55059262aa Revert r153924. There were buildbot failures.
llvm-svn: 153925
2012-04-03 02:51:09 +00:00
Akira Hatanaka e2498d014b MIPS disassembler support.
Patch by Vladimir Medic.

llvm-svn: 153924
2012-04-03 02:20:58 +00:00
Andrew Trick a890e3c69a Cleanup set_union usage. The same thing but a bit cleaner now.
llvm-svn: 153922
2012-04-03 01:35:52 +00:00
Andrew Trick c544e7c0a7 Use std::set_union instead of nasty custom code.
I just noticed Jakob's examples of the proper application of
std::set... routines.

llvm-svn: 153918
2012-04-03 00:47:23 +00:00
Eric Christopher 34164196af Add a line number for the scope of the function (starting at the first
brace) so that we get more accurate line number information about the
declaration of a given function and the line where the function
first starts.

Part of rdar://11026482

llvm-svn: 153916
2012-04-03 00:43:49 +00:00
Pete Cooper 4f0dbb27d9 Fixes to r153903. Added missing explanation of behaviour when the VirtRegMap is NULL. Also changed it in this case to just avoid updating the map, but live ranges or intervals will still get updated and created
llvm-svn: 153914
2012-04-03 00:28:46 +00:00
Bill Wendling d70cde134b Compare the .o files only for release builds. Add an option to bypass the comparison altogether.
llvm-svn: 153909
2012-04-02 23:27:43 +00:00
Pete Cooper 3ca96f9950 Moved LiveRangeEdit.h so that it can be called from other parts of the backend, not just libCodeGen
llvm-svn: 153906
2012-04-02 22:44:18 +00:00
Rafael Espindola f76bff0504 Make dominatedBySlowTreeWalk private and assert cases handled by the caller.
llvm-svn: 153905
2012-04-02 22:37:54 +00:00
Jakob Stoklund Olesen 291007b055 Allocate virtual registers in ascending order.
This is just the fallback tie-breaker ordering, the main allocation
order is still descending size.

Patch by Shamil Kurmangaleev!

llvm-svn: 153904
2012-04-02 22:30:39 +00:00
Pete Cooper 2bde2f42b1 Refactored the LiveRangeEdit interface so that MachineFunction, TargetInstrInfo, MachineRegisterInfo, LiveIntervals, and VirtRegMap are all passed into the constructor and stored as members instead of passed in to each method.
llvm-svn: 153903
2012-04-02 22:22:53 +00:00
Bill Wendling 932b992888 Add an option to turn off the expensive GVN load PRE part of GVN.
llvm-svn: 153902
2012-04-02 22:16:50 +00:00
Owen Anderson 98f2c0c384 Add predicates for checking whether targets have free FNEG and FABS operations, and prevent the DAGCombiner from turning them into bitwise operations if they do.
llvm-svn: 153901
2012-04-02 22:10:29 +00:00
Lang Hames aaafacd07e During two-address lowering, rescheduling an instruction does not untie
operands. Make TryInstructionTransform return false to reflect this.
Fixes PR11861.

llvm-svn: 153892
2012-04-02 19:58:43 +00:00
Rafael Espindola 2e5c58e77b No need to run llvm-as.
llvm-svn: 153890
2012-04-02 19:44:20 +00:00
Akira Hatanaka b1f68f9696 Initial 64 bit direct object support.
This patch allows llvm to recognize that a 64 bit object file is being produced
and that the subsequently generated ELF header has the correct information.

The test case checks for both big and little endian flavors.

Patch by Jack Carter.

llvm-svn: 153889
2012-04-02 19:25:22 +00:00
Hal Finkel 7591afa235 The binutils for the IBM BG/P are too old to support CFI.
llvm-svn: 153886
2012-04-02 19:09:04 +00:00
Hal Finkel f208af02a4 Add triple support for the IBM BG/P and BG/Q supercomputers.
llvm-svn: 153882
2012-04-02 18:31:33 +00:00
Eric Christopher ad9fe8955a Turn on the accelerator tables for Darwin.
llvm-svn: 153880
2012-04-02 17:58:52 +00:00
Stepan Dyatkovskiy f62ffeca88 Fast fix for PR12343:
http://llvm.org/bugs/show_bug.cgi?id=12343

We have not trivial way for splitting edges that are goes from indirect branch. We can do it with some tricks, but it should be additionally discussed. And it is still dangerous due to difficulty of indirect branches controlling.

Fix forbids this case for unswitching.

llvm-svn: 153879
2012-04-02 17:16:45 +00:00
Roman Divacky b9663ccd6b Implement the SVR4 byval alignment for aggregates. Fixing a FIXME.
llvm-svn: 153876
2012-04-02 15:49:30 +00:00
Silviu Baranga 98144e9e1a Second part for the 153874 one
llvm-svn: 153875
2012-04-02 15:46:46 +00:00
Silviu Baranga ac37acd31b Added fix in TableGen instruction decoder generation. The decoder now breaks for every leaf node.
llvm-svn: 153874
2012-04-02 15:20:39 +00:00
Rafael Espindola ebe09ec137 Add missing 'd'.
llvm-svn: 153872
2012-04-02 13:02:57 +00:00
Bill Wendling 71b19bbdc8 Hack the hack. If we have a situation where an ASM object is defined but isn't
reflected in the LLVM IR (as a declare or something), then treat it like a data
object.

N.B. This isn't 100% correct. The ASM parser should supply more information so
that we know what type of object it is, and what attributes it should have.

llvm-svn: 153870
2012-04-02 10:01:21 +00:00
Benjamin Kramer 22d093e4f1 Emit the asm writer's mnemonic table with SequenceToOffsetTable.
This way we can get AVX v-prefixed instructions tail merged with the normal insns.

llvm-svn: 153869
2012-04-02 09:13:46 +00:00
Benjamin Kramer 1c0541b031 Move getOpcodeName from the various target InstPrinters into the superclass MCInstPrinter.
All implementations used the same code.

llvm-svn: 153866
2012-04-02 08:32:38 +00:00
Craig Topper 4de7373862 Reorder fields in MatchEntry and OperandMatchEntry to reduce padding. A bit tricky due to the target specific sizes for some of the fields so the ordering is only optimal for the targets in the tree.
llvm-svn: 153865
2012-04-02 07:48:39 +00:00
Nadav Rotem 702f080767 Optimizing swizzles of complex shuffles may generate additional complex shuffles.
Do not try to optimize swizzles of shuffles if the source shuffle has more than
a single user, except when the source shuffle is also a swizzle.

llvm-svn: 153864
2012-04-02 07:11:12 +00:00
Craig Topper dab9e35ad0 Remove getInstructionName from MCInstPrinter implementations in favor of using the instruction name table from MCInstrInfo. Reduces static data in the InstPrinter implementations.
llvm-svn: 153863
2012-04-02 07:01:04 +00:00
Eric Christopher 8e52bdce7b Fix CXXFLAGS for huge_val.m4.
Patch by Jeremy Huddleston!

llvm-svn: 153862
2012-04-02 06:54:01 +00:00
Craig Topper 54bfde79db Make MCInstrInfo available to the MCInstPrinter. This will be used to remove getInstructionName and the static data it contains since the same tables are already in MCInstrInfo.
llvm-svn: 153860
2012-04-02 06:09:36 +00:00
Bill Wendling 3a0bcf06ef It could come about that we parse the inline ASM before we get a potential
definition for it. In that case, we want to wait for the potential definition
before we create a symbol for it.

llvm-svn: 153859
2012-04-02 03:33:31 +00:00
Craig Topper 7a2cea1814 Use SequenceToOffsetTable to generate instruction name table for AsmWriter.
llvm-svn: 153857
2012-04-02 00:47:39 +00:00
Chandler Carruth 219173a1be Start cleaning up the InlineCost class. This switches to sentinel values
rather than a bitfield, a great suggestion by Chris during code review.

There is still quite a bit of cruft in the interface, but that requires
sorting out some awkward uses of the cost inside the actual inliner.

No functionality changed intended here.

llvm-svn: 153853
2012-04-01 22:44:09 +00:00
Hal Finkel 3ecfa7b277 Fix some 80-col. violations I introduced with the A2 PPC64 core.
llvm-svn: 153852
2012-04-01 21:20:14 +00:00
Hal Finkel 322e41a914 Enable prefetch generation on PPC64.
llvm-svn: 153851
2012-04-01 20:08:17 +00:00
Hal Finkel 9032344c15 Add LdStSTD* itin. for the PPC64 A2 core.
llvm-svn: 153850
2012-04-01 20:08:08 +00:00
Nadav Rotem b078350872 This commit contains a few changes that had to go in together.
1. Simplify xor/and/or (bitcast(A), bitcast(B)) -> bitcast(op (A,B))
   (and also scalar_to_vector).

2. Xor/and/or are indifferent to the swizzle operation (shuffle of one src).
   Simplify xor/and/or (shuff(A), shuff(B)) -> shuff(op (A, B))

3. Optimize swizzles of shuffles:  shuff(shuff(x, y), undef) -> shuff(x, y).

4. Fix an X86ISelLowering optimization which was very bitcast-sensitive.

Code which was previously compiled to this:

movd    (%rsi), %xmm0
movdqa  .LCPI0_0(%rip), %xmm2
pshufb  %xmm2, %xmm0
movd    (%rdi), %xmm1
pshufb  %xmm2, %xmm1
pxor    %xmm0, %xmm1
pshufb  .LCPI0_1(%rip), %xmm1
movd    %xmm1, (%rdi)
ret

Now compiles to this:

movl    (%rsi), %eax
xorl    %eax, (%rdi)
ret

llvm-svn: 153848
2012-04-01 19:31:22 +00:00
Lang Hames 652f21274f Fix typo.
llvm-svn: 153846
2012-04-01 19:27:25 +00:00
Hal Finkel 88ed4e3b15 Set the default PPC node scheduling preference to ILP (for the embedded cores).
The 440 and A2 cores have detailed itineraries, and this allows them to be
fully used to maximize throughput.

llvm-svn: 153845
2012-04-01 19:23:08 +00:00
Hal Finkel b9845f5758 Add ppc440 itin. entries for LdStSTD*
llvm-svn: 153844
2012-04-01 19:23:04 +00:00
Hal Finkel ec5a1e3669 Use full anti-dep. breaking with post-ra sched. on the embedded ppc cores.
Post-RA scheduling gives a significant performance improvement on
the embedded cores, so turn it on. Using full anti-dep. breaking is
important for FP-intensive blocks, so turn it on (just on the
embedded cores for now; this should also be good on the 970s because
post-ra scheduling is all that we have for now, but that should have
more testing first).

llvm-svn: 153843
2012-04-01 19:22:57 +00:00
Hal Finkel 9f9f8929ee Add instruction itinerary for the PPC64 A2 core.
This adds a full itinerary for IBM's PPC64 A2 embedded core. These
cores form the basis for the CPUs in the new IBM BG/Q supercomputer.

llvm-svn: 153842
2012-04-01 19:22:40 +00:00
Craig Topper 91773ab2ca Use SequenceToOffsetTable to create instruction name table. Saves space particularly on X86 where AVX instructions just add a 'v' to the front of other instructions.
llvm-svn: 153841
2012-04-01 18:14:14 +00:00
Benjamin Kramer 12af4285d1 Emit the LLVM<->DWARF register mapping as a sorted table and use binary search to do the lookup.
This also avoids emitting the information twice, which led to code bloat. On i386-linux-Release+Asserts
with all targets built this change shaves a whopping 1.3 MB off clang. The number is probably exaggerated
by recent inliner changes but the methods were already enormous with the old inline cost computation.

The DWARF reg -> LLVM reg mapping doesn't seem to have holes in it, so it could be a simple lookup table.
I didn't implement that optimization yet to avoid potentially changing functionality.

There is still some duplication both in tablegen and the generated code that should be cleaned up eventually.

llvm-svn: 153837
2012-04-01 14:23:58 +00:00