Commit Graph

7946 Commits

Author SHA1 Message Date
Chris Lattner e0f0c4aa02 enable sinking and licm of loads from the argument area. I'd like to enable this
for remat, but can't due to an RA bug.

llvm-svn: 45623
2008-01-05 06:10:42 +00:00
Chris Lattner d4738ee8e1 simplify some code by using shorter accessors.
llvm-svn: 45622
2008-01-05 05:28:30 +00:00
Chris Lattner 86f4a2e4f2 revert my previous patch.
llvm-svn: 45621
2008-01-05 05:26:26 +00:00
Chris Lattner 69d7902cf1 factor some code better to avoid redundancy between
isReallySideEffectFree and isReallyTriviallyReMaterializable.  Why is a load from
a global considered side-effect-free but not rematable?

llvm-svn: 45620
2008-01-05 05:19:56 +00:00
Chris Lattner 9fa8ae6c6b getting the pic base has no side effects.
llvm-svn: 45618
2008-01-05 03:54:32 +00:00
Evan Cheng 880b080887 X86 JIT PIC jumptable support.
llvm-svn: 45616
2008-01-05 02:26:58 +00:00
Evan Cheng f55b7381af Combine MovePCtoStack + POP32r into one instruction MOVPC32r so it can be moved if needed.
llvm-svn: 45605
2008-01-05 00:41:47 +00:00
Owen Anderson 6bb0c52628 Move some more functionality from MRegisterInfo to TargetInstrInfo.
llvm-svn: 45603
2008-01-04 23:57:37 +00:00
Evan Cheng c1d1e54fc4 Unbreak tailcall opt in JIT.
llvm-svn: 45576
2008-01-04 10:50:28 +00:00
Evan Cheng 49ff8ecd03 X86 PIC JIT support fixes: encoding bugs, add lazy pointer stubs support.
llvm-svn: 45575
2008-01-04 10:46:51 +00:00
Evan Cheng 2e1ba07f16 Correct order of parameters.
llvm-svn: 45562
2008-01-04 02:22:21 +00:00
Gordon Henriksen f066fc477c First steps in in X86 calling convention cleanup.
llvm-svn: 45536
2008-01-03 16:47:34 +00:00
Evan Cheng 563fcc3428 Change MachineRelocation::DoesntNeedFnStub to NeedStub. This fields will be used
for non-function GV relocations that require function address stubs (e.g. Mac OS X in non-static mode).

llvm-svn: 45527
2008-01-03 02:56:28 +00:00
Evan Cheng 96334b4e3b X86 PIC JIT bug fix: relocations for constantpool and jumptable.
llvm-svn: 45515
2008-01-02 23:38:59 +00:00
Bill Wendling e1f28e7871 Machine LICM will check that operands are defined outside of the loop. Also
check that register isn't 0 before going further.

llvm-svn: 45498
2008-01-02 21:10:40 +00:00
Chris Lattner cce79c67ca darwin9 and above support aligned common symbols.
llvm-svn: 45494
2008-01-02 19:44:55 +00:00
Chris Lattner dcbc0f3029 leopard and above support alignment for common symbols.
llvm-svn: 45493
2008-01-02 19:35:16 +00:00
Owen Anderson eee14601b1 Move some more instruction creation methods from RegisterInfo into InstrInfo.
llvm-svn: 45484
2008-01-01 21:11:32 +00:00
Chris Lattner b0fb17fd34 Fix a bug in my previous patch: refer to the impl not the pure virtual version. It's unclear why gcc would ever compile this...
llvm-svn: 45476
2008-01-01 01:05:34 +00:00
Chris Lattner 25568e4cef Fix a problem where lib/Target/TargetInstrInfo.h would include and use
a header file from libcodegen.  This violates a layering order: codegen
depends on target, not the other way around.  The fix to this is to 
split TII into two classes, TII and TargetInstrInfoImpl, which defines
stuff that depends on libcodegen.  It is defined in libcodegen, where 
the base is not.

llvm-svn: 45475
2008-01-01 01:03:04 +00:00
Owen Anderson 7a73ae9a86 Move copyRegToReg from MRegisterInfo to TargetInstrInfo. This is part of the
Machine-level API cleanup instigated by Chris.

llvm-svn: 45470
2007-12-31 06:32:00 +00:00
Chris Lattner a10fff51d9 Rename SSARegMap -> MachineRegisterInfo in keeping with the idea
that "machine" classes are used to represent the current state of
the code being compiled.  Given this expanded name, we can start 
moving other stuff into it.  For now, move the UsedPhysRegs and
LiveIn/LoveOuts vectors from MachineFunction into it.

Update all the clients to match.

This also reduces some needless #includes, such as MachineModuleInfo
from MachineFunction.

llvm-svn: 45467
2007-12-31 04:13:23 +00:00
Chris Lattner a5bb370aa4 Add new shorter predicates for testing machine operands for various types:
e.g. MO.isMBB() instead of MO.isMachineBasicBlock().  I don't plan on 
switching everything over, so new clients should just start using the 
shorter names.

Remove old long accessors, switching everything over to use the short
accessor: getMachineBasicBlock() -> getMBB(), 
getConstantPoolIndex() -> getIndex(), setMachineBasicBlock -> setMBB(), etc.

llvm-svn: 45464
2007-12-30 23:10:15 +00:00
Chris Lattner 6005589faf More cleanups for MachineOperand:
- Eliminate the static "print" method for operands, moving it
    into MachineOperand::print.
  - Change various set* methods for register flags to take a bool
    for the value to set it to.  Remove unset* methods.
  - Group methods more logically by operand flavor in MachineOperand.h

llvm-svn: 45461
2007-12-30 21:56:09 +00:00
Chris Lattner 5c4637816e Use MachineOperand::getImm instead of MachineOperand::getImmedValue. Likewise setImmedValue -> setImm
llvm-svn: 45453
2007-12-30 20:49:49 +00:00
Bill Wendling 7749a9014b If we have a load of a global address that's not modified during the
function, then go ahead and hoist it out of the loop. This is the result:

$ cat a.c
volatile int G;

int A(int N) {
  for (; N > 0; --N)
    G++;
}
$ llc -o - -relocation-model=pic
_A:
...
LBB1_2: # bb
        movl    L_G$non_lazy_ptr-"L1$pb"(%eax), %esi
        incl    (%esi)
        incl    %edx
        cmpl    %ecx, %edx
        jne     LBB1_2  # bb
...
$ llc -o - -relocation-model=pic -machine-licm
_A:
...
        movl    L_G$non_lazy_ptr-"L1$pb"(%eax), %eax
LBB1_2: # bb
        incl    (%eax)
        incl    %edx
        cmpl    %ecx, %edx
        jne     LBB1_2  # bb
...

I'm limiting this to the MOV32rm x86 instruction for now.

llvm-svn: 45444
2007-12-30 03:18:58 +00:00
Chris Lattner b3fd2d7b63 use simplified operand addition methods.
llvm-svn: 45437
2007-12-30 01:01:54 +00:00
Chris Lattner 4b762496a9 Shrinkify the machine operand creation method names.
llvm-svn: 45433
2007-12-30 00:45:46 +00:00
Chris Lattner f3ebc3f3d2 Remove attribution from file headers, per discussion on llvmdev.
llvm-svn: 45418
2007-12-29 20:36:04 +00:00
Chris Lattner a087a8d2ce remove attribution from lib Makefiles.
llvm-svn: 45415
2007-12-29 20:09:26 +00:00
Chris Lattner 4e6cbb2fe2 this is done.
llvm-svn: 45408
2007-12-29 19:38:02 +00:00
Chris Lattner d2b8a36f0e One readme entry is done, one is really easy (Evan, want to investigate
eliminating the llvm.x86.sse2.loadl.pd intrinsic?), one shuffle optzn
may be done (if shufps is better than pinsw, Evan, please review), and
we already know about LICM of simple instructions.

llvm-svn: 45407
2007-12-29 19:31:47 +00:00
Chris Lattner 3b6a82118b Fold comparisons against a constant nan, and optimize ORD/UNORD
comparisons with a constant.  This allows us to compile isnan to:

_foo:
	fcmpu cr7, f1, f1
	mfcr r2
	rlwinm r3, r2, 0, 31, 31
	blr 

instead of:

LCPI1_0:					;  float
	.space	4
_foo:
	lis r2, ha16(LCPI1_0)
	lfs f0, lo16(LCPI1_0)(r2)
	fcmpu cr7, f1, f0
	mfcr r2
	rlwinm r3, r2, 0, 31, 31
	blr 

llvm-svn: 45405
2007-12-29 08:37:08 +00:00
Chris Lattner 33de0c6e92 this xform is implemented.
llvm-svn: 45404
2007-12-29 08:19:39 +00:00
Chris Lattner 07ccbfa64a Codegen:
as:

_bar:
	pushl	%esi
	subl	$8, %esp
	movl	16(%esp), %esi
	call	L_foo$stub
	fstps	(%esi)
	addl	$8, %esp
	popl	%esi
	#FP_REG_KILL
	ret

instead of:

_bar:
	pushl	%esi
	subl	$8, %esp
	movl	16(%esp), %esi
	call	L_foo$stub
	fstpl	(%esi)
	cvtsd2ss	(%esi), %xmm0
	movss	%xmm0, (%esi)
	addl	$8, %esp
	popl	%esi
	#FP_REG_KILL
	ret

llvm-svn: 45401
2007-12-29 06:57:38 +00:00
Chris Lattner 8013bd339b avoid going through a stack slot to convert from fpstack to xmm reg
if we are just going to store it back anyway.  This improves things 
like:
double foo();
void bar(double *P) { *P = foo(); }

llvm-svn: 45399
2007-12-29 06:41:28 +00:00
Chris Lattner 62ba67c0f7 add a note
llvm-svn: 45397
2007-12-29 05:51:58 +00:00
Chris Lattner 7cafd92aa9 expand note.
llvm-svn: 45393
2007-12-29 01:05:01 +00:00
Chris Lattner 9d53b611d1 add a note.
llvm-svn: 45388
2007-12-28 22:30:05 +00:00
Chris Lattner d798002401 add a note.
llvm-svn: 45387
2007-12-28 21:50:40 +00:00
Chris Lattner 180f0e9044 add a note
llvm-svn: 45377
2007-12-28 04:42:05 +00:00
Chris Lattner 6c234bf58f add a simple hack
llvm-svn: 45343
2007-12-24 19:27:46 +00:00
Gordon Henriksen 84c7325ca1 Setting GlobalDirective in TargetAsmInfo by default rather than
providing a misleading facility. It's used once in the MIPS backend
and hardcoded as "\t.globl\t" everywhere else.

llvm-svn: 45338
2007-12-23 20:58:16 +00:00
Chris Lattner 00602f6105 fix some warnings. This code needs to be de-tabified :(
llvm-svn: 45325
2007-12-22 22:47:03 +00:00
Chris Lattner 91f3379660 fix strict-aliasing violation
llvm-svn: 45324
2007-12-22 22:45:38 +00:00
Anton Korobeynikov 1b1250770c Erm, really disable :)
llvm-svn: 45319
2007-12-22 20:46:24 +00:00
Anton Korobeynikov d34112a4ee Disable, until we'll really need it
llvm-svn: 45318
2007-12-22 20:41:12 +00:00
Evan Cheng 345a00ba05 Preliminary PIC JIT support for X86 (32-bit) / Darwin.
llvm-svn: 45313
2007-12-22 09:40:20 +00:00
Evan Cheng db33a0211b Oops.
llvm-svn: 45312
2007-12-22 09:14:34 +00:00
Evan Cheng f4f52dbc8c Fix JIT code emission of X86::MovePCtoStack.
llvm-svn: 45307
2007-12-22 02:26:46 +00:00
Evan Cheng ac134551c6 Allow JIT with non-static relocation model.
llvm-svn: 45304
2007-12-22 01:12:14 +00:00
Anton Korobeynikov 1e8c1308bb Fix silly typo in the FP CEP handling.
llvm-svn: 45300
2007-12-21 23:33:44 +00:00
Duncan Sands 85f26f28b9 Fix a brain fart by our beloved leader (the content
of this patch is the last line).

llvm-svn: 45289
2007-12-21 20:18:41 +00:00
Nicolas Geoffray 31a2c3948e Fix unintented change from last commit
llvm-svn: 45282
2007-12-21 12:22:29 +00:00
Nicolas Geoffray 80c741e160 Enable EH for linux/ppc32 targets
llvm-svn: 45281
2007-12-21 12:19:44 +00:00
Evan Cheng 78c460c8c4 New entry.
llvm-svn: 45280
2007-12-21 01:31:58 +00:00
Evan Cheng 01c7c198ee Fix JIT encoding for CMPSD as well.
llvm-svn: 45268
2007-12-20 19:57:09 +00:00
Scott Michel 5f1470f03a More working CellSPU tests:
- vec_const.ll: Vector constant loads
- immed64.ll: i64, f64 constant loads

llvm-svn: 45242
2007-12-20 00:44:13 +00:00
Dale Johannesen eadbf4b91c Enable EH on PPC Darwin. This basically works; there
are a couple of issues that show up with the optimizer,
but I don't think they're really EH problems.
(llvm-gcc testsuite users note:  By default the testsuite
uses the unwinding code that's built as part of your local
llvm-gcc, which does not work.  You need to trick it into
using the installed system unwinding code to get useful
results.)

llvm-svn: 45221
2007-12-19 21:54:36 +00:00
Scott Michel 5ecac82f71 CellSPU testcase, extract_elt.ll: extract vector element.
llvm-svn: 45219
2007-12-19 21:17:42 +00:00
Scott Michel 098c113bc8 Two more test cases: or_ops.ll (arithmetic or operations) and vecinsert.ll
(vector insertions)

llvm-svn: 45216
2007-12-19 20:15:47 +00:00
Scott Michel 9b834469e0 Add new immed16.ll test case, fix CellSPU errata to make test case work.
llvm-svn: 45196
2007-12-19 07:35:06 +00:00
Bill Wendling ca77ecb40a Mark the "isRemat" instruction as never having side effects.
llvm-svn: 45190
2007-12-19 06:07:48 +00:00
Chris Lattner 2583a66295 add an obvious load folding missed optzn.
llvm-svn: 45161
2007-12-18 16:48:14 +00:00
Christopher Lamb 8b09a464b4 Fold certain additions through selects (and their compares) so as to eliminate subtractions. This code is often produced by the SMAX expansion in SCEV.
This implements test/Transforms/InstCombine/2007-12-18-AddSelCmpSub.ll

llvm-svn: 45158
2007-12-18 09:34:41 +00:00
Chris Lattner 52a9e40789 add a missed case.
llvm-svn: 45141
2007-12-18 01:19:18 +00:00
Bill Wendling b3d85a5d4b Add "mayHaveSideEffects" and "neverHasSideEffects" flags to some instructions. I
based what flag to set on whether it was already marked as
"isRematerializable". If there was a further check to determine if it's "really"
rematerializable, then I marked it as "mayHaveSideEffects" and created a check
in the X86 back-end similar to the remat one.

llvm-svn: 45132
2007-12-17 23:07:56 +00:00
Scott Michel c5cccb9e60 - Restore some i8 functionality in CellSPU
- New test case: nand.ll

llvm-svn: 45130
2007-12-17 22:32:34 +00:00
Bill Wendling a2401be121 LD_Fp64m should have "isRematerializable" set.
llvm-svn: 45128
2007-12-17 22:17:14 +00:00
Bill Wendling 939526a930 As per feedback, revised comments to (hopefully) make the different side effect
flags clearer.

llvm-svn: 45120
2007-12-17 21:02:07 +00:00
Christopher Lamb edf0788758 Change the PointerType api for creating pointer types. The old functionality of PointerType::get() has become PointerType::getUnqual(), which returns a pointer in the generic address space. The new prototype of PointerType::get() requires both a type and an address space.
llvm-svn: 45082
2007-12-17 01:12:55 +00:00
Chris Lattner 2af27c202c don't violate C TBAA rules, use FloatToBits instead.
llvm-svn: 45076
2007-12-16 20:41:33 +00:00
Chris Lattner e3b05fe31b fix a questionable cast, thanks to Mike Stump for pointing this out.
llvm-svn: 45075
2007-12-16 20:26:54 +00:00
Chris Lattner dab6bd902e Fix the JIT encoding of cmp*ss, which aborts with this assertion currently:
X86CodeEmitter.cpp:378: failed assertion `0 && "Immediate size not set!"'

I *think* this is right, but Evan, please verify.  It also looks like
CMPSDrr and maybe others are missing this info.  Evan, plz investigate.

llvm-svn: 45074
2007-12-16 20:12:41 +00:00
Evan Cheng 23d2d4dc6c Make better use of instructions that clear high bits; fix various 2-wide shuffle bugs.
llvm-svn: 45058
2007-12-15 03:00:47 +00:00
Scott Michel 0aa7133f82 Start committing working test cases for CellSPU.
llvm-svn: 45050
2007-12-15 00:38:50 +00:00
Evan Cheng 9556729128 Actually, MOVPQIto64mr is a dup of MOVPQI2QImr, MOV64toPQIrm is a dup of MOVQI2PQIrm.
llvm-svn: 45041
2007-12-14 20:08:14 +00:00
Evan Cheng e28372c0d6 Fix (mem) <-> low 64-bits of xmm bugs pointed out by David Greene. Mac OS X Leopard assembler recognizes movq.
llvm-svn: 45040
2007-12-14 19:54:07 +00:00
Dale Johannesen f7cefdd5f0 x86-32 long doubles are 4-byte aligned on the stack
for parameter passing (only for that, on Darwin).

llvm-svn: 45038
2007-12-14 19:25:34 +00:00
Evan Cheng a56e6ff9a7 Fix bsf / bsr jit encoding.
llvm-svn: 45037
2007-12-14 18:49:43 +00:00
Evan Cheng f28c810036 Oops. Forgot these.
llvm-svn: 45036
2007-12-14 18:25:34 +00:00
Dan Gohman 9d2e9e376f Fix Intel asm syntax for the bsr and bsf instructions.
llvm-svn: 45030
2007-12-14 15:10:00 +00:00
Evan Cheng 0e6408124e Fix ctlz and cttz. llvm definition requires them to return number of bits in of the src type when value is zero.
llvm-svn: 45029
2007-12-14 08:30:15 +00:00
Evan Cheng e9fbc3f014 Implement ctlz and cttz with bsr and bsf.
llvm-svn: 45024
2007-12-14 02:13:44 +00:00
Bill Wendling cb77f04e1f Add flags to indicate that there are "never" side effects or that there "may be"
side effects for machine instructions.

llvm-svn: 45022
2007-12-14 01:48:59 +00:00
Evan Cheng 827d30db19 Fold some and + shift in x86 addressing mode.
llvm-svn: 44970
2007-12-13 00:43:27 +00:00
Evan Cheng 6e68381e02 Implicit def instructions, e.g. X86::IMPLICIT_DEF_GR32, are always re-materializable and they should not be spilled.
llvm-svn: 44960
2007-12-12 23:12:09 +00:00
Duncan Sands fde556745b Remove host endianness info from TargetData and
put it in a new header System/Host.h instead.
Instead of getting the endianness from configure,
calculate it directly.

llvm-svn: 44959
2007-12-12 23:03:45 +00:00
Dan Gohman 7a7742c2fe Allow vector integer constants to be created with
SelectionDAG::getConstant, in the same way as vector floating-point
constants. This allows the legalize expansion code for @llvm.ctpop and
friends to be usable with vector types.

llvm-svn: 44954
2007-12-12 22:21:26 +00:00
Evan Cheng 0f42730722 Use shuffles to implement insert_vector_elt for i32, i64, f32, and f64.
llvm-svn: 44929
2007-12-12 07:55:34 +00:00
Evan Cheng 2a98956796 Lower a build_vector with all constants into a constpool load unless it can be done with a move to low part.
llvm-svn: 44921
2007-12-12 06:45:40 +00:00
Scott Michel 4a8bc7e105 Correct typo for Linux: s/esp/%rsp/
llvm-svn: 44904
2007-12-12 02:38:28 +00:00
Nate Begeman 6dc8b4ed13 Allow the JIT to encode MMX instructions
llvm-svn: 44869
2007-12-11 18:06:14 +00:00
Evan Cheng 4fbf459549 - Improved v8i16 shuffle lowering. It now uses pshuflw and pshufhw as much as
possible before resorting to pextrw and pinsrw.
- Better codegen for v4i32 shuffles masquerading as v8i16 or v16i8 shuffles.
- Improves (i16 extract_vector_element 0) codegen by recognizing
  (i32 extract_vector_element 0) does not require a pextrw.

llvm-svn: 44836
2007-12-11 01:46:18 +00:00
Nate Begeman a55a67ae91 x86 doesn't actually want to custom lower v3i32
llvm-svn: 44835
2007-12-11 01:41:33 +00:00
Chris Lattner 2945c46e93 Move TargetData::hostIsLittleEndian out of line, which means we
don't have to #include config.h in it.  #including config.h breaks
other projects that have their own autoconf stuff and try to #include
the llvm headers.  One obscure example is llvm-gcc.

llvm-svn: 44825
2007-12-11 00:28:59 +00:00
Anton Korobeynikov 21ade5880b Hey, English is not my native language :)
llvm-svn: 44820
2007-12-10 23:10:20 +00:00
Anton Korobeynikov 77eb5e649d Clarify the need of CFI() stuff
llvm-svn: 44819
2007-12-10 23:08:35 +00:00
Anton Korobeynikov a6b0f7e244 Provide convenient way to disable CFI stuff for old/broken assemblers.
Use it for Darwin.

llvm-svn: 44818
2007-12-10 23:04:38 +00:00
Chris Lattner 8a72a7d586 Disable cfi directives for now, darwin does't support them.
These should probably be something like:

  CFI(".cfi_def_cfa_offset 16\n")

where CFI is defined to a noop on darwin and other platforms
that don't support those directives.

llvm-svn: 44803
2007-12-10 19:10:18 +00:00
Anton Korobeynikov 657be86229 And finally annotate X86-64 version of callback.
All bad stuff from SSE version is implicitely inherited :)

llvm-svn: 44794
2007-12-10 15:27:07 +00:00
Anton Korobeynikov 88e9d082d8 Provide annotation for SSE version of callback. It's even more
broken, because doesn't mark xmm regs properly

llvm-svn: 44793
2007-12-10 15:13:55 +00:00
Anton Korobeynikov 81e9dc4af7 Annotate JIT callback function with call frame infromation.
This will allow us (theoretically) to unwind through JITer.
The code wasn't verified, so I'm pretty sure offsets are wrong :)

llvm-svn: 44792
2007-12-10 14:54:42 +00:00
Bill Wendling 3f19dfe794 Reverting 44702. It wasn't correct to rename them.
llvm-svn: 44727
2007-12-08 23:58:46 +00:00
Chris Lattner ff87f05e43 aesthetic changes, no functionality change. Evan, it's not clear
what 'Available' is, please add a comment near it and rename it
if appropriate.

llvm-svn: 44703
2007-12-08 07:22:58 +00:00
Bill Wendling 2b07d8c5a0 Renaming:
isTriviallyReMaterializable -> hasNoSideEffects
  isReallyTriviallyReMaterializable -> isTriviallyReMaterializable

llvm-svn: 44702
2007-12-08 07:17:56 +00:00
Chris Lattner f47015bc74 Fix a significant code quality regression I introduced on PPC64 quite
a while ago.  We now produce:

_foo:
	mflr r0
	std r0, 16(r1)
	ld r2, 16(r1)
	std r2, 0(r3)
	ld r0, 16(r1)
	mtlr r0
	blr 

instead of:

_foo:
	mflr r0
	std r0, 16(r1)
	lis r0, 0
	ori r0, r0, 16
	ldx r2, r1, r0
	std r2, 0(r3)
	ld r0, 16(r1)
	mtlr r0
	blr 

for:

void foo(void **X) {
  *X = __builtin_return_address(0);
}

on ppc64.

llvm-svn: 44701
2007-12-08 07:04:58 +00:00
Chris Lattner f6a8156e4f implement __builtin_return_addr(0) on ppc.
llvm-svn: 44700
2007-12-08 06:59:59 +00:00
Chris Lattner a6c8297e84 refactor some code to avoid overloading the name 'usesLR' in
different places to mean different things.  Document what the
one in PPCFunctionInfo means and when it is valid.

llvm-svn: 44699
2007-12-08 06:39:11 +00:00
Evan Cheng 92105ac0cd Doh
llvm-svn: 44694
2007-12-08 01:01:07 +00:00
Evan Cheng cd6913627e Fix a compilation warning.
llvm-svn: 44692
2007-12-08 01:00:31 +00:00
Evan Cheng ab87e735cd Fix a compilation warning.
llvm-svn: 44691
2007-12-08 01:00:21 +00:00
Bill Wendling fb706bc52b Initial commit of the machine code LICM pass. It successfully hoists this:
_foo:
        li r2, 0
LBB1_1: ; bb
        li r5, 0
        stw r5, 0(r3)
        addi r2, r2, 1
        addi r3, r3, 4
        cmplw cr0, r2, r4
        bne cr0, LBB1_1 ; bb
LBB1_2: ; return
        blr 

to:

_foo:
        li r2, 0
        li r5, 0
LBB1_1: ; bb
        stw r5, 0(r3)
        addi r2, r2, 1
        addi r3, r3, 4
        cmplw cr0, r2, r4
        bne cr0, LBB1_1 ; bb
LBB1_2: ; return
        blr

ZOMG!! :-)

Moar to come...

llvm-svn: 44687
2007-12-07 21:42:31 +00:00
Evan Cheng b41d838d28 Add comment.
llvm-svn: 44686
2007-12-07 21:30:01 +00:00
Evan Cheng bfd373a53e Much improved v8i16 shuffles. (Step 1).
llvm-svn: 44676
2007-12-07 08:07:39 +00:00
Evan Cheng c829e5cdf0 Remove a bogus optimization. It's not possible to do a move to low element to a <8 x i16> or <16 x i8> vector.
llvm-svn: 44669
2007-12-06 22:14:22 +00:00
Chris Lattner 7c709a5d08 implement a readme entry, compiling the code into:
_foo:
	movl	$12, %eax
	andl	4(%esp), %eax
	movl	_array(%eax), %eax
	ret

instead of:

_foo:
	movl	4(%esp), %eax
	shrl	$2, %eax
	andl	$3, %eax
	movl	_array(,%eax,4), %eax
	ret

As it turns out, this triggers all the time, in a wide variety of
situations, for example, I see diffs like this in various programs:

-       movl    8(%eax), %eax
-       shll    $2, %eax
-       andl    $1020, %eax
-       movl    (%esi,%eax), %eax
+       movzbl  8(%eax), %eax
+       movl    (%esi,%eax,4), %eax


-       shll    $2, %edx
-       andl    $1020, %edx
-       movl    (%edi,%edx), %edx
+       andl    $255, %edx
+       movl    (%edi,%edx,4), %edx

Unfortunately, I also see stuff like this, which can be fixed in the
X86 backend:

-       andl    $85, %ebx
-       addl    _bit_count(,%ebx,4), %ebp
+       shll    $2, %ebx
+       andl    $340, %ebx
+       addl    _bit_count(%ebx), %ebp

llvm-svn: 44656
2007-12-06 07:33:36 +00:00
Chris Lattner 5e224c32f4 add a note
llvm-svn: 44638
2007-12-05 23:05:06 +00:00
Chris Lattner ad05e17491 add a note
llvm-svn: 44637
2007-12-05 22:58:19 +00:00
Scott Michel 83d54c9ee0 Minor updates:
- Fix typo in SPUCallingConv.td
- Credit myself for CellSPU work
- Add CellSPU to 'all' host target list

llvm-svn: 44627
2007-12-05 21:23:16 +00:00
Evan Cheng 3b8a674469 Added canFoldMemoryOperand for PPC.
llvm-svn: 44623
2007-12-05 18:41:29 +00:00
Evan Cheng 8492bdeaa4 Update foldMemoryOperand.
llvm-svn: 44621
2007-12-05 18:36:37 +00:00
Chris Lattner de9bfcf67a fix warnings
llvm-svn: 44620
2007-12-05 18:32:18 +00:00
Chris Lattner 8292519705 allow this to build
llvm-svn: 44619
2007-12-05 18:30:11 +00:00
Evan Cheng bb26301864 Add a argument to storeRegToStackSlot and storeRegToAddr to specify whether
the stored register is killed.

llvm-svn: 44600
2007-12-05 03:14:33 +00:00
Scott Michel 4834955fdf More stuff for CellSPU -- this should be enough to get an error-free
compilation (no files missing). Test cases remain to be checked in.

llvm-svn: 44598
2007-12-05 02:01:41 +00:00
Scott Michel d1b5b9f68c Updated source file headers to llvm coding standard.
llvm-svn: 44597
2007-12-05 01:40:25 +00:00
Scott Michel 8d35728607 Two missing files.
llvm-svn: 44596
2007-12-05 01:31:18 +00:00
Scott Michel eff980208a Main CellSPU backend files checked in. Intrinsics and autoconf files
remain.

llvm-svn: 44595
2007-12-05 01:24:05 +00:00
Scott Michel dfe09ed085 More files in the CellSPU drop...
llvm-svn: 44584
2007-12-04 22:35:58 +00:00
Scott Michel 6e22c651d1 More of the Cell SPU code drop from "Team Aerospace".
llvm-svn: 44582
2007-12-04 22:23:35 +00:00
Scott Michel d821fe741e More CellSPU files... more to follow.
llvm-svn: 44559
2007-12-03 23:14:43 +00:00
Scott Michel c7bd8d9cb0 Makefile fragment for CellSPU.
llvm-svn: 44558
2007-12-03 23:12:49 +00:00
Scott Michel 256e9abbb9 First commit to CellSPU. More to follow
llvm-svn: 44557
2007-12-03 23:09:49 +00:00
Duncan Sands 38ef3a8ec7 Rather than having special rules like "intrinsics cannot
throw exceptions", just mark intrinsics with the nounwind
attribute.  Likewise, mark intrinsics as readnone/readonly
and get rid of special aliasing logic (which didn't use
anything more than this anyway).

llvm-svn: 44544
2007-12-03 20:06:50 +00:00
Evan Cheng f45a1d623c Remove redundant foldMemoryOperand variants and other code clean up.
llvm-svn: 44517
2007-12-02 08:30:39 +00:00
Evan Cheng 69fda0a716 Allow some reloads to be folded in multi-use cases. Specifically testl r, r -> cmpl [mem], 0.
llvm-svn: 44479
2007-12-01 02:07:52 +00:00
Chris Lattner 246b7b2dbf Work around a GCC bug, producing this code:
unsigned char *llvm_cbe_X;
...
  llvm_cbe_X = 0; *((void**)&llvm_cbe_X) = __builtin_stack_save();

instead of:

  llvm_cbe_X = __builtin_stack_save();

See PR1809 for details.

llvm-svn: 44415
2007-11-28 21:26:17 +00:00
Chris Lattner 57ee7c6630 Implement ExpandOperationResult for ppc i64 fp->int, which fixes
CodeGen/Generic/fp_to_int.ll among others.  Its unclear why this 
just started failing...

llvm-svn: 44407
2007-11-28 18:44:47 +00:00
Duncan Sands 5208d1ab4a Add some convenience methods for querying attributes, and
use them.

llvm-svn: 44403
2007-11-28 17:07:01 +00:00
Chris Lattner 57662f3882 several entries got significantly better, though they still aren't done.
llvm-svn: 44382
2007-11-27 22:41:52 +00:00
Chris Lattner f3f4ad9dd6 implement a trivial readme entry.
llvm-svn: 44380
2007-11-27 22:36:16 +00:00
Chris Lattner 79ae9895f6 Fix a crash on invalid code due to memcpy lowering.
llvm-svn: 44378
2007-11-27 22:14:42 +00:00
Nate Begeman 6f026a654c Support returning non-power-of-2 vectors to unblock some work
llvm-svn: 44371
2007-11-27 19:28:48 +00:00
Andrew Lenharth b960acebde something wrong with this opt
llvm-svn: 44370
2007-11-27 18:31:30 +00:00
Duncan Sands ad0ea2d430 Fix PR1146: parameter attributes are longer part of
the function type, instead they belong to functions
and function calls.  This is an updated and slightly
corrected version of Reid Spencer's original patch.
The only known problem is that auto-upgrading of
bitcode files doesn't seem to work properly (see
test/Bitcode/AutoUpgradeIntrinsics.ll).  Hopefully
a bitcode guru (who might that be? :) ) will fix it.

llvm-svn: 44359
2007-11-27 13:23:08 +00:00
Chris Lattner 5728bdd4db Fix a long standing deficiency in the X86 backend: we would
sometimes emit "zero" and "all one" vectors multiple times,
for example:

_test2:
	pcmpeqd	%mm0, %mm0
	movq	%mm0, _M1
	pcmpeqd	%mm0, %mm0
	movq	%mm0, _M2
	ret

instead of:

_test2:
	pcmpeqd	%mm0, %mm0
	movq	%mm0, _M1
	movq	%mm0, _M2
	ret

This patch fixes this by always arranging for zero/one vectors
to be defined as v4i32 or v2i32 (SSE/MMX) instead of letting them be
any random type.  This ensures they get trivially CSE'd on the dag.
This fix is also important for LegalizeDAGTypes, as it gets unhappy
when the x86 backend wants BUILD_VECTOR(i64 0) to be legal even when
'i64' isn't legal.

This patch makes the following changes:

1) X86TargetLowering::LowerBUILD_VECTOR now lowers 0/1 vectors into
   their canonical types.
2) The now-dead patterns are removed from the SSE/MMX .td files.
3) All the patterns in the .td file that referred to immAllOnesV or
   immAllZerosV in the wrong form now use *_bc to match them with a
   bitcast wrapped around them.
4) X86DAGToDAGISel::SelectScalarSSELoad is generalized to handle 
   bitcast'd zero vectors, which simplifies the code actually.
5) getShuffleVectorZeroOrUndef is updated to generate a shuffle that
   is legal, instead of generating one that is illegal and expecting
   a later legalize pass to clean it up.
6) isZeroShuffle is generalized to handle bitcast of zeros.
7) several other minor tweaks.

This patch is definite goodness, but has the potential to cause random
code quality regressions.  Please be on the lookout for these and let 
me know if they happen.

llvm-svn: 44310
2007-11-25 00:24:49 +00:00
Chris Lattner 6a49593aa5 add a immAllZerosV_bc pattern fragment for consistency with others.
llvm-svn: 44303
2007-11-24 19:02:07 +00:00
Chris Lattner f72ad16263 remove bogus assertion that broke CodeGen/Generic/cast-fp.ll on x86
among others.

llvm-svn: 44302
2007-11-24 18:37:20 +00:00
Chris Lattner f81d5886c6 Several changes:
1) Change the interface to TargetLowering::ExpandOperationResult to 
   take and return entire NODES that need a result expanded, not just
   the value.  This allows us to handle things like READCYCLECOUNTER,
   which returns two values.
2) Implement (extremely limited) support in LegalizeDAG::ExpandOp for MERGE_VALUES.
3) Reimplement custom lowering in LegalizeDAGTypes in terms of the new
   ExpandOperationResult.  This makes the result simpler and fully 
   general.
4) Implement (fully general) expand support for MERGE_VALUES in LegalizeDAGTypes.
5) Implement ExpandOperationResult support for ARM f64->i64 bitconvert and ARM
   i64 shifts, allowing them to work with LegalizeDAGTypes.
6) Implement ExpandOperationResult support for X86 READCYCLECOUNTER and FP_TO_SINT,
   allowing them to work with LegalizeDAGTypes.

LegalizeDAGTypes now passes several more X86 codegen tests when enabled and when
type legalization in LegalizeDAG is ifdef'd out.

llvm-svn: 44300
2007-11-24 07:07:01 +00:00
Chris Lattner ab98c41337 add a note
llvm-svn: 44299
2007-11-24 06:13:33 +00:00
Dale Johannesen e70850cf7b Fix compiler warning.
llvm-svn: 44261
2007-11-21 00:45:00 +00:00
Dale Johannesen 763e110a9f Fix .eh table linkage issues on Darwin. Some EH support
for Darwin PPC, but it's not fully working yet.

llvm-svn: 44258
2007-11-20 23:24:42 +00:00
Dan Gohman aad83c8ee5 Remove meaningless qualifiers from return types, avoiding compiler warnings.
llvm-svn: 44240
2007-11-19 20:46:23 +00:00
Nate Begeman d4d45c268c Add support for vectors to int <-> float casts.
llvm-svn: 44204
2007-11-17 03:58:34 +00:00
Anton Korobeynikov 91460e43f1 Implement codegen for flt_rounds on x86
llvm-svn: 44183
2007-11-16 01:31:51 +00:00
Evan Cheng 0cbe920d7c Oops. Debugging code shouldn't have been checked in.
llvm-svn: 44128
2007-11-14 19:08:32 +00:00
Anton Korobeynikov 2c6387803e Fix PIC jump table codegen on x86-32/linux. In fact, such thing should be applied
to all targets uses GOT-relative offsets for PIC (Alpha?)

llvm-svn: 44108
2007-11-14 09:18:41 +00:00
Duncan Sands e2287ed552 Eliminate the recently introduced CCAssignToStackABISizeAlign
in favour of teaching CCAssignToStack that size 0 and/or align
0 means to use the ABI values.  This seems a neater solution.
It is safe since no legal value type has size 0.

llvm-svn: 44107
2007-11-14 08:29:13 +00:00
Evan Cheng 7f02cfa599 Clean up sub-register implementation by moving subReg information back to
MachineOperand auxInfo. Previous clunky implementation uses an external map
to track sub-register uses. That works because register allocator uses
a new virtual register for each spilled use. With interval splitting (coming
soon), we may have multiple uses of the same register some of which are
of using different sub-registers from others. It's too fragile to constantly
update the information.

llvm-svn: 44104
2007-11-14 07:59:08 +00:00
Dale Johannesen 7904708369 Revert previous; these files aren't ready to go in yet.
llvm-svn: 44057
2007-11-13 19:16:02 +00:00
Dale Johannesen 7a7085f6d3 Add parameter to getDwarfRegNum to permit targets
to use different mappings for EH and debug info;
no functional change yet.
Fix warning in X86CodeEmitter.

llvm-svn: 44056
2007-11-13 19:13:01 +00:00
Evan Cheng c891ae92dc Fix x86-64 jit: remove reliance on Dwarf numbers.
llvm-svn: 44048
2007-11-13 17:54:34 +00:00
Bill Wendling 77b13af9a6 Unifacalize the CALLSEQ{START,END} stuff.
llvm-svn: 44045
2007-11-13 09:19:02 +00:00
Bill Wendling f359fed9f9 Unify CALLSEQ_{START,END}. They take 4 parameters: the chain, two stack
adjustment fields, and an optional flag. If there is a "dynamic_stackalloc" in
the code, make sure that it's bracketed by CALLSEQ_START and CALLSEQ_END. If
not, then there is the potential for the stack to be changed while the stack's
being used by another instruction (like a call).

This can only result in tears...

llvm-svn: 44037
2007-11-13 00:44:25 +00:00
Anton Korobeynikov bfb139ec93 Completely forgot, that we have some debug information emission on PPC. This should fix
some regressions on ppc nightly tests.

llvm-svn: 44029
2007-11-12 23:36:13 +00:00
Bruno Cardoso Lopes b439132d16 Added JumpTable support
Fixed some AsmPrinter issues
Added GLOBAL_OFFSET_TABLE Node handle.

llvm-svn: 44024
2007-11-12 19:49:57 +00:00
Owen Anderson 933b5b7e62 Add a flag for indirect branch instructions.
Target maintainers: please check that the instructions for your target are correctly marked.

llvm-svn: 44012
2007-11-12 07:39:39 +00:00
Anton Korobeynikov 0644bb865e Clarify the meaning of '-2' register number
llvm-svn: 43998
2007-11-11 19:53:50 +00:00
Anton Korobeynikov 4edfea438a Use TableGen to emit information for dwarf register numbers.
This makes DwarfRegNum to accept list of numbers instead.
Added three different "flavours", but only slightly tested on x86-32/linux.
Please check another subtargets if possible,

llvm-svn: 43997
2007-11-11 19:50:10 +00:00
Dale Johannesen b988e7e8cd Add CCAssignToStackABISizeAlign for convenience in
dealing with types whose size & alignment are
different on different subtargets.  Use it for x86 f80.

llvm-svn: 43988
2007-11-10 22:07:15 +00:00
Arnold Schwaighofer d2c16ff905 Update tailcall code to include inline attribute operand for memcpy.
llvm-svn: 43978
2007-11-10 10:48:01 +00:00
Evan Cheng fb13fd6f93 Unbreak x86-64 jumptable.
llvm-svn: 43955
2007-11-09 19:11:23 +00:00
Anton Korobeynikov 5db5e352b9 Silence a warning
llvm-svn: 43954
2007-11-09 19:06:14 +00:00
Dale Johannesen dfb85c7831 Revert previous rewrite per chris's comments.
llvm-svn: 43950
2007-11-09 18:07:11 +00:00
Evan Cheng 797d56ff17 Much improved pic jumptable codegen:
Then:
        call    "L1$pb"
"L1$pb":
        popl    %eax
		...
LBB1_1: # entry
        imull   $4, %ecx, %ecx
        leal    LJTI1_0-"L1$pb"(%eax), %edx
        addl    LJTI1_0-"L1$pb"(%ecx,%eax), %edx
        jmpl    *%edx

        .align  2
        .set L1_0_set_3,LBB1_3-LJTI1_0
        .set L1_0_set_2,LBB1_2-LJTI1_0
        .set L1_0_set_5,LBB1_5-LJTI1_0
        .set L1_0_set_4,LBB1_4-LJTI1_0
LJTI1_0:
        .long    L1_0_set_3
        .long    L1_0_set_2

Now:
        call    "L1$pb"
"L1$pb":
        popl    %eax
		...
LBB1_1: # entry
        addl    LJTI1_0-"L1$pb"(%eax,%ecx,4), %eax
        jmpl    *%eax

		.align  2
		.set L1_0_set_3,LBB1_3-"L1$pb"
		.set L1_0_set_2,LBB1_2-"L1$pb"
		.set L1_0_set_5,LBB1_5-"L1$pb"
		.set L1_0_set_4,LBB1_4-"L1$pb"
LJTI1_0:
        .long    L1_0_set_3
        .long    L1_0_set_2

llvm-svn: 43924
2007-11-09 01:32:10 +00:00
Dale Johannesen 04fd82088e Rewrite Dwarf number handling per review comments.
llvm-svn: 43918
2007-11-09 00:47:10 +00:00
Lauro Ramos Venancio f6a67bf700 [ARM] Implement __builtin_thread_pointer.
llvm-svn: 43892
2007-11-08 17:20:05 +00:00
Dale Johannesen 1b9de4dd6f Complete conditionalization of Dwarf reg numbers.
Would somebody not on Darwin please make sure this
doesn't break anything.  Exception handling failures
would be the most likely symptom.

llvm-svn: 43844
2007-11-07 21:48:35 +00:00
Dale Johannesen fbe69d2cd6 Interchange Dwarf numbers of ESP and EBP on x86 Darwin.
Much improvement in exception handling.

llvm-svn: 43794
2007-11-07 00:25:05 +00:00
Bruno Cardoso Lopes 87bb032c05 Better processor definition
llvm-svn: 43749
2007-11-06 03:15:20 +00:00
Rafael Espindola fa0df55bdd Move the LowerMEMCPY and LowerMEMCPYCall to a common place.
Thanks for the suggestions Bill :-)

llvm-svn: 43742
2007-11-05 23:12:20 +00:00
Lauro Ramos Venancio 1a30c18e88 [ARM] Fix code generation for:
static __thread struct {
    int a;
    int b;
} teste = {0, 0};

llvm-svn: 43722
2007-11-05 18:33:37 +00:00
Evan Cheng 9337929aae Use movups to spill / restore SSE registers on targets where stacks alignment is
less than 16. This is a temporary solution until dynamic stack alignment is
implemented.

llvm-svn: 43703
2007-11-05 07:30:01 +00:00
Bruno Cardoso Lopes 3e0d030dad Added support for PIC code with "explicit relocations" *only*.
Removed all macro code for PIC (goodbye "la").
Support tested with shootout bench.

llvm-svn: 43697
2007-11-05 03:02:32 +00:00
Duncan Sands 283207a71c Eliminate the remaining uses of getTypeSize. This
should only effect x86 when using long double.  Now
12/16 bytes are output for long double globals (the
exact amount depends on the alignment).  This brings
globals in line with the rest of LLVM: the space
reserved for an object is now always the ABI size.
One tricky point is that only 10 bytes should be
output for long double if it is a field in a packed
struct, which is the reason for the additional
argument to EmitGlobalConstant.

llvm-svn: 43688
2007-11-05 00:04:43 +00:00
Chris Lattner 9329e780cd Fix PR1761 by not printing (rip) suffix when in -static mode.
Evan, please review this.

llvm-svn: 43680
2007-11-04 19:23:28 +00:00
Nick Lewycky d954dcd138 Fix crash before main on ppc/linux with static constructors. PR1771
llvm-svn: 43676
2007-11-04 17:32:10 +00:00
Chris Lattner 296160d443 Fix PR1763 by allowing the 'q' constraint to work with 64-bit
regs on x86-64.

llvm-svn: 43669
2007-11-04 06:51:12 +00:00
Evan Cheng 2b93a20b09 Unbreak tailcall opt.
llvm-svn: 43646
2007-11-02 17:45:40 +00:00
Chris Lattner 389d430c49 add a note
llvm-svn: 43642
2007-11-02 17:04:20 +00:00
Evan Cheng e453ff4913 Missing a getNumOperands check.
llvm-svn: 43630
2007-11-02 01:26:22 +00:00
Duncan Sands 44b8721de8 Executive summary: getTypeSize -> getTypeStoreSize / getABITypeSize.
The meaning of getTypeSize was not clear - clarifying it is important
now that we have x86 long double and arbitrary precision integers.
The issue with long double is that it requires 80 bits, and this is
not a multiple of its alignment.  This gives a primitive type for
which getTypeSize differed from getABITypeSize.  For arbitrary precision
integers it is even worse: there is the minimum number of bits needed to
hold the type (eg: 36 for an i36), the maximum number of bits that will
be overwriten when storing the type (40 bits for i36) and the ABI size
(i.e. the storage size rounded up to a multiple of the alignment; 64 bits
for i36).

This patch removes getTypeSize (not really - it is still there but
deprecated to allow for a gradual transition).  Instead there is:

(1) getTypeSizeInBits - a number of bits that suffices to hold all
values of the type.  For a primitive type, this is the minimum number
of bits.  For an i36 this is 36 bits.  For x86 long double it is 80.
This corresponds to gcc's TYPE_PRECISION.

(2) getTypeStoreSizeInBits - the maximum number of bits that is
written when storing the type (or read when reading it).  For an
i36 this is 40 bits, for an x86 long double it is 80 bits.  This
is the size alias analysis is interested in (getTypeStoreSize
returns the number of bytes).  There doesn't seem to be anything
corresponding to this in gcc.

(3) getABITypeSizeInBits - this is getTypeStoreSizeInBits rounded
up to a multiple of the alignment.  For an i36 this is 64, for an
x86 long double this is 96 or 128 depending on the OS.  This is the
spacing between consecutive elements when you form an array out of
this type (getABITypeSize returns the number of bytes).  This is
TYPE_SIZE in gcc.

Since successive elements in a SequentialType (arrays, pointers
and vectors) need to be aligned, the spacing between them will be
given by getABITypeSize.  This means that the size of an array
is the length times the getABITypeSize.  It also means that GEP
computations need to use getABITypeSize when computing offsets.
Furthermore, if an alloca allocates several elements at once then
these too need to be aligned, so the size of the alloca has to be
the number of elements multiplied by getABITypeSize.  Logically
speaking this doesn't have to be the case when allocating just
one element, but it is simpler to also use getABITypeSize in this
case.  So alloca's and mallocs should use getABITypeSize.  Finally,
since gcc's only notion of size is that given by getABITypeSize, if
you want to output assembler etc the same as gcc then getABITypeSize
is the size you want.

Since a store will overwrite no more than getTypeStoreSize bytes,
and a read will read no more than that many bytes, this is the
notion of size appropriate for alias analysis calculations.

In this patch I have corrected all type size uses except some of
those in ScalarReplAggregates, lib/Codegen, lib/Target (the hard
cases).  I will get around to auditing these too at some point,
but I could do with some help.

Finally, I made one change which I think wise but others might
consider pointless and suboptimal: in an unpacked struct the
amount of space allocated for a field is now given by the ABI
size rather than getTypeStoreSize.  I did this because every
other place that reserves memory for a type (eg: alloca) now
uses getABITypeSize, and I didn't want to make an exception
for unpacked structs, i.e. I did it to make things more uniform.
This only effects structs containing long doubles and arbitrary
precision integers.  If someone wants to pack these types more
tightly they can always use a packed struct.

llvm-svn: 43620
2007-11-01 20:53:16 +00:00
Bill Wendling b7cabbe295 Silence, accersed warning
llvm-svn: 43609
2007-11-01 08:51:44 +00:00
Rafael Espindola 419b6d7ce4 Make ARM and X86 LowerMEMCPY identical by moving the isThumb check into getMaxInlineSizeThreshold
and by restructuring the X86 version.

New I just have to move this to a common place :-)

llvm-svn: 43554
2007-10-31 14:39:58 +00:00
Rafael Espindola 063f177300 Make ARM an X86 memcpy expansion more similar to each other.
Now both subtarget define getMaxInlineSizeThreshold and the expansion uses it.

This should not change generated code.

llvm-svn: 43552
2007-10-31 11:52:06 +00:00
Dale Johannesen b066c1f216 Make i64=expand_vector_elt(v2i64) work in 32-bit mode.
llvm-svn: 43535
2007-10-31 00:32:36 +00:00
Dale Johannesen d50c8bcef6 Add missing SSE builtins: CVTPD2PI, CVTPS2PI,
CVTTPD2PI, CVTTPS2PI, CVTPI2PD, CVTPI2PS.

llvm-svn: 43523
2007-10-30 22:15:38 +00:00
Duncan Sands b508c53c63 Fix for visibility warnings generated by gcc-4.2.
llvm-svn: 43500
2007-10-30 13:14:37 +00:00
Dale Johannesen 6aa304e529 Add missing MMX PSUBQ.
llvm-svn: 43488
2007-10-30 01:18:38 +00:00
Evan Cheng e106e2f142 Enable more fold (sext (load x)) -> (sext (truncate (sextload x)))
transformation. Previously, it's restricted by ensuring the number of load uses
is one. Now the restriction is loosened up by allowing setcc uses to be
"extended" (e.g. setcc x, c, eq -> setcc sext(x), sext(c), eq).

llvm-svn: 43465
2007-10-29 19:58:20 +00:00
Evan Cheng 7b3f7feaea Avoid doing something dumb like rewriting using a 64-bit iv in 32-bit mode.
llvm-svn: 43446
2007-10-29 07:57:50 +00:00
Chris Lattner 909a54ccd4 add a note.
llvm-svn: 43444
2007-10-29 06:19:48 +00:00
Chris Lattner 5e99fd8c0d Add support for the x86-64 'q' regigster modifier, and add support for the
b/h/w/k/q inline asm memory modifiers, which are just ignored.  This fixes
PR1748 and CodeGen/X86/2007-10-28-inlineasm-q-modifier.ll

llvm-svn: 43430
2007-10-29 03:09:07 +00:00
Chris Lattner 9a641510bd Fix PR1749 and InstCombine/2007-10-28-EmptyField.ll by handling
zero-length fields better.

llvm-svn: 43427
2007-10-29 02:40:02 +00:00
Evan Cheng c826ac533b New entry.
llvm-svn: 43420
2007-10-28 04:01:09 +00:00
Anton Korobeynikov d07d6a411c Fix off-by-one stack offset computations (dwarf information) for callee-saved
registers in case, when FP pointer was eliminated. This should fixes misc. random
EH-related crahses, when stuff is compiled with -fomit-frame-pointer.
Thanks Duncan for nailing this bug!

llvm-svn: 43381
2007-10-26 09:13:24 +00:00
Eric Christopher 18063916b5 clo/clz aren't supported on mips I. Keep them around for when we'll
want them later (mips32/64).

llvm-svn: 43380
2007-10-26 04:00:13 +00:00
Evan Cheng 7f3d02471d Loosen up iv reuse to allow reuse of the same stride but a larger type when truncating from the larger type to smaller type is free.
e.g.
Turns this loop:
LBB1_1: # entry.bb_crit_edge
        xorl    %ecx, %ecx
        xorw    %dx, %dx
        movw    %dx, %si
LBB1_2: # bb
        movl    L_X$non_lazy_ptr, %edi
        movw    %si, (%edi)
        movl    L_Y$non_lazy_ptr, %edi
        movw    %dx, (%edi)
		addw    $4, %dx
		incw    %si
		incl    %ecx
		cmpl    %eax, %ecx
		jne     LBB1_2  # bb
	
into

LBB1_1: # entry.bb_crit_edge
        xorl    %ecx, %ecx
        xorw    %dx, %dx
LBB1_2: # bb
        movl    L_X$non_lazy_ptr, %esi
        movw    %cx, (%esi)
        movl    L_Y$non_lazy_ptr, %esi
        movw    %dx, (%esi)
        addw    $4, %dx
		incl    %ecx
        cmpl    %eax, %ecx
        jne     LBB1_2  # bb

llvm-svn: 43375
2007-10-26 01:56:11 +00:00
Dale Johannesen 65e804a9c3 Support non-POSIX hosts by removing use of strncasecmp.
llvm-svn: 43364
2007-10-25 21:54:43 +00:00
Dale Johannesen 10f4152471 Disable a couple more things for ppcf128.
llvm-svn: 43267
2007-10-23 23:20:14 +00:00
Evan Cheng ec271b104c Temporary solution: added a different set of BCTRL_Macho / BCTRL_ELF with right callee-saved defs set for ppc64.
llvm-svn: 43248
2007-10-23 06:42:42 +00:00
Evan Cheng 1f2dd35898 Fix memcpy lowering when addresses are 4-byte aligned but size is not multiple of 4.
llvm-svn: 43234
2007-10-22 22:11:27 +00:00
Dan Gohman bf474959a3 Fix the folding of multiplication into addresses on x86, which was broken
by the recent {U,S}MUL_LOHI changes.

llvm-svn: 43230
2007-10-22 20:22:24 +00:00
Evan Cheng bdbed66333 Use ptr type in the immediate field of a BxA instruction so we don't end up selecting 32-bit call instruction for ppc64.
llvm-svn: 43228
2007-10-22 19:46:19 +00:00
Evan Cheng c92446af1f Fix an unfolding bug.
llvm-svn: 43212
2007-10-22 03:03:20 +00:00
Dale Johannesen 8ee70112ea Allow for copysign having f80 second argument.
Fixes 5550319.

llvm-svn: 43205
2007-10-21 01:07:44 +00:00
Evan Cheng 45e096c77e Resolve unfold tables ambiguity.
llvm-svn: 43194
2007-10-19 23:50:58 +00:00
Evan Cheng 35ff79370b Local spiller optimization:
Turn a store folding instruction into a load folding instruction. e.g.
     xorl  %edi, %eax
     movl  %eax, -32(%ebp)
     movl  -36(%ebp), %eax
     orl   %eax, -32(%ebp)
=>
     xorl  %edi, %eax
     orl   -36(%ebp), %eax
     mov   %eax, -32(%ebp)
This enables the unfolding optimization for a subsequent instruction which will
also eliminate the newly introduced store instruction.

llvm-svn: 43192
2007-10-19 21:23:22 +00:00
Rafael Espindola 18a831d783 split LowerMEMCPY into LowerMEMCPYCall and LowerMEMCPYInline in the ARM backend.
llvm-svn: 43176
2007-10-19 14:35:17 +00:00
Rafael Espindola 846c19dd70 Add support for byval function whose argument is not 32 bit aligned.
To do this it is necessary to add a "always inline" argument to the
memcpy node. For completeness I have also added this node to memmove
and memset.  I have also added getMem* functions, because the extra
argument makes it cumbersome to use getNode and because I get confused
by it :-)

llvm-svn: 43172
2007-10-19 10:41:11 +00:00
Chris Lattner b193576bc6 comment fixes
llvm-svn: 43168
2007-10-19 04:08:28 +00:00
Chris Lattner 5d979d57ae Add an easy microoptimization I noticed.
llvm-svn: 43164
2007-10-19 03:29:26 +00:00
Dale Johannesen 10432e5a67 More ppcf128 issues (maybe the last)?
llvm-svn: 43160
2007-10-19 00:59:18 +00:00
Evan Cheng 463e2ab0ac - Added getOpcodeAfterMemoryUnfold(). It doesn't unfold an instruction, but only returns the opcode of the instruction post unfolding.
- Fix some copy+paste bugs.

llvm-svn: 43153
2007-10-18 22:40:57 +00:00
Evan Cheng aa9a225699 Use SmallVectorImpl instead of SmallVector with hardcoded size in MRegister public interface.
llvm-svn: 43150
2007-10-18 21:29:24 +00:00
Christopher Lamb 79dfbed6f6 Fix a misnamed parameter.
llvm-svn: 43145
2007-10-18 19:29:45 +00:00
Christopher Lamb 7f68cf0d57 Fix a typo
llvm-svn: 43144
2007-10-18 19:28:55 +00:00
Gordon Henriksen ea31de8dc1 Work around downrev gccs which do not inherit visibility of the
Registry<>::iterator member class.

llvm-svn: 43122
2007-10-18 11:53:05 +00:00
Chris Lattner 84f3461c49 legalizing the ret operation on f64 shouldn't introduce a new
i64 bit convert needlessly.

llvm-svn: 43116
2007-10-18 06:17:07 +00:00
Gordon Henriksen ef5d08f4ea Switching TargetMachineRegistry to use the new generic Registry.
llvm-svn: 43094
2007-10-17 21:28:48 +00:00
Chris Lattner 12d5da49d3 Change fp to sint legalization on x86-32 to do 2 x i32
loads instead of 1 x i64 loads.  This doesn't change any functionality yet.

llvm-svn: 43068
2007-10-17 06:17:29 +00:00
Chris Lattner 693cbeadff fix some funny indentation, add comments.
llvm-svn: 43066
2007-10-17 06:02:13 +00:00
Dale Johannesen e5530a35d4 Check for invalid cc's in f80 select.
llvm-svn: 43033
2007-10-16 18:09:08 +00:00
Chris Lattner 1366653e2f Fix a bug handling frame references in ppc inline asm when the frame offset
doesn't fit into 16 bits.

llvm-svn: 43032
2007-10-16 18:00:18 +00:00
Arnold Schwaighofer b3d58b98d0 Correction to tail call optimization code. The new return address
was stored to the acutal stack slot before the parameters were
lowered to their stack slot. This could cause arguments to be
overwritten by the return address if the called function had less
parameters than the caller function. The update should remove the
last failing test case of llc-beta: SPASS.

llvm-svn: 43027
2007-10-16 09:05:00 +00:00
Chris Lattner 06a4954e6e Change LowerFP_TO_SINT to create the specific code it needs instead of
unconditionally creating an i64 bitcast.  With the future legalizer
design, operation legalization can't introduce new nodes with illegal
types.

This fixes the rest of olden on ppc32.

llvm-svn: 43005
2007-10-15 20:14:52 +00:00
Evan Cheng 7bcfd8f880 LowerFP_TO_SINT must not create a stack object if it's not needed.
llvm-svn: 43004
2007-10-15 20:11:21 +00:00
Dale Johannesen 207bd4d90e Handle PPC long double in CBackend.
llvm-svn: 42972
2007-10-15 01:05:37 +00:00
Evan Cheng 4099f4f91a Unbreak x86-64.
llvm-svn: 42962
2007-10-14 10:09:39 +00:00
Evan Cheng cdf3609130 Revert 42908 for now.
llvm-svn: 42960
2007-10-14 05:57:21 +00:00
Dale Johannesen 2f6b6d6fb0 Fix type mismatch error in PPC Altivec (only causes
a problem when asserts are on).  From vecLib.

llvm-svn: 42959
2007-10-14 01:58:32 +00:00
Duncan Sands 29af26f147 Clarify that fastcc has a problem with nested function
trampolines, rather than with nested functions themselves.

llvm-svn: 42955
2007-10-13 07:38:37 +00:00
Evan Cheng 7082dcf605 Change unfoldMemoryOperand(). User is now responsible for passing in the
register used by the unfolded instructions. User can also specify whether to
unfold the load, the store, or both.

llvm-svn: 42946
2007-10-13 02:35:06 +00:00
Arnold Schwaighofer e8d0bf2669 Correcting the corrections. Bad bad baaad emacs!
llvm-svn: 42935
2007-10-12 21:53:12 +00:00
Arnold Schwaighofer 1f0da1fefb Corrected many typing errors. And removed 'nest' parameter handling
for fastcc from X86CallingConv.td.  This means that nested functions
are not supported for calling convention 'fastcc'.

llvm-svn: 42934
2007-10-12 21:30:57 +00:00
Duncan Sands a6286bd502 Due to the new tail call optimization, trampolines can no
longer be created for fastcc functions.

llvm-svn: 42925
2007-10-12 19:37:31 +00:00
Evan Cheng 409fa443fc Update.
llvm-svn: 42922
2007-10-12 18:22:55 +00:00
Dan Gohman dc35bd79ca Change the names used for internal labels to use the current
function symbol name instead of a codegen-assigned function
number.

Thanks Evan! :-)

llvm-svn: 42908
2007-10-12 14:53:36 +00:00
Dan Gohman 8d978da3b0 Mark vector ctpop, cttz, and ctlz as Expand on x86.
llvm-svn: 42905
2007-10-12 14:09:42 +00:00