Commit Graph

4656 Commits

Author SHA1 Message Date
Gordon Henriksen 2d684b1fbf Ammending r45669 with a missing file.
llvm-svn: 45671
2008-01-07 01:33:09 +00:00
Gordon Henriksen 6047b6e140 With this patch, the LowerGC transformation becomes the
ShadowStackCollector, which additionally has reduced overhead with
no sacrifice in portability.

Considering a function @fun with 8 loop-local roots,
ShadowStackCollector introduces the following overhead
(x86):

; shadowstack prologue
        movl    L_llvm_gc_root_chain$non_lazy_ptr, %eax
        movl    (%eax), %ecx
        movl    $___gc_fun, 20(%esp)
        movl    $0, 24(%esp)
        movl    $0, 28(%esp)
        movl    $0, 32(%esp)
        movl    $0, 36(%esp)
        movl    $0, 40(%esp)
        movl    $0, 44(%esp)
        movl    $0, 48(%esp)
        movl    $0, 52(%esp)
        movl    %ecx, 16(%esp)
        leal    16(%esp), %ecx
        movl    %ecx, (%eax)

; shadowstack loop overhead
        (none)

; shadowstack epilogue
        movl    48(%esp), %edx
        movl    %edx, (%ecx)

; shadowstack metadata
        .align  3
___gc_fun:                              # __gc_fun
        .long   8
        .space  4

In comparison to LowerGC:

; lowergc prologue
        movl    L_llvm_gc_root_chain$non_lazy_ptr, %eax
        movl    (%eax), %ecx
        movl    %ecx, 48(%esp)
        movl    $8, 52(%esp)
        movl    $0, 60(%esp)
        movl    $0, 56(%esp)
        movl    $0, 68(%esp)
        movl    $0, 64(%esp)
        movl    $0, 76(%esp)
        movl    $0, 72(%esp)
        movl    $0, 84(%esp)
        movl    $0, 80(%esp)
        movl    $0, 92(%esp)
        movl    $0, 88(%esp)
        movl    $0, 100(%esp)
        movl    $0, 96(%esp)
        movl    $0, 108(%esp)
        movl    $0, 104(%esp)
        movl    $0, 116(%esp)
        movl    $0, 112(%esp)

; lowergc loop overhead
        leal    44(%esp), %eax
        movl    %eax, 56(%esp)
        leal    40(%esp), %eax
        movl    %eax, 64(%esp)
        leal    36(%esp), %eax
        movl    %eax, 72(%esp)
        leal    32(%esp), %eax
        movl    %eax, 80(%esp)
        leal    28(%esp), %eax
        movl    %eax, 88(%esp)
        leal    24(%esp), %eax
        movl    %eax, 96(%esp)
        leal    20(%esp), %eax
        movl    %eax, 104(%esp)
        leal    16(%esp), %eax
        movl    %eax, 112(%esp)

; lowergc epilogue
        movl    48(%esp), %edx
        movl    %edx, (%ecx)

; lowergc metadata
        (none)

llvm-svn: 45670
2008-01-07 01:30:53 +00:00
Gordon Henriksen 5180e85675 Enabling the target-independent garbage collection infrastructure by hooking it
up to the various compiler pipelines.

This doesn't actually add support for any GC algorithms, which means it 
temporarily breaks a few tests. To be fixed shortly.

llvm-svn: 45669
2008-01-07 01:30:38 +00:00
Chris Lattner a4ce4f6987 rename isLoad -> isSimpleLoad due to evan's desire to have such a predicate.
llvm-svn: 45667
2008-01-06 23:38:27 +00:00
Chris Lattner 10324d0175 rename isStore -> mayStore to more accurately reflect what it captures.
llvm-svn: 45656
2008-01-06 08:36:04 +00:00
Duncan Sands 1694a53c5d Remove an unused variable.
llvm-svn: 45655
2008-01-06 07:43:13 +00:00
Chris Lattner 7eac714b41 make this build with newer gcc's
llvm-svn: 45637
2008-01-05 23:29:51 +00:00
Nate Begeman 5743da502e If custom lowering of insert element fails, the result Val will be 0.
Don't overwrite a variable used by the fallthrough code path in this
case.

llvm-svn: 45630
2008-01-05 20:47:37 +00:00
Chris Lattner 647e61a42b Fix build issue on certain compilers.
llvm-svn: 45629
2008-01-05 20:15:42 +00:00
Chris Lattner ee61d14bf6 The current impl is really trivial, add some comments about how it can be made better.
llvm-svn: 45625
2008-01-05 06:47:58 +00:00
Chris Lattner 276178e49f allow sinking to be enabled for the jit
llvm-svn: 45624
2008-01-05 06:14:16 +00:00
Chris Lattner d11ca169e7 don't sink anything with side effects, this makes lots of stuff work, but sinks almost nothing.
llvm-svn: 45617
2008-01-05 02:33:22 +00:00
Chris Lattner 6ec78274df fix a common crash.
llvm-svn: 45614
2008-01-05 01:39:17 +00:00
Owen Anderson 3592b2352d I should not be allowed to commit when sleepy.
llvm-svn: 45608
2008-01-05 00:48:55 +00:00
Bill Wendling 0c209430b4 Don't recalculate the loop info and loop dominators analyses if they're
preserved.

llvm-svn: 45596
2008-01-04 20:54:55 +00:00
Bill Wendling 118ae4cd61 80-column violations.
llvm-svn: 45574
2008-01-04 08:59:18 +00:00
Bill Wendling 3bf5603ce4 Add that this preserves some analyses.
llvm-svn: 45573
2008-01-04 08:48:49 +00:00
Bill Wendling 66470d02c3 Move option to enable machine LICM into LLVMTargetMachine.cpp.
llvm-svn: 45572
2008-01-04 08:11:03 +00:00
Bill Wendling d865697016 Call the parent's getAnalysisUsage.
llvm-svn: 45571
2008-01-04 07:50:05 +00:00
Chris Lattner f3edc09f9b Add a really quick hack at a machine code sinking pass, enabled with --enable-sinking.
It is missing validity checks, so it is known broken.  However, it is powerful enough
to compile this contrived code:

void test1(int C, double A, double B, double *P) {
  double Tmp = A*A+B*B;
  *P = C ? Tmp : A;
}

into:

_test1:
	movsd	8(%esp), %xmm0
	cmpl	$0, 4(%esp)
	je	LBB1_2	# entry
LBB1_1:	# entry
	movsd	16(%esp), %xmm1
	mulsd	%xmm1, %xmm1
	mulsd	%xmm0, %xmm0
	addsd	%xmm1, %xmm0
LBB1_2:	# entry
	movl	24(%esp), %eax
	movsd	%xmm0, (%eax)
	ret

instead of:

_test1:
	movsd	16(%esp), %xmm0
	mulsd	%xmm0, %xmm0
	movsd	8(%esp), %xmm1
	movapd	%xmm1, %xmm2
	mulsd	%xmm2, %xmm2
	addsd	%xmm0, %xmm2
	cmpl	$0, 4(%esp)
	je	LBB1_2	# entry
LBB1_1:	# entry
	movapd	%xmm2, %xmm1
LBB1_2:	# entry
	movl	24(%esp), %eax
	movsd	%xmm1, (%eax)
	ret

woo.

llvm-svn: 45570
2008-01-04 07:36:53 +00:00
Chris Lattner b5c1d9b7da remove dead #includes and reorder the rest.
llvm-svn: 45569
2008-01-04 06:41:45 +00:00
Bill Wendling 0ba4184404 Use the correct MachineRegisterInfo object.
llvm-svn: 45499
2008-01-02 21:10:54 +00:00
Bill Wendling f0b37780ca Remove dead code.
llvm-svn: 45496
2008-01-02 20:47:37 +00:00
Bill Wendling 5da1945cdd Use the new architecture to get the containing machine basic block for a machine
instruction. Also, use "splice" to move the new instruction instead of
remove/insert (where it was leaking memory anyway).

llvm-svn: 45492
2008-01-02 19:32:43 +00:00
Owen Anderson eee14601b1 Move some more instruction creation methods from RegisterInfo into InstrInfo.
llvm-svn: 45484
2008-01-01 21:11:32 +00:00
Chris Lattner caaf8aae4d Make MachineRegisterInfo::getVRegDef more efficient by aiming the keep the def of the vreg at the start of the list, so the list doesn't need to be traversed.
llvm-svn: 45483
2008-01-01 21:08:22 +00:00
Chris Lattner 0cb9dd7aa2 switch the register iterator to act more like hte LLVM value iterator: dereferencing
it now returns the machineinstr of the use.  To get the operand, use I.getOperand().

Add a new MachineRegisterInfo::replaceRegWith, which is basically like
Value::replaceAllUsesWith.

llvm-svn: 45482
2008-01-01 20:36:19 +00:00
Chris Lattner 39204d76c5 Add a trivial but handy function to efficiently return the machine
instruction that defines the specified vreg.  Crazy.

llvm-svn: 45480
2008-01-01 03:07:29 +00:00
Chris Lattner 961e7427ea Implement automatically updated def/use lists for all MachineInstr register
operands.  The lists are currently kept in MachineRegisterInfo, but it does
not yet provide an iterator interface to them.

llvm-svn: 45477
2008-01-01 01:12:31 +00:00
Chris Lattner 25568e4cef Fix a problem where lib/Target/TargetInstrInfo.h would include and use
a header file from libcodegen.  This violates a layering order: codegen
depends on target, not the other way around.  The fix to this is to 
split TII into two classes, TII and TargetInstrInfoImpl, which defines
stuff that depends on libcodegen.  It is defined in libcodegen, where 
the base is not.

llvm-svn: 45475
2008-01-01 01:03:04 +00:00
Duncan Sands 57a60f0466 Fix PR1833 - eh.exception and eh.selector return two
values, which means doing extra legalization work.
It would be easier to get this kind of thing right if
there was some documentation...

llvm-svn: 45472
2007-12-31 18:35:50 +00:00
Owen Anderson 7a73ae9a86 Move copyRegToReg from MRegisterInfo to TargetInstrInfo. This is part of the
Machine-level API cleanup instigated by Chris.

llvm-svn: 45470
2007-12-31 06:32:00 +00:00
Chris Lattner 574e7166e0 properly encapsulate the parent field of MBB and MI with get/set accessors.
llvm-svn: 45469
2007-12-31 04:56:33 +00:00
Chris Lattner 21ec2b4769 update a couple of references to SSARegMap.
llvm-svn: 45468
2007-12-31 04:16:08 +00:00
Chris Lattner a10fff51d9 Rename SSARegMap -> MachineRegisterInfo in keeping with the idea
that "machine" classes are used to represent the current state of
the code being compiled.  Given this expanded name, we can start 
moving other stuff into it.  For now, move the UsedPhysRegs and
LiveIn/LoveOuts vectors from MachineFunction into it.

Update all the clients to match.

This also reduces some needless #includes, such as MachineModuleInfo
from MachineFunction.

llvm-svn: 45467
2007-12-31 04:13:23 +00:00
Chris Lattner a5bb370aa4 Add new shorter predicates for testing machine operands for various types:
e.g. MO.isMBB() instead of MO.isMachineBasicBlock().  I don't plan on 
switching everything over, so new clients should just start using the 
shorter names.

Remove old long accessors, switching everything over to use the short
accessor: getMachineBasicBlock() -> getMBB(), 
getConstantPoolIndex() -> getIndex(), setMachineBasicBlock -> setMBB(), etc.

llvm-svn: 45464
2007-12-30 23:10:15 +00:00
Chris Lattner 6005589faf More cleanups for MachineOperand:
- Eliminate the static "print" method for operands, moving it
    into MachineOperand::print.
  - Change various set* methods for register flags to take a bool
    for the value to set it to.  Remove unset* methods.
  - Group methods more logically by operand flavor in MachineOperand.h

llvm-svn: 45461
2007-12-30 21:56:09 +00:00
Chris Lattner c98c0e57eb MachineOperand:
- Add getParent() accessors.
  - Move SubReg out of the AuxInfo union, to make way for future changes.
  - Remove the getImmedValue/setImmedValue methods.
  - in some MachineOperand::Create* methods, stop initializing fields that are dead.

MachineInstr:
  - Delete one copy of the MachineInstr printing code, now there is only one dump
    format and one copy of the code.
  - Make MachineOperand use the parent field to get info about preg register names if
    no target info is otherwise available.
  - Move def/use/kill/dead flag printing to the machineoperand printer, so they are
    always printed for an operand.

llvm-svn: 45460
2007-12-30 21:31:53 +00:00
Chris Lattner 96317d2412 fix typo duncan noticed!
llvm-svn: 45459
2007-12-30 21:21:10 +00:00
Chris Lattner 35fececec9 simpilfy some register printing code.
llvm-svn: 45458
2007-12-30 21:08:36 +00:00
Chris Lattner 383a873a9a eliminate a copy of the machineoperand printing stuff. Keep the copy that
knows how to print offsets.

llvm-svn: 45457
2007-12-30 21:03:30 +00:00
Chris Lattner 49bd29daa0 Simplify and clean up some machine operand/instr printing/dumping stuff.
llvm-svn: 45456
2007-12-30 21:01:27 +00:00
Chris Lattner 0dad74d252 two register machineoperands are not identical unless their subregs match.
llvm-svn: 45455
2007-12-30 20:55:08 +00:00
Chris Lattner 81798417dc MachineOperand::getImmedValue -> MachineOperand::getImm
llvm-svn: 45454
2007-12-30 20:50:28 +00:00
Chris Lattner 3c6ce5b43c make machine operands fatter: give each one an up-pointer to the
machineinstr that owns it.

llvm-svn: 45449
2007-12-30 06:11:04 +00:00
Chris Lattner 20421fe936 use simplified operand addition methods.
llvm-svn: 45436
2007-12-30 00:57:42 +00:00
Chris Lattner bbbae8e1ce use simplified operand addition methods.
llvm-svn: 45435
2007-12-30 00:51:11 +00:00
Chris Lattner e35dfb827f Start using the simplified methods for adding operands.
llvm-svn: 45432
2007-12-30 00:41:17 +00:00
Chris Lattner c288ff1d78 simplify some code by factoring operand construction better.
llvm-svn: 45428
2007-12-30 00:12:25 +00:00
Chris Lattner f3ebc3f3d2 Remove attribution from file headers, per discussion on llvmdev.
llvm-svn: 45418
2007-12-29 20:36:04 +00:00
Chris Lattner a087a8d2ce remove attribution from lib Makefiles.
llvm-svn: 45415
2007-12-29 20:09:26 +00:00
Chris Lattner 3b6a82118b Fold comparisons against a constant nan, and optimize ORD/UNORD
comparisons with a constant.  This allows us to compile isnan to:

_foo:
	fcmpu cr7, f1, f1
	mfcr r2
	rlwinm r3, r2, 0, 31, 31
	blr 

instead of:

LCPI1_0:					;  float
	.space	4
_foo:
	lis r2, ha16(LCPI1_0)
	lfs f0, lo16(LCPI1_0)(r2)
	fcmpu cr7, f1, f0
	mfcr r2
	rlwinm r3, r2, 0, 31, 31
	blr 

llvm-svn: 45405
2007-12-29 08:37:08 +00:00
Chris Lattner 2de9b85297 make sure not to zap volatile stores, thanks a lot to Dale for noticing this!
llvm-svn: 45402
2007-12-29 07:15:45 +00:00
Chris Lattner 5919b48fe9 don't fold fp_round(fp_extend(load)) -> fp_round(extload)
llvm-svn: 45400
2007-12-29 06:55:23 +00:00
Chris Lattner 3f9c6a7260 Delete a store whose input is a load from the same pointer:
x = load p
  store x -> p

llvm-svn: 45398
2007-12-29 06:26:16 +00:00
Owen Anderson bccb8c432d Flesh out the Briggs implementation a little bit more, fix a few FIXMEs.
llvm-svn: 45347
2007-12-24 22:12:23 +00:00
Owen Anderson e110199916 Sketch out an implementation of Briggs' copy placement algorithm.
llvm-svn: 45334
2007-12-23 15:37:26 +00:00
Chris Lattner de272b1b63 initial code for forming an FGETSIGN node. This is disabled until
legalizer support goes in.

llvm-svn: 45323
2007-12-22 21:35:38 +00:00
Chris Lattner afc8f13bf5 improve support for fgetsign
llvm-svn: 45322
2007-12-22 21:26:52 +00:00
Chris Lattner efd1cddb5a Tell TargetLoweringOpt whether it is running before
or after legalize.

llvm-svn: 45321
2007-12-22 20:56:36 +00:00
Chris Lattner 843cad4df2 Add a new FGETSIGN operation, which defaults to expand on all
targets.

llvm-svn: 45320
2007-12-22 20:47:56 +00:00
Gordon Henriksen 41689b52ab Use getIntrinsicID instead of looking up intrinsic prototypes. Also
fixes a bug with indirect calls. (Test case will be included with
ocaml collector patch.)

llvm-svn: 45316
2007-12-22 17:27:01 +00:00
Owen Anderson 5a4c05d047 Note what still needs doing.
llvm-svn: 45310
2007-12-22 04:59:10 +00:00
Owen Anderson 4534100765 Remove critical edge breaking. It won't be necessary as long as we are very careful when inserting copies.
llvm-svn: 45309
2007-12-22 04:50:11 +00:00
Evan Cheng f989141d30 More accurate checks for two-address constraints.
llvm-svn: 45259
2007-12-20 09:25:31 +00:00
Evan Cheng a509537e25 The physical register + virtual register joining requirement was much too strict.
llvm-svn: 45253
2007-12-20 02:23:25 +00:00
Evan Cheng 61bc51ee97 Bring back a burr scheduling heuristic that's still needed.
llvm-svn: 45252
2007-12-20 02:22:36 +00:00
Bill Wendling 65c001e6bc Updated comments to reflect what "side effects" means in this situation.
llvm-svn: 45245
2007-12-20 01:08:10 +00:00
Duncan Sands e9d8861cdf Simplify LowerCallTo by using a callsite.
llvm-svn: 45198
2007-12-19 09:48:52 +00:00
Duncan Sands 030bce7b83 The C++ exception handling personality function wants
to know about calls that cannot throw ('nounwind'):
if such a call does throw for some reason then the
personality will terminate the program.  The distinction
between an ordinary call and a nounwind call is that
an ordinary call gets an entry in the exception table
but a nounwind call does not.  This patch sets up the
exception table appropriately.  One oddity is that
I've chosen to bracket nounwind calls with labels (like
invokes) - the other choice would have been to bracket
ordinary calls with labels.  While bracketing
ordinary calls is more natural (because bracketing
by labels would then correspond exactly to getting an
entry in the exception table), I didn't do it because
introducing labels impedes some optimizations and I'm
guessing that ordinary calls occur more often than
nounwind calls.  This fixes the gcc filter2 eh test,
at least at -O0 (the inliner needs some tweaking at
higher optimization levels).

llvm-svn: 45197
2007-12-19 07:36:31 +00:00
Evan Cheng 9f06e5e2df Don't leave newly created nodes around if it turns out they are not needed.
llvm-svn: 45186
2007-12-19 01:34:38 +00:00
Bill Wendling 166f746246 Add debugging info. Use the newly created "hasUnmodelledSideEffects" method.
llvm-svn: 45178
2007-12-18 21:38:04 +00:00
Anton Korobeynikov 95cc3e0e66 Support more insane CEP's in AsmPrinter (Yes, PyPy folks do really use them).
llvm-svn: 45172
2007-12-18 20:53:41 +00:00
Evan Cheng 483a969ece Fix PR1872: SrcValue and SrcValueOffset should not be used to compute load / store node id.
llvm-svn: 45167
2007-12-18 19:38:14 +00:00
Evan Cheng 78ced47a2f Also print alignment and volatileness.
llvm-svn: 45164
2007-12-18 19:06:30 +00:00
Evan Cheng 91e0fc9cb4 FIX for PR1799: When a load is unfolded from an instruction, check if it is a new node. If not, do not create a new SUnit.
llvm-svn: 45157
2007-12-18 08:42:10 +00:00
Evan Cheng e2dbba5828 SelectionDAG::dump() should print SrcValue of LoadSDNode and StoreSDNode.
llvm-svn: 45151
2007-12-18 07:02:08 +00:00
Duncan Sands b5a79d0eaa Make invokes of inline asm legal. Teach codegen
how to lower them (with no attempt made to be
efficient, since they should only occur for
unoptimized code).

llvm-svn: 45108
2007-12-17 18:08:19 +00:00
Christopher Lamb edf0788758 Change the PointerType api for creating pointer types. The old functionality of PointerType::get() has become PointerType::getUnqual(), which returns a pointer in the generic address space. The new prototype of PointerType::get() requires both a type and an address space.
llvm-svn: 45082
2007-12-17 01:12:55 +00:00
Owen Anderson 7b8a741189 Break local interferences in StrongPHIElimination. One step closer...
llvm-svn: 45070
2007-12-16 05:44:27 +00:00
Owen Anderson ccb3981256 A few more comments.
llvm-svn: 45069
2007-12-16 04:07:23 +00:00
Dan Gohman 8a332b235d Add explicit keywords, and fix a minor typo that they uncovered.
llvm-svn: 45034
2007-12-14 15:41:34 +00:00
Evan Cheng 0fcf56f8f5 Bug fix. Must also match ResNo when matching an operand with a user.
llvm-svn: 45028
2007-12-14 08:25:15 +00:00
Owen Anderson 53b677e4e8 Add register pairs to the list to check for local interferences.
llvm-svn: 44987
2007-12-13 05:53:03 +00:00
Owen Anderson 1f93edd08a Remove ugly and horrible code. It's not necessary for correctness, and can be added back later if it causes code quality issues.
llvm-svn: 44986
2007-12-13 05:43:37 +00:00
Evan Cheng 6e68381e02 Implicit def instructions, e.g. X86::IMPLICIT_DEF_GR32, are always re-materializable and they should not be spilled.
llvm-svn: 44960
2007-12-12 23:12:09 +00:00
Dan Gohman 7a7742c2fe Allow vector integer constants to be created with
SelectionDAG::getConstant, in the same way as vector floating-point
constants. This allows the legalize expansion code for @llvm.ctpop and
friends to be usable with vector types.

llvm-svn: 44954
2007-12-12 22:21:26 +00:00
Owen Anderson 499e5bffcf Forgot to remove a register from the PHI-union after I'd determined that it
interfered with other registers.  Seems like that might be a good thing to do. :-)

llvm-svn: 44902
2007-12-12 01:25:08 +00:00
Evan Cheng 6766d2fa4f If deleting a reload instruction due to reuse (value is available in register R and reload is targeting R), make sure to invalidate the kill information of the last kill.
llvm-svn: 44894
2007-12-11 23:36:57 +00:00
Bill Wendling 38236ef6cb Need to grow the indexed map. Added debug statements.
llvm-svn: 44892
2007-12-11 23:27:51 +00:00
Bill Wendling 642e15a7cb Simplify slightly.
llvm-svn: 44881
2007-12-11 22:22:22 +00:00
Owen Anderson f24dd1c1eb More progress on StrongPHIElimination. Now we actually USE the DomForest!
llvm-svn: 44877
2007-12-11 20:12:11 +00:00
Bill Wendling b678ae7c38 Blark! How in the world did this work without this?!
llvm-svn: 44874
2007-12-11 19:40:06 +00:00
Bill Wendling 7717a8a37d - Update the virtual reg to machine instruction map when hoisting.
- Fix subtle bug when creating initially creating this map.

llvm-svn: 44873
2007-12-11 19:17:04 +00:00
Bill Wendling 5143d898c8 Checking for "zero operands" during the "CanHoistInst()" method isn't necessary
because those with side effects will be caught by other checks in here.

Also, simplify the check for a BB in a sub loop.

llvm-svn: 44871
2007-12-11 18:45:11 +00:00
Evan Cheng 303417d242 Switch over to MachineLoopInfo.
llvm-svn: 44838
2007-12-11 02:09:15 +00:00
Evan Cheng f54030231e Pretty print shuffle mask operand.
llvm-svn: 44837
2007-12-11 02:08:35 +00:00
Gordon Henriksen 7843c16f31 CollectorMetadata and Collector are rejiggered to get along with
per-function collector model. Collector is now the factory for
CollectorMetadata, so the latter may be subclassed.

llvm-svn: 44827
2007-12-11 00:30:17 +00:00
Owen Anderson ba61806ef1 A little more progress on StrongPHIElimination, now that I have a better sense of
how the CodeGen machinery works.

llvm-svn: 44786
2007-12-10 08:07:09 +00:00
Christopher Lamb d202e03fe5 Improve branch folding by recgonizing that explict successor relationships impact the value of fall-through choices.
llvm-svn: 44785
2007-12-10 07:24:06 +00:00
Chris Lattner 64443973c0 Duncan points out that the subtraction is unneeded since hte code
knows the vector is not pow2

llvm-svn: 44740
2007-12-09 17:56:34 +00:00
Chris Lattner 69d3298777 Add support for splitting the operand of a return instruction.
llvm-svn: 44728
2007-12-09 00:06:19 +00:00
Bill Wendling 3f19dfe794 Reverting 44702. It wasn't correct to rename them.
llvm-svn: 44727
2007-12-08 23:58:46 +00:00
Chris Lattner e48fc80446 add many new cases to SplitResult. SplitResult now handles all the cases that LegalizeDAG does.
llvm-svn: 44726
2007-12-08 23:58:27 +00:00
Chris Lattner de9046af54 Implement splitting support for store, allowing us to compile:
%f8 = type <8 x float>

define void @test_f8(%f8* %P, %f8* %Q, %f8* %S) {
	%p = load %f8* %P		; <%f8> [#uses=1]
	%q = load %f8* %Q		; <%f8> [#uses=1]
	%R = add %f8 %p, %q		; <%f8> [#uses=1]
	store %f8 %R, %f8* %S
	ret void
}

into:

_test_f8:
	movaps	16(%rdi), %xmm0
	addps	16(%rsi), %xmm0
	movaps	(%rdi), %xmm1
	addps	(%rsi), %xmm1
	movaps	%xmm0, 16(%rdx)
	movaps	%xmm1, (%rdx)
	ret

llvm-svn: 44725
2007-12-08 23:24:26 +00:00
Chris Lattner de87224cd9 implement vector splitting of load, undef, and binops.
llvm-svn: 44724
2007-12-08 23:08:49 +00:00
Chris Lattner 1ef437d4e1 implement some methods.
llvm-svn: 44723
2007-12-08 22:40:18 +00:00
Chris Lattner a5e7db115e add scaffolding for splitting of vectors.
llvm-svn: 44722
2007-12-08 22:37:41 +00:00
Chris Lattner 8c8eaf6b92 reorganize header to separate into functional blocks.
llvm-svn: 44719
2007-12-08 21:59:32 +00:00
Chris Lattner 4063bd6eae split scalarization out to its own file.
llvm-svn: 44718
2007-12-08 20:30:28 +00:00
Chris Lattner 5c7c46baaf Split expansion out into its own file.
llvm-svn: 44717
2007-12-08 20:27:32 +00:00
Chris Lattner 029c816460 Split promotion support out to its own file.
llvm-svn: 44716
2007-12-08 20:24:38 +00:00
Chris Lattner 757d4beba9 Rename LegalizeDAGTypes.cpp -> LegalizeTypes.cpp
llvm-svn: 44715
2007-12-08 20:17:13 +00:00
Chris Lattner 92288147b6 Split the class definition of DAGTypeLegalizer out into a header.
Leave it visibility hidden, but not in an anon namespace.

llvm-svn: 44714
2007-12-08 20:16:06 +00:00
Bill Wendling 2b07d8c5a0 Renaming:
isTriviallyReMaterializable -> hasNoSideEffects
  isReallyTriviallyReMaterializable -> isTriviallyReMaterializable

llvm-svn: 44702
2007-12-08 07:17:56 +00:00
Bill Wendling 4375173ba0 Incorporated comments from Evan and Chris:
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20071203/056043.html
http://lists.cs.uiuc.edu/pipermail/llvm-commits/Week-of-Mon-20071203/056048.html

llvm-svn: 44696
2007-12-08 01:47:01 +00:00
Bill Wendling fb706bc52b Initial commit of the machine code LICM pass. It successfully hoists this:
_foo:
        li r2, 0
LBB1_1: ; bb
        li r5, 0
        stw r5, 0(r3)
        addi r2, r2, 1
        addi r3, r3, 4
        cmplw cr0, r2, r4
        bne cr0, LBB1_1 ; bb
LBB1_2: ; return
        blr 

to:

_foo:
        li r2, 0
        li r5, 0
LBB1_1: ; bb
        stw r5, 0(r3)
        addi r2, r2, 1
        addi r3, r3, 4
        cmplw cr0, r2, r4
        bne cr0, LBB1_1 ; bb
LBB1_2: ; return
        blr

ZOMG!! :-)

Moar to come...

llvm-svn: 44687
2007-12-07 21:42:31 +00:00
Evan Cheng 85cdba29b0 Add an option to control this heuristic tweak so I can test it.
llvm-svn: 44671
2007-12-07 00:28:32 +00:00
Dale Johannesen 5eff4de9c8 Redo previous patch so optimization only done for i1.
Simpler and safer.

llvm-svn: 44663
2007-12-06 17:53:31 +00:00
Evan Cheng 8393dc7378 Turning simple splitting on. Start testing new coalescer heuristics as new llcbeta.
llvm-svn: 44660
2007-12-06 08:54:31 +00:00
Chris Lattner eedaf92fcf third time around: instead of disabling this completely,
only disable it if we don't know it will be obviously profitable.
Still fixme, but less so. :)

llvm-svn: 44658
2007-12-06 07:47:55 +00:00
Chris Lattner b5fdfb9612 Actually, disable this code for now. More analysis and improvements to
the X86 backend are needed before this should be enabled by default.

llvm-svn: 44657
2007-12-06 07:44:31 +00:00
Chris Lattner 7c709a5d08 implement a readme entry, compiling the code into:
_foo:
	movl	$12, %eax
	andl	4(%esp), %eax
	movl	_array(%eax), %eax
	ret

instead of:

_foo:
	movl	4(%esp), %eax
	shrl	$2, %eax
	andl	$3, %eax
	movl	_array(,%eax,4), %eax
	ret

As it turns out, this triggers all the time, in a wide variety of
situations, for example, I see diffs like this in various programs:

-       movl    8(%eax), %eax
-       shll    $2, %eax
-       andl    $1020, %eax
-       movl    (%esi,%eax), %eax
+       movzbl  8(%eax), %eax
+       movl    (%esi,%eax,4), %eax


-       shll    $2, %edx
-       andl    $1020, %edx
-       movl    (%edi,%edx), %edx
+       andl    $255, %edx
+       movl    (%edi,%edx,4), %edx

Unfortunately, I also see stuff like this, which can be fixed in the
X86 backend:

-       andl    $85, %ebx
-       addl    _bit_count(,%ebx,4), %ebp
+       shll    $2, %ebx
+       andl    $340, %ebx
+       addl    _bit_count(%ebx), %ebp

llvm-svn: 44656
2007-12-06 07:33:36 +00:00
Chris Lattner 42558bf664 implement the rest of the functionality from SelectionDAGLegalize::ScalarizeVectorOp
llvm-svn: 44654
2007-12-06 05:53:43 +00:00
Dale Johannesen 05bbbda78a Fix PR1842.
llvm-svn: 44649
2007-12-06 01:43:46 +00:00
Evan Cheng 7fc1d98353 Fix for PR1831: if all defs of an interval are re-materializable, then it's a preferred spill candiate.
llvm-svn: 44644
2007-12-06 00:01:56 +00:00
Evan Cheng 678b86d6ce MachineInstr can change. Store indexes instead.
llvm-svn: 44612
2007-12-05 10:24:35 +00:00
Evan Cheng 06353b48b5 If a split live interval is spilled again, remove the kill marker on its last use.
llvm-svn: 44611
2007-12-05 09:51:10 +00:00
Evan Cheng 64b3baaaea Clobber more bugs.
llvm-svn: 44610
2007-12-05 09:05:34 +00:00
Evan Cheng d7de56ac93 Fix kill info for split intervals.
llvm-svn: 44609
2007-12-05 08:16:32 +00:00
Chris Lattner c9693c60a5 more scalarization
llvm-svn: 44608
2007-12-05 07:45:02 +00:00
Chris Lattner 1a0d49a63c scalarize vector binops
llvm-svn: 44607
2007-12-05 07:36:58 +00:00
Evan Cheng 269dbd31d0 - Mark last use of a split interval as kill instead of letting spiller track it.
This allows an important optimization to be re-enabled.
- If all uses / defs of a split interval can be folded, give the interval a
  low spill weight so it would not be picked in case spilling is needed (avoid
  pushing other intervals in the same BB to be spilled).

llvm-svn: 44601
2007-12-05 03:22:34 +00:00
Evan Cheng bb26301864 Add a argument to storeRegToStackSlot and storeRegToAddr to specify whether
the stored register is killed.

llvm-svn: 44600
2007-12-05 03:14:33 +00:00
Evan Cheng e412a4427b Remove a unsafe optimization. This fixes 401.bzip2.
llvm-svn: 44587
2007-12-04 23:57:55 +00:00
Evan Cheng cd8a89b3cd Spiller unfold optimization bug: do not clobber a reusable stack slot value unless it can be modified.
llvm-svn: 44575
2007-12-04 19:19:45 +00:00
Chris Lattner b892225fb9 Implement framework for scalarizing node results. This is sufficient
to codegen this:

define float @test_extract_elt(<1 x float> * %P) {
	%p = load <1 x float>* %P
	%R = extractelement <1 x float> %p, i32 0
	ret float %R
}

llvm-svn: 44570
2007-12-04 07:48:46 +00:00
Chris Lattner 681c9d6697 start providing framework for scalarizing vectors.
llvm-svn: 44569
2007-12-04 07:29:51 +00:00
Evan Cheng d1badb960e Discard split intervals made empty due to folding.
llvm-svn: 44565
2007-12-04 00:32:23 +00:00
Evan Cheng 40965448ff Bug fixes.
llvm-svn: 44549
2007-12-03 21:31:55 +00:00
Duncan Sands 38ef3a8ec7 Rather than having special rules like "intrinsics cannot
throw exceptions", just mark intrinsics with the nounwind
attribute.  Likewise, mark intrinsics as readnone/readonly
and get rid of special aliasing logic (which didn't use
anything more than this anyway).

llvm-svn: 44544
2007-12-03 20:06:50 +00:00
Evan Cheng 196faa9dc5 Typo
llvm-svn: 44532
2007-12-03 10:00:00 +00:00
Evan Cheng 85ef9834a6 Update kill info for uses of split intervals.
llvm-svn: 44531
2007-12-03 09:58:48 +00:00
Evan Cheng f45a1d623c Remove redundant foldMemoryOperand variants and other code clean up.
llvm-svn: 44517
2007-12-02 08:30:39 +00:00
Evan Cheng 388f6f51a0 Fix a bug where splitting cause some unnecessary spilling.
llvm-svn: 44482
2007-12-01 04:42:39 +00:00
Evan Cheng 69fda0a716 Allow some reloads to be folded in multi-use cases. Specifically testl r, r -> cmpl [mem], 0.
llvm-svn: 44479
2007-12-01 02:07:52 +00:00
Evan Cheng b10dc27b20 Do not fold reload into an instruction with multiple uses. It issues one extra load.
llvm-svn: 44467
2007-11-30 21:23:43 +00:00
Devang Patel cc45c338d1 Provide a way to update DescGlobals cache directly.
llvm-svn: 44446
2007-11-30 00:51:33 +00:00
Evan Cheng d35b5acae4 Do not lose rematerialization info when spilling already split live intervals.
llvm-svn: 44443
2007-11-29 23:02:50 +00:00
Evan Cheng 8494ee175c Fix a major performance issue with splitting. If there is a def (not def/use)
in the middle of a split basic block, create a new live interval starting at
the def. This avoid artifically extending the live interval over a number of
cycles where it is dead. e.g.

bb1:
       = vr1204   (use / kill) <= new interval starts and ends here.
...
...
vr1204 =          (new def)   <= start a new interval here.
       = vr1204   (use)

llvm-svn: 44436
2007-11-29 10:12:14 +00:00
Evan Cheng f85c063ec0 Replace the odd kill# hack with something less fragile.
llvm-svn: 44434
2007-11-29 09:49:23 +00:00
Evan Cheng be255b0650 Fixed various live interval splitting bugs / compile time issues.
llvm-svn: 44428
2007-11-29 01:06:25 +00:00
Evan Cheng 147f7799c5 Kill info update bug.
llvm-svn: 44427
2007-11-29 01:05:47 +00:00
Duncan Sands 5208d1ab4a Add some convenience methods for querying attributes, and
use them.

llvm-svn: 44403
2007-11-28 17:07:01 +00:00
Duncan Sands 45a0c3265f Add missing newlines at EOF.
llvm-svn: 44399
2007-11-28 10:13:38 +00:00
Evan Cheng c1648b6a0d Recover compile time regression.
llvm-svn: 44386
2007-11-28 01:28:46 +00:00
Owen Anderson 30767b15e9 Add MachineLoopInfo. This is not yet tested.
llvm-svn: 44384
2007-11-27 22:47:08 +00:00
Nate Begeman 6f026a654c Support returning non-power-of-2 vectors to unblock some work
llvm-svn: 44371
2007-11-27 19:28:48 +00:00
Duncan Sands ad0ea2d430 Fix PR1146: parameter attributes are longer part of
the function type, instead they belong to functions
and function calls.  This is an updated and slightly
corrected version of Reid Spencer's original patch.
The only known problem is that auto-upgrading of
bitcode files doesn't seem to work properly (see
test/Bitcode/AutoUpgradeIntrinsics.ll).  Hopefully
a bitcode guru (who might that be? :) ) will fix it.

llvm-svn: 44359
2007-11-27 13:23:08 +00:00
Chris Lattner 698b1cb28d err, no really.
llvm-svn: 44352
2007-11-27 06:14:32 +00:00
Chris Lattner 28caf2717a don't depend on ADL.
llvm-svn: 44351
2007-11-27 06:14:12 +00:00
Dan Gohman 9a69341725 Don't lower srem/urem X%C to X-X/C*C unless the division is actually
optimized. This avoids creating illegal divisions when the combiner is
running after legalize; this fixes PR1815. Also, it produces better
code in the included testcase by avoiding the subtract and multiply
when the division isn't optimized.

llvm-svn: 44341
2007-11-26 23:46:11 +00:00
Chris Lattner cab915f9cf Implement expand support for MERGE_VALUEs that only produces one result.
llvm-svn: 44304
2007-11-24 19:12:15 +00:00
Chris Lattner 6e3641897b Implement support for custom legalization in DAGTypeLegalizer::ExpandOperand.
Improve a comment.
Unbreak Duncan's carefully written path compression where I didn't realize
what was happening!

llvm-svn: 44301
2007-11-24 18:11:42 +00:00
Chris Lattner f81d5886c6 Several changes:
1) Change the interface to TargetLowering::ExpandOperationResult to 
   take and return entire NODES that need a result expanded, not just
   the value.  This allows us to handle things like READCYCLECOUNTER,
   which returns two values.
2) Implement (extremely limited) support in LegalizeDAG::ExpandOp for MERGE_VALUES.
3) Reimplement custom lowering in LegalizeDAGTypes in terms of the new
   ExpandOperationResult.  This makes the result simpler and fully 
   general.
4) Implement (fully general) expand support for MERGE_VALUES in LegalizeDAGTypes.
5) Implement ExpandOperationResult support for ARM f64->i64 bitconvert and ARM
   i64 shifts, allowing them to work with LegalizeDAGTypes.
6) Implement ExpandOperationResult support for X86 READCYCLECOUNTER and FP_TO_SINT,
   allowing them to work with LegalizeDAGTypes.

LegalizeDAGTypes now passes several more X86 codegen tests when enabled and when
type legalization in LegalizeDAG is ifdef'd out.

llvm-svn: 44300
2007-11-24 07:07:01 +00:00
Duncan Sands b87dde7e8e Fix a bug in which node A is replaced by node B, but later
node A gets back into the DAG again because it was hiding in
one of the node maps: make sure that node replacement happens
in those maps too.

llvm-svn: 44263
2007-11-21 16:43:19 +00:00
Dale Johannesen 763e110a9f Fix .eh table linkage issues on Darwin. Some EH support
for Darwin PPC, but it's not fully working yet.

llvm-svn: 44258
2007-11-20 23:24:42 +00:00
Chris Lattner 09c0393d5e ExpandUnalignedLoad doesn't handle vectors right at all apparently.
Fix a couple of problems:
1. Don't assume the VT-1 is a VT that is half the size.
2. Treat vectors of FP in the vector path, not the FP path.

This has a couple of remaining problems before it will work with
the code in PR1811: the code below this change assumes that it can
use extload/shift/or to construct the result, which isn't right for
vectors.

This also doesn't handle vectors of 1 or vectors that aren't pow-2.

llvm-svn: 44243
2007-11-19 21:38:03 +00:00
Chris Lattner 6fa95ec19d Implement vector expand support for shuffle_vector. This fixes PR1811.
llvm-svn: 44242
2007-11-19 21:16:54 +00:00
Chris Lattner 67d77945e7 Implement splitting of UNDEF nodes. This is the first step towards fixing PR1811
llvm-svn: 44239
2007-11-19 20:21:32 +00:00
Dan Gohman 36347a26f9 Add support in SplitVectorOp for remainder operators.
llvm-svn: 44233
2007-11-19 15:15:03 +00:00
Nate Begeman d4d45c268c Add support for vectors to int <-> float casts.
llvm-svn: 44204
2007-11-17 03:58:34 +00:00
Evan Cheng 8e22379303 Live interval splitting:
When a live interval is being spilled, rather than creating short, non-spillable
intervals for every def / use, split the interval at BB boundaries. That is, for
every BB where the live interval is defined or used, create a new interval that
covers all the defs and uses in the BB.

This is designed to eliminate one common problem: multiple reloads of the same
value in a single basic block. Note, it does *not* decrease the number of spills
since no copies are inserted so the split intervals are *connected* through
spill and reloads (or rematerialization). The newly created intervals can be
spilled again, in that case, since it does not span multiple basic blocks, it's
spilled in the usual manner. However, it can reuse the same stack slot as the
previously split interval.

This is currently controlled by -split-intervals-at-bb.

llvm-svn: 44198
2007-11-17 00:40:40 +00:00
Anton Korobeynikov 66b91e66ec Implement necessary bits for flt_rounds gcc builtin.
Codegen bits and llvm-gcc support will follow.

llvm-svn: 44182
2007-11-15 23:25:33 +00:00
Nate Begeman bd117f06ba Basic non-power-of-2 vector support
llvm-svn: 44181
2007-11-15 21:15:26 +00:00
Duncan Sands d4494352f8 This assertion was bogus.
llvm-svn: 44167
2007-11-15 09:54:37 +00:00
Evan Cheng 2c1a50455c Fix a thinko in post-allocation coalescer.
llvm-svn: 44166
2007-11-15 08:13:29 +00:00
Bill Wendling b3712f8146 Adding debug output during coalescing.
llvm-svn: 44154
2007-11-15 02:06:30 +00:00
Bill Wendling 8269925b1e Need to increment the iterator.
llvm-svn: 44153
2007-11-15 00:40:48 +00:00
Anton Korobeynikov 2c6387803e Fix PIC jump table codegen on x86-32/linux. In fact, such thing should be applied
to all targets uses GOT-relative offsets for PIC (Alpha?)

llvm-svn: 44108
2007-11-14 09:18:41 +00:00
Evan Cheng 7f02cfa599 Clean up sub-register implementation by moving subReg information back to
MachineOperand auxInfo. Previous clunky implementation uses an external map
to track sub-register uses. That works because register allocator uses
a new virtual register for each spilled use. With interval splitting (coming
soon), we may have multiple uses of the same register some of which are
of using different sub-registers from others. It's too fragile to constantly
update the information.

llvm-svn: 44104
2007-11-14 07:59:08 +00:00
Owen Anderson d8167ab332 Run computeDomForest() on the set of registers that need to be tested for
interference.

llvm-svn: 44064
2007-11-13 20:13:24 +00:00
Owen Anderson 569ef71e44 Preserve LiveVariables when doing critical edge splitting.
llvm-svn: 44063
2007-11-13 20:04:45 +00:00
Dale Johannesen 7a7085f6d3 Add parameter to getDwarfRegNum to permit targets
to use different mappings for EH and debug info;
no functional change yet.
Fix warning in X86CodeEmitter.

llvm-svn: 44056
2007-11-13 19:13:01 +00:00
Bill Wendling f359fed9f9 Unify CALLSEQ_{START,END}. They take 4 parameters: the chain, two stack
adjustment fields, and an optional flag. If there is a "dynamic_stackalloc" in
the code, make sure that it's bracketed by CALLSEQ_START and CALLSEQ_END. If
not, then there is the potential for the stack to be changed while the stack's
being used by another instruction (like a call).

This can only result in tears...

llvm-svn: 44037
2007-11-13 00:44:25 +00:00
Owen Anderson c520c4b325 Break critical edges coming into blocks with PHI nodes.
llvm-svn: 44019
2007-11-12 17:27:27 +00:00
Evan Cheng be51f28e2b Refactor some code.
llvm-svn: 44010
2007-11-12 06:35:08 +00:00
Owen Anderson a1cd45213d As Chris and Evan pointed out, BreakCriticalMachineEdges doesn't really need
to be a pass of its own.  Instead, move it out into a helper method.

llvm-svn: 44002
2007-11-12 01:05:09 +00:00
Hartmut Kaiser 67297144ab Fixed a strange construct. Please review.
llvm-svn: 43960
2007-11-09 19:59:00 +00:00
Duncan Sands e795efea5b Move MinAlign to MathExtras.h.
llvm-svn: 43944
2007-11-09 13:41:39 +00:00
Duncan Sands e7a9ac929f Fix some load/store logic that would be wrong for
apints on big-endian machines if the bitwidth is
not a multiple of 8.  Introduce a new helper,
MVT::getStoreSizeInBits, and use it.

llvm-svn: 43934
2007-11-09 08:57:19 +00:00
Duncan Sands bab9dc9433 Add terminating newline.
llvm-svn: 43933
2007-11-09 08:30:21 +00:00
Evan Cheng 797d56ff17 Much improved pic jumptable codegen:
Then:
        call    "L1$pb"
"L1$pb":
        popl    %eax
		...
LBB1_1: # entry
        imull   $4, %ecx, %ecx
        leal    LJTI1_0-"L1$pb"(%eax), %edx
        addl    LJTI1_0-"L1$pb"(%ecx,%eax), %edx
        jmpl    *%edx

        .align  2
        .set L1_0_set_3,LBB1_3-LJTI1_0
        .set L1_0_set_2,LBB1_2-LJTI1_0
        .set L1_0_set_5,LBB1_5-LJTI1_0
        .set L1_0_set_4,LBB1_4-LJTI1_0
LJTI1_0:
        .long    L1_0_set_3
        .long    L1_0_set_2

Now:
        call    "L1$pb"
"L1$pb":
        popl    %eax
		...
LBB1_1: # entry
        addl    LJTI1_0-"L1$pb"(%eax,%ecx,4), %eax
        jmpl    *%eax

		.align  2
		.set L1_0_set_3,LBB1_3-"L1$pb"
		.set L1_0_set_2,LBB1_2-"L1$pb"
		.set L1_0_set_5,LBB1_5-"L1$pb"
		.set L1_0_set_4,LBB1_4-"L1$pb"
LJTI1_0:
        .long    L1_0_set_3
        .long    L1_0_set_2

llvm-svn: 43924
2007-11-09 01:32:10 +00:00
Evan Cheng f14006f4d6 Didn't mean to check these in.
llvm-svn: 43923
2007-11-09 01:28:33 +00:00
Evan Cheng 1bf166312b Bug fix. Passive nodes are not in SUnitMap.
llvm-svn: 43922
2007-11-09 01:27:11 +00:00
Owen Anderson 65d2fcdd2a This preserves critical edge breaking.
llvm-svn: 43911
2007-11-08 22:23:57 +00:00
Owen Anderson 3bc8124a66 Make BreakCriticalMachineEdges available as a pass that can be depended on.
llvm-svn: 43910
2007-11-08 22:20:23 +00:00
Evan Cheng ece4c68b82 If both parts of smul_lohi, etc. are used, don't simplify. If only one part is used, try simplify it.
llvm-svn: 43888
2007-11-08 09:25:29 +00:00
Owen Anderson 0be8c1dafe Add the majority of machine-level critical edge breaking pass. Most of this was written by Fernando, cleanup and updating to TOT by me.
This still needs a bit of work, particularly to handle jump tables properly.

llvm-svn: 43885
2007-11-08 07:55:43 +00:00
Owen Anderson bfbc12973d Take another stab at getting isLiveIn() and isLiveOut() right.
llvm-svn: 43869
2007-11-08 01:32:45 +00:00
Owen Anderson 9d86ef12c8 Bring UsedBlocks back. StrongPHIElimination needs this information.
llvm-svn: 43866
2007-11-08 01:20:48 +00:00
Evan Cheng e742ee1dbe Simplify my (il)logic.
llvm-svn: 43819
2007-11-07 08:08:25 +00:00
Owen Anderson c6a5387d09 Add some more of StrongPHIElim.
llvm-svn: 43805
2007-11-07 05:17:15 +00:00
Dan Gohman ccfc028283 Remainder operations must be either integer or floating-point.
llvm-svn: 43781
2007-11-06 22:11:54 +00:00
Evan Cheng dd71a5c37b When the allocator rewrite a spill register with new virtual register, it replaces other operands of the same register. Watch out for situations where
only some of the operands are sub-register uses.

llvm-svn: 43776
2007-11-06 21:12:10 +00:00
Evan Cheng d5d59ad634 First step towards moving the coalescer to priority_queue based machinery.
llvm-svn: 43764
2007-11-06 08:52:21 +00:00
Evan Cheng 92d23e5204 Fix a bug where a def use operand isn't being detected as a sub-register use.
llvm-svn: 43763
2007-11-06 08:50:44 +00:00
Evan Cheng 2dbffa4e76 Add pseudo dependency to force two-address instruction to be scheduled after
other uses. There was a overly restricted check that prevented some obvious
cases.

llvm-svn: 43762
2007-11-06 08:44:59 +00:00
Owen Anderson d378cea030 Add a few comments.
llvm-svn: 43755
2007-11-06 05:26:02 +00:00
Owen Anderson eb964eb2c8 DomForest is a forest of registers, not instructions.
llvm-svn: 43754
2007-11-06 05:22:43 +00:00
Owen Anderson a9057f0b97 StrongPHIElimination requires LiveVariables.
llvm-svn: 43751
2007-11-06 04:49:43 +00:00
Dan Gohman 08143e397d Add support for vector remainder operations.
llvm-svn: 43744
2007-11-05 23:35:22 +00:00
Rafael Espindola fa0df55bdd Move the LowerMEMCPY and LowerMEMCPYCall to a common place.
Thanks for the suggestions Bill :-)

llvm-svn: 43742
2007-11-05 23:12:20 +00:00
Dale Johannesen 4646aa3e33 Make labels work in asm blocks; allow labels as
parameters.  Rename ValueRefList to ParamList
in AsmParser, since its only use is for parameters.

llvm-svn: 43734
2007-11-05 21:20:28 +00:00
Duncan Sands f7ae8bd090 Don't output ABI size padding twice. By using the store
size for the field we get ABI padding automatically, so
no need to put it in again when we emit the field.

llvm-svn: 43720
2007-11-05 18:03:02 +00:00
Evan Cheng 8bb30184a8 Move SimpleRegisterCoalescing.h to lib/CodeGen since there is now a common
register coalescer interface: RegisterCoalescing.

llvm-svn: 43714
2007-11-05 17:41:38 +00:00
Evan Cheng 17b0e3e1ae Skip over deleted val#'s.
llvm-svn: 43700
2007-11-05 06:46:45 +00:00
Evan Cheng a406b47f14 Handle cases where a register and one of its super-register are both marked as
defined on the same instruction. This fixes PR1767.

llvm-svn: 43699
2007-11-05 03:11:55 +00:00
Evan Cheng a8044084ac Fix PR1187.
llvm-svn: 43692
2007-11-05 00:59:10 +00:00
Duncan Sands 283207a71c Eliminate the remaining uses of getTypeSize. This
should only effect x86 when using long double.  Now
12/16 bytes are output for long double globals (the
exact amount depends on the alignment).  This brings
globals in line with the rest of LLVM: the space
reserved for an object is now always the ABI size.
One tricky point is that only 10 bytes should be
output for long double if it is a field in a packed
struct, which is the reason for the additional
argument to EmitGlobalConstant.

llvm-svn: 43688
2007-11-05 00:04:43 +00:00
Owen Anderson eea82746b3 Another step of stronger PHI elimination down.
llvm-svn: 43684
2007-11-04 22:33:26 +00:00
Evan Cheng 5c1b044899 If an interval is being undone clear its preference as well since the source interval may have been undone as well.
llvm-svn: 43670
2007-11-04 08:32:21 +00:00
Evan Cheng 66298e226f There are times when the coalescer would not coalesce away a copy but the copy
can be eliminated by the allocator is the destination and source targets the
same register. The most common case is when the source and destination registers
are in different class. For example, on x86 mov32to32_ targets GR32_ which
contains a subset of the registers in GR32.

The allocator can do 2 things:
1. Set the preferred allocation for the destination of a copy to that of its source.
2. After allocation is done, change the allocation of a copy destination (if
   legal) so the copy can be eliminated.

This eliminates 443 extra moves from 403.gcc.

llvm-svn: 43662
2007-11-03 07:20:12 +00:00
Dan Gohman d7917b6248 Add std:: to sort calls.
llvm-svn: 43652
2007-11-02 22:24:01 +00:00
Dan Gohman c981d72d1a Change illegal uses of ++ to uses of STLExtra.h's next function.
llvm-svn: 43651
2007-11-02 22:22:02 +00:00
Evan Cheng f851163c53 One more extract_subreg coalescing bug.
llvm-svn: 43644
2007-11-02 17:35:08 +00:00
Duncan Sands 04059dd351 Fix a thinko.
llvm-svn: 43639
2007-11-02 15:18:06 +00:00
Duncan Sands 44b8721de8 Executive summary: getTypeSize -> getTypeStoreSize / getABITypeSize.
The meaning of getTypeSize was not clear - clarifying it is important
now that we have x86 long double and arbitrary precision integers.
The issue with long double is that it requires 80 bits, and this is
not a multiple of its alignment.  This gives a primitive type for
which getTypeSize differed from getABITypeSize.  For arbitrary precision
integers it is even worse: there is the minimum number of bits needed to
hold the type (eg: 36 for an i36), the maximum number of bits that will
be overwriten when storing the type (40 bits for i36) and the ABI size
(i.e. the storage size rounded up to a multiple of the alignment; 64 bits
for i36).

This patch removes getTypeSize (not really - it is still there but
deprecated to allow for a gradual transition).  Instead there is:

(1) getTypeSizeInBits - a number of bits that suffices to hold all
values of the type.  For a primitive type, this is the minimum number
of bits.  For an i36 this is 36 bits.  For x86 long double it is 80.
This corresponds to gcc's TYPE_PRECISION.

(2) getTypeStoreSizeInBits - the maximum number of bits that is
written when storing the type (or read when reading it).  For an
i36 this is 40 bits, for an x86 long double it is 80 bits.  This
is the size alias analysis is interested in (getTypeStoreSize
returns the number of bytes).  There doesn't seem to be anything
corresponding to this in gcc.

(3) getABITypeSizeInBits - this is getTypeStoreSizeInBits rounded
up to a multiple of the alignment.  For an i36 this is 64, for an
x86 long double this is 96 or 128 depending on the OS.  This is the
spacing between consecutive elements when you form an array out of
this type (getABITypeSize returns the number of bytes).  This is
TYPE_SIZE in gcc.

Since successive elements in a SequentialType (arrays, pointers
and vectors) need to be aligned, the spacing between them will be
given by getABITypeSize.  This means that the size of an array
is the length times the getABITypeSize.  It also means that GEP
computations need to use getABITypeSize when computing offsets.
Furthermore, if an alloca allocates several elements at once then
these too need to be aligned, so the size of the alloca has to be
the number of elements multiplied by getABITypeSize.  Logically
speaking this doesn't have to be the case when allocating just
one element, but it is simpler to also use getABITypeSize in this
case.  So alloca's and mallocs should use getABITypeSize.  Finally,
since gcc's only notion of size is that given by getABITypeSize, if
you want to output assembler etc the same as gcc then getABITypeSize
is the size you want.

Since a store will overwrite no more than getTypeStoreSize bytes,
and a read will read no more than that many bytes, this is the
notion of size appropriate for alias analysis calculations.

In this patch I have corrected all type size uses except some of
those in ScalarReplAggregates, lib/Codegen, lib/Target (the hard
cases).  I will get around to auditing these too at some point,
but I could do with some help.

Finally, I made one change which I think wise but others might
consider pointless and suboptimal: in an unpacked struct the
amount of space allocated for a field is now given by the ABI
size rather than getTypeStoreSize.  I did this because every
other place that reserves memory for a type (eg: alloca) now
uses getABITypeSize, and I didn't want to make an exception
for unpacked structs, i.e. I did it to make things more uniform.
This only effects structs containing long doubles and arbitrary
precision integers.  If someone wants to pack these types more
tightly they can always use a packed struct.

llvm-svn: 43620
2007-11-01 20:53:16 +00:00
Evan Cheng fe1ac52836 - Coalesce extract_subreg when both intervals are relatively small.
- Some code clean up.

llvm-svn: 43606
2007-11-01 06:22:48 +00:00
Duncan Sands 3b4668a5d8 Promotion of sdiv/srem/udiv/urem.
llvm-svn: 43551
2007-10-31 08:57:43 +00:00
Duncan Sands 21ca939683 Add a newline at the end of the file.
llvm-svn: 43550
2007-10-31 08:49:24 +00:00
Owen Anderson 0b59fa0605 Add the skeleton of a better PHI elimination pass.
llvm-svn: 43542
2007-10-31 03:37:57 +00:00
Owen Anderson 9b8f34f2ac Some fixes to get MachineDomTree working better.
llvm-svn: 43541
2007-10-31 03:30:14 +00:00
Dale Johannesen b066c1f216 Make i64=expand_vector_elt(v2i64) work in 32-bit mode.
llvm-svn: 43535
2007-10-31 00:32:36 +00:00
Evan Cheng 0747bc1df6 Typo.
llvm-svn: 43511
2007-10-30 20:11:21 +00:00
Duncan Sands 9ad5465005 Add support for expanding trunc stores. Consider
storing an i170 on a 32 bit machine.  This is first
promoted to a trunc-i170 store of an i256.  On a
little-endian machine this expands to a store of
an i128 and a trunc-i42 store of an i128.  The
trunc-i42 store is further expanded to a trunc-i42
store of an i64, then to a store of an i32 and a
trunc-i10 store of an i32.  At this point the operand
type is legal (i32) and expansion stops (legalization
of the trunc-i10 needs to be handled in LegalizeDAG.cpp).
On big-endian machines the high bits are stored first,
and some bit-fiddling is needed in order to generate
aligned stores.

llvm-svn: 43499
2007-10-30 12:50:39 +00:00
Duncan Sands 341f093bb1 If a call to getTruncStore is for a normal store,
offload to getStore rather than trying to handle
both cases at once (the assertions for example
assume the store really is truncating).

llvm-svn: 43498
2007-10-30 12:40:58 +00:00
Dan Gohman ae95d72a52 Fix a DAGCombiner abort on a bitcast from a scalar to a vector.
llvm-svn: 43470
2007-10-29 20:44:42 +00:00
Evan Cheng e106e2f142 Enable more fold (sext (load x)) -> (sext (truncate (sextload x)))
transformation. Previously, it's restricted by ensuring the number of load uses
is one. Now the restriction is loosened up by allowing setcc uses to be
"extended" (e.g. setcc x, c, eq -> setcc sext(x), sext(c), eq).

llvm-svn: 43465
2007-10-29 19:58:20 +00:00
Dan Gohman 1961c28d46 Add explicit keywords.
llvm-svn: 43464
2007-10-29 19:52:04 +00:00
Duncan Sands 1826deda68 The guaranteed alignment of ptr+offset is only the minimum of
of offset and the alignment of ptr if these are both powers of
2.  While the ptr alignment is guaranteed to be a power of 2,
there is no reason to think that offset is.  For example, if
offset is 12 (the size of a long double on x86-32 linux) and
the alignment of ptr is 8, then the alignment of ptr+offset
will in general be 4, not 8.  Introduce a function MinAlign,
lifted from gcc, for computing the minimum guaranteed alignment.
I've tried to fix up everywhere under lib/CodeGen/SelectionDAG/.
I also changed some places that weren't wrong (because both values
were a power of 2), as a defensive change against people copying
and pasting the code.
Hopefully someone who cares about alignment will review the rest
of LLVM and fix up the remaining places.  Since I'm on x86 I'm
not very motivated to do this myself...

llvm-svn: 43421
2007-10-28 12:59:45 +00:00
Bill Wendling 6d15b32c15 - Remove the hacky code that forces a memcpy. Alignment is taken care of in the
FE.
- Explicitly pass in the alignment of the load & store.
- XFAIL 2007-10-23-UnalignedMemcpy.ll because llc has a bug that crashes on
  unaligned pointers.

llvm-svn: 43398
2007-10-26 20:24:42 +00:00
Bill Wendling f73340efb9 Changed XXX to FIXME, and added comment to the README file
llvm-svn: 43359
2007-10-25 19:49:32 +00:00
Bill Wendling 5f7ed00d44 Added comment explaining why we are doing this check.
llvm-svn: 43353
2007-10-25 18:23:45 +00:00
Duncan Sands d385f0759c Small formatting changes. Add a sanity check.
Use NVT rather than looking it up, since we have
it to hand.

llvm-svn: 43341
2007-10-25 12:35:51 +00:00
Duncan Sands a8f4ba6eb9 Promote SETCC operands.
llvm-svn: 43340
2007-10-25 12:32:31 +00:00
Duncan Sands cf0da03312 Correctly extract the ValueType from a VTSDNode.
llvm-svn: 43339
2007-10-25 12:30:51 +00:00
Dale Johannesen a4a972e32d Another expansion for i64 multiply, suitable for PPC.
llvm-svn: 43314
2007-10-24 22:26:08 +00:00
Bill Wendling 38ccabcae9 Fix comment and use the "Size" variable that's already provided.
llvm-svn: 43271
2007-10-23 23:36:57 +00:00
Bill Wendling e3b859298a If there's an unaligned memcpy to/from the stack, don't lower it. Just call the
memcpy library function instead.

llvm-svn: 43270
2007-10-23 23:32:40 +00:00
Bill Wendling 6f149c0571 This broke lots. Reverting.
llvm-svn: 43264
2007-10-23 22:04:26 +00:00
Bill Wendling 8971440e56 Lowering a memcpy to the stack is killing PPC. The ARM and X86 backends already
have their own custom memcpy lowering code. This code needs to be factored out
into a target-independent lowering method with hooks to the backend. In the
meantime, just call memcpy if we're trying to copy onto a stack.

llvm-svn: 43262
2007-10-23 21:30:25 +00:00
Evan Cheng 5d7032bb08 It's possible to commute instrctions with more than 3 operands.
llvm-svn: 43256
2007-10-23 20:14:40 +00:00
Evan Cheng 847d42a85c isSubRegOf() is a dup of isSubRegister.
llvm-svn: 43249
2007-10-23 06:51:50 +00:00
Evan Cheng 5163a8f53e Add missing paratheses.
llvm-svn: 43227
2007-10-22 19:42:28 +00:00
Duncan Sands 941db4da0a Support for expanding extending loads of integers with
funky bit-widths.

llvm-svn: 43225
2007-10-22 19:00:05 +00:00
Duncan Sands 8fc995069b Fix up the logic for result expanding the various extension
operations so they work right for integers with funky
bit-widths.  For example, consider extending i48 to i64
on a 32 bit machine.  The i64 result is expanded to 2 x i32.
We know that the i48 operand will be promoted to i64, then
also expanded to 2 x i32.  If we had the expanded promoted
operand to hand, then expanding the result would be trivial.
Unfortunately at this stage we can only get hold of the
promoted operand.  So instead we kind of hand-expand, doing
explicit shifting and truncating to get the top and bottom
halves of the i64 operand into 2 x i32, which are then used
to expand the result.  This is harmless, because when the
promoted operand is finally expanded all this bit fiddling
turns into trivial operations which are eliminated either
by the expansion code itself or the DAG combiner.

llvm-svn: 43223
2007-10-22 18:26:21 +00:00
Evan Cheng 8557603781 - Only perform the unfolding optimization when the folding in question is modref.
- Remove a bogus assertion.

llvm-svn: 43211
2007-10-22 03:01:44 +00:00
Chris Lattner 36f06c80e6 Add promote operand support for [su]int_to_fp.
llvm-svn: 43204
2007-10-20 22:57:56 +00:00
Chris Lattner 2ba4b148f3 Add result promotion of FP_TO_*INT, fixing CodeGen/X86/trunc-to-bool.ll
with the new legalizer.

llvm-svn: 43199
2007-10-20 04:32:38 +00:00
Chris Lattner 1c87f0c620 simplify some code.
llvm-svn: 43198
2007-10-20 04:09:48 +00:00
Chris Lattner 2bcac640b7 Implement promote and expand for operands of memcpy and friends.
This fixes CodeGen/X86/mem*.ll.

llvm-svn: 43197
2007-10-20 04:07:07 +00:00
Evan Cheng f12967124c Added missing curly braces which renders the if clause useless in debug build.
llvm-svn: 43196
2007-10-20 04:01:47 +00:00
Dale Johannesen 771188cf60 Fix a few places vector operations were not getting
the operand's type from the right place.

llvm-svn: 43195
2007-10-20 00:07:52 +00:00
Evan Cheng 35ff79370b Local spiller optimization:
Turn a store folding instruction into a load folding instruction. e.g.
     xorl  %edi, %eax
     movl  %eax, -32(%ebp)
     movl  -36(%ebp), %eax
     orl   %eax, -32(%ebp)
=>
     xorl  %edi, %eax
     orl   -36(%ebp), %eax
     mov   %eax, -32(%ebp)
This enables the unfolding optimization for a subsequent instruction which will
also eliminate the newly introduced store instruction.

llvm-svn: 43192
2007-10-19 21:23:22 +00:00
Bill Wendling ac5c93040f Don't branch fold inline asm statements.
llvm-svn: 43191
2007-10-19 21:09:55 +00:00
Duncan Sands a87c9e4b75 Add support for a few more nodes.
llvm-svn: 43190
2007-10-19 20:29:48 +00:00
Dale Johannesen 6802d0c96f Redo "last ppc long double fix" as Chris wants.
llvm-svn: 43189
2007-10-19 20:29:00 +00:00
Chris Lattner 064c31ebac Fix a really nasty vector miscompilation bill recently introduced.
llvm-svn: 43181
2007-10-19 16:47:35 +00:00
Chris Lattner 3ea519e56d rename ExpandOperation to ExpandOperationResult, as suggested
by Duncan

llvm-svn: 43177
2007-10-19 15:28:47 +00:00
Duncan Sands a9953e4d0a Support for expanding ADDE and SUBE.
llvm-svn: 43175
2007-10-19 13:06:17 +00:00
Duncan Sands d9834b29dd If the value types are equal then this routine
asserts in later checks rather than producing
the ordinary load it is supposed to.  Avoid all
such hassles by directly returning an ordinary
load in this case.

llvm-svn: 43174
2007-10-19 13:05:40 +00:00
Rafael Espindola 846c19dd70 Add support for byval function whose argument is not 32 bit aligned.
To do this it is necessary to add a "always inline" argument to the
memcpy node. For completeness I have also added this node to memmove
and memset.  I have also added getMem* functions, because the extra
argument makes it cumbersome to use getNode and because I get confused
by it :-)

llvm-svn: 43172
2007-10-19 10:41:11 +00:00
Chris Lattner e5a6448533 Implement a few new operations.
llvm-svn: 43171
2007-10-19 04:46:45 +00:00
Chris Lattner e31365eecc Implement expansion of SINT_TO_FP and UINT_TO_FP operands.
llvm-svn: 43170
2007-10-19 04:32:47 +00:00
Chris Lattner 9081d08083 implement support for custom expansion of any node type, in one place.
llvm-svn: 43169
2007-10-19 04:14:36 +00:00
Chris Lattner d01b8ea4a5 Make use of TLI.ExpandOperation, remove softfloat stuff.
llvm-svn: 43167
2007-10-19 03:58:25 +00:00
Chris Lattner 3c7ee41c78 add expand support for bit_convert result, even allowing custom expansion.
llvm-svn: 43166
2007-10-19 03:33:14 +00:00
Chris Lattner 579db81f1c add a new target hook.
llvm-svn: 43165
2007-10-19 03:31:45 +00:00
Bill Wendling de16ad1446 Negative indices aren't allowed here.
llvm-svn: 43161
2007-10-19 01:10:49 +00:00
Dale Johannesen 10432e5a67 More ppcf128 issues (maybe the last)?
llvm-svn: 43160
2007-10-19 00:59:18 +00:00
Bill Wendling 070aca5d25 Pointer arithmetic should be done with the index the same size as the pointer.
llvm-svn: 43120
2007-10-18 08:32:37 +00:00
Duncan Sands cb7aca0dcb Support for ADDC/SUBC.
llvm-svn: 43119
2007-10-18 08:22:16 +00:00
Evan Cheng e6a41c066a Really fix PR1734. Carefully track which register uses are sub-register uses by
traversing inverse register coalescing map.

llvm-svn: 43118
2007-10-18 07:49:59 +00:00
Dan Gohman 8f518b9875 Add support for ISD::SELECT in SplitVectorOp.
llvm-svn: 43072
2007-10-17 14:48:28 +00:00
Duncan Sands d42c812f4a Return Expand from getOperationAction for all extended
types.  This is needed for SIGN_EXTEND_INREG at least.
It is not clear if this is correct for other operations.
On the other hand, for the various load/store actions
it seems to correct to return the type action, as is
currently done.
Also, it seems that SelectionDAG::getValueType can be
called for extended value types; introduce a map for
holding these, since we don't really want to extend
the vector to be 2^32 pointers long!
Generalize DAGTypeLegalizer::PromoteResult_TRUNCATE
and DAGTypeLegalizer::PromoteResult_INT_EXTEND to handle
the various funky possibilities that apints introduce,
for example that you can promote to a type that needs
to be expanded.

llvm-svn: 43071
2007-10-17 13:49:58 +00:00
Evan Cheng 0dde6e5761 Apply Chris' suggestions.
llvm-svn: 43069
2007-10-17 06:53:44 +00:00
Evan Cheng c8b5397000 One more extract_subreg coalescing bug fix.
llvm-svn: 43065
2007-10-17 05:29:37 +00:00
Evan Cheng 9b0a44a2ce Fix MergeValueInAsValue(). It allows overlapping live ranges but should replace
their value numbers with the specified value number.

llvm-svn: 43062
2007-10-17 02:13:29 +00:00
Evan Cheng a6fd8bc97e Clean up code that calculate MBB live-in's.
llvm-svn: 43061
2007-10-17 02:12:22 +00:00
Evan Cheng 8b8c7c9927 Clean up code that calculate MBB live-in's.
llvm-svn: 43060
2007-10-17 02:10:22 +00:00
Dale Johannesen e5facd51cb Disable attempts to constant fold PPC f128.
Remove the assumption that this will happen from
various places.

llvm-svn: 43053
2007-10-16 23:38:29 +00:00
Evan Cheng 8f644cef0f Some clean up.
llvm-svn: 43043
2007-10-16 21:09:14 +00:00
Evan Cheng fab7ca89d5 Fix PR1734.
llvm-svn: 43035
2007-10-16 19:29:47 +00:00
Duncan Sands bbbfbe95f7 Initial infrastructure for arbitrary precision integer
codegen support.  This should have no effect on codegen
for other types.  Debatable bits: (1) the use (abuse?)
of a set in SDNode::getValueTypeList; (2) the length of
getTypeToTransformTo, which maybe should be refactored
with a non-inline part for extended value types.

llvm-svn: 43030
2007-10-16 09:56:48 +00:00
Duncan Sands 052c843559 Fixes due to lack of type-safety for ValueType: (1) ValueType
being passed instead of an opcode; (2) ValueType being passed
for isVolatile (!) in getLoad.

llvm-svn: 43028
2007-10-16 09:07:20 +00:00
Evan Cheng ecf62cb763 Code clean up.
llvm-svn: 43026
2007-10-16 08:04:24 +00:00
Chris Lattner cece03dd89 implement promotion of select and select_cc, allowing MallocBench/gs to
work with type promotion on x86.

llvm-svn: 43025
2007-10-16 03:00:22 +00:00
Dan Gohman 9aa4fc5cd6 Teach IntrinsicLowering.cpp about the sin, cos, and pow intrinsics.
llvm-svn: 43020
2007-10-15 22:07:31 +00:00
Evan Cheng 04c44712d3 Make CalcLatency() non-recursive.
llvm-svn: 43017
2007-10-15 21:33:22 +00:00