Commit Graph

2302 Commits

Author SHA1 Message Date
Nate Begeman 7d831fa5b9 Add BSWAP stuff to intrinsic lowering for CBE & friends.
llvm-svn: 25355
2006-01-16 07:57:00 +00:00
Chris Lattner fcdb420baf Disable two transformations that contribute to bus errors on SparcV8.
llvm-svn: 25339
2006-01-15 18:58:59 +00:00
Chris Lattner 59b82f9848 Allow the target to specify 'expand' if they just require the amount to
be subtracted from the stack pointer.

llvm-svn: 25331
2006-01-15 08:54:32 +00:00
Chris Lattner 2d59142613 Fix custom lowering of dynamic_stackalloc
llvm-svn: 25329
2006-01-15 08:43:08 +00:00
Chris Lattner 9597b33d58 add a missing node name
llvm-svn: 25327
2006-01-15 08:39:35 +00:00
Chris Lattner 02011c9a4f Token chain results are not always the first or last result. Consider copyfromreg nodes, where they are the middle result (the flag result is last)
llvm-svn: 25325
2006-01-14 22:41:46 +00:00
Nate Begeman 542c3c17a9 Remove some duplicated code
llvm-svn: 25313
2006-01-14 03:18:27 +00:00
Nate Begeman 2fba8a3aaa bswap implementation
llvm-svn: 25312
2006-01-14 03:14:10 +00:00
Chris Lattner ed9b3e1c0a If a target specified a stack pointer with setStackPointerRegisterToSaveRestore,
lower STACKSAVE/STACKRESTORE into a copy from/to that register.

llvm-svn: 25276
2006-01-13 17:48:44 +00:00
Chris Lattner b32664583b Compile llvm.stacksave/restore into STACKSAVE/STACKRESTORE nodes, and allow
targets to custom expand them as they desire.

llvm-svn: 25273
2006-01-13 02:50:02 +00:00
Chris Lattner a5110e854d add stacksave/stackrestore nodes
llvm-svn: 25270
2006-01-13 02:39:42 +00:00
Chris Lattner 6c9c250dcd Add "support" for stacksave/stackrestore to the dag isel
llvm-svn: 25268
2006-01-13 02:24:42 +00:00
Chris Lattner 3b2b0aff0c Add "support" for the llvm.stacksave/stackrestore intrinsics, this is
used by the C backend.

llvm-svn: 25267
2006-01-13 02:22:08 +00:00
Chris Lattner 3470b5dee6 Add a simple missing fold to produce this:
subfic r3, r2, 33

instead of this:

        subfic r2, r2, 32
        addi r3, r2, 1

llvm-svn: 25255
2006-01-12 20:22:43 +00:00
Chris Lattner 3760e901cf If using __main, emit global ctor/dtor list like any other global
llvm-svn: 25251
2006-01-12 19:17:23 +00:00
Chris Lattner b1ee616de9 Don't create rotate instructions in unsupported types, because we don't have
promote/expand code yet.  This fixes the 177.mesa failure on PPC.

llvm-svn: 25250
2006-01-12 18:57:33 +00:00
Evan Cheng 7f4ec8274f Allow custom lowering of DYNAMIC_STACKALLOC.
llvm-svn: 25224
2006-01-11 22:14:47 +00:00
Evan Cheng 982493300e ignore register #0
llvm-svn: 25223
2006-01-11 22:13:48 +00:00
Nate Begeman 1b8121b227 Add bswap, rotl, and rotr nodes
Add dag combiner code to recognize rotl, rotr
Add ppc code to match rotl

Targets should add rotl/rotr patterns if they have them

llvm-svn: 25222
2006-01-11 21:21:00 +00:00
Chris Lattner fb5f46541c silence a warning
llvm-svn: 25184
2006-01-10 19:43:26 +00:00
Robert Bocchino 2c966e7617 Added selection DAG support for the extractelement operation.
llvm-svn: 25179
2006-01-10 19:04:57 +00:00
Chris Lattner b05fce676f Minor cleanup, no functionality change for current targets
llvm-svn: 25173
2006-01-10 05:41:59 +00:00
Chris Lattner 90ba544826 Fix an exponential function in libcall insertion to not be exponential. :)
llvm-svn: 25165
2006-01-09 23:21:49 +00:00
Evan Cheng 870e4f8e38 * Allow custom lowering of ADD_PARTS, SUB_PARTS, SHL_PARTS, SRA_PARTS,
and SRL_PARTS.
* Fix a bug that caused *_PARTS to be custom lowered twice.

llvm-svn: 25157
2006-01-09 18:31:59 +00:00
Evan Cheng 53a1f57fc5 New getNode() variants.
llvm-svn: 25156
2006-01-09 18:29:18 +00:00
Chris Lattner fae8afb77f Unbreak the build :(
llvm-svn: 25124
2006-01-06 05:47:48 +00:00
Evan Cheng 85c973cda9 Revert the previous check-in. Leave shl x, 1 along for target to deal with.
llvm-svn: 25121
2006-01-06 01:56:02 +00:00
Evan Cheng b03f9b32d2 fold (shl x, 1) -> (add x, x)
llvm-svn: 25120
2006-01-06 01:06:31 +00:00
Evan Cheng f35b1c837f Support for custom lowering of ISD::RET.
llvm-svn: 25116
2006-01-06 00:41:43 +00:00
Jim Laskey 762e9ec06c Added initial support for DEBUG_LABEL allowing debug specific labels to be
inserted in the code.

llvm-svn: 25104
2006-01-05 01:25:28 +00:00
Jim Laskey 219d559824 Applied some recommend changes from sabre. The dominate one beginning "let the
pass manager do it's thing."  Fixes crash when compiling -g files and suppresses
dwarf statements if no debug info is present.

llvm-svn: 25100
2006-01-04 22:28:25 +00:00
Jim Laskey 0da76a676a Add unique id to debug location for debug label use (work in progress.)
llvm-svn: 25096
2006-01-04 15:04:11 +00:00
Jim Laskey 2741e8304c Add check for debug presence.
llvm-svn: 25095
2006-01-04 14:30:12 +00:00
Jim Laskey b0609d91c3 Tie dwarf generation to darwin assembler.
llvm-svn: 25093
2006-01-04 13:52:30 +00:00
Jim Laskey 57a5e0b45a Moving MachineDebugInfo to module level location.
llvm-svn: 25090
2006-01-04 13:43:56 +00:00
Jim Laskey 6f9ff633a6 Change how MachineDebugInfo is fetched.
llvm-svn: 25089
2006-01-04 13:42:59 +00:00
Jim Laskey 44317393f8 Extending MachineDebugInfo.
llvm-svn: 25086
2006-01-04 13:36:38 +00:00
Chris Lattner 227e936650 Add support for targets (like Alpha) that have terminator instructions which
use virtual registers.  We now allow the first instruction in a block of
terminators to use virtual registers, and update phi elimination to correctly
update livevar when eliminating phi's.  This fixes a problem on a testcase
Andrew sent me.

llvm-svn: 25083
2006-01-04 07:12:21 +00:00
Chris Lattner 0511055276 Add an assertion, update DefInst even though no one uses it (dangling pointers
don't help anyone)

llvm-svn: 25081
2006-01-04 06:47:48 +00:00
Chris Lattner be45b5e948 Add a LiveVariables::VarInfo::dump method
llvm-svn: 25080
2006-01-04 05:40:30 +00:00
Chris Lattner b723c33614 Change a variable from being an iterator to a raw MachineInstr*, to make
GDB use tolerable

llvm-svn: 25064
2006-01-03 07:41:37 +00:00
Nate Begeman 164db3a7eb Make sure to pass the offset into the new node, so that we don't silently
drop it on the floor.

llvm-svn: 25044
2005-12-30 00:10:38 +00:00
Duraid Madina fb6a914ca7 purity++
llvm-svn: 25041
2005-12-29 05:59:19 +00:00
Duraid Madina 26b037e762 add these so I can be less naughty
llvm-svn: 25034
2005-12-28 06:29:02 +00:00
Duraid Madina e47d9d0e92 HB is *the* code janitor.
llvm-svn: 25031
2005-12-28 04:55:42 +00:00
Duraid Madina 7c3dcb6892 mixed-STL programs are big and nasty :(
llvm-svn: 25030
2005-12-28 02:44:35 +00:00
Andrew Lenharth 30db2ec59f allow custom lowering to return null for legal results
llvm-svn: 25007
2005-12-25 01:07:37 +00:00
Andrew Lenharth 7259426d88 Support Custom lowering of a few more operations.
Alpha needs to custom lower *DIV and *REM

llvm-svn: 25006
2005-12-24 23:42:32 +00:00
Jim Laskey bdba3e2a46 Remove redundant debug locations.
llvm-svn: 24995
2005-12-23 20:08:28 +00:00
Chris Lattner c7037abc5b unbreak the build :-/
llvm-svn: 24992
2005-12-23 16:12:20 +00:00
Evan Cheng 31d15fa093 Allow custom lowering of LOAD, EXTLOAD, ZEXTLOAD, STORE, and TRUNCSTORE. Not
currently used.

llvm-svn: 24988
2005-12-23 07:29:34 +00:00
Chris Lattner 26943b9691 Simplify store(bitconv(x)) to store(x). This allows us to compile this:
void bar(double Y, double *X) {
  *X = Y;
}

to this:

bar:
        save -96, %o6, %o6
        st %i1, [%i2+4]
        st %i0, [%i2]
        restore %g0, %g0, %g0
        retl
        nop

instead of this:

bar:
        save -104, %o6, %o6
        st %i1, [%i6+-4]
        st %i0, [%i6+-8]
        ldd [%i6+-8], %f0
        std  %f0, [%i2]
        restore %g0, %g0, %g0
        retl
        nop

on sparcv8.

llvm-svn: 24983
2005-12-23 05:48:07 +00:00
Chris Lattner 54560f6887 fold (conv (load x)) -> (load (conv*)x).
This allows us to compile this:
void foo(double);
void bar(double *X) { foo(*X); }

To this:

bar:
        save -96, %o6, %o6
        ld [%i0+4], %o1
        ld [%i0], %o0
        call foo
        nop
        restore %g0, %g0, %g0
        retl
        nop

instead of this:

bar:
        save -104, %o6, %o6
        ldd [%i0], %f0
        std %f0, [%i6+-8]
        ld [%i6+-4], %o1
        ld [%i6+-8], %o0
        call foo
        nop
        restore %g0, %g0, %g0
        retl
        nop

on SparcV8.

llvm-svn: 24982
2005-12-23 05:44:41 +00:00
Chris Lattner efbbedbf4a Fold bitconv(bitconv(x)) -> x. We now compile this:
void foo(double);
void bar(double X) { foo(X); }

to this:

bar:
        save -96, %o6, %o6
        or %g0, %i0, %o0
        or %g0, %i1, %o1
        call foo
        nop
        restore %g0, %g0, %g0
        retl
        nop

instead of this:

bar:
        save -112, %o6, %o6
        st %i1, [%i6+-4]
        st %i0, [%i6+-8]
        ldd [%i6+-8], %f0
        std %f0, [%i6+-16]
        ld [%i6+-12], %o1
        ld [%i6+-16], %o0
        call foo
        nop
        restore %g0, %g0, %g0
        retl
        nop

on V8.

llvm-svn: 24981
2005-12-23 05:37:50 +00:00
Chris Lattner a187460552 constant fold bits_convert in getNode and in the dag combiner for fp<->int
conversions.  This allows V8 to compiles this:

void %test() {
        call float %test2( float 1.000000e+00, float 2.000000e+00, double 3.000000e+00, double* null )
        ret void
}

into:

test:
        save -96, %o6, %o6
        sethi 0, %o3
        sethi 1049088, %o2
        sethi 1048576, %o1
        sethi 1040384, %o0
        or %g0, %o3, %o4
        call test2
        nop
        restore %g0, %g0, %g0
        retl
        nop

instead of:

test:
        save -112, %o6, %o6
        sethi 0, %o4
        sethi 1049088, %l0
        st %o4, [%i6+-12]
        st %l0, [%i6+-16]
        ld [%i6+-12], %o3
        ld [%i6+-16], %o2
        sethi 1048576, %o1
        sethi 1040384, %o0
        call test2
        nop
        restore %g0, %g0, %g0
        retl
        nop

llvm-svn: 24980
2005-12-23 05:30:37 +00:00
Chris Lattner 884eb3adc3 Fix a pasto
llvm-svn: 24973
2005-12-23 00:52:30 +00:00
Chris Lattner 9eae8d5d03 fix a thinko in the bit_convert handling code
llvm-svn: 24972
2005-12-23 00:50:25 +00:00
Chris Lattner 36e663d6e1 add very simple support for the BIT_CONVERT node
llvm-svn: 24970
2005-12-23 00:16:34 +00:00
Chris Lattner 177d7af5d5 remove dead code
llvm-svn: 24965
2005-12-22 21:16:08 +00:00
Chris Lattner 1408c05a8b The 81st column doesn't like code in it.
llvm-svn: 24943
2005-12-22 05:23:45 +00:00
Reid Spencer 2335fc2f44 Add an eol at the end to shut gcc sup.
llvm-svn: 24926
2005-12-22 01:41:00 +00:00
Evan Cheng 9cdc16c6d3 * Fix a GlobalAddress lowering bug.
* Teach DAG combiner about X86ISD::SETCC by adding a TargetLowering hook.

llvm-svn: 24921
2005-12-21 23:05:39 +00:00
Jim Laskey 9e296bee9a Disengage DEBUG_LOC from non-PPC targets.
llvm-svn: 24919
2005-12-21 20:51:37 +00:00
Evan Cheng c1583dbd63 * Added support for X86 RET with an additional operand to specify number of
bytes to pop off stack.
* Added support for X86 SETCC.

llvm-svn: 24917
2005-12-21 20:21:51 +00:00
Jim Laskey 7b52a923b8 Start of Dwarf framework.
llvm-svn: 24914
2005-12-21 19:48:16 +00:00
Chris Lattner 0fab459362 make sure to relegalize all cases
llvm-svn: 24911
2005-12-21 19:40:42 +00:00
Chris Lattner 44c07ed61a enable the gep isel opt
llvm-svn: 24910
2005-12-21 19:36:36 +00:00
Chris Lattner ac12f68424 fix a bug I introduced that broke recursive expansion of nodes (e.g. scalarizing vectors)
llvm-svn: 24905
2005-12-21 18:02:52 +00:00
Chris Lattner 803a575616 Lower ConstantAggregateZero into zeros
llvm-svn: 24890
2005-12-21 02:43:26 +00:00
Chris Lattner 434ffe49a9 Don't emit a null terminator, nor anything after it, to the ctor/dtor list
llvm-svn: 24887
2005-12-21 01:17:37 +00:00
Evan Cheng 6af02635a7 Added a hook to print out names of target specific DAG nodes.
llvm-svn: 24877
2005-12-20 06:22:03 +00:00
Chris Lattner 2af3ee4bdd Fix a nasty latent bug in the legalizer that was triggered by my patch
last night, breaking crafty and twolf.  Make sure that the newly found
legal nodes are themselves not re-legalized until the next iteration.

Also, since this functionality exists now, we can reduce number of legalizer
iterations by depending on this behavior instead of having to misuse 'do
another iteration' to get the same effect.

llvm-svn: 24875
2005-12-20 00:53:54 +00:00
Evan Cheng 6fc31046aa X86 conditional branch support.
llvm-svn: 24870
2005-12-19 23:12:38 +00:00
Evan Cheng 9fd9541367 Print out opcode number if it's an unknown target node.
llvm-svn: 24869
2005-12-19 23:11:49 +00:00
Chris Lattner 50b2d302d5 Fix a case where the DAG Combiner would accidentally CSE flag-producing nodes,
creating graphs that cannot be scheduled.

llvm-svn: 24866
2005-12-19 22:21:21 +00:00
Jim Laskey 9b9688aeb8 Amend comment.
llvm-svn: 24861
2005-12-19 16:32:26 +00:00
Jim Laskey ce23987e6b Create a strong dependency for loads following stores. This will leave a
latency period between the two.

llvm-svn: 24860
2005-12-19 16:30:13 +00:00
Chris Lattner c06da626b4 Make sure to relegalize new nodes
llvm-svn: 24843
2005-12-18 23:54:29 +00:00
Jeff Cohen c7cb351aac Keep VC++ happy.
llvm-svn: 24835
2005-12-18 22:20:05 +00:00
Chris Lattner ebcfa0c210 More corrections for flagged copyto/from reg
llvm-svn: 24828
2005-12-18 15:36:21 +00:00
Chris Lattner e3c67e97c7 legalize copytoreg and copyfromreg nodes that have flag operands correctly.
llvm-svn: 24826
2005-12-18 15:27:43 +00:00
Jim Laskey c97b7d0be9 Fix a bug Sabre was having where the DAG root was a group. The group dominator
needed to be added to the ordering list, not the first member of the group.

llvm-svn: 24816
2005-12-18 04:40:52 +00:00
Jim Laskey e220821deb Groups were not emitted if the dominator node and the node in the ordering list
were not the same node.  Ultimately the test was bogus.

llvm-svn: 24815
2005-12-18 03:59:21 +00:00
Chris Lattner cf12118965 Simplify code
llvm-svn: 24806
2005-12-18 01:03:46 +00:00
Chris Lattner bf0bd99e03 allow custom expansion of BR_CC
llvm-svn: 24804
2005-12-17 23:46:46 +00:00
Evan Cheng 225a4d0d6d X86 lowers SELECT to a cmp / test followed by a conditional move.
llvm-svn: 24754
2005-12-17 01:21:05 +00:00
Jim Laskey 7c462768ed Added source file/line correspondence for dwarf (PowerPC only at this point.)
llvm-svn: 24748
2005-12-16 22:45:29 +00:00
Chris Lattner 83e4407379 Don't create SEXTLOAD/ZEXTLOAD instructions that the target doesn't support
if after legalize.  This fixes IA64 failures.

llvm-svn: 24725
2005-12-15 19:02:38 +00:00
Chris Lattner d39c60fcc8 When folding loads into ops, immediately replace uses of the op with the
load.  This reduces number of worklist iterations and avoid missing optimizations
depending on folding of things into sext_inreg nodes (which aren't supported by
all targets).
Tested by Regression/CodeGen/X86/extend.ll:test2

llvm-svn: 24712
2005-12-14 19:25:30 +00:00
Chris Lattner 7dac1083da Fix the (zext (zextload)) case to trigger, similarly for sign extends.
Allow (zext (truncate)) to apply after legalize if the target supports
AND (which all do).

This compiles
short %foo() {
        %tmp.0 = load ubyte* %X         ; <ubyte> [#uses=1]
        %tmp.3 = cast ubyte %tmp.0 to short             ; <short> [#uses=1]
        ret short %tmp.3
}

to:
_foo:
        movzbl _X, %eax
        ret

instead of:

_foo:
        movzbl _X, %eax
        movzbl %al, %eax
        ret

thanks to Evan for pointing this out.

llvm-svn: 24709
2005-12-14 19:05:06 +00:00
Chris Lattner f753d1a574 Fix a miscompilation in crafty due to a recent patch
llvm-svn: 24706
2005-12-14 07:58:38 +00:00
Evan Cheng bce7c47306 Fold (zext (load x) to (zextload x).
llvm-svn: 24702
2005-12-14 02:19:23 +00:00
Chris Lattner 5d4e61dd87 Don't lump the filename and working dir together
llvm-svn: 24697
2005-12-13 17:40:33 +00:00
Chris Lattner f0e9aef954 Add a couple more fields, move ctor init list to .cpp file, add support
for emitting the ctor/dtor list for common targets.

llvm-svn: 24694
2005-12-13 06:32:10 +00:00
Nate Begeman 956aef45c9 Lowering constant pool entries on ppc exposed a bug in the recently added
ConstantVec legalizing code, which would return constantpool nodes that
were not of the target's pointer type.

llvm-svn: 24691
2005-12-13 03:03:23 +00:00
Chris Lattner 9e8b633ec1 Accept and ignore prefetches for now
llvm-svn: 24678
2005-12-12 22:51:16 +00:00
Chris Lattner b42ce7ca63 Fix CodeGen/Generic/2005-12-12-ExpandSextInreg.ll
llvm-svn: 24677
2005-12-12 22:27:43 +00:00
Chris Lattner f1a54c0d14 Minor tweak to get isel opt
llvm-svn: 24663
2005-12-11 09:05:13 +00:00
Nate Begeman 4e56db674c Add support for TargetConstantPool nodes to the dag isel emitter, and use
them in the PPC backend, to simplify some logic out of Select and
SelectAddr.

llvm-svn: 24657
2005-12-10 02:36:00 +00:00
Evan Cheng dadc1057ac Added new getNode and getTargetNode variants for X86 stores.
llvm-svn: 24653
2005-12-10 00:37:58 +00:00
Chris Lattner a6f835f5a0 Avoid emitting two tabs when switching to a named section
llvm-svn: 24646
2005-12-09 19:28:49 +00:00
Chris Lattner 268d457b69 Teach legalize how to promote sext_inreg to fix a problem Andrew pointed
out to me.

llvm-svn: 24644
2005-12-09 17:32:47 +00:00
Chris Lattner be73d6eece improve code insertion in two ways:
1. Only forward subst offsets into loads and stores, not into arbitrary
   things, where it will likely become a load.
2. If the source is a cast from pointer, forward subst the cast as well,
   allowing us to fold the cast away (improving cases when the cast is
   from an alloca or global).

This hasn't been fully tested, but does appear to further reduce register
pressure and improve code.  Lets let the testers grind on it a bit. :)

llvm-svn: 24640
2005-12-08 08:00:12 +00:00
Nate Begeman ae89d862f5 Fix a crash where ConstantVec nodes were being generated with the wrong
type when the target did not support them.  Also teach Legalize how to
expand ConstantVecs.

This allows us to generate

_test:
        lwz r2, 12(r3)
        lwz r4, 8(r3)
        lwz r5, 4(r3)
        lwz r6, 0(r3)
        addi r2, r2, 4
        addi r4, r4, 3
        addi r5, r5, 2
        addi r6, r6, 1
        stw r2, 12(r3)
        stw r4, 8(r3)
        stw r5, 4(r3)
        stw r6, 0(r3)
        blr

For:

void %test(%v4i *%P) {
        %T = load %v4i* %P
        %S = add %v4i %T, <int 1, int 2, int 3, int 4>
        store %v4i %S, %v4i * %P
        ret void
}

On PowerPC.

llvm-svn: 24633
2005-12-07 19:48:11 +00:00
Chris Lattner 57c882edf8 Only transform (sext (truncate x)) -> (sextinreg x) if before legalize or
if the target supports the resultant sextinreg

llvm-svn: 24632
2005-12-07 18:02:05 +00:00
Chris Lattner cbd3d01a43 Teach the dag combiner to turn a truncate/sign_extend pair into a sextinreg
when the types match up.  This allows the X86 backend to compile:

sbyte %toggle_value(sbyte* %tmp.1) {
        %tmp.2 = load sbyte* %tmp.1
        ret sbyte %tmp.2
}

to this:

_toggle_value:
        mov %EAX, DWORD PTR [%ESP + 4]
        movsx %EAX, BYTE PTR [%EAX]
        ret

instead of this:

_toggle_value:
        mov %EAX, DWORD PTR [%ESP + 4]
        movsx %EAX, BYTE PTR [%EAX]
        movsx %EAX, %AL
        ret

noticed in Shootout/objinst.

-Chris

llvm-svn: 24630
2005-12-07 07:11:03 +00:00
Nate Begeman 41b1cdc771 Teach the SelectionDAG ISel how to turn ConstantPacked values into
constant nodes with vector types.  Also teach the asm printer how to print
ConstantPacked constant pool entries.  This allows us to generate altivec
code such as the following, which adds a vector constantto a packed float.

LCPI1_0:  <4 x float> < float 0.0e+0, float 0.0e+0, float 0.0e+0, float 1.0e+0 >
        .space  4
        .space  4
        .space  4
        .long   1065353216      ; float 1
        .text
        .align  4
        .globl  _foo
_foo:
        lis r2, ha16(LCPI1_0)
        la r2, lo16(LCPI1_0)(r2)
        li r4, 0
        lvx v0, r4, r2
        lvx v1, r4, r3
        vaddfp v0, v1, v0
        stvx v0, r4, r3
        blr

For the llvm code:

void %foo(<4 x float> * %a) {
entry:
  %tmp1 = load <4 x float> * %a;
  %tmp2 = add <4 x float> %tmp1, < float 0.0, float 0.0, float 0.0, float 1.0 >
  store <4 x float> %tmp2, <4 x float> *%a
  ret void
}

llvm-svn: 24616
2005-12-06 06:18:55 +00:00
Chris Lattner 3539778883 Fix the #1 code quality problem that I have seen on X86 (and it also affects
PPC and other targets).  In a particular, consider code like this:

struct Vector3 { double x, y, z; };
struct Matrix3 { Vector3 a, b, c; };
double dot(Vector3 &a, Vector3 &b) {
   return a.x * b.x  +  a.y * b.y  +  a.z * b.z;
}
Vector3 mul(Vector3 &a, Matrix3 &b) {
   Vector3 r;
   r.x = dot( a, b.a );
   r.y = dot( a, b.b );
   r.z = dot( a, b.c );
   return r;
}
void transform(Matrix3 &m, Vector3 *x, int n) {
   for (int i = 0; i < n; i++)
      x[i] = mul( x[i], m );
}

we compile transform to a loop with all of the GEP instructions for indexing
into 'm' pulled out of the loop (9 of them).  Because isel occurs a bb at a time
we are unable to fold the constant index into the loads in the loop, leading to
PPC code that looks like this:

LBB3_1: ; no_exit.preheader
        li r2, 0
        addi r6, r3, 64        ;; 9 values live across the loop body!
        addi r7, r3, 56
        addi r8, r3, 48
        addi r9, r3, 40
        addi r10, r3, 32
        addi r11, r3, 24
        addi r12, r3, 16
        addi r30, r3, 8
LBB3_2: ; no_exit
        lfd f0, 0(r30)
        lfd f1, 8(r4)
        fmul f0, f1, f0
        lfd f2, 0(r3)        ;; no constant indices folded into the loads!
        lfd f3, 0(r4)
        lfd f4, 0(r10)
        lfd f5, 0(r6)
        lfd f6, 0(r7)
        lfd f7, 0(r8)
        lfd f8, 0(r9)
        lfd f9, 0(r11)
        lfd f10, 0(r12)
        lfd f11, 16(r4)
        fmadd f0, f3, f2, f0
        fmul f2, f1, f4
        fmadd f0, f11, f10, f0
        fmadd f2, f3, f9, f2
        fmul f1, f1, f6
        stfd f0, 0(r4)
        fmadd f0, f11, f8, f2
        fmadd f1, f3, f7, f1
        stfd f0, 8(r4)
        fmadd f0, f11, f5, f1
        addi r29, r4, 24
        stfd f0, 16(r4)
        addi r2, r2, 1
        cmpw cr0, r2, r5
        or r4, r29, r29
        bne cr0, LBB3_2 ; no_exit

uh, yuck.  With this patch, we now sink the constant offsets into the loop, producing
this code:

LBB3_1: ; no_exit.preheader
        li r2, 0
LBB3_2: ; no_exit
        lfd f0, 8(r3)
        lfd f1, 8(r4)
        fmul f0, f1, f0
        lfd f2, 0(r3)
        lfd f3, 0(r4)
        lfd f4, 32(r3)       ;; much nicer.
        lfd f5, 64(r3)
        lfd f6, 56(r3)
        lfd f7, 48(r3)
        lfd f8, 40(r3)
        lfd f9, 24(r3)
        lfd f10, 16(r3)
        lfd f11, 16(r4)
        fmadd f0, f3, f2, f0
        fmul f2, f1, f4
        fmadd f0, f11, f10, f0
        fmadd f2, f3, f9, f2
        fmul f1, f1, f6
        stfd f0, 0(r4)
        fmadd f0, f11, f8, f2
        fmadd f1, f3, f7, f1
        stfd f0, 8(r4)
        fmadd f0, f11, f5, f1
        addi r6, r4, 24
        stfd f0, 16(r4)
        addi r2, r2, 1
        cmpw cr0, r2, r5
        or r4, r6, r6
        bne cr0, LBB3_2 ; no_exit

This is much nicer as it reduces register pressure in the loop a lot.  On X86,
this takes the function from having 9 spilled registers to 2.  This should help
some spec programs on X86 (gzip?)

This is currently only enabled with -enable-gep-isel-opt to allow perf testing
tonight.

llvm-svn: 24606
2005-12-05 07:10:48 +00:00
Chris Lattner 8782b782cd dbg.stoppoint returns a value, don't forget to init it
llvm-svn: 24583
2005-12-03 18:50:48 +00:00
Andrew Lenharth f9b27d7011 bah, must generate all results
llvm-svn: 24574
2005-12-02 06:08:08 +00:00
Andrew Lenharth 73420b3795 cycle counter fix
llvm-svn: 24573
2005-12-02 04:56:24 +00:00
Chris Lattner 0142afd6c1 Don't remove two operand, two result nodes from the binary ops map. These
should come from the arbitrary ops map.

This fixes Regression/CodeGen/PowerPC/2005-12-01-Crash.ll

llvm-svn: 24571
2005-12-01 23:14:50 +00:00
Chris Lattner 05b0b4575b Promote line and column number information for our friendly 64-bit targets.
llvm-svn: 24568
2005-12-01 18:21:35 +00:00
Chris Lattner 9d0d715e83 This is a bugfix for SelectNodeTo. In certain situations, we could be
selecting a node and use a mix of getTargetNode() and SelectNodeTo.  Because
SelectNodeTo didn't check the CSE maps for a preexisting node and didn't insert
its result into the CSE maps, we would sometimes miss a CSE opportunity.

This is extremely rare, but worth fixing for completeness.

llvm-svn: 24565
2005-12-01 18:00:57 +00:00
Nate Begeman 006bb04f3a Support multiple ValueTypes per RegisterClass, needed for upcoming vector
work.  This change has no effect on generated code.

llvm-svn: 24563
2005-12-01 04:51:06 +00:00
Chris Lattner be5dd5da19 Make SelectNodeTo return N
llvm-svn: 24548
2005-11-30 22:45:14 +00:00
Chris Lattner c174048430 CALLSEQ_START/END nodes don't get memoized, do not add them in when
replaceAllUses'ing.

llvm-svn: 24539
2005-11-30 18:20:52 +00:00
Andrew Lenharth 6ee8566cae At long last, you can say that f32 isn't supported for setcc
llvm-svn: 24537
2005-11-30 17:12:26 +00:00
Nate Begeman 1064d6ec43 First chunk of actually generating vector code for packed types. These
changes allow us to generate the following code:

_foo:
        li r2, 0
        lvx v0, r2, r3
        vaddfp v0, v0, v0
        stvx v0, r2, r3
        blr

for this llvm:

void %foo(<4 x float>* %a) {
entry:
        %tmp1 = load <4 x float>* %a
        %tmp2 = add <4 x float> %tmp1, %tmp1
        store <4 x float> %tmp2, <4 x float>* %a
        ret void
}

llvm-svn: 24534
2005-11-30 08:22:07 +00:00
Andrew Lenharth 8d17c70171 add support for custom lowering SINT_TO_FP
llvm-svn: 24531
2005-11-30 06:43:03 +00:00
Reid Spencer 3fd1b4c9bf Fix a problem with llvm-ranlib that (on some platforms) caused the archive
file to become corrupted due to interactions between mmap'd memory segments
and file descriptors closing. The problem is completely avoiding by using
a third temporary file.

Patch provided by Evan Jones

llvm-svn: 24527
2005-11-30 05:21:10 +00:00
Evan Cheng 11d61613af Fixed a bug introduced by my last commit: TargetGlobalValues should key on
GlobalValue * and index pair. Update getGlobalAddress() for symmetry.

llvm-svn: 24524
2005-11-30 02:49:21 +00:00
Evan Cheng 0e0de2f3f0 Added an index field to GlobalAddressSDNode so it can represent X+12, etc.
llvm-svn: 24523
2005-11-30 02:04:11 +00:00
Chris Lattner 435b402e1f Add support for a new STRING and LOCATION node for line number support, patch
contributed by Daniel Berlin, with a few cleanups here and there by me.

llvm-svn: 24515
2005-11-29 06:21:05 +00:00
Nate Begeman 89b049af90 Add the majority of the vector machien value types we expect to support,
and make a few changes to the legalization machinery to support more than
16 types.

llvm-svn: 24511
2005-11-29 05:45:29 +00:00
Nate Begeman d37c13154a Check in code to scalarize arbitrarily wide packed types for some simple
vector operations (load, add, sub, mul).

This allows us to codegen:
void %foo(<4 x float> * %a) {
entry:
  %tmp1 = load <4 x float> * %a;
  %tmp2 = add <4 x float> %tmp1, %tmp1
  store <4 x float> %tmp2, <4 x float> *%a
  ret void
}

on ppc as:
_foo:
        lfs f0, 12(r3)
        lfs f1, 8(r3)
        lfs f2, 4(r3)
        lfs f3, 0(r3)
        fadds f0, f0, f0
        fadds f1, f1, f1
        fadds f2, f2, f2
        fadds f3, f3, f3
        stfs f0, 12(r3)
        stfs f1, 8(r3)
        stfs f2, 4(r3)
        stfs f3, 0(r3)
        blr

llvm-svn: 24484
2005-11-22 18:16:00 +00:00
Nate Begeman 07890bbec4 Rather than attempting to legalize 1 x float, make sure the SD ISel never
generates it.  Make MVT::Vector expand-only, and remove the code in
Legalize that attempts to legalize it.

The plan for supporting N x Type is to continually epxand it in ExpandOp
until it gets down to 2 x Type, where it will be scalarized into a pair of
scalars.

llvm-svn: 24482
2005-11-22 01:29:36 +00:00
Duraid Madina f28b3bd8b4 I think I know what you meant here, but just to be safe I'll let you
do it. :)

    <_sabre_> excuses excuses

llvm-svn: 24471
2005-11-21 14:09:40 +00:00
Chris Lattner f2991cee1f Allow target to customize directive used to switch to arbitrary section in SwitchSection,
add generic constant pool emitter

llvm-svn: 24464
2005-11-21 08:25:09 +00:00
Chris Lattner 08adbd13ff increment the function number in SetupMachineFunction
llvm-svn: 24461
2005-11-21 08:13:27 +00:00
Chris Lattner bb644e39c0 Adjust to capitalized asmprinter method names
llvm-svn: 24457
2005-11-21 07:51:36 +00:00
Chris Lattner 2ea5c99eca Add section switching to common code generator code. Add a couple of
asserts.

llvm-svn: 24445
2005-11-21 07:06:27 +00:00
Chris Lattner 44c28c22b7 Legalize MERGE_VALUES, expand READCYCLECOUNTER correctly, so it doesn't
break control dependence.

llvm-svn: 24437
2005-11-20 22:56:56 +00:00
Andrew Lenharth 627cbd49b1 The first patch of X86 support for read cycle counter
llvm-svn: 24429
2005-11-20 21:32:07 +00:00
Chris Lattner a8d37d748f more progress towards bug 291 being finished. Patch by Owen Anderson,
HAVE_GV case fixed up by me.

llvm-svn: 24428
2005-11-20 03:45:52 +00:00
Chris Lattner 19baba67b5 Unbreak codegen of bools. This should fix the llc/jit/llc-beta failures
from last night.

llvm-svn: 24427
2005-11-19 18:40:42 +00:00
Chris Lattner 377bdbff91 Improve Selection DAG printer portability. Patch by Owen Anderson!
llvm-svn: 24425
2005-11-19 07:44:09 +00:00
Chris Lattner a22eae0163 Teach the graph viewer to handle register operands that are zero.
llvm-svn: 24421
2005-11-19 06:58:46 +00:00
Chris Lattner 301015a703 Silence a bogus warning
llvm-svn: 24420
2005-11-19 05:51:46 +00:00
Chris Lattner f090f7eb0e Add some method variants, patch by Evan Cheng
llvm-svn: 24418
2005-11-19 01:44:53 +00:00
Nate Begeman b2e089c31b Teach LLVM how to scalarize packed types. Currently, this only works on
packed types with an element count of 1, although more generic support is
coming.  This allows LLVM to turn the following code:

void %foo(<1 x float> * %a) {
entry:
  %tmp1 = load <1 x float> * %a;
  %tmp2 = add <1 x float> %tmp1, %tmp1
  store <1 x float> %tmp2, <1 x float> *%a
  ret void
}

Into:

_foo:
        lfs f0, 0(r3)
        fadds f0, f0, f0
        stfs f0, 0(r3)
        blr

llvm-svn: 24416
2005-11-19 00:36:38 +00:00
Nate Begeman 127321b14c Split out the shift code from visitBinary.
llvm-svn: 24412
2005-11-18 07:42:56 +00:00
Chris Lattner 45ca1c0194 Allow targets to custom legalize leaf nodes like GlobalAddress.
llvm-svn: 24387
2005-11-17 06:41:44 +00:00
Chris Lattner 4ff65ec745 Teach legalize about targetglobaladdress
llvm-svn: 24385
2005-11-17 05:52:24 +00:00
Chris Lattner f2b62f317c when debugging lower dbg intrinsics to calls
llvm-svn: 24377
2005-11-16 07:22:30 +00:00
Chris Lattner bba9c372c1 Remove extraneous parents around constants when using a constant expr cast.
llvm-svn: 24357
2005-11-15 00:03:16 +00:00
Chris Lattner dd8eeed096 Teach emitAlignment to handle explicit alignment requests by globals.
llvm-svn: 24354
2005-11-14 19:00:06 +00:00
Jeff Cohen cf1f782a2f Fix operator precedence bug caught by VC++.
llvm-svn: 24318
2005-11-12 00:59:01 +00:00
Andrew Lenharth de1b5d6baa added a chain output
llvm-svn: 24306
2005-11-11 22:48:54 +00:00
Andrew Lenharth 01aa56397d continued readcyclecounter support
llvm-svn: 24300
2005-11-11 16:47:30 +00:00
Chris Lattner 4f827446da nuke blank line
llvm-svn: 24278
2005-11-10 18:49:46 +00:00
Chris Lattner c0a1eba0ab Get rid of casts by #including the right header
llvm-svn: 24275
2005-11-10 18:36:17 +00:00
Chris Lattner 747960d21e Compile C strings to:
l1__2E_str_1:                           ; '.str_1'
        .asciz  "foo"

not:

        .align  0
l1__2E_str_1:                           ; '.str_1'
        .asciz  "foo"

llvm-svn: 24273
2005-11-10 18:09:27 +00:00
Chris Lattner 55a6d9067b add support for .asciz, and enable it by default. If your target assemblerdoesn't support .asciz, just set AscizDirective to null in your asmprinter.
This compiles C strings to:

l1__2E_str_1:                           ; '.str_1'
        .asciz  "foo"

instead of:

l1__2E_str_1:                           ; '.str_1'
        .ascii  "foo\000"

llvm-svn: 24272
2005-11-10 18:06:33 +00:00
Chris Lattner bf4f233214 Switch the allnodes list from a vector of pointers to an ilist of nodes.This eliminates the vector, allows constant time removal of a node froma graph, and makes iteration over the all nodes list stable when adding
nodes to the graph.

llvm-svn: 24263
2005-11-09 23:47:37 +00:00
Chris Lattner cd6f0f47f2 Refactor intrinsic lowering stuff out of visitCall
llvm-svn: 24261
2005-11-09 19:44:01 +00:00
Chris Lattner af3aefa10e Handle the trivial (but common) two-op case more efficiently
llvm-svn: 24259
2005-11-09 18:48:57 +00:00
Chris Lattner 619dfaa42b Nuke noop copies.
llvm-svn: 24258
2005-11-09 18:22:42 +00:00
Chris Lattner 41fd6d5d27 Fix CodeGen/X86/shift-folding.ll:test3 on X86
llvm-svn: 24256
2005-11-09 16:50:40 +00:00
Chris Lattner 35ecaa76fa Disable some overly-aggressive checking code. This speeds up the local
allocator from 23s to 11s on kc++ in debug mode.

llvm-svn: 24255
2005-11-09 05:28:45 +00:00
Chris Lattner b7cad90e55 Avoid creating a token factor node in trivially redundant cases. This
eliminates almost one node per block in common cases.

llvm-svn: 24254
2005-11-09 05:03:03 +00:00
Chris Lattner 43535a19b1 Handle GEP's a bit more intelligently. Fold constant indices early and
turn power-of-two multiplies into shifts early to improve compile time.

llvm-svn: 24253
2005-11-09 04:45:33 +00:00
Chris Lattner c4d6050db6 Allocate the right amount of memory for this vector up front.
llvm-svn: 24252
2005-11-08 23:32:44 +00:00
Chris Lattner 88fa11c3d5 Change the ValueList array for each node to be shared instead of individuallyallocated. Further, in the common case where a node has a single value, justreference an element from a small array. This is a small compile-time win.
llvm-svn: 24251
2005-11-08 23:30:28 +00:00
Chris Lattner 7e4b5d33cb Switch the operandlist/valuelist from being vectors to being just an array.This saves 12 bytes from SDNode, but doesn't speed things up substantially
(our graphs apparently already fit within the cache on my g5).  In any case
this reduces memory usage.

llvm-svn: 24249
2005-11-08 22:07:03 +00:00
Chris Lattner 3ba38cba64 Explicitly initialize some instance vars
llvm-svn: 24247
2005-11-08 21:54:57 +00:00
Chris Lattner aba48dd34c Clean up RemoveDeadNodes significantly, by eliminating the need for a temporary
set and eliminating the need to iterate whenever something is removed (which
can be really slow in some cases).  Thx to Jim for pointing out something silly
I was getting stuck on. :)

llvm-svn: 24241
2005-11-08 18:52:27 +00:00
Jim Laskey 1d2f26adcc Let's try ignoring resource utilization on the backward pass.
llvm-svn: 24231
2005-11-07 19:08:53 +00:00
Chris Lattner 629ba44e50 Always compute max align.
llvm-svn: 24227
2005-11-06 17:43:20 +00:00
Nate Begeman 3ee3e69556 Add the necessary support to the ISel to allow targets to codegen the new
alignment information appropriately.  Includes code for PowerPC to support
fixed-size allocas with alignment larger than the stack.  Support for
arbitrarily aligned dynamic allocas coming soon.

llvm-svn: 24224
2005-11-06 09:00:38 +00:00
Jim Laskey 904dbb4a27 Fix logic bug in finding retry slot in tally.
llvm-svn: 24188
2005-11-05 00:01:25 +00:00
Jim Laskey ded4759d81 Fix a warning
llvm-svn: 24187
2005-11-04 18:26:02 +00:00
Jim Laskey e682b677c1 Scheduling now uses itinerary data.
llvm-svn: 24180
2005-11-04 04:05:35 +00:00
Nate Begeman ee065281e8 Fix a crash that Andrew noticed, and add a pair of braces to unfconfuse
XCode's indenting.

llvm-svn: 24159
2005-11-02 18:42:59 +00:00
Chris Lattner 17df608719 Fix a source of undefined behavior when dealing with 64-bit types. This
may fix PR652.  Thanks to Andrew for tracking down the problem.

llvm-svn: 24145
2005-11-02 01:47:04 +00:00
Jim Laskey 5ce0538253 1. Embed and not inherit vector for NodeGroup.
2. Iterate operands and not uses (performance.)

3. Some long pending comment changes.

llvm-svn: 24119
2005-10-31 12:49:09 +00:00
Chris Lattner 6871b23d02 Significantly simplify this code and make it more aggressive. Instead of having
a special case hack for X86, make the hack more general: if an incoming argument
register is not used in any block other than the entry block, don't copy it to
a vreg.  This helps us compile code like this:

%struct.foo = type { int, int, [0 x ubyte] }
int %test(%struct.foo* %X) {
        %tmp1 = getelementptr %struct.foo* %X, int 0, uint 2, int 100
        %tmp = load ubyte* %tmp1                ; <ubyte> [#uses=1]
        %tmp2 = cast ubyte %tmp to int          ; <int> [#uses=1]
        ret int %tmp2
}

to:

_test:
        lbz r3, 108(r3)
        blr

instead of:

_test:
        lbz r2, 108(r3)
        or r3, r2, r2
        blr

The (dead) copy emitted to copy r3 into a vreg for extra-block uses was
increasing the live range of r3 past the load, preventing the coallescing.

This implements CodeGen/PowerPC/reg-coallesce-simple.ll

llvm-svn: 24115
2005-10-30 19:42:35 +00:00
Chris Lattner dd5663dfa0 Reduce the number of copies emitted as machine instructions by
generating results in vregs that will need them.  In the case of something
like this:  CopyToReg((add X, Y), reg1024), we no longer emit code like
this:

   reg1025 = add X, Y
   reg1024 = reg 1025

Instead, we emit:

   reg1024 = add X, Y

Whoa! :)

llvm-svn: 24111
2005-10-30 18:54:27 +00:00
Chris Lattner a70878d4fb Codegen mul by negative power of two with a shift and negate.
This implements test/Regression/CodeGen/PowerPC/mul-neg-power-2.ll,
producing:

_foo:
        slwi r2, r3, 1
        subfic r3, r2, 63
        blr

instead of:

_foo:
        mulli r2, r3, -2
        addi r3, r2, 63
        blr

llvm-svn: 24106
2005-10-30 06:41:49 +00:00
Chris Lattner 4b6d583d7a Fix DSE to not nuke dead stores unless they redundant store is the same
VT as the killing one.  Fix fixes PR491

llvm-svn: 24034
2005-10-27 07:10:34 +00:00
Chris Lattner d8c5c066a1 Add a simple xform that is useful for bitfield operations.
llvm-svn: 24029
2005-10-27 05:06:38 +00:00
Chris Lattner 3c7974aade Fix some spello's pointed out by Gabor Greif
llvm-svn: 24019
2005-10-26 18:41:41 +00:00
Nate Begeman d8f2a1a0f3 Allow custom lowered FP_TO_SINT ops in the check for whether a larger
FP_TO_SINT is preferred to a larger FP_TO_UINT.  This seems to be begging
for a TLI.isOperationCustom() helper function.

llvm-svn: 23992
2005-10-25 23:47:25 +00:00
Chris Lattner 3b409a85eb Clear a bit in this file that was causing a miscompilation of 178.galgel.
llvm-svn: 23980
2005-10-25 18:57:30 +00:00
Chris Lattner 476b8ddd55 Alkis agrees that that iterative scan allocator isn't going to be worked on
in the future, remove it.

llvm-svn: 23952
2005-10-24 04:14:30 +00:00
Jeff Cohen 11e26b52b2 When a function takes a variable number of pointer arguments, with a zero
pointer marking the end of the list, the zero *must* be cast to the pointer
type.  An un-cast zero is a 32-bit int, and at least on x86_64, gcc will
not extend the zero to 64 bits, thus allowing the upper 32 bits to be
random junk.

The new END_WITH_NULL macro may be used to annotate a such a function
so that GCC (version 4 or newer) will detect the use of un-casted zero
at compile time.

llvm-svn: 23888
2005-10-23 04:37:20 +00:00
Andrew Lenharth 4b3932aa89 add TargetExternalSymbol
llvm-svn: 23886
2005-10-23 03:40:17 +00:00
Chris Lattner 9faa5b7a9a BuildSDIV and BuildUDIV only work for i32/i64, but they don't check that
the input is that type, this caused a failure on gs on X86 last night.
Move the hard checks into Build[US]Div since that is where decisions like
this should be made.

llvm-svn: 23881
2005-10-22 18:50:15 +00:00
Chris Lattner 75ea5b10bf add a case missing from the dag combiner that exposed the failure on
2005-10-21-longlonggtu.ll.

llvm-svn: 23875
2005-10-21 21:23:25 +00:00
Chris Lattner e95b5745c0 Make the coallescer a bit smarter, allowing it to join more live ranges.
For example, we can now join things like [0-30:0)[31-40:1)[52-59:2)
with [40:60:0) if the 52-59 range is defined by a copy from the 40-60 range.
The resultant range ends up being [0-30:0)[31-60:1).

This fires a lot through-out the test suite (e.g. shrinking bc from
19492 -> 18509 machineinstrs) though most gains are smaller (e.g. about
50 copies eliminated from crafty).

llvm-svn: 23866
2005-10-21 06:49:50 +00:00
Chris Lattner 76c97afbbc Fix LiveInterval::getOverlapingRanges to take things in the right order
(an unused method).

Fix the merger so that it can merge ranges like this  [10:12)[16:40) with
[12:38) into [10:40) instead of bogus ranges.  This sort of input will be
possible for the merger coming shortly

llvm-svn: 23865
2005-10-21 06:41:30 +00:00
Nate Begeman 8f62cd32ad Fix a typo in the dag combiner, so that this can work on i64 targets
llvm-svn: 23856
2005-10-21 01:51:45 +00:00
Nate Begeman 4dd383120f Invert the TargetLowering flag that controls divide by consant expansion.
Add a new flag to TargetLowering indicating if the target has really cheap
  signed division by powers of two, make ppc use it.  This will probably go
  away in the future.
Implement some more ISD::SDIV folds in the dag combiner
Remove now dead code in the x86 backend.

llvm-svn: 23853
2005-10-21 00:02:42 +00:00
Chris Lattner b7b75e1b68 Fix a conditional so we don't access past the end of the range. Thanks to
Andrew for bringing this to my attn.

llvm-svn: 23850
2005-10-20 22:50:10 +00:00
Nate Begeman 7efe53d90b Fix a couple bugs in the const div stuff where we'd generate MULHS/MULHU
for types that aren't legal, and fail a divisor is less than zero
comparison, which would cause us to drop a subtract.

llvm-svn: 23846
2005-10-20 17:45:03 +00:00
Chris Lattner a6efeb01f9 don't use llabs with apparently VC++ doesn't have
llvm-svn: 23845
2005-10-20 17:01:00 +00:00
Chris Lattner 35852fc391 Fix order of eval problem from when I refactored this into a function.
llvm-svn: 23844
2005-10-20 16:56:40 +00:00
Chris Lattner 3cf40798ab add a new method, play around with some code.
Fix a *bug* in the extendIntervalEndTo method.  In particular, if adding
[2:10) to an interval containing [0:2),[10:30), we produced [0:10),[10,30).
Which is not the most smart thing to do.  Now produce [0:30).

llvm-svn: 23841
2005-10-20 07:39:25 +00:00
Chris Lattner 8816353040 Refactor some code, pulling it out into a function. No functionality change.
llvm-svn: 23839
2005-10-20 06:06:30 +00:00
Nate Begeman c6f067a8c4 Move the target constant divide optimization up into the dag combiner, so
that the nodes can be folded with other nodes, and we can not duplicate
code in every backend.  Alpha will probably want this too.

llvm-svn: 23835
2005-10-20 02:15:44 +00:00
Nate Begeman 5172ce641e Teach Legalize how to do something with EXTRACT_ELEMENT when the type of
the pair of elements is a legal type.

llvm-svn: 23804
2005-10-19 00:06:56 +00:00
Nate Begeman 78afac2ddd Add the ability to lower return instructions to TargetLowering. This
allows us to lower legal return types to something else, to meet ABI
requirements (such as that i64 be returned in two i32 regs on Darwin/ppc).

llvm-svn: 23802
2005-10-18 23:23:37 +00:00
Chris Lattner 0a71a9ac86 Fix Generic/2005-10-18-ZeroSizeStackObject.ll by not requesting a zero
sized stack object if either the array size or the type size is zero.

llvm-svn: 23801
2005-10-18 22:14:06 +00:00
Chris Lattner 8396a308a7 remove hack
llvm-svn: 23797
2005-10-18 22:11:42 +00:00
Chris Lattner 6c14c35bd7 Fold (select C, load A, load B) -> load (select C, A, B). This happens quite
a lot throughout many programs.  In particular, specfp triggers it a bunch for
constant FP nodes when you have code like  cond ? 1.0 : -1.0.

If the PPC ISel exposed the loads implicit in pic references to external globals,
we would be able to eliminate a load in cases like this as well:

%X = external global int
%Y = external global int
int* %test4(bool %C) {
        %G = select bool %C, int* %X, int* %Y
        ret int* %G
}

Note that this breaks things that use SrcValue's (see the fixme), but since nothing
uses them yet, this is ok.

Also, simplify some code to use hasOneUse() on an SDOperand instead of hasNUsesOfValue directly.

llvm-svn: 23781
2005-10-18 06:04:22 +00:00
Nate Begeman 418c6e4045 Implement some feedback from Chris re: constant canonicalization
llvm-svn: 23777
2005-10-18 00:28:13 +00:00
Nate Begeman bd5f41a6a6 Legalize BUILD_PAIR appropriately for upcoming 64 bit PowerPC work.
llvm-svn: 23776
2005-10-18 00:27:41 +00:00
Nate Begeman ec48a1bfbd fold fmul X, +2.0 -> fadd X, X;
llvm-svn: 23774
2005-10-17 20:40:11 +00:00
Chris Lattner eeb2bda2fa add a trivial fold
llvm-svn: 23764
2005-10-17 01:07:11 +00:00
Chris Lattner e540800d5a Fix this logic.
llvm-svn: 23756
2005-10-15 22:35:40 +00:00
Chris Lattner 17cc9edd33 Add a case we were missing that was causing us to fail CodeGen/PowerPC/rlwinm.ll:test3
llvm-svn: 23755
2005-10-15 22:18:08 +00:00
Chris Lattner b986f471be Use getExtLoad here instead of getNode, as extloads produce two values. This
fixes a legalize failure on SPASS for itanium.

llvm-svn: 23747
2005-10-15 20:24:07 +00:00
Nate Begeman 6e673b24d3 fold sext_in_reg, sext_in_reg where both have the same VT. This was
popping up in Fourinarow.

llvm-svn: 23722
2005-10-14 01:29:07 +00:00
Nate Begeman d59e5a7abb Relax the checking on zextload generation a bit, since as sabre pointed out
you could be AND'ing with the result of a shift that shifts out all the
bits you care about, in addition to a constant.

Also, move over an add/sub_parts fold from legalize to the dag combiner,
where it works for things other than constants.  Woot!

llvm-svn: 23720
2005-10-14 01:12:21 +00:00
Chris Lattner b8282987f4 Fix the trunc(load) case, finally allowing crafty and povray to pass
llvm-svn: 23718
2005-10-13 22:10:05 +00:00
Chris Lattner dbc5ae3109 Fix some bugs in (sext (load x))
llvm-svn: 23717
2005-10-13 21:52:31 +00:00
Chris Lattner 258521d7ea When ExpandOp'ing a [SZ]EXTLOAD, make sure to remember that the chain
is also legal.  Add support for ExpandOp'ing raw EXTLOADs too.

llvm-svn: 23716
2005-10-13 21:44:47 +00:00
Chris Lattner d23f4b7411 Implement PromoteOp for *EXTLOAD, allowing MallocBench/gs to Legalize
llvm-svn: 23715
2005-10-13 20:07:41 +00:00
Nate Begeman 8e022b3d89 Fix the remaining DAGCombiner issues pointed out by sabre. This should fix
the remainder of the failures introduced by my patch last night.

llvm-svn: 23714
2005-10-13 18:34:58 +00:00
Chris Lattner a80f1f6e72 Fix a minor bug in the dag combiner that broke pcompress2 and some other
tests.

llvm-svn: 23713
2005-10-13 18:16:34 +00:00
Nate Begeman c3a89c5259 Add support to Legalize for expanding i64 sextload/zextload into hi and lo
parts. This should fix the crafty and signed long long unit test failure
on x86 last night.

llvm-svn: 23711
2005-10-13 17:15:37 +00:00
Jim Laskey 5d7a50ac44 Inhibit instructions from being pushed before function calls. This will
minimize unnecessary spilling.

llvm-svn: 23710
2005-10-13 16:44:00 +00:00
Nate Begeman 02b23c6065 Move some Legalize functionality over to the DAGCombiner where it belongs.
Kill some dead code.

llvm-svn: 23706
2005-10-13 03:11:28 +00:00
Nate Begeman 70d28c5e32 Fix a potential bug with two combine-to's back to back that chris pointed
out, where after the first CombineTo() call, the node the second CombineTo
wishes to replace may no longer exist.

Fix a very real bug with the truncated load optimization on little endian
targets, which do not need a byte offset added to the load.

llvm-svn: 23704
2005-10-12 23:18:53 +00:00
Nate Begeman 8caf81d617 More cool stuff for the dag combiner. We can now finally handle things
like turning:

_foo:
        fctiwz f0, f1
        stfd f0, -8(r1)
        lwz r2, -4(r1)
        rlwinm r3, r2, 0, 16, 31
        blr

into
_foo:
        fctiwz f0,f1
        stfd f0,-8(r1)
        lhz r3,-2(r1)
        blr

Also removed an unncessary constraint from sra -> srl conversion, which
should take care of hte only reason we would ever need to handle sra in
MaskedValueIsZero, AFAIK.

llvm-svn: 23703
2005-10-12 20:40:40 +00:00
Jim Laskey 63b1419b74 Finally committing to the new scheduler. Still -sched=none by default.
llvm-svn: 23702
2005-10-12 18:29:35 +00:00
Jim Laskey d00db257c7 Added graphviz/gv support for MF.
llvm-svn: 23700
2005-10-12 12:09:05 +00:00
Chris Lattner 514f058be1 Fix a powerpc crash on CodeGen/Generic/llvm-ct-intrinsics.ll
llvm-svn: 23694
2005-10-11 17:56:34 +00:00
Chris Lattner c38fb8e2a1 Add a canonicalization that got lost, fixing PowerPC/fold-li.ll:SUB
llvm-svn: 23693
2005-10-11 06:07:15 +00:00
Chris Lattner cc6e53e6ee clean up some corner cases
llvm-svn: 23692
2005-10-10 23:00:08 +00:00
Chris Lattner 04c737091f Implement trivial DSE. If two stores are neighbors and store to the same
location, replace them with a new store of the last value.  This occurs
in the same neighborhood in 197.parser, speeding it up about 1.5%

llvm-svn: 23691
2005-10-10 22:31:19 +00:00
Chris Lattner e260ed8628 Add support for CombineTo, allowing the dag combiner to replace nodes with
multiple results.

Use this support to implement trivial store->load forwarding, implementing
CodeGen/PowerPC/store-load-fwd.ll.  Though this is the most simple case and
can be extended in the future, it is still useful.  For example, it speeds
up 197.parser by 6.2% by avoiding an LSU reject in xalloc:

        stw r6, lo16(l5_end_of_array)(r2)
        addi r2, r5, -4
        stwx r5, r4, r2
-       lwzx r5, r4, r2
-       rlwinm r5, r5, 0, 0, 30
        stwx r5, r4, r2
        lwz r2, -4(r4)
        ori r2, r2, 1

llvm-svn: 23690
2005-10-10 22:04:48 +00:00
Nate Begeman 6828ed9bfd Teach the DAGCombiner several new tricks, teaching it how to turn
sext_inreg into zext_inreg based on the signbit (fires a lot), srem into
urem, etc.

llvm-svn: 23688
2005-10-10 21:26:48 +00:00
Chris Lattner 7730924067 Fix comment
llvm-svn: 23686
2005-10-10 16:52:03 +00:00
Chris Lattner 3d1d4a3d12 Add ISD::ADD to MaskedValueIsZero
llvm-svn: 23685
2005-10-10 16:51:40 +00:00
Chris Lattner 56e44a6da5 This function is now dead
llvm-svn: 23684
2005-10-10 16:49:22 +00:00
Chris Lattner bcfebebf22 Enable Nate's excellent DAG combiner work by default. This allows the
removal of a bunch of ad-hoc and crufty code from SelectionDAG.cpp.

llvm-svn: 23682
2005-10-10 16:47:10 +00:00
Chris Lattner 6a49b7cabb add a todo for something I noticed
llvm-svn: 23679
2005-10-09 22:59:08 +00:00
Chris Lattner 1d3dc00674 (X & Y) & C == 0 if either X&C or Y&C are zero
llvm-svn: 23678
2005-10-09 22:12:36 +00:00
Chris Lattner 0832f2635a When emiting a CopyFromReg and the source is already a vreg, do not bother
creating a new vreg and inserting a copy: just use the input vreg directly.

This speeds up the compile (e.g. about 5% on mesa with a debug build of llc)
by not adding a bunch of copies and vregs to be coallesced away.  On mesa,
for example, this reduces the number of intervals from 168601 to 129040
going into the coallescer.

llvm-svn: 23671
2005-10-09 05:58:56 +00:00
Nate Begeman 2042aa5b92 Lo and behold, the last bits of SelectionDAG.cpp have been moved over.
llvm-svn: 23665
2005-10-08 00:29:44 +00:00
Chris Lattner be4bbca0ba remove debugging code
llvm-svn: 23663
2005-10-07 15:31:26 +00:00
Chris Lattner fb12624a3f implement CodeGen/PowerPC/div-2.ll:test2-4 by propagating zero bits through
C-X's

llvm-svn: 23662
2005-10-07 15:30:32 +00:00
Chris Lattner b27a4147d3 fix indentation
llvm-svn: 23660
2005-10-07 06:37:02 +00:00
Chris Lattner 5bcd0dd811 Turn sdivs into udivs when we can prove the sign bits are clear. This
implements CodeGen/PowerPC/div-2.ll

llvm-svn: 23659
2005-10-07 06:10:46 +00:00
Chris Lattner 7bf8d06f02 silence a bogus GCC warning
llvm-svn: 23646
2005-10-06 17:39:10 +00:00
Chris Lattner fabe55f155 Fix the LLC regressions on X86 last night. In particular, when undoing
previous copy elisions and we discover we need to reload a register, make
sure to use the regclass of the original register for the reload, not the
class of the current register.  This avoid using 16-bit loads to reload 32-bit
values.

llvm-svn: 23645
2005-10-06 17:19:06 +00:00
Chris Lattner 4bbbb9eed7 Make the legalizer completely non-recursive
llvm-svn: 23642
2005-10-06 01:20:27 +00:00
Nate Begeman 558beb3729 Let the combiner handle more cases
llvm-svn: 23641
2005-10-05 21:44:43 +00:00
Nate Begeman f8221c5e2c Remove some bad code from Legalize
llvm-svn: 23640
2005-10-05 21:44:10 +00:00
Nate Begeman bd7df030d2 Check in some more DAGCombiner pieces
llvm-svn: 23639
2005-10-05 21:43:42 +00:00
Chris Lattner 55149d7835 Fix a bug in the local spiller, where we could take code like this:
store r12 -> [ss#2]
  R3 = load [ss#1]
  use R3
  R3 = load [ss#2]
  R4 = load [ss#1]

and turn it into this code:

  store R12 -> [ss#2]
  R3 = load [ss#1]
  use R3
  R3 = R12
  R4 = R3    <- oops!

The problem was that promoting R3 = load[ss#2] to a copy missed the fact that
the instruction invalidated R3 at that point.

llvm-svn: 23638
2005-10-05 18:30:19 +00:00
Chris Lattner a49e16fefa implement visitBR_CC so that PowerPC/inverted-bool-compares.ll passes
with the dag combiner.  This speeds up espresso by 8%, reaching performance
parity with the dag-combiner-disabled llc.

llvm-svn: 23636
2005-10-05 06:47:48 +00:00
Chris Lattner b11d15637a fix some pastos
llvm-svn: 23635
2005-10-05 06:37:22 +00:00
Chris Lattner 06f1d0f73a Add a new HandleNode class, which is used to handle (haha) cases in the
dead node elim and dag combiner passes where the root is potentially updated.
This fixes a fixme in the dag combiner.

llvm-svn: 23634
2005-10-05 06:35:28 +00:00
Chris Lattner a6895d180e Implement the code for PowerPC/inverted-bool-compares.ll, even though it
that testcase still does not pass with the dag combiner.  This is because
not all forms of br* are folded yet.

Also, when we combine a node into another one, delete the node immediately
instead of waiting for the node to potentially come up in the future.

llvm-svn: 23632
2005-10-05 06:11:08 +00:00
Chris Lattner 6bd8fd09b6 make sure that -view-isel-dags is the input to the isel, not the input to
the second phase of dag combining

llvm-svn: 23631
2005-10-05 06:09:10 +00:00
Chris Lattner 746d50a01a Fix a crash compiling Olden/tsp
llvm-svn: 23630
2005-10-05 04:45:43 +00:00
Jim Laskey 327d4298e1 Reverting to version - until problem isolated.
llvm-svn: 23622
2005-10-04 16:41:51 +00:00
Nate Begeman 5da6908d65 Fix some faulty logic in the libcall inserter.
Since calls return more than one value, don't bail if one of their uses
happens to be a node that's not an MVT::Other when following the chain
from CALLSEQ_START to CALLSEQ_END.

Once we've found a CALLSEQ_START, we can just return; there's no need to
tail-recurse further up the graph.

Most importantly, just because something only has one use doesn't mean we
should use it's one use to follow from start to end.  This faulty logic
caused us to follow a chain of one-use FP operations back to a much earlier
call, putting a cycle in the graph from a later start to an earlier end.

This is a better fix that reverting to the workaround committed earlier
today.

llvm-svn: 23620
2005-10-04 02:10:55 +00:00
Nate Begeman 54fb5002e5 Add back a workaround that fixes some breakages from chris's last change.
Neither of us have yet figured out why this code is necessary, but stuff
breaks if its not there.  Still tracking this down...

llvm-svn: 23617
2005-10-04 00:37:37 +00:00
Jim Laskey 409a6b204e Refactor gathering node info and emission.
llvm-svn: 23610
2005-10-03 12:30:32 +00:00
Chris Lattner 57b21f9f10 clean up this code a bit, no functionality change
llvm-svn: 23609
2005-10-03 07:22:07 +00:00
Chris Lattner 5f096e2847 Break the body of the loop out into a new method
llvm-svn: 23606
2005-10-03 04:47:08 +00:00
Chris Lattner 9cfccfb517 Fix a problem where the legalizer would run out of stack space on extremely
large basic blocks because it was purely recursive.  This switches it to an
iterative/recursive hybrid.

llvm-svn: 23596
2005-10-02 17:49:46 +00:00
Chris Lattner 7f718e61e8 silence a bogus warning
llvm-svn: 23595
2005-10-02 16:30:51 +00:00
Chris Lattner 704d97f8b2 Add assertions to the trivial scheduler to check that the value types match
up between defs and uses.

llvm-svn: 23590
2005-10-02 07:10:55 +00:00
Chris Lattner a038d901fb Codegen CopyFromReg using the regclass that matches the valuetype of the
destination vreg.

llvm-svn: 23586
2005-10-02 06:34:16 +00:00
Chris Lattner 5a7bfe0b72 Add some very paranoid checking for operand/result reg class matchup
For instructions that define multiple results, use the right regclass
to define the result, not always the rc of result #0

llvm-svn: 23580
2005-10-01 07:45:09 +00:00
Jeff Cohen f8a5e5ae6e Fix VC++ warnings.
llvm-svn: 23579
2005-10-01 03:57:14 +00:00
Chris Lattner fda6944c5b add a method
llvm-svn: 23575
2005-10-01 00:17:07 +00:00
Jim Laskey d3850457a1 typo
llvm-svn: 23574
2005-10-01 00:08:23 +00:00
Jim Laskey 9d96932879 1. Simplify the gathering of node groups.
2. Printing node groups when displaying nodes.

llvm-svn: 23573
2005-10-01 00:03:07 +00:00
Jim Laskey 3fe3841c2a 1. Made things node-centric (from operand).
2. Added node groups to handle flagged nodes.

3. Started weaning simple scheduling off existing emitter.

llvm-svn: 23566
2005-09-30 19:15:27 +00:00
Chris Lattner 2e794c9198 now that we have a reg class to spill with, get this info from the regclass
llvm-svn: 23559
2005-09-30 17:19:22 +00:00
Chris Lattner 51878189c5 Now that we have getCalleeSaveRegClasses() info, use it to pass the register
class into the spill/reload methods.  Targets can now rely on that argument.

llvm-svn: 23556
2005-09-30 16:59:07 +00:00
Chris Lattner 5a6199f387 Change this code ot pass register classes into the stack slot spiller/reloader
code.  PrologEpilogInserter hasn't been updated yet though, so targets cannot
use this info.

llvm-svn: 23536
2005-09-30 01:29:00 +00:00
Chris Lattner 5b2be1f890 Fix two bugs in my patch earlier today that broke int->fp conversion on X86.
llvm-svn: 23522
2005-09-29 06:44:39 +00:00
Jeff Cohen b01a41a06d Silence VC++ redeclaration warnings.
llvm-svn: 23516
2005-09-29 01:59:49 +00:00
Chris Lattner 6f3b577ee6 Add FP versions of the binary operators, keeping the int and fp worlds seperate.
Though I have done extensive testing, it is possible that this will break
things in configs I can't test.  Please let me know if this causes a problem
and I'll fix it ASAP.

llvm-svn: 23504
2005-09-28 22:28:18 +00:00
Chris Lattner 0fd8f9fbc9 If the target prefers it, use _setjmp/_longjmp should be used instead of setjmp/longjmp for llvm.setjmp/llvm.longjmp.
llvm-svn: 23481
2005-09-27 22:15:53 +00:00
Jim Laskey 63523f98d5 Remove some redundancies.
llvm-svn: 23469
2005-09-27 17:32:45 +00:00
Jim Laskey 5f2443c8a3 Addition of a simple two pass scheduler. This version is currently hacked up
for testing and will require target machine info to do a proper scheduling.
The simple scheduler can be turned on using -sched=simple (defaults
to -sched=none)

llvm-svn: 23455
2005-09-26 21:57:04 +00:00
Chris Lattner 59a05bdde6 Turn (X^C1) == C2 into X == C1^C2 iff X&~C1 = 0 (and move a function)
This happens all the time on PPC for bool values, e.g. eliminating a xori
in inverted-bool-compares.ll.

This should be added to the dag combiner as well.

llvm-svn: 23403
2005-09-23 00:55:52 +00:00
Chris Lattner b1f8982ff0 Expose the LiveInterval interfaces as public headers.
llvm-svn: 23400
2005-09-21 04:19:09 +00:00
Nate Begeman c760f80fed Stub out the rest of the DAG Combiner. Just need to fill in the
select_cc bits and then wrap it in a convenience function for  use with
regular select.

llvm-svn: 23389
2005-09-19 22:34:01 +00:00
Chris Lattner 2f838f2192 Teach the local spiller to turn stack slot loads into register-register copies
when possible, avoiding the load (and avoiding the copy if the value is already
in the right register).

This patch came about when I noticed code like the following being generated:

  store R17 -> [SS1]
  ...blah...
  R4 = load [SS1]

This was causing an LSU reject on the G5.  This problem was due to the register
allocator folding spill code into a reg-reg copy (producing the load), which
prevented the spiller from being able to rewrite the load into a copy, despite
the fact that the value was already available in a register.  In the case
above, we now rip out the R4 load and replace it with a R4 = R17 copy.

This speeds up several programs on X86 (which spills a lot :) ), e.g.
smg2k from 22.39->20.60s, povray from 12.93->12.66s, 168.wupwise from
68.54->53.83s (!), 197.parser from 7.33->6.62s (!), etc.  This may have a larger
impact in some cases on the G5 (by avoiding LSU rejects), though it probably
won't trigger as often (less spilling in general).

Targets that implement folding of loads/stores into copies should implement
the isLoadFromStackSlot hook to get this.

llvm-svn: 23388
2005-09-19 06:56:21 +00:00
Nate Begeman 24a7eca282 More DAG combining. Still need the branch instructions, and select_cc
llvm-svn: 23371
2005-09-16 00:54:12 +00:00
Chris Lattner d4382f0afa If a function has liveins, and if the target requested that they be plopped
into particular vregs, emit copies into the entry MBB.

llvm-svn: 23331
2005-09-13 19:30:54 +00:00
Chris Lattner 2d454bf5be Allow targets to say they don't support truncstore i1 (which includes a mask
when storing to an 8-bit memory location), as most don't.

llvm-svn: 23303
2005-09-10 00:20:18 +00:00
Chris Lattner bd39c1a4c6 Add a missing #include, patch courtesy of Baptiste Lepilleur.
llvm-svn: 23302
2005-09-09 23:53:39 +00:00
Chris Lattner 331b311f7b Fix a problem duraid encountered on itanium where this folding:
select (x < y), 1, 0 -> (x < y) incorrectly: the setcc returns i1 but the
select returned i32.  Add the zero extend as needed.

llvm-svn: 23301
2005-09-09 23:00:07 +00:00
Chris Lattner 16e5cb87ba Fix a crash viewing dags that have target nodes in them
llvm-svn: 23300
2005-09-09 22:35:03 +00:00
Chris Lattner 1410003751 Use continue in the use-processing loop to make it clear what the early exits
are, simplify logic, and cause things to not be nested as deeply.  This also
uses MRI->areAliases instead of an explicit loop.

No functionality change, just code cleanup.

llvm-svn: 23296
2005-09-09 20:29:51 +00:00
Nate Begeman 049b748c76 Last round of 2-node folds from SD.cpp. Will move on to 3 node ops such
as setcc and select next.

llvm-svn: 23295
2005-09-09 19:49:52 +00:00
Chris Lattner ce3662f2a2 remove debugging code *slaps head*
llvm-svn: 23294
2005-09-09 19:19:20 +00:00
Chris Lattner c9053083eb When spilling a live range that is used multiple times by one instruction,
only add a reload live range once for the instruction.  This is one step
towards fixing a regalloc pessimization that Nate notice, but is later undone
by the spiller (so no code is changed).

llvm-svn: 23293
2005-09-09 19:17:47 +00:00
Nate Begeman 85c1cc4523 Move yet more folds over to the dag combiner from sd.cpp
llvm-svn: 23278
2005-09-08 20:18:10 +00:00
Nate Begeman 2cc2c9a79c Another round of dag combiner changes. This fixes some missing XOR folds
as well as fixing how we replace old values with new values.

llvm-svn: 23260
2005-09-07 23:25:52 +00:00
Chris Lattner 5d16dbd5bb Fix a bug that Tzu-Chien Chiu noticed: live interval analysis does NOT
preserve livevar

llvm-svn: 23259
2005-09-07 17:34:39 +00:00