Commit Graph

18332 Commits

Author SHA1 Message Date
Chris Lattner e0efd1fa72 remove one more occurance of this that snuck in
llvm-svn: 21271
2005-04-13 02:46:17 +00:00
Chris Lattner 857624f47a Remove support for ZERO_EXTEND_INREG. This pessimizes code, genering stuff
like this:

        ldah $1,1($31)
        lda $1,-1($1)
        and $0,$1,$24

instead of this:

        zap $0,252,$24

To get this back, the selector should recognize the ISD::AND case where this
happens and emit the appropriate ZAP instruction.

llvm-svn: 21270
2005-04-13 02:43:40 +00:00
Chris Lattner 7f4c4179a6 Remove special handling of ZERO_EXTEND_INREG. This pessimizes code, causing
things like this:

       mov r9 = 65535;;
       and r8 = r8, r9;;

To be emitted instead of:

        zxt2 r8 = r8;;

To get this back, the selector for ISD::AND should recognize this case.

llvm-svn: 21269
2005-04-13 02:41:52 +00:00
Chris Lattner 83075510ee Elimate handling of ZERO_EXTEND_INREG. This causes the PPC backend to emit
andi instructions instead of rlwinm instructions for zero extend, but they
seem like they would take the same time.

llvm-svn: 21268
2005-04-13 02:40:26 +00:00
Chris Lattner 248fe6bda2 Z_E_I is gone
llvm-svn: 21267
2005-04-13 02:39:05 +00:00
Chris Lattner 0e852afb4c Instead of making ZERO_EXTEND_INREG nodes, use the helper method in
SelectionDAG to do the job with AND.  Don't legalize Z_E_I anymore as
it is gone

llvm-svn: 21266
2005-04-13 02:38:47 +00:00
Chris Lattner 2b4e3fca38 Remove all foldings of ZERO_EXTEND_INREG, moving them to work for AND nodes
instead.  OVerall, this increases the amount of folding we can do.

llvm-svn: 21265
2005-04-13 02:38:18 +00:00
Chris Lattner 50b63f7015 Add a new helper method which returns the and that is equivalent to what
ZERO_EXTEND_INREG was.

llvm-svn: 21264
2005-04-13 02:37:19 +00:00
Chris Lattner 71886d95d5 Remove the ZERO_EXTEND_INREG node which is redundant with AND
llvm-svn: 21263
2005-04-13 02:36:41 +00:00
Nate Begeman ca916ba4a0 Fold shift x, [sz]ext(y) -> shift x, y
llvm-svn: 21262
2005-04-12 23:32:28 +00:00
Nate Begeman af1c0f7a00 Fold shift by size larger than type size to undef
Make llvm undef values generate ISD::UNDEF nodes

llvm-svn: 21261
2005-04-12 23:12:17 +00:00
Nate Begeman 818eb6ddd2 Implement setcc op, -1 sequences
Remove dead setcc op, 0 sequences
Coming later: generalization of op, imm

llvm-svn: 21260
2005-04-12 21:22:28 +00:00
Chris Lattner 0b73a6d8bc promote extload i1 -> extload i8
llvm-svn: 21258
2005-04-12 20:30:10 +00:00
Chris Lattner 9daef352e9 add an argument to allow avoiding deleting phi nodes.
llvm-svn: 21255
2005-04-12 18:52:14 +00:00
Chris Lattner eb958b0e45 add an argument.
llvm-svn: 21254
2005-04-12 18:51:53 +00:00
Chris Lattner 95f16a3ac4 Get rid of this for_each loop
llvm-svn: 21253
2005-04-12 18:51:33 +00:00
Duraid Madina fd469bddac * OK, after changing to use liveIn/liveOut instead of IDEFs,
to avoid redundant mov out3=r44 type instructions, we need to
tell the register allocator the truth about out? registers.

FIXME: unfortunately, since the list of allocatable registers is immutable,
we can't simply 'delete r127' from the allocation order, say, if 'out0' is
used. The only correct thing we can do is have a linear order of regs:

out7, out6 ... out2, out1, out0, r32, r33, r34 ... r126, r127

and slide a 'window' of 96 registers along this line, depending on how many
of the out? regs a function actually uses. The only downside of this is
that the out? registers will be allocated _first_, which makes the
resulting assembly ugly. :( Note this in the README. Hope this gets fixed
soon. :) (note the 3rd person speech there)

llvm-svn: 21252
2005-04-12 18:42:59 +00:00
Andrew Lenharth 740f93ca10 Get rid of idefs for arguments (oops)
llvm-svn: 21251
2005-04-12 17:47:57 +00:00
Andrew Lenharth 10c6eb4be2 Get rid of idefs for arguments
llvm-svn: 21250
2005-04-12 17:35:16 +00:00
Chris Lattner 14f72885dd Put out* into the allocation order, allowing the register allocator to
coallesce moves into outgoing args.

llvm-svn: 21249
2005-04-12 15:12:51 +00:00
Chris Lattner 6b91767b77 Make sure to realize that calls use their argument regs
llvm-svn: 21248
2005-04-12 15:12:19 +00:00
Duraid Madina b6dfb227b7 stop emitting IDEFs for args - change to using liveIn/liveOut
llvm-svn: 21247
2005-04-12 14:54:44 +00:00
Nate Begeman f67f3bf627 Initial support for allocation condition registers
llvm-svn: 21246
2005-04-12 07:04:16 +00:00
Chris Lattner 6febe5ef40 Fix a crash analyzing MultiSource/Benchmarks/MallocBench/gs
llvm-svn: 21245
2005-04-12 03:59:27 +00:00
Chris Lattner af5b25f139 Remove some redundant checks, add a couple of new ones. This allows us to
compile this:

int foo (unsigned long a, unsigned long long g) {
  return a >= g;
}

To:

foo:
        movl 8(%esp), %eax
        cmpl %eax, 4(%esp)
        setae %al
        cmpl $0, 12(%esp)
        sete %cl
        andb %al, %cl
        movzbl %cl, %eax
        ret

instead of:

foo:
        movl 8(%esp), %eax
        cmpl %eax, 4(%esp)
        setae %al
        movzbw %al, %cx
        movl 12(%esp), %edx
        cmpl $0, %edx
        sete %al
        movzbw %al, %ax
        cmpl $0, %edx
        cmove %cx, %ax
        movzbl %al, %eax
        ret

llvm-svn: 21244
2005-04-12 02:54:39 +00:00
Chris Lattner aedcabe8db Emit comparisons against the sign bit better. Codegen this:
bool %test1(long %X) {
        %A = setlt long %X, 0
        ret bool %A
}

like this:

test1:
        cmpl $0, 8(%esp)
        setl %al
        movzbl %al, %eax
        ret

instead of:

test1:
        movl 8(%esp), %ecx
        cmpl $0, %ecx
        setl %al
        movzbw %al, %ax
        cmpl $0, 4(%esp)
        setb %dl
        movzbw %dl, %dx
        cmpl $0, %ecx
        cmove %dx, %ax
        movzbl %al, %eax
        ret

llvm-svn: 21243
2005-04-12 02:19:10 +00:00
Chris Lattner 71ff44e46c Emit long comparison against -1 better. Instead of this (x86):
test2:
        movl 8(%esp), %eax
        notl %eax
        movl 4(%esp), %ecx
        notl %ecx
        orl %eax, %ecx
        cmpl $0, %ecx
        sete %al
        movzbl %al, %eax
        ret

or this (PPC):

_test2:
        nor r2, r4, r4
        nor r3, r3, r3
        or r2, r2, r3
        cntlzw r2, r2
        srwi r3, r2, 5
        blr

Emit this:

test2:
        movl 8(%esp), %eax
        andl 4(%esp), %eax
        cmpl $-1, %eax
        sete %al
        movzbl %al, %eax
        ret

or this:

_test2:
.LBB_test2_0:   ;
        and r2, r4, r3
        cmpwi cr0, r2, -1
        li r3, 1
        li r2, 0
        beq .LBB_test2_2        ;
.LBB_test2_1:   ;
        or r3, r2, r2
.LBB_test2_2:   ;
        blr

it seems like the PPC isel could do better for R32 == -1 case.

llvm-svn: 21242
2005-04-12 01:46:05 +00:00
Chris Lattner 87bd69884a canonicalize x <u 1 -> x == 0. On this testcase:
unsigned long long g;
unsigned long foo (unsigned long a) {
  return (a >= g) ? 1 : 0;
}

It changes the ppc code from:

_foo:
.LBB_foo_0:     ; entry
        mflr r11
        stw r11, 8(r1)
        bl "L00000$pb"
"L00000$pb":
        mflr r2
        addis r2, r2, ha16(L_g$non_lazy_ptr-"L00000$pb")
        lwz r2, lo16(L_g$non_lazy_ptr-"L00000$pb")(r2)
        lwz r4, 0(r2)
        lwz r2, 4(r2)
        cmplw cr0, r3, r2
        li r2, 1
        li r3, 0
        bge .LBB_foo_2  ; entry
.LBB_foo_1:     ; entry
        or r2, r3, r3
.LBB_foo_2:     ; entry
        cmplwi cr0, r4, 1
        li r3, 1
        li r5, 0
        blt .LBB_foo_4  ; entry
.LBB_foo_3:     ; entry
        or r3, r5, r5
.LBB_foo_4:     ; entry
        cmpwi cr0, r4, 0
        beq .LBB_foo_6  ; entry
.LBB_foo_5:     ; entry
        or r2, r3, r3
.LBB_foo_6:     ; entry
        rlwinm r3, r2, 0, 31, 31
        lwz r11, 8(r1)
        mtlr r11
        blr


to:

_foo:
.LBB_foo_0:     ; entry
        mflr r11
        stw r11, 8(r1)
        bl "L00000$pb"
"L00000$pb":
        mflr r2
        addis r2, r2, ha16(L_g$non_lazy_ptr-"L00000$pb")
        lwz r2, lo16(L_g$non_lazy_ptr-"L00000$pb")(r2)
        lwz r4, 0(r2)
        lwz r2, 4(r2)
        cmplw cr0, r3, r2
        li r2, 1
        li r3, 0
        bge .LBB_foo_2  ; entry
.LBB_foo_1:     ; entry
        or r2, r3, r3
.LBB_foo_2:     ; entry
        cntlzw r3, r4
        srwi r3, r3, 5
        cmpwi cr0, r4, 0
        beq .LBB_foo_4  ; entry
.LBB_foo_3:     ; entry
        or r2, r3, r3
.LBB_foo_4:     ; entry
        rlwinm r3, r2, 0, 31, 31
        lwz r11, 8(r1)
        mtlr r11
        blr

llvm-svn: 21241
2005-04-12 00:28:49 +00:00
Nate Begeman 79a3bea4ca Implement bitfield clears
Implement divide by negative power of two

llvm-svn: 21240
2005-04-12 00:10:02 +00:00
Nate Begeman 08698cf644 Update PPC readme. Remove things that are done or aren't ppc specific
llvm-svn: 21232
2005-04-11 20:48:57 +00:00
Chris Lattner 8ffd004920 Teach the dag mechanism that this:
long long test2(unsigned A, unsigned B) {
        return ((unsigned long long)A << 32) + B;
}

is equivalent to this:

long long test1(unsigned A, unsigned B) {
        return ((unsigned long long)A << 32) | B;
}

Now they are both codegen'd to this on ppc:

_test2:
        blr

or this on x86:

test2:
        movl 4(%esp), %edx
        movl 8(%esp), %eax
        ret

llvm-svn: 21231
2005-04-11 20:29:59 +00:00
Chris Lattner edd197062f Fix expansion of shifts by exactly NVT bits on arch's (like X86) that have
masking shifts.

This fixes the miscompilation of this:

long long test1(unsigned A, unsigned B) {
        return ((unsigned long long)A << 32) | B;
}

into this:

test1:
        movl 4(%esp), %edx
        movl %edx, %eax
        orl 8(%esp), %eax
        ret

allowing us to generate this instead:

test1:
        movl 4(%esp), %edx
        movl 8(%esp), %eax
        ret

llvm-svn: 21230
2005-04-11 20:08:52 +00:00
Chris Lattner 607bd26b38 IA64 supports this operation.
llvm-svn: 21228
2005-04-11 18:55:36 +00:00
Chris Lattner 67291ea580 ORo sets CR0
llvm-svn: 21227
2005-04-11 15:03:48 +00:00
Chris Lattner f29cc88210 Revert the previous patch, which I didn't mean to check in.
llvm-svn: 21226
2005-04-11 15:03:41 +00:00
Chris Lattner d3dc31009f Fix a minor bug (ORo didn't mark that it set CR0).
Refactor how . instructions are handled.  In particular, instead of passing
the RC flag all the way up the inheritance hierarchy, just make a new tblgen
class 'DOT' which can be added to an instruction definition.

For example, instead of this:

-def AND  : XForm_6<31,  28, 0, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB),
-let Defs = [CR0] in
-def ANDo : XForm_6<31,  28, 1, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB),
-                   "and. $rA, $rS, $rB">;

We now have this:

+def AND  : XForm_6<31,  28, 0, 0, (ops GPRC:$rA, GPRC:$rS, GPRC:$rB),
                    "and $rA, $rS, $rB">;

llvm-svn: 21225
2005-04-11 15:01:39 +00:00
Duraid Madina 8de7ac092d hmm, should probably change addImm() to take 64-bit arguments one day anyway.
llvm-svn: 21224
2005-04-11 07:16:39 +00:00
Duraid Madina 247def9c2b rename addU64Imm() to addImm64()
llvm-svn: 21223
2005-04-11 07:14:41 +00:00
Nate Begeman bebefac791 Add recording variants of ISD::AND and ISD::OR. This kills almost 1000
(1.5%) instructions in 186.crafty

llvm-svn: 21222
2005-04-11 06:34:10 +00:00
Duraid Madina fb43ef78c5 assorted fixes:
* clean up immediates (we use 14, 22 and 64 bit immediates now. sane.)
  * fold r0/f0/f1 registers into comparisons against 0/0.0/1.0
  * fix nasty thinko - didn't use two-address form of conditional add
    for extending bools to integers, so occasionally there would be
    garbage in the result. it's amazing how often zeros are just
    sitting around in registers ;) - this should fix a bunch of tests.

llvm-svn: 21221
2005-04-11 05:55:56 +00:00
Reid Spencer 7a763bfbc5 Ensure that the arguments passed to sys::Program::ExecuteAndWait include
the program name as the first argument. Thanks go to Markus Oberhumer for
noticing this problem.

llvm-svn: 21220
2005-04-11 05:48:04 +00:00
Jeff Cohen a3b1458175 Eliminate tabs
llvm-svn: 21216
2005-04-11 03:44:22 +00:00
Jeff Cohen ecbfa98ce7 Eliminate major source of VC++ "possible loss of data" warnings.
llvm-svn: 21215
2005-04-11 03:38:28 +00:00
Nate Begeman add0c63ad2 Fix libcall code to not pass a NULL Chain to LowerCallTo
Fix libcall code to not crash or assert looking for an ADJCALLSTACKUP node
  when it is known that there is no ADJCALLSTACKDOWN to match.
Expand i64 multiply when ISD::MULHU is legal for the target.

llvm-svn: 21214
2005-04-11 03:01:51 +00:00
Chris Lattner e2427c9afc Don't bother sign/zext_inreg'ing the result of an and operation if we know
the result does change as a result of the extend.

This improves codegen for Alpha on this testcase:

int %a(ushort* %i) {
        %tmp.1 = load ushort* %i
        %tmp.2 = cast ushort %tmp.1 to int
        %tmp.4 = and int %tmp.2, 1
        ret int %tmp.4
}

Generating:

a:
        ldgp $29, 0($27)
        ldwu $0,0($16)
        and $0,1,$0
        ret $31,($26),1

instead of:

a:
        ldgp $29, 0($27)
        ldwu $0,0($16)
        and $0,1,$0
        addl $0,0,$0
        ret $31,($26),1

btw, alpha really should switch to livein/outs for args :)

llvm-svn: 21213
2005-04-10 23:37:16 +00:00
Chris Lattner a3b7ef05f4 Teach legalize to deal with targets that don't support some SEXTLOAD/ZEXTLOADs
llvm-svn: 21212
2005-04-10 22:54:25 +00:00
Chris Lattner 672fe7267b The first argument to ExecuteAndWait should be the program name, but pointed
out by Markus F.X.J. Oberhumer.

llvm-svn: 21211
2005-04-10 20:59:38 +00:00
Chris Lattner 751cc5f49f fix this testcase so the regex doesn't match the function name
llvm-svn: 21210
2005-04-10 20:45:35 +00:00
Chris Lattner 391a351ede don't zextload fp values!
llvm-svn: 21209
2005-04-10 17:40:35 +00:00
Duraid Madina 7b0287b78d * store immediate values as int64_t, not int. come on, we should be happy
when there are immediates, let's not worry about the memory overhead of
this :)

* add addU64Imm(uint64_t val) to machineinstrbuilder

(seriously: this seems required to support 64-bit immediates cleanly. if it
_really_ gets on your nerves, feel free to pull it out ;) )

coming up next week: "all your floating point constants are belong to us"

llvm-svn: 21208
2005-04-10 09:18:55 +00:00
Nate Begeman 492370311d Fix another fixme: factor out the constant fp generation code.
llvm-svn: 21207
2005-04-10 06:06:10 +00:00
Nate Begeman 941a01802f Fix 64 bit argument loading that straddles the args in regs / args on stack
boundary.

llvm-svn: 21206
2005-04-10 05:53:14 +00:00
Chris Lattner c53cd501b5 Until we have a dag combiner, promote using zextload's instead of extloads.
This gives the optimizer a bit of information about the top-part of the
value.

llvm-svn: 21205
2005-04-10 04:33:47 +00:00
Chris Lattner f74c794ccf Fold zext_inreg(zextload), likewise for sext's
llvm-svn: 21204
2005-04-10 04:33:08 +00:00
Chris Lattner f2bff92411 add a simple xform
llvm-svn: 21203
2005-04-10 04:04:49 +00:00
Nate Begeman b076731713 Remove unnecessary Implicit Defs. Since r0 is not in allocation, we do not
have to inform the register allocator it might be stepped on.

llvm-svn: 21202
2005-04-10 03:59:42 +00:00
Chris Lattner 2de306ba83 make this harder
llvm-svn: 21201
2005-04-10 03:18:18 +00:00
Chris Lattner d65632a9ca oops add ~
llvm-svn: 21200
2005-04-10 03:07:25 +00:00
Chris Lattner 38b1ae75fc new testcase for previously unsupported unary complex operators
llvm-svn: 21199
2005-04-10 03:06:27 +00:00
Nate Begeman 6566e8ac06 Make sure that BRCOND branches can be converted into long branches too.
llvm-svn: 21198
2005-04-10 01:48:29 +00:00
Nate Begeman 3345eadc37 Don't hand ISD::CALL nodes off to SelectExprFP. This fixes siod.
llvm-svn: 21197
2005-04-10 01:14:13 +00:00
Chris Lattner d8cbfe82ba Fix a thinko. If the operand is promoted, pass the promoted value into
the new zero extend, not the original operand.  This fixes cast bool -> long
on ppc.

Add an unrelated fixme

llvm-svn: 21196
2005-04-10 01:13:15 +00:00
Chris Lattner 9ff4b4190f rename getPPCOpcodeForSetCCNumber -> getPPCOpcodeForSetCCOpode to be more
correct.  Remove the EmitComparison retvalue, as it is always the first arg.

Fix a place where we incorrectly passed in the setcc opcode instead of the
setcc number, causing us to miscompile crafty.  Crafty now works!

llvm-svn: 21195
2005-04-10 01:03:31 +00:00
Nate Begeman 2121a54868 fix ISD::BRCONDTWOWAY codegen to not deference the end() iterator
llvm-svn: 21193
2005-04-09 23:35:05 +00:00
Chris Lattner 228fed92e6 Fix CodeGen/Generic/2005-05-09-GlobalInPHI.ll, which was reduced from 254.gap.
This caused the "use before a def" assertion on some programs.

With this patch, 254.gap now passes with the PPC backend.

llvm-svn: 21191
2005-04-09 22:05:17 +00:00
Chris Lattner db32a632c9 new testcase that used to crash the ppc fe. It could effect any simpleisel
that is not careful, so I'm checking it into the generic tests.

llvm-svn: 21190
2005-04-09 22:03:10 +00:00
Chris Lattner da504741da add a little peephole optimization. This allows us to codegen:
int a(short i) {
        return i & 1;
}

as

_a:
        andi. r3, r3, 1
        blr

instead of:

_a:
        rlwinm r2, r3, 0, 16, 31
        andi. r3, r2, 1
        blr

on ppc.  It should also help the other risc targets.

llvm-svn: 21189
2005-04-09 21:43:54 +00:00
Chris Lattner e8e070dbfb do not set the root to null if an argument is dead
llvm-svn: 21188
2005-04-09 21:23:24 +00:00
Nate Begeman 8309a333dd Add rlwnm instruction for variable rotate
Generate rotate left/right immediate
Generate code for brcondtwoway
Use new livein/liveout functionality

llvm-svn: 21187
2005-04-09 20:09:12 +00:00
Chris Lattner 3a7f5768c5 Fix a crash on 173.applu by asking for a constant bigger than 32-bits.
llvm-svn: 21185
2005-04-09 19:47:21 +00:00
Chris Lattner a55a5f2580 Switch this instruction selector over to using liveins and liveouts, eliminating
implicit defs on entry to the function.  yaay :)

llvm-svn: 21184
2005-04-09 16:32:30 +00:00
Chris Lattner 1a44855f8f there is no need to remove this instruction, linscan does it already as it
removes noop moves.

llvm-svn: 21183
2005-04-09 16:24:20 +00:00
Chris Lattner 0b1681bce1 Adjust live intervals to support a livein set
llvm-svn: 21182
2005-04-09 16:17:50 +00:00
Chris Lattner b59006c4a1 Use live out sets for return values instead of imp_defs, which is cleaner and faster.
llvm-svn: 21181
2005-04-09 15:23:56 +00:00
Chris Lattner 4c6ab01a20 Consider the livein/out set for a function, allowing targets to not have to
use ugly imp_def/imp_uses for arguments and return values.

llvm-svn: 21180
2005-04-09 15:23:25 +00:00
Chris Lattner 576db37185 add routines to track the livein/out set for a function
llvm-svn: 21179
2005-04-09 15:22:53 +00:00
Duraid Madina 46aa06cfed ok, the "ia64 has a boatload of registers" joke stopped being funny today ;)
* fix overallocation of integer (stacked) registers: we can't allocate
  registers for local use if they are required as output registers

this fixes 'toast' in the test suite, and all sorts of larger programs
like bzip2 etc.

llvm-svn: 21178
2005-04-09 11:53:00 +00:00
Nate Begeman 2f64122319 Optimize FSEL a bit for fneg arguments. This fixes the recently added test
case so that we emit

_test_fneg_sel:
.LBB_test_fneg_sel_0:   ;
        fsel f1, f1, f3, f2
        blr

instead of:

_test_fneg_sel:
.LBB_test_fneg_sel_0:   ;
        fneg f0, f1
        fneg f0, f0
        fsel f1, f0, f3, f2
        blr

llvm-svn: 21177
2005-04-09 09:33:07 +00:00
Nate Begeman 7d3e44fb12 Add a testcase to make sure that we don't emit two fneg instructions back
to back for certain fsel instructions.

llvm-svn: 21176
2005-04-09 09:30:09 +00:00
Nate Begeman 968e44a900 Add cases to cover the rest of the patterns we should be matching
llvm-svn: 21175
2005-04-09 08:29:59 +00:00
Chris Lattner 888c5fdcc2 Fix CodeGen/SparcV9/2005-05-09-GEP-Crash.ll a crash on some specfp program
lets hope this doesn't break other programs with induced entropy

llvm-svn: 21174
2005-04-09 06:27:14 +00:00
Chris Lattner 3aa6ec0dda New testcase that the sparc backend crashes on
llvm-svn: 21173
2005-04-09 06:26:27 +00:00
Chris Lattner 6a31b878f8 recognize some patterns as fabs operations, so that fabs at the source level
is deconstructed then reconstructed here.  This catches 19 fabs's in 177.mesa
9 in 168.wupwise, 5 in 171.swim, 3 in 172.mgrid, and 14 in 173.applu out of
specfp2000.

This allows the X86 code generator to make MUCH better code than before for
each of these and saves one instr on ppc.

This depends on the previous CFE patch to expose these correctly.

llvm-svn: 21171
2005-04-09 05:15:53 +00:00
Chris Lattner d9748bcae5 make this test more interesting
llvm-svn: 21170
2005-04-09 04:55:14 +00:00
Chris Lattner ec90861662 add a test for fnabs
llvm-svn: 21169
2005-04-09 04:03:16 +00:00
Chris Lattner b9a11b8b7f add a partial test for the fma operations that ppc supports. I'm sure I'm
missing some and not all of these match yet, but I'm sure that Nate will
clean up my mess :)

llvm-svn: 21168
2005-04-09 04:01:32 +00:00
Chris Lattner 8a98c7f337 Emit BRCONDTWOWAY when possible.
llvm-svn: 21167
2005-04-09 03:30:29 +00:00
Chris Lattner fd98678a8a Legalize BRCONDTWOWAY into a BRCOND/BR pair if a target doesn't support it.
llvm-svn: 21166
2005-04-09 03:30:19 +00:00
Chris Lattner b0713c74a2 print and fold BRCONDTWOWAY correctly
llvm-svn: 21165
2005-04-09 03:27:28 +00:00
Chris Lattner a3a135a9f7 This target does not support/want ISD::BRCONDTWOWAY
llvm-svn: 21164
2005-04-09 03:22:37 +00:00
Chris Lattner 4f77badaa3 This target does not yet support ISD::BRCONDTWOWAY
llvm-svn: 21163
2005-04-09 03:22:30 +00:00
Chris Lattner 4b1323e846 Add a new node
llvm-svn: 21162
2005-04-09 03:21:50 +00:00
Nate Begeman e8ce0cda40 64b: Expand S/UREM
32b: No longer pattern match fneg(fsub(fmul)) as fnmsub
     Pattern match fsub a, mul(b, c) as fnmsub
     Pattern match fadd a, mul(b, c) as fmadd
Those changes speed up hydro2d by 2.5%, distray by 6%, and scimark by 8%

llvm-svn: 21161
2005-04-09 03:05:51 +00:00
Chris Lattner 0ea81f9db4 canonicalize a bunch of operations involving fneg
llvm-svn: 21160
2005-04-09 03:02:46 +00:00
Nate Begeman f50b597f67 Fix 64b shifts
llvm-svn: 21159
2005-04-08 23:45:01 +00:00
Chris Lattner 61b6f04ae9 fix this method for 64-bit constants
llvm-svn: 21158
2005-04-08 21:31:29 +00:00
Nate Begeman 705d3c18e8 Match Mac OS X 64 bit calling conventions
llvm-svn: 21157
2005-04-08 21:26:05 +00:00
Andrew Lenharth de5aed3f12 collect a few statistics, factor constants (constant loading and mult), fix logic operation pattern matchs, supress FP div when int dividing by a constant
llvm-svn: 21156
2005-04-08 17:28:49 +00:00
Andrew Lenharth ce9e043c78 oops
llvm-svn: 21155
2005-04-08 16:55:15 +00:00
Andrew Lenharth 2e184e2522 added some tests to check stupid pattern matching mistakes
llvm-svn: 21154
2005-04-08 16:46:44 +00:00
Duraid Madina 41ff502549 fix bogus division-by-power-of-2 (was wrong for negative input, adds extr insn)
fix hack in division (clean up frcpa instruction)

llvm-svn: 21153
2005-04-08 10:01:48 +00:00
Chris Lattner 4236261930 Fix bug: InstCombine/2005-05-07-UDivSelectCrash.ll
llvm-svn: 21152
2005-04-08 04:03:26 +00:00
Chris Lattner 9e2b5fc65a new testcase that crashes the instcombiner.
llvm-svn: 21151
2005-04-08 03:58:21 +00:00
Nate Begeman b1f66d1af2 Optimized code sequences for setcc reg, 0
Optimized code sequence for (a < 0) ? b : 0

llvm-svn: 21150
2005-04-07 20:30:01 +00:00
Andrew Lenharth 534eebb317 Alpha zero extends setcc results
llvm-svn: 21149
2005-04-07 20:11:32 +00:00
Chris Lattner b32d9318d2 If a target zero or sign extends the result of its setcc, allow folding of
this into sign/zero extension instructions later.

On PPC, for example, this testcase:

%G = external global sbyte
implementation
void %test(int %X, int %Y) {
  %C = setlt int %X, %Y
  %D = cast bool %C to sbyte
  store sbyte %D, sbyte* %G
  ret void
}

Now codegens to:

        cmpw cr0, r3, r4
        li r3, 1
        li r4, 0
        blt .LBB_test_2 ;
.LBB_test_1:    ;
        or r3, r4, r4
.LBB_test_2:    ;
        addis r2, r2, ha16(L_G$non_lazy_ptr-"L00000$pb")
        lwz r2, lo16(L_G$non_lazy_ptr-"L00000$pb")(r2)
        stb r3, 0(r2)

instead of:

        cmpw cr0, r3, r4
        li r3, 1
        li r4, 0
        blt .LBB_test_2 ;
.LBB_test_1:    ;
        or r3, r4, r4
.LBB_test_2:    ;
***     rlwinm r3, r3, 0, 31, 31
        addis r2, r2, ha16(L_G$non_lazy_ptr-"L00000$pb")
        lwz r2, lo16(L_G$non_lazy_ptr-"L00000$pb")(r2)
        stb r3, 0(r2)

llvm-svn: 21148
2005-04-07 19:43:53 +00:00
Chris Lattner 532ac79122 PowerPC zero extends setcc results
llvm-svn: 21147
2005-04-07 19:41:49 +00:00
Chris Lattner 38fd97084b X86 zero extends setcc results
llvm-svn: 21146
2005-04-07 19:41:46 +00:00
Chris Lattner 693e797be9 Allow targets which produce setcc results in non-MVT::i1 registers to describe
what the contents of the top bits of these registers are, in the common cases
of targets that sign and zero extend the results.

llvm-svn: 21145
2005-04-07 19:41:18 +00:00
Chris Lattner dfed7355c9 Remove somethign I had for testing
llvm-svn: 21144
2005-04-07 18:58:54 +00:00
Andrew Lenharth 9807ac5d3c fix a small optimization opertunity and make gcc happy
llvm-svn: 21143
2005-04-07 18:15:28 +00:00
Chris Lattner 6b03a0cba1 This patch does two things. First, it canonicalizes 'X >= C' -> 'X > C-1'
(likewise for <= >=u >=u).

Second, it implements a special case hack to turn 'X gtu SINTMAX' -> 'X lt 0'

On powerpc, for example, this changes this:

        lis r2, 32767
        ori r2, r2, 65535
        cmplw cr0, r3, r2
        bgt .LBB_test_2

into:

        cmpwi cr0, r3, 0
        blt .LBB_test_2

llvm-svn: 21142
2005-04-07 18:14:58 +00:00
Andrew Lenharth 31f5e2f73f match case change in codegen
llvm-svn: 21141
2005-04-07 17:47:00 +00:00
Andrew Lenharth 6b492bec30 fixup magic constant making code. tested by thousands of random divisions.... by 10000. ok, so random divisors would be good too, but this at least fixes some things
llvm-svn: 21140
2005-04-07 17:19:16 +00:00
Andrew Lenharth d2da7177f2 lowercase instructions, makes diff happier
llvm-svn: 21139
2005-04-07 17:17:48 +00:00
Chris Lattner 4706046e68 Implement the following xforms:
(X-Y)-X --> -Y
A + (B - A) --> B
(B - A) + A --> B

llvm-svn: 21138
2005-04-07 17:14:51 +00:00
Chris Lattner 679c1119e8 new test
llvm-svn: 21137
2005-04-07 16:41:45 +00:00
Chris Lattner c7f3c1a00e Implement InstCombine/add.ll:test28, transforming C1-(X+C2) --> (C1-C2)-X.
This occurs several dozen times in specint2k, particularly in crafty and gcc
apparently.

llvm-svn: 21136
2005-04-07 16:28:01 +00:00
Chris Lattner dd83183c1e new testcase
llvm-svn: 21135
2005-04-07 16:24:59 +00:00
Chris Lattner a9be4490d8 Transform X-(X+Y) == -Y and X-(Y+X) == -Y
llvm-svn: 21134
2005-04-07 16:15:25 +00:00
Andrew Lenharth 05e51d92e0 It wasn't happy about this either
llvm-svn: 21133
2005-04-07 14:18:13 +00:00
Andrew Lenharth 85f34a5682 Yea, it wasn't happy
llvm-svn: 21132
2005-04-07 13:55:53 +00:00
Duraid Madina a7abda3989 teach asmprinter to print s8/s14 operands
llvm-svn: 21131
2005-04-07 12:34:36 +00:00
Duraid Madina 8419da8acf codegen immediate forms of add/sub/shift
llvm-svn: 21130
2005-04-07 12:33:38 +00:00
Duraid Madina b484f7c55e add immediate forms of add, sub, shift
llvm-svn: 21129
2005-04-07 12:32:24 +00:00
Chris Lattner 7d13eae254 Fix a really scary bug that Nate found where we weren't deleting the right
elements auto of the autoCSE maps.

llvm-svn: 21128
2005-04-07 00:30:13 +00:00
Nate Begeman d20628ff7d Pattern match bitfield insert, which helps shift long by immediate, among
other things.

llvm-svn: 21127
2005-04-06 23:51:40 +00:00
Nate Begeman 505f6b760d Fix some shift bugs
llvm-svn: 21126
2005-04-06 22:42:08 +00:00
Alkis Evlogimenos 7ca0947274 Make these 64 bit constants so that this compiles on x86-32 as well.
llvm-svn: 21125
2005-04-06 22:09:40 +00:00
Andrew Lenharth 3ea17c6cfe added sdiv by 2^k and works for neg divisors also
llvm-svn: 21124
2005-04-06 22:03:13 +00:00
Chris Lattner 747eee2707 Don't make this require loopsimplify. It works BETTER with loop simplify
but should not require it.

llvm-svn: 21123
2005-04-06 21:45:00 +00:00
Nate Begeman b067492535 Teach ExpandShift how to handle shifts by a constant. This allows targets
like PowerPC to codegen long shifts in many fewer instructions.

llvm-svn: 21122
2005-04-06 21:13:14 +00:00
Andrew Lenharth df3256aa7f fix copy/paste errors, and add imm support to SxADDQ and SxSUBQ
llvm-svn: 21121
2005-04-06 20:59:59 +00:00
Chris Lattner bd32728a98 Fix SingleSource/Regression/C/2005-05-06-LongLongSignedShift.c, we were not
properly sign extending the top of the result of a 64-bit shift right by
a constant > 32.

llvm-svn: 21120
2005-04-06 20:59:35 +00:00
Andrew Lenharth 249bf7bee4 simplified
llvm-svn: 21119
2005-04-06 20:59:03 +00:00
Andrew Lenharth 99edcfe3f8 added first alpha codegen regression test
llvm-svn: 21117
2005-04-06 20:39:17 +00:00
Andrew Lenharth 1d4747c302 Added Nate's div by constant stuff, also scaled operations!
llvm-svn: 21116
2005-04-06 20:25:34 +00:00
Chris Lattner 70b8b96d6c Fix a namespace issue, reported by Vladimir Merzliakov!
llvm-svn: 21115
2005-04-06 19:45:39 +00:00
Duraid Madina c36b6c3b1a steal sampo's div-by-constant-power-of-2 stuff
thanks sampo!!

llvm-svn: 21113
2005-04-06 09:55:17 +00:00
Duraid Madina 03c530786c add fms instruction
llvm-svn: 21112
2005-04-06 09:54:09 +00:00
Nate Begeman 39ef2f1d43 Fixed version of optimized integer divide is now fixed. Calculate the
quotient, not the remainder.  Also, make sure to remove the old div operand
from the ExprMap and let SelectExpr insert the new one.

llvm-svn: 21111
2005-04-06 06:44:57 +00:00
Duraid Madina c0e9adf3cc lie a bit and say that r1/r12 (GP/SP) _aren't_ callee-save, as we take
care of this ourselves

llvm-svn: 21110
2005-04-06 06:18:36 +00:00
Duraid Madina df0ecbd4cc make sure 'special' registers don't get allocated
llvm-svn: 21109
2005-04-06 06:17:54 +00:00
Chris Lattner 9953d17a44 document these nodes, as they are nonobvious
llvm-svn: 21108
2005-04-06 04:21:29 +00:00
Chris Lattner 4fbb4af5d1 Add (untested) support for MULHS and MULHU.
llvm-svn: 21107
2005-04-06 04:21:07 +00:00
Chris Lattner c21db6b15c add signed versions of the extra precision multiplies
llvm-svn: 21106
2005-04-06 04:19:22 +00:00
Nate Begeman dd397119b0 Turn off the div -> mul optimization until it works correctly 100% of the
time.

llvm-svn: 21105
2005-04-06 03:36:33 +00:00
Nate Begeman 4164c4baac Add support for MULHS and MULHU nodes
Have LegalizeDAG handle SREM and UREM for us
Codegen SDIV and UDIV by constant as a multiply by magic constant instead
of integer divide, which is very slow.

llvm-svn: 21104
2005-04-06 00:25:27 +00:00
Nate Begeman 20b7d2a36f Expand SREM and UREM for targets that claim not to have them, like PowerPC
llvm-svn: 21103
2005-04-06 00:23:54 +00:00
Nate Begeman 55e8625c69 Add MULHU and MULHS nodes for the high part of an (un)signed 32x32=64b
multiply.

llvm-svn: 21102
2005-04-05 22:36:56 +00:00
Andrew Lenharth 43f78bc2da added lowerargs support for varargs
llvm-svn: 21101
2005-04-05 20:51:46 +00:00
Nate Begeman 524417357c Behold, rlwinm with certain immediate arguments is printed as the much more
readable slwi or srwi (shift left/right word immediate).

llvm-svn: 21099
2005-04-05 18:19:50 +00:00
Nate Begeman a188b698a2 Fix cut & paste errors (32->64), and codegen float->int more optimally.
llvm-svn: 21098
2005-04-05 17:32:30 +00:00
Tanya Lattner 8d64e9a90d Updated to use dep analyzer.
llvm-svn: 21097
2005-04-05 16:36:44 +00:00
Nate Begeman 9203e169a7 Remove 64 bit simple ISel, it never worked correctly
Add initial (buggy) implementation of 64 bit pattern ISel

llvm-svn: 21096
2005-04-05 08:51:15 +00:00
Nate Begeman 4bde071216 Back out the previous change to SelectBranchCC, since there are cases it
could miscompile.  A correct solution will be found in the near future.

llvm-svn: 21095
2005-04-05 04:32:16 +00:00
Nate Begeman 9049e4beec Rename canUseAsImmediateForOpcode to getImmediateForOpcode to better
indicate that it is not a boolean function.
Properly emit the pseudo instruction for conditional branch, so that we
  can fix up conditional branches whose displacements are too large.
Reserve the right amount of opcode space for said pseudo instructions.

llvm-svn: 21094
2005-04-05 04:22:58 +00:00
Chris Lattner 7e0a534cba do not crash when using -debug
llvm-svn: 21092
2005-04-05 01:12:03 +00:00
Nate Begeman d6933f5078 Implement SDIV by power of 2 as srawi/addze rather than load imm, divw
llvm-svn: 21091
2005-04-05 00:15:08 +00:00
Nate Begeman 1d5d767a09 Pattern match fp mul-add, mul-sub, neg-mul-add, and neg-mul-sub
llvm-svn: 21090
2005-04-04 23:40:36 +00:00
Nate Begeman d96350095c Add support for multiply-add, multiply-sub, and their negated versions
llvm-svn: 21089
2005-04-04 23:01:51 +00:00
Chris Lattner b919b21777 do not dereference an extra layer of pointers to determine if an external
call can modify a memory location.  This fixes
test/Regression/Analysis/Andersens/modreftest.ll

llvm-svn: 21088
2005-04-04 22:23:21 +00:00
Chris Lattner 0933766e2b new testcase
llvm-svn: 21087
2005-04-04 22:22:30 +00:00
Nate Begeman 1194531057 Make sure that arg regs used by the call instruction are marked as such, so
that regalloc doesn't cleverly reuse early arg regs loading later arg regs.
This fixes almost all outstanding failures in the pattern isel.

llvm-svn: 21086
2005-04-04 22:17:48 +00:00
Nate Begeman c7186025de Remove unnecessary register copy now that regalloc is fixed
llvm-svn: 21085
2005-04-04 21:48:13 +00:00
Chris Lattner 6a6056e93d Make sure to notice that explicit physregs are used in the function
llvm-svn: 21084
2005-04-04 21:35:34 +00:00
Nate Begeman d753765460 i1 loads should also be from the low byte of the argument word.
llvm-svn: 21077
2005-04-04 09:09:00 +00:00
Nate Begeman 1ce4839890 Fix i64 return, fix CopyFromReg
llvm-svn: 21076
2005-04-04 06:52:38 +00:00
Duraid Madina 9935f44fb8 fix SREM/UREM, which gave incorrect results for x%y if x was zero. This is
an ugly hack, but it seems to work. I should fix this properly and add a test
as well.

fixes multisource/obsequi (maybe others)

llvm-svn: 21075
2005-04-04 05:05:52 +00:00
Duraid Madina dbc810022b add implicit use op
llvm-svn: 21074
2005-04-04 04:50:57 +00:00
Nate Begeman cc00a7c42d Handle expanding arguments to ISD::TRUNCATE. This happens on PowerPC when
you have something like i16 = truncate i64.  This fixes Regression/C/casts

llvm-svn: 21073
2005-04-04 00:57:08 +00:00
Chris Lattner 4784489de2 Fix sign_extend and zero_extend of promoted value types to expanded value
types.  This occurs when casting short to long on PPC for example.

llvm-svn: 21072
2005-04-03 23:41:52 +00:00
Nate Begeman 629cdaea39 Full varargs support. All of UnitTests now passes
llvm-svn: 21070
2005-04-03 23:11:17 +00:00
Nate Begeman 7a3e929efc Pass the correct value for the chain to the store
llvm-svn: 21066
2005-04-03 22:22:56 +00:00
Nate Begeman f6dc43bd46 Fix SHL_PARTS
Start implementation of integer varargs

llvm-svn: 21065
2005-04-03 22:13:27 +00:00
Andrew Lenharth 79e727e8a7 is this simpler? I think it is simpler.
llvm-svn: 21064
2005-04-03 20:35:21 +00:00
Andrew Lenharth 7ce5740de9 fix 101 regressions
llvm-svn: 21063
2005-04-03 18:24:50 +00:00
Duraid Madina 2f472ecb11 a wise man once said:
"!!!!!!!! IF YOU CHANGE SPACES TO TABS, YOU WILL BE KILLED!!!!!!"

llvm-svn: 21062
2005-04-03 14:57:35 +00:00
Duraid Madina 6c9afaead4 .bss is no problem here.
llvm-svn: 21061
2005-04-03 14:52:01 +00:00
Nate Begeman 34cc5b329f Keeping up with the Joneses.
Implement not, nor, nand, and eqv

llvm-svn: 21060
2005-04-03 11:20:20 +00:00
Andrew Lenharth 46897ab49e Select optimization
llvm-svn: 21051
2005-04-02 22:32:39 +00:00
Andrew Lenharth f029d795f0 Try several things. 1) drop /i from FP ops 2) factor out FP to Int moves and provide 21264 support for those 3) match not 4) match ornot andnot xornot
llvm-svn: 21046
2005-04-02 21:06:51 +00:00
Chris Lattner d2df8ca403 fix some VC compilation problems, thanks to Jeff C for pointing this out!
llvm-svn: 21044
2005-04-02 20:17:09 +00:00
Chris Lattner a7913e66e1 EquivClassGraphs is now in DataStructure.h
llvm-svn: 21042
2005-04-02 20:08:17 +00:00
Chris Lattner 745c960672 merge EquivClassGraphs.h into DataStructure.h with the other DSA pass definitions.
llvm-svn: 21041
2005-04-02 20:08:06 +00:00
Chris Lattner 526cc17b55 use a callee_iterator typedef.
llvm-svn: 21038
2005-04-02 20:02:41 +00:00
Chris Lattner 63e3a262d8 add and use a callee_iterator typedef
llvm-svn: 21037
2005-04-02 20:02:32 +00:00
Chris Lattner 990ed1d201 Change the ActualCallees callgraph from hash_multimap<Instruction,Function>
to std::set<std::pair<Inst,Func>> to avoid duplicate entries.

This speeds up the CompleteBU pass from 1.99s to .15s on povray and the
eqgraph passes from 1.5s to .16s on the same.

llvm-svn: 21031
2005-04-02 19:17:18 +00:00
Chris Lattner 637e42022f Change the ActualCallees callgraph from hash_multimap<Instruction,Function>
to std::set<std::pair<Inst,Func>> to avoid duplicate entries.

llvm-svn: 21030
2005-04-02 19:15:15 +00:00
Andrew Lenharth e7ae400bc8 FNEG/FABS/UNDEF
llvm-svn: 21029
2005-04-02 19:11:07 +00:00
Andrew Lenharth 7ad3697e1e FNEG/FABS
llvm-svn: 21028
2005-04-02 19:04:58 +00:00
Chris Lattner 04e0515937 this has now been fixed
llvm-svn: 21026
2005-04-02 16:17:57 +00:00
Duraid Madina 50b339b7fa ia64 asmprinter fixes:
- turn off assembler's autoalignment
  - set FunctionAddrPrefix/Suffix so that .data8 entries pointing to
    functions have their value wrapped in @fptr(), so that a function
    descriptor will be materialized for that function.

llvm-svn: 21025
2005-04-02 12:30:47 +00:00
Duraid Madina 73a316d712 add support for prefix/suffix strings to go around GlobalValue(s)
(which may or be function pointers) in the asmprinter. For the moment,
this changes nothing, except the IA64 backend which can use this to write:

  data8.ua  @fptr(blah__blah__mangled_function_name)

  (by setting FunctionAddrPrefix/Suffix to "@fptr(" / ")")

llvm-svn: 21024
2005-04-02 12:21:51 +00:00
Duraid Madina c935c833d1 support IDEF, fnegabs (thanks sampo)
llvm-svn: 21023
2005-04-02 10:33:53 +00:00
Duraid Madina 0ccac38ed3 add fnegabs op
llvm-svn: 21022
2005-04-02 10:06:27 +00:00
Nate Begeman 165cf4844e Set shift amount to Extend
Implement ISD::FABS and ISD::FNEG nodes
Implement SHL_PARTS, SRL_PARTS, and SRA_PARTS
Generate PowerPC 'fneg', 'fabs', and 'fnabs' instructions

llvm-svn: 21018
2005-04-02 05:59:34 +00:00
Chris Lattner 1d7052fad5 don't forget to use the right code generator :)
llvm-svn: 21017
2005-04-02 05:40:03 +00:00
Chris Lattner b4e122c59f new testcase
llvm-svn: 21016
2005-04-02 05:35:00 +00:00
Chris Lattner 0e0b599d29 add support for FABS and FNEG
llvm-svn: 21015
2005-04-02 05:30:17 +00:00