Commit Graph

1434 Commits

Author SHA1 Message Date
Chris Lattner fb5614506e Simplify/speedup the PEI by not having to scan for uses of the callee saved
registers.  This information is computed directly by the register allocator
now.

llvm-svn: 19795
2005-01-23 23:13:12 +00:00
Chris Lattner 3d527f7b61 Update physregsused info.
llvm-svn: 19793
2005-01-23 22:55:45 +00:00
Chris Lattner 24f0f0e28f Update this pass to set PhysRegsUsed info in MachineFunction.
llvm-svn: 19792
2005-01-23 22:51:56 +00:00
Chris Lattner ae09d93b35 Update these register allocators to set the PhysRegUsed info in MachineFunction.
llvm-svn: 19791
2005-01-23 22:45:13 +00:00
Chris Lattner 304053c6ec Add support for the PhysRegsUsed array.
llvm-svn: 19789
2005-01-23 22:13:58 +00:00
Chris Lattner ef2de322c6 Speed this up a bit by making ModifiedRegs a vector<char> not vector<bool>
llvm-svn: 19787
2005-01-23 21:45:01 +00:00
Chris Lattner 4add7e356f Adjust to changes in SelectionDAG interfaces
The first half of correct chain insertion for libcalls. This is not enough
to fix Fhourstones yet though.

llvm-svn: 19781
2005-01-23 04:42:50 +00:00
Chris Lattner 90b7c13f3a Remove the 3 HACK HACK HACKs I put in before, fixing them properly with
the new TLI that is available.

Implement support for handling out of range shifts.  This allows us to
compile this code (a 64-bit rotate):

unsigned long long f3(unsigned long long x) {
  return (x << 32) | (x >> (64-32));
}

into this:

f3:
        mov %EDX, DWORD PTR [%ESP + 4]
        mov %EAX, DWORD PTR [%ESP + 8]
        ret

GCC produces this:

$ gcc t.c -masm=intel -O3 -S -o - -fomit-frame-pointer
..
f3:
        push    %ebx
        mov     %ebx, DWORD PTR [%esp+12]
        mov     %ecx, DWORD PTR [%esp+8]
        mov     %eax, %ebx
        mov     %edx, %ecx
        pop     %ebx
        ret

The Simple ISEL produces (eww gross):

f3:
        sub %ESP, 4
        mov DWORD PTR [%ESP], %ESI
        mov %EDX, DWORD PTR [%ESP + 8]
        mov %ECX, DWORD PTR [%ESP + 12]
        mov %EAX, 0
        mov %ESI, 0
        or %EAX, %ECX
        or %EDX, %ESI
        mov %ESI, DWORD PTR [%ESP]
        add %ESP, 4
        ret

llvm-svn: 19780
2005-01-23 04:39:44 +00:00
Chris Lattner ffcb0ae329 Adjust to changes in SelectionDAG interface.
llvm-svn: 19779
2005-01-23 04:36:26 +00:00
Chris Lattner eccb73d57f Get this to work for 64-bit systems.
llvm-svn: 19763
2005-01-22 23:04:37 +00:00
Chris Lattner 52c97fbea9 Implicitly defined registers can clobber callee saved registers too!
This fixes the return-address-not-being-saved problem in the Alpha backend.

llvm-svn: 19741
2005-01-22 00:49:16 +00:00
Chris Lattner 3bc78b2e0b More bugfixes for IA64 shifts.
llvm-svn: 19739
2005-01-22 00:33:03 +00:00
Chris Lattner ec2183713c Fix problems with non-x86 targets.
llvm-svn: 19738
2005-01-22 00:31:52 +00:00
Chris Lattner d637c96fac Add a nasty hack to fix Alpha/IA64 multiplies by a power of two.
llvm-svn: 19737
2005-01-22 00:20:42 +00:00
Chris Lattner d53e763f18 Remove unneeded line.
llvm-svn: 19736
2005-01-21 23:43:12 +00:00
Chris Lattner 4f987bf16d test commit
llvm-svn: 19735
2005-01-21 23:38:56 +00:00
Chris Lattner 96e809c47d Unary token factor nodes are unneeded.
llvm-svn: 19727
2005-01-21 18:01:22 +00:00
Chris Lattner aac464e6c0 Refactor libcall code a bit. Initial implementation of expanding int -> FP
operations for 64-bit integers.

llvm-svn: 19724
2005-01-21 06:05:23 +00:00
Chris Lattner 4d25c04f94 Simplify the shift-expansion code.
llvm-svn: 19721
2005-01-20 20:29:23 +00:00
Chris Lattner b3f83b28a5 Expand add/sub into ADD_PARTS/SUB_PARTS instead of a non-existant libcall.
llvm-svn: 19715
2005-01-20 18:52:28 +00:00
Chris Lattner 1fe9b40981 implement add_parts/sub_parts.
llvm-svn: 19714
2005-01-20 18:50:55 +00:00
Chris Lattner 28d15860bd Add missing entry.
llvm-svn: 19712
2005-01-20 17:32:28 +00:00
Chris Lattner 96c26751ec Support targets that do not use i8 shift amounts.
llvm-svn: 19707
2005-01-19 22:31:21 +00:00
Chris Lattner f840289291 Add an assertion that would have made more sense to duraid
llvm-svn: 19704
2005-01-19 21:32:07 +00:00
Chris Lattner 3d95c14d94 Add support for targets that pass args in registers to calls.
llvm-svn: 19703
2005-01-19 20:24:35 +00:00
Chris Lattner 55562fa99a Fold single use token factor nodes into other token factor nodes.
llvm-svn: 19701
2005-01-19 19:10:54 +00:00
Chris Lattner 0d03eb45a8 Realize the individual pieces of an expanded copytoreg/store/load are
independent of each other.

llvm-svn: 19700
2005-01-19 18:02:17 +00:00
Chris Lattner 9b75e148fd Know some identities about tokenfactor nodes.
llvm-svn: 19699
2005-01-19 18:01:40 +00:00
Chris Lattner 32a5f02598 Know some simple identities. This improves codegen for (1LL << N).
llvm-svn: 19698
2005-01-19 17:29:49 +00:00
Chris Lattner 1cffa73f2a Just in case, handle something that is both a use and a def.
llvm-svn: 19696
2005-01-19 17:11:51 +00:00
Chris Lattner 00c436824f When an instruction moves, make sure to update the VarInfo::Kills list as
well as all of teh other stuff in livevar. This fixes the compiler crash
on fourinarow last night.

llvm-svn: 19695
2005-01-19 17:09:15 +00:00
Chris Lattner ea42c15da9 Use the TargetInstrInfo::commuteInstruction method to commute instructions
instead of doing it manually.

llvm-svn: 19685
2005-01-19 07:08:42 +00:00
Chris Lattner 2a7f8a94f4 Implement a way of expanding shifts. This applies to targets that offer
select operations or to shifts that are by a constant.  This automatically
implements (with no special code) all of the special cases for shift by 32,
shift by < 32 and shift by > 32.

llvm-svn: 19679
2005-01-19 04:19:40 +00:00
Chris Lattner 42993e45b6 Zero is cheaper than sign extend.
llvm-svn: 19675
2005-01-18 21:57:59 +00:00
Chris Lattner d65c3f3118 Fix some fixmes (promoting bools for select and brcond), fix promotion
of zero and sign extends.

llvm-svn: 19671
2005-01-18 19:27:06 +00:00
Chris Lattner a9d53f9fb9 Keep track of the retval type as well.
llvm-svn: 19670
2005-01-18 19:26:36 +00:00
Chris Lattner 9f2c4a5200 Teach legalize to promote copy(from|to)reg, instead of making the isel pass
do it.  This results in better code on X86 for floats (because if strict
precision is not required, we can elide some more expensive double -> float
conversions like the old isel did), and allows other targets to emit
CopyFromRegs that are not legal for arguments.

llvm-svn: 19668
2005-01-18 17:54:55 +00:00
Chris Lattner 2cb338d7b5 Teach legalize to promote SetCC results.
llvm-svn: 19657
2005-01-18 02:59:52 +00:00
Chris Lattner b07e2d2084 Allow setcc operations to have nonbool types.
llvm-svn: 19656
2005-01-18 02:52:03 +00:00
Chris Lattner 2b4b79581d Fix the completely broken FP constant folds for setcc's.
llvm-svn: 19651
2005-01-18 02:11:55 +00:00
Chris Lattner 4d9651c760 Non-volatile loads can be freely reordered against each other. This fixes
X86/reg-pressure.ll again, and allows us to do nice things in other cases.
For example, we now codegen this sort of thing:

int %loadload(int *%X, int* %Y) {
  %Z = load int* %Y
  %Y = load int* %X      ;; load between %Z and store
  %Q = add int %Z, 1
  store int %Q, int* %Y
  ret int %Y
}

Into this:

loadload:
        mov %EAX, DWORD PTR [%ESP + 4]
        mov %EAX, DWORD PTR [%EAX]
        mov %ECX, DWORD PTR [%ESP + 8]
        inc DWORD PTR [%ECX]
        ret

where we weren't able to form the 'inc [mem]' before.  This also lets the
instruction selector emit loads in any order it wants to, which can be good
for register pressure as well.

llvm-svn: 19644
2005-01-17 22:19:26 +00:00
Chris Lattner 4108bb01cf Don't call SelectionDAG.getRoot() directly, go through a forwarding method.
llvm-svn: 19642
2005-01-17 19:43:36 +00:00
Chris Lattner e3c2cf4854 Implement a target independent optimization to codegen arguments only into
the basic block that uses them if possible.  This is a big win on X86, as it
lets us fold the argument loads into instructions and reduce register pressure
(by not loading all of the arguments in the entry block).

For this (contrived to show the optimization) testcase:

int %argtest(int %A, int %B) {
        %X = sub int 12345, %A
        br label %L
L:
        %Y = add int %X, %B
        ret int %Y
}

we used to produce:

argtest:
        mov %ECX, DWORD PTR [%ESP + 4]
        mov %EAX, 12345
        sub %EAX, %ECX
        mov %EDX, DWORD PTR [%ESP + 8]
.LBBargtest_1:  # L
        add %EAX, %EDX
        ret


now we produce:

argtest:
        mov %EAX, 12345
        sub %EAX, DWORD PTR [%ESP + 4]
.LBBargtest_1:  # L
        add %EAX, DWORD PTR [%ESP + 8]
        ret

This also fixes the FIXME in the code.

BTW, this occurs in real code.  164.gzip shrinks from 8623 to 8608 lines of
.s file.  The stack frame in huft_build shrinks from 1644->1628 bytes,
inflate_codes shrinks from 116->108 bytes, and inflate_block from 2620->2612,
due to fewer spills.

Take that alkis. :-)

llvm-svn: 19639
2005-01-17 17:55:19 +00:00
Chris Lattner 16f64df93a Refactor code into a new method.
llvm-svn: 19635
2005-01-17 17:15:02 +00:00
Chris Lattner 5c8a85e2d8 Implement legalize of call nodes.
llvm-svn: 19617
2005-01-16 19:46:48 +00:00
Chris Lattner 3c0dd46a3b Revamp supported ops. Instead of just being supported or not, we now keep
track of how to deal with it, and provide the target with a hook that they
can use to legalize arbitrary operations in arbitrary ways.

Implement custom lowering for a couple of ops, implement promotion for select
operations (which x86 needs).

llvm-svn: 19613
2005-01-16 07:29:19 +00:00
Chris Lattner 897cd7dc0a add method stub
llvm-svn: 19612
2005-01-16 07:28:41 +00:00
Chris Lattner 2c331fbc8f Don't mash stuff together.
llvm-svn: 19611
2005-01-16 07:28:31 +00:00
Chris Lattner 3ba56b3fe7 Implement some more missing promotions.
llvm-svn: 19606
2005-01-16 05:06:12 +00:00
Chris Lattner 73b6977700 Clarify assertion.
llvm-svn: 19597
2005-01-16 02:23:34 +00:00
Chris Lattner 4e550ebb38 Add assertions.
llvm-svn: 19596
2005-01-16 02:23:22 +00:00
Chris Lattner 209f585033 Add support for promoted registers being live across blocks.
llvm-svn: 19595
2005-01-16 02:23:07 +00:00
Chris Lattner 87a769cbd4 Move some information into the TargetLowering object.
llvm-svn: 19583
2005-01-16 01:11:45 +00:00
Chris Lattner d58384fca6 Use the new TLI method to get this.
llvm-svn: 19582
2005-01-16 01:11:19 +00:00
Chris Lattner 71d7f6e86f legalize a bunch of operations that I missed.
llvm-svn: 19580
2005-01-16 00:38:00 +00:00
Chris Lattner a8d34fb8c6 Add support for targets that require promotions.
llvm-svn: 19579
2005-01-16 00:37:38 +00:00
Chris Lattner 207a962c2c Fix some serious bugs in promotion.
llvm-svn: 19578
2005-01-16 00:17:42 +00:00
Chris Lattner 0fe7776da5 Eliminate unneeded extensions.
llvm-svn: 19577
2005-01-16 00:17:20 +00:00
Chris Lattner 4d97864e92 Implement promotion of a whole bunch more operators. I think that this is
basically everything.

llvm-svn: 19576
2005-01-15 22:16:26 +00:00
Chris Lattner 630d1937bf Print extra type for nodes with extra type info.
llvm-svn: 19575
2005-01-15 21:11:37 +00:00
Chris Lattner 99222f706c Add support for legalizing FP_ROUND_INREG, SIGN_EXTEND_INREG, and
ZERO_EXTEND_INREG for targets that don't support them.

llvm-svn: 19573
2005-01-15 07:15:18 +00:00
Chris Lattner 09d1b3d01d Common code factored out.
llvm-svn: 19572
2005-01-15 07:14:32 +00:00
Chris Lattner 5d24d61ae8 implement these methods.
llvm-svn: 19571
2005-01-15 06:52:40 +00:00
Chris Lattner c6c9a5b07d Add support for promoting ADD/MUL.
Add support for new SIGN_EXTEND_INREG, ZERO_EXTEND_INREG, and FP_ROUND_INREG operators.
Realize that if we do any promotions, we need to iterate SelectionDAG
construction.

llvm-svn: 19569
2005-01-15 06:18:18 +00:00
Chris Lattner 1001c6e2cd Add new SIGN_EXTEND_INREG, ZERO_EXTEND_INREG, and FP_ROUND_INREG operators.
llvm-svn: 19568
2005-01-15 06:17:04 +00:00
Chris Lattner 1f2c9d82fa Add intitial support for promoting some operators.
llvm-svn: 19565
2005-01-15 05:21:40 +00:00
Chris Lattner 3b8e719d1d Adjust to CopyFromReg changes, implement deletion of truncating/extending
stores/loads.

llvm-svn: 19562
2005-01-14 22:38:01 +00:00
Chris Lattner 39c6744c9f Start implementing truncating stores and extending loads.
llvm-svn: 19559
2005-01-14 22:08:15 +00:00
Chris Lattner 0ad02bdd3d Improve compatibility with acc
llvm-svn: 19549
2005-01-14 15:54:24 +00:00
Chris Lattner e727af06c8 Add new ImplicitDef node, rename CopyRegSDNode class to RegSDNode.
llvm-svn: 19535
2005-01-13 20:50:02 +00:00
Chris Lattner 2451684678 Don't forget the existing root.
llvm-svn: 19531
2005-01-13 19:53:14 +00:00
Chris Lattner 718b5c2f82 Codegen independent ops as being independent.
llvm-svn: 19528
2005-01-13 17:59:43 +00:00
Chris Lattner 05b4e37e85 Legalize new node, add assertion.
llvm-svn: 19527
2005-01-13 17:59:25 +00:00
Chris Lattner 4b1be0dfeb Print new node.
llvm-svn: 19526
2005-01-13 17:59:10 +00:00
Chris Lattner 4dfd2cfc0c Do not fold (zero_ext (sign_ext V)) -> (sign_ext V), they are not the same.
This fixes llvm-test/SingleSource/Regression/C/casts.c

llvm-svn: 19519
2005-01-12 18:51:15 +00:00
Chris Lattner 40e7982c2c New method
llvm-svn: 19517
2005-01-12 18:37:47 +00:00
Chris Lattner 9864b08f94 Fix sign extend to long. When coming from sbyte, we used to generate:
movsbl 4(%esp), %eax
        movl %eax, %edx
        sarl $7, %edx

Now we generate:

        movsbl 4(%esp), %eax
        movl %eax, %edx
        sarl $31, %edx

Which is right.

llvm-svn: 19515
2005-01-12 18:19:52 +00:00
Reid Spencer 6dced92447 Shut up warnings with GCC 3.4.3 about uninitialized variables.
llvm-svn: 19512
2005-01-12 14:53:45 +00:00
Chris Lattner e05a461f1d Add an option to view the selection dags as they are generated.
llvm-svn: 19498
2005-01-12 03:41:21 +00:00
Chris Lattner c2785562f1 Print the value types in the nodes of the graph
llvm-svn: 19485
2005-01-11 22:21:04 +00:00
Chris Lattner 613f79fcbb add an assertion, avoid creating copyfromreg/copytoreg pairs that are the
same for PHI nodes.

llvm-svn: 19484
2005-01-11 22:03:46 +00:00
Chris Lattner f49c27c65c Squelch optimized warning.
llvm-svn: 19475
2005-01-11 17:46:49 +00:00
Chris Lattner 85d70c6fd5 Teach legalize to lower MEMSET/MEMCPY/MEMMOVE operations if the target
does not support them.

llvm-svn: 19465
2005-01-11 05:57:22 +00:00
Chris Lattner 844277fb1e Print new operations.
llvm-svn: 19464
2005-01-11 05:57:01 +00:00
Chris Lattner 875def9b71 Turn memset/memcpy/memmove into the corresponding operations.
llvm-svn: 19463
2005-01-11 05:56:49 +00:00
Chris Lattner a86fa4455b shift X, 0 -> X
llvm-svn: 19453
2005-01-11 04:25:13 +00:00
Chris Lattner 1308b488ea Print SelectionDAGs bottom up, include extra info in the node labels
llvm-svn: 19447
2005-01-11 00:34:33 +00:00
Chris Lattner b241b443b6 Add a marker for the graph root.
llvm-svn: 19445
2005-01-10 23:52:04 +00:00
Chris Lattner 12be02722f Put the operation name in each node, put the function name on the graph.
llvm-svn: 19444
2005-01-10 23:26:00 +00:00
Chris Lattner 9e4c76123c Split out SDNode::getOperationName into its own method.
llvm-svn: 19443
2005-01-10 23:25:25 +00:00
Chris Lattner 7f65075be3 Implement initial selectiondag printing support. This gets us a nice
graph with no labels! :)

llvm-svn: 19441
2005-01-10 23:08:40 +00:00
Chris Lattner be02d430a9 Lower to the correct functions. This fixes FreeBench/fourinarow
llvm-svn: 19436
2005-01-10 21:02:37 +00:00
Chris Lattner 41b764144d Implement a couple of more simplifications. This lets us codegen:
int test2(int * P, int* Q, int A, int B) {
        return P+A == P;
}

into:

test2:
        movl 4(%esp), %eax
        movl 12(%esp), %eax
        shll $2, %eax
        cmpl $0, %eax
        sete %al
        movzbl %al, %eax
        ret

instead of:

test2:
        movl 4(%esp), %eax
        movl 12(%esp), %ecx
        leal (%eax,%ecx,4), %ecx
        cmpl %eax, %ecx
        sete %al
        movzbl %al, %eax
        ret

ICC is producing worse code:

test2:
        movl      4(%esp), %eax                                 #8.5
        movl      12(%esp), %edx                                #8.5
        lea       (%edx,%edx), %ecx                             #9.9
        addl      %ecx, %ecx                                    #9.9
        addl      %eax, %ecx                                    #9.9
        cmpl      %eax, %ecx                                    #9.16
        movl      $0, %eax                                      #9.16
        sete      %al                                           #9.16
        ret                                                     #9.16

as is GCC (looks like our old code):

test2:
        movl    4(%esp), %edx
        movl    12(%esp), %eax
        leal    (%edx,%eax,4), %ecx
        cmpl    %edx, %ecx
        sete    %al
        movzbl  %al, %eax
        ret

llvm-svn: 19430
2005-01-10 02:03:02 +00:00
Chris Lattner 00c231baa7 Fix incorrect constant folds, fixing Stepanov after the SHR patch.
llvm-svn: 19429
2005-01-10 01:16:03 +00:00
Chris Lattner 0966a75e76 Constant fold shifts, turning this loop:
.LBB_Z5test0PdS__3:     # no_exit.1
        fldl data(,%eax,8)
        fldl 24(%esp)
        faddp %st(1)
        fstl 24(%esp)
        incl %eax
        movl $16000, %ecx
        sarl $3, %ecx
        cmpl %eax, %ecx
        fstpl 16(%esp)
        #FP_REG_KILL
        jg .LBB_Z5test0PdS__3   # no_exit.1

into:

.LBB_Z5test0PdS__3:     # no_exit.1
        fldl data(,%eax,8)
        fldl 24(%esp)
        faddp %st(1)
        fstl 24(%esp)
        incl %eax
        cmpl $2000, %eax
        fstpl 16(%esp)
        #FP_REG_KILL
        jl .LBB_Z5test0PdS__3   # no_exit.1

llvm-svn: 19427
2005-01-10 00:07:15 +00:00
Chris Lattner fde3a212e2 Add some folds for == and != comparisons. This allows us to
codegen this loop in stepanov:

no_exit.i:              ; preds = %entry, %no_exit.i, %then.i, %_Z5checkd.exit
        %i.0.0 = phi int [ 0, %entry ], [ %i.0.0, %no_exit.i ], [ %inc.0, %_Z5checkd.exit ], [ %inc.012, %then.i ]              ; <int> [#uses=3]
        %indvar = phi uint [ %indvar.next, %no_exit.i ], [ 0, %entry ], [ 0, %then.i ], [ 0, %_Z5checkd.exit ]          ; <uint> [#uses=3]
        %result_addr.i.0 = phi double [ %tmp.4.i.i, %no_exit.i ], [ 0.000000e+00, %entry ], [ 0.000000e+00, %then.i ], [ 0.000000e+00, %_Z5checkd.exit ]          ; <double> [#uses=1]
        %first_addr.0.i.2.rec = cast uint %indvar to int                ; <int> [#uses=1]
        %first_addr.0.i.2 = getelementptr [2000 x double]* %data, int 0, uint %indvar           ; <double*> [#uses=1]
        %inc.i.rec = add int %first_addr.0.i.2.rec, 1           ; <int> [#uses=1]
        %inc.i = getelementptr [2000 x double]* %data, int 0, int %inc.i.rec            ; <double*> [#uses=1]
        %tmp.3.i.i = load double* %first_addr.0.i.2             ; <double> [#uses=1]
        %tmp.4.i.i = add double %result_addr.i.0, %tmp.3.i.i            ; <double> [#uses=2]
        %tmp.2.i = seteq double* %inc.i, getelementptr ([2000 x double]* %data, int 0, int 2000)                ; <bool> [#uses=1]
        %indvar.next = add uint %indvar, 1              ; <uint> [#uses=1]
        br bool %tmp.2.i, label %_Z10accumulateIPddET0_T_S2_S1_.exit, label %no_exit.i

To this:

.LBB_Z4testIPddEvT_S1_T0__1:    # no_exit.i
        fldl data(,%eax,8)
        fldl 16(%esp)
        faddp %st(1)
        fstpl 16(%esp)
        incl %eax
        movl %eax, %ecx
        shll $3, %ecx
        cmpl $16000, %ecx
        #FP_REG_KILL
        jne .LBB_Z4testIPddEvT_S1_T0__1 # no_exit.i

instead of this:

.LBB_Z4testIPddEvT_S1_T0__1:    # no_exit.i
        fldl data(,%eax,8)
        fldl 16(%esp)
        faddp %st(1)
        fstpl 16(%esp)
        incl %eax
        leal data(,%eax,8), %ecx
        leal data+16000, %edx
        cmpl %edx, %ecx
        #FP_REG_KILL
        jne .LBB_Z4testIPddEvT_S1_T0__1 # no_exit.i

llvm-svn: 19425
2005-01-09 20:52:51 +00:00
Jeff Cohen 7d1670da3f Fix VC++ compilation error
llvm-svn: 19423
2005-01-09 20:41:56 +00:00
Chris Lattner e6f7882c27 Print the DAG out more like a DAG in nested format.
llvm-svn: 19422
2005-01-09 20:38:33 +00:00
Chris Lattner 1270acc1ce Print out nodes sorted by their address to make it easier to find them in a list.
llvm-svn: 19421
2005-01-09 20:26:36 +00:00
Chris Lattner 3d5d5022d5 Add a simple transformation. This allows us to compile one of the inner
loops in stepanov to this:

.LBB_Z5test0PdS__2:     # no_exit.1
        fldl data(,%eax,8)
        fldl 24(%esp)
        faddp %st(1)
        fstl 24(%esp)
        incl %eax
        cmpl $2000, %eax
        fstpl 16(%esp)
        #FP_REG_KILL
        jl .LBB_Z5test0PdS__2

instead of this:

.LBB_Z5test0PdS__2:     # no_exit.1
        fldl data(,%eax,8)
        fldl 24(%esp)
        faddp %st(1)
        fstl 24(%esp)
        incl %eax
        movl $data, %ecx
        movl %ecx, %edx
        addl $16000, %edx
        subl %ecx, %edx
        movl %edx, %ecx
        sarl $2, %ecx
        shrl $29, %ecx
        addl %ecx, %edx
        sarl $3, %edx
        cmpl %edx, %eax
        fstpl 16(%esp)
        #FP_REG_KILL
        jl .LBB_Z5test0PdS__2

The old instruction selector produced:

.LBB_Z5test0PdS__2:     # no_exit.1
        fldl 24(%esp)
        faddl data(,%eax,8)
        fstl 24(%esp)
        movl %eax, %ecx
        incl %ecx
        incl %eax
        leal data+16000, %edx
        movl $data, %edi
        subl %edi, %edx
        movl %edx, %edi
        sarl $2, %edi
        shrl $29, %edi
        addl %edi, %edx
        sarl $3, %edx
        cmpl %edx, %ecx
        fstpl 16(%esp)
        #FP_REG_KILL
        jl .LBB_Z5test0PdS__2   # no_exit.1

Which is even worse!

llvm-svn: 19419
2005-01-09 20:09:57 +00:00