Commit Graph

1726 Commits

Author SHA1 Message Date
Owen Anderson f35a1dbc7a Remove ImmediateDominator analysis. The same information can be obtained from DomTree. A lot of code for
constructing ImmediateDominator is now folded into DomTree construction.

This is part of the ongoing work for PR217.

llvm-svn: 36063
2007-04-15 08:47:27 +00:00
Chris Lattner 4a6e0cbd41 Extend store merging to support the 'if/then' version in addition to if/then/else.
This sinks the two stores in this example into a single store in cond_next.  In this
case, it allows elimination of the load as well:

        store double 0.000000e+00, double* @s.3060
        %tmp3 = fcmp ogt double %tmp1, 5.000000e-01             ; <i1> [#uses=1]
        br i1 %tmp3, label %cond_true, label %cond_next
cond_true:              ; preds = %entry
        store double 1.000000e+00, double* @s.3060
        br label %cond_next
cond_next:              ; preds = %entry, %cond_true
        %tmp6 = load double* @s.3060            ; <double> [#uses=1]

This implements Transforms/InstCombine/store-merge.ll:test2

llvm-svn: 36040
2007-04-15 01:02:18 +00:00
Chris Lattner 14a251b937 refactor some code, no functionality change.
llvm-svn: 36037
2007-04-15 00:07:55 +00:00
Chris Lattner 28d921d04f fix long lines
llvm-svn: 36031
2007-04-14 23:32:02 +00:00
Chris Lattner 7bfdd0abe1 Implement Transforms/InstCombine/vec_extract_elt.ll, transforming:
define i32 @test(float %f) {
        %tmp7 = insertelement <4 x float> undef, float %f, i32 0
        %tmp17 = bitcast <4 x float> %tmp7 to <4 x i32>
        %tmp19 = extractelement <4 x i32> %tmp17, i32 0
        ret i32 %tmp19
}

into:

define i32 @test(float %f) {
        %tmp19 = bitcast float %f to i32                ; <i32> [#uses=1]
        ret i32 %tmp19
}

On PPC, this is the difference between:

_test:
        mfspr r2, 256
        oris r3, r2, 8192
        mtspr 256, r3
        stfs f1, -16(r1)
        addi r3, r1, -16
        addi r4, r1, -32
        lvx v2, 0, r3
        stvx v2, 0, r4
        lwz r3, -32(r1)
        mtspr 256, r2
        blr

and:

_test:
        stfs f1, -4(r1)
        nop
        nop
        nop
        lwz r3, -4(r1)
        blr

llvm-svn: 36025
2007-04-14 23:02:14 +00:00
Chris Lattner b37fb6a0da Implement InstCombine/vec_demanded_elts.ll:test2. This allows us to turn
unsigned test(float f) {
 return _mm_cvtsi128_si32( (__m128i) _mm_set_ss( f*f ));
}

into:

_test:
        movss 4(%esp), %xmm0
        mulss %xmm0, %xmm0
        movd %xmm0, %eax
        ret

instead of:

_test:
        movss 4(%esp), %xmm0
        mulss %xmm0, %xmm0
        xorps %xmm1, %xmm1
        movss %xmm0, %xmm1
        movd %xmm1, %eax
        ret

GCC gets:

_test:
        subl    $28, %esp
        movss   32(%esp), %xmm0
        mulss   %xmm0, %xmm0
        xorps   %xmm1, %xmm1
        movss   %xmm0, %xmm1
        movaps  %xmm1, %xmm0
        movd    %xmm0, 12(%esp)
        movl    12(%esp), %eax
        addl    $28, %esp
        ret

llvm-svn: 36020
2007-04-14 22:29:23 +00:00
Chris Lattner efb33d28c6 Implement PR1201 and test/Transforms/InstCombine/malloc-free-delete.ll
llvm-svn: 35981
2007-04-14 00:20:02 +00:00
Chris Lattner 164b76565b use an accessor to simplify code.
llvm-svn: 35979
2007-04-14 00:17:39 +00:00
Chris Lattner efd3051d60 Now that codegen prepare isn't defeating me, I can finally fix what I set
out to do! :)

This fixes a problem where LSR would insert a bunch of code into each MBB
that uses a particular subexpression (e.g. IV+base+C).  The problem is that
this code cannot be CSE'd back together if inserted into different blocks.

This patch changes LSR to attempt to insert a single copy of this code and
share it, allowing codegenprepare to duplicate the code if it can be sunk
into various addressing modes.  On CodeGen/ARM/lsr-code-insertion.ll,
for example, this gives us code like:

        add r8, r0, r5
        str r6, [r8, #+4]
..
        ble LBB1_4      @cond_next
LBB1_3: @cond_true
        str r10, [r8, #+4]
LBB1_4: @cond_next
...
LBB1_5: @cond_true55
        ldr r6, LCPI1_1
        str r6, [r8, #+4]

instead of:

        add r10, r0, r6
        str r8, [r10, #+4]
...
        ble LBB1_4      @cond_next
LBB1_3: @cond_true
        add r8, r0, r6
        str r10, [r8, #+4]
LBB1_4: @cond_next
...
LBB1_5: @cond_true55
        add r8, r0, r6
        ldr r10, LCPI1_1
        str r10, [r8, #+4]

Besides being smaller and more efficient, this makes it immediately
obvious that it is profitable to predicate LBB1_3 now :)

llvm-svn: 35972
2007-04-13 20:42:26 +00:00
Chris Lattner feee64e997 Completely rewrite addressing-mode related sinking of code. In particular,
this fixes problems where codegenprepare would sink expressions into load/stores
that are not valid, and fixes cases where it would miss important valid ones.

This fixes several serious codesize and perf issues, particularly on targets
with complex addressing modes like arm and x86.  For example, now we compile
CodeGen/X86/isel-sink.ll to:

_test:
        movl 8(%esp), %eax
        movl 4(%esp), %ecx
        cmpl $1233, %eax
        ja LBB1_2       #F
LBB1_1: #T
        movl $4, (%ecx,%eax,4)
        movl $141, %eax
        ret
LBB1_2: #F
        movl (%ecx,%eax,4), %eax
        ret

instead of:

_test:
        movl 8(%esp), %eax
        leal (,%eax,4), %ecx
        addl 4(%esp), %ecx
        cmpl $1233, %eax
        ja LBB1_2       #F
LBB1_1: #T
        movl $4, (%ecx)
        movl $141, %eax
        ret
LBB1_2: #F
        movl (%ecx), %eax
        ret

llvm-svn: 35970
2007-04-13 20:30:56 +00:00
Chris Lattner 5ee4d0726a Fix Transforms/ScalarRepl/union-pointer.ll
llvm-svn: 35906
2007-04-11 15:45:25 +00:00
Chris Lattner 74ff60ff84 Turn stuff like:
icmp slt i32 %X, 0              ; <i1>:0 [#uses=1]
        sext i1 %0 to i32               ; <i32>:1 [#uses=1]

into:

        %X.lobit = ashr i32 %X, 31              ; <i32> [#uses=1]

This implements InstCombine/icmp.ll:test[34]

llvm-svn: 35891
2007-04-11 06:57:46 +00:00
Chris Lattner d0f7942e23 Simplify some comparisons to arithmetic, this implements:
Transforms/InstCombine/icmp.ll

llvm-svn: 35890
2007-04-11 06:53:04 +00:00
Chris Lattner 20f2372a7c canonicalize (x <u 2147483648) -> (x >s -1) and (x >u 2147483647) -> (x <s 0)
llvm-svn: 35886
2007-04-11 06:12:58 +00:00
Chris Lattner 7ddbff090a fix a miscompilation of:
define i32 @test(i32 %X) {
entry:
        %Y = and i32 %X, 4              ; <i32> [#uses=1]
        icmp eq i32 %Y, 0               ; <i1>:0 [#uses=1]
        sext i1 %0 to i32               ; <i32>:1 [#uses=1]
        ret i32 %1
}

by moving code out of commonIntCastTransforms into visitZExt.  Simplify the
APInt gymnastics in it etc.

llvm-svn: 35885
2007-04-11 05:45:39 +00:00
Chris Lattner 32104034f8 fix a regression introduced by my last patch.
llvm-svn: 35879
2007-04-11 03:27:24 +00:00
Chris Lattner daa012d1fb Simplify SROA conversion to integer in some ways, make it more general in others.
We now tolerate small amounts of undefined behavior, better emulating what
would happen if the transaction actually occurred in memory.  This fixes
SingleSource/UnitTests/2007-04-10-BitfieldTest.c on PPC, at least until
Devang gets a chance to fix the CFE from doing undefined things with bitfields :)

llvm-svn: 35875
2007-04-11 00:57:54 +00:00
Chris Lattner 467b69cabb Strengthen the boundary conditions of this fold, implementing
InstCombine/set.ll:test25

llvm-svn: 35852
2007-04-09 23:52:13 +00:00
Chris Lattner 3e9690f987 eliminate the last uses of some TLI methods.
llvm-svn: 35844
2007-04-09 23:29:07 +00:00
Chris Lattner 780c009756 switch LSR to use isLegalAddressingMode instead of other simpler hooks
llvm-svn: 35837
2007-04-09 22:20:14 +00:00
Devang Patel bca0d57179 Check _all_ PHINodes.
llvm-svn: 35836
2007-04-09 22:20:10 +00:00
Devang Patel 8eb8eeada9 Insert new pre-header before new header. Original pre-header may
happen to be an entry, in such case, it is not a good idea to
insert new block before entry.

Also fix typo in assertion check.

llvm-svn: 35833
2007-04-09 21:40:43 +00:00
Devang Patel 854197884b Preserve canonical loop form.
llvm-svn: 35829
2007-04-09 20:19:46 +00:00
Devang Patel b9af5747a5 Do not create new pre-header. Reuse original pre-header.
llvm-svn: 35825
2007-04-09 19:04:21 +00:00
Devang Patel 03d7ae3a74 Simpler for() loops.
llvm-svn: 35822
2007-04-09 17:09:13 +00:00
Devang Patel d6ba41e02d Fix future bug. Of course, Chris spotted this.
Handle Argument or Undef as an incoming PHI value.

llvm-svn: 35821
2007-04-09 16:41:46 +00:00
Devang Patel b28a391a8d More cosmetic changes.
llvm-svn: 35820
2007-04-09 16:21:29 +00:00
Devang Patel 88bc2c6f82 Only cosmetic changes. Zero functionality Change.
llvm-svn: 35819
2007-04-09 16:11:48 +00:00
Chris Lattner a87c9f6114 Fix PR1304 and Transforms/InstCombine/2007-04-08-SingleEltVectorCrash.ll
llvm-svn: 35792
2007-04-09 01:37:55 +00:00
Chris Lattner 4ca9cbb170 Eliminate useless insertelement instructions. This implements
Transforms/InstCombine/vec_insertelt.ll and fixes PR1286.

We now compile the code from that bug into:

_foo:
        movl 4(%esp), %eax
        movdqa (%eax), %xmm0
        movl 8(%esp), %ecx
        psllw (%ecx), %xmm0
        movdqa %xmm0, (%eax)
        ret

instead of:

_foo:
        subl $4, %esp
        movl %ebp, (%esp)
        movl %esp, %ebp
        movl 12(%ebp), %eax
        movdqa (%eax), %xmm0
        #IMPLICIT_DEF %eax
        pinsrw $2, %eax, %xmm0
        xorl %ecx, %ecx
        pinsrw $3, %ecx, %xmm0
        pinsrw $4, %eax, %xmm0
        pinsrw $5, %ecx, %xmm0
        pinsrw $6, %eax, %xmm0
        pinsrw $7, %ecx, %xmm0
        movl 8(%ebp), %eax
        movdqa (%eax), %xmm1
        psllw %xmm0, %xmm1
        movdqa %xmm1, (%eax)
        movl %ebp, %esp
        popl %ebp
        ret

woo :)

llvm-svn: 35788
2007-04-09 01:11:16 +00:00
Chris Lattner c8d3788f71 reenable this xform, whoops :)
llvm-svn: 35765
2007-04-08 08:01:49 +00:00
Chris Lattner 7621a031d8 Fix regression on Instcombine/apint-or2.ll
llvm-svn: 35763
2007-04-08 07:55:22 +00:00
Chris Lattner 1150df9cc4 Generalize the code that handles (A&B)|(A&C) to work where B/C are not constants.
Add a new xform to simplify (A&B)|(~A&C).  THis implements InstCombine/or2.ll:test1

llvm-svn: 35760
2007-04-08 07:47:01 +00:00
Nick Lewycky d4f51a8ae3 Add support for cast instructions.
llvm-svn: 35734
2007-04-07 15:48:32 +00:00
Owen Anderson 8763ba1b88 Completely purge DomSet. This is the (hopefully) final patch for PR1171.
llvm-svn: 35731
2007-04-07 07:17:27 +00:00
Nick Lewycky 93f541057b Support NE inequality in ValueRanges.
llvm-svn: 35724
2007-04-07 04:49:12 +00:00
Nick Lewycky 3bb6de85d1 Cleanup. Refactor out the applying of value ranges to its own method.
llvm-svn: 35719
2007-04-07 03:36:51 +00:00
Nick Lewycky 12d44abe0f Use TargetData to find the size of a type.
llvm-svn: 35718
2007-04-07 03:16:12 +00:00
Nick Lewycky eeb01b41ef Strengthen icmp snuggling by doing 'compare-or-equal-to' to 'compare'
first and then range testing second.

llvm-svn: 35715
2007-04-07 02:30:14 +00:00
Devang Patel f42389ffe5 Add loop rotation pass.
llvm-svn: 35714
2007-04-07 01:25:15 +00:00
Chris Lattner 3dbe65f80a implement Transforms/InstCombine/malloc2.ll and PR1313
llvm-svn: 35700
2007-04-06 18:57:34 +00:00
Chris Lattner 108083edff Use a worklist-driven algorithm instead of a recursive one.
llvm-svn: 35680
2007-04-05 01:27:02 +00:00
Dale Johannesen 7c2001d014 Prevent transformConstExprCastCall from generating conversions that assert
elsewhere.

llvm-svn: 35668
2007-04-04 19:16:42 +00:00
Jeff Cohen 5a1c750f31 Fix 2007-04-04-BadFoldBitcastIntoMalloc.ll
llvm-svn: 35665
2007-04-04 16:58:57 +00:00
Duncan Sands f01a47c93c Fix comment.
llvm-svn: 35655
2007-04-04 06:42:45 +00:00
Chris Lattner e5bbb3cb1a Fix a bug I introduced with my patch yesterday which broke Qt (I converted
some constant exprs to apints).

Thanks to Anton for tracking down a small testcase that triggered this!

llvm-svn: 35633
2007-04-03 23:29:39 +00:00
Chris Lattner a74deafb13 reinstate the previous two patches, with a bugfix :)
ldecod now passes.

llvm-svn: 35626
2007-04-03 17:43:25 +00:00
Evan Cheng 7511fa280d Reverting back to 1.723. The last two commits broke JM (and possibily others) on ARM.
llvm-svn: 35620
2007-04-03 08:11:50 +00:00
Chris Lattner 81e0707552 split some code out into a helper function
llvm-svn: 35615
2007-04-03 05:11:24 +00:00
Chris Lattner 64c764cebc Split a whole ton of code out of visitICmpInst into visitICmpInstWithInstAndIntCst.
llvm-svn: 35614
2007-04-03 04:46:52 +00:00