Commit Graph

11329 Commits

Author SHA1 Message Date
Nate Begeman bd5f41a6a6 Legalize BUILD_PAIR appropriately for upcoming 64 bit PowerPC work.
llvm-svn: 23776
2005-10-18 00:27:41 +00:00
Nate Begeman ec48a1bfbd fold fmul X, +2.0 -> fadd X, X;
llvm-svn: 23774
2005-10-17 20:40:11 +00:00
Chris Lattner da1b152c43 Make this work for FP constantexprs
llvm-svn: 23773
2005-10-17 20:18:38 +00:00
Chris Lattner 7fde91e365 Oops, X+0.0 isn't foldable, but X+-0.0 is.
llvm-svn: 23772
2005-10-17 17:56:38 +00:00
Chris Lattner 32979336a7 relax this a bit, as we only support the default rounding mode
llvm-svn: 23771
2005-10-17 17:49:32 +00:00
Chris Lattner eeb2bda2fa add a trivial fold
llvm-svn: 23764
2005-10-17 01:07:11 +00:00
Nate Begeman 6cca84e43c More PPC32 -> PPC changes, as well as merging some classes that were
redundant after the change.

llvm-svn: 23759
2005-10-16 05:39:50 +00:00
Chris Lattner e540800d5a Fix this logic.
llvm-svn: 23756
2005-10-15 22:35:40 +00:00
Chris Lattner 17cc9edd33 Add a case we were missing that was causing us to fail CodeGen/PowerPC/rlwinm.ll:test3
llvm-svn: 23755
2005-10-15 22:18:08 +00:00
Nate Begeman c0896117d3 Remove some dead code now that the dag combiner exists.
llvm-svn: 23754
2005-10-15 22:08:02 +00:00
Chris Lattner d869bec4fe Remove some dead code: the ORI/ORIS cases are autogen'd. This makes
SelectIntImmediateExpr dead.

llvm-svn: 23753
2005-10-15 22:06:18 +00:00
Chris Lattner 03354280eb prune #includes
llvm-svn: 23752
2005-10-15 21:58:54 +00:00
Chris Lattner a52969c8d6 These instructions are now autogenerated
llvm-svn: 23751
2005-10-15 21:44:56 +00:00
Chris Lattner 286c1d7cfa Add a pattern for FSQRTS
llvm-svn: 23750
2005-10-15 21:44:15 +00:00
Chris Lattner efa382616b remove dead code
llvm-svn: 23749
2005-10-15 21:40:12 +00:00
Chris Lattner b986f471be Use getExtLoad here instead of getNode, as extloads produce two values. This
fixes a legalize failure on SPASS for itanium.

llvm-svn: 23747
2005-10-15 20:24:07 +00:00
Chris Lattner e33870d154 remove broken SRA/rlwimi case
llvm-svn: 23746
2005-10-15 19:04:48 +00:00
Chris Lattner 6f3b954662 Rename PPC32*.h to PPC*.h
This completes the grand PPC file renaming

llvm-svn: 23745
2005-10-14 23:59:06 +00:00
Chris Lattner 0aa794ba5b Merge PPCJITInfo.h and PPC32JITInfo.h. Note that the PowerPCJITInfo
and PPC32JITInfo classes should be merged.

llvm-svn: 23744
2005-10-14 23:53:41 +00:00
Chris Lattner bfca1ab79d Rename PowerPC*.h to PPC*.h
llvm-svn: 23743
2005-10-14 23:51:18 +00:00
Chris Lattner e80bf1b33a Rename PowerPCInstrBuilder.h -> PPC*
llvm-svn: 23742
2005-10-14 23:45:43 +00:00
Chris Lattner 2ed745a905 Nuke the PowerPCTargetMachine.h header. Note that the PowerPCTargetMachine
still should be merged into the PPC32TargetMachine class

llvm-svn: 23741
2005-10-14 23:44:05 +00:00
Chris Lattner 7503d46feb Rename PowerPC*.td -> PPC*.td
llvm-svn: 23740
2005-10-14 23:40:39 +00:00
Chris Lattner f3b97f53b9 These are dead
llvm-svn: 23739
2005-10-14 23:38:51 +00:00
Chris Lattner 0921e3bfc1 Eliminate PowerPC.td and PPC32.td, consolidating them into PPC.td
llvm-svn: 23738
2005-10-14 23:37:35 +00:00
Chris Lattner 09cd9e7661 Like the comment says...
llvm-svn: 23737
2005-10-14 22:48:24 +00:00
Chris Lattner 2121f3ca50 Nuke PowerPCInstrFormats.h, its contents are dead. Remove the definitions
from the .td file that correspond to it

llvm-svn: 23736
2005-10-14 22:44:13 +00:00
Nate Begeman 9d7008b08d Properly split f32 and f64 into separate register classes for scalar sse fp
fixing a bunch of nasty hackery

llvm-svn: 23735
2005-10-14 22:06:00 +00:00
Nate Begeman c41e1be2e8 Remove an unnecsesary file. PPC32 and PPC64 share architected registers.
We will decide with subtarget support whether we ever use an i64 register
class.

llvm-svn: 23734
2005-10-14 18:58:46 +00:00
Chris Lattner 56f31f5408 add the integer truncate/extension operations
llvm-svn: 23733
2005-10-14 06:40:20 +00:00
Chris Lattner 7d9f719d42 These are now autogenerated
llvm-svn: 23731
2005-10-14 06:26:29 +00:00
Chris Lattner 9c0d3c5932 Add patterns for FP round/extend
llvm-svn: 23727
2005-10-14 04:55:50 +00:00
Chris Lattner 6e83cbf7f3 add a new SDTCisOpSmallerThanOp type constraint, and implement fround/fextend in terms of it
llvm-svn: 23726
2005-10-14 04:55:10 +00:00
Nate Begeman 6e673b24d3 fold sext_in_reg, sext_in_reg where both have the same VT. This was
popping up in Fourinarow.

llvm-svn: 23722
2005-10-14 01:29:07 +00:00
Chris Lattner e3870fbe4a Allow $
llvm-svn: 23721
2005-10-14 01:28:34 +00:00
Nate Begeman d59e5a7abb Relax the checking on zextload generation a bit, since as sabre pointed out
you could be AND'ing with the result of a shift that shifts out all the
bits you care about, in addition to a constant.

Also, move over an add/sub_parts fold from legalize to the dag combiner,
where it works for things other than constants.  Woot!

llvm-svn: 23720
2005-10-14 01:12:21 +00:00
Chris Lattner b8282987f4 Fix the trunc(load) case, finally allowing crafty and povray to pass
llvm-svn: 23718
2005-10-13 22:10:05 +00:00
Chris Lattner dbc5ae3109 Fix some bugs in (sext (load x))
llvm-svn: 23717
2005-10-13 21:52:31 +00:00
Chris Lattner 258521d7ea When ExpandOp'ing a [SZ]EXTLOAD, make sure to remember that the chain
is also legal.  Add support for ExpandOp'ing raw EXTLOADs too.

llvm-svn: 23716
2005-10-13 21:44:47 +00:00
Chris Lattner d23f4b7411 Implement PromoteOp for *EXTLOAD, allowing MallocBench/gs to Legalize
llvm-svn: 23715
2005-10-13 20:07:41 +00:00
Nate Begeman 8e022b3d89 Fix the remaining DAGCombiner issues pointed out by sabre. This should fix
the remainder of the failures introduced by my patch last night.

llvm-svn: 23714
2005-10-13 18:34:58 +00:00
Chris Lattner a80f1f6e72 Fix a minor bug in the dag combiner that broke pcompress2 and some other
tests.

llvm-svn: 23713
2005-10-13 18:16:34 +00:00
Nate Begeman c3a89c5259 Add support to Legalize for expanding i64 sextload/zextload into hi and lo
parts. This should fix the crafty and signed long long unit test failure
on x86 last night.

llvm-svn: 23711
2005-10-13 17:15:37 +00:00
Jim Laskey 5d7a50ac44 Inhibit instructions from being pushed before function calls. This will
minimize unnecessary spilling.

llvm-svn: 23710
2005-10-13 16:44:00 +00:00
Nate Begeman 02b23c6065 Move some Legalize functionality over to the DAGCombiner where it belongs.
Kill some dead code.

llvm-svn: 23706
2005-10-13 03:11:28 +00:00
Nate Begeman 70d28c5e32 Fix a potential bug with two combine-to's back to back that chris pointed
out, where after the first CombineTo() call, the node the second CombineTo
wishes to replace may no longer exist.

Fix a very real bug with the truncated load optimization on little endian
targets, which do not need a byte offset added to the load.

llvm-svn: 23704
2005-10-12 23:18:53 +00:00
Nate Begeman 8caf81d617 More cool stuff for the dag combiner. We can now finally handle things
like turning:

_foo:
        fctiwz f0, f1
        stfd f0, -8(r1)
        lwz r2, -4(r1)
        rlwinm r3, r2, 0, 16, 31
        blr

into
_foo:
        fctiwz f0,f1
        stfd f0,-8(r1)
        lhz r3,-2(r1)
        blr

Also removed an unncessary constraint from sra -> srl conversion, which
should take care of hte only reason we would ever need to handle sra in
MaskedValueIsZero, AFAIK.

llvm-svn: 23703
2005-10-12 20:40:40 +00:00
Jim Laskey 63b1419b74 Finally committing to the new scheduler. Still -sched=none by default.
llvm-svn: 23702
2005-10-12 18:29:35 +00:00
Jim Laskey d00db257c7 Added graphviz/gv support for MF.
llvm-svn: 23700
2005-10-12 12:09:05 +00:00
Chris Lattner 192cd18f53 Fix (hopefully the last) issue where LSR is nondeterminstic. When pulling
out CSE's of base expressions it could build a result whose order was
nondet.

llvm-svn: 23698
2005-10-11 18:41:04 +00:00
Chris Lattner 5c9d63da31 Fix another problem where LSR was being nondeterminstic. Also remove elements
from the end of a vector instead of the beginning

llvm-svn: 23697
2005-10-11 18:30:57 +00:00
Chris Lattner b7a3894e7c Fix another lsr-is-nondeterministic case
llvm-svn: 23695
2005-10-11 18:17:57 +00:00
Chris Lattner 514f058be1 Fix a powerpc crash on CodeGen/Generic/llvm-ct-intrinsics.ll
llvm-svn: 23694
2005-10-11 17:56:34 +00:00
Chris Lattner c38fb8e2a1 Add a canonicalization that got lost, fixing PowerPC/fold-li.ll:SUB
llvm-svn: 23693
2005-10-11 06:07:15 +00:00
Chris Lattner cc6e53e6ee clean up some corner cases
llvm-svn: 23692
2005-10-10 23:00:08 +00:00
Chris Lattner 04c737091f Implement trivial DSE. If two stores are neighbors and store to the same
location, replace them with a new store of the last value.  This occurs
in the same neighborhood in 197.parser, speeding it up about 1.5%

llvm-svn: 23691
2005-10-10 22:31:19 +00:00
Chris Lattner e260ed8628 Add support for CombineTo, allowing the dag combiner to replace nodes with
multiple results.

Use this support to implement trivial store->load forwarding, implementing
CodeGen/PowerPC/store-load-fwd.ll.  Though this is the most simple case and
can be extended in the future, it is still useful.  For example, it speeds
up 197.parser by 6.2% by avoiding an LSU reject in xalloc:

        stw r6, lo16(l5_end_of_array)(r2)
        addi r2, r5, -4
        stwx r5, r4, r2
-       lwzx r5, r4, r2
-       rlwinm r5, r5, 0, 0, 30
        stwx r5, r4, r2
        lwz r2, -4(r4)
        ori r2, r2, 1

llvm-svn: 23690
2005-10-10 22:04:48 +00:00
Nate Begeman 6828ed9bfd Teach the DAGCombiner several new tricks, teaching it how to turn
sext_inreg into zext_inreg based on the signbit (fires a lot), srem into
urem, etc.

llvm-svn: 23688
2005-10-10 21:26:48 +00:00
Chris Lattner 7730924067 Fix comment
llvm-svn: 23686
2005-10-10 16:52:03 +00:00
Chris Lattner 3d1d4a3d12 Add ISD::ADD to MaskedValueIsZero
llvm-svn: 23685
2005-10-10 16:51:40 +00:00
Chris Lattner 56e44a6da5 This function is now dead
llvm-svn: 23684
2005-10-10 16:49:22 +00:00
Chris Lattner bcfebebf22 Enable Nate's excellent DAG combiner work by default. This allows the
removal of a bunch of ad-hoc and crufty code from SelectionDAG.cpp.

llvm-svn: 23682
2005-10-10 16:47:10 +00:00
Chris Lattner d59a57a8d5 These definitions have been moved to common code.
llvm-svn: 23681
2005-10-10 06:01:00 +00:00
Chris Lattner d83571bbf2 Pull DAG ISel generation nodes out of the PowerPC backend to where they
can be used by other targets.  For those targets that want to use it,
have at.  :)

llvm-svn: 23680
2005-10-10 06:00:30 +00:00
Chris Lattner 6a49b7cabb add a todo for something I noticed
llvm-svn: 23679
2005-10-09 22:59:08 +00:00
Chris Lattner 1d3dc00674 (X & Y) & C == 0 if either X&C or Y&C are zero
llvm-svn: 23678
2005-10-09 22:12:36 +00:00
Chris Lattner 03b9eb506c Make MaskedValueIsZero a bit more aggressive
llvm-svn: 23677
2005-10-09 22:08:50 +00:00
Andrew Lenharth 1dfb85c7af This seems useful from the original patch that added the function. If there is a reason it is not useful on a RISC type target, let me know and I will pull it out
llvm-svn: 23676
2005-10-09 20:11:35 +00:00
Chris Lattner 62010c450f Fix funky xcode indentation
llvm-svn: 23674
2005-10-09 06:36:35 +00:00
Chris Lattner eb4be8b942 Hrm, you didn't see this.
llvm-svn: 23673
2005-10-09 06:24:02 +00:00
Chris Lattner 4ea0a3eaac Fix a source of non-determinism in the backend: the order of processing
IV strides dependend on the pointer order of the strides in memory.
Non-determinism is bad.

llvm-svn: 23672
2005-10-09 06:20:55 +00:00
Chris Lattner 0832f2635a When emiting a CopyFromReg and the source is already a vreg, do not bother
creating a new vreg and inserting a copy: just use the input vreg directly.

This speeds up the compile (e.g. about 5% on mesa with a debug build of llc)
by not adding a bunch of copies and vregs to be coallesced away.  On mesa,
for example, this reduces the number of intervals from 168601 to 129040
going into the coallescer.

llvm-svn: 23671
2005-10-09 05:58:56 +00:00
Chris Lattner 89c7fa22b1 Disable formation of rlwinm instructions from SRA bases. This fixes
the 177.mesa failure from last night, and fixes the
CodeGen/PowerPC/2005-10-08-ArithmeticRotate.ll regression test I added.
If this code cannot be fixed, it should be removed for good, but I'll leave
it to Nate to decide its fate.

llvm-svn: 23670
2005-10-09 05:36:17 +00:00
Nate Begeman 967ce74980 Remove another unused file. Preparing for the great "enable i64 on ppc32"
merge, and using subtarget info for ptr size.

llvm-svn: 23668
2005-10-08 01:32:34 +00:00
Nate Begeman af72457fc4 Remove a file that is no longer used
llvm-svn: 23666
2005-10-08 01:21:27 +00:00
Nate Begeman 2042aa5b92 Lo and behold, the last bits of SelectionDAG.cpp have been moved over.
llvm-svn: 23665
2005-10-08 00:29:44 +00:00
Chris Lattner dae96f8881 When preselecting, favor things that have low depth to select first. This
is faster and uses less stack space.  This reduces our stack requirement
enough to compile sixtrack, and though it's a hack, should be enough until
we switch to iterative isel

llvm-svn: 23664
2005-10-07 22:10:27 +00:00
Chris Lattner be4bbca0ba remove debugging code
llvm-svn: 23663
2005-10-07 15:31:26 +00:00
Chris Lattner fb12624a3f implement CodeGen/PowerPC/div-2.ll:test2-4 by propagating zero bits through
C-X's

llvm-svn: 23662
2005-10-07 15:30:32 +00:00
Chris Lattner b27a4147d3 fix indentation
llvm-svn: 23660
2005-10-07 06:37:02 +00:00
Chris Lattner 5bcd0dd811 Turn sdivs into udivs when we can prove the sign bits are clear. This
implements CodeGen/PowerPC/div-2.ll

llvm-svn: 23659
2005-10-07 06:10:46 +00:00
Jeff Cohen 572910c9a2 Remove useless variable.
llvm-svn: 23656
2005-10-07 05:28:29 +00:00
Chris Lattner 20a244577d add a hack to work around broken VC++ scoping rules. Thx to JeffC for pointing
this out to me

llvm-svn: 23655
2005-10-07 05:23:36 +00:00
Chris Lattner e373592258 Fix a CQ regression from my patch to split F32/F64 into seperate register
classes on PPC.  We were emitting fmr instructions to do fp extensions, which
weren't getting coallesced.  This fixes Regression/CodeGen/PowerPC/fpcopy.ll

llvm-svn: 23654
2005-10-07 05:00:52 +00:00
Chris Lattner cd8b421799 Fix CodeGen/Generic/bool-to-double.ll
llvm-svn: 23652
2005-10-07 04:50:48 +00:00
Chris Lattner 318622fb9f Pull out Call, reducing stack frame size from 6032 bytes to 5184 bytes.
llvm-svn: 23650
2005-10-06 19:07:45 +00:00
Chris Lattner 491b8294f4 Pull out setcc, this reduces stack frame size from 7520 to 6032 bytes
llvm-svn: 23649
2005-10-06 19:03:35 +00:00
Chris Lattner 502a36935e Pull two more methods out, reducing stack frame size from 8224 -> 7520 bytes
llvm-svn: 23648
2005-10-06 18:56:10 +00:00
Chris Lattner 259e6c76f2 Add a recursive-iterative hybrid stage to attempt to reduce stack space, this
helps but not enough.

Start pulling cases out of PPC32DAGToDAGISel::Select.  With GCC 4, this function
required 8512 bytes of stack space for each invocation (GCC 3 required less
than 700 bytes).  Pulling this first function out gets us down to 8224.  More
to come :(

llvm-svn: 23647
2005-10-06 18:45:51 +00:00
Chris Lattner 7bf8d06f02 silence a bogus GCC warning
llvm-svn: 23646
2005-10-06 17:39:10 +00:00
Chris Lattner fabe55f155 Fix the LLC regressions on X86 last night. In particular, when undoing
previous copy elisions and we discover we need to reload a register, make
sure to use the regclass of the original register for the reload, not the
class of the current register.  This avoid using 16-bit loads to reload 32-bit
values.

llvm-svn: 23645
2005-10-06 17:19:06 +00:00
Andrew Lenharth e4c91fc9e8 This is suppose to work now
llvm-svn: 23644
2005-10-06 16:54:29 +00:00
Andrew Lenharth 332df13b9e remove VAX compatibility instruction, we will never use this
llvm-svn: 23643
2005-10-06 16:53:32 +00:00
Chris Lattner 4bbbb9eed7 Make the legalizer completely non-recursive
llvm-svn: 23642
2005-10-06 01:20:27 +00:00
Nate Begeman 558beb3729 Let the combiner handle more cases
llvm-svn: 23641
2005-10-05 21:44:43 +00:00
Nate Begeman f8221c5e2c Remove some bad code from Legalize
llvm-svn: 23640
2005-10-05 21:44:10 +00:00
Nate Begeman bd7df030d2 Check in some more DAGCombiner pieces
llvm-svn: 23639
2005-10-05 21:43:42 +00:00
Chris Lattner 55149d7835 Fix a bug in the local spiller, where we could take code like this:
store r12 -> [ss#2]
  R3 = load [ss#1]
  use R3
  R3 = load [ss#2]
  R4 = load [ss#1]

and turn it into this code:

  store R12 -> [ss#2]
  R3 = load [ss#1]
  use R3
  R3 = R12
  R4 = R3    <- oops!

The problem was that promoting R3 = load[ss#2] to a copy missed the fact that
the instruction invalidated R3 at that point.

llvm-svn: 23638
2005-10-05 18:30:19 +00:00
Chris Lattner 05da0d966e silence some warnings
llvm-svn: 23637
2005-10-05 17:15:09 +00:00
Chris Lattner a49e16fefa implement visitBR_CC so that PowerPC/inverted-bool-compares.ll passes
with the dag combiner.  This speeds up espresso by 8%, reaching performance
parity with the dag-combiner-disabled llc.

llvm-svn: 23636
2005-10-05 06:47:48 +00:00
Chris Lattner b11d15637a fix some pastos
llvm-svn: 23635
2005-10-05 06:37:22 +00:00
Chris Lattner 06f1d0f73a Add a new HandleNode class, which is used to handle (haha) cases in the
dead node elim and dag combiner passes where the root is potentially updated.
This fixes a fixme in the dag combiner.

llvm-svn: 23634
2005-10-05 06:35:28 +00:00
Chris Lattner a6895d180e Implement the code for PowerPC/inverted-bool-compares.ll, even though it
that testcase still does not pass with the dag combiner.  This is because
not all forms of br* are folded yet.

Also, when we combine a node into another one, delete the node immediately
instead of waiting for the node to potentially come up in the future.

llvm-svn: 23632
2005-10-05 06:11:08 +00:00
Chris Lattner 6bd8fd09b6 make sure that -view-isel-dags is the input to the isel, not the input to
the second phase of dag combining

llvm-svn: 23631
2005-10-05 06:09:10 +00:00
Chris Lattner 746d50a01a Fix a crash compiling Olden/tsp
llvm-svn: 23630
2005-10-05 04:45:43 +00:00
Chris Lattner 3b793c6521 refactor a bit of code.
When moving constant entries in 'Map' if the entry is the representative
constant for the abstractypemap, make sure to update it as well.  This
fixes the bcreader failures from last night on several C++ apps.

llvm-svn: 23628
2005-10-04 21:35:50 +00:00
Chris Lattner dff59118c6 Minor speedup to avoid array searches given a Use*. This speeds up bc reading
of the python test from 1:00 to 54s.

llvm-svn: 23627
2005-10-04 18:47:09 +00:00
Chris Lattner 7a1450dbc6 Change the signature of replaceUsesOfWithOnConstant. The bool was always
true dynamically.  Finally, pass the Use* that replaceAllUsesWith has into
the method for future use.

llvm-svn: 23626
2005-10-04 18:13:04 +00:00
Chris Lattner 935aa922e3 For large constants (e.g. arrays and structs with many elements) just
creating the keys and doing comparisons to index into 'Map' takes a lot
of time.  For these large constants, keep an inverse map so that 'remove'
and move operations are much faster.

This speeds up a release build of the bc reader on Eric's nasty python
bytecode file from 1:39 to 1:00s.

llvm-svn: 23624
2005-10-04 17:48:46 +00:00
Chris Lattner 5bbf60a5b6 minor cleanup/fastpath for the bcreader. This speeds up the bcreader
from 1:41 -> 1:39 on the large python .bc file in a release build.

llvm-svn: 23623
2005-10-04 16:52:46 +00:00
Jim Laskey 327d4298e1 Reverting to version - until problem isolated.
llvm-svn: 23622
2005-10-04 16:41:51 +00:00
Chris Lattner d1a5bc8dbd Add a forward def
llvm-svn: 23621
2005-10-04 05:09:20 +00:00
Nate Begeman 5da6908d65 Fix some faulty logic in the libcall inserter.
Since calls return more than one value, don't bail if one of their uses
happens to be a node that's not an MVT::Other when following the chain
from CALLSEQ_START to CALLSEQ_END.

Once we've found a CALLSEQ_START, we can just return; there's no need to
tail-recurse further up the graph.

Most importantly, just because something only has one use doesn't mean we
should use it's one use to follow from start to end.  This faulty logic
caused us to follow a chain of one-use FP operations back to a much earlier
call, putting a cycle in the graph from a later start to an earlier end.

This is a better fix that reverting to the workaround committed earlier
today.

llvm-svn: 23620
2005-10-04 02:10:55 +00:00
Chris Lattner 8760ec73d8 implement the struct version of the array speedup, speeding up the
testcase a bit more from 1:48 -> 1.40.

llvm-svn: 23619
2005-10-04 01:17:50 +00:00
Chris Lattner 20b0754c41 Fix DemoteRegToStack on an invoke. This fixes PR634.
llvm-svn: 23618
2005-10-04 00:44:01 +00:00
Nate Begeman 54fb5002e5 Add back a workaround that fixes some breakages from chris's last change.
Neither of us have yet figured out why this code is necessary, but stuff
breaks if its not there.  Still tracking this down...

llvm-svn: 23617
2005-10-04 00:37:37 +00:00
Chris Lattner 4c3b2b536c Clean up the code a bit. Use isInstructionTriviallyDead to be more aggressive
and more correct than use_empty().  This fixes PR635 and
SimplifyCFG/2005-10-02-InvokeSimplify.ll

llvm-svn: 23616
2005-10-03 23:43:43 +00:00
Chris Lattner b64419ac40 Change ConstantArray::replaceUsesOfWithOnConstant to attempt to update
constant arrays in place instead of reallocating them and replaceAllUsesOf'ing
the result.  This speeds up a release build of the bcreader from:

136.987u 120.866s 4:24.38
to
49.790u 49.890s 1:40.14

... a 2.6x speedup parsing a large python bc file.

llvm-svn: 23614
2005-10-03 22:51:37 +00:00
Chris Lattner c4062ba65f move some methods, no other changes
llvm-svn: 23613
2005-10-03 21:58:36 +00:00
Chris Lattner 0144fadc17 minor microoptimizations
llvm-svn: 23612
2005-10-03 21:56:24 +00:00
Chris Lattner bad09e71d0 Use a map to cache the ModuleType information, so we can do logarithmic
lookups instead of linear time lookups.  This speeds up bc parsing of a
large file from

137.834u 118.256s 4:27.96
to
132.611u 114.436s 4:08.53

with a release build.

llvm-svn: 23611
2005-10-03 21:26:53 +00:00
Jim Laskey 409a6b204e Refactor gathering node info and emission.
llvm-svn: 23610
2005-10-03 12:30:32 +00:00
Chris Lattner 57b21f9f10 clean up this code a bit, no functionality change
llvm-svn: 23609
2005-10-03 07:22:07 +00:00
Chris Lattner afef68baff Speed up the asm printer a lot by not printing formatted LLVM asm output
for globals

llvm-svn: 23608
2005-10-03 07:08:36 +00:00
Chris Lattner 5f096e2847 Break the body of the loop out into a new method
llvm-svn: 23606
2005-10-03 04:47:08 +00:00
Chris Lattner f07a587c79 Make IVUseShouldUsePostIncValue more aggressive when the use is a PHI. In
particular, it should realize that phi's use their values in the pred block
not the phi block itself.  This change turns our em3d loop from this:

_test:
        cmpwi cr0, r4, 0
        bgt cr0, LBB_test_2     ; entry.no_exit_crit_edge
LBB_test_1:     ; entry.loopexit_crit_edge
        li r2, 0
        b LBB_test_6    ; loopexit
LBB_test_2:     ; entry.no_exit_crit_edge
        li r6, 0
LBB_test_3:     ; no_exit
        or r2, r6, r6
        lwz r6, 0(r3)
        cmpw cr0, r6, r5
        beq cr0, LBB_test_6     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r2, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit
LBB_test_5:     ; endif.loopexit.loopexit_crit_edge
        addi r3, r2, 1
        blr
LBB_test_6:     ; loopexit
        or r3, r2, r2
        blr

into:

_test:
        cmpwi cr0, r4, 0
        bgt cr0, LBB_test_2     ; entry.no_exit_crit_edge
LBB_test_1:     ; entry.loopexit_crit_edge
        li r2, 0
        b LBB_test_5    ; loopexit
LBB_test_2:     ; entry.no_exit_crit_edge
        li r6, 0
LBB_test_3:     ; no_exit
        lwz r2, 0(r3)
        cmpw cr0, r2, r5
        or r2, r6, r6
        beq cr0, LBB_test_5     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r6, 1
        cmpw cr0, r6, r4
        or r2, r6, r6
        blt cr0, LBB_test_3     ; no_exit
LBB_test_5:     ; loopexit
        or r3, r2, r2
        blr


Unfortunately, this is actually worse code, because the register coallescer
is getting confused somehow.  If it were doing its job right, it could turn the
code into this:

_test:
        cmpwi cr0, r4, 0
        bgt cr0, LBB_test_2     ; entry.no_exit_crit_edge
LBB_test_1:     ; entry.loopexit_crit_edge
        li r6, 0
        b LBB_test_5    ; loopexit
LBB_test_2:     ; entry.no_exit_crit_edge
        li r6, 0
LBB_test_3:     ; no_exit
        lwz r2, 0(r3)
        cmpw cr0, r2, r5
        beq cr0, LBB_test_5     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r6, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit
LBB_test_5:     ; loopexit
        or r3, r6, r6
        blr

... which I'll work on next. :)

llvm-svn: 23604
2005-10-03 02:50:05 +00:00
Chris Lattner e4ed42a426 Refactor some code into a function
llvm-svn: 23603
2005-10-03 01:04:44 +00:00
Chris Lattner 360928dbed This break is bogus and I have no idea why it was there. Basically it prevents
memoizing code when IV's are used by phinodes outside of loops.  In a simple
example, we were getting this code before (note that r6 and r7 are isomorphic
IV's):

        li r6, 0
        or r7, r6, r6
LBB_test_3:     ; no_exit
        lwz r2, 0(r3)
        cmpw cr0, r2, r5
        or r2, r7, r7
        beq cr0, LBB_test_5     ; loopexit
LBB_test_4:     ; endif
        addi r2, r7, 1
        addi r7, r7, 1
        addi r3, r3, 4
        addi r6, r6, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit

Now we get:

        li r6, 0
LBB_test_3:     ; no_exit
        or r2, r6, r6
        lwz r6, 0(r3)
        cmpw cr0, r6, r5
        beq cr0, LBB_test_6     ; loopexit
LBB_test_4:     ; endif
        addi r3, r3, 4
        addi r6, r2, 1
        cmpw cr0, r6, r4
        blt cr0, LBB_test_3     ; no_exit

this was noticed in em3d.

llvm-svn: 23602
2005-10-03 00:37:33 +00:00
Chris Lattner 8fcce170cf when checking if we should move a split edge block outside of a loop,
check the presplit pred, not the post-split pred.  This was causing us
to make the wrong decision in some cases, leaving the critical edge block
in the loop.

llvm-svn: 23601
2005-10-03 00:31:52 +00:00
Chris Lattner 9cfccfb517 Fix a problem where the legalizer would run out of stack space on extremely
large basic blocks because it was purely recursive.  This switches it to an
iterative/recursive hybrid.

llvm-svn: 23596
2005-10-02 17:49:46 +00:00
Chris Lattner 7f718e61e8 silence a bogus warning
llvm-svn: 23595
2005-10-02 16:30:51 +00:00
Chris Lattner 9982da2703 silence some warnings
llvm-svn: 23594
2005-10-02 16:29:36 +00:00
Chris Lattner c0e655b65d silence a warning
llvm-svn: 23593
2005-10-02 16:27:59 +00:00
Chris Lattner 68303a78ff add patterns for float binops and fma ops
llvm-svn: 23592
2005-10-02 07:46:28 +00:00
Chris Lattner 98da1d9910 Sort the cpu and features table, so that the alpha backend doesn't fail EVERY
compile with an assertion that the tables are not sorted!

llvm-svn: 23591
2005-10-02 07:13:52 +00:00
Chris Lattner 704d97f8b2 Add assertions to the trivial scheduler to check that the value types match
up between defs and uses.

llvm-svn: 23590
2005-10-02 07:10:55 +00:00
Chris Lattner 3734d204b8 another solution to the fsel issue. Instead of having 4 variants, just force
the comparison to be 64-bits.  This is fine because extensions from float
to double are free.

llvm-svn: 23589
2005-10-02 07:07:49 +00:00
Chris Lattner 9e98672962 fsel can take a different FP type for the comparison and for the result. As such
split the FSEL family into 4 things instead of just two.

llvm-svn: 23588
2005-10-02 06:58:23 +00:00
Chris Lattner a17e6c486c fix an f32/f64 type mismatch
llvm-svn: 23587
2005-10-02 06:37:13 +00:00
Chris Lattner a038d901fb Codegen CopyFromReg using the regclass that matches the valuetype of the
destination vreg.

llvm-svn: 23586
2005-10-02 06:34:16 +00:00
Chris Lattner 4155ae0f74 Adjust to change in ctor
llvm-svn: 23585
2005-10-02 06:23:51 +00:00
Chris Lattner 5ab9d42bb4 Minor tweak to the branch selector. When emitting a two-way branch, and if
we're in a single-mbb loop, make sure to emit the backwards branch as the
conditional branch instead of the uncond branch.  For example, emit this:

LBBl29_z__44:
        stw r9, 0(r15)
        stw r9, 4(r15)
        stw r9, 8(r15)
        stw r9, 12(r15)
        addi r15, r15, 16
        addi r8, r8, 1
        cmpw cr0, r8, r28
        ble cr0, LBBl29_z__44
        b LBBl29_z__48                   *** NOT PART OF LOOP

Instead of:

LBBl29_z__44:
        stw r9, 0(r15)
        stw r9, 4(r15)
        stw r9, 8(r15)
        stw r9, 12(r15)
        addi r15, r15, 16
        addi r8, r8, 1
        cmpw cr0, r8, r28
        bgt cr0, LBBl29_z__48            *** PART OF LOOP!
        b LBBl29_z__44

The former sequence has one fewer dispatch group for the loop body.

llvm-svn: 23582
2005-10-01 23:06:26 +00:00
Chris Lattner 6f4dc51d6f like the comment says, enable this
llvm-svn: 23581
2005-10-01 23:02:40 +00:00
Chris Lattner 5a7bfe0b72 Add some very paranoid checking for operand/result reg class matchup
For instructions that define multiple results, use the right regclass
to define the result, not always the rc of result #0

llvm-svn: 23580
2005-10-01 07:45:09 +00:00
Jeff Cohen f8a5e5ae6e Fix VC++ warnings.
llvm-svn: 23579
2005-10-01 03:57:14 +00:00
Chris Lattner 8713ebf37c fix typo
llvm-svn: 23578
2005-10-01 02:51:36 +00:00
Chris Lattner d3eee1a09b Modify the ppc backend to use two register classes for FP: F8RC and F4RC.
These are used to represent float and double values, and the two regclasses
contain the same physical registers.

llvm-svn: 23577
2005-10-01 01:35:02 +00:00
Chris Lattner fda6944c5b add a method
llvm-svn: 23575
2005-10-01 00:17:07 +00:00
Jim Laskey d3850457a1 typo
llvm-svn: 23574
2005-10-01 00:08:23 +00:00
Jim Laskey 9d96932879 1. Simplify the gathering of node groups.
2. Printing node groups when displaying nodes.

llvm-svn: 23573
2005-10-01 00:03:07 +00:00