Commit Graph

165 Commits

Author SHA1 Message Date
Chris Lattner 21a84f3054 teach SROA to handle promoting vector allocas with a memset into them into
a vector type instead of into an integer type.

llvm-svn: 66368
2009-03-08 04:17:04 +00:00
Chris Lattner c009757761 Enhance SROA to "promote to scalar" allocas which are
memcpy/memmove'd into or out of.  This fixes a serious
perf issue that Nate ran into.

llvm-svn: 66366
2009-03-08 04:04:21 +00:00
Chris Lattner dc35e5b43a change the MemIntrinsic get/setAlignment method to take an unsigned
instead of a Constant*, which is what the clients of it really want.

llvm-svn: 66364
2009-03-08 03:59:00 +00:00
Chris Lattner 334268a211 Introduce a new MemTransferInst pseudo class, which is a common
parent between MemCpyInst and MemMoveInst, simplify some code to
use it.

llvm-svn: 66361
2009-03-08 03:37:16 +00:00
Devang Patel 25b625165f While converting an aggregate to scalare, ignore and remove aggregate's debug info.
llvm-svn: 66262
2009-03-06 07:03:54 +00:00
Evan Cheng 5fd4fc76bf SRThreshold is meant to be inclusive.
llvm-svn: 66227
2009-03-06 00:56:43 +00:00
Chris Lattner a41bb40458 complete comment.
llvm-svn: 66055
2009-03-04 19:23:25 +00:00
Chris Lattner b5b0c87be6 this wasn't intended to be committed.
llvm-svn: 66054
2009-03-04 19:22:30 +00:00
Chris Lattner 5c204c92a4 Fix PR3720 by properly propagating alignment information from memcpy/memmove
onto element accesses.

llvm-svn: 66053
2009-03-04 19:20:50 +00:00
Bill Wendling a68fc7af63 Use > instead of >=. We want to promote aggregates of 128-bytes.
llvm-svn: 65960
2009-03-03 19:18:49 +00:00
Bill Wendling 3e44bf3c4b Reapply r65755, but reversing "<" to ">=".
llvm-svn: 65945
2009-03-03 12:12:58 +00:00
Bill Wendling 38eae046cf Temporarily revert r65755. It was causing failures in the self-hosting
testsuite:

Running /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/CodeGen/X86/dg.exp ...
FAIL: /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/CodeGen/X86/nancvt.ll
Failed with exit(1) at line 2
while running: grep 2147027116 nancvt.ll.tmp | count 3
count: expected 3 lines and got        0.
child process exited abnormally
FAIL: /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/CodeGen/X86/vec_ins_extract.ll
Failed with exit(1) at line 1
while running:  llvm-as < /Volumes/Sandbox/Buildbot/llvm/full-llvm/build/llvmCore/test/CodeGen/X86/vec_ins_extract.ll |  opt -scalarrepl -instcombine |   llc -march=x86 -mcpu=yonah | not /usr/bin/grep sub.*esp
      subl      $28, %esp
      subl      $28, %esp
child process exited abnormally

And more.

llvm-svn: 65758
2009-03-01 03:55:12 +00:00
Chris Lattner e2bb5e31c8 hoist the check for alloca size up so that it controls CanConvertToScalar
as well as isSafeAllocaToScalarRepl.

llvm-svn: 65755
2009-03-01 02:26:47 +00:00
Devang Patel da1a632a87 Use early exits. Reduce indentation.
llvm-svn: 64226
2009-02-10 19:28:07 +00:00
Devang Patel caf4485781 Enable scalar replacement of AllocaInst whose one of the user is dbg info.
llvm-svn: 64207
2009-02-10 07:00:59 +00:00
Chris Lattner bbbb74372b fix PR3489, use bits instead of bytes.
llvm-svn: 63916
2009-02-06 04:34:07 +00:00
Chris Lattner ef37dc8511 teach "convert from scalar" to handle loads of fca's.
llvm-svn: 63659
2009-02-03 21:08:45 +00:00
Chris Lattner f5df53cb46 refactor the interface to ConvertUsesOfLoadToScalar,
renaming it to ConvertScalar_ExtractValue

llvm-svn: 63658
2009-02-03 21:01:03 +00:00
Chris Lattner 576baa4adf convert ConvertUsesOfLoadToScalar to use IRBuilder,
no functionality change.

llvm-svn: 63652
2009-02-03 19:45:44 +00:00
Chris Lattner c1fb96d347 switch ConvertScalar_InsertValue to use an IRBuilder, no
functionality change.

llvm-svn: 63651
2009-02-03 19:41:50 +00:00
Chris Lattner 18f56c295c make scalar conversion handle stores of first class
aggregate values.  loads are not yet handled (coming
soon to an sroa near you).

llvm-svn: 63649
2009-02-03 19:30:11 +00:00
Chris Lattner 73eff2e6e8 Make SROA produce a vector only when the alloca is actually
accessed at least once as a vector.  This prevents it from
compiling the example in not-a-vector into:

define double @test(double %A, double %B) {
	%tmp4 = insertelement <7 x double> undef, double %A, i32 0
	%tmp = insertelement <7 x double> %tmp4, double %B, i32 4
	%tmp2 = extractelement <7 x double> %tmp, i32 4
	ret double %tmp2
}

instead, producing the integer code.  Producing vectors when they
aren't otherwise in the program is dangerous because a lot of other
code treats them carefully and doesn't want to break them down.
OTOH, many things want to break down tasty i448's.

llvm-svn: 63638
2009-02-03 18:15:05 +00:00
Chris Lattner 80810b4c2d add another case of undefined behavior without crashing, PR3466.
llvm-svn: 63620
2009-02-03 07:08:57 +00:00
Chris Lattner 6aa6b1f263 Teach ConvertUsesToScalar to handle memset, allowing it to handle
crazy cases like:

struct f {  int A, B, C, D, E, F; };
short test4() {
  struct f A;
  A.A = 1;
  memset(&A.B, 2, 12);
  return A.C;
}

llvm-svn: 63596
2009-02-03 02:01:43 +00:00
Chris Lattner 09b65ab288 rearrange how SRoA handles promotion of allocas to vectors.
With the new world order, it can handle cases where the first
store into the alloca is an element of the vector, instead of
requiring the first analyzed store to have the vector type 
itself.  This allows us to un-xfail 
test/CodeGen/X86/vec_ins_extract.ll.

llvm-svn: 63590
2009-02-03 01:30:09 +00:00
Chris Lattner 43cecd7c26 inline SROA::ConvertToScalar, no functionality change.
llvm-svn: 63544
2009-02-02 20:44:45 +00:00
Chris Lattner 18eba4f211 Fix a bug which caused us to miscompile a couple of Ada
tests.  Thanks for the beautiful reduced testcase Duncan!

llvm-svn: 63529
2009-02-02 18:02:59 +00:00
Duncan Sands 6f361ff345 Fix a comment (bytes -> bits), reformat a comment
and remove trailing whitespace.  No functionality
change.

llvm-svn: 63511
2009-02-02 10:06:20 +00:00
Duncan Sands 33d6e97e33 Fix an obvious thinko.
llvm-svn: 63510
2009-02-02 09:53:14 +00:00
Chris Lattner ec99c46d44 Simplify and generalize the SROA "convert to scalar" transformation to
be able to handle *ANY* alloca that is poked by loads and stores of 
bitcasts and GEPs with constant offsets.  Before the code had a number
of annoying limitations and caused it to miss cases such as storing into
holes in structs and complex casts (as in bitfield-sroa) where we had
unions of bitfields etc.  This also handles a number of important cases
that are exposed due to the ABI lowering stuff we do to pass stuff by
value.

One case that is pretty great is that we compile 
2006-11-07-InvalidArrayPromote.ll into:

define i32 @func(<4 x float> %v0, <4 x float> %v1) nounwind {
	%tmp10 = call <4 x i32> @llvm.x86.sse2.cvttps2dq(<4 x float> %v1)
	%tmp105 = bitcast <4 x i32> %tmp10 to i128
	%tmp1056 = zext i128 %tmp105 to i256	
	%tmp.upgrd.43 = lshr i256 %tmp1056, 96
	%tmp.upgrd.44 = trunc i256 %tmp.upgrd.43 to i32	
	ret i32 %tmp.upgrd.44
}

which turns into:

_func:
	subl	$28, %esp
	cvttps2dq	%xmm1, %xmm0
	movaps	%xmm0, (%esp)
	movl	12(%esp), %eax
	addl	$28, %esp
	ret

Which is pretty good code all things considering :).

One effect of this is that SROA will start generating arbitrary bitwidth 
integers that are a multiple of 8 bits.  In the case above, we got a 
256 bit integer, but the codegen guys assure me that it can handle the 
simple and/or/shift/zext stuff that we're doing on these operations.

This addresses rdar://6532315

llvm-svn: 63469
2009-01-31 02:28:54 +00:00
Chris Lattner df17987c19 Fix some issues with volatility, move "CanConvertToScalar" check
after the others.

llvm-svn: 63227
2009-01-28 20:16:43 +00:00
Duncan Sands dc020f9c3c Rename getABITypeSize to getTypePaddedSize, as
suggested by Chris.

llvm-svn: 62099
2009-01-12 20:38:59 +00:00
Chris Lattner ae0e857b98 Fix PR3304
llvm-svn: 61995
2009-01-09 18:18:43 +00:00
Chris Lattner c518dfd11b This implements the second half of the fix for PR3290, handling
loads from allocas that cover the entire aggregate.  This handles
some memcpy/byval cases that are produced by llvm-gcc.  This triggers
a few times in kc++ (with std::pair<std::_Rb_tree_const_iterator
<kc::impl_abstract_phylum*>,bool>) and once in 176.gcc (with %struct..0anon).

llvm-svn: 61915
2009-01-08 05:42:05 +00:00
Chris Lattner f2b8c82ad1 Implement the first half of PR3290: if there is a store of an
integer to a (transitive) bitcast the alloca and if that integer
has the full size of the alloca, then it clobbers the whole thing.
Handle this by extracting pieces out of the stored integer and 
filing them away in the SROA'd elements.

This triggers fairly frequently because the CFE uses integers to
pass small structs by value and the inliner exposes these.  For 
example, in kimwitu++, I see a bunch of these with i64 stores to
"%struct.std::pair<std::_Rb_tree_const_iterator<kc::impl_abstract_phylum*>,bool>"

In 176.gcc I see a few i32 stores to "%struct..0anon".

In the testcase, this is a difference between compiling test1 to:

_test1:
	subl	$12, %esp
	movl	20(%esp), %eax
	movl	%eax, 4(%esp)
	movl	16(%esp), %eax
	movl	%eax, (%esp)
	movl	(%esp), %eax
	addl	4(%esp), %eax
	addl	$12, %esp
	ret

vs:

_test1:
	movl	8(%esp), %eax
	addl	4(%esp), %eax
	ret

The second half of this will be to handle loads of the same form.

llvm-svn: 61853
2009-01-07 08:11:13 +00:00
Chris Lattner 9a2de65fd6 Factor a bunch of code out into a helper method.
llvm-svn: 61852
2009-01-07 07:18:45 +00:00
Chris Lattner db561146aa use continue to simplify code and reduce nesting, no functionality
change.

llvm-svn: 61851
2009-01-07 06:39:58 +00:00
Chris Lattner 938b54f383 Get TargetData once up front and cache as an ivar instead of
requerying it all over the place.

llvm-svn: 61850
2009-01-07 06:34:28 +00:00
Chris Lattner a63dba9e6c Use the hasAllZeroIndices predicate to simplify some
code, no functionality change.

llvm-svn: 61849
2009-01-07 06:25:07 +00:00
Dale Johannesen 0a7b4f5800 Allow SROA of vectors. Removing this caused a
huge performance regression in something we care
about.  This may not be final fix.

llvm-svn: 58718
2008-11-04 20:54:03 +00:00
Matthijs Kooijman cbe5e16eb5 Allow scalarrepl to treat an all-zero GEP just as bitcast.
This includes not marking a GEP involving a vector as unsafe, but only when it
has all zero indices. This allows scalarrepl to work in a few more cases.

llvm-svn: 57177
2008-10-06 16:23:31 +00:00
Dan Gohman a79db30d28 Tidy up several unbeseeming casts from pointer to intptr_t.
llvm-svn: 55779
2008-09-04 17:05:41 +00:00
Chris Lattner 3f972c9150 Fix PR2423 by checking all indices for out of range access, not only
indices that start with an array subscript.  x->field[10000] is just 
as bad as (*X)[14][10000].

llvm-svn: 55226
2008-08-23 05:21:06 +00:00
Chris Lattner 4d754bc97b minor tidying of comments.
llvm-svn: 52630
2008-06-23 17:11:23 +00:00
Chris Lattner 6ff85681e4 Fix PR2369 by making scalarrepl more careful about promoting
structures.  Its default threshold is to promote things that are
smaller than 128 bytes, which is sane.  However, it is not sane
to do this for things that turn into 128 *registers*.  Add a cap
on the number of registers introduced, defaulting to 128/4=32.

llvm-svn: 52611
2008-06-22 17:46:21 +00:00
Matthijs Kooijman 812989b147 Learn ScalarReplAggregrates how stores and loads of first class aggregrates
work and how to replace them into individual values. Also, when trying to
replace an aggregrate that is used by load or store with a single (large)
integer, don't crash (but don't replace the aggregrate either).

Also adds a testcase for both structs and arrays.

llvm-svn: 51997
2008-06-05 12:51:53 +00:00
Duncan Sands fc3c489b52 Change packed struct layout so that field sizes
are the same as in unpacked structs, only field
positions differ.  This only matters for structs
containing x86 long double or an apint; it may
cause backwards compatibility problems if someone
has bitcode containing a packed struct with a
field of one of those types.
The issue is that only 10 bytes are needed to
hold an x86 long double: the store size is 10
bytes, but the ABI size is 12 or 16 bytes (linux/
darwin) which comes from rounding the store size
up by the alignment.  Because it seemed silly not
to pack an x86 long double into 10 bytes in a
packed struct, this is what was done.  I now
think this was a mistake.  Reserving the ABI size
for an x86 long double field even in a packed
struct makes things more uniform: the ABI size is
now always used when reserving space for a type.
This means that developers are less likely to
make mistakes.  It also makes life easier for the
CBE which otherwise could not represent all LLVM
packed structs (PR2402).
Front-end people might need to adjust the way
they create LLVM structs - see following change
to llvm-gcc.

llvm-svn: 51928
2008-06-04 08:21:45 +00:00
Dan Gohman 7a0566b9cd Use isSingleValueType instead of isFirstClassType to
exclude struct and array types.

llvm-svn: 51456
2008-05-23 00:12:03 +00:00
Gabor Greif e1f6e4b21d API change for {BinaryOperator|CmpInst|CastInst}::create*() --> Create. Legacy interfaces will be in place for some time. (Merge from use-diet branch.)
llvm-svn: 51200
2008-05-16 19:29:10 +00:00
Dan Gohman d78c400b5b Clean up the use of static and anonymous namespaces. This turned up
several things that were neither in an anonymous namespace nor static
but not intended to be global.

llvm-svn: 51017
2008-05-13 00:00:25 +00:00