Commit Graph

911 Commits

Author SHA1 Message Date
Chris Lattner 9052c35479 fix some vector extractions to return properly zero extended values
(instead of sign extending) to match ICC.  GCC is changing this in 
a series of their own PRs (e.g. 41323).

llvm-svn: 111637
2010-08-20 16:08:33 +00:00
Anton Yartsev 583a1cf7b5 support for predicates with bool/pixel arguments
llvm-svn: 111515
2010-08-19 11:57:49 +00:00
Anton Yartsev fc83c60755 support for the rest of AltiVec functions with bool/pixel arguments and return values (except predicates)
llvm-svn: 111511
2010-08-19 03:21:36 +00:00
Anton Yartsev 9e96898032 support for vec_perm and all dependent functions (vec_mergeh, vec_mergel, vec_pack, vec_sld, vec_splat) with bool/pixel arguments and return values
llvm-svn: 111509
2010-08-19 03:00:09 +00:00
Anton Yartsev 2cc136d4e3 support for vec_add, vec_adds, vec_and, vec_andc with bool arguments
llvm-svn: 111141
2010-08-16 16:22:12 +00:00
Fariborz Jahanian f7f020bb2a Make use of __func__ in a block actually refer to
block's helper function. Fixes radar 7860965.

llvm-svn: 110988
2010-08-13 00:19:55 +00:00
Devang Patel a3025fcd45 update test to reflect r110876 change.
llvm-svn: 110884
2010-08-12 00:00:41 +00:00
John McCall 5996699834 Revise r110163: don't mark weak functions nounwind, because the optimizer
treats that as a contract to be fulfilled by any replacements.

llvm-svn: 110864
2010-08-11 22:38:33 +00:00
Bruno Cardoso Lopes 762e401911 Remove rsqrtps_nr256 and sqrtps_nr256 builtins, at least until we need them
llvm-svn: 110844
2010-08-11 19:18:36 +00:00
Daniel Dunbar 9034aa36c7 ARM: Recognize single precision float register names.
- We don't recognize double or NEON register names yet -- we don't have the
   infrastructure to generate the right clobbers for them.

llvm-svn: 110775
2010-08-11 02:17:20 +00:00
Daniel Dunbar 256e1f3ad0 ARM: Swap which registers we consider real / aliases to match LLVM and llvm-gcc.
llvm-svn: 110774
2010-08-11 02:17:11 +00:00
Bruno Cardoso Lopes 65954ffc69 Remove 256-bit cast built-ins and make the AVX intrinsic call llvm __builtin_shufflevector with the appropriate arguments
llvm-svn: 110771
2010-08-11 02:14:38 +00:00
Bruno Cardoso Lopes a4f1930b75 Remove 256-bit unpack built-ins and make the AVX intrinsic call llvm __builtin_shufflevector with the appropriate arguments
llvm-svn: 110768
2010-08-11 01:43:24 +00:00
Bruno Cardoso Lopes e712a135b7 Remove 256-bit shuffle built-ins and make the AVX intrinsic call llvm __builtin_shufflevector with the appropriate arguments
llvm-svn: 110766
2010-08-11 01:17:34 +00:00
John Thompson 307c2729fd Something's wrong with this test on other platforms. I'll probably need to simplify it later. For now revert.
llvm-svn: 110738
2010-08-10 22:04:00 +00:00
John Thompson a5c7d706b8 Slightly revised handling of mult-alt constraints, to avoid an assert, until we have the full fix.
llvm-svn: 110706
2010-08-10 19:20:14 +00:00
Devang Patel 76e3b53541 Do not use DIGlobalVariable to emit debugging information for enums.
llvm-svn: 110697
2010-08-10 18:27:15 +00:00
Devang Patel e03edfd3e7 Even if a constant's evaluated value is used, emit debug info for the constant variable.
llvm-svn: 110660
2010-08-10 07:24:25 +00:00
Bruno Cardoso Lopes 3d3fc1d075 Make replicate intrinsics use shufflevector instead of dup builtins, also remove the dup builtins
llvm-svn: 110646
2010-08-10 02:23:54 +00:00
Devang Patel 2210aa2eca There is no need to pubish file static variable's name. Do not rely on this code gen bug to check whether debug info is generated for such variables or not.
llvm-svn: 110640
2010-08-10 01:36:24 +00:00
Eric Christopher 6ff7161d51 Thread local variables aren't considered common linkage.
llvm-svn: 110530
2010-08-08 01:37:14 +00:00
Chris Lattner 8139c98cf9 Correct -ftrapv to trap on errors, instead of calling the
__overflow_handler entrypoint that David Chisnall made up.
Calling __overflow_handler is not part of the contract of
-ftrapv provided by GCC, and should never have been checked
in in the first place.

According to:
http://permalink.gmane.org/gmane.comp.compilers.clang.devel/8699

David is using this for some of arbitrary precision integer stuff
or something, which is not an appropriate thing to implement on
this.

llvm-svn: 110490
2010-08-07 00:20:46 +00:00
Chandler Carruth 66ce9651f1 Prevent these tests from dirtying the tree with output files that aren't even
used for the test.

llvm-svn: 110431
2010-08-06 05:29:57 +00:00
Bruno Cardoso Lopes e2538c4ecf We don't want to support built-ins which aren't needed by the intrinsics. Remove them
llvm-svn: 110399
2010-08-05 23:47:43 +00:00
John McCall a9731a4179 Fix a major bug with -ftrapv and ++/--. Patch by David Keaton!
llvm-svn: 110347
2010-08-05 17:39:44 +00:00
Eli Friedman d986fc8b48 Tests for #pragma GCC visibility.
llvm-svn: 110316
2010-08-05 07:00:53 +00:00
Bruno Cardoso Lopes 6586724f71 Add more AVX 256-bit intrinsics and test cases for them
llvm-svn: 110178
2010-08-04 01:11:26 +00:00
John McCall f8280e723d Fix a warning on a test.
llvm-svn: 110165
2010-08-03 22:49:45 +00:00
John McCall 8601a75118 Do a very simple pass over every function we emit to infer whether we can
mark it nounwind based on whether it contains any non-nounwind calls.
<rdar://problem/8087431>

llvm-svn: 110163
2010-08-03 22:46:07 +00:00
Bruno Cardoso Lopes 1f927ccaa2 Support x86 AVX 256-bit instructions built-ins. Right now support all of them, but
as soon as we properly codegen the simple vector operations, remove the
unnecessary built-ins/intrinsics from clang and llvm. Also add tests for the new
built-ins

llvm-svn: 110096
2010-08-03 01:57:18 +00:00
John McCall a95172baa0 Only run the jump-checker if there's a branch-protected scope *and* there's
a switch or goto somewhere in the function.  Indirect gotos trigger the
jump-checker regardless, because the conditions there are slightly more
elaborate and it's too marginal a case to be worth optimizing.

Turns off the jump-checker in a lot of cases in C++.  rdar://problem/7702918

llvm-svn: 109962
2010-08-01 00:26:45 +00:00
Daniel Dunbar b8cba97cde There is no reason for this test to invoke 'llc'.
llvm-svn: 109847
2010-07-30 03:30:55 +00:00
Chris Lattner 7f4b81af7a fix rdar://8251384, another case where we could access beyond the
end of a struct.  This improves the case when the struct being passed
contains 3 floats, either due to a struct or array of 3 things.  Before
we'd generate this IR for the testcase:

define float @bar(double %X.coerce0, double %X.coerce1) nounwind {
entry:
  %X = alloca %struct.foof, align 8               ; <%struct.foof*> [#uses=2]
  %0 = bitcast %struct.foof* %X to %1*            ; <%1*> [#uses=2]
  %1 = getelementptr %1* %0, i32 0, i32 0         ; <double*> [#uses=1]
  store double %X.coerce0, double* %1
  %2 = getelementptr %1* %0, i32 0, i32 1         ; <double*> [#uses=1]
  store double %X.coerce1, double* %2
  %tmp = getelementptr inbounds %struct.foof* %X, i32 0, i32 2 ; <float*> [#uses=1]
  %tmp1 = load float* %tmp                        ; <float> [#uses=1]
  ret float %tmp1
}

which compiled (with optimization) to:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movd	%xmm1, %rax
	movd	%eax, %xmm0
	ret

Now we produce:

define float @bar(double %X.coerce0, float %X.coerce1) nounwind {
entry:
  %X = alloca %struct.foof, align 8               ; <%struct.foof*> [#uses=2]
  %0 = bitcast %struct.foof* %X to %0*            ; <%0*> [#uses=2]
  %1 = getelementptr %0* %0, i32 0, i32 0         ; <double*> [#uses=1]
  store double %X.coerce0, double* %1
  %2 = getelementptr %0* %0, i32 0, i32 1         ; <float*> [#uses=1]
  store float %X.coerce1, float* %2
  %tmp = getelementptr inbounds %struct.foof* %X, i32 0, i32 2 ; <float*> [#uses=1]
  %tmp1 = load float* %tmp                        ; <float> [#uses=1]
  ret float %tmp1
}

and:

_bar:                                   ## @bar
## BB#0:                                ## %entry
	movaps	%xmm1, %xmm0
	ret

llvm-svn: 109776
2010-07-29 18:13:09 +00:00
Chris Lattner 3f76342cfc handle a case where we could access off the end of a function
that Eli pointed out, rdar://8249586

llvm-svn: 109762
2010-07-29 17:34:39 +00:00
Chris Lattner 44f9c3b3f1 in release mode, irbuilder doesn't add names to instructions,
this will hopefully fix the osuosl clang-i686-darwin10 builder.

llvm-svn: 109760
2010-07-29 17:14:05 +00:00
Chris Lattner 98076a25ce This is a little bit far, but optimize cases like:
struct a {
  struct c {
    double x;
    int y;
  } x[1];
};

void foo(struct a A) {
}

into:

define void @foo(double %A.coerce0, i32 %A.coerce1) nounwind {
entry:
  %A = alloca %struct.a, align 8                  ; <%struct.a*> [#uses=1]
  %0 = bitcast %struct.a* %A to %struct.c*        ; <%struct.c*> [#uses=2]
  %1 = getelementptr %struct.c* %0, i32 0, i32 0  ; <double*> [#uses=1]
  store double %A.coerce0, double* %1
  %2 = getelementptr %struct.c* %0, i32 0, i32 1  ; <i32*> [#uses=1]
  store i32 %A.coerce1, i32* %2

instead of:

define void @foo(double %A.coerce0, i64 %A.coerce1) nounwind {
entry:
  %A = alloca %struct.a, align 8                  ; <%struct.a*> [#uses=1]
  %0 = bitcast %struct.a* %A to %0*               ; <%0*> [#uses=2]
  %1 = getelementptr %0* %0, i32 0, i32 0         ; <double*> [#uses=1]
  store double %A.coerce0, double* %1
  %2 = getelementptr %0* %0, i32 0, i32 1         ; <i64*> [#uses=1]
  store i64 %A.coerce1, i64* %2

I only do this now because I never want to look at this code again :)
 

llvm-svn: 109738
2010-07-29 07:43:55 +00:00
Chris Lattner c8b7b53a1e implement a todo: pass a eight-byte that consists of a
small integer + padding as that small integer.  On code
like:

struct c { double x; int y; };
void bar(struct c C) { }

This means that we compile to:

define void @bar(double %C.coerce0, i32 %C.coerce1) nounwind {
entry:
  %C = alloca %struct.c, align 8                  ; <%struct.c*> [#uses=2]
  %0 = getelementptr %struct.c* %C, i32 0, i32 0  ; <double*> [#uses=1]
  store double %C.coerce0, double* %0
  %1 = getelementptr %struct.c* %C, i32 0, i32 1  ; <i32*> [#uses=1]
  store i32 %C.coerce1, i32* %1

instead of:

define void @bar(double %C.coerce0, i64 %C.coerce1) nounwind {
entry:
  %C = alloca %struct.c, align 8                  ; <%struct.c*> [#uses=3]
  %0 = bitcast %struct.c* %C to %0*               ; <%0*> [#uses=2]
  %1 = getelementptr %0* %0, i32 0, i32 0         ; <double*> [#uses=1]
  store double %C.coerce0, double* %1
  %2 = getelementptr %0* %0, i32 0, i32 1         ; <i64*> [#uses=1]
  store i64 %C.coerce1, i64* %2

which gives SRoA heartburn.

This implements rdar://5711709, a nice low number :)

llvm-svn: 109737
2010-07-29 07:30:00 +00:00
Chris Lattner fe34c1d53e Kill off the 'coerce' ABI passing form. Now 'direct' and 'extend' always
have a "coerce to" type which often matches the default lowering of Clang
type to LLVM IR type, but the coerce case can be handled by making them
not be the same.

This simplifies things and fixes issues where X86-64 abi lowering would 
return coerce after making preferred types exactly match up.  This caused
us to compile:

typedef float v4f32 __attribute__((__vector_size__(16)));
v4f32 foo(v4f32 X) {
  return X+X;
}

into this code at -O0:

define <4 x float> @foo(<4 x float> %X.coerce) nounwind {
entry:
  %retval = alloca <4 x float>, align 16          ; <<4 x float>*> [#uses=2]
  %coerce = alloca <4 x float>, align 16          ; <<4 x float>*> [#uses=2]
  %X.addr = alloca <4 x float>, align 16          ; <<4 x float>*> [#uses=3]
  store <4 x float> %X.coerce, <4 x float>* %coerce
  %X = load <4 x float>* %coerce                  ; <<4 x float>> [#uses=1]
  store <4 x float> %X, <4 x float>* %X.addr
  %tmp = load <4 x float>* %X.addr                ; <<4 x float>> [#uses=1]
  %tmp1 = load <4 x float>* %X.addr               ; <<4 x float>> [#uses=1]
  %add = fadd <4 x float> %tmp, %tmp1             ; <<4 x float>> [#uses=1]
  store <4 x float> %add, <4 x float>* %retval
  %0 = load <4 x float>* %retval                  ; <<4 x float>> [#uses=1]
  ret <4 x float> %0
}

Now we get:

define <4 x float> @foo(<4 x float> %X) nounwind {
entry:
  %X.addr = alloca <4 x float>, align 16          ; <<4 x float>*> [#uses=3]
  store <4 x float> %X, <4 x float>* %X.addr
  %tmp = load <4 x float>* %X.addr                ; <<4 x float>> [#uses=1]
  %tmp1 = load <4 x float>* %X.addr               ; <<4 x float>> [#uses=1]
  %add = fadd <4 x float> %tmp, %tmp1             ; <<4 x float>> [#uses=1]
  ret <4 x float> %add
}

This implements rdar://8248065

llvm-svn: 109733
2010-07-29 06:26:06 +00:00
Chris Lattner 9fa15c3608 ignore structs that wrap vectors in IR, the abstraction shouldn't add penalty.
Before we'd compile the example into something like:

  %coerce.dive2 = getelementptr %struct.v4f32wrapper* %retval, i32 0, i32 0 ; <<4 x float>*> [#uses=1]
  %1 = bitcast <4 x float>* %coerce.dive2 to <2 x double>* ; <<2 x double>*> [#uses=1]
  %2 = load <2 x double>* %1, align 1             ; <<2 x double>> [#uses=1]
  ret <2 x double> %2

Now we produce:

  %coerce.dive2 = getelementptr %struct.v4f32wrapper* %retval, i32 0, i32 0 ; <<4 x float>*> [#uses=1]
  %0 = load <4 x float>* %coerce.dive2, align 1   ; <<4 x float>> [#uses=1]
  ret <4 x float> %0

llvm-svn: 109732
2010-07-29 05:02:29 +00:00
Chris Lattner 4200fe4e50 move the 'pretty 16-byte vector' inferring code up to be shared
with return values, improving stuff that returns __m128 etc.

llvm-svn: 109731
2010-07-29 04:56:46 +00:00
Chris Lattner 3a44c7e55d now that we have CGT around, we can start using preferred types
for return values too.  Instead of compiling something like:

struct foo {
  int *X;
  float *Y;
};

struct foo test(struct foo *P) { return *P; }

to:

%1 = type { i64, i64 }

define %1 @test(%struct.foo* %P) nounwind {
entry:
  %retval = alloca %struct.foo, align 8           ; <%struct.foo*> [#uses=2]
  %P.addr = alloca %struct.foo*, align 8          ; <%struct.foo**> [#uses=2]
  store %struct.foo* %P, %struct.foo** %P.addr
  %tmp = load %struct.foo** %P.addr               ; <%struct.foo*> [#uses=1]
  %tmp1 = bitcast %struct.foo* %retval to i8*     ; <i8*> [#uses=1]
  %tmp2 = bitcast %struct.foo* %tmp to i8*        ; <i8*> [#uses=1]
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %tmp1, i8* %tmp2, i64 16, i32 8, i1 false)
  %0 = bitcast %struct.foo* %retval to %1*        ; <%1*> [#uses=1]
  %1 = load %1* %0, align 1                       ; <%1> [#uses=1]
  ret %1 %1
}

We now get the result more type safe, with:

define %struct.foo @test(%struct.foo* %P) nounwind {
entry:
  %retval = alloca %struct.foo, align 8           ; <%struct.foo*> [#uses=2]
  %P.addr = alloca %struct.foo*, align 8          ; <%struct.foo**> [#uses=2]
  store %struct.foo* %P, %struct.foo** %P.addr
  %tmp = load %struct.foo** %P.addr               ; <%struct.foo*> [#uses=1]
  %tmp1 = bitcast %struct.foo* %retval to i8*     ; <i8*> [#uses=1]
  %tmp2 = bitcast %struct.foo* %tmp to i8*        ; <i8*> [#uses=1]
  call void @llvm.memcpy.p0i8.p0i8.i64(i8* %tmp1, i8* %tmp2, i64 16, i32 8, i1 false)
  %0 = load %struct.foo* %retval                  ; <%struct.foo> [#uses=1]
  ret %struct.foo %0
}

That memcpy is completely terrible, but I don't know how to fix it.

llvm-svn: 109729
2010-07-29 04:46:19 +00:00
Chris Lattner f4ba08aeaf pass argument vectors in a type that corresponds to the user type if
possible.  This improves the example to pass <4 x float> instead of
<2 x double> but we still get awful code, and still don't get the
return value right.

llvm-svn: 109700
2010-07-28 23:47:21 +00:00
Chris Lattner 31faff5d58 use Get8ByteTypeAtOffset for the return value path as well so we
don't get errors similar to PR7714 on the return path.

llvm-svn: 109689
2010-07-28 23:06:14 +00:00
Chris Lattner 4c1e484f39 fix PR7714 by not referencing off the end of a struct when passed by value in
x86-64 abi.  This also improves codegen as well.  Some refactoring is needed of
this code.

llvm-svn: 109681
2010-07-28 22:15:08 +00:00
Fariborz Jahanian d5010898ab Fix flags in global block descriptor when
block returns structs. Fies radar 8241648.
Executable test added to llvm test suite.

llvm-svn: 109620
2010-07-28 19:07:18 +00:00
Fariborz Jahanian 0ebca28f1d 2nd argument of __builtin_expect must be evaluated
if it hs side-effect to matchgcc's behaviour.
Addresses radar 8172109.

llvm-svn: 109467
2010-07-26 23:11:03 +00:00
John McCall a464ff9d15 Switch some random local-decl cleanups over to using lazy cleanups. Turn on
the block-release unwind cleanup:  we're never going to test it if we don't turn
it on.

llvm-svn: 108992
2010-07-21 06:13:08 +00:00
Chandler Carruth 3973af797a Fix a goof in my previous patch -- not all of the builtins return a value, some
fixed return types.

llvm-svn: 108657
2010-07-18 20:54:12 +00:00
Chandler Carruth bc8cab16c5 Improve the representation of the atomic builtins in a few ways. First, we make
their call expressions synthetically have the "deduced" types based on their
first argument. We only insert conversions in the AST for arguments whose
values require conversion to match the value type expected. This keeps PR7600
closed by maintaining the return type, but avoids assertions due to unexpected
implicit casts making the type unsigned (test case added from Daniel).

The magic is moved into the codegen for the atomic builtin which inserts the
casts as needed at the IR level to raise the type to an integer suitable for
the LLVM intrinsic. This shouldn't cause any real change in functionality, but
now we can make the builtin be more truly polymorphic.

llvm-svn: 108638
2010-07-18 07:23:17 +00:00
Eli Friedman eca55afea3 Fix for PR3800: make sure not to evaluate the expression for a read-write
asm operand twice.

llvm-svn: 108489
2010-07-16 00:55:21 +00:00