Commit Graph

26449 Commits

Author SHA1 Message Date
Tim Northover 9653eb5759 Make Triple's isOSBinFormatXXX functions partition triple-space.
Most users would be surprised if "isCOFF" and "isMachO" were simultaneously
true, unless they'd put the compiler in a box with a gun attached to a photon
detector.

This makes sure precisely one of the three formats is true for any triple and
simplifies some target logic based on that.

llvm-svn: 196934
2013-12-10 16:57:43 +00:00
Chad Rosier 7a9bba442f [AArch64] Refactor the Neon vector/scalar floating-point convert intrinsics so
that they use float/double rather than the vector equivalents when appropriate.

llvm-svn: 196930
2013-12-10 16:11:39 +00:00
Chad Rosier fcc4c366d1 [AArch64] Refactor the Neon vector/scalar floating-point convert implementation.
Specifically, reuse the ARM intrinsics when possible.

llvm-svn: 196926
2013-12-10 15:35:33 +00:00
Andrea Di Biagio f7c33c8162 Ensure that the backend no longer emits unnecessary vector insert instructions
immediately after SSE scalar fp instructions like addss or mulss.

Added patterns to select SSE scalar fp arithmetic instructions from a scalar
fp operation followed by a blend.

For example, given the following code:
  __m128 foo(__m128 A, __m128 B) {
    A[0] += B[0];
    return A;
  }

previously we generated:
  addss %xmm0, %xmm1
  movss %xmm1, %xmm0

now we generate:
  addss %xmm1, %xmm0

llvm-svn: 196925
2013-12-10 15:22:48 +00:00
Vincent Lejeune cc0ea74c7b R600: Fix an infinite loop when trying to reorganize export/tex vector input
llvm-svn: 196923
2013-12-10 14:43:31 +00:00
Vincent Lejeune f92d64d160 R600: Fix input modifiers lost for Cayman
llvm-svn: 196922
2013-12-10 14:43:27 +00:00
Reed Kotler 0ff4001781 Next step in Mips16 prologue/epilogue cleanup.
Save S2(reg 18) only when we are calling floating point stubs that
have a return value of float or complex. Some more work to make this
better but this is the first step.

llvm-svn: 196921
2013-12-10 14:29:38 +00:00
Elena Demikhovsky e382c3fdcd AVX-512: changed intrinsics for mask operations
llvm-svn: 196918
2013-12-10 13:53:10 +00:00
Elena Demikhovsky 6270b388c8 AVX-512: Changed intrinsics of VPCONFLICT to match GCC builtin form
llvm-svn: 196914
2013-12-10 11:58:35 +00:00
Daniel Sanders c309be2f1f [mips][msa] Correct sld and sldi builtins.
Summary: The result register of these instructions is also the first operand.

Reviewers: jacksprat, dsanders

Reviewed By: dsanders

Differential Revision: http://llvm-reviews.chandlerc.com/D2362
Differential Revision: http://llvm-reviews.chandlerc.com/D2363

llvm-svn: 196910
2013-12-10 11:37:00 +00:00
Richard Sandiford bef3d7af2b Add TargetLowering::prepareVolatileOrAtomicLoad
One unusual feature of the z architecture is that the result of a
previous load can be reused indefinitely for subsequent loads, even if
a cache-coherent store to that location is performed by another CPU.
A special serializing instruction must be used if you want to force
a load to be reattempted.

Since volatile loads are not supposed to be omitted in this way,
we should insert a serializing instruction before each such load.
The same goes for atomic loads.

The patch implements this at the IR->DAG boundary, in a similar way
to atomic fences.  It is a no-op for targets other than SystemZ.

llvm-svn: 196906
2013-12-10 10:49:34 +00:00
Richard Sandiford 9afe613d12 Add TargetLowering::prepareVolatileOrAtomicLoad
One unusual feature of the z architecture is that the result of a
previous load can be reused indefinitely for subsequent loads, even if
a cache-coherent store to that location is performed by another CPU.
A special serializing instruction must be used if you want to force
a load to be reattempted.

Since volatile loads are not supposed to be omitted in this way,
we should insert a serializing instruction before each such load.
The same goes for atomic loads.

The patch implements this at the IR->DAG boundary, in a similar way
to atomic fences.  It is a no-op for targets other than SystemZ.

llvm-svn: 196905
2013-12-10 10:36:34 +00:00
Kevin Qin 43385c7065 [AArch64 NEON] Replace fpimm with fpz32 for floating compare with zero.
This is a small change to be strict. Just want get pattern safer.

llvm-svn: 196889
2013-12-10 06:51:07 +00:00
Kevin Qin 04396d1e69 [AArch64 NEON] Support poly128_t and implement relevant intrinsic.
llvm-svn: 196887
2013-12-10 06:48:35 +00:00
NAKAMURA Takumi 396d4d3c7e Add proper dependencies to LLVMBuild.txt in llvm/lib.
I'll prune redundant deps in LLVMBuild.txt, later.

llvm-svn: 196881
2013-12-10 05:39:34 +00:00
NAKAMURA Takumi e3afe2ef62 Whitespaces.
llvm-svn: 196880
2013-12-10 05:39:12 +00:00
Reid Kleckner 0a9509f080 Revert "Fix miscompile of MS inline assembly with stack realignment"
This reverts commit r196876.  Its tests failed on the bots, so I'll
figure it out tomorrow.

llvm-svn: 196879
2013-12-10 05:31:27 +00:00
Reid Kleckner 7f10a8cd45 Fix miscompile of MS inline assembly with stack realignment
For stack frames requiring realignment, three pointers may be needed:
- ebp to address incoming arguments
- esi (could be any callee-saved register) to address locals
- esp to address outgoing arguments

We would use esi unconditionally without verifying that it did not
conflict with inline assembly.

This change doesn't do the verification, it simply emits a fatal error
on functions that use stack realignment, dynamic SP adjustments, and
inline assembly.

Because stack realignment is common on Windows, we also no longer assume
that MS inline assembly clobbers esp.  Instead, we analyze the inline
instructions for implicit definitions and check if esp is there.  If so,
we require the use of a base pointer and consider it in the condition
above.

Mostly fixes PR16830, but we could try harder to find a non-conflicting
base pointer.

Reviewers: sunfish

Differential Revision: http://llvm-reviews.chandlerc.com/D1317

llvm-svn: 196876
2013-12-10 05:12:23 +00:00
Rafael Espindola 1d224bd65f Add comments documenting the ARM datalayout string.
llvm-svn: 196850
2013-12-10 00:37:37 +00:00
Rafael Espindola 74d682b443 Simplify further.
Thanks to Jim Grosbach for noticing it.

llvm-svn: 196846
2013-12-10 00:15:35 +00:00
Rafael Espindola 964bf07fb8 Refactor the construction of the DataLayout string on ARM.
llvm-svn: 196843
2013-12-09 23:56:41 +00:00
Chad Rosier 5c8bf9c3db [AArch64] Refactor the NEON scalar reduce pairwise intrinsics, so that they use
float/double rather than the vector equivalents when appropriate.

llvm-svn: 196833
2013-12-09 22:47:38 +00:00
Chad Rosier 3b0b3ee71e [AArch64] Refactor NEON scalar reduce pairwise front-end codegen to remove
unnecessary patterns in tablegen.

llvm-svn: 196832
2013-12-09 22:47:34 +00:00
Chad Rosier 397ff3945c [AArch64] Remove q and non-q intrinsic definitions in the NEON scalar reduce
pairwise implementation, using an overloaded definition instead.

llvm-svn: 196831
2013-12-09 22:47:31 +00:00
Reed Kotler b102fa5aef get rid of superfluous comment
llvm-svn: 196829
2013-12-09 22:08:32 +00:00
Reed Kotler 2e362b3b4b Delete some old code used for testing that is not needed anymore.
This is part of the mips16 epilogue/prologue cleanup.

llvm-svn: 196824
2013-12-09 21:19:51 +00:00
Rafael Espindola 1a3a22fad1 Don't add suffixes for stdcall/fastcall on 64 coff.
This matches the behavior of both msvc and mingw.

llvm-svn: 196814
2013-12-09 20:44:48 +00:00
Rafael Espindola e2a1418e68 Don't set a variable to its default value.
llvm-svn: 196807
2013-12-09 19:36:11 +00:00
Ana Pazos bde2828ae0 Fix pattern match for movi with 0D result
Patch by Jiangning Liu.

With some test case changes:
- intrinsic test added to the existing /test/CodeGen/AArch64/neon-aba-abd.ll.
- New test cases to cover movi 1D scenario without using the intrinsic in
test/CodeGen/AArch64/neon-mov.ll.

llvm-svn: 196806
2013-12-09 19:29:14 +00:00
Daniel Sanders 3519dce968 [mips][msa] Fix invalid generated code when lowering FrameIndex involving unaligned offsets.
Summary:
The MSA ld.[bhwd] and st.[bhwd] instructions scale the immediate by the
element size before use as an offset. The offset must therefore be a
multiple of the element size to be valid in these instructions. However,
an unaligned base address is valid in MSA.

This commit causes the compiler to emit valid code when the calculated
offset is not a multiple of the element size by accounting for the offset
using addiu and using a zero offset in the load/store.

Depends on D2338

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D2339

llvm-svn: 196777
2013-12-09 12:47:12 +00:00
Daniel Sanders 26a5a7475e [mips][msa] Fix suboptimal FrameIndex lowering for ld.[hwd] and st.[hwd]
Summary:
The immediate in these instructions is scaled before use as an offset.
They therefore have a wider reach than ld.b/st.b.

Reviewers: matheusalmeida

Reviewed By: matheusalmeida

Differential Revision: http://llvm-reviews.chandlerc.com/D2338

llvm-svn: 196775
2013-12-09 11:50:16 +00:00
Vladimir Medic 0d02be37c2 Method parseSetAssignment treats every operand with '$' sign as register and the parsing is directed to set alias for register. This will result in errors reported when expressions containing label references are parsed(for example long jumps)
As we can't make a complete solution now it has been decided to enable .set directive to handle long jump expressions. This will cause parser to report errors when parsing integer based register assignments, for example:
   .set r3, will be reported as error. Still, the need for expressions is higher priority as the integer based register assignments are Mips specific and can be avoided using register names.

llvm-svn: 196773
2013-12-09 11:03:25 +00:00
Venkatraman Govindaraju 61116e7084 [SPARCV9]: Adjust the resultant pointer of DYNAMIC_STACKALLOC with the stack BIAS on sparcV9.
llvm-svn: 196755
2013-12-09 05:13:25 +00:00
Venkatraman Govindaraju f6c8fe983b [Sparc]: Implement getSetCCResultType() in SparcTargetLowering so that umulo/smulo can be lowered on sparcv9 without an assertion error.
llvm-svn: 196751
2013-12-09 04:02:15 +00:00
Hao Liu 96a587a9f7 [AArch64]Add missing pair intrinsics such as:
int32_t vminv_s32(int32x2_t a)
which should be compiled into SMINP Vd.2S,Vn.2S,Vm.2S

llvm-svn: 196749
2013-12-09 03:51:42 +00:00
Hao Liu 868caea6d1 [AArch64]Pattern match failures for truncate store and extend load
llvm-svn: 196748
2013-12-09 03:34:08 +00:00
Venkatraman Govindaraju 72cc248524 [SparcV9]: Expand MULHU/MULHS:i64 and UMUL_LOHI/SMUL_LOHI:i64 on sparcv9.
This fixes PR18150.

llvm-svn: 196735
2013-12-08 22:06:07 +00:00
Manman Ren 2e06c8c777 Revert 196544 due to internal bot failures.
llvm-svn: 196732
2013-12-08 20:28:33 +00:00
Reed Kotler abaed9ecea Make sure we mark these registers as defined. Previously was done
in the td file.

llvm-svn: 196731
2013-12-08 19:21:47 +00:00
Reed Kotler e0a34ee66e Cleaning up of prologue/epilogue code for Mips16. First step
here is to make save/restore into variable number of argument instructions.

llvm-svn: 196726
2013-12-08 16:51:52 +00:00
Tim Northover a4173715f7 ARM: fix folding of stack-adjustment (yet again).
When trying to eliminate an "sub sp, sp, #N" instruction by folding
it into an existing push/pop using dummy registers, we need to account
for the fact that this might affect precisely how "fp" gets set in the
prologue.

We were attempting this, but assuming that *whenever* we performed a
fold it would make a difference. This is false, for example, in:
    push {r4, r7, lr}
    add fp, sp, #4
    vpush {d8}
    sub sp, sp, #8

we can fold the "sub" into the "vpush", forming "vpush {d7, d8}".
However, in that case the "add fp" instruction mustn't change, which
we were getting wrong before.

Should fix PR18160.

llvm-svn: 196725
2013-12-08 15:56:50 +00:00
Rafael Espindola 080133453b Remove the notion of primitive types.
They were out of place since the introduction of arbitrary precision integer
types.

This also synchronizes the documentation to Types.h, so it refers to first class
types and single value types.

llvm-svn: 196661
2013-12-07 19:34:20 +00:00
Vincent Lejeune 92b0a64906 Add a RequireStructuredCFG Field to TargetMachine.
llvm-svn: 196634
2013-12-07 01:49:19 +00:00
Vincent Lejeune ae7e96062c R600: Remove orphaned declarations
llvm-svn: 196633
2013-12-07 01:49:10 +00:00
Ana Pazos 93a07c2185 Added support for mcpu krait
- krait processor currently modeled with the same features as A9.
- Krait processor additionally has VFP4 (fused multiply add/sub)
and hardware division features enabled.
- krait has currently the same Schedule model as A9
- krait cpu flag is not recognized by the GNU assembler yet,
it is replaced with march=armv7-a to avoid a lower march
from being used.

llvm-svn: 196619
2013-12-06 22:48:17 +00:00
Weiming Zhao 43d8e6cb3b Bug 18149: [AArch32] VSel instructions has no ARMCC field
The current peephole optimizing for compare inst assumes an instr that
uses CPSR has an MO for ARM Cond code.However, for VSEL instructions
(vseqeq, vselgt, vselgt, vselvs), there is no such operand nor do
they support the modification of Cond Code.

llvm-svn: 196588
2013-12-06 17:56:48 +00:00
Cameron McInally e3cc4aacb9 Update AVX512 vector blend intrinsic names.
llvm-svn: 196581
2013-12-06 13:35:35 +00:00
Richard Sandiford 198ddf83c1 [SystemZ] Use LOAD AND TEST for comparisons with -0
...since it os equivalent to comparison with +0.

llvm-svn: 196580
2013-12-06 09:59:12 +00:00
Richard Sandiford 7b4118a0fc [SystemZ] Extend the use of C(L)GFR
instcombine prefers to put extended operands first, so this patch
handles that case for C(L)GFR.

llvm-svn: 196579
2013-12-06 09:56:50 +00:00
Richard Sandiford 48ef6abddc [SystemZ] Optimize selects between 0 and -1
Since z has no setcc instruction as such, the choice of setBooleanContents
is a bit arbitrary.  Currently it's set to ZeroOrOneBooleanContent,
so we produced a branch-free form when selecting between 0 and 1,
but not when selecting between 0 and -1.  This patch handles the latter
case too.

At some point I'd like to measure whether it's better to use conditional
moves for constant selects on z196, but that's future work.

llvm-svn: 196578
2013-12-06 09:53:09 +00:00