Commit Graph

105676 Commits

Author SHA1 Message Date
Hal Finkel b8e7c736fb Handle AddrSpaceCast in stripAndAccumulateInBoundsConstantOffsets
All of the other similar functions in that part of the file look through
addrspacecast in addition to bitcast, and I see no reason why
stripAndAccumulateInBoundsConstantOffsets shouldn't do so also.

llvm-svn: 213449
2014-07-19 03:32:02 +00:00
NAKAMURA Takumi ab184fb88d MergedLoadStoreMotion.cpp: Fix msc17 build. Member initializer is unavailable.
llvm-svn: 213448
2014-07-19 03:29:25 +00:00
Hal Finkel 9e440c08a9 Make Value::isDereferenceablePointer handle offsets to pointer types with dereferenceable attributes
When we have a parameter (or call site return) with a dereferenceable
attribute, it can specify the size of an array pointed to by that parameter. If
we have a value for which we can accumulate a constant offset to such a
parameter, then we can use that offset in a direct comparison with the size
specified by the dereferenceable attribute.

This enables us to handle cases like this:

  int foo(int a[static 3]) {
    return a[2]; /* this is always dereferenceable */
  }

llvm-svn: 213447
2014-07-19 03:25:16 +00:00
Saleem Abdulrasool c4e00289a7 ARM: correct WoA __builtin_alloca handling on O0
When performing a dynamic stack adjustment without optimisations, we would mark
SP as def and R4 as kill.  This occurred as part of the expansion of a
WIN__CHKSTK SDNode which indicated the proper handling of SP and R4.  The result
would be that we would double define SP as part of an operation, which is
obviously incorrect.

Furthermore, the VTList for the chain had an incorrect parameter type of i32
instead of Other.

Correct these to permit proper lowering of __builtin_alloca at -O0.

llvm-svn: 213442
2014-07-19 01:29:51 +00:00
David Blaikie b61064ed39 Remove uses of the redundant ".reset(nullptr)" of unique_ptr, in favor of ".reset()"
It's also possible to just write "= nullptr", but there's some question
of whether that's as readable, so I leave it up to authors to pick which
they prefer for now. If we want to discuss standardizing on one or the
other, we can do that at some point in the future.

llvm-svn: 213438
2014-07-19 01:05:11 +00:00
Lang Hames 3fda7d81c7 [MCJIT] Add a 'decodeAddend' method to RuntimeDyldMachO and teach
getBasicRelocationEntry to use this rather than 'memcpy' to get the
relocation addend. Targets with non-trivial addend encodings (E.g. AArch64) can
override decodeAddend to handle immediates with interesting encodings.

No functional change.

llvm-svn: 213435
2014-07-19 00:19:17 +00:00
Eric Christopher cfd17dd2be Revert "Reapply "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself.""""
After a successful build it seems to have come back on a later build.

This reverts commit r213391.

llvm-svn: 213432
2014-07-18 23:57:20 +00:00
Eric Christopher 4e7d1e7e7b Fundamentally change the MipsSubtarget replacement machinery:
a) Move the replacement level decision to the target machine.
b) Create additional subtargets at the TargetMachine level to
   cache and make replacement easy.
c) Make the mips16 features obvious.
d) Remove the override logic as it no longer does anything.
e) Have MipsModuleDAGToDAGISel take only the target machine.
f) Have the constant islands pass grab the current subtarget
   from the MachineFunction (via the TargetMachine) instead
   of caching it.
g) Unconditionally initialize TLOF.
h) Remove the old complicated subtarget based resetting and
   replace it with simple conditionals.

llvm-svn: 213430
2014-07-18 23:41:32 +00:00
Eric Christopher e54f10ee77 FrameLowering depends only upon the Subtarget, so only take a subtarget
during initialization.

llvm-svn: 213429
2014-07-18 23:33:47 +00:00
Hal Finkel 3ee2af7d1c [PowerPC] 32-bit ELF PIC support
This adds initial support for PPC32 ELF PIC (Position Independent Code; the
-fPIC variety), thus rectifying a long-standing deficiency in the PowerPC
backend.

Patch by Justin Hibbits!

llvm-svn: 213427
2014-07-18 23:29:49 +00:00
Eric Christopher 8924d27c02 In preparation for replacing the whole subtarget on the target machine,
have target lowering take the subtarget explicitly.

llvm-svn: 213426
2014-07-18 23:25:04 +00:00
Eric Christopher 675cb4dab8 Make InstrInfo depend only upon the Subtarget getting passed in
rather than the TargetMachine.

llvm-svn: 213425
2014-07-18 23:25:00 +00:00
Eric Christopher 1c29a657c7 The subtarget in MipsTargetLowering isn't going to change and
so doesn't need to be a pointer, but a reference.

llvm-svn: 213422
2014-07-18 22:55:25 +00:00
Eric Christopher f74faf42fe Avoid caching the relocation model on the subtarget, this is for
two reasons:

a) we're already caching the target machine which contains it,
b) which relocation model you get is dependent upon whether or
not you ask before MCCodeGenInfo is constructed on the target
machine, so avoid any latent issues there.

llvm-svn: 213420
2014-07-18 22:34:20 +00:00
Eric Christopher 396a649014 Remove commented out code.
llvm-svn: 213419
2014-07-18 22:34:18 +00:00
Eric Christopher 66b7069cf8 Clean up some style and formatting issues.
llvm-svn: 213418
2014-07-18 22:34:14 +00:00
David Blaikie db5371b3bb DebugInfo: Assert that all abstract scopes are subprograms, rather than conditionalizing.
There's nothing else these should ever be...

llvm-svn: 213417
2014-07-18 22:26:59 +00:00
Mark Heffernan f3764da8ec Fix build breakage introduced with r213412.
llvm-svn: 213414
2014-07-18 21:29:41 +00:00
Mark Heffernan 053a68688a Remove unroll pragma metadata after it is used.
llvm-svn: 213412
2014-07-18 21:04:33 +00:00
Eric Christopher 754d54fcf8 Fix a couple of formatting and style issues.
llvm-svn: 213409
2014-07-18 20:35:49 +00:00
Lang Hames 76774a57d8 [MCJIT] [AArch64] Make sure to propegate ARM64_RELOC_ADDEND values into the
RelocationEntry.

No test case yet, as this primarily hits GOT entries, which RuntimeDyldChecker
can't examine yet. I'm actively working on features that will enable us to
test this.

llvm-svn: 213408
2014-07-18 20:29:36 +00:00
Eric Christopher a08db01b35 Make non-module passes unconditionally added in the pass
manager for mips, and early exit if we don't want to do
anything because of the current subtarget.

llvm-svn: 213407
2014-07-18 20:29:02 +00:00
Eli Bendersky f4f1cff4ba Add tests for atomic adds on floats.
llvm-svn: 213406
2014-07-18 20:11:26 +00:00
Tyler Nowicki 55454c6c5a Rename DiagnosticInfoOptimizationWarning to DiagnosticInfoOptimizationFailure
so the severity of the message is not part of the type name.

Reviewed by Alp Toker

llvm-svn: 213399
2014-07-18 19:36:04 +00:00
Eli Bendersky fc42738a75 Use CHECK-LABEL where appropriate in this test.
llvm-svn: 213398
2014-07-18 19:32:09 +00:00
Mark Heffernan 893752af3a Add loop unrolling metadata descriptions to docs/LangRef.rst.
llvm-svn: 213397
2014-07-18 19:24:51 +00:00
Gerolf Hoflehner f27ae6cdcf MergedLoadStoreMotion pass
Merges equivalent loads on both sides of a hammock/diamond
and hoists into into the header.
Merges equivalent stores on both sides of a hammock/diamond
and sinks it to the footer.
Can enable if conversion and tolerate better load misses
and store operand latencies.

llvm-svn: 213396
2014-07-18 19:13:09 +00:00
David Blaikie 5450240219 Reapply "DebugInfo: Ensure that all debug location scope chains from instructions within a function, lead to the function itself."""
Recommits 212776 which was reverted in r212793. This has been committed
and recommitted a few times as I try to test it harder and find/fix more
issues. The most recent revert was due to an asan bot failure which I
can't seem to reproduce locally, though I believe I'm following all the
steps the buildbot does.

So I'm going to recommit this in the hopes of investigating the failure
on the buildbot itself... apologies in advance for the bot noise. If
anyone sees failures with this /please/ provide me with any
reproductions, etc.

llvm-svn: 213391
2014-07-18 17:49:10 +00:00
David Peixotto b0b3e66ed4 Fix build failure on windows
Add explicit constructor to struct instead of using brace initialization.

llvm-svn: 213389
2014-07-18 16:41:58 +00:00
David Peixotto ae5ba76221 MC: support different sized constants in constant pools
On AArch64 the pseudo instruction ldr <reg>, =... supports both
32-bit and 64-bit constants. Add support for 64 bit constants for
the pools to support the pseudo instruction fully.

Changes the AArch64 ldr-pseudo tests to use 32-bit registers and
adds tests with 64-bit registers.

Patch by Janne Grunau!

Differential Revision: http://reviews.llvm.org/D4279

llvm-svn: 213387
2014-07-18 16:05:14 +00:00
Hal Finkel b0407ba071 Add a dereferenceable attribute
This attribute indicates that the parameter or return pointer is
dereferenceable. Practically speaking, loads from such a pointer within the
associated byte range are safe to speculatively execute. Such pointer
parameters are common in source languages (C++ references, for example).

llvm-svn: 213385
2014-07-18 15:51:28 +00:00
Daniel Sanders dc6a941350 Add MIPS Technologies to the vendors in llvm::Triple.
This is a prerequisite for checking for 'mti' and 'img' in a consistent way in
clang. Previously 'img' could use Triple::getVendor() but 'mti' could only use
Triple::getVendorName().

llvm-svn: 213381
2014-07-18 14:28:19 +00:00
Tim Northover f8bfe21fad AArch64: implement efficient f16 bitcasts
Because i16 is illegal, there's no native DAG method to
represent a bitcast to or from an f16 type. This meant LLVM was
inserting a stack store/load pair which is really not ideal.

llvm-svn: 213378
2014-07-18 13:07:05 +00:00
Tim Northover 9e108a0e3a NVPTX: support fpext/fptrunc to and from f16.
llvm-svn: 213377
2014-07-18 13:01:43 +00:00
Tim Northover 00fdbbbf60 R600: support fpext/fptrunc operations to and from f16.
llvm-svn: 213376
2014-07-18 13:01:37 +00:00
Tim Northover b94f0859e5 AArch64: support f16 extend/trunc operations.
llvm-svn: 213375
2014-07-18 13:01:31 +00:00
Tim Northover 871de902af X86: support fpext/fptrunc operations to and from 16-bit floats.
llvm-svn: 213374
2014-07-18 13:01:25 +00:00
Tim Northover 4e80b584fe ARM: support legalisation of "fptrunc ... to half" operations.
llvm-svn: 213373
2014-07-18 13:01:19 +00:00
Tim Northover 20bd0ced30 CodeGen: soften f16 type by default instead of marking legal.
Actual support for softening f16 operations is still limited, and can be added
when it's needed.  But Soften is much closer to being a useful thing to try
than keeping it Legal when no registers can actually hold such values.

Longer term, we probably want something between Soften and Promote semantics
for most targets, it'll be more efficient to promote the 4 basic operations to
f32 than libcall them.

llvm-svn: 213372
2014-07-18 12:41:46 +00:00
Renato Golin e48d9dc15e Suppress 'not handled in switch' warning
llvm-svn: 213371
2014-07-18 12:13:04 +00:00
Tilmann Scheller 0fc933d6b8 [ARM] Add earlyclobber constraint to pre/post-indexed ARM STR instructions.
The post-indexed instructions were missing the constraint, causing unpredictable STR instructions to be emitted.

The earlyclobber constraint on the pre-indexed STR instructions is not strictly necessary, as the instruction selection for pre-indexed STR instructions goes through an additional layer of pseudo instructions which have the constraint defined, however it doesn't hurt to specify the constraint directly on the pre-indexed instructions as well, since at some point someone might create instances of them programmatically and then the constraint is definitely needed.

This fixes PR20323.

llvm-svn: 213369
2014-07-18 12:05:49 +00:00
Renato Golin c17a07b36a Refactor ARM subarchitecture parsing
Re-commit of a patch to rework the triple parsing on ARM to a more sane
model.

Patch by Gabor Ballabas.

llvm-svn: 213367
2014-07-18 12:00:48 +00:00
Artyom Skrobov 78d5daf8ce extracting swapStruct into include/llvm/Support/MachO.h (no functional change)
llvm-svn: 213361
2014-07-18 09:26:16 +00:00
Tim Northover 12817862f1 R600: rename misleading fp16 test.
This test is actually going in the opposite direction to what the
filename and function name suggested.

llvm-svn: 213358
2014-07-18 08:43:30 +00:00
Tim Northover f861de3d7b R600: support f16 -> f64 conversion intrinsic.
Unfortunately, we don't seem to have a direct truncation, but the
extension can be legally split into two operations so we should
support that.

llvm-svn: 213357
2014-07-18 08:43:24 +00:00
Tim Northover 5e54fe14a4 NVPTX: support direct f16 <-> f64 conversions via intrinsics.
Clang may well start emitting these soon, and while it may not be
directly relevant for OpenCL or GLSL, the instructions were just
sitting there waiting to be used.

llvm-svn: 213356
2014-07-18 08:30:10 +00:00
Hal Finkel e15442c8aa Rename AlignAttribute to IntAttribute
Currently the only kind of integer IR attributes that we have are alignment
attributes, and so the attribute kind that takes an integer parameter is called
AlignAttr, but that will change (we'll soon be adding a dereferenceable
attribute that also takes an integer value). Accordingly, rename AlignAttribute
to IntAttribute (class names, enums, etc.).

No functionality change intended.

llvm-svn: 213352
2014-07-18 06:51:55 +00:00
Matt Arsenault 3dd43fc75d R600: Implement TTI:getPopcntSupport
The test is just copied from X86, and I don't know of a better
way to test it.

llvm-svn: 213351
2014-07-18 06:07:13 +00:00
Jim Grosbach b6535c32f5 X86: Constant fold converting vector setcc results to float.
Since the result of a SETCC for X86 is 0 or -1 in each lane, we can
move unary operations, in this case [su]int_to_fp through the mask
operation and constant fold the operation away. Generally speaking:
  UNARYOP(AND(VECTOR_CMP(x,y), constant))
      --> AND(VECTOR_CMP(x,y), constant2)
where constant2 is UNARYOP(constant).

This implements the transform where UNARYOP is [su]int_to_fp.

For example, consider the simple function:
define <4 x float> @foo(<4 x float> %val, <4 x float> %test) nounwind {
  %cmp = fcmp oeq <4 x float> %val, %test
  %ext = zext <4 x i1> %cmp to <4 x i32>
  %result = sitofp <4 x i32> %ext to <4 x float>
  ret <4 x float> %result
}

Before this change, the SSE code is generated as:
LCPI0_0:
  .long 1                       ## 0x1
  .long 1                       ## 0x1
  .long 1                       ## 0x1
  .long 1                       ## 0x1
  .section  __TEXT,__text,regular,pure_instructions
  .globl  _foo
  .align  4, 0x90
_foo:                                   ## @foo
  cmpeqps %xmm1, %xmm0
  andps LCPI0_0(%rip), %xmm0
  cvtdq2ps  %xmm0, %xmm0
  retq

After, the code is improved to:
LCPI0_0:
  .long 1065353216              ## float 1.000000e+00
  .long 1065353216              ## float 1.000000e+00
  .long 1065353216              ## float 1.000000e+00
  .long 1065353216              ## float 1.000000e+00
  .section  __TEXT,__text,regular,pure_instructions
  .globl  _foo
  .align  4, 0x90
_foo:                                   ## @foo
  cmpeqps %xmm1, %xmm0
  andps LCPI0_0(%rip), %xmm0
  retq

The cvtdq2ps has been constant folded away and the floating point 1.0f
vector lanes are materialized directly via the ModRM operand of andps.

llvm-svn: 213342
2014-07-18 00:40:56 +00:00
Jim Grosbach f7502c4884 AArch64: Constant fold converting vector setcc results to float.
Since the result of a SETCC for AArch64 is 0 or -1 in each lane, we can
move unary operations, in this case [su]int_to_fp through the mask
operation and constant fold the operation away. Generally speaking:
  UNARYOP(AND(VECTOR_CMP(x,y), constant))
      --> AND(VECTOR_CMP(x,y), constant2)
where constant2 is UNARYOP(constant).

This implements the transform where UNARYOP is [su]int_to_fp.

For example, consider the simple function:
define <4 x float> @foo(<4 x float> %val, <4 x float> %test) nounwind {
  %cmp = fcmp oeq <4 x float> %val, %test
  %ext = zext <4 x i1> %cmp to <4 x i32>
  %result = sitofp <4 x i32> %ext to <4 x float>
  ret <4 x float> %result
}

Before this change, the code is generated as:
  fcmeq.4s  v0, v0, v1
  movi.4s v1, #0x1        // Integer splat value.
  and.16b v0, v0, v1      // Mask lanes based on the comparison.
  scvtf.4s  v0, v0        // Convert each lane to f32.
  ret

After, the code is improved to:
  fcmeq.4s  v0, v0, v1
  fmov.4s v1, #1.00000000 // f32 splat value.
  and.16b v0, v0, v1      // Mask lanes based on the comparison.
  ret

The svvtf.4s has been constant folded away and the floating point 1.0f
vector lanes are materialized directly via fmov.4s.

Rather than do the folding manually in the target code, teach getNode()
in the generic SelectionDAG to handle folding constant operands of
vector [su]int_to_fp nodes. It is reasonable (as noted in a FIXME) to do
additional constant folding there as well, but I don't have test cases
for those operations, so leaving them for another time when it becomes
appropriate.

rdar://17693791

llvm-svn: 213341
2014-07-18 00:40:52 +00:00