Commit Graph

55155 Commits

Author SHA1 Message Date
Rafael Espindola efab16d43b Handle implicit_defs in the register coalescer. I am still trying to produce
a reduced testcase, but this fixes pr13209.

llvm-svn: 159479
2012-06-30 01:45:55 +00:00
Nuno Lopes 7b12b87096 revert r159440. As Duncan pointed out, the test for invoke is not needed at this point
llvm-svn: 159471
2012-06-29 22:10:10 +00:00
Manman Ren b1b3db6802 ARM: Clean up optimizeCompare in peephole, no functional change.
Use getUniqueVRegDef.
Replace a loop with existing interfaces: modifiesRegister and readsRegister.
Factor out code into inline functions and simplify the code.

llvm-svn: 159470
2012-06-29 22:06:19 +00:00
Manman Ren 6fa76dc0e0 Add SrcReg2 to analyzeCompare and optimizeCompareInstr to handle Compare
instructions with two register operands.

llvm-svn: 159465
2012-06-29 21:33:59 +00:00
Jakob Stoklund Olesen 3e3cdecf98 Clear kill flags in InstrEmitter::EmitSubregNode().
When a local virtual register is made global, make sure to clear any
existing kill flags.

llvm-svn: 159461
2012-06-29 21:00:03 +00:00
Jakob Stoklund Olesen da9ea1d6bc Check for extra kill flags on live-out virtual registers.
This would previously get reported as the misleading "Virtual register
def doesn't dominate all uses."

llvm-svn: 159460
2012-06-29 21:00:00 +00:00
Benjamin Kramer 396b3adc10 CodeGenPrepare: Don't crash when TLI is not available.
This happens when codegenprepare is invoked via opt.

llvm-svn: 159457
2012-06-29 19:58:21 +00:00
Manman Ren c146589aa4 Add getUniqueVRegDef to MachineRegisterInfo.
This comes in handy during peephole optimization.

llvm-svn: 159453
2012-06-29 19:16:05 +00:00
Duncan Sands 9838286d9e Rework this to clarify where the removal of nodes from the queue is
really happening.  No intended functionality change.

llvm-svn: 159451
2012-06-29 19:03:05 +00:00
Nuno Lopes 674acc12d0 RefreshCallGraph: ignore 'invoke intrinsic'. IntrinsicInst doesnt not recognize invoke, and shouldnt at this point, since the rest of LLVM codebase doesnt expect invoke of intrinsics
llvm-svn: 159441
2012-06-29 17:49:32 +00:00
Nuno Lopes b37ef71ce1 ignore 'invoke new' in isInstructionTriviallyDead, since most callers are not ready to handle invokes. instcombine will take care of this.
llvm-svn: 159440
2012-06-29 17:37:07 +00:00
Alexey Samsonov 6e7e6b646b Cleanup in DwarfDebug - fix a typo and remove two unused functions
llvm-svn: 159433
2012-06-29 16:04:14 +00:00
Duncan Sands 369c6d270b Fix a reassociate crash on sozefx when compiling with dragonegg+gcc-4.7 due to
the optimizers producing a multiply expression with more multiplications than
the original (!).

llvm-svn: 159426
2012-06-29 13:25:06 +00:00
Chandler Carruth aafe0918bc Move llvm/Support/IRBuilder.h -> llvm/IRBuilder.h
This was always part of the VMCore library out of necessity -- it deals
entirely in the IR. The .cpp file in fact was already part of the VMCore
library. This is just a mechanical move.

I've tried to go through and re-apply the coding standard's preferred
header sort, but at 40-ish files, I may have gotten some wrong. Please
let me know if so.

I'll be committing the corresponding updates to Clang and Polly, and
Duncan has DragonEgg.

Thanks to Bill and Eric for giving the green light for this bit of cleanup.

llvm-svn: 159421
2012-06-29 12:38:19 +00:00
Bill Wendling 098d906dbb Update the CMake files.
llvm-svn: 159417
2012-06-29 09:01:47 +00:00
Bill Wendling f799efdedc The DIBuilder class is just a wrapper around debug info creation
(a.k.a. MDNodes). The module doesn't belong in Analysis. Move it to the VMCore
instead.

llvm-svn: 159414
2012-06-29 08:32:07 +00:00
Andrew Trick 51a8cf77b8 Revert "Make NumMicroOps a variable in the subtarget's instruction itinerary."
This reverts commit r159406. I noticed a performance regression so I'll back out for now.

llvm-svn: 159411
2012-06-29 07:10:41 +00:00
Rafael Espindola efdfb1e6b2 In the initial exec mode we always do a load to find the address of a variable.
Before this patch in pic 32 bit code we would add the global base register
and not load from that address. This is a really old bug, but before the
introduction of the tls attributes we would never select initial exec for
pic code.

llvm-svn: 159409
2012-06-29 04:22:35 +00:00
Andrew Trick 8c9e6728b3 misched: avoid scheduling instructions that can't be dispatched.
llvm-svn: 159408
2012-06-29 03:23:24 +00:00
Andrew Trick ce27bb999d misched: count micro-ops toward the issue limit.
llvm-svn: 159407
2012-06-29 03:23:22 +00:00
Andrew Trick 1f50152b2d Make NumMicroOps a variable in the subtarget's instruction itinerary.
The TargetInstrInfo::getNumMicroOps API does not change, but soon it
will be used by MachineScheduler. Now each subtarget can specify the
number of micro-ops per itinerary class. For ARM, this is currently
always dynamic (-1), because it is used for load/store multiple which
depends on the number of register operands.

Zero is now a valid number of micro-ops. This can be used for
nop pseudo-instructions or instructions that the hardware can squash
during dispatch.

llvm-svn: 159406
2012-06-29 03:23:18 +00:00
Manman Ren 98a5bf24a9 X86: add more GATHER intrinsics in LLVM
Corrected type for index of llvm.x86.avx2.gather.d.pd.256
  from 256-bit to 128-bit.
Corrected types for src|dst|mask of llvm.x86.avx2.gather.q.ps.256
  from 256-bit to 128-bit.

Support the following intrinsics:
  llvm.x86.avx2.gather.d.q, llvm.x86.avx2.gather.q.q
  llvm.x86.avx2.gather.d.q.256, llvm.x86.avx2.gather.q.q.256
  llvm.x86.avx2.gather.d.d, llvm.x86.avx2.gather.q.d
  llvm.x86.avx2.gather.d.d.256, llvm.x86.avx2.gather.q.d.256

llvm-svn: 159402
2012-06-29 00:54:20 +00:00
Nick Lewycky 474112d82c If the step value is a constant zero, the loop isn't going to terminate. Fixes
the assert reported in PR13228!

llvm-svn: 159393
2012-06-28 23:44:57 +00:00
Nuno Lopes 2f49284f12 make the verifier accept @llvm.donothing as the only intrinsic that can be invoked
While at it, merge 2 tests and FileCheckize them

llvm-svn: 159388
2012-06-28 22:57:00 +00:00
Nuno Lopes b97a4e8bc2 make simplifyCFG erase invokes to readonly/readnone functions
llvm-svn: 159385
2012-06-28 22:32:27 +00:00
Nuno Lopes 9ac4661afa make instcombine produce calls to llvm.donothing instead of a random intrinsic
llvm-svn: 159384
2012-06-28 22:31:24 +00:00
Nuno Lopes ec9653b363 add a new @llvm.donothing intrinsic that, well, does nothing, and teach CodeGen to ignore calls to it
llvm-svn: 159383
2012-06-28 22:30:12 +00:00
Jim Grosbach e0c10d8b86 'Promote' vector [su]int_to_fp should widen elements.
Teach vector legalization how to honor Promote for int to float
conversions. The code checking whether to promote the operation knew
to look at the operand, but the actual promotion code didn't. This
fixes that. The operand is promoted up via [zs]ext.

rdar://11762659

llvm-svn: 159378
2012-06-28 21:03:44 +00:00
Jack Carter 27747b57f9 Changed the formatting sequence of a curly brace to
the comment per code review feedback.

llvm-svn: 159376
2012-06-28 20:46:26 +00:00
Bill Wendling b2f11986f8 Remove layering violation #include.
llvm-svn: 159372
2012-06-28 20:17:05 +00:00
Benjamin Kramer ae3c300625 Enable automatic GCC<->LLVM intrinsic translation for mips.
llvm-svn: 159367
2012-06-28 19:09:53 +00:00
Nuno Lopes 181d67ecb1 MemoryBuiltins:
- recognize C++ new(std::nothrow) friends
 - ignore ExtractElement and ExtractValue instructions in size/offset analysis (all easy cases are probably folded away before we get here)
 - also recognize realloc as noalias

llvm-svn: 159356
2012-06-28 16:34:03 +00:00
Nuno Lopes 8650fb8e0e make LazyValueInfo analyze the default case of switch statements (we know that in the default branch the value cannot be any of the switch cases)
llvm-svn: 159353
2012-06-28 16:13:37 +00:00
Nuno Lopes 5020db2a8c add ConstantRange::difference (to perform set difference/relative complement)
llvm-svn: 159352
2012-06-28 16:10:13 +00:00
Benjamin Kramer 92658b8149 Devirtualize DIScope and subclasses.
Nothing in here makes use of the virtuality.

llvm-svn: 159349
2012-06-28 14:25:45 +00:00
Kostya Serebryany c387ca7bab [asan] set a hard limit on the number of instructions instrumented pear each BB. This is (hopefully temporary) workaround for PR13225
llvm-svn: 159344
2012-06-28 09:34:41 +00:00
Hal Finkel 918ca2b8b7 Precompute SCEV pointer analysis prior to instruction fusion in BBVectorize.
When both a load/store and its address computation are being vectorized, it can
happen that the address-computation vectorization destroys SCEV's ability
to analyize the relative pointer offsets. As a result (like with the aliasing
analysis info), we need to precompute the necessary information prior to
instruction fusing.

This was found during stress testing (running through the test suite with a very
low required chain length); unfortunately, I don't have a small test case.

llvm-svn: 159332
2012-06-28 05:42:45 +00:00
Hal Finkel 0873d73cbf Remove a useless check in BBVectorize.
A shuffle mask will always be a constant, but I did not realize that
when I originally wrote the code.

llvm-svn: 159331
2012-06-28 05:42:43 +00:00
Hal Finkel f2dcb9a9c4 Allow BBVectorize to form non-2^n-length vectors.
The original algorithm only used recursive pair fusion of equal-length
types. This is now extended to allow pairing of any types that share
the same underlying scalar type. Because we would still generally
prefer the 2^n-length types, those are formed first. Then a second
set of iterations form the non-2^n-length types.

Also, a call to SimplifyInstructionsInBlock has been added after each
pairing iteration. This takes care of DCE (and a few other things)
that make the following iterations execute somewhat faster. For the
same reason, some of the simple shuffle-combination cases are now
handled internally.

There is some additional refactoring work to be done, but I've had
many requests for this feature, so additional refactoring will come
soon in future commits (as will additional test cases).

llvm-svn: 159330
2012-06-28 05:42:42 +00:00
Hal Finkel 74e5225c92 Refactor operation equivalence checking in BBVectorize by extending Instruction::isSameOperationAs.
Maintaining this kind of checking in different places is dangerous, extending
Instruction::isSameOperationAs consolidates this logic into one place. Here
I've added an optional flags parameter and two flags that are important for
vectorization: CompareIgnoringAlignment and CompareUsingScalarTypes.

llvm-svn: 159329
2012-06-28 05:42:26 +00:00
Bill Wendling a2ccbf0f85 Only print out the tag if it's there.
llvm-svn: 159328
2012-06-28 02:17:58 +00:00
Bill Wendling 74ac023cf6 Don't output an empty string.
llvm-svn: 159327
2012-06-28 02:12:20 +00:00
Jack Carter 6c0bc0b378 The Mips specific inline asm operand modifier 'z' has the
following description in the gnu sources:

    Print $0 if operand is zero otherwise print the op normally.

llvm-svn: 159324
2012-06-28 01:33:40 +00:00
Nuno Lopes e6e049020b make LVI::getEdgeValue() always intersect the constraints of the edge with the range of the block. Previously it was only performing the intersection for a few cases, thus losing precision
llvm-svn: 159320
2012-06-28 01:16:18 +00:00
Nuno Lopes ebb0c94e1b fix a off-by-one bug in intersectWith(), and add a bunch of tests
llvm-svn: 159319
2012-06-28 00:59:33 +00:00
Bill Wendling 5cb50c5bd5 Use the interface through DIDescriptor to get the tag/version for a debug info
MDNode.

llvm-svn: 159317
2012-06-28 00:41:44 +00:00
Bill Wendling 3b2ab9eaaa Fix cmake failure from moving files around.
llvm-svn: 159314
2012-06-28 00:18:12 +00:00
Bill Wendling e38859dc8e Move lib/Analysis/DebugInfo.cpp to lib/VMCore/DebugInfo.cpp and
include/llvm/Analysis/DebugInfo.h to include/llvm/DebugInfo.h.

The reasoning is because the DebugInfo module is simply an interface to the
debug info MDNodes and has nothing to do with analysis.

llvm-svn: 159312
2012-06-28 00:05:13 +00:00
Jack Carter ef40238a0e This allows hello world to be compiled for Mips 64 direct object.
It takes advantage of r159299 which introduces relocation support for N64. 
elf-dump needed to be upgraded to support N64 relocations as well.

This passes make check.

Jack

llvm-svn: 159302
2012-06-27 23:13:42 +00:00
Jack Carter b9f9de93df This allows hello world to be compiled for Mips 64 direct object.
It takes advantage of r159299 which introduces relocation support for N64. 
elf-dump needed to be upgraded to support N64 relocations as well.

This passes make check.

Jack

llvm-svn: 159301
2012-06-27 22:48:25 +00:00
Chad Rosier 51afe6397b Whitespace.
llvm-svn: 159300
2012-06-27 22:34:28 +00:00
Jack Carter 8ad0c272af The ELF relocation record format is different for N64
which many Mips 64 ABIs use than for O64 which many 
if not all other target ABIs use.

Most architectures have the following 64 bit relocation record format:

  typedef struct
  {
    Elf64_Addr   r_offset; /* Address of reference */
    Elf64_Xword  r_info;   /* Symbol index and type of relocation */
  } Elf64_Rel;

  typedef struct
  {
    Elf64_Addr    r_offset;
    Elf64_Xword   r_info;
    Elf64_Sxword  r_addend;
  } Elf64_Rela;

Whereas N64 has the following format:

  typedef struct
  {
    Elf64_Addr    r_offset;/* Address of reference */
    Elf64_Word  r_sym;     /* Symbol index */
    Elf64_Byte  r_ssym;    /* Special symbol */
    Elf64_Byte  r_type3;   /* Relocation type */
    Elf64_Byte  r_type2;   /* Relocation type */
    Elf64_Byte  r_type;    /* Relocation type */
  } Elf64_Rel;

  typedef struct
  {
    Elf64_Addr    r_offset;/* Address of reference */
    Elf64_Word  r_sym;     /* Symbol index */
    Elf64_Byte  r_ssym;    /* Special symbol */
    Elf64_Byte  r_type3;   /* Relocation type */
    Elf64_Byte  r_type2;   /* Relocation type */
    Elf64_Byte  r_type;    /* Relocation type */
    Elf64_Sxword  r_addend;
  } Elf64_Rela;

The structure is the same size, but the r_info data element 
is now 5 separate elements. Besides the content aspects, 
endian byte reordering will be different for the area with 
each element being endianized separately.

I treat this as generic and continue to pass r_type as 
an integer masking and unmasking the byte sized N64 
values for N64 mode. I've implemented this and it causes no 
affect on other current targets.

This passes make check.

Jack

llvm-svn: 159299
2012-06-27 22:28:30 +00:00
Matt Beaumont-Gay a58862310c Revert r159136 due to PR13124.
Original commit message:

If a constant or a function has linkonce_odr linkage and unnamed_addr, mark it
hidden. Being linkonce_odr guarantees that it is available in every dso that
needs it. Being a constant/function with unnamed_addr guarantees that the
copies don't have to be merged.

llvm-svn: 159272
2012-06-27 17:10:33 +00:00
Duncan Sands 514db117bd Some reassociate optimizations create new instructions, which they insert just
before the expression root.  Any existing operators that are changed to use one
of them needs to be moved between it and the expression root, and recursively
for the operators using that one.  When I rewrote RewriteExprTree I accidentally
inverted the logic, resulting in the compacting going down from operators to
operands rather than up from operands to the operators using them, oops.  Fix
this, resolving PR12963.

llvm-svn: 159265
2012-06-27 14:19:00 +00:00
Richard Barton 57b7d16e34 Teach assembler to handle capitalised operation values for DSB instructions
llvm-svn: 159259
2012-06-27 09:48:23 +00:00
Richard Barton 4b7558ef9a Prevent ARM Assembler crashing on unrecognised assembly format for DSB instruction
llvm-svn: 159257
2012-06-27 09:36:19 +00:00
Akira Hatanaka d030738b8f Silence uninitialized variable warning in MipsISelDAGToDAG.cpp.
llvm-svn: 159243
2012-06-27 00:49:46 +00:00
Akira Hatanaka 62871a3460 Fix bug in computation of stack size in MipsFrameLowering.cpp.
llvm-svn: 159240
2012-06-27 00:20:39 +00:00
Bill Wendling 3b70d784a2 Reduce indentation in function. Rearrange some methods. No functionality change.
llvm-svn: 159239
2012-06-26 23:22:18 +00:00
Bill Wendling e02a1f8cf2 Revamp how debugging information is emitted for debug info objects.
It's not necessary for each DI class to have its own copy of `print' and
`dump'. Instead, just give DIDescriptor those methods and have it call the
appropriate debugging printing routine based on the type of the debug
information.

llvm-svn: 159237
2012-06-26 22:57:33 +00:00
Evan Cheng a75127871c Add a missing check to avoid dereference null. No sensible test case possible. Sorry. rdar://11745134
llvm-svn: 159236
2012-06-26 22:54:59 +00:00
Evan Cheng 319be53a1f Remove a instcombine transform that (no longer?) makes sense:
// C - zext(bool) -> bool ? C - 1 : C
    if (ZExtInst *ZI = dyn_cast<ZExtInst>(Op1))
      if (ZI->getSrcTy()->isIntegerTy(1))
        return SelectInst::Create(ZI->getOperand(0), SubOne(C), C);

This ends up forming sext i1 instructions that codegen to terrible code. e.g.
int blah(_Bool x, _Bool y) {
  return (x - y) + 1;
}
=>
        movzbl  %dil, %eax
        movzbl  %sil, %ecx
        shll    $31, %ecx
        sarl    $31, %ecx
        leal    1(%rax,%rcx), %eax
        ret


Without the rule, llvm now generates:
        movzbl  %sil, %ecx
        movzbl  %dil, %eax
        incl    %eax
        subl    %ecx, %eax
        ret

It also helps with ARM (and pretty much any target that doesn't have a sext i1 :-).

The transformation was done as part of Eli's r75531. He has given the ok to
remove it.

rdar://11748024

llvm-svn: 159230
2012-06-26 22:03:13 +00:00
Benjamin Kramer efe4028693 Implement getHostCPUName for ARM/linux. This will be used to implement -march=native in clang.
The cpuid registers are only available in privileged mode so we don't have
an OS-independent way of implementing this. ARM doesn't provide a list of
processor IDs so the list is somewhat incomplete.

llvm-svn: 159228
2012-06-26 21:36:32 +00:00
Manman Ren a09820414a X86: add GATHER intrinsics (AVX2) in LLVM
Support the following intrinsics:
llvm.x86.avx2.gather.d.pd, llvm.x86.avx2.gather.q.pd
llvm.x86.avx2.gather.d.pd.256, llvm.x86.avx2.gather.q.pd.256
llvm.x86.avx2.gather.d.ps, llvm.x86.avx2.gather.q.ps
llvm.x86.avx2.gather.d.ps.256, llvm.x86.avx2.gather.q.ps.256

Modified Disassembler to handle VSIB addressing mode.

llvm-svn: 159221
2012-06-26 19:47:59 +00:00
Argyrios Kyrtzidis 46785f9461 Fix ThreadLocalImpl::getInstance for --disable-threads.
PR13114.

llvm-svn: 159210
2012-06-26 17:13:58 +00:00
Jakob Stoklund Olesen 59a0d3243b Allow targets to inject passes before the virtual register rewriter.
Such passes can be used to tweak the register assignments in a
target-dependent way, for example to avoid write-after-write
dependencies.

llvm-svn: 159209
2012-06-26 17:09:29 +00:00
Jack Carter 5e69cffed5 There are a number of generic inline asm operand modifiers that
up to r158925 were handled as processor specific. Making them 
generic and putting tests for these modifiers in the CodeGen/Generic
directory caused a number of targets to fail. 

This commit addresses that problem by having the targets call 
the generic routine for generic modifiers that they don't currently
have explicit code for.

For now only generic print operands 'c' and 'n' are supported.vi


Affected files:

    test/CodeGen/Generic/asm-large-immediate.ll
    lib/Target/PowerPC/PPCAsmPrinter.cpp
    lib/Target/NVPTX/NVPTXAsmPrinter.cpp
    lib/Target/ARM/ARMAsmPrinter.cpp
    lib/Target/XCore/XCoreAsmPrinter.cpp
    lib/Target/X86/X86AsmPrinter.cpp
    lib/Target/Hexagon/HexagonAsmPrinter.cpp
    lib/Target/CellSPU/SPUAsmPrinter.cpp
    lib/Target/Sparc/SparcAsmPrinter.cpp
    lib/Target/MBlaze/MBlazeAsmPrinter.cpp
    lib/Target/Mips/MipsAsmPrinter.cpp
    
MSP430 isn't represented because it did not even run with
the long existing 'c' modifier and it was not apparent what
needs to be done to get it inline asm ready.

Contributer: Jack Carter
llvm-svn: 159203
2012-06-26 13:49:27 +00:00
Duncan Sands 8bc764aeca Replacing zero-sized alloca's with a null pointer is too aggressive, instead
merge all zero-sized alloca's into one, fixing c43204g from the Ada ACATS
conformance testsuite.  What happened there was that a variable sized object
was being allocated on the stack, "alloca i8, i32 %size".  It was then being
passed to another function, which tested that the address was not null (raising
an exception if it was) then manipulated %size bytes in it (load and/or store).
The optimizers cleverly managed to deduce that %size was zero (congratulations
to them, as it isn't at all obvious), which made the alloca zero size, causing
the optimizers to replace it with null, which then caused the check mentioned
above to fail, and the exception to be raised, wrongly.  Note that no loads
and stores were actually being done to the alloca (the loop that does them is
executed %size times, i.e. is not executed), only the not-null address check.

llvm-svn: 159202
2012-06-26 13:39:21 +00:00
Elena Demikhovsky 863d2d3235 Removed unused variable
llvm-svn: 159197
2012-06-26 10:50:07 +00:00
Bill Wendling 8ed44466c2 Rename to match other X86_64* names.
llvm-svn: 159196
2012-06-26 10:05:06 +00:00
Elena Demikhovsky 26088d2e24 Shuffle optimization for AVX/AVX2.
The current patch optimizes frequently used shuffle patterns and gives these instruction sequence reduction.
Before:
      vshufps $-35, %xmm1, %xmm0, %xmm2 ## xmm2 = xmm0[1,3],xmm1[1,3]
       vpermilps       $-40, %xmm2, %xmm2 ## xmm2 = xmm2[0,2,1,3]
       vextractf128    $1, %ymm1, %xmm1
       vextractf128    $1, %ymm0, %xmm0
       vshufps $-35, %xmm1, %xmm0, %xmm0 ## xmm0 = xmm0[1,3],xmm1[1,3]
       vpermilps       $-40, %xmm0, %xmm0 ## xmm0 = xmm0[0,2,1,3]
       vinsertf128     $1, %xmm0, %ymm2, %ymm0
After:
      vshufps $13, %ymm0, %ymm1, %ymm1 ## ymm1 = ymm1[1,3],ymm0[0,0],ymm1[5,7],ymm0[4,4]
      vshufps $13, %ymm0, %ymm0, %ymm0 ## ymm0 = ymm0[1,3,0,0,5,7,4,4]
      vunpcklps       %ymm1, %ymm0, %ymm0 ## ymm0 = ymm0[0],ymm1[0],ymm0[1],ymm1[1],ymm0[4],ymm1[4],ymm0[5],ymm1[5]

llvm-svn: 159188
2012-06-26 08:04:10 +00:00
Chandler Carruth 9139f44d23 Update a bunch of stale comments that dated from when this folled the
very first (and worst) placement algorithm. These should now more
accurately reflect the reality of the pass.

llvm-svn: 159185
2012-06-26 05:16:37 +00:00
Craig Topper 94bf0f3855 Remove some duplicate instructions that exist only to given different mnemonics for the assembler. Use InstAlias instead.
llvm-svn: 159184
2012-06-26 04:12:49 +00:00
Andrew Trick fb2ba3e1cb Enable the new LoopInfo algorithm by default.
The primary advantage is that loop optimizations will be applied in a
stable order. This helps debugging and unit test creation. It is also
a better overall implementation without pathologically bad performance
on deep functions.

On large functions (llvm-stress --size=200000 | opt -loops)
Before: 0.1263s
After:  0.0225s

On deep functions (after tweaking llvm-stress, thanks Nadav):
Before: 0.2281s
After:  0.0227s

See r158790 for more comments.

The loop tree is now consistently generated in forward order, but loop
passes are applied in reverse order over the program. If we have a
loop optimization that prefers forward order, that can easily be
achieved by adding a different type of LoopPassManager.

llvm-svn: 159183
2012-06-26 04:11:38 +00:00
Andrew Trick fecf937938 Remove unnecessary FIXME
llvm-svn: 159182
2012-06-26 04:11:34 +00:00
Evan Cheng 4c6f917d34 Make sure type is not extended or untyped before create a constant of the type. No test case. Found by inspection.
llvm-svn: 159179
2012-06-26 01:19:33 +00:00
Eli Friedman bbcd09cc00 Make some ugly hacks for inline asm operands which name a specific register a bit more thorough. PR13196.
llvm-svn: 159176
2012-06-25 23:42:33 +00:00
Nuno Lopes 31b54a5379 revert my previous commit (r159173), since as Eli pointed out, it's perfectly ok to mark realloc as noalias
llvm-svn: 159175
2012-06-25 23:26:10 +00:00
Nuno Lopes 75eaa72de9 do not set realloc() as NotAlias, since it can return the same pointer. This whole thing should be upgraded to use the MemoryBuiltin interface anyway..
llvm-svn: 159173
2012-06-25 22:55:50 +00:00
Manman Ren 606953fbe7 ARM: update peephole optimization.
More condition codes are included when deciding whether to remove cmp after
a sub instruction. Specifically, we extend from GE|LT|GT|LE to 
GE|LT|GT|LE|HS|LS|HI|LO|EQ|NE. If we have "sub a, b; cmp b, a; movhs", we
should be able to replace with "sub a, b; movls".

rdar: 11725965
llvm-svn: 159166
2012-06-25 21:49:38 +00:00
Dan Gohman 5f725cd196 Fix the objc_autoreleasedReturnValue optimization code to locate
the call correctly even in the case where it is an invoke. This
fixes rdar://11714057.

llvm-svn: 159157
2012-06-25 19:47:37 +00:00
Jakob Stoklund Olesen a57fc12ec9 Enforce stricter liveness rules for PHIs.
Verify that all paths from the entry block to a virtual register read
pass through a def. Enable this check even when MRI->isSSA() is false.

Verify that the live range of a virtual register is live out of all
predecessor blocks, even for PHI-values.

This requires that PHIElimination sometimes inserts IMPLICIT_DEF
instruction in predecessor blocks.

llvm-svn: 159150
2012-06-25 18:18:27 +00:00
Jakob Stoklund Olesen eb49566447 Run ProcessImplicitDefs on SSA form where it can be much simpler.
Implicitly defined virtual registers can simply have the <undef> bit set
on all uses, and copies can be turned into implicit defs recursively.

Physical registers are a bit trickier. We handle the common case where a
physreg def is used by a nearby instruction in the same basic block. For
more complicated cases, just leave the IMPLICIT_DEF instruction in.

llvm-svn: 159149
2012-06-25 18:12:18 +00:00
Nuno Lopes 07594cba7c improve optimization of invoke instructions:
- simplifycfg:  invoke undef/null -> unreachable
 - instcombine:  invoke new  -> invoke expect(0, 0)  (an arbitrary NOOP intrinsic;  only done if the allocated memory is unused, of course)
 - verifier:  allow invoke of intrinsics  (to make the previous step work)

llvm-svn: 159146
2012-06-25 17:11:47 +00:00
Nuno Lopes 9ecc8761bc check for the NoAlias attribute through CallSite
llvm-svn: 159145
2012-06-25 16:17:54 +00:00
Meador Inge fc2fb711e8 PR13013: ELF Type identification fails for MSB type ELF files.
Fix 'sys::IdentifyFileType' to work with big and little endian byte orderings
when reading the ELF object file type.

Initial patch by Stefan Hepp.

llvm-svn: 159138
2012-06-25 14:48:43 +00:00
Rafael Espindola 540c3d23df If a constant or a function has linkonce_odr linkage and unnamed_addr, mark it
hidden. Being linkonce_odr guarantees that it is available in every dso that
needs it. Being a constant/function with unnamed_addr guarantees that the
copies don't have to be merged.

llvm-svn: 159136
2012-06-25 14:30:31 +00:00
Eli Bendersky f0ad3606c7 The name (and comment describing) of llvm::GetFirstDebuigLocInBasicBlock no longer represents what the function does. Therefore, the function is removed and its functionality is folded into the only place in the code-base where it was being used.
llvm-svn: 159133
2012-06-25 10:13:14 +00:00
Craig Topper 357de815b4 Add SSE2 predicate to CVTPS2PD instructions. Doesn't matter much because there are no patterns in the instruction.
llvm-svn: 159127
2012-06-25 06:51:42 +00:00
Craig Topper b6eb513c68 Remove codegen only instruction in favor of one that has the same definition. Make some pattern operands more explicit about types.
llvm-svn: 159126
2012-06-25 06:16:00 +00:00
Jakob Stoklund Olesen 70ed924e18 Teach PHIElimination to handle <undef> operands.
When a PHI use is <undef>, don't emit a copy in the predecessor block,
but insert an IMPLICIT_DEF instruction instead. This ensures that
virtual register uses are always jointly dominated by defs, even if some
of them are IMPLICIT_DEF.

llvm-svn: 159121
2012-06-25 03:36:12 +00:00
Jakob Stoklund Olesen 6b556f824d Handle <undef> operands in TwoAddressInstructionPass.
When the source register to a 2-addr instruction is undefined, there is
no need to attempt any transformations - simply replace the source
register with the destination register.

This also comes up when lowering IMPLICIT_DEF instructions - make sure
the <undef> flag is moved to the new partial register def operand:

  %vreg8<def> = INSERT_SUBREG %vreg9<undef>, %vreg0<kill>, sub_16bit
rewrite undef:
  %vreg8<def> = INSERT_SUBREG %vreg8<undef>, %vreg0<kill>, sub_16bit
convert to:
  %vreg8:sub_16bit<def,read-undef> = COPY %vreg0<kill>

llvm-svn: 159120
2012-06-25 03:27:12 +00:00
Jakob Stoklund Olesen 2e22e6a361 %RCX is not a function live-out in eh.return functions.
The function live-out registers must be live at all function returns,
and %RCX is only used by eh.return. When a function also has a normal
return, only %RAX holds a return value.

This fixes PR13188.

llvm-svn: 159116
2012-06-24 15:53:01 +00:00
NAKAMURA Takumi 704de074b8 llvm/lib: [CMake] Add explicit dependency to intrinsics_gen.
llvm-svn: 159112
2012-06-24 13:32:01 +00:00
Hal Finkel 3099ce9489 Allow controlling vectorization of boolean values separately from other integer types.
These are used as the result of comparisons, and often handled differently from larger integer types.

llvm-svn: 159111
2012-06-24 13:28:01 +00:00
Nick Lewycky 0a045bbe4e Remove dyn_cast + dereference pattern by replacing it with a cast and changing
the safety check to look for the same type we're going to actually cast to.
Fixes PR13180!

llvm-svn: 159110
2012-06-24 10:15:42 +00:00
Craig Topper fd5e6e7db1 Remove intrinsic specific instructions for (V)CVTPS2DQ and replace with patterns.
llvm-svn: 159109
2012-06-24 07:07:16 +00:00
Craig Topper b925230fb1 Remove intrinsic specific instructions for (V)CVTPS2DQ and replace with patterns.
llvm-svn: 159108
2012-06-24 06:55:37 +00:00
Craig Topper f48ec7a708 Fix build failures from r159106.
llvm-svn: 159107
2012-06-24 06:08:31 +00:00
Craig Topper bab2b89944 Remove intrinsic specific instructions for CVTPD2PS and replace with just patterns.
llvm-svn: 159106
2012-06-24 05:44:31 +00:00