Commit Graph

10412 Commits

Author SHA1 Message Date
Nate Begeman 53afc8f06a Implement a vectorized algorithm for <16 x i8> << <16 x i8>
This is about 4x faster and smaller than the existing scalarization.

llvm-svn: 109566
2010-07-28 00:21:48 +00:00
Stuart Hastings a7f1d4a2ba Testcase for r109556. Radar 8198362.
llvm-svn: 109557
2010-07-27 23:15:25 +00:00
Nate Begeman 269a6da023 ~40% faster vector shl <4 x i32> on SSE 4.1 Larger improvements for smaller types coming in future patches.
For:

define <2 x i64> @shl(<4 x i32> %r, <4 x i32> %a) nounwind readnone ssp {
entry:
  %shl = shl <4 x i32> %r, %a                     ; <<4 x i32>> [#uses=1]
  %tmp2 = bitcast <4 x i32> %shl to <2 x i64>     ; <<2 x i64>> [#uses=1]
  ret <2 x i64> %tmp2
}

We get:

_shl:                                   ## @shl
	pslld	$23, %xmm1
	paddd	LCPI0_0, %xmm1
	cvttps2dq	%xmm1, %xmm1
	pmulld	%xmm1, %xmm0
	ret

Instead of:

_shl:                                   ## @shl
	pshufd	$3, %xmm0, %xmm2
	movd	%xmm2, %eax
	pshufd	$3, %xmm1, %xmm2
	movd	%xmm2, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm2
	pshufd	$1, %xmm0, %xmm3
	movd	%xmm3, %eax
	pshufd	$1, %xmm1, %xmm3
	movd	%xmm3, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm3
	punpckldq	%xmm2, %xmm3
	movd	%xmm0, %eax
	movd	%xmm1, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm2
	movhlps	%xmm0, %xmm0
	movd	%xmm0, %eax
	movhlps	%xmm1, %xmm1
	movd	%xmm1, %ecx
	shll	%cl, %eax
	movd	%eax, %xmm0
	punpckldq	%xmm0, %xmm2
	movdqa	%xmm2, %xmm0
	punpckldq	%xmm3, %xmm0
	ret

llvm-svn: 109549
2010-07-27 22:37:06 +00:00
Devang Patel bd32256e25 Update tests to not rely on input file's absolute path.
llvm-svn: 109521
2010-07-27 18:13:53 +00:00
Nate Begeman 317b969ac5 Fix a crash in the dag combiner caused by ConstantFoldBIT_CONVERTofBUILD_VECTOR calling itself
recursively and returning a SCALAR_TO_VECTOR node, but assuming the input was always a BUILD_VECTOR.

llvm-svn: 109519
2010-07-27 18:02:18 +00:00
Tobias Grosser 731b079edb Make coff-dump.py executable and add python as executable for this script.
This fixes the MC/COFF/basic-coff.ll test case.

llvm-svn: 109497
2010-07-27 09:01:26 +00:00
Michael J. Spencer f8270bdb2d Make MC use Windows COFF on Windows and add tests.
llvm-svn: 109494
2010-07-27 06:46:15 +00:00
Anton Korobeynikov 6bcea068db Currently EH lowering code expects typeinfo to be global only.
This assumption is not satisfied due to global mergeing.
Workaround the issue by temporary disablinge mergeing of const globals.
Also, ignore LLVM "special" globals. This fixes PR7716

llvm-svn: 109423
2010-07-26 18:45:39 +00:00
Owen Anderson bb4c4b59a4 Fix a test with malformed IR. Not sure why this didn't fail before.
llvm-svn: 109422
2010-07-26 18:44:56 +00:00
Dan Gohman cd83870faf Fix SCEVExpander::visitAddRecExpr so that it remembers the induction variable
it inserted rather than using LoopInfo::getCanonicalInductionVariable to
rediscover it, since that doesn't work on non-canonical loops. This fixes
infinite recurrsion on such loops; PR7562.

llvm-svn: 109419
2010-07-26 18:28:14 +00:00
Dan Gohman b0961f2443 Avoid depending on LCSSA implicitly pulling in LoopSimplify.
llvm-svn: 109410
2010-07-26 18:00:43 +00:00
Bruno Cardoso Lopes 306a1f9721 Support x86 "eiz" and "riz" pseudo index registers in the assembler.
llvm-svn: 109295
2010-07-24 00:06:39 +00:00
Matt Fleming fbd7f65248 Consolidate the ELF section directive tests into a single file as
suggested by Chris Lattner.

llvm-svn: 109290
2010-07-23 23:40:41 +00:00
Evan Cheng df907f4594 - Allow target to specify when is register pressure "too high". In most cases,
it's too late to start backing off aggressive latency scheduling when most
  of the registers are in use so the threshold should be a bit tighter.
- Correctly handle live out's and extract_subreg etc.
- Enable register pressure aware scheduling by default for hybrid scheduler.
  For ARM, this is almost always a win on # of instructions. It's runtime
  neutral for most of the tests. But for some kernels with high register
  pressure it can be a huge win. e.g. 464.h264ref reduced number of spills by
  54 and sped up by 20%.

llvm-svn: 109279
2010-07-23 22:39:59 +00:00
Bruno Cardoso Lopes 6f38011196 Move AVX encoding tests to different files
llvm-svn: 109269
2010-07-23 21:25:26 +00:00
Dan Gohman 55e244698a Use the proper type for shift counts. This fixes a bootstrap error.
llvm-svn: 109265
2010-07-23 21:08:12 +00:00
Stuart Hastings caf8e3a2db Test case to insure template function declaration refers to correct filename. Radar 8063111.
llvm-svn: 109258
2010-07-23 20:15:49 +00:00
Bruno Cardoso Lopes ea0e05a3ce Add AVX version of CLMUL instructions
llvm-svn: 109248
2010-07-23 18:41:12 +00:00
Dan Gohman 0818684a70 DAGCombine (shl (anyext x, c)) to (anyext (shl x, c)) if the high bits
are not demanded. This often allows the anyext to be folded away.

llvm-svn: 109242
2010-07-23 18:03:30 +00:00
Bruno Cardoso Lopes acd9230b1b Add complete assembler support for FMA3 instructions, with descriptions and encodings taken from the AVX manual
llvm-svn: 109204
2010-07-23 00:54:35 +00:00
Bruno Cardoso Lopes 0710c74f29 Add remaining AVX instructions (most of them dealing with GR64 destinations. This complete the assembler support for the general AVX ISA. But we still miss instructions from FMA3 and CLMUL specific feature flags, which are now the next step
llvm-svn: 109168
2010-07-22 21:18:49 +00:00
Tobias Grosser 336734aca6 Add new RegionInfo pass.
The RegionInfo pass detects single entry single exit regions in a function,
where a region is defined as any subgraph that is connected to the remaining
graph at only two spots.
Furthermore an hierarchical region tree is built.
Use it by calling "opt -regions analyze" or "opt -view-regions".

llvm-svn: 109089
2010-07-22 07:46:31 +00:00
Eric Christopher 9a77382685 Custom lower the memory barrier instructions and add support
for lowering without sse2.  Add a couple of new testcases.

Fixes a few libgomp tests and latent bugs.  Remove a few todos.

llvm-svn: 109078
2010-07-22 02:48:34 +00:00
Evan Cheng 285903853f More register pressure aware scheduling work.
llvm-svn: 109064
2010-07-21 23:53:58 +00:00
Bruno Cardoso Lopes e3acfd4d58 Add more 256-bit forms for a bunch of regular AVX instructions
Add 64-bit (GR64) versions of some instructions (which are not
described in their SSE forms, but are described in AVX)

llvm-svn: 109063
2010-07-21 23:53:50 +00:00
Eric Christopher 84bdfd80df Baby steps towards ARM fast-isel.
llvm-svn: 109047
2010-07-21 22:26:11 +00:00
Bruno Cardoso Lopes 6238c1d102 Add missing AVX convert instructions. Those instructions are not described in their SSE forms (although they exist), but add the AVX forms anyway, so the assembler can benefit from it
llvm-svn: 109039
2010-07-21 21:37:59 +00:00
Dan Gohman 093cb79d4b Disallow null as a named metadata operand.
Make MDNode::destroy private.
Fix the one thing that used MDNode::destroy, outside of MDNode itself.

One should never delete or destroy an MDNode explicitly. MDNodes
implicitly go away when there are no references to them (implementation
details aside).

llvm-svn: 109028
2010-07-21 18:54:18 +00:00
Rafael Espindola 4277e14dc4 Fix calling convention on ARM if vfp2+ is enabled.
llvm-svn: 109009
2010-07-21 11:38:30 +00:00
Bruno Cardoso Lopes cdbec62510 Add AVX only vzeroall and vzeroupper instructions
llvm-svn: 109002
2010-07-21 08:56:24 +00:00
Eric Christopher 690aa72437 Turn this test on again after the llvm-gcc change in r108986.
llvm-svn: 108987
2010-07-21 04:54:06 +00:00
Eric Christopher 8d95d26eb1 Update this to use a "valid" alignment.
llvm-svn: 108985
2010-07-21 04:51:24 +00:00
Bruno Cardoso Lopes 3499934da6 Add new AVX vpermilps, vpermilpd and vperm2f128 instructions
llvm-svn: 108984
2010-07-21 03:07:42 +00:00
Bruno Cardoso Lopes 3ceaf7a0a2 Add new AVX vmaskmov instructions, and also fix the VEX encoding bits to support it
llvm-svn: 108983
2010-07-21 02:46:58 +00:00
Bruno Cardoso Lopes e706501975 Add new AVX vextractf128 instructions
llvm-svn: 108964
2010-07-20 23:19:02 +00:00
Matt Fleming c3eb5e3d4b Include some tests for the recently committed ELF section directive
handlers.

llvm-svn: 108938
2010-07-20 21:37:30 +00:00
Eric Christopher 3f696ff489 Testcase for llvm-gcc commit r108910.
llvm-svn: 108918
2010-07-20 20:32:47 +00:00
Bruno Cardoso Lopes 3b505848fd Add new AVX instruction vinsertf128
llvm-svn: 108892
2010-07-20 19:44:51 +00:00
Dan Gohman 625fd2292d Fix SCEV denormalization of expressions where the exit value from
one loop is involved in the increment of an addrec for another
loop. This fixes rdar://8168938.

llvm-svn: 108863
2010-07-20 17:06:20 +00:00
Jim Grosbach badf087e45 update tests for smarter BIC usage
llvm-svn: 108846
2010-07-20 16:16:48 +00:00
Duncan Sands 2e839de377 The same problem was being tracked in PR7652.
llvm-svn: 108843
2010-07-20 15:52:32 +00:00
Bruno Cardoso Lopes 160695fecb Fix PR7174, a couple o Mips fixes:
- Fix a typo for PIC check during jmp table lowering
- Also fix the "first jump table basic block is not
considered only reachable by fall through" problem, use this
ad-hoc solution until I come up with something better.

Patch by stetorvs@gmail.com

llvm-svn: 108820
2010-07-20 08:37:04 +00:00
Bruno Cardoso Lopes ea7863647b Fix Mips PR7473. Patch by stetorvs@gmail.com
llvm-svn: 108816
2010-07-20 07:58:51 +00:00
Bruno Cardoso Lopes 6c8041ea34 x86_32 tests for vbroadcast
llvm-svn: 108789
2010-07-20 00:11:50 +00:00
Bruno Cardoso Lopes 14c5fd437c Add AVX vbroadcast new instruction
llvm-svn: 108788
2010-07-20 00:11:13 +00:00
Bruno Cardoso Lopes 9de0ca73d4 Add 256-bit vaddsub, vhadd, vhsub, vblend and vdpp instructions!
llvm-svn: 108769
2010-07-19 23:32:44 +00:00
Dan Gohman b5e918dc05 After a custom inserter, in a block which has constant instructions,
update the current basic block in addition to the current insert
position, so that they remain consistent. This fixes rdar://8204072.

llvm-svn: 108765
2010-07-19 22:48:56 +00:00
Daniel Dunbar 9db7d0addd X86: Mark JMP{32,64}[mr] as requires 32-bit/64-bit mode. They are the same
instruction, we only want to allow the one for the current subtarget.
 - This also fixes suffix matching for jmp instructions, because it eliminates
   the ambiguity between 'jmpl' and 'jmpq'.

llvm-svn: 108746
2010-07-19 20:44:16 +00:00
Dale Johannesen d4e389441d Testcase for 108732 (8195660).
llvm-svn: 108733
2010-07-19 18:22:40 +00:00
Devang Patel 18efced1a2 Fix PR 7662.
Do not try to insert local variable info to a DIE used for function declaration.

llvm-svn: 108731
2010-07-19 17:53:55 +00:00