hanchenye-llvm-project/llvm
Sanjay Patel ae945e7927 [InstCombine] transform more extract/insert pairs into shuffles (PR2109)
This is an extension of the shuffle combining from r203229:
http://reviews.llvm.org/rL203229

The idea is to widen a short input vector with undef elements so the
existing shuffle transform for extract/insert can kick in.

The motivation is to finally solve PR2109:
https://llvm.org/bugs/show_bug.cgi?id=2109

For that example, the IR becomes:

%1 = bitcast <2 x i32>* %P to <2 x float>*
%ld1 = load <2 x float>, <2 x float>* %1, align 8
%2 = shufflevector <2 x float> %ld1, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef>
%i2 = shufflevector <4 x float> %A, <4 x float> %2, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
ret <4 x float> %i2

And x86 SSE output improves from:

movq	(%rdi), %xmm1           ## xmm1 = mem[0],zero
movdqa	%xmm1, %xmm2
shufps	$229, %xmm2, %xmm2      ## xmm2 = xmm2[1,1,2,3]
shufps	$48, %xmm0, %xmm1       ## xmm1 = xmm1[0,0],xmm0[3,0]
shufps	$132, %xmm1, %xmm0      ## xmm0 = xmm0[0,1],xmm1[0,2]
shufps	$32, %xmm0, %xmm2       ## xmm2 = xmm2[0,0],xmm0[2,0]
shufps	$36, %xmm2, %xmm0       ## xmm0 = xmm0[0,1],xmm2[2,0]
retq

To the almost optimal:

movhpd	(%rdi), %xmm0

Note: There's a tension in the existing transform related to generating
arbitrary shufflevector masks. We avoid that in other places in InstCombine
because we're scared that codegen can't handle strange masks, but it looks
like we're ok with producing those here. I purposely chose weird insert/extract
indexes for the regression tests to see the effect in these cases. 
For PowerPC+Altivec, AArch64, and X86+SSE/AVX, I think the codegen is equal or
better for these examples.

Differential Revision: http://reviews.llvm.org/D15096

llvm-svn: 256394
2015-12-24 21:17:56 +00:00
..
autoconf [OPENMP] Make -fopenmp to turn on OpenMP support by default. 2015-12-10 05:45:58 +00:00
bindings Deprecate a few C APIs. 2015-12-18 23:46:42 +00:00
cmake win: Pass /W4 in front of all the -wd flags. 2015-12-23 02:38:31 +00:00
docs Add advice on choosing reviewers 2015-12-22 18:59:02 +00:00
examples [Orc] Fix Kaleidoscope example for change in r254693. 2015-12-04 02:32:32 +00:00
include Fix signed/unsigned warning in Line.h. 2015-12-24 19:17:54 +00:00
lib [InstCombine] transform more extract/insert pairs into shuffles (PR2109) 2015-12-24 21:17:56 +00:00
projects
resources
test [InstCombine] transform more extract/insert pairs into shuffles (PR2109) 2015-12-24 21:17:56 +00:00
tools llvm-dwarfdump: Add support for dumping .dSYM bundles. 2015-12-23 21:51:13 +00:00
unittests [Function] Properly remove use when clearing personality 2015-12-23 18:27:23 +00:00
utils [X86][PKU] Add {RD,WR}PKRU encoding 2015-12-24 08:25:00 +00:00
.arcconfig
.clang-format
.clang-tidy adding readability-identifier-naming to llvm clang-tidy configuration. 2015-12-08 17:44:51 +00:00
.gitignore
CMakeLists.txt Generate a clang CompilationDatabase when running CMake 2015-12-16 18:17:45 +00:00
CODE_OWNERS.TXT The PS4 baton passes. 2015-12-19 20:04:03 +00:00
CREDITS.TXT
LICENSE.TXT
LLVMBuild.txt
Makefile
Makefile.common
Makefile.config.in
Makefile.rules Create Makefile variables for 'share' and 'libexec' 2015-11-09 16:10:00 +00:00
README.txt Revert previous test commit. 2015-12-11 07:40:25 +00:00
configure [OPENMP] Make -fopenmp to turn on OpenMP support by default. 2015-12-10 05:45:58 +00:00
llvm.spec.in

README.txt

Low Level Virtual Machine (LLVM)
================================

This directory and its subdirectories contain source code for LLVM,
a toolkit for the construction of highly optimized compilers,
optimizers, and runtime environments.

LLVM is open source software. You may freely distribute it under the terms of
the license agreement found in LICENSE.txt.

Please see the documentation provided in docs/ for further
assistance with LLVM, and in particular docs/GettingStarted.rst for getting
started with LLVM and docs/README.txt for an overview of LLVM's
documentation setup.

If you are writing a package for LLVM, see docs/Packaging.rst for our
suggestions.