Commit Graph

5 Commits

Author SHA1 Message Date
Justin Lebar b8f7a3b8b1 [CUDA] Rename keywords used in macro so they don't conflict with MSVC.
Summary:
MSVC seems to use "__in" and "__out" for its own purposes, so we have to
pick different names in this macro.

Reviewers: tra

Subscribers: cfe-commits

Differential Revision: https://reviews.llvm.org/D28325

llvm-svn: 291138
2017-01-05 16:54:11 +00:00
Justin Bogner 2f8de9fb4f NVPTX: Rename __builtin_ptx_shfl -> __nvvm_shfl
To match "NVPTX: Make the llvm.nvvm.shfl intrinsics and builtin names
consistent" in LLVM.

llvm-svn: 274663
2016-07-06 19:52:32 +00:00
Justin Lebar 4fb5711751 [CUDA] Implement __shfl* intrinsics in clang headers.
Summary: Clang changes to make use of the LLVM intrinsics added in D21160.

Reviewers: tra

Subscribers: jholewinski, cfe-commits

Differential Revision: http://reviews.llvm.org/D21162

llvm-svn: 272299
2016-06-09 20:04:57 +00:00
Justin Lebar 720f8da33a [CUDA] Fix order of vectorized ldg intrinsics' elements.
Summary: The order is [x, y, z, w], not [w, x, y, z].

Subscribers: cfe-commits, tra

Differential Revision: http://reviews.llvm.org/D20794

llvm-svn: 271215
2016-05-30 17:12:55 +00:00
Justin Lebar 2e4ecfdebe [CUDA] Implement __ldg using intrinsics.
Summary:
Previously it was implemented as inline asm in the CUDA headers.

This change allows us to use the [addr+imm] addressing mode when
executing ld.global.nc instructions.  This translates into a 1.3x
speedup on some benchmarks that call this instruction from within an
unrolled loop.

Reviewers: tra, rsmith

Subscribers: jhen, cfe-commits, jholewinski

Differential Revision: http://reviews.llvm.org/D19990

llvm-svn: 270150
2016-05-19 22:49:13 +00:00