Commit Graph

764 Commits

Author SHA1 Message Date
Richard Smith 66a7385e27 <float.h>: do not define DECIMAL_DIG in -std=c89 mode; this macro was added in C99.
Patch by Jorge Teixeira!

llvm-svn: 260639
2016-02-12 01:15:33 +00:00
Eric Christopher 0466c7ce23 Use __ before argument names in provided headers.
llvm-svn: 260631
2016-02-12 00:32:23 +00:00
Richard Smith b473e1e473 In C11, provide macros FLT_DECIMAL_DIG, DBL_DECIMAL_DIG, and LDBL_DECIMAL_DIG in <float.h>.
Patch by Jorge Teixeira!

llvm-svn: 260577
2016-02-11 19:57:37 +00:00
Ekaterina Romanova a61946d551 This patch adds doxygen comments for all the intrinsincs in the header file f16cintrin.h. The doxygen comments are automatically generated based on Sony's intrinsics document.
Differential Revision: http://reviews.llvm.org/D17021

llvm-svn: 260333
2016-02-10 00:12:24 +00:00
Ekaterina Romanova d416747803 This patch adds doxygen comments for all the intrinsincs in the header file pmmintrin.h. The doxygen comments are automatically generated based on Sony's intrinsics document.
Differential Revision: http://reviews.llvm.org/D16913

llvm-svn: 260160
2016-02-08 22:35:09 +00:00
Igor Breger 9c2a0bfa13 AVX512: Change builtin function name for scalar intrinsics. Add "mask" to function name to reflect the function behavior.
Differential Revision: http://reviews.llvm.org/D16957

llvm-svn: 260088
2016-02-08 12:36:48 +00:00
Artem Belevich 2aad2b3500 [CUDA] Bug 26497 : Remove wrappers for variants provided by CUDA headers.
... and pull global-scope ones into std namespace with using-declaration.

Differential Revision: http://reviews.llvm.org/D16932

llvm-svn: 259944
2016-02-05 22:54:05 +00:00
Artem Belevich 7b660e2604 [CUDA] added declarations for device-side system calls
...and std:: wrappers for free/malloc.

llvm-svn: 259690
2016-02-03 20:53:58 +00:00
Ekaterina Romanova 0e19cf2dd8 This patch adds doxygen comments for the intrinsincs in the header file __wmmintrin_aes.h.
The doxygen comments are automatically generated based on Sony's intrinsics document.

Differential Revision: http://reviews.llvm.org/D16562

llvm-svn: 259275
2016-01-29 23:59:00 +00:00
Ekaterina Romanova deec50a3d2 This patch adds doxygen comments for the intrinsincs in the header file __wmmintrin_pclmul.h. The doxygen comments are automatically generated based on Sony's intrinsics document.
Differential Revision: http://reviews.llvm.org/D15999

llvm-svn: 259239
2016-01-29 20:37:14 +00:00
Artem Belevich c5f41a34e5 [CUDA] Implemented device-side support functions in <cmath>.
CUDA expects math functions in std:: namespace to work on device side.
In order to make it work with clang without allowing device-side code
generation for functions w/o appropriate target attributes, this patch
provides device-side implementations for <cmath> functions. Most of
them call global-scope math functions provided by CUDA headers. In few
cases we use clang builtins.

Tested out-of tree by compiling and running thrust's unit_tests.
https://github.com/thrust/thrust/tree/master/testing

Differential Revision: http://reviews.llvm.org/D16593

llvm-svn: 258880
2016-01-26 23:37:29 +00:00
Chris Bieneman 2bf68c6c1c Remove autoconf support
Summary:
This patch is provided in preparation for removing autoconf on 1/26. The proposal to remove autoconf on 1/26 was discussed on the llvm-dev thread here: http://lists.llvm.org/pipermail/llvm-dev/2016-January/093875.html

    "This is the way [autoconf] ends
    Not with a bang but a whimper."
    -T.S. Eliot

Reviewers: chandlerc, grosbach, bob.wilson, echristo

Subscribers: klimek, cfe-commits

Differential Revision: http://reviews.llvm.org/D16472

llvm-svn: 258862
2016-01-26 21:30:40 +00:00
Justin Lebar 3039a593db [CUDA] Make printf work.
Summary:
The code in CGCUDACall is largely based on a patch written by Eli
Bendersky:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20140324/210218.html

That patch implemented an LLVM pass lowering printf to vprintf; this
one does something similar, but in Clang codegen.

Reviewers: echristo

Subscribers: cfe-commits, jhen, tra, majnemer

Differential Revision: http://reviews.llvm.org/D16372

llvm-svn: 258642
2016-01-23 21:28:14 +00:00
Ekaterina Romanova 08d1f2431d 2 missing intrinsics _cvtss_sh and _mm_cvtps_ph were added to the intrinsics header f16intrin.h
Differential Revision: http://reviews.llvm.org/D16177

llvm-svn: 258492
2016-01-22 06:50:50 +00:00
Adam Nemet e708747129 [AVX512] Fix typo in r226298
Hal noticed that the double/float got mixed up on the parameters for
these.

llvm-svn: 258108
2016-01-19 02:02:25 +00:00
Kyle Butt 436ff85b63 [PPC] Add long long/double support for vec_cts, vec_ctu and vec_ctf
Add long long/double support for vec_cts, vec_ctu and vec_ctf.

Similar to this change in GCC:
https://gcc.gnu.org/ml/gcc-patches/2014-08/msg02653.html

Patch by Tim Shen.

llvm-svn: 257135
2016-01-08 02:00:48 +00:00
David Majnemer 30f9bfd574 Reimplement __readeflags and __writeeflags on top of intrinsics
Lean on LLVM to provide this functionality now that it provides the
necessary intrinsics.

llvm-svn: 256686
2016-01-01 06:50:08 +00:00
Asaf Badouh a9d1e18f48 [X86][PKU] add clang intrinsic for {RD|WR}PKRU
Differential Revision: http://reviews.llvm.org/D15837

llvm-svn: 256672
2015-12-31 14:14:07 +00:00
Eric Christopher 7f7d9bea6f Fix up comment in header.
llvm-svn: 256508
2015-12-28 19:07:46 +00:00
Michael Kuperstein 591278c08d [X86] Add missing m64/int64 conversions
Define the 64-bit equivalents of _m_to_int and _m_from_int.

Differential Revision: http://reviews.llvm.org/D15572

llvm-svn: 256122
2015-12-20 12:37:18 +00:00
Michael Kuperstein beae026738 [X86] Add signed aliases for popcnt intrinsics
The Intel manual documents both an unsigned form (_mm_popcnt_u32)
and a signed form (_popcnt32) of the intrinsic. Add the missing signed form.

Differential Revision: http://reviews.llvm.org/D15568

llvm-svn: 256121
2015-12-20 12:35:35 +00:00
Artem Belevich 8e9ba042a6 [CUDA] runtime wrapper header tweaks
* Pull in host-only implementations of few CUDA-specific math functions.
* #nclude <cmath> early to prevent its inclusion from CUDA headers after
  they've messed with __THROW macro.

llvm-svn: 255933
2015-12-17 22:25:22 +00:00
Artem Belevich 7fda3c9ff3 [CUDA] renamed cuda_runtime.h wrapper to __cuda_runtime.h
Currently it's easy to break CUDA compilation by passing
"-isystem /path/to/cuda/include" to compiler which leads to
compiler including real cuda_runtime.h from there instead
of the wrapper we need.

Renaming the wrapper ensures that we can include the wrapper
regardless of user-specified include paths and files.

Differential Revision: http://reviews.llvm.org/D15534

llvm-svn: 255802
2015-12-16 18:51:59 +00:00
Asaf Badouh 5e4248b4e0 [x86][avx512] more changes in intrinsics to be align with gcc format
Differential Revision: http://reviews.llvm.org/D15328

llvm-svn: 255012
2015-12-08 12:34:38 +00:00
Asaf Badouh 3e5111e313 [avx512] rename gcc intrinsics to be align with gcc format
rename the gcc intrinsics suffix : _mask ->_round

Differential Revision: http://reviews.llvm.org/D15284

llvm-svn: 254906
2015-12-07 13:14:22 +00:00
Paul Robinson 941bc91518 Move _mm256_cvtps_ph and _mm256_cvtph_ps to immintrin.h.
This more closely matches their locations as described by Intel
documentation, and lets us remove a pair of redundant typedefs.

Differential Revision: http://reviews.llvm.org/D15127

llvm-svn: 254528
2015-12-02 18:41:52 +00:00
Craig Topper 5ec97a7b9b [X86] Improve codegen for AVX2 gather with an all 1s mask.
Use undefined instead of setzero as the pass through input since its going to be fully overwritten. Use cmpeq of two zero vectors to produce the all 1s vector. Casting -1 to a double and vectorizing causes a constant load of a -1.0 floating point value.

llvm-svn: 254389
2015-12-01 07:12:59 +00:00
Craig Topper e20b8c68ed [X86] _mm256_permutevar8x32_ps should take an integer vector for its shuffle index input.
llvm-svn: 254270
2015-11-29 22:53:32 +00:00
Craig Topper 3a71f35a67 [X86] Remove temporary variables from intrinsic macros. NFC
llvm-svn: 254247
2015-11-29 06:50:33 +00:00
Argyrios Kyrtzidis dcb5653516 [CMake] Add a specific 'install-clang-headers' target.
llvm-svn: 253636
2015-11-20 02:24:03 +00:00
Artem Belevich c29db84419 [CUDA] Added a wrapper header for inclusion of stock CUDA headers.
Header files that come with CUDA are assuming split host/device
compilation and are not usable by clang out of the box.
With a bit of preprocessor magic it's possible to twist them
into something clang can use.

This wrapper always includes CUDA headers exactly the same way during
host and device compilation passes and produces identical preprocessed
content during host and device side compilation for sm_35 GPUs. Device
compilation passes for older GPUs will see a smaller subset of device
functions supported by particular GPU.

The wrapper assumes specific contents of CUDA header files and works
only with CUDA 7.0 and 7.5.

Differential Revision: http://reviews.llvm.org/D13171

llvm-svn: 253388
2015-11-17 22:28:52 +00:00
Hans Wennborg 1acf955a6a bmiintrin.h: Allow using the tzcnt intrinsics for non-BMI targets
The tzcnt intrinsics are used non non-BMI targets by code (e.g. ffmpeg)
that uses it as a potentially faster BSF.

The TZCNT instruction is special in that it's encoded in a
backward-compatible way and behaves as BSF on non-BMI targets.

Differential Revision: http://reviews.llvm.org/D14748

llvm-svn: 253358
2015-11-17 18:46:48 +00:00
Oliver Stannard 7aa90f5735 [ARM,AArch64] Fix __rev16l and __rev16ll intrinsics
These two intrinsics are defined in arm_acle.h.

__rev16l needs to rotate by 16 bits, bit it was actually rotating by 2 bits.
For AArch64, where long is 64 bits, this would still be wrong.

__rev16ll was incorrect, it reversed the bytes in each 32-bit word, rather than
each 16-bit halfword. The correct implementation is to apply __rev16 to the top
and bottom words of the 64-bit value.

For AArch32 targets, these get compiled down to the hardware rev16 instruction
at -O1 and above. For AArch64 targets, the 64-bit ones get compiled to two
32-bit rev16 instructions, because there is not currently a pattern for the
64-bit rev16 instruction.

Differential Revision: http://reviews.llvm.org/D14609

llvm-svn: 253211
2015-11-16 14:58:50 +00:00
Craig Topper fb79b5f273 [X86] Add 'pause' builtin that's already in llvm and use it instead of inline assembly to implement _mm_pause.
llvm-svn: 252712
2015-11-11 08:13:33 +00:00
Craig Topper a5455524c2 [X86] Use __builtin_ia32_paddq and __builtin_ia32_psubq to implement a couple intrinsics that were supposed to operate on MMX registers. Otherwise we end up operating on GPRs. Throw in a test for _mm_mul_su32 while I was there.
llvm-svn: 252711
2015-11-11 08:00:41 +00:00
Craig Topper 880f60b7b3 [X86] Header formatting fixes. NFC
llvm-svn: 252710
2015-11-11 08:00:39 +00:00
Craig Topper d619eaaae4 [X86] Add missing typecasts in intrinsic macros. This should make them more robust against inputs that aren't already the right type.
llvm-svn: 252700
2015-11-11 03:47:10 +00:00
Craig Topper 19744ee6ad [X86] Change pointer type in AVX2 gather builtins to be the scalar type instead of the vector type. This matches gcc and removes extras casts.
llvm-svn: 252697
2015-11-11 02:51:18 +00:00
Craig Topper fd778eebac [X86] Use setzero instead of set1(0) in a few places in intrinsic headers.
llvm-svn: 252587
2015-11-10 05:08:08 +00:00
Craig Topper 7148166785 [X86] Remove temporary variables from macros in x86 intrinsic headers. Prevents duplicate names appearing from multiple macro expansions. NFC
llvm-svn: 252586
2015-11-10 05:08:05 +00:00
Craig Topper 166f8b20a3 [X86] Fix bad intrinsic header comment. NFC.
llvm-svn: 252585
2015-11-10 05:08:00 +00:00
Craig Topper 991d499457 Fix a couple intrinsic header comments. NFC
llvm-svn: 251900
2015-11-03 06:16:31 +00:00
Eric Christopher 99af5b2ea7 Handle target builtin options that are all required rather than
only one of a group of possibilities.

This changes the syntax in the builtin files to represent:

, as the and operator
| as the or operator

The former syntax matches how the backend tablegen files represent
multiple subtarget features being required.

Updated the builtin and intrinsic headers accordingly for the new
syntax.

llvm-svn: 251388
2015-10-27 06:11:03 +00:00
Andrea Di Biagio 8bb12d0a77 [x86] Fix maskload/store intrinsic definitions in avxintrin.h
According to the Intel documentation, the mask operand of a maskload and
maskstore intrinsics is always a vector of packed integer/long integer values.
This patch introduces the following two changes:
 1. It fixes the avx maskload/store intrinsic definitions in avxintrin.h.
 2. It changes BuiltinsX86.def to match the correct gcc definitions for avx
    maskload/store (see D13861 for more details).

Differential Revision: http://reviews.llvm.org/D13861

llvm-svn: 250816
2015-10-20 11:19:54 +00:00
Craig Topper e33f51fa91 [X86] Add fxsr feature name for fxsave/fxrestore builtins.
llvm-svn: 250498
2015-10-16 06:22:36 +00:00
Peter Collingbourne e919b0f9ad Headers: Switch some headers to LF line endings for consistency.
llvm-svn: 250388
2015-10-15 10:33:27 +00:00
Hans Wennborg 4ca00afd7c Intrin.h: implement __emul and __emulu
llvm-svn: 250301
2015-10-14 16:24:28 +00:00
Eric Christopher 525334cf6c Add subtarget feature support for 3dnowa to the 3dnowa intrinsics.
llvm-svn: 250202
2015-10-13 18:40:17 +00:00
Amjad Aboud 2b9b8a5921 [X86] Add XSAVE intrinsic family
Add intrinsics for the
  XSAVE instructions (XSAVE/XSAVE64/XRSTOR/XRSTOR64)
  XSAVEOPT instructions (XSAVEOPT/XSAVEOPT64)
  XSAVEC instructions (XSAVEC/XSAVEC64)
  XSAVES instructions (XSAVES/XSAVES64/XRSTORS/XRSTORS64)

Differential Revision: http://reviews.llvm.org/D13014

llvm-svn: 250158
2015-10-13 12:29:35 +00:00
Ahmed Bougacha 7dfaaf3891 [Headers][X86] Fix stream_load (movntdqa) to accept const*.
Per Intel intrinsics guide:
- _mm256_stream_load_si256 takes `__m256i const *'
- _mm_stream_load_si128 takes `__m128i *', for no good reason.

Let's accept const* for both.

llvm-svn: 249213
2015-10-02 23:29:26 +00:00