Commit Graph

332 Commits

Author SHA1 Message Date
Jonathan Peyton 99ef4d0433 [ITTNOTIFY] Correct barrier imbalance time in case of tasks
ittnotify fix for barrier imbalance time in case tasks exist. In the current
implementation, task execution time is included into aggregated time on a
barrier. This fix calculates task execution time and corrects the arrive time
by subtracting the task execution time.

Since __kmp_invoke_task() can not only be called on a barrier, the field
th.th_bar_arrive_time is used to check if the function was called at the
barrier (th.th_bar_arrive_time != 0). So for this check, th_bar_arrive_time
is set to zero right after the value is used on the barrier.

Differential Revision: http://reviews.llvm.org/D19030

llvm-svn: 266332
2016-04-14 16:06:49 +00:00
Jonathan Peyton 377aa40d84 Exponential back off logic for test-and-set lock
This change adds back off logic in the test and set lock for better contended
lock performance. It uses a simple truncated binary exponential back off
function. The default back off parameters are tuned for x86.

The main back off logic has a two loop structure where each is controlled by a
user-level parameter:
max_backoff - limits the outer loop number of iterations.
    This parameter should be a power of 2.
min_ticks - the inner spin wait loop number of "ticks" which is system
    dependent and should be tuned for your system if you so choose.
    The "ticks" on x86 correspond to the time stamp counter,
    but on other architectures ticks is a timestamp derived
    from gettimeofday().

The user can modify these via the environment variable:
KMP_SPIN_BACKOFF_PARAMS=max_backoff[,min_ticks]
Currently, since the default user lock is a queuing lock,
one would have to also specify KMP_LOCK_KIND=tas to use the test-and-set locks.

Differential Revision: http://reviews.llvm.org/D19020

llvm-svn: 266329
2016-04-14 16:00:37 +00:00
Jonathan Peyton 2e379fc767 Add declarations of OpenMP 4.5 target/offload routines to headers
All these routines are implemented in the offload library.

llvm-svn: 266120
2016-04-12 20:37:18 +00:00
Jonathan Peyton 072772bf05 [STATS] Remove trailing whitespace in stats source files
llvm-svn: 265437
2016-04-05 18:48:48 +00:00
Jonathan Peyton 50e8f18b52 OMP_WAIT_POLICY changes
This change has OMP_WAIT_POLICY=active to mean that threads will busy-wait in
spin loops and virtually never go to sleep. OMP_WAIT_POLICY=passive now means
that threads will immediately go to sleep inside a spin loop. KMP_BLOCKTIME was
the previous mechanism to specify this behavior via KMP_BLOCKTIME=0 or
KMP_BLOCKTIME=infinite, but the standard OpenMP environment variable should
also be able to specify this behavior.

Differential Revision: http://reviews.llvm.org/D18577

llvm-svn: 265339
2016-04-04 19:38:32 +00:00
Jonathan Peyton 1d46d979a9 Fix bug when KMP_USE_ADAPTIVE_LOCKS is 0
#endif was one line too low.  If KMP_USE_ADAPTIVE_LOCKS is 0,
then queuing locks would incorrectly use drdpa lock mechanism.
This is a fix for https://llvm.org/bugs/show_bug.cgi?id=26649

llvm-svn: 264934
2016-03-30 21:50:59 +00:00
Jonathan Peyton 4cfe93c599 Fix comment in kmp_wait_release.h
Removed reference to "ref ct" in a comment, as ref_ct no longer exists. Also
moved the comment to where the task_team is about to be tested if NULL.

llvm-svn: 264786
2016-03-29 21:08:29 +00:00
Jonathan Peyton ee2f96c79b Fix incorrect indention in kmp_alloc.c
llvm-svn: 264777
2016-03-29 20:10:00 +00:00
Jonathan Peyton a58563d8c9 Remove dead KMP_USE_POOLED_ALLOC code
llvm-svn: 264776
2016-03-29 20:05:27 +00:00
Jonathan Peyton 316af8de48 [STATS] Missing check for MIC in config-ix.cmake
llvm-svn: 264616
2016-03-28 18:53:10 +00:00
Hal Finkel 01bb2406a3 Fixing the non-x86 build by removing dependence on kmp_cpuid_t
The problem is that the definition of kmp_cpuinfo_t contains:

  char       name [3*sizeof (kmp_cpuid_t)]; // CPUID(0x80000002,0x80000003,0x80000004)

and kmp_cpuid_t is only defined when compiling for x86.

Differential Revision: http://reviews.llvm.org/D18245

llvm-svn: 264535
2016-03-27 13:24:09 +00:00
Jonas Hahnfeld e46a494a50 [OMPT] Fix parallel_id and task_id in loop_end with schedule static
For serialized parallel regions, wrong ids were reported. Now the same code is
used as in kmp_dispatch.cpp which emits the correct ids.

Differential Revision: http://reviews.llvm.org/D18348

llvm-svn: 264266
2016-03-24 12:52:20 +00:00
Jonas Hahnfeld 801fe9bbe2 [OMPT] Test ids reported by ompt_get_{parallel,task}_id
llvm-svn: 264265
2016-03-24 12:52:11 +00:00
Jonas Hahnfeld 1c1c71776a [OMPT] Fix duplicate implicit_task_end events for master thread with GCC
For non-serialized parallel regions the master thread issued two callbacks:
The first one in kmp_gsupport.c and the second in __kmp_join_call. Therefore
only trigger the callback in kmp_gsupport.c for serialized parallel regions.

Differential Revision: http://reviews.llvm.org/D16716

llvm-svn: 264264
2016-03-24 12:52:04 +00:00
Jonathan Peyton b7d30cbc7e Fix Visual Studio builds
Have Visual Studio use MemoryBarrier() instead of _mm_mfence() and remove
__declspec align attribute from function parameters in kmp_atomic.h

llvm-svn: 264166
2016-03-23 16:27:25 +00:00
Jonas Hahnfeld b1cad2954b [OMPT] Make tests require OMPT_BLAME
ompt_event_barrier_{begin,end} are optional blame events.
In total it doesn't make any sense to test partially built OMPT support.

llvm-svn: 264031
2016-03-22 08:23:24 +00:00
Jonas Hahnfeld c804301113 [OMPT] Create infrastructure and add first tests for OMPT
Some basic checks next to the implementation should futher lower the
possibility to introduce regressions. (Note that this would have catched
the ordering issue fixed in rL258866 and pointed to rL263940.)

The tests are implementation dependent in one point because they assume that
thread ids are assigned in ascending order. This is not defined by the standard
but currently ensured in libomp. We have to think about another way of ordering
the threads should this ever be subject to change...

Note that this isn't aiming at replacing the implementation independent
test-suite at https://github.com/OpenMPToolsInterface/ompt-test-suite!

Differential Revision: http://reviews.llvm.org/D16715

llvm-svn: 264027
2016-03-22 07:22:49 +00:00
Jonathan Peyton 93a879ce78 [STATS] Add OMP_critical and OMP_critical_wait timers
OMP_critical - time spent in critical section
OMP_critical_wait - time spent waiting to enter a critical section

llvm-svn: 263967
2016-03-21 18:32:26 +00:00
Jonathan Peyton 97cbb42d90 [STATS] separate noTotal bit flag from onlyInMaster and noUnits
This change logically separates the stats_flags_e::noTotal bit flag from the
stats_flags_e::onlyInMaster and stats_flags_e::noUnits bit flags. If no
TOTAL_foo output is wanted for a particular statistic, the flag must be
explicitly included in that statistic's flags.

Differential Revision: http://reviews.llvm.org/D18198

llvm-svn: 263954
2016-03-21 17:26:23 +00:00
Jonas Hahnfeld 6c250b714c [OMPT] Fix wrong parent_task_id in serialized parallel_begin with GCC
Without this patch a simple '#pragma omp parallel num_threads(1)' leads to
ompt_event_parallel_begin: parent_task_id=3, [...], parallel_id=2, [...]
ompt_event_parallel_end: parallel_id=2, task_id=4, [...]

Differential Revision: http://reviews.llvm.org/D16714

llvm-svn: 263940
2016-03-21 12:37:52 +00:00
Jonathan Peyton b5969ca42d Update www/index.html to reflect current status of OpenMP project
llvm-svn: 263788
2016-03-18 14:50:01 +00:00
Jonathan Peyton 8a46c067ed [CMake] Fix Windows build problem for CMake versions < 3.3
Building libomp using CMake versions < 3.3 caused a link time error.  These
errors occurred because when assembling z_Windows_NT-586_asm.asm, the
definitions: OMPT_SUPPORT, _M_AMD64|_M_IA32 weren't defined on the command line.
To fix the problem, the COMPILE_FLAGS property for the assembly file is appended
to instead of the COMPILE_DEFINITIONS property being set.  For whatever reason, the
COMPILE_DEFINITIONS property doesn't pick up the definitions for assembly files
for the older CMake versions.

llvm-svn: 263651
2016-03-16 18:44:18 +00:00
Jonathan Peyton 4240055ac8 Fix spelling error in comment
llvm-svn: 263586
2016-03-15 20:59:10 +00:00
Jonathan Peyton 20c1e4e69d [STATS] Print "Unknown" for frequency if it wasn't able to be parsed
llvm-svn: 263583
2016-03-15 20:55:32 +00:00
Jonathan Peyton 226dcd3243 [STATS] Fix comments in kmp_stats.h
llvm-svn: 263582
2016-03-15 20:49:01 +00:00
Jonathan Peyton 6e98d7988b [STATS] Add header information to stats print out
This change adds a header to the printout of the statistics which includes the
time, machine name, and processor info if available. This change also includes
some cosmetic changes like using enum casting for timer and counter iteration.

Differential Revision: http://reviews.llvm.org/D18153

llvm-svn: 263580
2016-03-15 20:28:47 +00:00
Samuel Antao 11e4c539f4 Initialize two variables in kmp_tasking.
Summary:
Two initialized local variables are causing clang to produce warnings:

```
./src/projects/openmp/runtime/src/kmp_tasking.c:3019:5: error: variable 'num_tasks' is used uninitialized whenever switch default is taken [-Werror,-Wsometimes-uninitialized]
    default:
    ^~~~~~~
./src/projects/openmp/runtime/src/kmp_tasking.c:3027:21: note: uninitialized use occurs here
    for( i = 0; i < num_tasks; ++i ) {
                    ^~~~~~~~~
./src/projects/openmp/runtime/src/kmp_tasking.c:2968:28: note: initialize the variable 'num_tasks' to silence this warning
    kmp_uint64 i, num_tasks, extras;
                           ^
                            = 0
./src/projects/openmp/runtime/src/kmp_tasking.c:3019:5: error: variable 'extras' is used uninitialized whenever switch default is taken [-Werror,-Wsometimes-uninitialized]
    default:
    ^~~~~~~
./src/projects/openmp/runtime/src/kmp_tasking.c:3022:52: note: uninitialized use occurs here
    KMP_DEBUG_ASSERT(tc == num_tasks * grainsize + extras);
                                                   ^~~~~~
./src/projects/openmp/runtime/src/kmp_debug.h:62:60: note: expanded from macro 'KMP_DEBUG_ASSERT'
        #define KMP_DEBUG_ASSERT( cond )       KMP_ASSERT( cond )
                                                           ^
./src/projects/openmp/runtime/src/kmp_debug.h:60:51: note: expanded from macro 'KMP_ASSERT'
        #define KMP_ASSERT( cond )             ( (cond) ? 0 : __kmp_debug_assert( #cond, __FILE__, __LINE__ ) )
                                                  ^
./src/projects/openmp/runtime/src/kmp_tasking.c:2968:36: note: initialize the variable 'extras' to silence this warning
    kmp_uint64 i, num_tasks, extras;
                                   ^
                                    = 0
2 errors generated.
```

This patch initializes these two variables.

Reviewers: tlwilmar, jlpeyton

Subscribers: tlwilmar, openmp-commits

Differential Revision: http://reviews.llvm.org/D17909

llvm-svn: 263316
2016-03-12 00:55:17 +00:00
Jonathan Peyton 495e153ff9 [STATS] change TASK_execution name to OMP_task
llvm-svn: 263291
2016-03-11 20:23:05 +00:00
Jonathan Peyton e2554af857 [STATS] Add a total statistics count
This change removes synthesized stats and instead has all timers print out a
total which is the aggregate statistics across threads. This is displayed as
"Total_foo" at the end of program. The stats_flags_e::synthesized flag is
removed and the printStats() function is split into two separate functions:
printTimerStats() which can display the aggregate total and printCounterStats().

Differential Revision: http://reviews.llvm.org/D17869

llvm-svn: 263290
2016-03-11 20:20:49 +00:00
Jonathan Peyton c1a7c97c1b [STATS] fix output formatting when sample count is 0
Force 0.0 to be displayed for all statistics which have sample count equal to 0

llvm-svn: 262658
2016-03-03 21:24:13 +00:00
Jonathan Peyton 30138256fa [STATS] fix master and single timers
Only the thread which executes the single/master section will update its statistics.

llvm-svn: 262656
2016-03-03 21:21:05 +00:00
Jonathan Peyton 283a215c7a Add new OpenMP 4.5 taskloop construct feature
From the standard: The taskloop construct specifies that the iterations of one
or more associated loops will be executed in parallel using OpenMP tasks. The
iterations are distributed across tasks created by the construct and scheduled
to be executed.

This initial implementation uses a simple linear tasks distribution algorithm.
Later we can add other algorithms to speedup generation of huge number of tasks
(i.e., tree-like tasks generation should be faster).

This needs to be put into the OpenMP runtime library in order for the
compiler team to develop the compiler side of the implementation.

Differential Revision: http://reviews.llvm.org/D17404

llvm-svn: 262535
2016-03-02 22:47:51 +00:00
Jonathan Peyton a0d7a2cd3f Forgot to add test files for doacross and task priority.
llvm-svn: 262533
2016-03-02 22:43:14 +00:00
Jonathan Peyton 71909c57ca Add new OpenMP 4.5 doacross loop nest feature
From the standard: A doacross loop nest is a loop nest that has cross-iteration
dependence. An iteration is dependent on one or more lexicographically earlier
iterations. The ordered clause parameter on a loop directive identifies the
loop(s) associated with the doacross loop nest.

The init/fini routines allocate/free doacross buffer(s) for each loop for each
thread.  The wait routine waits for a flag designated by the dependence vector.
The post routine sets the flag designated by current iteration vector.  We use
a similar technique of shared buffer indices that covers up to 7 nowait loops
executed simultaneously by different threads (number 7 has no real meaning,
just heuristic value).  Also, the size of structures are kept intact via
reducing dummy arrays.

This needs to be put into the OpenMP runtime library in order for the compiler
team to develop the compiler side of the implementation.

Differential Revision: http://reviews.llvm.org/D17399

llvm-svn: 262532
2016-03-02 22:42:06 +00:00
Jonathan Peyton 2f7c077b5a Add new OpenMP 4.5 affinity API
This change introduces the new OpenMP 4.5 affinity api surrounding
OpenMP Places. There are six new entry points:

Typically called in serial region:
 * omp_get_num_places - returns the number of places available to the execution
       environment in the place list.
 * omp_get_place_num_procs - returns the number of processors available to the
       execution environment in the specified place.
 * omp_get_place_proc_ids - returns the numerical identifiers of the processors
       available to the execution environment in the specified place.

Typically called inside parallel region:
 * omp_get_place_num - returns the place number of the place to which the
       encountering thread is bound.
 * omp_get_partition_num_places - returns the number of places in the place
       partition of the innermost implicit task.
 * omp_get_partition_place_nums - returns the list of place numbers
       corresponding to the places in the place-var ICV of the innermost
       implicit task.

Differential Revision: http://reviews.llvm.org/D17417

llvm-svn: 261915
2016-02-25 18:49:52 +00:00
Jonathan Peyton 2851072d69 Add initial support for OpenMP 4.5 task priority feature
The maximum task priority value is read from envirable: OMP_MAX_TASK_PRIORITY.
But as of now, nothing is done with it.  We just handle the environment variable
and add the new api: omp_get_max_task_priority() which returns that value or
zero if it is not set.

Differential Revision: http://reviews.llvm.org/D17411

llvm-svn: 261908
2016-02-25 18:04:09 +00:00
Jonathan Peyton ea0fe1dfeb dd new OpenMP 4.5 schedule clause modifiers (monotonic/non-monotonic) feature
The monotonic/non-monotonic flags are sent to the runtime via the sched_type by
setting the 30th (non-monotonic) or 29th (monotonic) bit in the sched_type.
Macros are added to probe if monotonic or non-monotonic is specified
(SCHEDULE_HAS_[NON]MONOTONIC & SCHEDULE_HAS_NO_MODIFIERS)
and also to to get the base sched_type (SCHEDULE_WITHOUT_MODIFIERS)

Currently, nothing is done with the modifiers.

Also, this patch adds some comments on the use of the enumerations in at least
 one place where it is subtle.

Differential Revision: http://reviews.llvm.org/D17406

llvm-svn: 261906
2016-02-25 17:55:50 +00:00
Jonathan Peyton 95c95c350e Remove unnecessary semicolons after braces
llvm-svn: 261249
2016-02-18 19:38:25 +00:00
Jonas Hahnfeld 867aa20b1e [OMPT] Frame information for openmp taskwait
For pragma omp taskwait the runtime is called from the task context.
Therefore, the reentry frame information should be updated.

The information should be available for both taskwait event calls; therefore,
set before the first event and reset after the last event.

Patch by Joachim Protze
Differential Revision: http://reviews.llvm.org/D17145

llvm-svn: 260674
2016-02-12 12:19:59 +00:00
Jonathan Peyton 134f90d59f Fix incorrect task_team in __kmp_give_task
When a target task finishes and it tries to access the th_task_team from the
threads in the team where it was created, th_task_team can be NULL or point to
a different place when that thread started a nested region that is still
running. Finding the exact task_team that the threads were using is difficult
as it would require to unwind the task_state_memo_stack. So a new field was added
in the taskdata structure to point to the active task_team when the task was
created.

llvm-svn: 260615
2016-02-11 23:07:30 +00:00
Jonathan Peyton ff684e4b9e Fix a couple of typos in comments
llvm-svn: 260613
2016-02-11 22:58:29 +00:00
Jonathan Peyton d3f2b94d97 Proxy task fix: task_state stack push condition on fork
The problem is that the master's thread state was not saved before entering a
parallel region so it does not remember tasks when it returns.

llvm-svn: 260306
2016-02-09 22:32:41 +00:00
Jonathan Peyton 89d9b333b0 Have Mac builds use @rpath when supported in CMake
The -install_name linker flag will use "@rpath/" when supported in CMake
which is the recommended usage for dynamic libraries on Mac OSX.

llvm-svn: 260300
2016-02-09 22:15:30 +00:00
Jonas Hahnfeld 9dffeff894 [GCC] GOMP_task: Change argument type of if_cond from int to bool
(libgomp has bool as well)

This was causing a test failure in omp_test_if.c when building with GCC in
Debug mode. I have verified that GCC versions 4.9.2 and 5.3.0 now work and
compile-tested this change with clang 3.7.1 and Intel Compiler 16.0.

Differential Revision: http://reviews.llvm.org/D16921

llvm-svn: 260204
2016-02-09 07:07:30 +00:00
Jonas Hahnfeld 66594990b1 [CMake] Introduce OPENMP_LLVM_TOOLS_DIR
This will be used in a later patch to find additional LLVM tools for tests and
enables reusability for libomptarget that is currently under review.

Differential Revision: http://reviews.llvm.org/D16713

llvm-svn: 259876
2016-02-05 07:00:13 +00:00
Jonathan Peyton fd74f90072 Add LIBOMP_ENABLE_SHARED option for CMake
When building executables for Cray supercomputers, statically-linked executables
are preferred. This patch makes it possible to build the OpenMP runtime as an
archive for building statically-linked executables.  The patch adds the flag
LIBOMP_ENABLE_SHARED, which defaults to true. When true, a build of the OpenMP
runtime yields dynamic libraries. When false, a build of the OpenMP runtime
yields static libraries. There is no setting that allows both kinds of libraries
to be built.

Patch by John Mellor-Crummey

Differential Revision: http://reviews.llvm.org/D16525

llvm-svn: 259817
2016-02-04 19:29:35 +00:00
Jonathan Peyton 7d45451a0d Fix task dependency performance problem
In: http://lists.llvm.org/pipermail/openmp-dev/2015-August/000858.html, a
performance issue was found with libomp's task dependencies.  The task
dependencies hash table has an issue with collisions. The current table size is
a power of two. This combined with the current hash function causes a large
number of collisions to occurr. Also, the current size (64) is too small for
larger applications so the table size is increased.

This patch creates a two level hash table approach for task dependencies. The
implicit task is considered the "master" or "top-level" task which has a large
static sized hash table (997), and nested tasks will have smaller hash
tables (97). Prime numbers were chosen to help reduce collisions.

Differential Revision: http://reviews.llvm.org/D16640

llvm-svn: 259113
2016-01-28 23:10:44 +00:00
Jonas Hahnfeld 39b6862482 [OMPT] Add support for ompt_event_task_dependences and ompt_event_task_dependence_pair
The attached patch adds support for ompt_event_task_dependences and
ompt_event_task_dependence_pair events from the OMPT specification [1]. These
events only apply to OpenMP 4.0 and 4.1 (aka 4.5) because task dependencies
were introduced in 4.0.

With respect to the changes:

ompt_event_task_dependences
According to the specification, this event is raised after the task has been
created, thefore this event needs to be raised after ompt_event_task_begin
(in __kmp_task_start). However, the dependencies are known at
__kmpc_omp_task_with_deps which occurs before __kmp_task_start. My modifications
extend the ompt_task_info_t struct in order to store the dependencies of the
task when _kmpc_omp_task_with_deps occurs and then they are emitted in
__kmp_task_start just after raising the ompt_event_task_begin. The deps field
is allocated and valid until the event is raised and it is freed and set
to null afterwards.

ompt_event_task_dependence_pair
The processing of the dependences (i.e. checking whenever a dependence is
already satisfied) is done within __kmp_process_deps. That function checks
every dependence and calls the __kmp_track_dependence routine which gives some
support for graphical output. I used that routine to emit the dependence pair
but I also needed to know the sink_task. Despite the fact that the code within
KMP_SUPPORT_GRAPH_OUTPUT refers to task_sink it may be null because
sink->dn.task (there's a comment regarding this) and in fact it does not point
to a proper pointer value because the value is set in node->dn.task = task;
after the __kmp_process_deps calls in __kmp_check_deps. I have extended the
__kmp_process_deps and __kmp_track_dependence parameter list to receive the
sink_task.

[1] https://github.com/OpenMPToolsInterface/OMPT-Technical-Report/blob/target/ompt-tr.pdf

Patch by Harald Servat
Differential Revision: http://reviews.llvm.org/D14746

llvm-svn: 259038
2016-01-28 10:39:52 +00:00
Jonas Hahnfeld dbf627dbd4 [OMPT] Avoid SEGV when a worker thread needs its parallel id behind the barrier
When the code behind the barrier is executed, the master thread may have
already resumed execution. That's why we cannot safely assume that *pteam
is not yet freed.

This has been introduced by r258866.

llvm-svn: 259037
2016-01-28 10:39:45 +00:00
Jonas Hahnfeld bba248c368 [OMPT] Workaround clang failing with 'declare target'
Current clang trunk reports _OPENMP to be 201307 = OpenMP 4.0. It doesn't
recognize '#pragma omp declare target' though (patch still pending) and
therefore fails compilation.

Differential Revision: http://reviews.llvm.org/D16631

llvm-svn: 259026
2016-01-28 07:14:44 +00:00