Commit Graph

17 Commits

Author SHA1 Message Date
Sebastian Pop 9070500633 improve performance of string::find
string::find used to call the generic algorithm ::find.  The patch special
case string::find such that it ultimately gets converted to calls to memchr
and memcmp.

The patch improves the performance of the string::find routine by about 20x.

Without the patch, the performance on an x86_64-linux 3400 MHz machine is:

Benchmark                           Time           CPU Iterations
-----------------------------------------------------------------
BM_StringFindNoMatch/10             4 ns          4 ns  166421326
BM_StringFindNoMatch/64            37 ns         37 ns   18754392
BM_StringFindNoMatch/512          268 ns        268 ns    2586060
BM_StringFindNoMatch/4k          2143 ns       2144 ns     328342
BM_StringFindNoMatch/32k        16910 ns      16917 ns      40623
BM_StringFindNoMatch/128k       67577 ns      67602 ns      10138
BM_StringFindAllMatch/1             3 ns          3 ns  265163471
BM_StringFindAllMatch/8             6 ns          6 ns  112582467
BM_StringFindAllMatch/64           36 ns         36 ns   19566457
BM_StringFindAllMatch/512         209 ns        209 ns    3318893
BM_StringFindAllMatch/4k         1618 ns       1618 ns     432963
BM_StringFindAllMatch/32k       12909 ns      12914 ns      54317
BM_StringFindAllMatch/128k      48342 ns      48361 ns      13922
BM_StringFindMatch1/1           33777 ns      33790 ns      20698
BM_StringFindMatch1/8           33940 ns      33953 ns      20619
BM_StringFindMatch1/64          34038 ns      34051 ns      20571
BM_StringFindMatch1/512         34217 ns      34230 ns      20480
BM_StringFindMatch1/4k          35510 ns      35524 ns      19752
BM_StringFindMatch1/32k         46438 ns      46456 ns      15030
BM_StringFindMatch2/1           33839 ns      33852 ns      20648
BM_StringFindMatch2/8           33950 ns      33963 ns      20594
BM_StringFindMatch2/64          33846 ns      33859 ns      20668
BM_StringFindMatch2/512         34023 ns      34036 ns      20279
BM_StringFindMatch2/4k          35422 ns      35436 ns      19716
BM_StringFindMatch2/32k         46570 ns      46588 ns      15027

With the patch applied

Benchmark                           Time           CPU Iterations
-----------------------------------------------------------------
BM_StringFindNoMatch/10             5 ns          5 ns  133724346
BM_StringFindNoMatch/64             6 ns          6 ns  119312184
BM_StringFindNoMatch/512           13 ns         13 ns   51539628
BM_StringFindNoMatch/4k            77 ns         77 ns    8935934
BM_StringFindNoMatch/32k          551 ns        551 ns    1222808
BM_StringFindNoMatch/128k        2684 ns       2685 ns     259957
BM_StringFindAllMatch/1             7 ns          7 ns   98017959
BM_StringFindAllMatch/8             7 ns          7 ns   91466911
BM_StringFindAllMatch/64            8 ns          8 ns   85707392
BM_StringFindAllMatch/512          20 ns         20 ns   34490895
BM_StringFindAllMatch/4k           93 ns         93 ns    7360375
BM_StringFindAllMatch/32k         827 ns        828 ns     829944
BM_StringFindAllMatch/128k       3593 ns       3594 ns     195815
BM_StringFindMatch1/1            1332 ns       1332 ns     516354
BM_StringFindMatch1/8            1336 ns       1336 ns     495876
BM_StringFindMatch1/64           1338 ns       1339 ns     516656
BM_StringFindMatch1/512          1357 ns       1357 ns     510717
BM_StringFindMatch1/4k           1485 ns       1486 ns     461228
BM_StringFindMatch1/32k          2235 ns       2236 ns     318253
BM_StringFindMatch2/1            1335 ns       1335 ns     517105
BM_StringFindMatch2/8            1336 ns       1337 ns     518004
BM_StringFindMatch2/64           1344 ns       1345 ns     511751
BM_StringFindMatch2/512          1361 ns       1361 ns     508150
BM_StringFindMatch2/4k           1611 ns       1611 ns     463388
BM_StringFindMatch2/32k          2187 ns       2187 ns     317532

Patch written by Aditya Kumar and Sebastian Pop.

Differential Revision: https://reviews.llvm.org/D27068

llvm-svn: 290761
2016-12-30 18:01:36 +00:00
Eric Fiselier a55333003d Optimize filesystem::path by providing weaker exception guarantees.
path uses string::append to construct, append, and concatenate paths. Unfortunatly
string::append has a strong exception safety guaranteed and if it can't prove
that the iterator operations don't throw then it will allocate a temporary
string copy to append to. However this extra allocation and copy is very
undesirable for path which doesn't have the same exception guarantees.

To work around this this patch adds string::__append_forward_unsafe which exposes
the std::string::append interface for forward iterators without enforcing
that the iterator is noexcept.

llvm-svn: 285532
2016-10-31 02:46:25 +00:00
Eric Fiselier ef915d3ef4 Improve performance of constructing filesystem::path from strings.
This patch fixes a performance bug when constructing or appending to a path
from a string or c-string. Previously we called 'push_back' to append every
single character. This caused multiple re-allocation and copies when at most
one reallocation is necessary. The new behavior is to simply call
`string::append` so it can correctly handle reallocation.

For large strings this change is a ~4x improvement. This also makes our path
faster to construct than libstdc++'s.

llvm-svn: 285530
2016-10-30 23:53:50 +00:00
Eric Fiselier 1467a197e5 Rewrite std::filesystem::path iterators and parser
This patch entirely rewrites the parsing logic for paths. Unlike the previous
implementation this one stores information about the current state; For example
if we are in a trailing separator or a root separator. This avoids the need for
extra lookahead (and extra work) when incrementing or decrementing an iterator.
Roughly this gives us a 15% speedup over the previous implementation.

Unfortunately this implementation is still a lot slower than libstdc++'s.
Because libstdc++ pre-parses and splits the path upon construction their
iterators are trivial to increment/decrement. This makes libc++ lazy parsing
100x slower than libstdc++. However the pre-parsing libstdc++ causes a ton
of extra and unneeded allocations when constructing the string. For example
`path("/foo/bar/")` would require at least 5 allocations with libstdc++
whereas libc++ uses only one. The non-allocating behavior is much preferable
when you consider filesystem usages like 'exists("/foo/bar/")'.

Even then libc++'s path seems to be twice as slow to simply construct compared
to libstdc++. More investigation is needed about this.

llvm-svn: 285526
2016-10-30 23:30:38 +00:00
Eric Fiselier 3aa5478e21 Add start of filesystem benchmarks
llvm-svn: 285524
2016-10-30 22:53:00 +00:00
Sebastian Pop 622c26389f remove warnings from google-benchmarks in libcxx
Differential Revision: https://reviews.llvm.org/D25522

Patch written by Aditya Kumar.

llvm-svn: 284179
2016-10-14 00:07:57 +00:00
Eric Fiselier ed84f4abd4 Cleanup CMake status output
llvm-svn: 283721
2016-10-10 06:31:00 +00:00
Eric Fiselier e9e3377141 Improve CMake output when registering benchmarks
llvm-svn: 280771
2016-09-07 00:57:26 +00:00
Eric Fiselier 8b4a30584a Turn On -DLIBCXX_ENABLE_BENCHMARKS by default.
This patch enables the `cxx-benchmarks` target by default. Note that the target
still has to be manually invoked since it isn't included in the default 'make'
rule.

This patch also gets the benchmarks building w/ GCC. The build previously
required the '-stdlib=libc++' flag but upstream patches to Google Benchmark
now allow the library to build w/ libc++ and GCC.

These changes should make the benchmarks easier to build and test.

llvm-svn: 279999
2016-08-29 19:50:49 +00:00
Eric Fiselier f6e09e537b Update in-tree Google Benchmark to current ToT.
I've put some work into the Google Benchmark library in order to make it easier
to benchmark libc++. These changes have already been upstreamed into
Google Benchmark and this patch applies the changes to the in-tree version.

The main improvement in the addition of a 'compare_bench.py' script which
makes it very easy to compare benchmarks. For example to compare the native
STL to libc++ you would run:

`$ compare_bench.py ./util_smartptr.native.out ./util_smartptr.libcxx.out`

And the output would look like:

RUNNING: ./util_smartptr.native.out
Benchmark                          Time           CPU Iterations
----------------------------------------------------------------
BM_SharedPtrCreateDestroy         62 ns         62 ns   10937500
BM_SharedPtrIncDecRef             31 ns         31 ns   23972603
BM_WeakPtrIncDecRef               28 ns         28 ns   23648649
RUNNING: ./util_smartptr.libcxx.out
Benchmark                          Time           CPU Iterations
----------------------------------------------------------------
BM_SharedPtrCreateDestroy         46 ns         46 ns   14957265
BM_SharedPtrIncDecRef             31 ns         31 ns   22435897
BM_WeakPtrIncDecRef               34 ns         34 ns   21084337
Comparing ./util_smartptr.native.out to ./util_smartptr.libcxx.out
Benchmark                          Time           CPU
-----------------------------------------------------
BM_SharedPtrCreateDestroy         -0.26         -0.26
BM_SharedPtrIncDecRef             +0.00         +0.00
BM_WeakPtrIncDecRef               +0.21         +0.21

llvm-svn: 278147
2016-08-09 18:56:48 +00:00
Eric Fiselier 84c557ad3e Pass compilers when configuring Google Benchmark.
llvm-svn: 277512
2016-08-02 20:21:07 +00:00
Ben Craig adb6d28b0f Adding smart_ptr benchmark
Initial draft here:
https://reviews.llvm.org/D22470
... though this is Eric Fiselier's rewrite to fit in with Google
Benchmark.

llvm-svn: 277373
2016-08-01 19:56:39 +00:00
Eric Fiselier 0e100998fc Start adding benchmarks for vector
llvm-svn: 276552
2016-07-24 06:51:55 +00:00
Eric Fiselier 1a06fe5f7e Skip chash computation in insert/emplace if the unconstrained hash matches.
llvm-svn: 276549
2016-07-24 06:22:25 +00:00
Eric Fiselier b08d8b189c [libcxx] Add support for benchmark tests using Google Benchmark.
Summary:
This patch does the following:

1. Checks in a copy of the Google Benchmark library into the libc++ repo under `utils/google-benchmark`.
2. Teaches libc++ how to build Google Benchmark against both (A) in-tree libc++ and (B) the platforms native STL.
3. Allows performance benchmarks to be built as part of the libc++ build.

Building the benchmarks (and Google Benchmark) is off by default. It must be enabled using the CMake option `-DLIBCXX_INCLUDE_BENCHMARKS=ON`. When this option is enabled the tests under `libcxx/benchmarks`  can be built using the `libcxx-benchmarks` target.

On Linux platforms where libstdc++ is the default STL the CMake option `-DLIBCXX_BUILD_BENCHMARKS_NATIVE_STDLIB=ON` can be used to build each benchmark test against libstdc++ as well. This is useful for comparing performance between standard libraries.

Support for benchmarks is currently very minimal. They must be manually run by the user and there is no mechanism for detecting performance regressions.

Known Issues:

* `-DLIBCXX_INCLUDE_BENCHMARKS=ON` is only supported for Clang, and not GCC, since the `-stdlib=libc++` option is needed to build Google Benchmark.








Reviewers: danalbert, dberlin, chandlerc, mclow.lists, jroelofs

Subscribers: chandlerc, dberlin, tberghammer, danalbert, srhines, hfinkel

Differential Revision: https://reviews.llvm.org/D22240

llvm-svn: 276049
2016-07-19 23:07:03 +00:00
Eric Fiselier c835c12f6c Add unordered_map::insert benchmark test and rename file
llvm-svn: 274424
2016-07-02 05:30:54 +00:00
Eric Fiselier f977598bb6 Improve performance of unordered_set<uint32_t>::find by 45%. Add benchmarks.
This patch improves the performance of unordered_set's find by 45% when
the value exists within the set. __hash_tables find method
needs to check if it's reached the end of the bucket by constraining the
hash of the current node and checking it against the bucket index. However
constraining the hash is an expensive operations and it can be avoided if the
two unconstrained hashes are equal. This patch applies that optimization.

This patch also adds a top level directory called benchmarks. 'benchmarks/'
is intended to store any/all benchmarks written for the standard library.
Currently nothing is done with files under 'benchmarks/' but I would like
to move towards introducing a formal format and test runner.

llvm-svn: 274423
2016-07-02 05:19:59 +00:00