Motivation:
Fixes the bug described in #1279
Modifications:
Moved state change in NIOHTTPClientUpgradeHandler.performUpgrade to after firing buffered messages.
Result:
- The upgradeState will now only change to upgradeComplete after buffered reads are handeld.
- Fixes#1279
* Add NIOSingleStepByteToMessageDecoder and NIOSingleStepByteToMessageProcessor
for processing byte streams outside of the channel pipeline.
Motivation:
ByteToMessageHandler works well when you can embed it in your pipeline, but if
you have a wrapped protocol (e.g. sending a protocol over HTTP), you can't use
it without losing the context of the outer protocol (e.g. HTTP headers/trailers).
Modifications:
Added NIOSingleStepByteToMessageDecoder and NIOSingleStepByteToMessageProcessor.
Result:
New functionality above.
Motivation:
The code-of-conduct email address is out-of-date.
Modifications:
Update code-of-conduct email address to swift-server-conduct@group.apple.com
Result:
Motivation:
The tooling around the allocation counter tests is pretty opaque and
also not easy to use.
Modifications:
Document what interested folks should know.
Result:
More people will be able to debug allocations in SwiftNIO.
Co-Authored-By: Cory Benfield <lukasa@apple.com>
Motivation:
So far we've been checking the heap property of our heaps _twice per
heap operation_. I think that's a bit much.
Modifications:
Remove the heap operation checks.
Result:
Runs much faster in debug mode.
Motivation:
The existing Atomic class holds an UnsafeEmbeddedAtomic which holds an OpaquePointer to raw memory for the C atomic. This results in multiple allocations on init. The new NIOAtomic class uses ManagedBufferPointer which tail allocates memory for the C atomic as part of the NIOAtomic class allocation.
Modifications:
Created NIOAtomic class that uses ManagedBufferPointer to allocate memory for the C atomic. Created version of catmc_atomic_* C functions that expect memory to be allocated/deallocated by caller. Created NIOAtomicPrimitive protocol for the new version of catmc_atomic_* functions.
Result:
Purely additive. NIOAtomic is available as a replacement for Atomic which results in fewer memory allocations.
Motivation:
In some cases it may be possible to write the (usually small) web socket
frame header to the buffer provided by the user. If we do this we can
avoid an extra pipeline traversal for the small write. This is only
going to be a performance win in cases where we can avoid a
copy-on-write operation on the buffer, but if we can then it's a useful
small win to achieve.
Modifications:
- Refactored the WebSocketFrameEncoder to potentially prepend the frame
header.
- Added a missing @inlinable attribute.
- Tests.
Result:
Potentially improved performance.
Motivation:
In Swift, writing
var something: T?
init() {
self.something = someValue
}
means that the compiler will first set `self.something` to `nil` and
then in the init override it with `self.someValue`
(https://bugs.swift.org/browse/SR-11777). Unfortunately, because of
https://bugs.swift.org/browse/SR-11768 , stored property initialisation
cannot be made `@inlinable` (short of using `@frozen` which isn't
available in Swift 5.0).
The combination of SR-11768 and SR-11777 leads to `var something: T?`
having much worse code than `var something: Optional<T>` iff the `init`
is `public` and `@inlinable`.
Modifications:
Change all `var something: T?` to `var something: Optional<T>`
Result:
Faster code, sad NIO developers.
Motivation:
We should join the threads that we spawned which has the extra benefit
of TSan being able to detect our thread leaks.
Modifications:
Join the SelectableEventLoop's thread after it's shut down.
Result:
We join all the threads.
Motivation:
It is useful/interesting to have an overview about total bytes
allocated, not only the number of malloc calls.
This is not used to fail tests though, more as an informational thing.
Modifications:
Introduce and hit new counter with allocated size whenever allocations
happen.
Result:
It is possible to inspect the total allocated bytes.
Motivation:
It's important to also test deprecated functionliaty. One way of
achieving this without warnings is to also deprecate the tests that test
this deprecated functionality. Unfortunately, on Linux we need to
generate lists of tests which would then reference deprecated tests
(which gives us a warning).
Modifications:
Deprecate test suites and the main test runner all the way to the top so
never get warnings.
Result:
Possible to test deprecated functionlity without warnings.
Motivation:
ByteBufferView isn't a mutable collection, but it probably should be.
Modifications:
- Add `copyBytes(at:to:length)` to `ByteBuffer` to copy bytes from a
readable region of a buffer to another part of the buffer
- Conform `ByteBufferView` to `MutableCollection`, `RangeReplaceableCollection`
and `MutableDataProtocol`
- Add an allocation counting test
Result:
`ByteBufferView` is now mutable.
Motivation:
Jazzy generates Dash docsets for us, but currently they don't work. This
is because the URLs they generate aren't matching the URL scheme we
have. That makes them next to useless.
Modifications:
Add a trailing '/' to the root URL.
Result:
Calls to Ruby's URI.join won't blow away the final path component in the
root URL, which means that the docsets will have correct URLs going
forward.
Motivation:
`public extension`s are very dangerous because the make it so easy to
slip in public API by accident.
Modifications:
Don't use `public extension`.
Result:
Fewer chances to accidentally make stuff public.
Motivation:
The ExpressibleBy*Literal conformances require `@usableFromInline` inits
which make it harder to make NIO compatible with library evolution
because we can no longer just remove all `@inlinable`s and call it a
day.
Modifications:
Remove the ExpressibleBy*Literal conformances for internal types.
Result:
NIO easy to convert into a library evolution mode.
Motivation:
For better codegen for TimeAmount struct.
After this change overflow will not be checked at each multiplication stage.
Modifications:
- Added parentheses in static constructors of `TimeAmount`
Result:
No change logically except better codegen
Motivation:
make-single-file-spm has a weird bug where it would chop off trailing
`n` characters in all lines due to a badly set `IFS` bash variable.
Modifications:
Set `IFS=""` which is correct.
Result:
make-single-file-spm will not chop off trailing `n`s.
Motivation:
Using the single test argument with the run-nio-alloc-counter-tests.sh
does not work as expected as the given test would just replace the first
test in the list.
Modifications:
Replace the tests to run with a single-element array containing the test
provided as an argument.
Result:
- A single test can be run when specified as a command line arg.
Motivation:
The HTTP1TestServer forgot to actually close the accepted channel on
`stop`.
Modifications:
Close the accepted Channel on stop.
Result:
Fewer bugs.
Motivation:
Very new Swift compilers warn about generating empty OptionSets when
using `rawValue: 0`.
Modifications:
Replace the deliberately empty OptionSets by `[]`.
Result:
No warnings.
Motivation:
We should print the same metric that we compare with.
Modifications:
Switch the printed metric to the one selected in `::METRIC`.
Result:
More correct printouts.
Motivation:
There were at least 2 tests in our test suite that were racing EventLoop
shutdown.
Modifications:
Don't race EL shutdown.
Result:
More reliable test suite.
Motivation:
make-single-file-spm is a simple bash script that can turn a single file
into a SwiftPM project with multiple modules (all defined in a single
file with special markers such as `// MODULE: Foo`. This is mostly
useful for performance evaluation of Swift becuase in almost all cases,
you'll need multiple modules which is quite a bit of work to set up.
Modifications:
Add make-single-file-spm
Result:
More useful scripts in the NIO repo.
Motivation:
It's faster and contiguous storage is trivially available
Modifications:
- Implement WebSocketMaskingKey.withContiguousStorageIfAvailable
- Make it @inlinable, requiring promoting WebSocketMaskingKey.key to internal
so renamed to _key.
Result:
Writing WebSocketMaskingKey into ByteBuffer is about 2x faster.
Motivation:
The Swift compiler actively tries to make our live hard and puts all
initialisations of variables that are initialised using the shorthand
syntax
@usableFromInline var myVar: MyVarType = .init()
into a _non-inlinable_ function
variable initialization expression of MyModule.MyType.myVar : MyVarType
This means that the following pattern
public struct MyType {
@usableFromInline var myVar: MyVarType = .init()
@inlinable public init() {
}
}
leads to code for the `init` that looks like
CALL variable_initialization_expression_...
instead of also inlining the the code for the variable initialiser.
Modifications:
Don't use `@usableFromInline var myVar: MyVarType = ...` anywhere,
instead put it explicitly into the @inlinable `init`s which makes
actually inlinable.
I even removed that syntax from places where it doesn't matter
(eg. non-@inlinable inits) for consistency and in case someone were to
make the init inlinable later.
Result:
Minor performance wins working around https://bugs.swift.org/browse/SR-11768
Motivation:
Zero EventLoop threads makes no sense and leads to a crash (see #1223)
Modifications:
Add a precondition
Result:
Initializing EventLoop with zero threads will fail precondition rather than crash later.
Motivation:
We want to have some extra WebSocket decoder benchmarks, for
a WebSocketFrame with a maskingKey, a WebSocketFrame with a 16-bit
length and a WebSocketFrame with a 64-bit length
Modifications:
Changed WebSocketFrameDecoderBenchmark to have ability for dynamic
WebSocketFrame length through the initializer parameter dataSize.
Added ability for WebSocketFrameDecoderBenchmark to benchmark
a WebSocketFrame with a maskingKey by passing withMaskKey parameter to
the initializer.
Added new benchmarks in the main for WebSocketFrame decoding with a
16-bit length, WebSocketFrame decoding with a 64-bit length and with
a maskingKey.
Result:
Better WebSocket decoder benchmarks
Motivation:
When calling ByteBuffer setBytes/writeBytes with CircularBuffer<Uint8> we take the slow path. Before we make any changes we want a baseline for current performance.
Modifications:
- This gives a very significant performance increase for CircularBuffer
and likely other types that use the slow path.
- Adds ByteBuferBenchmark.swift and tests circular_buffer_into_byte_buffer_1kb, circular_buffer_into_byte_buffer_1mb.
- Make _setSlowPath @inlinable
- Implement == for CircularBuffer.Index to make it @inlinable
Result:
- Extra performance tests
- Improve performance
Motivation:
Docs give unexpected results since behavior was changed in NIO 2
to only see readable bytes. Fixes#1232.
Modifications:
Documentation updated.
Result:
Documentation gives expected results with no errors.
Motivation:
Certain `#if os` uses in SwiftNIO are unnecessary, they should be
removed.
Modifications:
Removed unnecessary `#if os` in SwiftNIO.
Result:
Nicer codebase.
Motivation:
Previously, NIOPerformanceTester would use Foundation.Date twice to
calculate the time spend in a given function. This is less precise as it
can be but more importantly, it's not resilient against the user
changing the clock or the computer going to sleep.
Modifications:
Use DispatchTime.now().uptimeNanoseconds
Result:
More precise and resilient benchmark measurements.
Motivation:
Every EventLoopFuture was allocating an atomic boolean (which needs to
heap allocate because Swift) literally just to make the leakage checker
work. But the leakage checker is only active in debug mode, the
allocation however was _always_ done.
Modifications:
Remove the allocation in release mode (by making the atomic optional).
Result:
Fewer allocations
Motivation:
ByteBuffer.readJSONDecodable just ignored the JSONDecoder that got
passed in.
Modifications:
Pass through the JSONDecoder that was specified.
Result:
Respect the user's choices.
Motivation:
WebSocketFrameDecoder makes an allocation due to WebSocketFrame not fitting in an
existential. We should reduce this overhead.
Modifications:
Moved data and extensionData into a cow-box and reorder fields to more efficiently
pack them.
Result:
WebSocketFrame is now 14 bytes, down from 55.
* Allow single test to be specified as an argument to run-nio-alloc-counter-tests.sh.
Motivation:
run-nio-alloc-counter-tests.sh currently runs all tests. When writing or
debugging a test, you commonly want to run a single test. At the moment the
common practice is to edit the script to hardcode a test to run, which is
annoying and error-prone.
Modifications:
Add an optional argument to the script to specify the test to run.
Result:
$ ./run-nio-alloc-counter-tests.sh
runs all tests.
$ ./run-nio-alloc-counter-tests.sh test_decode_1000_ws_frames.swift
runs only test_decode_1000_ws_frames.swift.
Motivation:
JSONDecoder settings should be respected but were not.
Modifications:
This allows actually configuring the `JSONDecoder` instead of being stuck with the defaults.
Result:
Settings respected.
Motivation:
In complex ByteBuffer parsing code a surprising amount of cost comes
from retain/release operations that are emitted by the compiler around
calls to readerIndex/writerIndex, readableBytes/writableBytes, and
ByteBufferSlice._lowerBound. These are all straightforward computed
properties based on usableFromInline operations, and so they can be made
inlinable.
Modifications:
Joannis' sample benchmark runtime is halved by eliminating
retains/releases.
Result:
Better, faster parsing code.
Motivation:
Certain POSIX error numbers mean the NIO program is broken beyond repair and should crash. One such value is EBADF, which indicates that the file descriptor that NIO passed to the kernel is invalid for some reason. SR-11557 was discovered in part because debug builds of NIO crash for this.
I have promoted these checks from an assertion to a precondition adding a function preconditionIsNotBlacklistedErrno. Issue was raised in issue #1156
Modifications:
- Add function preconditionIsNotBlacklistedErrno to enable option to stop program execution in both debug mode and release mode if the errno is blacklisted.
- Update assertIsNotBlacklistedErrno to all preconditionIsNotBlacklistedErrno when the program is in debug mode
- Update wrapSyscallMayBlock, wrapSyscall, wrapErrorIsNullReturnCall, and close to call preconditionIsNotBlacklistedErrno in stead of assertIsNotBlacklistedErrno
Result:
Functions wrapSyscallMayBlock, wrapSyscall, wrapErrorIsNullReturnCall, and close will stop the execution in both debug mode and release mode if the errno is blacklisted.
Motivation:
ByteBuffer.clear doesn't reset the ByteBuffer._slice. Imagine having a byte buffer slice (ByteBuffer.getSlice). Calling ByteBuffer.clear on the slice will reset the reader and writer, but the ByteBuffer.capacity will return the old slice size, not the full storage capacity. After this change ByteBuffer.clear will also reset the ByteBuffer._slice to regain the whole ByteBuffer._storage.
Modifications:
Just setting the full storage slice in the ByteBuffer.clear method.
Result:
After that change ByteBuffer.clear will reset byte buffer slice to full capacity of underlying storage.
Motivation:
It's important to know how websocket frame decoder perform.
Modification:
- Wrote simple benchmark for WebSocket frame decoding.
Result:
Better insight
* Make SocketAddress.port mutable
* Add linux test
* Precondition for setting non-nil port to unix socket
* Update Sources/NIO/SocketAddresses.swift
Co-Authored-By: George Barnett <gbrntt@gmail.com>
Motivation:
For some reason, we never added `socketpair` to the POSIX wrappers but
it should be there.
Modifications:
Add socketpair to the POSIX wrappers and make use of it.
Result:
Nicer code base.