Motivation:
We recently added a synchronous view of the `ChannelPipline` so that
callers can avoid allocating futures when they know they're on the right
event loop. We also offer convenience APIs to configure the pipeline for
particular use cases, like an HTTP/1 server but we don't have
synchronous versions of these APIs yet. We should have parity
between as synchronous and asyncronous APIs where feasible.
Modifications:
- Add synchronous helpers to configure HTTP1 client and server pipelines
Result:
Callers to synchronously configure HTTP1 client and server pipelines.
Motivation:
Verifying the contents of inbound request parts for `NIOHTTP1TestServer`
can be quite tedious and whenever I use `NIOHTTP1TestServer` in
different projects I find myself writing similar extensions to help
verify the inbound request parts are as expected.
Modifications:
- Add verification methods to `NIOHTTP1TestServer` which checks for the
expected part and optionally verifies it further in a callback
Result:
It's easier to test with NIOHTTP1TestServer
Motivation:
Some of the output from 'malloc-aggregation.d' trips up the
'stackdiff-dtrace.py' script -- especially when used with the output
from recent 5.4 builds. The regex for numbers includes hex which trips
us up when we try to convert that match to an int. Since we only use
this pattern for matching allocation counts, where we don't expect hex,
we can just modify the pattern.
Also the input parsing is a little tricky to follow without context so I
tidied it up little.
Modifications:
- relax the number regex (we don't expect counts to be in hex)
- drop the stack regex (it wasn't used)
- aggregate the current stack in a list (instead of a string)
- filter out non-stack lines (which were erroneously used to key the
first stack!)
- a few comments
Result:
stackdiff-dtrace.py doesn't blow up on output from newer builds
Motivation:
We added synchronous pipeline operations to allow the caller to save
allocations when they know they are already on the correct event loop.
However, we missed a trick! In some cases the caller cannot guarantee
they are on the correct event loop and must use an asynchronous method
instead. If that method returns a void future and is called on the event
loop, then we can perform the operation synchronously and return a
cached void future.
Modifications:
- Add API to `EventLoop` for creating a 'completed' future with a
`Result` (similar to `EventLoopPromise.completeWith`)
- Add an equivalent for making completed void futures
- Use these when asynchronously adding handlers and the caller is
already on the right event loop.
Result:
- Fewer allocations on the happiest of happy paths when adding handlers
asynchronously to a pipeline.
Motivation:
The HTTP server upgrade tests had unncessarily wide and also verbose
test assertions.
Modifications:
Rewrote them much simpler and narrowed what states are allowed.
Result:
- Neater code
- Works around https://bugs.swift.org/browse/SR-14253
* Add synchronous channel options
Motivation:
The functions for getting and setting channel options are currently
asynchronous. This ensures that options are set and retrieved safely.
However, in some cases the caller knows they are on the correct event
loop but still has to pay the cost of allocating a future to either get
or set an option.
Modifications:
- Add a 'NIOSynchronousChannelOptions' protocol for getting and setting
options
- Add a customisation point to 'Channel' to return 'NIOSynchronousChannelOptions'.
- Default implementation returns nil so as to not break API.
- Add implementations for 'EmbeddedChannel' and 'BaseSocketChannel'
- Allocation tests for getting and setting autoRead
Results:
Options can be get and set synchronously.
Motivation:
The SAL (Syscall Abstraction Layer) virtualises pretty much all system
resources that NIO uses, except time and threads. That means that unless
explicitly made sure, the EL thread and the test thread run at their own
pace.
The synchronisation point is whenever the EL thread makes a
(virtualised) syscall which the test thread then sees (and asserts).
The problem in #1748 was that we were assuming that the EL thread would
park itself very fast. So when the test driver thread came along to
execute something on the EL, it just assumed that the EL has already
parked (ie. waited for a wakeup). Seemingly that works most of the time
but not always.
Modifications:
- Always actually wait for the EL to park itself.
- Delete duplicated code in `salWait` to use `runSAL`
- fixes#1748
Result:
Fewer races.
Motivation:
`ChannelPipeline` is explicitly thread-safe, any of the operations may
be called from outside of the channel's event loop. However, there are
often cases where it is known that the caller be on the right event
loop, and an asynchronous API is unnecessary.
In some cases -- such as when a pipeline is configured dynamically and
handlers are added from the 'channelRead' implementation of one handler
-- it forces the caller to write code that they might not actually need:
such as buffering events which may happen before the future completes.
This is unnecessary complexity when the caller knows that they must
already be on an event loop.
Modifications:
- Add a 'SynchronousOperations' view to the 'ChannelPipeline' which is
available to callers via 'syncOperations'.
- Supported operations include: adding a handler, adding multiple
handlers, retrieving a context via various predicates and retrieving a
handler of a given type.
- Some of the operations in 'ChannelPipeline' were refactored to have an
explicitly synchronous version, asynchronous versions complete their
promise based on the result of these calls.
- Various minor documentation fixes and addition of 'self' where it was
not used explicitly.
Result:
Users can perform synchronous operations on the 'ChannelPipeline' if
they know they are on the right event loop.
Motivation:
The alloc counters need to store the pointer to the original libc
implmentations of the hooked functions somewhere. Previously, we would
store them in thread locals. That works fine but creates quite some
overhead (and allocations) per thread (to do dlsym on every thread).
Modifications:
Instead, we now store the libc function pointers in (atomic) globals so
we need to only resolve each function once, no matter how many threads
we use.
Result:
Faster, and more accurate.
* Host header is requires for HTTP/1.1
* Move bootstrap to allow injection of the connection target
* Pass original connect target in the Host header
* Define ConnectTo before using it
Motivation:
Sometimes, advances performance analysis is required and we didn't have
documents describing this.
Modifications:
Add a document describing advanced performance analysis with `perf`.
Result:
More guides.
Motivation:
B2MD called out to the decoder's `shouldReclaimBytes` after every
parsing attempt, even if the parser said `.continue`.
That's quite pointless because we won't add any bytes into the buffer
before we're trying the parser again.
Modifications:
Only ask the decoder if we should reclaim bytes if the decoder actually
says `.needMoreData`.
Result:
Faster, better, more sensible.
Motivation:
After chatting to the Swift perf team, they thought it may be a good
idea to align all functions in our performance tests which may make the
micro benchmarks more stable.
Modifications:
Align all functions/blocks.
Result:
Hopefully more stable micro benchmarks.
Motivation:
I'm sick of typing `.init(major: 1, minor: 1)`.
Modifications:
- Added static vars for common HTTP versions.
Result:
Maybe I'll never type `.init(major: 1, minor: 1)` ever again.
Motivation:
Succeeded `EventLoopFuture<Void>`s are quite important in SwiftNIO, they
happen all over the place. Unfortunately, we usually allocate each time,
unnecessarily.
Modifications:
Offer `EventLoop`s the option to cache succeeded void futures.
Result:
Fewer allocations.
Motivation:
When you make a change that affects many performance tests, it's often
easier to just copy the results from CI. Unfortunately, that makes the
diff hard to read because the order is arbitrary.
Modifications:
Sort the list so you can always easily get to the same order as the
docker file by using `| sort` or `:sort` in vim.
Result:
Easier to update perf tests.
Motivation:
Many other systems don't like non ascii characters.
Modifications:
Change a letter 'a' to a letter 'a' but with a more normal encoding.
Result:
Test names should be in ascii
Motivation:
PR #1710 introduced a typo into the code. This typo was missed because
the argument name was shadowing a variable from parent scope, which was
inadvertently closed over.
Best to avoid that confusion.
Modifications:
- Fixed the typo
- Renamed the argument to reduce the risk of this happening again
Result:
Better code organisation.
Motivation:
The handlerAdded and handlerRemoved functions are not throwing.
I can't spot anyway way and exception could actually happen
so no point in catering for the case when one does.
Modifications:
Remove possibility that an exception was thrown when adding or
removing handlers as none of the called functions are throwing.
Result:
Fewer lines of code, less complexity.
* Add allocation test for adding multiple handlers
Motivation:
I believe there is at least 1 avoidable allocation in this area.
Even if there isn't, making sure we don't increase allocations is good.
Modifications:
Add a test of allocations when adding multiple handlers.
Set limits for docker images.
Result:
Allocations when adding multiple handlers are now checked.
* Remove an allocation from addHandlers
Motivation:
Fewer allocations should improve performance.
Modifications:
Split out a sub function from addHandlers.
I originally thought I'd have to change the part of this
function which reads `var handlers = handlers` as there was
a surprising allocation at the beginning of this function.
It seems that breaking out some of the logic is sufficient
to remove an allocation.
Result:
1 fewer allocation.
* Fix up alloc tests.
Motivation:
Some of the generic EventLoopFuture functions weren't inlinable.
Modifications:
- Make them @inlinable
- Fix typo in name of 'DispathQueue+WithFuture.swift'
Result:
More specialization.
Co-authored-by: Cory Benfield <lukasa@apple.com>
This reverts commit 4853e910e8.
While we have been able to observe the effect that this change was
trying to workaround, the change seems to interact poorly with a
different issue in Big Sur that can cause EPROTOTYPE to be consistently
emitted during socket writes on otherwise connected sockets. This would
change a connection-terminating error into a 100% CPU spin that rendered
the event loop entirely useless: a substantial regression.
For this reason, we should back this out until the issue is better
characterised.
Motivation:
When writing to a network socket on Apple platforms it is possible to
see EPROTOTYPE returned as an error. This is an undocumented and
special-case error code that appears to be associated with socket
shutdown, and so can fire when writing to a socket that is being shut
down by the other side. This should not be fired into the pipeline but
instead should be retried.
Modifications:
- Retry EPROTOTYPE errors on socket write methods.
- Add an (unfortunately) probabilistic test bed.
Result:
Should avoid weird error cases.
Motivation:
Some users are including NIO in Xcode workspaces
rather than using SPM. When this is done, if all
imports of CNIOLinux are conditional they can
remove this project from their workspace.
Modifications:
Make all imports of CNIOLinux conditional on Linux,
Android or FreeBSD
Result:
No user visible change.
Motivation:
In all contemporary Swift versions, the `class` and the `AnyObject`
protocol restriction is the same. And `class` is deprecated which warns
on newer Swift compilers.
Modifications:
Replace `class` with `AnyObject`.
Result:
No warnings on newer Swift compilers.
Motivation:
When running load through EmbeddedChannel we spend an enormous amount of
time screwing around with removing things from Arrays. Arrays are not a
natural data type for `removeFirst()`, and in fact that method is
linear-time on Array due to the need for Array to be zero-indexed. Let's
stop using (and indeed misusing) Array on EmbeddedChannel.
While we're here, if we add some judicious @inlinable annotations we can
also save additional work generating results that users don't need.
Modifications:
- Replace arrays with circular buffers (including marked versions).
- Avoid CoWs and extra allocations on flush.
- Make some API methods inlinable to make them cheaper.
Result:
- Much cheaper EmbeddedChannel for benchmark purposes.
Motivation:
As outlined in https://bugs.swift.org/browse/SR-13923,
removeAll(keepingCapacity:) on Array has particularly negative
performance when that Array is not uniquely referenced. In this case, in
EmbeddedChannel, we _arrange_ to multiply reference it. This makes it
swamp our HTTP/2 microbenchmarks, spending more cycles copying this
buffer around than doing anything else.
Modifications:
- Just allocate a new buffer instead.
Result:
Much less copying.
Motivation:
A new init was added to `SocketAddress` in #1692, but the casing of
`packedIpAddress` is incorrect, it should be `packedIPAddress`. This
hasn't been released yet so let's fix it while we still can!
Modifications:
- s/packedIpAddress/packedIPAddress
Result:
More consistent API
Motivation:
I recently discovered that UnsafeRawBufferPointer.init(rebasing:) is
surprisingly expensive, with 7 traps and 11 branches. A simple
replacement can make it a lot cheaper, down to two traps and four
branches. This ends up having pretty drastic effects on
ByteBuffer-heavy NIO code, which often outlines the call to that
initializer and loses the ability to make a bunch of site-local
optimisations.
While this has been potentially fixed upstream with
https://github.com/apple/swift/pull/34879, there is no good reason to
wait until Swift 5.4 for this improvement.
Due to the niche use-case, I didn't bother doing this for _every_
rebasing in the program. In particular, there is at least one
UnsafeBufferPointer(rebasing:) that I didn't do this with, and there are
uses in both NIOTLS and NIOHTTP1 that I didn't change. While we can fix
those if we really need to, it would be nice to avoid this helper
proliferating too far through our codebase.
Modifications:
- Replaced the use of URBP.init(rebasing:) with a custom hand-rolled
version that avoids Slice.count.
Result:
Cheaper code. One NIOHTTP2 benchmark sees a 2.9% speedup from this
change alone.
Motivation:
Be able to run and test swift-nio on Android.
Modifications:
- Remove the custom ifaddrs and use the one from the Android NDK instead.
- Enable a bunch of conditionally-compiled code for Android.
- Add a handful of constants and other Android declarations.
- Cast some types because of mismatches specific to Android.
Result:
Most tests pass on Android AArch64 and ARMv7.
Co-authored-by: Cory Benfield <lukasa@apple.com>
Motivation:
A user should be able to create SocketAddress from packed bytes representation.
Modifications:
I added a new SocketAddress initializer which takes the IP address in ByteBuffer form. I have also added tests that ensure the initializer works properly.
Result:
We have a new way to initialize a SocketAddress from a byteBuffer.
Co-authored-by: Cory Benfield <lukasa@apple.com>
Conforms TimeAmount to AdditiveArithmetic.
Motivation:
TimeAmount does not support -=, +=. Sometimes it is useful to manipulate time amounts when building up a delay and if that iteration fails, we would want to delay += .milliseconds(5) to add 5 milliseconds to our delay and try again.
Modifications:
Conformed TimeAmount to AdditiveArithmetic: added a static zero property and required operators.
Result:
TimeAmount conforms to AdditiveArithmetic.
Co-authored-by: Josh <jrtkski@icloud.com>
Co-authored-by: Cory Benfield <lukasa@apple.com>
Upon the addition of `Result` in the Swift standard library, apple/swift-nio#734
updated `EventLoopFuture.whenComplete(_:)` to pass a `Result<T, Error>` to
its `callback`, but the documentation still confusingly states:
> Unlike its friends… `whenComplete` does not receive the result of the
> `EventLoopFuture`.
This patch fixes that by removing the (now inaccurate) lines.
Partially implement network interface enumeration. This is sufficient
to build up the structures for basic operations. Although the interface
information is incomplete, this provides enough structure to continue
porting the rest of the NIO interfaces.
Co-authored-by: Cory Benfield <lukasa@apple.com>
Motivation:
Curently adding multiple channel handlers makes a call to the
async version of addHandler for each handler resulting in
n+1 futures. It feels better to use just one future and add
all the handlers synchoronously.
Modifications:
Change sync functions with take a promise to instead return a Result.
Feed this back until reaching addHandlers.
Result:
Multiple handlers can be added more quickly.
Motivation:
We support watchOS 6+ with SwiftNIO Transport Services; as such we should
include watchOS as a deployment target for our CocoaPods.
Modifications:
- Add a watchOS deployment target to `build_podspecs.sh`
- Update docs
Result:
Users can deploy to watchOS 6+ with CocoaPods.
`WSASendMsg` and `WSARecvMsg` are not directly accessible to use.
Instead, one must perform an IOCTL on the socket to retrieve the
extension point and then use that function pointer to perform the
operation. Use this to implement the functionality on Windows.