Martin Evans
d87d654a34
Merge pull request #348 from martindevans/new_object_based_sampling_pipeline
...
Custom Sampling Pipelines
2023-12-11 21:40:53 +00:00
dependabot[bot]
85dc43dde0
build(deps): bump xunit from 2.6.2 to 2.6.3
...
Bumps [xunit](https://github.com/xunit/xunit ) from 2.6.2 to 2.6.3.
- [Commits](https://github.com/xunit/xunit/compare/2.6.2...2.6.3 )
---
updated-dependencies:
- dependency-name: xunit
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
2023-12-11 14:26:14 +00:00
dependabot[bot]
8fb4476813
build(deps): bump xunit.runner.visualstudio from 2.5.4 to 2.5.5
...
Bumps [xunit.runner.visualstudio](https://github.com/xunit/visualstudio.xunit ) from 2.5.4 to 2.5.5.
- [Release notes](https://github.com/xunit/visualstudio.xunit/releases )
- [Commits](https://github.com/xunit/visualstudio.xunit/compare/2.5.4...2.5.5 )
---
updated-dependencies:
- dependency-name: xunit.runner.visualstudio
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
2023-12-11 06:20:24 +00:00
Martin Evans
835958398c
- Removed the object wrappers and configurable pipeline, they can be better written in code.
...
- Added BaseSamplingPipeline which provides a base impl of `ISamplingPipeline`
- Added `DefaultSamplingPipeline` which mimics normal llama.cpp sampling
2023-12-08 16:25:13 +00:00
Martin Evans
3afc007499
- Added "protected" logits, instead of the awkward save/load mechanism
...
- Added an example usage to one of the tests
2023-12-08 01:17:24 +00:00
dependabot[bot]
6d86219d71
build(deps): bump xunit.runner.visualstudio from 2.5.3 to 2.5.4
...
Bumps [xunit.runner.visualstudio](https://github.com/xunit/visualstudio.xunit ) from 2.5.3 to 2.5.4.
- [Release notes](https://github.com/xunit/visualstudio.xunit/releases )
- [Commits](https://github.com/xunit/visualstudio.xunit/compare/2.5.3...2.5.4 )
---
updated-dependencies:
- dependency-name: xunit.runner.visualstudio
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
2023-11-27 06:17:47 +00:00
Rinne
cf4edeac55
Merge pull request #315 from futzy314/fix-ai-request-settings
...
Added a converter similar to the Open AI one
2023-11-24 22:12:55 +08:00
Martin Evans
597188c236
Merge pull request #316 from martindevans/update_binaries_nov
...
November Binary Update
2023-11-20 19:45:07 +00:00
Ian Foutz
b2bf59d8d5
Unit tests added
2023-11-20 09:55:20 -06:00
dependabot[bot]
41292b4b32
build(deps): bump xunit from 2.6.1 to 2.6.2
...
Bumps [xunit](https://github.com/xunit/xunit ) from 2.6.1 to 2.6.2.
- [Commits](https://github.com/xunit/xunit/compare/2.6.1...2.6.2 )
---
updated-dependencies:
- dependency-name: xunit
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
2023-11-20 06:49:35 +00:00
Martin Evans
77003d763e
Added new symbols from llama.h
2023-11-19 17:11:01 +00:00
Martin Evans
48c5039054
Improved test coverage. Discovered some issues:
...
FixedSizeQueue:
- Enqueue would always stop one short of filling the capacity
- Fill would only _replace_ existing items. It was only used in a place where there were not existing items! Removed the method entirely.
LLamaGrammarElement:
- Converted into a `record` struct, removed all of the (now unnecessary) equality stuff.
2023-11-18 02:40:36 +00:00
dependabot[bot]
f68aa777f1
build(deps): bump Microsoft.NET.Test.Sdk from 17.7.2 to 17.8.0
...
Bumps [Microsoft.NET.Test.Sdk](https://github.com/microsoft/vstest ) from 17.7.2 to 17.8.0.
- [Release notes](https://github.com/microsoft/vstest/releases )
- [Changelog](https://github.com/microsoft/vstest/blob/main/docs/releases.md )
- [Commits](https://github.com/microsoft/vstest/compare/v17.7.2...v17.8.0 )
---
updated-dependencies:
- dependency-name: Microsoft.NET.Test.Sdk
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
2023-11-13 06:38:31 +00:00
dependabot[bot]
b7893317f5
build(deps): bump xunit.runner.visualstudio from 2.5.0 to 2.5.3
...
Bumps [xunit.runner.visualstudio](https://github.com/xunit/visualstudio.xunit ) from 2.5.0 to 2.5.3.
- [Release notes](https://github.com/xunit/visualstudio.xunit/releases )
- [Commits](https://github.com/xunit/visualstudio.xunit/compare/2.5.0...2.5.3 )
---
updated-dependencies:
- dependency-name: xunit.runner.visualstudio
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com>
2023-11-04 16:54:42 +00:00
dependabot[bot]
b20c3ecda5
build(deps): bump xunit from 2.5.0 to 2.6.1
...
Bumps [xunit](https://github.com/xunit/xunit ) from 2.5.0 to 2.6.1.
- [Commits](https://github.com/xunit/xunit/compare/2.5.0...2.6.1 )
---
updated-dependencies:
- dependency-name: xunit
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
2023-11-03 10:48:48 +00:00
Martin Evans
09bc688b3c
Skipped slow test again
2023-10-28 23:19:07 +01:00
Martin Evans
cdf20d3c7a
Added timing to stateless test
2023-10-28 22:17:34 +01:00
Martin Evans
7e3cde4c13
Moved helper methods into `LLamaBatchSafeHandle`
2023-10-28 22:09:09 +01:00
Martin Evans
ccb8afae46
Cleaned up stateless executor as preparation for changing it to use the new batched decoding system.
2023-10-28 21:50:48 +01:00
Martin Evans
321d0b58c4
Merge pull request #202 from martindevans/multi_gpu
...
Multi GPU
2023-10-26 14:40:49 +01:00
Martin Evans
d5874a279c
Updated test runner to latest version
2023-10-25 16:57:19 +01:00
Martin Evans
51d4411a58
Added two new classes for detokenization tasks:
...
- `AntipromptProcessor` accepts chunks of text and returns a value indicating if any antiprompt has been detected.
- `StreamingTokenDecoder` decodes tokens into text, maintaining some internal state to handle single characters which are encoded as multiple tokens.
Added tests for these classes and updated StatelessExecutor to use them.
Removed most DeTokenize methods, marked the rest as obsolete (should always use a `StreamingTokenDecoder`).
2023-10-23 00:33:50 +01:00
Martin Evans
efdf3d630c
- Removed all `TokenToString` methods (it's never correct to use them, because sometimes one single character may be represented by multiple tokens).
...
- Built a new (hacky) `Detokenize` method which handles this
2023-10-22 21:43:36 +01:00
Martin Evans
1d0620e634
Created a test that "roundtrips" strings through tokenization. This reveals some flaws with certain characters
2023-10-22 15:28:36 +01:00
Martin Evans
b4e7f64e76
Added System.Text.Json serialization for `TensorSplitsCollectionConverter`
2023-10-20 14:55:01 +01:00
Martin Evans
e89ca5cc17
Fixed a few minor warnings
2023-10-19 00:43:50 +01:00
Martin Evans
d8434ea9d6
Merge pull request #185 from martindevans/wip_major_api_change
...
Major llama.cpp API Change
2023-10-18 20:50:32 +01:00
Martin Evans
1f8c94e386
Added in the `special` parameter to the tokenizer (introduced in https://github.com/ggerganov/llama.cpp/pull/3538 )
2023-10-17 23:55:46 +01:00
Martin Evans
efb0664df0
- Added new binaries
...
- Fixed stateless executor out-of-context handling
- Fixed token tests
2023-10-17 23:39:41 +01:00
Martin Evans
b8f0eff080
- Added `GetCharCountImpl` tests, fixed handling of empty strings
...
- Added ifdef to remove `Deconstruct` extension on everything except `NETSTANDARD2_0`
2023-10-14 00:04:12 +01:00
Martin Evans
45118520fa
- Improved coverage of `GBNFGrammarParser` up to 96%
...
- Covered text transforms
- Removed unnecessary non-async transforms
2023-10-13 23:54:01 +01:00
Martin Evans
9f694c584c
Further improved grammar parser test coverage (up to 92%)
2023-10-13 02:08:12 +01:00
Martin Evans
bff41eef37
Added some more coverage of `GrammarRule`, checking that invalid rules are rejected
2023-10-13 01:36:48 +01:00
Martin Evans
2a38808bca
- Added threads to context params, replaced all thread args with `uint?`
...
- Replaced all binaries
2023-10-12 18:49:41 +01:00
Martin Evans
669ae47ef7
- Split parameters into two interfaces
...
- params contains a list of loras, instead of just one
2023-09-30 16:21:18 +01:00
Martin Evans
9a0a0ae9fe
Removed cloning support
2023-09-30 15:48:26 +01:00
Martin Evans
0d40338692
Fixed out-of-context handling in stateless executor
2023-09-29 23:53:07 +01:00
Martin Evans
ce1fc51163
Added some more native methods
2023-09-29 16:05:19 +01:00
Martin Evans
bca55eace0
Initial changes to match the llama.cpp changes
2023-09-29 01:18:21 +01:00
Martin Evans
fe54f6764f
- Added unit tests for extension methods
...
- Removed unused `AddRangeSpan` extension
2023-09-22 16:29:50 +01:00
Martin Evans
3f80190f85
Minimal changes required to remove non-async inference.
2023-09-14 21:04:14 +01:00
Martin Evans
daf09eae64
Skipping tokenization of empty strings (saves allocating an empty array every time)
2023-09-12 01:03:27 +01:00
Martin Evans
bba801f4b7
Added a property to get the KV cache size from a context
2023-09-11 00:10:08 +01:00
Martin Evans
d3b8ee988c
Beam Search ( #155 )
...
* Added the low level bindings to beam search.
2023-09-07 19:26:51 +01:00
Martin Evans
d0e57a8c92
sealed test class
2023-09-06 20:11:31 +01:00
Martin Evans
3f082c6f2c
Fixed naming in tests
2023-09-06 20:09:41 +01:00
Martin Evans
614ba40948
- Added a `TokensEndsWithAnyString` extension to `IReadOnlyList<int>` which efficiently checks if a set of tokens ends with one of a set of strings.
...
- Minimal amount of characters converted
- Allocation free
- Added `TokensToSpan` to `SafeLlamaModelHandle` which converts as many tokens as possible into a character span
- Allocation free
2023-09-06 19:44:19 +01:00
Martin Evans
821d7f615e
Swapped to llama-7b-chat
2023-09-04 21:26:02 +01:00
Martin Evans
21cbecb82d
Disable test parallelism to prevent fix CI
2023-09-03 23:35:53 +01:00
Rinne
4e83e48ad1
Merge pull request #122 from martindevans/gguf
...
Add GGUF support
2023-09-02 11:54:50 +08:00