Commit Graph

96 Commits

Author SHA1 Message Date
Martin Evans d87d654a34
Merge pull request #348 from martindevans/new_object_based_sampling_pipeline
Custom Sampling Pipelines
2023-12-11 21:40:53 +00:00
dependabot[bot] 85dc43dde0
build(deps): bump xunit from 2.6.2 to 2.6.3
Bumps [xunit](https://github.com/xunit/xunit) from 2.6.2 to 2.6.3.
- [Commits](https://github.com/xunit/xunit/compare/2.6.2...2.6.3)

---
updated-dependencies:
- dependency-name: xunit
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-12-11 14:26:14 +00:00
dependabot[bot] 8fb4476813
build(deps): bump xunit.runner.visualstudio from 2.5.4 to 2.5.5
Bumps [xunit.runner.visualstudio](https://github.com/xunit/visualstudio.xunit) from 2.5.4 to 2.5.5.
- [Release notes](https://github.com/xunit/visualstudio.xunit/releases)
- [Commits](https://github.com/xunit/visualstudio.xunit/compare/2.5.4...2.5.5)

---
updated-dependencies:
- dependency-name: xunit.runner.visualstudio
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-12-11 06:20:24 +00:00
Martin Evans 835958398c - Removed the object wrappers and configurable pipeline, they can be better written in code.
- Added BaseSamplingPipeline which provides a base impl of `ISamplingPipeline`
 - Added `DefaultSamplingPipeline` which mimics normal llama.cpp sampling
2023-12-08 16:25:13 +00:00
Martin Evans 3afc007499 - Added "protected" logits, instead of the awkward save/load mechanism
- Added an example usage to one of the tests
2023-12-08 01:17:24 +00:00
dependabot[bot] 6d86219d71
build(deps): bump xunit.runner.visualstudio from 2.5.3 to 2.5.4
Bumps [xunit.runner.visualstudio](https://github.com/xunit/visualstudio.xunit) from 2.5.3 to 2.5.4.
- [Release notes](https://github.com/xunit/visualstudio.xunit/releases)
- [Commits](https://github.com/xunit/visualstudio.xunit/compare/2.5.3...2.5.4)

---
updated-dependencies:
- dependency-name: xunit.runner.visualstudio
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-11-27 06:17:47 +00:00
Rinne cf4edeac55
Merge pull request #315 from futzy314/fix-ai-request-settings
Added a converter similar to the Open AI one
2023-11-24 22:12:55 +08:00
Martin Evans 597188c236
Merge pull request #316 from martindevans/update_binaries_nov
November Binary Update
2023-11-20 19:45:07 +00:00
Ian Foutz b2bf59d8d5 Unit tests added 2023-11-20 09:55:20 -06:00
dependabot[bot] 41292b4b32
build(deps): bump xunit from 2.6.1 to 2.6.2
Bumps [xunit](https://github.com/xunit/xunit) from 2.6.1 to 2.6.2.
- [Commits](https://github.com/xunit/xunit/compare/2.6.1...2.6.2)

---
updated-dependencies:
- dependency-name: xunit
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-11-20 06:49:35 +00:00
Martin Evans 77003d763e Added new symbols from llama.h 2023-11-19 17:11:01 +00:00
Martin Evans 48c5039054 Improved test coverage. Discovered some issues:
FixedSizeQueue:
 - Enqueue would always stop one short of filling the capacity
 - Fill would only _replace_ existing items. It was only used in a place where there were not existing items! Removed the method entirely.

LLamaGrammarElement:
 - Converted into a `record` struct, removed all of the (now unnecessary) equality stuff.
2023-11-18 02:40:36 +00:00
dependabot[bot] f68aa777f1
build(deps): bump Microsoft.NET.Test.Sdk from 17.7.2 to 17.8.0
Bumps [Microsoft.NET.Test.Sdk](https://github.com/microsoft/vstest) from 17.7.2 to 17.8.0.
- [Release notes](https://github.com/microsoft/vstest/releases)
- [Changelog](https://github.com/microsoft/vstest/blob/main/docs/releases.md)
- [Commits](https://github.com/microsoft/vstest/compare/v17.7.2...v17.8.0)

---
updated-dependencies:
- dependency-name: Microsoft.NET.Test.Sdk
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-11-13 06:38:31 +00:00
dependabot[bot] b7893317f5
build(deps): bump xunit.runner.visualstudio from 2.5.0 to 2.5.3
Bumps [xunit.runner.visualstudio](https://github.com/xunit/visualstudio.xunit) from 2.5.0 to 2.5.3.
- [Release notes](https://github.com/xunit/visualstudio.xunit/releases)
- [Commits](https://github.com/xunit/visualstudio.xunit/compare/2.5.0...2.5.3)

---
updated-dependencies:
- dependency-name: xunit.runner.visualstudio
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-11-04 16:54:42 +00:00
dependabot[bot] b20c3ecda5
build(deps): bump xunit from 2.5.0 to 2.6.1
Bumps [xunit](https://github.com/xunit/xunit) from 2.5.0 to 2.6.1.
- [Commits](https://github.com/xunit/xunit/compare/2.5.0...2.6.1)

---
updated-dependencies:
- dependency-name: xunit
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2023-11-03 10:48:48 +00:00
Martin Evans 09bc688b3c Skipped slow test again 2023-10-28 23:19:07 +01:00
Martin Evans cdf20d3c7a Added timing to stateless test 2023-10-28 22:17:34 +01:00
Martin Evans 7e3cde4c13 Moved helper methods into `LLamaBatchSafeHandle` 2023-10-28 22:09:09 +01:00
Martin Evans ccb8afae46 Cleaned up stateless executor as preparation for changing it to use the new batched decoding system. 2023-10-28 21:50:48 +01:00
Martin Evans 321d0b58c4
Merge pull request #202 from martindevans/multi_gpu
Multi GPU
2023-10-26 14:40:49 +01:00
Martin Evans d5874a279c Updated test runner to latest version 2023-10-25 16:57:19 +01:00
Martin Evans 51d4411a58 Added two new classes for detokenization tasks:
- `AntipromptProcessor` accepts chunks of text and returns a value indicating if any antiprompt has been detected.
 - `StreamingTokenDecoder` decodes tokens into text, maintaining some internal state to handle single characters which are encoded as multiple tokens.

Added tests for these classes and updated StatelessExecutor to use them.

Removed most DeTokenize methods, marked the rest as obsolete (should always use a `StreamingTokenDecoder`).
2023-10-23 00:33:50 +01:00
Martin Evans efdf3d630c - Removed all `TokenToString` methods (it's never correct to use them, because sometimes one single character may be represented by multiple tokens).
- Built a new (hacky) `Detokenize` method which handles this
2023-10-22 21:43:36 +01:00
Martin Evans 1d0620e634 Created a test that "roundtrips" strings through tokenization. This reveals some flaws with certain characters 2023-10-22 15:28:36 +01:00
Martin Evans b4e7f64e76 Added System.Text.Json serialization for `TensorSplitsCollectionConverter` 2023-10-20 14:55:01 +01:00
Martin Evans e89ca5cc17 Fixed a few minor warnings 2023-10-19 00:43:50 +01:00
Martin Evans d8434ea9d6
Merge pull request #185 from martindevans/wip_major_api_change
Major llama.cpp API Change
2023-10-18 20:50:32 +01:00
Martin Evans 1f8c94e386 Added in the `special` parameter to the tokenizer (introduced in https://github.com/ggerganov/llama.cpp/pull/3538) 2023-10-17 23:55:46 +01:00
Martin Evans efb0664df0 - Added new binaries
- Fixed stateless executor out-of-context handling
 - Fixed token tests
2023-10-17 23:39:41 +01:00
Martin Evans b8f0eff080 - Added `GetCharCountImpl` tests, fixed handling of empty strings
- Added ifdef to remove `Deconstruct` extension on everything except `NETSTANDARD2_0`
2023-10-14 00:04:12 +01:00
Martin Evans 45118520fa - Improved coverage of `GBNFGrammarParser` up to 96%
- Covered text transforms
 - Removed unnecessary non-async transforms
2023-10-13 23:54:01 +01:00
Martin Evans 9f694c584c Further improved grammar parser test coverage (up to 92%) 2023-10-13 02:08:12 +01:00
Martin Evans bff41eef37 Added some more coverage of `GrammarRule`, checking that invalid rules are rejected 2023-10-13 01:36:48 +01:00
Martin Evans 2a38808bca - Added threads to context params, replaced all thread args with `uint?`
- Replaced all binaries
2023-10-12 18:49:41 +01:00
Martin Evans 669ae47ef7 - Split parameters into two interfaces
- params contains a list of loras, instead of just one
2023-09-30 16:21:18 +01:00
Martin Evans 9a0a0ae9fe Removed cloning support 2023-09-30 15:48:26 +01:00
Martin Evans 0d40338692 Fixed out-of-context handling in stateless executor 2023-09-29 23:53:07 +01:00
Martin Evans ce1fc51163 Added some more native methods 2023-09-29 16:05:19 +01:00
Martin Evans bca55eace0 Initial changes to match the llama.cpp changes 2023-09-29 01:18:21 +01:00
Martin Evans fe54f6764f - Added unit tests for extension methods
- Removed unused `AddRangeSpan` extension
2023-09-22 16:29:50 +01:00
Martin Evans 3f80190f85 Minimal changes required to remove non-async inference. 2023-09-14 21:04:14 +01:00
Martin Evans daf09eae64 Skipping tokenization of empty strings (saves allocating an empty array every time) 2023-09-12 01:03:27 +01:00
Martin Evans bba801f4b7 Added a property to get the KV cache size from a context 2023-09-11 00:10:08 +01:00
Martin Evans d3b8ee988c
Beam Search (#155)
* Added the low level bindings to beam search.
2023-09-07 19:26:51 +01:00
Martin Evans d0e57a8c92 sealed test class 2023-09-06 20:11:31 +01:00
Martin Evans 3f082c6f2c Fixed naming in tests 2023-09-06 20:09:41 +01:00
Martin Evans 614ba40948 - Added a `TokensEndsWithAnyString` extension to `IReadOnlyList<int>` which efficiently checks if a set of tokens ends with one of a set of strings.
- Minimal amount of characters converted
   - Allocation free
 - Added `TokensToSpan` to `SafeLlamaModelHandle` which converts as many tokens as possible into a character span
   - Allocation free
2023-09-06 19:44:19 +01:00
Martin Evans 821d7f615e Swapped to llama-7b-chat 2023-09-04 21:26:02 +01:00
Martin Evans 21cbecb82d Disable test parallelism to prevent fix CI 2023-09-03 23:35:53 +01:00
Rinne 4e83e48ad1
Merge pull request #122 from martindevans/gguf
Add GGUF support
2023-09-02 11:54:50 +08:00