Commit Graph

1074 Commits

Author SHA1 Message Date
Martin Evans c7d0dc915a Assorted small changes to clean up some code warnings 2024-02-17 23:07:10 +00:00
Martin Evans 9bc129e252
Merge pull request #512 from martindevans/updated_version
Release 0.10.0
2024-02-15 14:41:27 +00:00
Martin Evans 174f21a385 0.10.0 2024-02-15 14:40:56 +00:00
Martin Evans 633727bb73
Merge pull request #511 from martindevans/fixed_release_minor
Fixed Minor Release Script
2024-02-15 14:40:29 +00:00
Martin Evans 69a74bb053 Commented back in the line that pushes to nuget 2024-02-15 14:38:00 +00:00
Martin Evans d03c1a9201
Merge pull request #503 from martindevans/batched_executor_again
Introduced a new `BatchedExecutor`
2024-02-15 14:26:57 +00:00
Martin Evans 968e1e464a
Merge pull request #507 from martindevans/normalize_embeddings
Normalize Embeddings
2024-02-13 14:08:00 +00:00
Martin Evans d47b6afe4d Normalizing embeddings in `LLamaEmbedder`. As is done in llama.cpp: 2891c8aa9a/examples/embedding/embedding.cpp (L92) 2024-02-13 02:09:35 +00:00
Martin Evans a5eba9463f
Merge pull request #505 from SciSharp/dependabot/nuget/Microsoft.NET.Test.Sdk-17.9.0
build(deps): bump Microsoft.NET.Test.Sdk from 17.8.0 to 17.9.0
2024-02-12 23:54:15 +00:00
Martin Evans e9d9042576 Added `Divide` to `KvAccessor` 2024-02-12 15:54:13 +00:00
Martin Evans 1cc463b9b7 Added a finalizer to `BatchedExecutor` 2024-02-12 15:34:52 +00:00
dependabot[bot] 58b6b927c6
build(deps): bump Microsoft.NET.Test.Sdk from 17.8.0 to 17.9.0
Bumps [Microsoft.NET.Test.Sdk](https://github.com/microsoft/vstest) from 17.8.0 to 17.9.0.
- [Release notes](https://github.com/microsoft/vstest/releases)
- [Changelog](https://github.com/microsoft/vstest/blob/main/docs/releases.md)
- [Commits](https://github.com/microsoft/vstest/compare/v17.8.0...v17.9.0)

---
updated-dependencies:
- dependency-name: Microsoft.NET.Test.Sdk
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-02-12 06:11:00 +00:00
Martin Evans 0c2cff0e1c Added a Finalizer for `Conversation` in case it is not correctly disposed. 2024-02-12 02:58:35 +00:00
Martin Evans 949861a581 - Added a `Modify` method to `Conversation`. This grants **temporary** access to directly modify the KV cache.
- Re-implmented `Rewind` as an extension method using `Modify` internally
 - Implemented `ShiftLeft`, which shifts everything over except for some starting tokens. This is the same as the `StatelessExecutor` out-of-context handling.
 - Starting batch at epoch 1, this ensures that conversations (starting at zero) are below the current epoch. It also means `0` can always be used as a value guaranteed to be below the current epoch.
2024-02-11 23:20:05 +00:00
Martin Evans ea12ff4e07
Merge pull request #502 from vikramvee/Examples
Updated Examples
2024-02-10 00:03:56 +00:00
Martin Evans b0acecf080 Created a new `BatchedExecutor` which processes multiple "Conversations" in one single inference batch. This is faster, even when the conversations are unrelated, and is much faster if the conversations share some overlap (e.g. a common system prompt prefix).
Conversations can be "forked", to create a copy of a conversation at a given point. This allows e.g. prompting a conversation with a system prefix just once and then forking it again and again for each individual conversation. Conversations can also be "rewound" to an earlier state.

Added two new examples, demonstrating forking and rewinding.
2024-02-09 23:57:03 +00:00
vikramvee ebd853fede Updated Examples 2024-02-09 22:46:28 +05:30
Martin Evans 859160d6f7
Merge pull request #501 from martindevans/LLamaPos_inc_dec
Added increment and decrement operators to `LLamaPos`
2024-02-07 19:02:57 +00:00
Martin Evans 90915c5a99 Added increment and decrement operators to `LLamaPos` 2024-02-07 17:04:57 +00:00
Martin Evans 82c471eac4
Merge pull request #500 from martindevans/improved_kv_cache_methods
Small KV Cache Handling Improvements
2024-02-07 16:54:32 +00:00
Martin Evans a8f9262b7f
Merge pull request #499 from martindevans/better_batch_processing
Using `AddRange` in `LLamaEmbedder`
2024-02-07 16:35:50 +00:00
Martin Evans c5146bac23 - Exposed KV debug view through `SafeLLamaContextHandle`
- Added `KvCacheSequenceDivide`
 - Moved count tokens/cells methods to `SafeLLamaContextHandle`
2024-02-07 16:35:39 +00:00
Martin Evans 744758f110 Using `AddRange` in `LLamaEmbedder` 2024-02-07 16:19:36 +00:00
Martin Evans 5d80a56d11
Merge pull request #496 from martindevans/smaller_unit_test_model
Smaller Unit Test Model
2024-02-07 03:20:05 +00:00
Martin Evans 418345cbaf limited parallelism of CI jobs 2024-02-07 03:07:23 +00:00
Martin Evans df38d73c79 Switched to `Q3_K_S` for unit test model, instead of `Q4`. This is almost 1gb smaller, and _may_ make the macos tests less flakey. 2024-02-07 02:36:25 +00:00
Martin Evans ac7faa0f93
Merge pull request #495 from martindevans/quantise_new_formats
Added new file types to quantisation
2024-02-07 01:58:09 +00:00
Martin Evans c7103e86e4 Added new file types to quantisation 2024-02-06 18:06:10 +00:00
Martin Evans 17385e12b6
Merge pull request #479 from martindevans/update_binaries_feb_2024
Update binaries feb 2024
2024-02-06 01:08:09 +00:00
Martin Evans 21bdecd049 Merge branch 'update_binaries_feb_2024' of github.com:martindevans/LLamaSharp into update_binaries_feb_2024 2024-02-06 00:27:44 +00:00
Martin Evans bac40a3b7a Added new binaries, from this run: https://github.com/SciSharp/LLamaSharp/actions/runs/7792319886 2024-02-06 00:27:32 +00:00
Martin Evans 0e2521cff4
Merge pull request #493 from jasoncouture/remove_sha256_check
Disable SHA256 check
2024-02-05 23:35:51 +00:00
Jason Couture a101224c34 Disable SHA256 check 2024-02-05 17:02:45 -05:00
Martin Evans 0592164dc3
Merge pull request #489 from jasoncouture/clblast_nuget
Create nuspec for OpenCL
2024-02-05 19:56:57 +00:00
Jason Couture 1f45bae2cf
Update compile.yml
Fix SHA256 hash path
2024-02-05 13:35:40 -05:00
Jason Couture c963b051e2 Add nuspec for OpenCL (CLBLAST) 2024-02-05 12:21:07 -05:00
Martin Evans d468df08d6
Merge pull request #487 from jasoncouture/clblast_linux
CLBlast for linux
2024-02-05 16:17:50 +00:00
Martin Evans d4f3f642c3
Merge pull request #488 from jasoncouture/concurrency_limit
Only allow one build in parallel per ref
2024-02-05 14:33:53 +00:00
Jason Couture bfe3ad50aa Only allow one build in parallel per ref 2024-02-05 06:07:46 -05:00
Jason Couture f7a6eaa49f Cancel previous builds when a new build is started, due to how expensive this build is. 2024-02-05 06:06:11 -05:00
Jason Couture 277175af4d CLBlast for linux
This builds CLBLAST support for linux, and makes sure to copy the
clblast shared library on both windows and linux to the artifacts.
2024-02-05 06:06:11 -05:00
Martin Evans 7dbaed2d3b
Update README.md 2024-02-05 00:25:40 +00:00
Martin Evans dfac029dde
Merge pull request #485 from zsogitbe/master
KernelMemory EmbeddingMode bug correction
2024-02-04 18:26:28 +00:00
Zoli Somogyi f578fcafa3 KernelMemory EmbeddingMode bug correction 2024-02-04 08:37:08 +01:00
Martin Evans 765c697f77 Fixed number type 2024-02-01 19:40:34 +00:00
Martin Evans b2e815d51e Updated all binaries (from this run: https://github.com/SciSharp/LLamaSharp/actions/runs/7746303349) 2024-02-01 19:34:37 +00:00
Martin Evans 2323988cc7
Merge pull request #478 from martindevans/fixed_artifact_paths
Updated download-artifact to v4
2024-02-01 19:01:27 +00:00
Martin Evans b5674ead97 Updated download-artifact to v4 2024-02-01 19:00:54 +00:00
Martin Evans 15a98b36d8 Updated everything to work with llama.cpp ce32060198b7e2d6a13a9b8e1e1369e3c295ae2a 2024-02-01 16:35:05 +00:00
Martin Evans 48798837fb
Merge pull request #477 from martindevans/updated_cuda_toolkit
Updated compile.yml to use `Jimver/cuda-toolkit@v0.2.14`
2024-02-01 14:41:16 +00:00