Martin Evans
9bc129e252
Merge pull request #512 from martindevans/updated_version
...
Release 0.10.0
2024-02-15 14:41:27 +00:00
Martin Evans
174f21a385
0.10.0
2024-02-15 14:40:56 +00:00
Martin Evans
633727bb73
Merge pull request #511 from martindevans/fixed_release_minor
...
Fixed Minor Release Script
2024-02-15 14:40:29 +00:00
Martin Evans
69a74bb053
Commented back in the line that pushes to nuget
2024-02-15 14:38:00 +00:00
Martin Evans
d03c1a9201
Merge pull request #503 from martindevans/batched_executor_again
...
Introduced a new `BatchedExecutor`
2024-02-15 14:26:57 +00:00
Martin Evans
968e1e464a
Merge pull request #507 from martindevans/normalize_embeddings
...
Normalize Embeddings
2024-02-13 14:08:00 +00:00
Martin Evans
d47b6afe4d
Normalizing embeddings in `LLamaEmbedder`. As is done in llama.cpp: 2891c8aa9a/examples/embedding/embedding.cpp (L92)
2024-02-13 02:09:35 +00:00
Martin Evans
a5eba9463f
Merge pull request #505 from SciSharp/dependabot/nuget/Microsoft.NET.Test.Sdk-17.9.0
...
build(deps): bump Microsoft.NET.Test.Sdk from 17.8.0 to 17.9.0
2024-02-12 23:54:15 +00:00
Martin Evans
e9d9042576
Added `Divide` to `KvAccessor`
2024-02-12 15:54:13 +00:00
Martin Evans
1cc463b9b7
Added a finalizer to `BatchedExecutor`
2024-02-12 15:34:52 +00:00
dependabot[bot]
58b6b927c6
build(deps): bump Microsoft.NET.Test.Sdk from 17.8.0 to 17.9.0
...
Bumps [Microsoft.NET.Test.Sdk](https://github.com/microsoft/vstest ) from 17.8.0 to 17.9.0.
- [Release notes](https://github.com/microsoft/vstest/releases )
- [Changelog](https://github.com/microsoft/vstest/blob/main/docs/releases.md )
- [Commits](https://github.com/microsoft/vstest/compare/v17.8.0...v17.9.0 )
---
updated-dependencies:
- dependency-name: Microsoft.NET.Test.Sdk
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com>
2024-02-12 06:11:00 +00:00
Martin Evans
0c2cff0e1c
Added a Finalizer for `Conversation` in case it is not correctly disposed.
2024-02-12 02:58:35 +00:00
Martin Evans
949861a581
- Added a `Modify` method to `Conversation`. This grants **temporary** access to directly modify the KV cache.
...
- Re-implmented `Rewind` as an extension method using `Modify` internally
- Implemented `ShiftLeft`, which shifts everything over except for some starting tokens. This is the same as the `StatelessExecutor` out-of-context handling.
- Starting batch at epoch 1, this ensures that conversations (starting at zero) are below the current epoch. It also means `0` can always be used as a value guaranteed to be below the current epoch.
2024-02-11 23:20:05 +00:00
Martin Evans
ea12ff4e07
Merge pull request #502 from vikramvee/Examples
...
Updated Examples
2024-02-10 00:03:56 +00:00
Martin Evans
b0acecf080
Created a new `BatchedExecutor` which processes multiple "Conversations" in one single inference batch. This is faster, even when the conversations are unrelated, and is much faster if the conversations share some overlap (e.g. a common system prompt prefix).
...
Conversations can be "forked", to create a copy of a conversation at a given point. This allows e.g. prompting a conversation with a system prefix just once and then forking it again and again for each individual conversation. Conversations can also be "rewound" to an earlier state.
Added two new examples, demonstrating forking and rewinding.
2024-02-09 23:57:03 +00:00
vikramvee
ebd853fede
Updated Examples
2024-02-09 22:46:28 +05:30
Martin Evans
859160d6f7
Merge pull request #501 from martindevans/LLamaPos_inc_dec
...
Added increment and decrement operators to `LLamaPos`
2024-02-07 19:02:57 +00:00
Martin Evans
90915c5a99
Added increment and decrement operators to `LLamaPos`
2024-02-07 17:04:57 +00:00
Martin Evans
82c471eac4
Merge pull request #500 from martindevans/improved_kv_cache_methods
...
Small KV Cache Handling Improvements
2024-02-07 16:54:32 +00:00
Martin Evans
a8f9262b7f
Merge pull request #499 from martindevans/better_batch_processing
...
Using `AddRange` in `LLamaEmbedder`
2024-02-07 16:35:50 +00:00
Martin Evans
c5146bac23
- Exposed KV debug view through `SafeLLamaContextHandle`
...
- Added `KvCacheSequenceDivide`
- Moved count tokens/cells methods to `SafeLLamaContextHandle`
2024-02-07 16:35:39 +00:00
Martin Evans
744758f110
Using `AddRange` in `LLamaEmbedder`
2024-02-07 16:19:36 +00:00
Martin Evans
5d80a56d11
Merge pull request #496 from martindevans/smaller_unit_test_model
...
Smaller Unit Test Model
2024-02-07 03:20:05 +00:00
Martin Evans
418345cbaf
limited parallelism of CI jobs
2024-02-07 03:07:23 +00:00
Martin Evans
df38d73c79
Switched to `Q3_K_S` for unit test model, instead of `Q4`. This is almost 1gb smaller, and _may_ make the macos tests less flakey.
2024-02-07 02:36:25 +00:00
Martin Evans
ac7faa0f93
Merge pull request #495 from martindevans/quantise_new_formats
...
Added new file types to quantisation
2024-02-07 01:58:09 +00:00
Martin Evans
c7103e86e4
Added new file types to quantisation
2024-02-06 18:06:10 +00:00
Martin Evans
17385e12b6
Merge pull request #479 from martindevans/update_binaries_feb_2024
...
Update binaries feb 2024
2024-02-06 01:08:09 +00:00
Martin Evans
21bdecd049
Merge branch 'update_binaries_feb_2024' of github.com:martindevans/LLamaSharp into update_binaries_feb_2024
2024-02-06 00:27:44 +00:00
Martin Evans
bac40a3b7a
Added new binaries, from this run: https://github.com/SciSharp/LLamaSharp/actions/runs/7792319886
2024-02-06 00:27:32 +00:00
Martin Evans
0e2521cff4
Merge pull request #493 from jasoncouture/remove_sha256_check
...
Disable SHA256 check
2024-02-05 23:35:51 +00:00
Jason Couture
a101224c34
Disable SHA256 check
2024-02-05 17:02:45 -05:00
Martin Evans
0592164dc3
Merge pull request #489 from jasoncouture/clblast_nuget
...
Create nuspec for OpenCL
2024-02-05 19:56:57 +00:00
Jason Couture
1f45bae2cf
Update compile.yml
...
Fix SHA256 hash path
2024-02-05 13:35:40 -05:00
Jason Couture
c963b051e2
Add nuspec for OpenCL (CLBLAST)
2024-02-05 12:21:07 -05:00
Martin Evans
d468df08d6
Merge pull request #487 from jasoncouture/clblast_linux
...
CLBlast for linux
2024-02-05 16:17:50 +00:00
Martin Evans
d4f3f642c3
Merge pull request #488 from jasoncouture/concurrency_limit
...
Only allow one build in parallel per ref
2024-02-05 14:33:53 +00:00
Jason Couture
bfe3ad50aa
Only allow one build in parallel per ref
2024-02-05 06:07:46 -05:00
Jason Couture
f7a6eaa49f
Cancel previous builds when a new build is started, due to how expensive this build is.
2024-02-05 06:06:11 -05:00
Jason Couture
277175af4d
CLBlast for linux
...
This builds CLBLAST support for linux, and makes sure to copy the
clblast shared library on both windows and linux to the artifacts.
2024-02-05 06:06:11 -05:00
Martin Evans
7dbaed2d3b
Update README.md
2024-02-05 00:25:40 +00:00
Martin Evans
dfac029dde
Merge pull request #485 from zsogitbe/master
...
KernelMemory EmbeddingMode bug correction
2024-02-04 18:26:28 +00:00
Zoli Somogyi
f578fcafa3
KernelMemory EmbeddingMode bug correction
2024-02-04 08:37:08 +01:00
Martin Evans
765c697f77
Fixed number type
2024-02-01 19:40:34 +00:00
Martin Evans
b2e815d51e
Updated all binaries (from this run: https://github.com/SciSharp/LLamaSharp/actions/runs/7746303349 )
2024-02-01 19:34:37 +00:00
Martin Evans
2323988cc7
Merge pull request #478 from martindevans/fixed_artifact_paths
...
Updated download-artifact to v4
2024-02-01 19:01:27 +00:00
Martin Evans
b5674ead97
Updated download-artifact to v4
2024-02-01 19:00:54 +00:00
Martin Evans
15a98b36d8
Updated everything to work with llama.cpp ce32060198b7e2d6a13a9b8e1e1369e3c295ae2a
2024-02-01 16:35:05 +00:00
Martin Evans
48798837fb
Merge pull request #477 from martindevans/updated_cuda_toolkit
...
Updated compile.yml to use `Jimver/cuda-toolkit@v0.2.14`
2024-02-01 14:41:16 +00:00
Martin Evans
2df7e35c81
Updated compile.yml to use `Jimver/cuda-toolkit@v0.2.14`
2024-02-01 14:40:49 +00:00