Commit Graph

1118 Commits

Author SHA1 Message Date
Martin Evans c9c8cd0d62 - Swapped embeddings generator to use `llama_decode`
- Modified `GetEmbeddings` method to be async
2024-01-31 20:28:53 +00:00
Martin Evans 3b08874bec
Merge pull request #468 from jasoncouture/clblast
Add CLBLAST native library to native libraries build
2024-01-31 20:05:47 +00:00
Martin Evans 22aba9a671
Merge pull request #473 from martindevans/base_handle_removed_constructor
Removed `SafeLLamaHandleBase` Constructor
2024-01-31 18:58:12 +00:00
Martin Evans 2488f74bbd
Merge pull request #472 from martindevans/remove_params_interface_set
Removed `IModelParams` and `IContextParams` setters.
2024-01-31 18:57:49 +00:00
Martin Evans 5da2a2f64b - Removed one of the constructors of `SafeLLamaHandleBase`, which implicitly states that memory is owned. Better to be explicit about this kind of thing!
- Also fixed `ToString()` in `SafeLLamaHandleBase`
2024-01-31 18:01:03 +00:00
Martin Evans 9b995510d6 Removed all setters in `IModelParams` and `IContextParams`, allowing implementations to be immutable. 2024-01-31 17:51:50 +00:00
Martin Evans f9a9aaabca
Merge pull request #471 from jasoncouture/master
Fix incorrect event input variable name
2024-01-30 22:24:06 +00:00
Martin Evans 96d7d37f80
Merge pull request #469 from jasoncouture/library_name_fix
Fix missing library name prefix for cuda
2024-01-30 22:22:27 +00:00
Jason Couture 52a85c35e6 Add missing CMAKE prefix 2024-01-30 13:59:26 -05:00
Jason Couture 689ddf0d08 Add missing T to defines for CLBLAST 2024-01-30 13:59:26 -05:00
Jason Couture face505588 Fix syntax error in CLBLAST if statement 2024-01-30 13:59:26 -05:00
Jason Couture ce5fbf0658 Copy clblast dependencies 2024-01-30 13:59:26 -05:00
Jason Couture ad3f895eb3 Make build-deps depend on compile-clblast 2024-01-30 13:59:25 -05:00
Jason Couture 2347a7aa09 Add build job for CLBLAST 2024-01-30 13:59:25 -05:00
Jason Couture 34ca5ff6eb Simplifiy directory creation in build-deps
using bash expansion expression, combined with mkdir -p (--parents), allows us to create all of the directories at once.

`deps/{avx,avx2,avx512,osx-arm64,osx-x64,cu11.7.1,cu12.1.0,clblast}`
expands to: deps/avx deps/avx2 deps/avx512 deps/osx-arm64 deps/osx-x64 deps/cu11.7.1 deps/cu12.1.0 deps/clblast
2024-01-30 13:59:25 -05:00
Jason Couture 9cfbd22499 Fix github variable name
I am not on my game today 🤦
2024-01-30 13:35:54 -05:00
Jason Couture 30e448d2d5 Use event input directly
GHA doesn't seem to load the value into the enviroment variable first, so the workflow always runs on master.
2024-01-30 13:35:54 -05:00
Martin Evans afa6cc0ec4
Merge pull request #470 from jasoncouture/specific_commit
Checkout specific ref for llamacpp when building native libs
2024-01-30 18:13:02 +00:00
Jason Couture 64cb697bbf Checkout specific ref for llamacpp when building native libs 2024-01-30 13:10:13 -05:00
Jason Couture ec59c5bf9e Fix missing library name prefix for cuda 2024-01-30 12:41:23 -05:00
Martin Evans 0f9742c6d0
Merge pull request #465 from jasoncouture/lib_naming
Use llama instead of libllama in `[DllImport]`
2024-01-30 16:35:24 +00:00
Jason Couture 443ce4fff4 While the dllimport changes work, manual path searching needed to be updated 2024-01-30 11:10:51 -05:00
Jason Couture db7e1e88f8 Use llama instead of libllama in `[DllImport]`
This results in windows users not needing to rename the DLL. This allows native llama builds to be dropped in, even on windows.

I also took the time to update the documentation, removing references to renaming the files, since the names now match.

Fixes #463
2024-01-30 02:40:13 -05:00
Martin Evans 4cfdf064b8
Merge pull request #462 from SciSharp/dependabot/nuget/System.Text.Json-8.0.1
build(deps): bump System.Text.Json from 8.0.0 to 8.0.1
2024-01-29 13:18:25 +00:00
dependabot[bot] d8eb817bf5
build(deps): bump System.Text.Json from 8.0.0 to 8.0.1
Bumps [System.Text.Json](https://github.com/dotnet/runtime) from 8.0.0 to 8.0.1.
- [Release notes](https://github.com/dotnet/runtime/releases)
- [Commits](https://github.com/dotnet/runtime/compare/v8.0.0...v8.0.1)

---
updated-dependencies:
- dependency-name: System.Text.Json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-01-29 06:45:09 +00:00
Martin Evans 5cf481dc8e
Merge pull request #454 from martindevans/kv_cache_instance_methods
kv_cache_instance_methods
2024-01-25 15:26:47 +00:00
Martin Evans 92b9bbe779 Added methods to `SafeLLamaContextHandle` for KV cache manipulation 2024-01-23 16:16:02 +00:00
Martin Evans 8dfd07f67b
Merge pull request #453 from martindevans/fix_bad_merge_nseqmax
Fix Master Build Fail
2024-01-23 15:19:17 +00:00
Martin Evans a690db5d3e Fixed build error caused by extra unnecessary parameter 2024-01-23 15:09:20 +00:00
Martin Evans 96c26c25f5
Merge pull request #445 from martindevans/stateless_executor_llama_decode
Swapped `StatelessExecutor` to use `llama_decode`!
2024-01-23 03:02:51 +00:00
Martin Evans 1bc61472a8
Merge pull request #449 from SciSharp/dependabot/nuget/xunit-2.6.6
build(deps): bump xunit from 2.6.5 to 2.6.6
2024-01-22 14:21:22 +00:00
Martin Evans d5b3650235
Merge pull request #451 from xbotter/deps/sk_1.1.0
bump sk & km
2024-01-22 14:21:03 +00:00
xbotter 90815ae7d8
bump sk & km
- bump semantic kernel to 1.1.0
- bump kernel memory to 0.26
2024-01-22 19:03:28 +08:00
dependabot[bot] 3d4c3c5509
build(deps): bump xunit from 2.6.5 to 2.6.6
Bumps [xunit](https://github.com/xunit/xunit) from 2.6.5 to 2.6.6.
- [Commits](https://github.com/xunit/xunit/compare/2.6.5...2.6.6)

---
updated-dependencies:
- dependency-name: xunit
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-01-22 06:10:13 +00:00
Martin Evans 0074320a31
Merge pull request #447 from martindevans/grow_nseqmax_batch
LLamaBatch Grow n_seq_max automatically
2024-01-21 01:07:12 +00:00
Martin Evans 9fe878ae1f - Fixed example
- Growing more than double, if necessary
2024-01-21 01:00:24 +00:00
Martin Evans 9ede1bedc2 Automatically growing batch n_seq_max when exceeded. This means no parameters need to be picked when the batch is created. 2024-01-21 00:55:14 +00:00
Martin Evans a2e29d393c Swapped `StatelessExecutor` to use `llama_decode`!
- Added `logits_i` argument to `Context.ApplyPenalty`
 - Added a new exception type for `llama_decode` return code
2024-01-20 21:18:35 +00:00
Martin Evans 892e841da3
Merge pull request #444 from martindevans/batched_sampling_example_cleanup
Improved the BatchedDecoding demo
2024-01-20 17:47:19 +00:00
Martin Evans 5b6e82a594 Improved the BatchedDecoding demo:
- using less `NativeHandle`
 - Using `StreamingTokenDecoder` instead of obsolete detokenize method
2024-01-20 17:39:50 +00:00
Martin Evans 250c20bd56
Merge pull request #443 from martindevans/llama_batch_self_grow
LLamaBatch Automatically Grow Capacity
2024-01-20 14:43:24 +00:00
Martin Evans 99969e538e - Removed some unused `eval` methods.
- Added a `DecodeAsync` overload which runs the work in a task
 - Replaced some `NativeHandle` usage in `BatchedDecoding` with higher level equivalents.
 - Made the `LLamaBatch` grow when token capacity is exceeded, removing the need to manage token capacity externally.
2024-01-20 02:38:45 +00:00
Martin Evans a0be27d32b
Merge pull request #442 from martindevans/managed_llama_batch
Managed `LLamaBatch`
2024-01-20 00:05:43 +00:00
Martin Evans 36a9335588 Removed `LLamaBatchSafeHandle` (using unmanaged memory, created by llama.cpp) and replaced it with a fully managed `LLamaBatch`. Modified the `BatchedDecoding` example to use new managed batch. 2024-01-19 23:26:36 +00:00
Martin Evans 4b11feddef
Merge pull request #436 from SciSharp/dependabot/nuget/Microsoft.AspNetCore.OpenApi-8.0.1
build(deps): bump Microsoft.AspNetCore.OpenApi from 8.0.0 to 8.0.1
2024-01-16 15:32:23 +00:00
Martin Evans 1cb9bcd55c
Merge pull request #440 from martindevans/additional_special_string_tokenizer_tests
Extra Tokenization Tests
2024-01-16 15:10:07 +00:00
Martin Evans 1472704e12 Added a test with examples of troublesome strings from 0.9.1 2024-01-16 15:02:23 +00:00
Martin Evans 73172bbaba
Merge pull request #438 from martindevans/cleanup_model_unnecessary_unsafe
Model Metadata Loading Cleanup
2024-01-15 16:31:21 +00:00
Martin Evans ce1d302e7e Moved some native methods into `SafeLlamaModelHandle`, these methods are all wrapped in safer accessors with no extra costs so there is no need to expose them. 2024-01-15 16:10:47 +00:00
Martin Evans 4ef618012e
Merge pull request #437 from martindevans/check_model_path_exists
Check Model Path Exists
2024-01-15 15:41:52 +00:00