Commit Graph

1290 Commits

Author SHA1 Message Date
Martin Evans e2705be6c8
Fixed off by one error in LLamaBatch sampling position (#626) 2024-03-25 22:56:26 +00:00
Martin Evans a2d3a847dd
Disabled LLava tests, they're too slow and are crashing CI (#625) 2024-03-25 22:10:41 +00:00
Martin Evans 91d72e7465
Keeping track of positions where logits will be generated in a batch and what sequence those logits are associated with. (#624) 2024-03-25 21:02:48 +00:00
eublefar b8cd5b7ee5 loadTransforms flag for LoadSession methods 2024-03-21 12:18:38 +01:00
eublefar 9440f153da Make process message method more flexible 2024-03-21 12:14:15 +01:00
Kenneth Tang e4c2f57e43
Merge branch 'SciSharp:master' into master 2024-03-21 13:11:11 +08:00
Martin Evans 268f3a6b07
BatchedExecutor Fixed Forking (#621)
* Previously when a conversation was forked this would result in both the parent and the child sharing exactly the same logits. Since sampling is allowed to modify logits this could lead to issues in sampling (e.g. one conversation is sampled and overwrites logits to be all zero, second conversation is sampled and generates nonsense). Fixed this by setting a "forked" flag, logits are copied if this flag is set. Flag is cleared next time the conversation is prompted so this extra copying only happens once after a fork occurs.

* Removed finalizer from `BatchedExecutor`. This class does not directly own any unmanaged resources so it is not necessary.
2024-03-20 16:36:01 +00:00
Kenneth Tang 9e4109f774 Unable to load the model onto multiple GPUs (#617) 2024-03-20 07:03:12 +00:00
Kenneth Tang 6216197196
Merge branch 'SciSharp:master' into master 2024-03-20 14:39:36 +08:00
Martin Evans ad682fbebd
`BatchedExecutor.Create()` method (#613)
Replaced `BatchedExecutor.Prompt(string)` method with `BatchedExecutor.Create()` method. This improves the API in two ways:
 - A conversation can be created, without immediately prompting it
 - Other prompting overloads (e.g. prompt with token list) can be used without duplicating all the overloads onto `BatchedExecutor`

Added `BatchSize` property to `LLamaContext`
2024-03-20 02:20:35 +00:00
Kenneth Tang 3fda708eaa Fix System.ArgumentException: EmbeddingMode must be true 2024-03-19 06:53:16 +00:00
Rinne e3ecc318ff
Merge pull request #612 from xbotter/deps/sk-1.6.2
Update Semantic Kernel & Kernel Memory Package
2024-03-19 09:34:28 +08:00
xbotter a019b5cc24
📝 Update LLamaSharpChatCompletion and LLama.Unittest
- Updated LLamaSharpChatCompletion class in LLama.SemanticKernel/ChatCompletion/LLamaSharpChatCompletion.cs
  - Changed the type of the "_model" field from "StatelessExecutor" to "ILLamaExecutor"
  - Updated the constructor to accept an "ILLamaExecutor" parameter instead of a "StatelessExecutor" parameter
- Updated LLamaSharpChatCompletion class in LLama.SemanticKernel/LLamaSharp.SemanticKernel.csproj

- Updated LLama.Unittest project in LLama.Unittest/LLama.Unittest.csproj
  - Added a "PackageReference" for "Moq" version 4.20.70
- Added ExtensionMethodsTests class in LLama.Unittest/SemanticKernel/ExtensionMethodsTests.cs
  - Added tests for the "ToLLamaSharpChatHistory" and "ToLLamaSharpInferenceParams" extension methods
- Added LLamaSharpChatCompletionTests class in LLama.Unittest/SemanticKernel/LLamaSharpChatCompletionTests.cs
  - Added tests for the LLamaSharpChatCompletion class

ℹ️ The LLamaSharpChatCompletion class in the LLama.SemanticKernel project has been updated to use the ILLamaExecutor interface instead of the StatelessExecutor class. This change allows for better abstraction and flexibility in the implementation of the LLamaSharpChatCompletion class. The LLamaSharpChatCompletion class is responsible for providing chat completion functionality in the LLamaSharp project. The LLama.Unittest project has also been updated to include tests for the LLamaSharpChatCompletion class and the extension methods used by the class.
2024-03-18 21:49:52 +08:00
Martin Evans 024787225b
`SetDllImportResolver` based loading (#603)
- Modified library loading to be based on `SetDllImportResolver`. This replaces the built in loading system and ensures there can't be two libraries loaded at once.
 - llava and llama are loaded separately, as needed.
 - All the previous loading logic is still used, within the `SetDllImportResolver`
 - Split out CUDA, AVX and MacOS paths to separate helper methods.
 - `Description` now specifies if it is for `llama` or `llava`
2024-03-17 19:54:20 +00:00
eublefar d88f9e1199 Return null executor state if it's serialized in an old way 2024-03-17 16:22:25 +01:00
eublefar 00c873a197 Avoid saving empty context state in binary format, it smh messes with the llama.cpp 2024-03-17 15:55:35 +01:00
eublefar a31391edd7 Polymorphic serialization for executor state and transforms 2024-03-17 15:34:36 +01:00
eublefar 6f76d77350 Make text transform interfaces have explicit copy operation 2024-03-17 12:37:02 +01:00
eublefar 5f3803d23c Make state editable by the user, add deepcopy to fields that require it 2024-03-17 12:21:52 +01:00
eublefar 87fe982f10 Change method signature as suggested 2024-03-17 12:11:19 +01:00
eublefar af796fc3e9 Change List types in executor state to arrays to enforce copy on get/set operations 2024-03-17 11:58:26 +01:00
Rinne 6ddd45baa3
Merge pull request #602 from AsakusaRinne/add_submodule
build: add llama.cpp as submodule.
2024-03-17 18:00:53 +08:00
Rinne e17c8df992
Merge pull request #604 from AsakusaRinne/fix_readme_example
docs: update the example in readme.
2024-03-16 12:08:11 +08:00
Rinne a0dac6293f
docs: update the example in readme. 2024-03-16 12:06:06 +08:00
Rinne 5da0b4616a
build: add llama.cpp as submodule. 2024-03-14 23:27:38 +08:00
xbotter 3f2e5c27ff
🔧 Update package references
- Update Microsoft.KernelMemory.Core to version 0.34.240313.1
- Update Microsoft.SemanticKernel to version 1.6.2
- Update Microsoft.SemanticKernel.Plugins.Memory to version 1.6.2-alpha
- Update Microsoft.KernelMemory.Abstractions to version 0.34.240313.1
- Update Microsoft.SemanticKernel.Abstractions to version 1.6.2
2024-03-14 22:17:59 +08:00
jlsantiago 3b2836eac4
Llava api (#563)
* Add llava_binaries, update all binaries to make the test

* Llava API + LlavaTest

Preliminary

* First prototype of Load + Unit Test

* Temporary run test con branch LlavaAPI

* Disable Embed test to review the rest of the test

* Restore Embedding test

* Use BatchThread to eval image embeddings

Test Threads default value to ensure it doesn´t produce problems.

* Rename test file

* Update action versions

* Test only one method, no release embeddings

* Revert "Test only one method, no release embeddings"

This reverts commit 264e176dccc9cd0be318b800ae5e102a4635d01c.

* Correct API call

* Only test llava related functionality

* Cuda and Cblast binaries

* Restore build policy

* Changes related with code review

* Add SafeHandles

* Set overwrite to upload-artifact@v4

* Revert to upload-artifact@v3

* revert to upload-artifact@v3
2024-03-13 22:10:44 +00:00
Martin Evans ce4de7d607
llama_decode lock (#595)
* Added a lock object into `SafeLlamaModelHandle` which all calls to `llama_decode` (in the `SafeLLamaContextHandle`) lock first. This prevents two contexts from running inference on the same model at the same time, which seems to be unsafe in llama.cpp.

* Modified the lock to be global over _all_ inferences. This seems to be necessary (at least with the CUDA backend).
2024-03-13 00:33:16 +00:00
Valentin Arthur Thomas 9deb50a2f4
update readme.md backends (#587) 2024-03-11 23:42:01 +00:00
Clovis Henrique Ribeiro d0f79814e9
Added conditional compilation code to progress_callback (in LlamaModelParams struct) so the struct plays nice with legacy NET Framework 4.8 (#593) 2024-03-11 14:36:50 +00:00
Rinne 884641f751
ci: add dependabot.yml 2024-03-11 21:39:40 +08:00
Martin Evans f0b0bbcbb7
Mutable Logits (#586)
Modified LLamaBatch to not share tokens with other sequences if logits is true. This ensures that the logit span at the end in used by exactly one sequence - therefore it's safe to mutate. This removes the need for copying _very_ large arrays (vocab size) and simplifies sampling pipelines.
2024-03-10 13:56:11 +00:00
Martin Evans a8ba9f05b3
March Binary Update (#565)
* Updated binaries to llama.cpp `3ab8b3a92ede46df88bc5a2dfca3777de4a2b2b6` (build run: https://github.com/SciSharp/LLamaSharp/actions/runs/8118890586)

* Added abort callback

* Added properties to get/set thread count on `LLamaContext`

* Fixed LLamaLogLevel numbering
2024-03-06 15:19:42 +00:00
dependabot[bot] 6f03d5ac5c
build(deps): bump Microsoft.SemanticKernel and Microsoft.SemanticKernel.Abstractions (#572)
Bumps [Microsoft.SemanticKernel](https://github.com/microsoft/semantic-kernel) and [Microsoft.SemanticKernel.Abstractions](https://github.com/microsoft/semantic-kernel). These dependencies needed to be updated together.

Updates `Microsoft.SemanticKernel` from 1.4.0 to 1.5.0
- [Release notes](https://github.com/microsoft/semantic-kernel/releases)
- [Commits](https://github.com/microsoft/semantic-kernel/compare/dotnet-1.4.0...dotnet-1.5.0)

Updates `Microsoft.SemanticKernel.Abstractions` from 1.4.0 to 1.5.0
- [Release notes](https://github.com/microsoft/semantic-kernel/releases)
- [Commits](https://github.com/microsoft/semantic-kernel/compare/dotnet-1.4.0...dotnet-1.5.0)

---
updated-dependencies:
- dependency-name: Microsoft.SemanticKernel
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: Microsoft.SemanticKernel.Abstractions
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-03-05 00:58:43 +00:00
Rinne aa9a0a776b
Merge pull request #571 from SciSharp/dependabot/nuget/System.Text.Json-8.0.2
build(deps): bump System.Text.Json from 8.0.1 to 8.0.2
2024-03-04 17:37:37 +08:00
dependabot[bot] 4068a6f03b
build(deps): bump System.Text.Json from 8.0.1 to 8.0.2
Bumps [System.Text.Json](https://github.com/dotnet/runtime) from 8.0.1 to 8.0.2.
- [Release notes](https://github.com/dotnet/runtime/releases)
- [Commits](https://github.com/dotnet/runtime/compare/v8.0.1...v8.0.2)

---
updated-dependencies:
- dependency-name: System.Text.Json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-03-04 07:04:42 +00:00
Martin Evans defac000ad
Added a `%(RecursiveDir)` element to the props file, this causes files to be copied along with the folder structure rather than dumped into the root. (#561) 2024-03-03 17:58:50 +00:00
eublefar e05d5d4e14 Remove resetting state ops and make SessionState.ExecutorState and SessionState.ContextState no nullable 2024-03-02 20:07:17 +01:00
eublefar 0763f307ec Example chat session with preprocessing of chat history and reset operation that resets chat to original point of history without extra processing 2024-03-02 17:27:18 +01:00
eublefar b2f7dbb39b AddPromptAsync method for stateful executors, Chat session initialize from history and process system message methods for pre-processing prompts. Serializing executor state to JSON, to avoid saved states from being updated by reference. 2024-03-02 17:26:06 +01:00
eublefar 35153a77dd Chat session Get/Load in-memory state operations, reset state ops for stateful executors and context 2024-03-02 14:51:03 +01:00
jlsantiago ef24196444
Build Llava binaries (#556)
* Include llava binaries in the build

* Temporary comment

* Temporary do not remove artifacts

* Update upload version

* Remove artifacts
2024-03-01 13:41:09 +00:00
Martin Evans 8ac1634233
Removed `llama_eval`. It is going to be completely removed in the next version of llama.cpp (#553) 2024-02-28 21:41:39 +00:00
Martin Evans f0e7e7cc0a
Removed `SamplingApi`. it has been marked as Obsolete for a while, replaced by instance methods on `LLamaTokenDataArray` (#552) 2024-02-28 19:30:53 +00:00
Martin Evans a0731db944
Added tests checking that memory is freed properly (#551) 2024-02-28 17:12:24 +00:00
Martin Evans 7d84625a67
Classifier Free Guidance (#536)
* Added a `Guidance` method to `LLamaTokenDataArray` which applies classifier free guidance

* Factored out a safer `llama_sample_apply_guidance` method based on spans

* Created a guided sampling demo using the batched executor

* fixed comment, "classifier free" not "context free"

* Rebased onto master and fixed breakage due to changes in `BaseSamplingPipeline`

* Asking user for guidance weight

* Progress bar in batched fork demo

* Improved fork example (using tree display)

* Added proper disposal of resources in batched examples

* Added some more comments in BatchedExecutorGuidance
2024-02-26 15:41:57 +00:00
dependabot[bot] 364259aabe
build(deps): bump Microsoft.SemanticKernel from 1.1.0 to 1.4.0 (#544)
Bumps [Microsoft.SemanticKernel](https://github.com/microsoft/semantic-kernel) from 1.1.0 to 1.4.0.
- [Release notes](https://github.com/microsoft/semantic-kernel/releases)
- [Commits](https://github.com/microsoft/semantic-kernel/compare/dotnet-1.1.0...dotnet-1.4.0)

---
updated-dependencies:
- dependency-name: Microsoft.SemanticKernel
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-26 15:21:22 +00:00
dependabot[bot] e50f30d740
build(deps): bump Microsoft.KernelMemory.Core, System.Text.Json and Microsoft.KernelMemory.Abstractions (#546)
Bumps [Microsoft.KernelMemory.Core](https://github.com/microsoft/kernel-memory), [System.Text.Json](https://github.com/dotnet/runtime) and [Microsoft.KernelMemory.Abstractions](https://github.com/microsoft/kernel-memory). These dependencies needed to be updated together.

Updates `Microsoft.KernelMemory.Core` from 0.26.240121.1 to 0.29.240219.2
- [Release notes](https://github.com/microsoft/kernel-memory/releases)
- [Commits](https://github.com/microsoft/kernel-memory/compare/packages-0.26.240121.1...packages-0.29.240219.2)

Updates `System.Text.Json` from 8.0.1 to 8.0.2
- [Release notes](https://github.com/dotnet/runtime/releases)
- [Commits](https://github.com/dotnet/runtime/compare/v8.0.1...v8.0.2)

Updates `Microsoft.KernelMemory.Abstractions` from 0.26.240104.1 to 0.29.240219.3
- [Release notes](https://github.com/microsoft/kernel-memory/releases)
- [Commits](https://github.com/microsoft/kernel-memory/compare/0.26.240104.1...abstractions-0.29.240219.3)

---
updated-dependencies:
- dependency-name: Microsoft.KernelMemory.Core
  dependency-type: direct:production
  update-type: version-update:semver-minor
- dependency-name: System.Text.Json
  dependency-type: direct:production
  update-type: version-update:semver-patch
- dependency-name: Microsoft.KernelMemory.Abstractions
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-26 14:58:15 +00:00
dependabot[bot] 33827a1ba8
build(deps): bump Microsoft.SemanticKernel.Abstractions (#542)
Bumps [Microsoft.SemanticKernel.Abstractions](https://github.com/microsoft/semantic-kernel) from 1.1.0 to 1.4.0.
- [Release notes](https://github.com/microsoft/semantic-kernel/releases)
- [Commits](https://github.com/microsoft/semantic-kernel/compare/dotnet-1.1.0...dotnet-1.4.0)

---
updated-dependencies:
- dependency-name: Microsoft.SemanticKernel.Abstractions
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-26 14:09:11 +00:00
dependabot[bot] 070969b23f
build(deps): bump coverlet.collector from 6.0.0 to 6.0.1 (#540)
Bumps [coverlet.collector](https://github.com/coverlet-coverage/coverlet) from 6.0.0 to 6.0.1.
- [Release notes](https://github.com/coverlet-coverage/coverlet/releases)
- [Commits](https://github.com/coverlet-coverage/coverlet/compare/v6.0.0...v6.0.1)

---
updated-dependencies:
- dependency-name: coverlet.collector
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-02-26 14:08:27 +00:00