Commit Graph

579 Commits

Author SHA1 Message Date
Martin Evans 05100184f4
Merge pull request #719 from martindevans/remove-batched-conversation-prompt-with-string
Remove `Conversation.Prompt(String)`
2024-05-06 16:16:02 +01:00
Martin Evans 3ba49754b1 Removed (marked as obsolete) prompting with a string for `Conversation`. Tokenization requires extra parameters (e.g. addBos, special) which require special considersation. For now it's better to tokenize using other tools and pass the tokens directly. 2024-05-06 15:53:21 +01:00
ksanchez 0bbbf171ed Refactor executors 2024-05-02 23:30:16 -06:00
ksanchez 46a9d603f4 Add method to get BOS token. 2024-05-02 23:29:33 -06:00
ksanchez 61d143d8d8 Implement context shifting in executor base 2024-05-01 22:39:12 -06:00
Norne9 ad9bf1cbba
InitializeSessionFromHistoryAsync changed
ChatSession.InitializeSessionFromHistoryAsync now accepts IHistoryTransform as an optional parameter.
2024-04-30 02:32:14 +03:00
Rinne 495177fd0f fix: typos. 2024-04-29 18:19:20 +08:00
Rinne 98909dc2af
Merge pull request #708 from AsakusaRinne/llama3_support
Add LLaMA3 chat session example.
2024-04-29 10:36:19 +08:00
Rinne 175b25d4f7
Add LLaMA3 chat session example. 2024-04-29 04:12:19 +08:00
Martin Evans 377ebf3664 - Added `LoadFromFileAsync` method for `LLavaWeights`
- Fixed checking for invalid handles in `clip_model_load`
2024-04-27 23:31:07 +01:00
Martin Evans 1ec0fee5ba Added optional `IProgress` parameter to `LoadFromFileAsync` 2024-04-27 15:04:54 +01:00
Martin Evans 9867b4c85d Only setting callback if the token can be cancelled. 2024-04-27 02:55:35 +01:00
Martin Evans 00df7c1516 - Added `LLamaWeights.LoadFromFileAsync`.
- Async loading supports cancellation through a `CancellationToken`. If loading is cancelled an `OperationCanceledException` is thrown.  If it fails for another reason a `LoadWeightsFailedException` is thrown.
 - Updated examples to use `LoadFromFileAsync`
2024-04-27 02:52:41 +01:00
Martin Evans 18586cc43b
Merge pull request #696 from martindevans/safe_handle_constructor_refactor
Removed Unnecessary Constructor From Safe Handles
2024-04-26 16:14:42 +01:00
Martin Evans e9fd7f96e0
Merge pull request #691 from martindevans/empty_batch_check
Empty batch check
2024-04-26 16:14:28 +01:00
Martin Evans a2f8573831
Merge pull request #698 from martindevans/slightly_safer_quantize_params
Slightly Safer Quantize Params
2024-04-26 13:53:55 +01:00
Martin Evans d4f793a7eb Using `is` check instead of `== null` 2024-04-26 13:53:04 +01:00
Martin Evans ecb359c9e7
- Using more specific `LoadWeightsFailedException` when a llava model fails to load (#697)
- Passing model path, instead of a message, to `LoadWeightsFailedException` constructor
2024-04-26 13:39:09 +01:00
Martin Evans 58ec798bff Modified `llama_model_quantize` to accept argument by `ref` instead of pointer. 2024-04-26 01:35:13 +01:00
Martin Evans 54dab273cd - Removed unnecessary constructors from safe handles
- Returning SafeLLamaGrammarHandle directly from `llama_grammar_init` and `llama_grammar_copy`
2024-04-26 01:03:26 +01:00
Martin Evans 25812762c9 Added checks in `Decode` to skip doing anything if the batch is empty. 2024-04-24 14:54:02 +01:00
Martin Evans ccc49eb1e0
BatchedExecutor Save/Load (#681)
* Added the ability to save and load individual conversations in a batched executor.
 - New example
 - Added `BatchedExecutor.Load(filepath)` method
 - Added `Conversation.Save(filepath)` method
 - Added new (currently internal) `SaveState`/`LoadState` methods in LLamaContext which can stash some extra binary data in the header

* Added ability to save/load a `Conversation` to an in-memory state, instead of to file.

* Moved the new save/load methods out to an extension class specifically for the batched executor.

* Removed unnecessary spaces
2024-04-23 15:46:56 +01:00
Lyrcaxis f01c13ee54
Made special tokens included in prompts tokenize as intended (#677) 2024-04-20 15:23:55 +01:00
Martin Evans 3c76440957 - Added tests for generating embeddings with generative model and embedding model
- Rewritten native API methods for embeddings to return pointers - null is a valid value for these methods to return so `Span` is not appropriate
2024-04-19 16:30:32 +01:00
Zoli Somogyi 89217f73ca
Embeddings correction (#674)
* Embeddings correction
2024-04-19 16:23:44 +01:00
Martin Evans c325ac9127
April 2024 Binary Update (#662)
* Updated binaries, using [this build](https://github.com/SciSharp/LLamaSharp/actions/runs/8654672719/job/23733195669) for llama.cpp commit `f7001ccc5aa359fcf41bba19d1c99c3d25c9bcc7`.

 - Added all new functions.
 - Moved some functions (e.g. `SafeLlamaModelHandle` specific functions) into `SafeLlamaModelHandle.cs`
 - Exposed tokens on `SafeLlamaModelHandle` and `LLamaWeights` through a `Tokens` property. As new special tokens are added in the future they can be added here.
 - Changed all token properties to return nullable tokens, to handle some models not having some tokens.
 - Fixed `DefaultSamplingPipeline` to handle no newline token in some models.

* Moved native methods to more specific locations.

 - Context specific things have been moved into `SafeLLamaContextHandle.cs` and made private - they're exposed through C# properties and methods already.
 - Checking that GPU layer count is zero if GPU offload is not supported.
 - Moved methods for creating default structs (`llama_model_quantize_default_params` and `llama_context_default_params`) into relevant structs.

* Removed exception if `GpuLayerCount > 0` when GPU is not supported.

* - Added low level wrapper methods for new per-sequence state load/save in `SafeLLamaContextHandle`
 - Added high level wrapper methods (save/load with `State` object or memory mapped file) in `LLamaContext`
 - Moved native methods for per-sequence state load/save into `SafeLLamaContextHandle`

* Added update and defrag methods for KV cache in `SafeLLamaContextHandle`

* Updated submodule to `f7001ccc5aa359fcf41bba19d1c99c3d25c9bcc7`

* Passing the sequence ID when saving a single sequence state
2024-04-16 23:19:47 +01:00
SignalRT 168f697db6 Clean up and align documentation with the changes in the interface 2024-04-13 16:34:32 +02:00
SignalRT d6890e4ec4 Initial approach to clear images 2024-04-13 11:33:41 +02:00
Zoli Somogyi f4fad825c7 Simplifying image handling 2024-04-08 16:10:54 +02:00
Zoli Somogyi 44a82b0f3f Download image implementation 2024-04-08 10:06:04 +02:00
Zoli Somogyi e991e631f9 Standardizing Image Data implementation 2024-04-07 19:47:39 +02:00
Zoli Somogyi d3c5a42040 Extension LLava with in memory images 2024-04-06 17:22:29 +02:00
Rinne 544a38d3bd Merge branch 'master' of github.com:AsakusaRinne/LLamaSharp into release_0.11.2 2024-04-06 14:22:00 +08:00
Rinne 4640c6af04 release: update release info of packages. 2024-04-06 14:20:36 +08:00
Rinne 045850819e
Merge pull request #647 from AsakusaRinne/fix_llava_backend
fix: add cuda llava native libraries.
2024-04-06 14:19:51 +08:00
Martin Evans 58107bb5b9
Logging interceptor (#649)
* - Added `NativeLogConfig` which allows overriding the llama.cpp log callback
 - Delaying binding of this into llama.cpp until after `NativeLibraryConfig` has loaded

* Using the log callback to show loading log messages during loading.

* Registering log callbacks before any calls to llama.cpp except `llama_empty_call`, this is specifically selected to be a method that does nothing and is just there for triggering DLL loading.

* - Removed much of the complexity of logging from `NativeApi.Load`. It always call whatever log callbacks you have registered.
 - Removed alternative path for `ILogger` in NativeLibraryConfig, instead it redirects to wrapping it in a delegate.

* Saving a GC handle to keep the log callback alive

* Removed prefix, logger should already do that.

* Buffering up messages until a newline is encountered before passing log message to ILogger.

* - Added trailing `\n` to log messages from loading.
 - Using `ThreadLocal<StringBuilder>` to ensure messages from separate threads don't get mixed together.
2024-04-05 16:42:27 +01:00
Rinne ec8f832365
fix: add cuda llava native libraries. 2024-04-04 00:47:33 +08:00
liuyaohui.lyh f7bd458341 fix: llava backend ignores avx and cuda. 2024-04-02 10:46:49 +08:00
Rinne 4038a39843
Merge pull request #637 from SciSharp/dependabot/nuget/Microsoft.Extensions.Logging.Abstractions-8.0.1
build(deps): bump Microsoft.Extensions.Logging.Abstractions from 8.0.0 to 8.0.1
2024-04-01 16:16:45 +08:00
dependabot[bot] 1bfb900fbe
build(deps): bump Microsoft.Extensions.Logging.Abstractions
Bumps [Microsoft.Extensions.Logging.Abstractions](https://github.com/dotnet/runtime) from 8.0.0 to 8.0.1.
- [Release notes](https://github.com/dotnet/runtime/releases)
- [Commits](https://github.com/dotnet/runtime/compare/v8.0.0...v8.0.1)

---
updated-dependencies:
- dependency-name: Microsoft.Extensions.Logging.Abstractions
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-04-01 06:38:51 +00:00
dependabot[bot] 1d163352a0
build(deps): bump System.Text.Json from 8.0.2 to 8.0.3
Bumps [System.Text.Json](https://github.com/dotnet/runtime) from 8.0.2 to 8.0.3.
- [Release notes](https://github.com/dotnet/runtime/releases)
- [Commits](https://github.com/dotnet/runtime/compare/v8.0.2...v8.0.3)

---
updated-dependencies:
- dependency-name: System.Text.Json
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2024-04-01 06:33:06 +00:00
Rinne 3bc952cf60
Merge pull request #633 from AsakusaRinne/doc_ci
fix: the missing of llava_shared library.
2024-04-01 04:44:15 +08:00
Rinne b941540aaf
fix errors in nuspecs. 2024-04-01 04:41:51 +08:00
Rinne d104d9a85b
fix the missing of llava_shared library. 2024-04-01 04:29:00 +08:00
Rinne b4317eebbe
Merge pull request #632 from AsakusaRinne/master
Release version 0.11.0
2024-04-01 02:59:44 +08:00
Rinne d67658a0d6
docs: update the information to v0.11.0. 2024-04-01 01:38:40 +08:00
evolcano 353412923f Merge branch 'master' of https://github.com/SciSharp/LLamaSharp 2024-03-30 10:55:42 +08:00
evolcano 9d091c0316 Add path to find llama.dll for MAUI
This commit is originally made by lcarrere in https://github.com/SciSharp/LLamaSharp/issues/180 .

I have confirmed this modification is OK in my windows 11 laptop, add make this commit according require of AsakusaRinne.
2024-03-30 10:54:44 +08:00
SignalRT 43677c511c Change interface to support multiple images and add the capabitlity to render the image in the console 2024-03-26 23:19:32 +01:00
SignalRT 2d9a114f66 Include comments and include some checks 2024-03-26 23:19:32 +01:00