Commit Graph

1331 Commits

Author SHA1 Message Date
Martin Evans 3d76ef7b6a
Rewritten some examples docs, explaining what these examples show instead of just showing the source code. (#728) 2024-05-11 11:00:45 +08:00
Martin Evans 3b0b2ab224
Merge pull request #721 from martindevans/kv_cache_view
Make `LLamaKvCacheView` Safe
2024-05-10 15:19:36 +01:00
Martin Evans 44bd5b311e
Merge pull request #715 from martindevans/llama-templater
Llama Text Templater
2024-05-10 15:10:22 +01:00
Martin Evans b326624ade Split template out to a field, so it can be changed more easily. 2024-05-10 00:23:57 +01:00
Martin Evans b25f93b86d
Merge pull request #725 from martindevans/fix_cublas_git_ref
Fix cublas build action
2024-05-09 14:07:50 +01:00
Martin Evans c585eb5b25 Fixed cublas action always compiling `master` instead of the specific commit 2024-05-09 14:05:54 +01:00
Rinne d509105da7
ci: add windows benchmark test. (#723)
* ci: add windows benchmark test.
2024-05-09 03:31:54 +08:00
Rinne 6f9097f25b
ci: add benchmark test. (#720)
* ci: add benchmark test.
2024-05-08 23:39:49 +08:00
Martin Evans 2117287da9 Expanded the `LLamaKvCacheView` to make it usable without unsafe.
- Checking indices
 - Returning span of correct length
 - Hiding native methods
 - Hiding native types
2024-05-07 13:16:13 +01:00
Martin Evans 4332ab3813 Changes based on review feedback:
- Returning template for chaining method calls
 - Returning a `TextMessage` object instead of a tuple
2024-05-06 23:43:45 +01:00
Martin Evans a0335f67a4 - Added `LLamaTemplate` which efficiently formats a series of messages according to the model template.
- Fixed `llama_chat_apply_template` method (wrong entrypoint, couldn't handle null model)
2024-05-06 23:43:45 +01:00
Martin Evans 05100184f4
Merge pull request #719 from martindevans/remove-batched-conversation-prompt-with-string
Remove `Conversation.Prompt(String)`
2024-05-06 16:16:02 +01:00
Martin Evans 3ba49754b1 Removed (marked as obsolete) prompting with a string for `Conversation`. Tokenization requires extra parameters (e.g. addBos, special) which require special considersation. For now it's better to tokenize using other tools and pass the tokens directly. 2024-05-06 15:53:21 +01:00
Martin Evans 9906871f84
Merge pull request #714 from ksanman/infinite-context
Implement context shifting in executor base
2024-05-04 22:17:15 +01:00
ksanchez 0bbbf171ed Refactor executors 2024-05-02 23:30:16 -06:00
ksanchez 46a9d603f4 Add method to get BOS token. 2024-05-02 23:29:33 -06:00
ksanchez 61d143d8d8 Implement context shifting in executor base 2024-05-01 22:39:12 -06:00
Rinne 6bf010d719
Merge pull request #689 from zsogitbe/master
SemanticKernel: Correcting non-standard way of working with PromptExecutionSettings
2024-05-01 01:52:43 +08:00
Zoli Somogyi 54c01d4c2c Making old code obsolete - SemanticKernel: Correcting working with PromptExecutionSettings 2024-04-30 19:28:31 +02:00
Rinne 0c770a528e
Merge pull request #671 from kidkych/feature/interactive-sk-chatcompletion
Optimize Semantic Kernel LLamaSharpChatCompletion when running with StatefulExecutorBase models
2024-05-01 01:02:25 +08:00
Rinne 16141adcb0
Merge pull request #711 from Norne9/master
Optional IHistoryTransform added to ChatSession.InitializeSessionFromHistoryAsync
2024-05-01 01:00:02 +08:00
Rinne 7b03e735bb
Merge pull request #709 from AsakusaRinne/format_check_ci 2024-04-30 12:03:42 +08:00
Norne9 5c60e6d4ca
Merge pull request #1 from Norne9/Norne9-patch-chat-session
Optional IHistoryTransform added to ChatSession.InitializeSessionFromHistoryAsync
2024-04-30 02:39:07 +03:00
Norne9 ad9bf1cbba
InitializeSessionFromHistoryAsync changed
ChatSession.InitializeSessionFromHistoryAsync now accepts IHistoryTransform as an optional parameter.
2024-04-30 02:32:14 +03:00
Rinne 33d5677c0e Add editorconfig file for code format. 2024-04-30 00:00:35 +08:00
Rinne f44c8846f5
Merge pull request #710 from AsakusaRinne/typo_check_ci
ci: add workflow to check the spellings.
2024-04-29 23:31:52 +08:00
Rinne 495177fd0f fix: typos. 2024-04-29 18:19:20 +08:00
Rinne de31a06a4a ci: add workflow to check the spelling. 2024-04-29 18:07:13 +08:00
Rinne 98909dc2af
Merge pull request #708 from AsakusaRinne/llama3_support
Add LLaMA3 chat session example.
2024-04-29 10:36:19 +08:00
Martin Evans 4c078a757c
Merge pull request #703 from martindevans/llava_async_load
LLava Async Loading
2024-04-28 22:38:21 +01:00
Rinne 175b25d4f7
Add LLaMA3 chat session example. 2024-04-29 04:12:19 +08:00
Martin Evans 377ebf3664 - Added `LoadFromFileAsync` method for `LLavaWeights`
- Fixed checking for invalid handles in `clip_model_load`
2024-04-27 23:31:07 +01:00
Martin Evans 84bb5a36ab
Merge pull request #702 from martindevans/interruptible_async_model_load
Interruptible Async Model Loading With Progress Monitoring
2024-04-27 16:06:40 +01:00
Martin Evans 1ec0fee5ba Added optional `IProgress` parameter to `LoadFromFileAsync` 2024-04-27 15:04:54 +01:00
Zoli Somogyi 2aa96b206f Adding Response Format - Correcting non-standard way of working with PromptExecutionSettings
can be used downstream to post-process the messages based on the requested format
2024-04-27 09:39:40 +02:00
Martin Evans 9867b4c85d Only setting callback if the token can be cancelled. 2024-04-27 02:55:35 +01:00
Martin Evans 00df7c1516 - Added `LLamaWeights.LoadFromFileAsync`.
- Async loading supports cancellation through a `CancellationToken`. If loading is cancelled an `OperationCanceledException` is thrown.  If it fails for another reason a `LoadWeightsFailedException` is thrown.
 - Updated examples to use `LoadFromFileAsync`
2024-04-27 02:52:41 +01:00
Rinne b47ed9258f
Merge pull request #701 from AsakusaRinne/add_issue_template
Fix typo in issue templates.
2024-04-27 03:59:27 +08:00
Rinne bcf3ef1e40
Fix typo in issue templates. 2024-04-27 03:58:45 +08:00
Rinne c6565c3aaf
Merge pull request #700 from AsakusaRinne/add_issue_template
Add issue templates.
2024-04-27 03:56:58 +08:00
Rinne d56eb1a5ad
Add issue templates. 2024-04-27 03:38:20 +08:00
Martin Evans 18586cc43b
Merge pull request #696 from martindevans/safe_handle_constructor_refactor
Removed Unnecessary Constructor From Safe Handles
2024-04-26 16:14:42 +01:00
Martin Evans e9fd7f96e0
Merge pull request #691 from martindevans/empty_batch_check
Empty batch check
2024-04-26 16:14:28 +01:00
Martin Evans a2f8573831
Merge pull request #698 from martindevans/slightly_safer_quantize_params
Slightly Safer Quantize Params
2024-04-26 13:53:55 +01:00
Martin Evans d4f793a7eb Using `is` check instead of `== null` 2024-04-26 13:53:04 +01:00
Martin Evans ecb359c9e7
- Using more specific `LoadWeightsFailedException` when a llava model fails to load (#697)
- Passing model path, instead of a message, to `LoadWeightsFailedException` constructor
2024-04-26 13:39:09 +01:00
Martin Evans 58ec798bff Modified `llama_model_quantize` to accept argument by `ref` instead of pointer. 2024-04-26 01:35:13 +01:00
Martin Evans 54dab273cd - Removed unnecessary constructors from safe handles
- Returning SafeLLamaGrammarHandle directly from `llama_grammar_init` and `llama_grammar_copy`
2024-04-26 01:03:26 +01:00
Martin Evans 25812762c9 Added checks in `Decode` to skip doing anything if the batch is empty. 2024-04-24 14:54:02 +01:00
Zoli Somogyi 59a0afdb77 Renaming files to correspond to class names 2024-04-24 08:24:02 +02:00