Martin Evans
3d76ef7b6a
Rewritten some examples docs, explaining what these examples show instead of just showing the source code. ( #728 )
2024-05-11 11:00:45 +08:00
Martin Evans
3b0b2ab224
Merge pull request #721 from martindevans/kv_cache_view
...
Make `LLamaKvCacheView` Safe
2024-05-10 15:19:36 +01:00
Martin Evans
44bd5b311e
Merge pull request #715 from martindevans/llama-templater
...
Llama Text Templater
2024-05-10 15:10:22 +01:00
Martin Evans
b326624ade
Split template out to a field, so it can be changed more easily.
2024-05-10 00:23:57 +01:00
Martin Evans
b25f93b86d
Merge pull request #725 from martindevans/fix_cublas_git_ref
...
Fix cublas build action
2024-05-09 14:07:50 +01:00
Martin Evans
c585eb5b25
Fixed cublas action always compiling `master` instead of the specific commit
2024-05-09 14:05:54 +01:00
Rinne
d509105da7
ci: add windows benchmark test. ( #723 )
...
* ci: add windows benchmark test.
2024-05-09 03:31:54 +08:00
Rinne
6f9097f25b
ci: add benchmark test. ( #720 )
...
* ci: add benchmark test.
2024-05-08 23:39:49 +08:00
Martin Evans
2117287da9
Expanded the `LLamaKvCacheView` to make it usable without unsafe.
...
- Checking indices
- Returning span of correct length
- Hiding native methods
- Hiding native types
2024-05-07 13:16:13 +01:00
Martin Evans
4332ab3813
Changes based on review feedback:
...
- Returning template for chaining method calls
- Returning a `TextMessage` object instead of a tuple
2024-05-06 23:43:45 +01:00
Martin Evans
a0335f67a4
- Added `LLamaTemplate` which efficiently formats a series of messages according to the model template.
...
- Fixed `llama_chat_apply_template` method (wrong entrypoint, couldn't handle null model)
2024-05-06 23:43:45 +01:00
Martin Evans
05100184f4
Merge pull request #719 from martindevans/remove-batched-conversation-prompt-with-string
...
Remove `Conversation.Prompt(String)`
2024-05-06 16:16:02 +01:00
Martin Evans
3ba49754b1
Removed (marked as obsolete) prompting with a string for `Conversation`. Tokenization requires extra parameters (e.g. addBos, special) which require special considersation. For now it's better to tokenize using other tools and pass the tokens directly.
2024-05-06 15:53:21 +01:00
Martin Evans
9906871f84
Merge pull request #714 from ksanman/infinite-context
...
Implement context shifting in executor base
2024-05-04 22:17:15 +01:00
ksanchez
0bbbf171ed
Refactor executors
2024-05-02 23:30:16 -06:00
ksanchez
46a9d603f4
Add method to get BOS token.
2024-05-02 23:29:33 -06:00
ksanchez
61d143d8d8
Implement context shifting in executor base
2024-05-01 22:39:12 -06:00
Rinne
6bf010d719
Merge pull request #689 from zsogitbe/master
...
SemanticKernel: Correcting non-standard way of working with PromptExecutionSettings
2024-05-01 01:52:43 +08:00
Zoli Somogyi
54c01d4c2c
Making old code obsolete - SemanticKernel: Correcting working with PromptExecutionSettings
2024-04-30 19:28:31 +02:00
Rinne
0c770a528e
Merge pull request #671 from kidkych/feature/interactive-sk-chatcompletion
...
Optimize Semantic Kernel LLamaSharpChatCompletion when running with StatefulExecutorBase models
2024-05-01 01:02:25 +08:00
Rinne
16141adcb0
Merge pull request #711 from Norne9/master
...
Optional IHistoryTransform added to ChatSession.InitializeSessionFromHistoryAsync
2024-05-01 01:00:02 +08:00
Rinne
7b03e735bb
Merge pull request #709 from AsakusaRinne/format_check_ci
2024-04-30 12:03:42 +08:00
Norne9
5c60e6d4ca
Merge pull request #1 from Norne9/Norne9-patch-chat-session
...
Optional IHistoryTransform added to ChatSession.InitializeSessionFromHistoryAsync
2024-04-30 02:39:07 +03:00
Norne9
ad9bf1cbba
InitializeSessionFromHistoryAsync changed
...
ChatSession.InitializeSessionFromHistoryAsync now accepts IHistoryTransform as an optional parameter.
2024-04-30 02:32:14 +03:00
Rinne
33d5677c0e
Add editorconfig file for code format.
2024-04-30 00:00:35 +08:00
Rinne
f44c8846f5
Merge pull request #710 from AsakusaRinne/typo_check_ci
...
ci: add workflow to check the spellings.
2024-04-29 23:31:52 +08:00
Rinne
495177fd0f
fix: typos.
2024-04-29 18:19:20 +08:00
Rinne
de31a06a4a
ci: add workflow to check the spelling.
2024-04-29 18:07:13 +08:00
Rinne
98909dc2af
Merge pull request #708 from AsakusaRinne/llama3_support
...
Add LLaMA3 chat session example.
2024-04-29 10:36:19 +08:00
Martin Evans
4c078a757c
Merge pull request #703 from martindevans/llava_async_load
...
LLava Async Loading
2024-04-28 22:38:21 +01:00
Rinne
175b25d4f7
Add LLaMA3 chat session example.
2024-04-29 04:12:19 +08:00
Martin Evans
377ebf3664
- Added `LoadFromFileAsync` method for `LLavaWeights`
...
- Fixed checking for invalid handles in `clip_model_load`
2024-04-27 23:31:07 +01:00
Martin Evans
84bb5a36ab
Merge pull request #702 from martindevans/interruptible_async_model_load
...
Interruptible Async Model Loading With Progress Monitoring
2024-04-27 16:06:40 +01:00
Martin Evans
1ec0fee5ba
Added optional `IProgress` parameter to `LoadFromFileAsync`
2024-04-27 15:04:54 +01:00
Zoli Somogyi
2aa96b206f
Adding Response Format - Correcting non-standard way of working with PromptExecutionSettings
...
can be used downstream to post-process the messages based on the requested format
2024-04-27 09:39:40 +02:00
Martin Evans
9867b4c85d
Only setting callback if the token can be cancelled.
2024-04-27 02:55:35 +01:00
Martin Evans
00df7c1516
- Added `LLamaWeights.LoadFromFileAsync`.
...
- Async loading supports cancellation through a `CancellationToken`. If loading is cancelled an `OperationCanceledException` is thrown. If it fails for another reason a `LoadWeightsFailedException` is thrown.
- Updated examples to use `LoadFromFileAsync`
2024-04-27 02:52:41 +01:00
Rinne
b47ed9258f
Merge pull request #701 from AsakusaRinne/add_issue_template
...
Fix typo in issue templates.
2024-04-27 03:59:27 +08:00
Rinne
bcf3ef1e40
Fix typo in issue templates.
2024-04-27 03:58:45 +08:00
Rinne
c6565c3aaf
Merge pull request #700 from AsakusaRinne/add_issue_template
...
Add issue templates.
2024-04-27 03:56:58 +08:00
Rinne
d56eb1a5ad
Add issue templates.
2024-04-27 03:38:20 +08:00
Martin Evans
18586cc43b
Merge pull request #696 from martindevans/safe_handle_constructor_refactor
...
Removed Unnecessary Constructor From Safe Handles
2024-04-26 16:14:42 +01:00
Martin Evans
e9fd7f96e0
Merge pull request #691 from martindevans/empty_batch_check
...
Empty batch check
2024-04-26 16:14:28 +01:00
Martin Evans
a2f8573831
Merge pull request #698 from martindevans/slightly_safer_quantize_params
...
Slightly Safer Quantize Params
2024-04-26 13:53:55 +01:00
Martin Evans
d4f793a7eb
Using `is` check instead of `== null`
2024-04-26 13:53:04 +01:00
Martin Evans
ecb359c9e7
- Using more specific `LoadWeightsFailedException` when a llava model fails to load ( #697 )
...
- Passing model path, instead of a message, to `LoadWeightsFailedException` constructor
2024-04-26 13:39:09 +01:00
Martin Evans
58ec798bff
Modified `llama_model_quantize` to accept argument by `ref` instead of pointer.
2024-04-26 01:35:13 +01:00
Martin Evans
54dab273cd
- Removed unnecessary constructors from safe handles
...
- Returning SafeLLamaGrammarHandle directly from `llama_grammar_init` and `llama_grammar_copy`
2024-04-26 01:03:26 +01:00
Martin Evans
25812762c9
Added checks in `Decode` to skip doing anything if the batch is empty.
2024-04-24 14:54:02 +01:00
Zoli Somogyi
59a0afdb77
Renaming files to correspond to class names
2024-04-24 08:24:02 +02:00