Commit Graph

535 Commits

Author SHA1 Message Date
SignalRT e64b9057d7 Merge branch 'RuntimeDetection' of https://github.com/SignalRT/LLamaSharp into RuntimeDetection 2023-11-06 23:03:58 +01:00
SignalRT d1244332ed MacOS Runtime detection and clasification
Create different paths to different MacOS platforms.
Dynamically load the right library
2023-11-06 23:03:50 +01:00
Martin Evans 04ee64a6be Exposed YaRN scaling parameters in IContextParams 2023-11-06 21:59:18 +00:00
Udayshankar Ravikumar 1dad1ff834 Enhance framework compatibility 2023-11-07 03:22:05 +05:30
SignalRT e1a89a8b0a Added all binaries from this run: https://github.com/SciSharp/LLamaSharp/actions/runs/6762323560
Add the MacOS binary from the same run
2023-11-05 17:57:21 +01:00
Martin Evans 11d8c55db7 Added all binaries from this run: https://github.com/SciSharp/LLamaSharp/actions/runs/6762323560 (132d25b8a62ea084447e0014a0112c1b371fb3f8) 2023-11-05 16:32:38 +00:00
SignalRT 46fb472d42 Align with llama.cpp b1488 2023-11-05 16:16:29 +01:00
Martin Evans a03fdc4818 Using a reference to an array instead of pointer arithmetic. This means it will benefit from bounds checking on the array. 2023-11-04 16:17:32 +00:00
Martin Evans 08c29d52c5 Slightly refactored `SafeLLamaGrammarHandle.Create` to solve CodeQL warning about pointer arithmetic. 2023-11-04 16:02:33 +00:00
Yaohui Liu 0e139d4ee2
fix: add arm binaries to cpu nuspec. 2023-11-03 23:41:22 +08:00
Yaohui Liu 7ee27d2f99
fix: binary not copied on MAC platform. 2023-11-03 23:28:42 +08:00
Martin Evans db8f3980ea New binaries from this commit: 207b51900e
Should fix the extreme speed loss.
2023-10-30 23:28:29 +00:00
Martin Evans b6d242193e Debugging slowdown by removing some things:
- Removed all `record struct` uses in native code
 - Removed usage of `readonly` in native structs

Minor fix:
 - Added sequential layout to `LLamaModelQuantizeParams`
2023-10-30 21:35:46 +00:00
Martin Evans 529b06b35b - Fixed rope frequency/base to use the values in the model by default, instead of always overriding them by default! 2023-10-29 23:59:46 +00:00
Martin Evans dcc82e582e Fixed `Eval` on platforms < dotnet 5 2023-10-29 15:12:41 +00:00
Martin Evans 51c292ebd8 Added a safe method for `llama_get_logits_ith` 2023-10-28 23:15:45 +01:00
Martin Evans 7e3cde4c13 Moved helper methods into `LLamaBatchSafeHandle` 2023-10-28 22:09:09 +01:00
Martin Evans ccb8afae46 Cleaned up stateless executor as preparation for changing it to use the new batched decoding system. 2023-10-28 21:50:48 +01:00
Martin Evans c786fb0ec8 Using `IReadOnlyList` instead of `IEnumerable` in `IInferenceParams` 2023-10-28 21:32:23 +01:00
Martin Evans c7fdb9712c Added binaries, built from `6961c4bd0b` 2023-10-28 21:32:22 +01:00
Martin Evans e81b3023d5 Rewritten sampling API to be accessed through the `LLamaTokenDataArray` object 2023-10-28 21:32:21 +01:00
Martin Evans 3c5547b2b7 Reduced some uses of `NativeApi` in `BatchedDecoding` by adding some helper methods 2023-10-28 21:32:21 +01:00
Martin Evans b38e3f6fe2 binaries (avx512) 2023-10-28 21:32:21 +01:00
Martin Evans a024d2242e It works!
had to update binary to `b1426`
2023-10-28 21:32:21 +01:00
Martin Evans 8cd81251b4 initial setup 2023-10-28 21:32:21 +01:00
Martin Evans 321d0b58c4
Merge pull request #202 from martindevans/multi_gpu
Multi GPU
2023-10-26 14:40:49 +01:00
Martin Evans f6a472ae86 Setting the default seed to `0xFFFFFFFF` (no seed, randomised) 2023-10-25 20:40:41 +01:00
Martin Evans 36c71abcfb Fixed `LLama.StreamingTokenDecoderLLamaLLama.StreamingTokenDecoderLLamaLLama.StreamingTokenDecoderLLama` spam in all executors except Stateless. 2023-10-25 13:57:00 +01:00
Martin Evans 5b6408b072
Merge pull request #205 from martindevans/roundtrip_tokenization_investigation
RoundTrip Tokenization Errors
2023-10-24 20:46:48 +01:00
Martin Evans a03fe003de Fixed decoding of text "accumulating" over time (never properly clearing buffer) 2023-10-23 16:42:38 +01:00
Martin Evans 51d4411a58 Added two new classes for detokenization tasks:
- `AntipromptProcessor` accepts chunks of text and returns a value indicating if any antiprompt has been detected.
 - `StreamingTokenDecoder` decodes tokens into text, maintaining some internal state to handle single characters which are encoded as multiple tokens.

Added tests for these classes and updated StatelessExecutor to use them.

Removed most DeTokenize methods, marked the rest as obsolete (should always use a `StreamingTokenDecoder`).
2023-10-23 00:33:50 +01:00
Martin Evans efdf3d630c - Removed all `TokenToString` methods (it's never correct to use them, because sometimes one single character may be represented by multiple tokens).
- Built a new (hacky) `Detokenize` method which handles this
2023-10-22 21:43:36 +01:00
Rinne 231efe06f2
Update LLama/runtimes/build/LLamaSharp.Backend.Cpu.nuspec
Co-authored-by: Martin Evans <martindevans@gmail.com>
2023-10-23 00:01:09 +08:00
Rinne ecf852c4e2
Update LLama/runtimes/build/LLamaSharp.Backend.MacMetal.nuspec
Co-authored-by: Martin Evans <martindevans@gmail.com>
2023-10-23 00:00:46 +08:00
Rinne 95669c2ea3
Update LLama/runtimes/build/LLamaSharp.Backend.Cuda12.nuspec
Co-authored-by: Martin Evans <martindevans@gmail.com>
2023-10-23 00:00:40 +08:00
Rinne 5eaebd68ba
Update LLama/runtimes/build/LLamaSharp.Backend.Cuda11.nuspec
Co-authored-by: Martin Evans <martindevans@gmail.com>
2023-10-23 00:00:34 +08:00
Rinne 6724b39713
Update LLama/runtimes/build/LLamaSharp.Backend.Cpu.nuspec
Co-authored-by: Martin Evans <martindevans@gmail.com>
2023-10-23 00:00:27 +08:00
Martin Evans 1d0620e634 Created a test that "roundtrips" strings through tokenization. This reveals some flaws with certain characters 2023-10-22 15:28:36 +01:00
Yaohui Liu b7a7dc00b6 ci: fix typos. 2023-10-22 17:50:37 +08:00
Yaohui Liu 252992ec6e ci: fix icon and typos. 2023-10-22 17:45:07 +08:00
Yaohui Liu 53eedf1428
ci: fix error. 2023-10-22 15:23:10 +08:00
Yaohui Liu f9a98c6e23
ci: add auto release workflow. 2023-10-22 11:52:56 +08:00
Martin Evans f621ec67e8 Fixed serialization 2023-10-20 15:04:18 +01:00
Martin Evans 768747c652 spelling 2023-10-20 14:57:55 +01:00
Martin Evans b4e7f64e76 Added System.Text.Json serialization for `TensorSplitsCollectionConverter` 2023-10-20 14:55:01 +01:00
Martin Evans 281e58f059 Fixed default value 2023-10-20 14:35:06 +01:00
Martin Evans 04acbf8c42 Improved doc comment on `tensor_split` 2023-10-20 14:13:46 +01:00
Martin Evans 6a4cd506bd Added a safe `TensorSplitsCollection` to the params which prevents incorrectly setting the `tensor_splits` collection 2023-10-20 14:10:20 +01:00
Martin Evans 15db194c17 Added multi GPU support 2023-10-20 13:43:46 +01:00
Martin Evans 328022b13d Fixed merge conflicts 2023-10-19 21:14:45 +01:00
Martin Evans 7ec318aab5 Added logging to embedder too 2023-10-19 21:09:44 +01:00
Martin Evans f1e5a8f995 - Passing the `ILogger` through to every call of `CreateContext`
- Passing `ILogger` into executors
2023-10-19 21:09:44 +01:00
sa_ddam213 4ec9aed47a
Revert LLamasSharp project changes 2023-10-20 08:29:26 +13:00
sa_ddam213 b4b4000342
Merge branch 'master' into upstream_master
# Conflicts:
#	LLama.Web/Common/ModelOptions.cs
#	LLama.Web/Services/ConnectionSessionService.cs
#	LLama/LLamaStatelessExecutor.cs
#	LLama/LLamaWeights.cs
2023-10-20 08:02:27 +13:00
Martin Evans e89ca5cc17 Fixed a few minor warnings 2023-10-19 00:43:50 +01:00
Martin Evans 9daf586ba8 Assorted cleanup leftover after the huge change in the last PR (comments, syntax style, etc) 2023-10-19 00:26:30 +01:00
Martin Evans d8434ea9d6
Merge pull request #185 from martindevans/wip_major_api_change
Major llama.cpp API Change
2023-10-18 20:50:32 +01:00
Martin Evans 1f8c94e386 Added in the `special` parameter to the tokenizer (introduced in https://github.com/ggerganov/llama.cpp/pull/3538) 2023-10-17 23:55:46 +01:00
Martin Evans efb0664df0 - Added new binaries
- Fixed stateless executor out-of-context handling
 - Fixed token tests
2023-10-17 23:39:41 +01:00
Martin Evans b8f0eff080 - Added `GetCharCountImpl` tests, fixed handling of empty strings
- Added ifdef to remove `Deconstruct` extension on everything except `NETSTANDARD2_0`
2023-10-14 00:04:12 +01:00
Martin Evans 45118520fa - Improved coverage of `GBNFGrammarParser` up to 96%
- Covered text transforms
 - Removed unnecessary non-async transforms
2023-10-13 23:54:01 +01:00
Martin Evans 2a38808bca - Added threads to context params, replaced all thread args with `uint?`
- Replaced all binaries
2023-10-12 18:49:41 +01:00
Martin Evans 4e9b1f8cdc - Split extension methods into separate files 2023-10-12 15:38:26 +01:00
sa_ddam213 9b8de007dc Propagate ILogger 2023-10-04 13:47:08 +13:00
Martin Evans 669ae47ef7 - Split parameters into two interfaces
- params contains a list of loras, instead of just one
2023-09-30 16:21:18 +01:00
Martin Evans 9a0a0ae9fe Removed cloning support 2023-09-30 15:48:26 +01:00
Martin Evans 0d40338692 Fixed out-of-context handling in stateless executor 2023-09-29 23:53:07 +01:00
Martin Evans b306ac23dd Added `Decode` method to `SafeLLamaContextHandle` 2023-09-29 22:24:44 +01:00
Martin Evans 9e958e896b safe handle for batch 2023-09-29 22:18:23 +01:00
Martin Evans ce1fc51163 Added some more native methods 2023-09-29 16:05:19 +01:00
Martin Evans bca55eace0 Initial changes to match the llama.cpp changes 2023-09-29 01:18:21 +01:00
Martin Evans d58fcbbd13 Fixed antiprompt checking 2023-09-24 14:26:43 +01:00
Martin Evans 08f1615e60 - Converted LLamaStatelessExecutor to run `Exec` calls inside an awaited task. This unblocks async callers while the model is being evaluated.
- Added a "spinner" to the `StatelessModeExecute` demo, which spins while waiting for the next token (demonstrating that it's not blocked).
2023-09-23 15:22:57 +01:00
Martin Evans fe54f6764f - Added unit tests for extension methods
- Removed unused `AddRangeSpan` extension
2023-09-22 16:29:50 +01:00
Haiping 79fa74d59c
Merge pull request #177 from redthing1/fix/context-getstate
fix opaque GetState (fixes #176)
2023-09-19 06:51:36 -05:00
redthing1 b78044347c
fix opaque GetState (fixes #176) 2023-09-18 20:56:14 -07:00
Haiping e1af7a96da
Merge pull request #175 from redthing1/feat/inferenceparams_record
make InferenceParams a record so we can use `with`
2023-09-18 19:44:18 -05:00
redthing1 296ba607de
make InferenceParams a record so we can use with 2023-09-18 13:27:26 -07:00
Haiping 10678a83d6
Merge pull request #65 from martindevans/alternative_dependency_loading
CPU Feature Detection
2023-09-17 10:21:37 -05:00
Haiping f134c5af59
Merge pull request #163 from SignalRT/DefaultMetal
MacOS default build now is metal llama.cpp #2901
2023-09-17 10:20:47 -05:00
Martin Evans 3f80190f85 Minimal changes required to remove non-async inference. 2023-09-14 21:04:14 +01:00
Martin Evans b1e9d8240d
Merge pull request #149 from martindevans/removed_unused_inference_params
Removed unused properties of `InferenceParams` & `ModelParams`
2023-09-13 01:48:15 +01:00
Martin Evans daf09eae64 Skipping tokenization of empty strings (saves allocating an empty array every time) 2023-09-12 01:03:27 +01:00
Martin Evans 466722dcff
Merge pull request #165 from martindevans/better_instruct_antiprompt_checking
better_instruct_antiprompt_checking
2023-09-11 00:32:43 +01:00
Martin Evans d08a125020 Using the `TokensEndsWithAnyString` extensions for antiprompt checking in instruct executor. Simpler and more efficient. 2023-09-11 00:22:17 +01:00
Martin Evans bba801f4b7 Added a property to get the KV cache size from a context 2023-09-11 00:10:08 +01:00
SignalRT c41e448d0e ggml-metal.metal MUST be copied to output folder
Metal depends on this file to execute, and MacOS llama.cpp defaults is now METAL.
2023-09-10 20:42:15 +02:00
SignalRT 096293a026 MacOS Remove Metal as is the current default
See on Mac OS enable Metal by default #2901
2023-09-10 20:42:15 +02:00
Martin Evans b47977300a Removed one more unused parameter 2023-09-09 14:57:47 +01:00
Martin Evans a1b0349561 Removed `ModelAlias` property (unused) 2023-09-09 14:18:50 +01:00
Martin Evans 4dac142bd5
Merge pull request #160 from martindevans/GetState_fix
`GetState()` fix
2023-09-09 01:44:08 +01:00
Martin Evans 832bf7dbe0 Simplified implementation of `GetState` and fixed a memory leak (`bigMemory` was never freed) 2023-09-09 01:30:35 +01:00
Martin Evans 4f7b6ffdcc Removed `GenerateResult` method that was only used in one place 2023-09-09 01:09:27 +01:00
sa_ddam213 09d8f434f2
Extract LLamaLogLevel, Remove Logger class 2023-09-09 10:25:05 +12:00
sa_ddam213 949b0cde16
Replace ILLamaLogger for ILogger 2023-09-09 10:13:07 +12:00
sa_ddam213 70b36f8996
Add Microsoft.Extensions.Logging.Abstractions, update any required deps 2023-09-09 09:52:11 +12:00
Martin Evans d3b8ee988c
Beam Search (#155)
* Added the low level bindings to beam search.
2023-09-07 19:26:51 +01:00
Martin Evans a09aa86324
Merge pull request #153 from martindevans/fix_savestate_OpenOrCreate
Changed `OpenOrCreate` to `Create`
2023-09-06 23:03:24 +01:00
Martin Evans f366aa3abe Changed `OpenOrCreate` to `Create` to fix #151 2023-09-06 22:35:41 +01:00
Martin Evans 77bd090150 Simplified `LLamaInteractExecutor` antiprompt matching by using new extension method 2023-09-06 22:26:36 +01:00
Martin Evans 614ba40948 - Added a `TokensEndsWithAnyString` extension to `IReadOnlyList<int>` which efficiently checks if a set of tokens ends with one of a set of strings.
- Minimal amount of characters converted
   - Allocation free
 - Added `TokensToSpan` to `SafeLlamaModelHandle` which converts as many tokens as possible into a character span
   - Allocation free
2023-09-06 19:44:19 +01:00
Martin Evans d79a6556a1 Removed 3 unused properties of `InferenceParams` 2023-09-06 01:20:36 +01:00
Martin Evans 6a842014ac Removed duplicate `llama_sample_classifier_free_guidance` method 2023-09-04 00:48:27 +01:00
Martin Evans 4a53cdc56b
Merge pull request #142 from SciSharp/rinne-dev
refactor: remove old version files.
2023-09-03 23:36:28 +01:00
Martin Evans 33035c82bf - Removed `LLamaNewlineTokens` from `InteractiveExecutorState`. This is always set in the constructor from the context, so there's no point serializing it. 2023-09-03 18:22:39 +01:00
Yaohui Liu 18294a725e
refactor: remove old version files. 2023-09-02 22:24:07 +08:00
Martin Evans 8f58a40fb9 Added Linux dependency loading 2023-09-02 14:21:06 +01:00
Martin Evans dd4957471f Changed paths to match what the GitHub build action produces 2023-09-02 14:10:18 +01:00
Martin Evans 756a1ad0ba Added a new way to load dependencies, performing CPU feature detection 2023-09-02 14:03:37 +01:00
Martin Evans 025741a73e
Fixed My Name
The D is for my middle name 😄
2023-09-02 13:45:06 +01:00
Yaohui Liu 20b5363601
fix: remove the history commit of embedding length property. 2023-09-02 12:56:02 +08:00
Yaohui Liu 3a847623ab
docs: update the docs to follow new version. 2023-09-02 12:51:51 +08:00
Yaohui Liu ca6624edb3
Merge branch 'master' of github.com:SciSharp/LLamaSharp into rinne-dev 2023-09-02 12:03:35 +08:00
Rinne 4e83e48ad1
Merge pull request #122 from martindevans/gguf
Add GGUF support
2023-09-02 11:54:50 +08:00
Martin Evans 97349d93be Merge branch 'gguf' of github.com:martindevans/LLamaSharp into gguf 2023-09-02 02:22:18 +01:00
Martin Evans bcf06e2652 Added some comments on various native methods 2023-09-02 02:22:11 +01:00
Martin Evans af680ac2d7 Created a hierarchy of exceptions for grammar format issues. This allows the base catch-all exception to be caught for general handling, or more specific exceptions to be caught for more specific handling. 2023-09-02 02:04:11 +01:00
Rinne 1533ee7dbf
Merge pull request #138 from drasticactions/semantic-kernel
Enable Semantic kernel support
2023-09-01 20:50:46 +08:00
Tim Miller 326c802be7 Have weights generate context 2023-08-31 22:19:29 +09:00
Tim Miller 3bca3b632e New line 2023-08-31 17:31:13 +09:00
Tim Miller 9a1d6f99f2 Add Semantic Kernel support 2023-08-31 17:24:44 +09:00
Martin Evans a70c7170dd - Created a higher level `Grammar` class which is immutable and contains a list of grammar rules. This is the main "entry point" to the grammar system.
- Made all the mechanics of grammar parsing (GBNFGrammarParser, ParseState) internal. Just call `Grammar.Parse("whatever")`.
 - Added a `GrammarRule` class which validates elements on construction (this allows constructing grammar without parsing GBNF).
   - It should be impossible for a `GrammarRule` to represent an invalid rule.
2023-08-31 00:02:50 +01:00
SignalRT fb007e5921 Changes to compile in VS Mac + change model to llama2
This commit includes changes to compile en VS Mac + changest to use llama2 not codellama.

It includes MacOS binaries in memory and metal
2023-08-30 22:08:29 +02:00
Mihai 24d3e1bfa8 Address PR review comment 2023-08-30 21:59:28 +03:00
Mihai 60790c5aac Address code review comments (create custom exception, move printing to the ParseState class, rethrow error). 2023-08-30 21:06:45 +03:00
Mihai 2ae1891c13 Bug fixes after running tests.
SymbolIds is now SortedDictionary (although I'm not sure it really needs to be) because the test was failing due to expected value being in another order. The C++ data structure if SymbolIds is std::map<std::string, uint32_t> so the items are ordered by key.
2023-08-30 16:18:05 +03:00
Mihai 0bd495276b Add initial tests + fix bugs. Still WIP since the test is failing. 2023-08-30 14:10:56 +03:00
Mihai 0f373fcc6d Finish grammar_parser translation from C++ to C# 2023-08-30 12:20:45 +03:00
Mihai 3c919b56fe Use ReadOnlySpan everywhere instead of ReadOnlyMemeory and instead of returning tuple, reference the ReadOnlySpan. 2023-08-30 11:23:55 +03:00
Mihai 8b4ec6d973 Address PR change requests 2023-08-30 09:24:08 +03:00
Mihai 7f31276bdf [WIP] Translating the GrammarParser 2023-08-29 22:50:54 +03:00
Martin Evans c9d08b943e Added binaries for CUDA+Linux 2023-08-29 15:05:09 +01:00
Martin Evans 6711a59d0f Included Linux deps 2023-08-28 20:02:59 +01:00
Martin Evans ba49ea2991 Removed hardcoded paths from projects, modified Runtime.targets to exclude missing binaries 2023-08-28 19:53:34 +01:00
Martin Evans 2022b82947 Added binaries generated by this action: https://github.com/SciSharp/LLamaSharp/actions/runs/6002797872/job/16279896150
Based on this version: 6b73ef1201
2023-08-28 19:48:31 +01:00
sa_ddam213 a5d742b72c
Fix Tokenize of new line, Remove space inserts 2023-08-28 11:57:50 +12:00
Martin Evans 31287b5e6e Rewritten TokenToSpan/TokenToString to better fit the new way it's done in llama.cpp with a few different options:
- Just convert it to a `string`, nice and simple
 - Write the bytes to a `Span<byte>` no allocations
 - Write the chars to a `StringBuilder` potentially no allocations
2023-08-27 00:15:56 +01:00
Martin Evans 0c98ae1955 Passing ctx to `llama_token_nl(_ctx)` 2023-08-27 00:15:55 +01:00
Martin Evans 6ffa28f964 Removed `LLAMA_MAX_DEVICES` (not used) 2023-08-27 00:14:40 +01:00
Martin Evans 2056078aef Initial changes required for GGUF support 2023-08-27 00:14:40 +01:00
Martin Evans 826c6aaec3 cleaned up higher level code using the sampling API:
- Fixed multiple enumeration
 - Fixed newline penalisation
2023-08-26 21:47:41 +01:00
Martin Evans cf4754db44 Removed unnecessary parameters from some low level sampler methods 2023-08-26 21:38:24 +01:00
Martin Evans f70525fec2 Two small improvements to the native sampling API:
- Modified `llama_sample_token_mirostat` and `llama_sample_token_mirostat_v2` to take `ref float` instead of as a `float*`. Less pointers is always good.
 - Modified `llama_sample_repetition_penalty` and `llama_sample_frequency_and_presence_penalties` to take pointers instead of arrays. This allows the use non non allocating types (e.g. Span) instead of arrays
 - Modified higher level API to accept `Memory<int>` instead of `int[]`, which can be used to reduce allocations at call sites
2023-08-26 01:25:48 +01:00
Martin Evans a911b77dec Various minor changes, resolving about 100 ReSharper code quality warnings 2023-08-24 23:15:53 +01:00
Martin Evans 5a6c6de0dc
Merge pull request #115 from martindevans/model_params_record
ModelsParams record class
2023-08-24 22:54:23 +01:00
Martin Evans 70be6c7368 Removed `virtual` method in newly sealed class 2023-08-24 17:08:01 +01:00
Martin Evans ebacdb666d - Moved the lower level state get/set methods onto SafeLLamaContextHandle
- Used those methods to add a `Clone` method to SafeLLamaContextHandle
 - Simplified `LLamaContext` by using the new methods
 - Sealed `LLamaContext` and `LLamaEmbedder`
2023-08-24 17:03:27 +01:00
Martin Evans 77aa5fa0d0 Added `JsonConverter` attribute, so System.Text.Json serialization is seamless 2023-08-24 16:17:49 +01:00
Martin Evans df80ec9161
Merge pull request #97 from martindevans/embedder_tests
Embedder Test
2023-08-24 02:08:39 +01:00
Martin Evans 058c4e84b1 Rewritten LLamaEmbedder to use `LLamaContext` instead of the lower level handles 2023-08-24 01:14:12 +01:00
Martin Evans 829f32b27d - Added `Obsolete` attributes to the entire `OldVersion` namespace, so they can be removed in the future
- Minor changes to cleanup some of the compiler warnings
2023-08-24 00:59:32 +01:00
Martin Evans ee772a2921 added `using` statement instead of full qualification 2023-08-24 00:24:16 +01:00
Martin Evans 93f24f8a51 Switched to properly typed `Encoding` property 2023-08-24 00:09:00 +01:00
zombieguy 45b01d5a78 Improved type conversion
Type conversion is now done in the property rather than the utils class and uses the System.Convert class to ensure consistency.
2023-08-23 19:36:14 +01:00
Martin Evans 29df14cd9c Converted ModelParams into a `record` class. This has several advantages:
- Equality, hashing etc all implemented automatically
 - Default values are defined in just one place (the properties) instead of the constructor as well
 - Added test to ensure that serialization works properly
2023-08-23 00:58:25 +01:00
Martin Evans 2830e5755c - Applied a lot of minor R# code quality suggestions. Lots of unnecessary imports removed.
- Deleted `NativeInfo` (internal class, not used anywhere)
2023-08-22 23:20:13 +01:00
Martin Evans 854532c08e
Merge pull request #112 from martindevans/classifier_free_guidance
Added native symbol for CFG
2023-08-22 18:35:13 +01:00
Martin Evans 4b7d718551 Added native symbol for CFG 2023-08-22 17:11:49 +01:00
Erin Loy 8f0b52eb09 Re-renaming some arguments to allow for easy deserialization from appsettings.json. 2023-08-22 09:09:22 -07:00
Martin Evans 9fc17f3136 Fixed unit tests 2023-08-22 14:16:20 +01:00
Martin Evans 759ae26f36
Merge branch 'master' into grammar_basics 2023-08-22 14:06:57 +01:00
Martin Evans a9e6f21ab8 - Creating and destroying contexts in the stateless executor, saving memory. It now uses zero memory when not inferring!
- Passing encoding in the `IModelParams`, which reduces how often encoding needs to be passed around
2023-08-22 01:30:13 +01:00
Martin Evans e7b217f462 Fixed out of context logic 2023-08-22 01:28:28 +01:00
Martin Evans 4738c26299 - Reduced context size of test, to speed it up
- Removed some unnecessary `ToArray` calls
 - Initial pass on LLamaStatelessExecutor, the context overflow management is broken but I think I found where it's ported from
2023-08-22 01:28:28 +01:00
Martin Evans ae8ef17a4a - Added various convenience overloads to `LLamaContext.Eval`
- Converted `SafeLLamaContextHandle` to take a `ReadOnlySpan` for Eval, narrower type better represents what's really needed
2023-08-22 01:28:28 +01:00
Erin Loy 592a80840b renamed some arguments in ModelParams constructor so that classcan be serialized easily 2023-08-19 15:55:19 -07:00
Martin Evans 64416ca23c - Created a slightly nicer way to create grammar (from `IReadOnlyList<IReadOnlyList<LLamaGrammarElement>>`)
- Integrated grammar into sampling
 - Added a test for the grammar sampling
2023-08-17 19:29:15 +01:00
Martin Evans 0294bb1303 Some of the basics of the grammar API 2023-08-17 19:28:17 +01:00
Rinne 62331852bc
Merge pull request #90 from martindevans/proposal_multi_context
Multi Context
2023-08-17 21:59:05 +08:00
zombieguy 10f88ebd0e
Potential fix for .Net Framework issues (#103)
* Added a bool to sbyte Utils convertor

As an attempt to avoid using any MarshalAs attribute for .Net Framework support this Utils method will take in a bool value and return a 1 for true or 0 for false sbyte.

* Changed all bool "MarshalAs" types to sbytes

Changed all previous BOOL types with "MarshalAs" attributes to SBYTEs and changed all the setters of them to use the Utils.BoolToSignedByte() convertor method.

* Fixed Utils bool convertor & added sbyte to bool

Improved the Utils bool convertor just casting an sbyte value to get rid of the unneeded sbyte array and added an sbyte to bool convertor to convert back the way to a C# bool assuming any positive value above 0 is a bool and no bools are packed in the single byte integer.

* bool to & from sbyte conversions via properties

All 1byte bools are now handled where they "sit", via public properties which perform the conversions to keep all external data able to communicate as it did before.
2023-08-16 00:09:52 +01:00
Martin Evans 7ebff89f68
Merge pull request #101 from martindevans/llama_sample_classifier_free_guidance
llama_sample_classifier_free_guidance
2023-08-13 23:21:21 +01:00
Martin Evans 6c84accce8 Added `llama_sample_classifier_free_guidance` method from native API 2023-08-13 23:14:53 +01:00
Martin Evans afe559ef1c Added comments to `Logger` and fixed some nullability warnings 2023-08-13 01:29:33 +01:00
Martin Evans 6473f8d5e5 Temporarily added a `Console.WriteLine` into the test, to print the embedding vector for "cat" in CI 2023-08-13 01:10:09 +01:00
Martin Evans 1b35be2e0c Added some additional basic tests 2023-08-13 01:10:09 +01:00
Martin Evans f5a260926f Renamed `EmbeddingCount` to `EmbeddingSize` in higher level class 2023-08-13 01:10:09 +01:00
Martin Evans 479ff57853 Renamed `EmbeddingCount` to `EmbeddingSize` 2023-08-13 01:10:09 +01:00
Martin Evans d0a7a8fcd6 - Cleaned up disposal in LLamaContext
- sealed some classes not intended to be extended
2023-08-13 01:10:08 +01:00
Martin Evans 4d741d24f2 Marked old `LLamaContext` constructor obsolete 2023-08-13 01:10:08 +01:00
Martin Evans 20bdc2ec6f - Apply LoRA in `LLamaWeights.LoadFromFile`
- Sanity checking that weights are not disposed when creating a context from them
 - Further simplified `Utils.InitLLamaContextFromModelParams`
2023-08-13 01:10:08 +01:00
Martin Evans e2fe08a9a2 Added a higher level `LLamaWeights` wrapper around `SafeLlamaModelHandle` 2023-08-13 01:10:08 +01:00
Martin Evans fda7e1c038 Fixed mirostat/mirostate 2023-08-13 01:10:08 +01:00
Martin Evans f3511e390f WIP demonstrating changes to support multi-context. You can see this in use in `TalkToYourself`, along with notes on what still needs improving.
The biggest single change is renaming `LLamaModel` to `LLamaContext`
2023-08-13 01:10:08 +01:00
Martin Evans d7f971fc22 Improved `NativeApi` file a bit:
- Added some more comments
 - Modified `llama_tokenize` to not allocate
 - Modified `llama_tokenize_native` to take a pointer instead of an array, allowing use with no allocations
 - Removed GgmlInitParams (not used)
2023-08-12 00:45:23 +01:00
Martin Evans 841cf88e3b
Merge pull request #96 from martindevans/minor_quantizer_improvements
Minor quantizer improvements
2023-08-10 18:01:40 +01:00
Martin Evans ce325b49c7 Rewritten comments 2023-08-10 17:00:54 +01:00
Martin Evans b69f4bc40e - Expanded range of supported types in quantizer to match llama.cpp
- Rewritten `LLamaFtype` parsing to support any substring which uniquely matches a single enum variant
2023-08-10 16:58:00 +01:00
sa_ddam213 a67ea36dd9 Typo and formatting 2023-08-11 00:37:33 +12:00
sa_ddam213 726987b761 Add native logging output 2023-08-10 23:01:50 +12:00
Martin Evans acd91341e6 Added lots of comments to all the LLamaFtype variants 2023-08-10 02:14:21 +01:00
Yaohui Liu ee2a5f064e
Merge branch 'master' of github.com:SciSharp/LLamaSharp into rinne-dev 2023-08-08 21:41:48 +08:00
Yaohui Liu 3a1daa98a3
feat: add the api to get the embedding length of the model. 2023-08-08 21:41:33 +08:00
Martin Evans 270c6d55ef
Merge pull request #88 from martindevans/fix_serialization_nan
Fix serialization error due to NaN
2023-08-08 14:04:18 +01:00
Martin Evans 91bcefc852 comment on IModelParamsExtensions 2023-08-07 23:46:19 +01:00
Martin Evans 9cdc72aa67 Fixed `ToLlamaContextParams` using the wrong parameter for `use_mmap` 2023-08-07 23:45:05 +01:00
Martin Evans bab3b46f0c
Merge pull request #82 from martindevans/tokenization_cleanup
Utils Cleanup
2023-08-07 23:20:24 +01:00
Martin Evans b5de3ee5aa Fixed some final mentions of "mirostate" instead of "mirostat" 2023-08-07 21:12:56 +01:00
Martin Evans be52737488 Using a nullable float instead of NaN, this should fix the serialization issue reported in #85 2023-08-07 21:09:18 +01:00
sa_ddam213 2d1269cae9 Access to IModelParamsExtensions 2023-08-08 07:54:40 +12:00
Martin Evans 1fceeaf352 Applied fix from #84 (antiprompt does not work in stateless executor) 2023-08-07 19:00:59 +01:00
Yaohui Liu d609b0e1d5
Merge branch 'master' of github.com:SciSharp/LLamaSharp into rinne-dev 2023-08-08 00:16:38 +08:00
Yaohui Liu b60c8bd285
fix: antiprompt does not work in stateless executor. 2023-08-08 00:16:23 +08:00
Martin Evans 2b2d3af26b Moved `Eval` out of `Utils` and into `SafeLLamaContextHandle` 2023-08-07 15:15:34 +01:00
Martin Evans 7fabcc1849 One last `TokenToString` case 2023-08-07 15:15:34 +01:00
Martin Evans 0e5e00e300 Moved `TokenToString` from Utils into `SafeLLamaContextHandle` (thin wrappers around the same method in `SafeLlamaModelHandle`) 2023-08-07 15:15:34 +01:00
Martin Evans 2d811b2603 - Moved `GetLogits` into `SafeLLamaContextHandle`
- Added disposal check into `SafeLLamaContextHandle`
2023-08-07 15:13:24 +01:00
Martin Evans cd3cf2b77d - Moved tokenization from `Utils.Tokenize` into `SafeLLamaContextHandle.Tokenize`, one less thing in `Utils`.
- Also refactored it to return an `int[]` instead of an `IEnumerable<int>`, solving the "multiple enumeration" problems at the source!
2023-08-07 15:13:24 +01:00
Martin Evans 73882de591
Merge pull request #81 from martindevans/tensor_splits_array
Improved Tensor Splits
2023-08-07 13:36:38 +01:00
Martin Evans bd3d8d3dc4 Cleaned up multiple enumeration in FixedSizeQueue 2023-08-07 02:23:46 +01:00
Martin Evans f2499371ea Pulled conversion of a `IModelParams` into a `LLamaContextParams` out into an extension method which can be used in other places. 2023-08-07 01:55:36 +01:00
Martin Evans f1111a9f8b Using a pin instead of a `fixed` block 2023-08-07 01:20:34 +01:00
Martin Evans 685eb3b9c2 Replaced `nint` with `float[]?` in Model params, which is much more user friendly! 2023-08-06 20:29:38 +01:00
sa_ddam213 e02d0c3617 Merge branch 'master' of https://github.com/SciSharp/LLamaSharp into upstream_master 2023-08-07 03:34:37 +12:00
Rinne bfe9cc8961
Merge pull request #78 from SciSharp/rinne-dev
feat: update the llama backends.
2023-08-06 20:59:24 +08:00
sa_ddam213 e46646b8db Merge branch 'master' of https://github.com/SciSharp/LLamaSharp into upstream_master 2023-08-07 00:01:37 +12:00
Yaohui Liu bb46a990d0
fix: add bug info for native api. 2023-08-06 14:46:23 +08:00
Yaohui Liu 5fe13bd9f7
fix: update the dlls. 2023-08-06 13:46:57 +08:00
sa_ddam213 372894e1d4 Expose some native classes 2023-08-06 14:44:46 +12:00
sa_ddam213 bac9cba01a InferenceParams abstractions 2023-08-06 11:03:45 +12:00
sa_ddam213 2a04e31b7d ModelParams abstraction 2023-08-06 10:44:54 +12:00
Yaohui Liu 546ba28a68
fix: ci error caused by branch merge. 2023-08-06 01:48:31 +08:00
Yaohui Liu fc17e91d1a
feat: add backend for MACOS. 2023-08-06 01:30:56 +08:00
Yaohui Liu 9fcbd16b74
Merge branch 'master' of github.com:SciSharp/LLamaSharp into rinne-dev 2023-08-06 01:30:03 +08:00
Yaohui Liu 2968125daf
feat: update the llama backends. 2023-08-06 01:22:24 +08:00
Martin Evans fe3bd11dfa
Merge branch 'master' into master 2023-08-05 16:56:18 +01:00
Martin Evans 7ef07104e7 Added queue fix, so that CI can pass 2023-08-05 14:38:47 +01:00
SignalRT 348f2c7d72 Update llama.cpp binaries to 5f631c2 and align the context to that version
It solves the problem with netstandard2 (is it really netstandard2 a thing right now?)
Change context to solve problems.

5f631c26794b6371fcf2660e8d0c53494a5575f7
2023-08-05 12:45:34 +02:00
Rinne 075b785a4d
Merge branch 'master' into fixed_mirostate_mu 2023-08-05 08:59:47 +08:00
Rinne c641dbdb83
Merge pull request #69 from martindevans/fixed_mirostat_spelling
Fixed Spelling Mirostate -> Mirostat
2023-08-05 08:56:52 +08:00
Rinne 8d37abd787
Merge pull request #68 from martindevans/sampling_improvements
Fixed Memory pinning in Sampling API
2023-08-05 08:55:12 +08:00
Rinne 1d29b240b2
Merge pull request #64 from martindevans/new_llama_state_loading_mechanism
Low level new loading system
2023-08-05 08:47:28 +08:00
Martin Evans add3d5528b Removed `MarshalAs` on array 2023-08-03 14:16:41 +01:00
Martin Evans 2245b84906
Update LLamaContextParams.cs 2023-08-02 23:13:07 +01:00
Martin Evans c64507cb41 Correctly passing through mu value to mirostate instead of resetting it every time. 2023-07-30 00:15:52 +01:00
Rinne cd015055a8
Merge branch 'master' into more_multi_enumeration_fixes 2023-07-30 00:45:38 +08:00
sa_ddam213 3e252c81f6 LLamaContextParams epsilon and tensor split changes 2023-07-28 19:15:19 +12:00
Martin Evans 36735f7908 Fixed spelling of "mirostat" instead of "mirostate" 2023-07-27 23:11:25 +01:00
Martin Evans ec49bdd6eb - Most importantly: Fixed issue in `SamplingApi`, `Memory` was pinned, but never unpinned!
- Moved repeated code to convert `LLamaTokenDataArray` into a `LLamaTokenDataArrayNative` into a helper method.
   - Modified all call sites to dispose the `MemoryHandle`
 - Saved one copy of the `List<LLamaTokenData>` into a `LLamaTokenData[]` in `LlamaModel`
2023-07-27 20:45:59 +01:00
Martin Evans 6985d3ab60 Added comments on two properties 2023-07-27 18:58:29 +01:00
Martin Evans c974c8429e Removed leftover `using` 2023-07-25 20:30:10 +01:00
Martin Evans afb9d24f3a Added model `Tokenize` method 2023-07-25 20:29:35 +01:00
Martin Evans 369c915afe Added TokenToString conversion on model handle 2023-07-25 16:55:04 +01:00
Martin Evans b721072aa5 Exposed some extra model properties on safe handle 2023-07-25 16:41:17 +01:00
Martin Evans 44b1e93609 Moved LoRA loading into `SafeLlamaModelHandle` 2023-07-25 16:35:24 +01:00
Martin Evans c95b14d8b3 - Fixed null check
- Additional comments
2023-07-25 16:23:25 +01:00
Martin Evans f16aa58e12 Updated to use the new loading system in llama (llama_state). This new system has split model weights and contexts into two separate things, allowing one set of weights to be shared between many contexts.
This change _only_ implements the low level API and makes no effort to update the LlamaSharp higher level abstraction.

It is built upon llama `b3f138d`, necessary DLLs are **not** included in this commit.
2023-07-25 01:18:12 +01:00
Martin Evans 8848fc6e3d Fixed 2 more "multi enumeration" issues 2023-07-25 00:19:30 +01:00
Martin Evans ad28a5acdb
Merge branch 'master' into fix_multiple_enumeration 2023-07-24 22:13:49 +01:00
Rinne 4d7d4f2bfe
Merge pull request #59 from saddam213/master
Instruct & Stateless web example implemented
2023-07-24 23:28:04 +08:00
Rinne 66d6b00b49
Merge pull request #57 from martindevans/larger_states
Larger states
2023-07-24 23:10:39 +08:00