SignalRT
e64b9057d7
Merge branch 'RuntimeDetection' of https://github.com/SignalRT/LLamaSharp into RuntimeDetection
2023-11-06 23:03:58 +01:00
SignalRT
d1244332ed
MacOS Runtime detection and clasification
...
Create different paths to different MacOS platforms.
Dynamically load the right library
2023-11-06 23:03:50 +01:00
Martin Evans
04ee64a6be
Exposed YaRN scaling parameters in IContextParams
2023-11-06 21:59:18 +00:00
Udayshankar Ravikumar
1dad1ff834
Enhance framework compatibility
2023-11-07 03:22:05 +05:30
SignalRT
e1a89a8b0a
Added all binaries from this run: https://github.com/SciSharp/LLamaSharp/actions/runs/6762323560
...
Add the MacOS binary from the same run
2023-11-05 17:57:21 +01:00
Martin Evans
11d8c55db7
Added all binaries from this run: https://github.com/SciSharp/LLamaSharp/actions/runs/6762323560 (132d25b8a62ea084447e0014a0112c1b371fb3f8)
2023-11-05 16:32:38 +00:00
SignalRT
46fb472d42
Align with llama.cpp b1488
2023-11-05 16:16:29 +01:00
Martin Evans
a03fdc4818
Using a reference to an array instead of pointer arithmetic. This means it will benefit from bounds checking on the array.
2023-11-04 16:17:32 +00:00
Martin Evans
08c29d52c5
Slightly refactored `SafeLLamaGrammarHandle.Create` to solve CodeQL warning about pointer arithmetic.
2023-11-04 16:02:33 +00:00
Yaohui Liu
0e139d4ee2
fix: add arm binaries to cpu nuspec.
2023-11-03 23:41:22 +08:00
Yaohui Liu
7ee27d2f99
fix: binary not copied on MAC platform.
2023-11-03 23:28:42 +08:00
Martin Evans
db8f3980ea
New binaries from this commit: 207b51900e
...
Should fix the extreme speed loss.
2023-10-30 23:28:29 +00:00
Martin Evans
b6d242193e
Debugging slowdown by removing some things:
...
- Removed all `record struct` uses in native code
- Removed usage of `readonly` in native structs
Minor fix:
- Added sequential layout to `LLamaModelQuantizeParams`
2023-10-30 21:35:46 +00:00
Martin Evans
529b06b35b
- Fixed rope frequency/base to use the values in the model by default, instead of always overriding them by default!
2023-10-29 23:59:46 +00:00
Martin Evans
dcc82e582e
Fixed `Eval` on platforms < dotnet 5
2023-10-29 15:12:41 +00:00
Martin Evans
51c292ebd8
Added a safe method for `llama_get_logits_ith`
2023-10-28 23:15:45 +01:00
Martin Evans
7e3cde4c13
Moved helper methods into `LLamaBatchSafeHandle`
2023-10-28 22:09:09 +01:00
Martin Evans
ccb8afae46
Cleaned up stateless executor as preparation for changing it to use the new batched decoding system.
2023-10-28 21:50:48 +01:00
Martin Evans
c786fb0ec8
Using `IReadOnlyList` instead of `IEnumerable` in `IInferenceParams`
2023-10-28 21:32:23 +01:00
Martin Evans
c7fdb9712c
Added binaries, built from ` 6961c4bd0b
`
2023-10-28 21:32:22 +01:00
Martin Evans
e81b3023d5
Rewritten sampling API to be accessed through the `LLamaTokenDataArray` object
2023-10-28 21:32:21 +01:00
Martin Evans
3c5547b2b7
Reduced some uses of `NativeApi` in `BatchedDecoding` by adding some helper methods
2023-10-28 21:32:21 +01:00
Martin Evans
b38e3f6fe2
binaries (avx512)
2023-10-28 21:32:21 +01:00
Martin Evans
a024d2242e
It works!
...
had to update binary to `b1426`
2023-10-28 21:32:21 +01:00
Martin Evans
8cd81251b4
initial setup
2023-10-28 21:32:21 +01:00
Martin Evans
321d0b58c4
Merge pull request #202 from martindevans/multi_gpu
...
Multi GPU
2023-10-26 14:40:49 +01:00
Martin Evans
f6a472ae86
Setting the default seed to `0xFFFFFFFF` (no seed, randomised)
2023-10-25 20:40:41 +01:00
Martin Evans
36c71abcfb
Fixed `LLama.StreamingTokenDecoderLLamaLLama.StreamingTokenDecoderLLamaLLama.StreamingTokenDecoderLLama` spam in all executors except Stateless.
2023-10-25 13:57:00 +01:00
Martin Evans
5b6408b072
Merge pull request #205 from martindevans/roundtrip_tokenization_investigation
...
RoundTrip Tokenization Errors
2023-10-24 20:46:48 +01:00
Martin Evans
a03fe003de
Fixed decoding of text "accumulating" over time (never properly clearing buffer)
2023-10-23 16:42:38 +01:00
Martin Evans
51d4411a58
Added two new classes for detokenization tasks:
...
- `AntipromptProcessor` accepts chunks of text and returns a value indicating if any antiprompt has been detected.
- `StreamingTokenDecoder` decodes tokens into text, maintaining some internal state to handle single characters which are encoded as multiple tokens.
Added tests for these classes and updated StatelessExecutor to use them.
Removed most DeTokenize methods, marked the rest as obsolete (should always use a `StreamingTokenDecoder`).
2023-10-23 00:33:50 +01:00
Martin Evans
efdf3d630c
- Removed all `TokenToString` methods (it's never correct to use them, because sometimes one single character may be represented by multiple tokens).
...
- Built a new (hacky) `Detokenize` method which handles this
2023-10-22 21:43:36 +01:00
Rinne
231efe06f2
Update LLama/runtimes/build/LLamaSharp.Backend.Cpu.nuspec
...
Co-authored-by: Martin Evans <martindevans@gmail.com>
2023-10-23 00:01:09 +08:00
Rinne
ecf852c4e2
Update LLama/runtimes/build/LLamaSharp.Backend.MacMetal.nuspec
...
Co-authored-by: Martin Evans <martindevans@gmail.com>
2023-10-23 00:00:46 +08:00
Rinne
95669c2ea3
Update LLama/runtimes/build/LLamaSharp.Backend.Cuda12.nuspec
...
Co-authored-by: Martin Evans <martindevans@gmail.com>
2023-10-23 00:00:40 +08:00
Rinne
5eaebd68ba
Update LLama/runtimes/build/LLamaSharp.Backend.Cuda11.nuspec
...
Co-authored-by: Martin Evans <martindevans@gmail.com>
2023-10-23 00:00:34 +08:00
Rinne
6724b39713
Update LLama/runtimes/build/LLamaSharp.Backend.Cpu.nuspec
...
Co-authored-by: Martin Evans <martindevans@gmail.com>
2023-10-23 00:00:27 +08:00
Martin Evans
1d0620e634
Created a test that "roundtrips" strings through tokenization. This reveals some flaws with certain characters
2023-10-22 15:28:36 +01:00
Yaohui Liu
b7a7dc00b6
ci: fix typos.
2023-10-22 17:50:37 +08:00
Yaohui Liu
252992ec6e
ci: fix icon and typos.
2023-10-22 17:45:07 +08:00
Yaohui Liu
53eedf1428
ci: fix error.
2023-10-22 15:23:10 +08:00
Yaohui Liu
f9a98c6e23
ci: add auto release workflow.
2023-10-22 11:52:56 +08:00
Martin Evans
f621ec67e8
Fixed serialization
2023-10-20 15:04:18 +01:00
Martin Evans
768747c652
spelling
2023-10-20 14:57:55 +01:00
Martin Evans
b4e7f64e76
Added System.Text.Json serialization for `TensorSplitsCollectionConverter`
2023-10-20 14:55:01 +01:00
Martin Evans
281e58f059
Fixed default value
2023-10-20 14:35:06 +01:00
Martin Evans
04acbf8c42
Improved doc comment on `tensor_split`
2023-10-20 14:13:46 +01:00
Martin Evans
6a4cd506bd
Added a safe `TensorSplitsCollection` to the params which prevents incorrectly setting the `tensor_splits` collection
2023-10-20 14:10:20 +01:00
Martin Evans
15db194c17
Added multi GPU support
2023-10-20 13:43:46 +01:00
Martin Evans
328022b13d
Fixed merge conflicts
2023-10-19 21:14:45 +01:00
Martin Evans
7ec318aab5
Added logging to embedder too
2023-10-19 21:09:44 +01:00
Martin Evans
f1e5a8f995
- Passing the `ILogger` through to every call of `CreateContext`
...
- Passing `ILogger` into executors
2023-10-19 21:09:44 +01:00
sa_ddam213
4ec9aed47a
Revert LLamasSharp project changes
2023-10-20 08:29:26 +13:00
sa_ddam213
b4b4000342
Merge branch 'master' into upstream_master
...
# Conflicts:
# LLama.Web/Common/ModelOptions.cs
# LLama.Web/Services/ConnectionSessionService.cs
# LLama/LLamaStatelessExecutor.cs
# LLama/LLamaWeights.cs
2023-10-20 08:02:27 +13:00
Martin Evans
e89ca5cc17
Fixed a few minor warnings
2023-10-19 00:43:50 +01:00
Martin Evans
9daf586ba8
Assorted cleanup leftover after the huge change in the last PR (comments, syntax style, etc)
2023-10-19 00:26:30 +01:00
Martin Evans
d8434ea9d6
Merge pull request #185 from martindevans/wip_major_api_change
...
Major llama.cpp API Change
2023-10-18 20:50:32 +01:00
Martin Evans
1f8c94e386
Added in the `special` parameter to the tokenizer (introduced in https://github.com/ggerganov/llama.cpp/pull/3538 )
2023-10-17 23:55:46 +01:00
Martin Evans
efb0664df0
- Added new binaries
...
- Fixed stateless executor out-of-context handling
- Fixed token tests
2023-10-17 23:39:41 +01:00
Martin Evans
b8f0eff080
- Added `GetCharCountImpl` tests, fixed handling of empty strings
...
- Added ifdef to remove `Deconstruct` extension on everything except `NETSTANDARD2_0`
2023-10-14 00:04:12 +01:00
Martin Evans
45118520fa
- Improved coverage of `GBNFGrammarParser` up to 96%
...
- Covered text transforms
- Removed unnecessary non-async transforms
2023-10-13 23:54:01 +01:00
Martin Evans
2a38808bca
- Added threads to context params, replaced all thread args with `uint?`
...
- Replaced all binaries
2023-10-12 18:49:41 +01:00
Martin Evans
4e9b1f8cdc
- Split extension methods into separate files
2023-10-12 15:38:26 +01:00
sa_ddam213
9b8de007dc
Propagate ILogger
2023-10-04 13:47:08 +13:00
Martin Evans
669ae47ef7
- Split parameters into two interfaces
...
- params contains a list of loras, instead of just one
2023-09-30 16:21:18 +01:00
Martin Evans
9a0a0ae9fe
Removed cloning support
2023-09-30 15:48:26 +01:00
Martin Evans
0d40338692
Fixed out-of-context handling in stateless executor
2023-09-29 23:53:07 +01:00
Martin Evans
b306ac23dd
Added `Decode` method to `SafeLLamaContextHandle`
2023-09-29 22:24:44 +01:00
Martin Evans
9e958e896b
safe handle for batch
2023-09-29 22:18:23 +01:00
Martin Evans
ce1fc51163
Added some more native methods
2023-09-29 16:05:19 +01:00
Martin Evans
bca55eace0
Initial changes to match the llama.cpp changes
2023-09-29 01:18:21 +01:00
Martin Evans
d58fcbbd13
Fixed antiprompt checking
2023-09-24 14:26:43 +01:00
Martin Evans
08f1615e60
- Converted LLamaStatelessExecutor to run `Exec` calls inside an awaited task. This unblocks async callers while the model is being evaluated.
...
- Added a "spinner" to the `StatelessModeExecute` demo, which spins while waiting for the next token (demonstrating that it's not blocked).
2023-09-23 15:22:57 +01:00
Martin Evans
fe54f6764f
- Added unit tests for extension methods
...
- Removed unused `AddRangeSpan` extension
2023-09-22 16:29:50 +01:00
Haiping
79fa74d59c
Merge pull request #177 from redthing1/fix/context-getstate
...
fix opaque GetState (fixes #176 )
2023-09-19 06:51:36 -05:00
redthing1
b78044347c
fix opaque GetState ( fixes #176 )
2023-09-18 20:56:14 -07:00
Haiping
e1af7a96da
Merge pull request #175 from redthing1/feat/inferenceparams_record
...
make InferenceParams a record so we can use `with`
2023-09-18 19:44:18 -05:00
redthing1
296ba607de
make InferenceParams a record so we can use with
2023-09-18 13:27:26 -07:00
Haiping
10678a83d6
Merge pull request #65 from martindevans/alternative_dependency_loading
...
CPU Feature Detection
2023-09-17 10:21:37 -05:00
Haiping
f134c5af59
Merge pull request #163 from SignalRT/DefaultMetal
...
MacOS default build now is metal llama.cpp #2901
2023-09-17 10:20:47 -05:00
Martin Evans
3f80190f85
Minimal changes required to remove non-async inference.
2023-09-14 21:04:14 +01:00
Martin Evans
b1e9d8240d
Merge pull request #149 from martindevans/removed_unused_inference_params
...
Removed unused properties of `InferenceParams` & `ModelParams`
2023-09-13 01:48:15 +01:00
Martin Evans
daf09eae64
Skipping tokenization of empty strings (saves allocating an empty array every time)
2023-09-12 01:03:27 +01:00
Martin Evans
466722dcff
Merge pull request #165 from martindevans/better_instruct_antiprompt_checking
...
better_instruct_antiprompt_checking
2023-09-11 00:32:43 +01:00
Martin Evans
d08a125020
Using the `TokensEndsWithAnyString` extensions for antiprompt checking in instruct executor. Simpler and more efficient.
2023-09-11 00:22:17 +01:00
Martin Evans
bba801f4b7
Added a property to get the KV cache size from a context
2023-09-11 00:10:08 +01:00
SignalRT
c41e448d0e
ggml-metal.metal MUST be copied to output folder
...
Metal depends on this file to execute, and MacOS llama.cpp defaults is now METAL.
2023-09-10 20:42:15 +02:00
SignalRT
096293a026
MacOS Remove Metal as is the current default
...
See on Mac OS enable Metal by default #2901
2023-09-10 20:42:15 +02:00
Martin Evans
b47977300a
Removed one more unused parameter
2023-09-09 14:57:47 +01:00
Martin Evans
a1b0349561
Removed `ModelAlias` property (unused)
2023-09-09 14:18:50 +01:00
Martin Evans
4dac142bd5
Merge pull request #160 from martindevans/GetState_fix
...
`GetState()` fix
2023-09-09 01:44:08 +01:00
Martin Evans
832bf7dbe0
Simplified implementation of `GetState` and fixed a memory leak (`bigMemory` was never freed)
2023-09-09 01:30:35 +01:00
Martin Evans
4f7b6ffdcc
Removed `GenerateResult` method that was only used in one place
2023-09-09 01:09:27 +01:00
sa_ddam213
09d8f434f2
Extract LLamaLogLevel, Remove Logger class
2023-09-09 10:25:05 +12:00
sa_ddam213
949b0cde16
Replace ILLamaLogger for ILogger
2023-09-09 10:13:07 +12:00
sa_ddam213
70b36f8996
Add Microsoft.Extensions.Logging.Abstractions, update any required deps
2023-09-09 09:52:11 +12:00
Martin Evans
d3b8ee988c
Beam Search ( #155 )
...
* Added the low level bindings to beam search.
2023-09-07 19:26:51 +01:00
Martin Evans
a09aa86324
Merge pull request #153 from martindevans/fix_savestate_OpenOrCreate
...
Changed `OpenOrCreate` to `Create`
2023-09-06 23:03:24 +01:00
Martin Evans
f366aa3abe
Changed `OpenOrCreate` to `Create` to fix #151
2023-09-06 22:35:41 +01:00
Martin Evans
77bd090150
Simplified `LLamaInteractExecutor` antiprompt matching by using new extension method
2023-09-06 22:26:36 +01:00
Martin Evans
614ba40948
- Added a `TokensEndsWithAnyString` extension to `IReadOnlyList<int>` which efficiently checks if a set of tokens ends with one of a set of strings.
...
- Minimal amount of characters converted
- Allocation free
- Added `TokensToSpan` to `SafeLlamaModelHandle` which converts as many tokens as possible into a character span
- Allocation free
2023-09-06 19:44:19 +01:00
Martin Evans
d79a6556a1
Removed 3 unused properties of `InferenceParams`
2023-09-06 01:20:36 +01:00
Martin Evans
6a842014ac
Removed duplicate `llama_sample_classifier_free_guidance` method
2023-09-04 00:48:27 +01:00
Martin Evans
4a53cdc56b
Merge pull request #142 from SciSharp/rinne-dev
...
refactor: remove old version files.
2023-09-03 23:36:28 +01:00
Martin Evans
33035c82bf
- Removed `LLamaNewlineTokens` from `InteractiveExecutorState`. This is always set in the constructor from the context, so there's no point serializing it.
2023-09-03 18:22:39 +01:00
Yaohui Liu
18294a725e
refactor: remove old version files.
2023-09-02 22:24:07 +08:00
Martin Evans
8f58a40fb9
Added Linux dependency loading
2023-09-02 14:21:06 +01:00
Martin Evans
dd4957471f
Changed paths to match what the GitHub build action produces
2023-09-02 14:10:18 +01:00
Martin Evans
756a1ad0ba
Added a new way to load dependencies, performing CPU feature detection
2023-09-02 14:03:37 +01:00
Martin Evans
025741a73e
Fixed My Name
...
The D is for my middle name 😄
2023-09-02 13:45:06 +01:00
Yaohui Liu
20b5363601
fix: remove the history commit of embedding length property.
2023-09-02 12:56:02 +08:00
Yaohui Liu
3a847623ab
docs: update the docs to follow new version.
2023-09-02 12:51:51 +08:00
Yaohui Liu
ca6624edb3
Merge branch 'master' of github.com:SciSharp/LLamaSharp into rinne-dev
2023-09-02 12:03:35 +08:00
Rinne
4e83e48ad1
Merge pull request #122 from martindevans/gguf
...
Add GGUF support
2023-09-02 11:54:50 +08:00
Martin Evans
97349d93be
Merge branch 'gguf' of github.com:martindevans/LLamaSharp into gguf
2023-09-02 02:22:18 +01:00
Martin Evans
bcf06e2652
Added some comments on various native methods
2023-09-02 02:22:11 +01:00
Martin Evans
af680ac2d7
Created a hierarchy of exceptions for grammar format issues. This allows the base catch-all exception to be caught for general handling, or more specific exceptions to be caught for more specific handling.
2023-09-02 02:04:11 +01:00
Rinne
1533ee7dbf
Merge pull request #138 from drasticactions/semantic-kernel
...
Enable Semantic kernel support
2023-09-01 20:50:46 +08:00
Tim Miller
326c802be7
Have weights generate context
2023-08-31 22:19:29 +09:00
Tim Miller
3bca3b632e
New line
2023-08-31 17:31:13 +09:00
Tim Miller
9a1d6f99f2
Add Semantic Kernel support
2023-08-31 17:24:44 +09:00
Martin Evans
a70c7170dd
- Created a higher level `Grammar` class which is immutable and contains a list of grammar rules. This is the main "entry point" to the grammar system.
...
- Made all the mechanics of grammar parsing (GBNFGrammarParser, ParseState) internal. Just call `Grammar.Parse("whatever")`.
- Added a `GrammarRule` class which validates elements on construction (this allows constructing grammar without parsing GBNF).
- It should be impossible for a `GrammarRule` to represent an invalid rule.
2023-08-31 00:02:50 +01:00
SignalRT
fb007e5921
Changes to compile in VS Mac + change model to llama2
...
This commit includes changes to compile en VS Mac + changest to use llama2 not codellama.
It includes MacOS binaries in memory and metal
2023-08-30 22:08:29 +02:00
Mihai
24d3e1bfa8
Address PR review comment
2023-08-30 21:59:28 +03:00
Mihai
60790c5aac
Address code review comments (create custom exception, move printing to the ParseState class, rethrow error).
2023-08-30 21:06:45 +03:00
Mihai
2ae1891c13
Bug fixes after running tests.
...
SymbolIds is now SortedDictionary (although I'm not sure it really needs to be) because the test was failing due to expected value being in another order. The C++ data structure if SymbolIds is std::map<std::string, uint32_t> so the items are ordered by key.
2023-08-30 16:18:05 +03:00
Mihai
0bd495276b
Add initial tests + fix bugs. Still WIP since the test is failing.
2023-08-30 14:10:56 +03:00
Mihai
0f373fcc6d
Finish grammar_parser translation from C++ to C#
2023-08-30 12:20:45 +03:00
Mihai
3c919b56fe
Use ReadOnlySpan everywhere instead of ReadOnlyMemeory and instead of returning tuple, reference the ReadOnlySpan.
2023-08-30 11:23:55 +03:00
Mihai
8b4ec6d973
Address PR change requests
2023-08-30 09:24:08 +03:00
Mihai
7f31276bdf
[WIP] Translating the GrammarParser
2023-08-29 22:50:54 +03:00
Martin Evans
c9d08b943e
Added binaries for CUDA+Linux
2023-08-29 15:05:09 +01:00
Martin Evans
6711a59d0f
Included Linux deps
2023-08-28 20:02:59 +01:00
Martin Evans
ba49ea2991
Removed hardcoded paths from projects, modified Runtime.targets to exclude missing binaries
2023-08-28 19:53:34 +01:00
Martin Evans
2022b82947
Added binaries generated by this action: https://github.com/SciSharp/LLamaSharp/actions/runs/6002797872/job/16279896150
...
Based on this version: 6b73ef1201
2023-08-28 19:48:31 +01:00
sa_ddam213
a5d742b72c
Fix Tokenize of new line, Remove space inserts
2023-08-28 11:57:50 +12:00
Martin Evans
31287b5e6e
Rewritten TokenToSpan/TokenToString to better fit the new way it's done in llama.cpp with a few different options:
...
- Just convert it to a `string`, nice and simple
- Write the bytes to a `Span<byte>` no allocations
- Write the chars to a `StringBuilder` potentially no allocations
2023-08-27 00:15:56 +01:00
Martin Evans
0c98ae1955
Passing ctx to `llama_token_nl(_ctx)`
2023-08-27 00:15:55 +01:00
Martin Evans
6ffa28f964
Removed `LLAMA_MAX_DEVICES` (not used)
2023-08-27 00:14:40 +01:00
Martin Evans
2056078aef
Initial changes required for GGUF support
2023-08-27 00:14:40 +01:00
Martin Evans
826c6aaec3
cleaned up higher level code using the sampling API:
...
- Fixed multiple enumeration
- Fixed newline penalisation
2023-08-26 21:47:41 +01:00
Martin Evans
cf4754db44
Removed unnecessary parameters from some low level sampler methods
2023-08-26 21:38:24 +01:00
Martin Evans
f70525fec2
Two small improvements to the native sampling API:
...
- Modified `llama_sample_token_mirostat` and `llama_sample_token_mirostat_v2` to take `ref float` instead of as a `float*`. Less pointers is always good.
- Modified `llama_sample_repetition_penalty` and `llama_sample_frequency_and_presence_penalties` to take pointers instead of arrays. This allows the use non non allocating types (e.g. Span) instead of arrays
- Modified higher level API to accept `Memory<int>` instead of `int[]`, which can be used to reduce allocations at call sites
2023-08-26 01:25:48 +01:00
Martin Evans
a911b77dec
Various minor changes, resolving about 100 ReSharper code quality warnings
2023-08-24 23:15:53 +01:00
Martin Evans
5a6c6de0dc
Merge pull request #115 from martindevans/model_params_record
...
ModelsParams record class
2023-08-24 22:54:23 +01:00
Martin Evans
70be6c7368
Removed `virtual` method in newly sealed class
2023-08-24 17:08:01 +01:00
Martin Evans
ebacdb666d
- Moved the lower level state get/set methods onto SafeLLamaContextHandle
...
- Used those methods to add a `Clone` method to SafeLLamaContextHandle
- Simplified `LLamaContext` by using the new methods
- Sealed `LLamaContext` and `LLamaEmbedder`
2023-08-24 17:03:27 +01:00
Martin Evans
77aa5fa0d0
Added `JsonConverter` attribute, so System.Text.Json serialization is seamless
2023-08-24 16:17:49 +01:00
Martin Evans
df80ec9161
Merge pull request #97 from martindevans/embedder_tests
...
Embedder Test
2023-08-24 02:08:39 +01:00
Martin Evans
058c4e84b1
Rewritten LLamaEmbedder to use `LLamaContext` instead of the lower level handles
2023-08-24 01:14:12 +01:00
Martin Evans
829f32b27d
- Added `Obsolete` attributes to the entire `OldVersion` namespace, so they can be removed in the future
...
- Minor changes to cleanup some of the compiler warnings
2023-08-24 00:59:32 +01:00
Martin Evans
ee772a2921
added `using` statement instead of full qualification
2023-08-24 00:24:16 +01:00
Martin Evans
93f24f8a51
Switched to properly typed `Encoding` property
2023-08-24 00:09:00 +01:00
zombieguy
45b01d5a78
Improved type conversion
...
Type conversion is now done in the property rather than the utils class and uses the System.Convert class to ensure consistency.
2023-08-23 19:36:14 +01:00
Martin Evans
29df14cd9c
Converted ModelParams into a `record` class. This has several advantages:
...
- Equality, hashing etc all implemented automatically
- Default values are defined in just one place (the properties) instead of the constructor as well
- Added test to ensure that serialization works properly
2023-08-23 00:58:25 +01:00
Martin Evans
2830e5755c
- Applied a lot of minor R# code quality suggestions. Lots of unnecessary imports removed.
...
- Deleted `NativeInfo` (internal class, not used anywhere)
2023-08-22 23:20:13 +01:00
Martin Evans
854532c08e
Merge pull request #112 from martindevans/classifier_free_guidance
...
Added native symbol for CFG
2023-08-22 18:35:13 +01:00
Martin Evans
4b7d718551
Added native symbol for CFG
2023-08-22 17:11:49 +01:00
Erin Loy
8f0b52eb09
Re-renaming some arguments to allow for easy deserialization from appsettings.json.
2023-08-22 09:09:22 -07:00
Martin Evans
9fc17f3136
Fixed unit tests
2023-08-22 14:16:20 +01:00
Martin Evans
759ae26f36
Merge branch 'master' into grammar_basics
2023-08-22 14:06:57 +01:00
Martin Evans
a9e6f21ab8
- Creating and destroying contexts in the stateless executor, saving memory. It now uses zero memory when not inferring!
...
- Passing encoding in the `IModelParams`, which reduces how often encoding needs to be passed around
2023-08-22 01:30:13 +01:00
Martin Evans
e7b217f462
Fixed out of context logic
2023-08-22 01:28:28 +01:00
Martin Evans
4738c26299
- Reduced context size of test, to speed it up
...
- Removed some unnecessary `ToArray` calls
- Initial pass on LLamaStatelessExecutor, the context overflow management is broken but I think I found where it's ported from
2023-08-22 01:28:28 +01:00
Martin Evans
ae8ef17a4a
- Added various convenience overloads to `LLamaContext.Eval`
...
- Converted `SafeLLamaContextHandle` to take a `ReadOnlySpan` for Eval, narrower type better represents what's really needed
2023-08-22 01:28:28 +01:00
Erin Loy
592a80840b
renamed some arguments in ModelParams constructor so that classcan be serialized easily
2023-08-19 15:55:19 -07:00
Martin Evans
64416ca23c
- Created a slightly nicer way to create grammar (from `IReadOnlyList<IReadOnlyList<LLamaGrammarElement>>`)
...
- Integrated grammar into sampling
- Added a test for the grammar sampling
2023-08-17 19:29:15 +01:00
Martin Evans
0294bb1303
Some of the basics of the grammar API
2023-08-17 19:28:17 +01:00
Rinne
62331852bc
Merge pull request #90 from martindevans/proposal_multi_context
...
Multi Context
2023-08-17 21:59:05 +08:00
zombieguy
10f88ebd0e
Potential fix for .Net Framework issues ( #103 )
...
* Added a bool to sbyte Utils convertor
As an attempt to avoid using any MarshalAs attribute for .Net Framework support this Utils method will take in a bool value and return a 1 for true or 0 for false sbyte.
* Changed all bool "MarshalAs" types to sbytes
Changed all previous BOOL types with "MarshalAs" attributes to SBYTEs and changed all the setters of them to use the Utils.BoolToSignedByte() convertor method.
* Fixed Utils bool convertor & added sbyte to bool
Improved the Utils bool convertor just casting an sbyte value to get rid of the unneeded sbyte array and added an sbyte to bool convertor to convert back the way to a C# bool assuming any positive value above 0 is a bool and no bools are packed in the single byte integer.
* bool to & from sbyte conversions via properties
All 1byte bools are now handled where they "sit", via public properties which perform the conversions to keep all external data able to communicate as it did before.
2023-08-16 00:09:52 +01:00
Martin Evans
7ebff89f68
Merge pull request #101 from martindevans/llama_sample_classifier_free_guidance
...
llama_sample_classifier_free_guidance
2023-08-13 23:21:21 +01:00
Martin Evans
6c84accce8
Added `llama_sample_classifier_free_guidance` method from native API
2023-08-13 23:14:53 +01:00
Martin Evans
afe559ef1c
Added comments to `Logger` and fixed some nullability warnings
2023-08-13 01:29:33 +01:00
Martin Evans
6473f8d5e5
Temporarily added a `Console.WriteLine` into the test, to print the embedding vector for "cat" in CI
2023-08-13 01:10:09 +01:00
Martin Evans
1b35be2e0c
Added some additional basic tests
2023-08-13 01:10:09 +01:00
Martin Evans
f5a260926f
Renamed `EmbeddingCount` to `EmbeddingSize` in higher level class
2023-08-13 01:10:09 +01:00
Martin Evans
479ff57853
Renamed `EmbeddingCount` to `EmbeddingSize`
2023-08-13 01:10:09 +01:00
Martin Evans
d0a7a8fcd6
- Cleaned up disposal in LLamaContext
...
- sealed some classes not intended to be extended
2023-08-13 01:10:08 +01:00
Martin Evans
4d741d24f2
Marked old `LLamaContext` constructor obsolete
2023-08-13 01:10:08 +01:00
Martin Evans
20bdc2ec6f
- Apply LoRA in `LLamaWeights.LoadFromFile`
...
- Sanity checking that weights are not disposed when creating a context from them
- Further simplified `Utils.InitLLamaContextFromModelParams`
2023-08-13 01:10:08 +01:00
Martin Evans
e2fe08a9a2
Added a higher level `LLamaWeights` wrapper around `SafeLlamaModelHandle`
2023-08-13 01:10:08 +01:00
Martin Evans
fda7e1c038
Fixed mirostat/mirostate
2023-08-13 01:10:08 +01:00
Martin Evans
f3511e390f
WIP demonstrating changes to support multi-context. You can see this in use in `TalkToYourself`, along with notes on what still needs improving.
...
The biggest single change is renaming `LLamaModel` to `LLamaContext`
2023-08-13 01:10:08 +01:00
Martin Evans
d7f971fc22
Improved `NativeApi` file a bit:
...
- Added some more comments
- Modified `llama_tokenize` to not allocate
- Modified `llama_tokenize_native` to take a pointer instead of an array, allowing use with no allocations
- Removed GgmlInitParams (not used)
2023-08-12 00:45:23 +01:00
Martin Evans
841cf88e3b
Merge pull request #96 from martindevans/minor_quantizer_improvements
...
Minor quantizer improvements
2023-08-10 18:01:40 +01:00
Martin Evans
ce325b49c7
Rewritten comments
2023-08-10 17:00:54 +01:00
Martin Evans
b69f4bc40e
- Expanded range of supported types in quantizer to match llama.cpp
...
- Rewritten `LLamaFtype` parsing to support any substring which uniquely matches a single enum variant
2023-08-10 16:58:00 +01:00
sa_ddam213
a67ea36dd9
Typo and formatting
2023-08-11 00:37:33 +12:00
sa_ddam213
726987b761
Add native logging output
2023-08-10 23:01:50 +12:00
Martin Evans
acd91341e6
Added lots of comments to all the LLamaFtype variants
2023-08-10 02:14:21 +01:00
Yaohui Liu
ee2a5f064e
Merge branch 'master' of github.com:SciSharp/LLamaSharp into rinne-dev
2023-08-08 21:41:48 +08:00
Yaohui Liu
3a1daa98a3
feat: add the api to get the embedding length of the model.
2023-08-08 21:41:33 +08:00
Martin Evans
270c6d55ef
Merge pull request #88 from martindevans/fix_serialization_nan
...
Fix serialization error due to NaN
2023-08-08 14:04:18 +01:00
Martin Evans
91bcefc852
comment on IModelParamsExtensions
2023-08-07 23:46:19 +01:00
Martin Evans
9cdc72aa67
Fixed `ToLlamaContextParams` using the wrong parameter for `use_mmap`
2023-08-07 23:45:05 +01:00
Martin Evans
bab3b46f0c
Merge pull request #82 from martindevans/tokenization_cleanup
...
Utils Cleanup
2023-08-07 23:20:24 +01:00
Martin Evans
b5de3ee5aa
Fixed some final mentions of "mirostate" instead of "mirostat"
2023-08-07 21:12:56 +01:00
Martin Evans
be52737488
Using a nullable float instead of NaN, this should fix the serialization issue reported in #85
2023-08-07 21:09:18 +01:00
sa_ddam213
2d1269cae9
Access to IModelParamsExtensions
2023-08-08 07:54:40 +12:00
Martin Evans
1fceeaf352
Applied fix from #84 (antiprompt does not work in stateless executor)
2023-08-07 19:00:59 +01:00
Yaohui Liu
d609b0e1d5
Merge branch 'master' of github.com:SciSharp/LLamaSharp into rinne-dev
2023-08-08 00:16:38 +08:00
Yaohui Liu
b60c8bd285
fix: antiprompt does not work in stateless executor.
2023-08-08 00:16:23 +08:00
Martin Evans
2b2d3af26b
Moved `Eval` out of `Utils` and into `SafeLLamaContextHandle`
2023-08-07 15:15:34 +01:00
Martin Evans
7fabcc1849
One last `TokenToString` case
2023-08-07 15:15:34 +01:00
Martin Evans
0e5e00e300
Moved `TokenToString` from Utils into `SafeLLamaContextHandle` (thin wrappers around the same method in `SafeLlamaModelHandle`)
2023-08-07 15:15:34 +01:00
Martin Evans
2d811b2603
- Moved `GetLogits` into `SafeLLamaContextHandle`
...
- Added disposal check into `SafeLLamaContextHandle`
2023-08-07 15:13:24 +01:00
Martin Evans
cd3cf2b77d
- Moved tokenization from `Utils.Tokenize` into `SafeLLamaContextHandle.Tokenize`, one less thing in `Utils`.
...
- Also refactored it to return an `int[]` instead of an `IEnumerable<int>`, solving the "multiple enumeration" problems at the source!
2023-08-07 15:13:24 +01:00
Martin Evans
73882de591
Merge pull request #81 from martindevans/tensor_splits_array
...
Improved Tensor Splits
2023-08-07 13:36:38 +01:00
Martin Evans
bd3d8d3dc4
Cleaned up multiple enumeration in FixedSizeQueue
2023-08-07 02:23:46 +01:00
Martin Evans
f2499371ea
Pulled conversion of a `IModelParams` into a `LLamaContextParams` out into an extension method which can be used in other places.
2023-08-07 01:55:36 +01:00
Martin Evans
f1111a9f8b
Using a pin instead of a `fixed` block
2023-08-07 01:20:34 +01:00
Martin Evans
685eb3b9c2
Replaced `nint` with `float[]?` in Model params, which is much more user friendly!
2023-08-06 20:29:38 +01:00
sa_ddam213
e02d0c3617
Merge branch 'master' of https://github.com/SciSharp/LLamaSharp into upstream_master
2023-08-07 03:34:37 +12:00
Rinne
bfe9cc8961
Merge pull request #78 from SciSharp/rinne-dev
...
feat: update the llama backends.
2023-08-06 20:59:24 +08:00
sa_ddam213
e46646b8db
Merge branch 'master' of https://github.com/SciSharp/LLamaSharp into upstream_master
2023-08-07 00:01:37 +12:00
Yaohui Liu
bb46a990d0
fix: add bug info for native api.
2023-08-06 14:46:23 +08:00
Yaohui Liu
5fe13bd9f7
fix: update the dlls.
2023-08-06 13:46:57 +08:00
sa_ddam213
372894e1d4
Expose some native classes
2023-08-06 14:44:46 +12:00
sa_ddam213
bac9cba01a
InferenceParams abstractions
2023-08-06 11:03:45 +12:00
sa_ddam213
2a04e31b7d
ModelParams abstraction
2023-08-06 10:44:54 +12:00
Yaohui Liu
546ba28a68
fix: ci error caused by branch merge.
2023-08-06 01:48:31 +08:00
Yaohui Liu
fc17e91d1a
feat: add backend for MACOS.
2023-08-06 01:30:56 +08:00
Yaohui Liu
9fcbd16b74
Merge branch 'master' of github.com:SciSharp/LLamaSharp into rinne-dev
2023-08-06 01:30:03 +08:00
Yaohui Liu
2968125daf
feat: update the llama backends.
2023-08-06 01:22:24 +08:00
Martin Evans
fe3bd11dfa
Merge branch 'master' into master
2023-08-05 16:56:18 +01:00
Martin Evans
7ef07104e7
Added queue fix, so that CI can pass
2023-08-05 14:38:47 +01:00
SignalRT
348f2c7d72
Update llama.cpp binaries to 5f631c2 and align the context to that version
...
It solves the problem with netstandard2 (is it really netstandard2 a thing right now?)
Change context to solve problems.
5f631c26794b6371fcf2660e8d0c53494a5575f7
2023-08-05 12:45:34 +02:00
Rinne
075b785a4d
Merge branch 'master' into fixed_mirostate_mu
2023-08-05 08:59:47 +08:00
Rinne
c641dbdb83
Merge pull request #69 from martindevans/fixed_mirostat_spelling
...
Fixed Spelling Mirostate -> Mirostat
2023-08-05 08:56:52 +08:00
Rinne
8d37abd787
Merge pull request #68 from martindevans/sampling_improvements
...
Fixed Memory pinning in Sampling API
2023-08-05 08:55:12 +08:00
Rinne
1d29b240b2
Merge pull request #64 from martindevans/new_llama_state_loading_mechanism
...
Low level new loading system
2023-08-05 08:47:28 +08:00
Martin Evans
add3d5528b
Removed `MarshalAs` on array
2023-08-03 14:16:41 +01:00
Martin Evans
2245b84906
Update LLamaContextParams.cs
2023-08-02 23:13:07 +01:00
Martin Evans
c64507cb41
Correctly passing through mu value to mirostate instead of resetting it every time.
2023-07-30 00:15:52 +01:00
Rinne
cd015055a8
Merge branch 'master' into more_multi_enumeration_fixes
2023-07-30 00:45:38 +08:00
sa_ddam213
3e252c81f6
LLamaContextParams epsilon and tensor split changes
2023-07-28 19:15:19 +12:00
Martin Evans
36735f7908
Fixed spelling of "mirostat" instead of "mirostate"
2023-07-27 23:11:25 +01:00
Martin Evans
ec49bdd6eb
- Most importantly: Fixed issue in `SamplingApi`, `Memory` was pinned, but never unpinned!
...
- Moved repeated code to convert `LLamaTokenDataArray` into a `LLamaTokenDataArrayNative` into a helper method.
- Modified all call sites to dispose the `MemoryHandle`
- Saved one copy of the `List<LLamaTokenData>` into a `LLamaTokenData[]` in `LlamaModel`
2023-07-27 20:45:59 +01:00
Martin Evans
6985d3ab60
Added comments on two properties
2023-07-27 18:58:29 +01:00
Martin Evans
c974c8429e
Removed leftover `using`
2023-07-25 20:30:10 +01:00
Martin Evans
afb9d24f3a
Added model `Tokenize` method
2023-07-25 20:29:35 +01:00
Martin Evans
369c915afe
Added TokenToString conversion on model handle
2023-07-25 16:55:04 +01:00
Martin Evans
b721072aa5
Exposed some extra model properties on safe handle
2023-07-25 16:41:17 +01:00
Martin Evans
44b1e93609
Moved LoRA loading into `SafeLlamaModelHandle`
2023-07-25 16:35:24 +01:00
Martin Evans
c95b14d8b3
- Fixed null check
...
- Additional comments
2023-07-25 16:23:25 +01:00
Martin Evans
f16aa58e12
Updated to use the new loading system in llama (llama_state). This new system has split model weights and contexts into two separate things, allowing one set of weights to be shared between many contexts.
...
This change _only_ implements the low level API and makes no effort to update the LlamaSharp higher level abstraction.
It is built upon llama `b3f138d`, necessary DLLs are **not** included in this commit.
2023-07-25 01:18:12 +01:00
Martin Evans
8848fc6e3d
Fixed 2 more "multi enumeration" issues
2023-07-25 00:19:30 +01:00
Martin Evans
ad28a5acdb
Merge branch 'master' into fix_multiple_enumeration
2023-07-24 22:13:49 +01:00
Rinne
4d7d4f2bfe
Merge pull request #59 from saddam213/master
...
Instruct & Stateless web example implemented
2023-07-24 23:28:04 +08:00
Rinne
66d6b00b49
Merge pull request #57 from martindevans/larger_states
...
Larger states
2023-07-24 23:10:39 +08:00