Commit Graph

13 Commits

Author SHA1 Message Date
Martin Evans 1f8c94e386 Added in the `special` parameter to the tokenizer (introduced in https://github.com/ggerganov/llama.cpp/pull/3538) 2023-10-17 23:55:46 +01:00
Martin Evans efb0664df0 - Added new binaries
- Fixed stateless executor out-of-context handling
 - Fixed token tests
2023-10-17 23:39:41 +01:00
Martin Evans 669ae47ef7 - Split parameters into two interfaces
- params contains a list of loras, instead of just one
2023-09-30 16:21:18 +01:00
Martin Evans ce1fc51163 Added some more native methods 2023-09-29 16:05:19 +01:00
Martin Evans bca55eace0 Initial changes to match the llama.cpp changes 2023-09-29 01:18:21 +01:00
Martin Evans daf09eae64 Skipping tokenization of empty strings (saves allocating an empty array every time) 2023-09-12 01:03:27 +01:00
Martin Evans bba801f4b7 Added a property to get the KV cache size from a context 2023-09-11 00:10:08 +01:00
SignalRT fb007e5921 Changes to compile in VS Mac + change model to llama2
This commit includes changes to compile en VS Mac + changest to use llama2 not codellama.

It includes MacOS binaries in memory and metal
2023-08-30 22:08:29 +02:00
Martin Evans 95dc12dd76 Switched to codellama-7b.gguf in tests (probably temporarily) 2023-08-27 00:15:56 +01:00
Martin Evans 0c98ae1955 Passing ctx to `llama_token_nl(_ctx)` 2023-08-27 00:15:55 +01:00
Martin Evans 2830e5755c - Applied a lot of minor R# code quality suggestions. Lots of unnecessary imports removed.
- Deleted `NativeInfo` (internal class, not used anywhere)
2023-08-22 23:20:13 +01:00
Martin Evans a9e6f21ab8 - Creating and destroying contexts in the stateless executor, saving memory. It now uses zero memory when not inferring!
- Passing encoding in the `IModelParams`, which reduces how often encoding needs to be passed around
2023-08-22 01:30:13 +01:00
Martin Evans 1b35be2e0c Added some additional basic tests 2023-08-13 01:10:09 +01:00