Martin Evans
|
1f8c94e386
|
Added in the `special` parameter to the tokenizer (introduced in https://github.com/ggerganov/llama.cpp/pull/3538)
|
2023-10-17 23:55:46 +01:00 |
Martin Evans
|
efb0664df0
|
- Added new binaries
- Fixed stateless executor out-of-context handling
- Fixed token tests
|
2023-10-17 23:39:41 +01:00 |
Martin Evans
|
669ae47ef7
|
- Split parameters into two interfaces
- params contains a list of loras, instead of just one
|
2023-09-30 16:21:18 +01:00 |
Martin Evans
|
ce1fc51163
|
Added some more native methods
|
2023-09-29 16:05:19 +01:00 |
Martin Evans
|
bca55eace0
|
Initial changes to match the llama.cpp changes
|
2023-09-29 01:18:21 +01:00 |
Martin Evans
|
daf09eae64
|
Skipping tokenization of empty strings (saves allocating an empty array every time)
|
2023-09-12 01:03:27 +01:00 |
Martin Evans
|
bba801f4b7
|
Added a property to get the KV cache size from a context
|
2023-09-11 00:10:08 +01:00 |
SignalRT
|
fb007e5921
|
Changes to compile in VS Mac + change model to llama2
This commit includes changes to compile en VS Mac + changest to use llama2 not codellama.
It includes MacOS binaries in memory and metal
|
2023-08-30 22:08:29 +02:00 |
Martin Evans
|
95dc12dd76
|
Switched to codellama-7b.gguf in tests (probably temporarily)
|
2023-08-27 00:15:56 +01:00 |
Martin Evans
|
0c98ae1955
|
Passing ctx to `llama_token_nl(_ctx)`
|
2023-08-27 00:15:55 +01:00 |
Martin Evans
|
2830e5755c
|
- Applied a lot of minor R# code quality suggestions. Lots of unnecessary imports removed.
- Deleted `NativeInfo` (internal class, not used anywhere)
|
2023-08-22 23:20:13 +01:00 |
Martin Evans
|
a9e6f21ab8
|
- Creating and destroying contexts in the stateless executor, saving memory. It now uses zero memory when not inferring!
- Passing encoding in the `IModelParams`, which reduces how often encoding needs to be passed around
|
2023-08-22 01:30:13 +01:00 |
Martin Evans
|
1b35be2e0c
|
Added some additional basic tests
|
2023-08-13 01:10:09 +01:00 |