SignalRT
348f2c7d72
Update llama.cpp binaries to 5f631c2 and align the context to that version
...
It solves the problem with netstandard2 (is it really netstandard2 a thing right now?)
Change context to solve problems.
5f631c26794b6371fcf2660e8d0c53494a5575f7
2023-08-05 12:45:34 +02:00
Rinne
8d37abd787
Merge pull request #68 from martindevans/sampling_improvements
...
Fixed Memory pinning in Sampling API
2023-08-05 08:55:12 +08:00
Martin Evans
add3d5528b
Removed `MarshalAs` on array
2023-08-03 14:16:41 +01:00
Martin Evans
2245b84906
Update LLamaContextParams.cs
2023-08-02 23:13:07 +01:00
sa_ddam213
3e252c81f6
LLamaContextParams epsilon and tensor split changes
2023-07-28 19:15:19 +12:00
Martin Evans
ec49bdd6eb
- Most importantly: Fixed issue in `SamplingApi`, `Memory` was pinned, but never unpinned!
...
- Moved repeated code to convert `LLamaTokenDataArray` into a `LLamaTokenDataArrayNative` into a helper method.
- Modified all call sites to dispose the `MemoryHandle`
- Saved one copy of the `List<LLamaTokenData>` into a `LLamaTokenData[]` in `LlamaModel`
2023-07-27 20:45:59 +01:00
Martin Evans
6985d3ab60
Added comments on two properties
2023-07-27 18:58:29 +01:00
Martin Evans
c974c8429e
Removed leftover `using`
2023-07-25 20:30:10 +01:00
Martin Evans
afb9d24f3a
Added model `Tokenize` method
2023-07-25 20:29:35 +01:00
Martin Evans
369c915afe
Added TokenToString conversion on model handle
2023-07-25 16:55:04 +01:00
Martin Evans
b721072aa5
Exposed some extra model properties on safe handle
2023-07-25 16:41:17 +01:00
Martin Evans
44b1e93609
Moved LoRA loading into `SafeLlamaModelHandle`
2023-07-25 16:35:24 +01:00
Martin Evans
c95b14d8b3
- Fixed null check
...
- Additional comments
2023-07-25 16:23:25 +01:00
Martin Evans
f16aa58e12
Updated to use the new loading system in llama (llama_state). This new system has split model weights and contexts into two separate things, allowing one set of weights to be shared between many contexts.
...
This change _only_ implements the low level API and makes no effort to update the LlamaSharp higher level abstraction.
It is built upon llama `b3f138d`, necessary DLLs are **not** included in this commit.
2023-07-25 01:18:12 +01:00
Rinne
c5e8b3eba2
Merge pull request #56 from martindevans/memory_mapped_save_loading_and_saving
...
Memory Mapped LoadState/SaveState
2023-07-24 22:49:00 +08:00
Rinne
d17fa991cc
Merge pull request #53 from martindevans/xml_docs_fixes
...
XML docs fixes
2023-07-24 22:31:51 +08:00
Rinne
1b0523f630
Merge branch 'master' into master
2023-07-22 23:27:50 +08:00
Martin Evans
4d72420a04
Replaced `SaveState` and `LoadState` implementations. These new implementations map the file into memory and then pass the pointer directly into the native API. This improves things in two ways:
...
- A C# array cannot exceed 2,147,483,591 bytes. In my own use of LlamaSharp I encountered this limit.
- This saves an extra copy of the entire state data into a C# `byte[]`, so it should be faster.
This does _not_ fix some other places where `GetStateData` is used. I'll look at those in a separate PR.
2023-07-21 18:54:31 +01:00
Martin Evans
2e76b79af6
Various minor XML docs fixes
2023-07-20 16:07:53 +01:00
SignalRT
56a37a0d7d
Update to lates llama.cpp
...
Adapt the interface change in llama_backend_init
2023-07-15 11:42:19 +02:00
unknown
dba866ffcf
Update API method name
2023-07-13 22:39:26 -07:00
Yaohui Liu
1062fe1a7e
feat: upgrade the native libraries.
2023-06-21 15:21:27 +08:00
Yaohui Liu
9850417a12
feat: update quantize native params.
2023-06-20 23:32:58 +08:00
Yaohui Liu
3bf74ec9b9
feat: add chat session for refactored code.
2023-06-12 02:47:25 +08:00
Yaohui Liu
264fb9a706
refactor: LLamaModel and LLamaExecutor.
2023-06-10 18:37:58 +08:00
Yaohui Liu
3a62f087fe
fix: encoding error when using other languages.
2023-06-03 18:51:20 +08:00
Yaohui Liu
18c2ff2395
refactor: instruct mode and examples.
2023-05-21 20:36:49 +08:00
Yaohui Liu
55d5a8ae51
fix: quantization error with fp16.
2023-05-20 23:51:22 +08:00
Yaohui Liu
19979f664a
feat: support loading and saving state.
2023-05-20 14:01:20 +08:00
Yaohui Liu
00d91cf99e
refactor: some parts of code of LLamaModel.
2023-05-18 03:59:55 +08:00
Yaohui Liu
1fca06dc7f
fix: n_gpu_layers miss in llama context.
2023-05-17 04:22:54 +08:00
Yaohui Liu
4314f64b9c
feat: add check for backend package.
2023-05-17 03:40:45 +08:00
Yaohui Liu
6ffcb5306b
refactor: use official api of quantization instead.
2023-05-13 15:02:19 +08:00
Yaohui Liu
0958bbac2c
feat: add get-embedding api to LLamaModel.
2023-05-13 02:08:03 +08:00
Yaohui Liu
33067f990f
feat: run quantization in csharp.
2023-05-11 17:38:28 +08:00
Yaohui Liu
118d410d52
build: revise build informations.
2023-05-11 13:57:57 +08:00
Yaohui Liu
856d6549de
build: add linux support.
2023-05-11 04:20:56 +08:00
Yaohui Liu
02524ae4eb
build: add package informations.
2023-05-11 04:07:02 +08:00
Yaohui Liu
d6a7997e46
feat: add gpt model.
2023-05-10 20:48:16 +08:00
Yaohui Liu
5a79edeb51
feat: add the framework and basic usages.
2023-05-10 02:13:41 +08:00