LLamaSharp

Commit Graph

Author	SHA1	Message	Date
SignalRT	348f2c7d72	Update llama.cpp binaries to 5f631c2 and align the context to that version It solves the problem with netstandard2 (is it really netstandard2 a thing right now?) Change context to solve problems. 5f631c26794b6371fcf2660e8d0c53494a5575f7	2023-08-05 12:45:34 +02:00
Rinne	8d37abd787	Merge pull request #68 from martindevans/sampling_improvements Fixed Memory pinning in Sampling API	2023-08-05 08:55:12 +08:00
Martin Evans	add3d5528b	Removed `MarshalAs` on array	2023-08-03 14:16:41 +01:00
Martin Evans	2245b84906	Update LLamaContextParams.cs	2023-08-02 23:13:07 +01:00
sa_ddam213	3e252c81f6	LLamaContextParams epsilon and tensor split changes	2023-07-28 19:15:19 +12:00
Martin Evans	ec49bdd6eb	- Most importantly: Fixed issue in `SamplingApi`, `Memory` was pinned, but never unpinned! - Moved repeated code to convert `LLamaTokenDataArray` into a `LLamaTokenDataArrayNative` into a helper method. - Modified all call sites to dispose the `MemoryHandle` - Saved one copy of the `List<LLamaTokenData>` into a `LLamaTokenData[]` in `LlamaModel`	2023-07-27 20:45:59 +01:00
Martin Evans	6985d3ab60	Added comments on two properties	2023-07-27 18:58:29 +01:00
Martin Evans	c974c8429e	Removed leftover `using`	2023-07-25 20:30:10 +01:00
Martin Evans	afb9d24f3a	Added model `Tokenize` method	2023-07-25 20:29:35 +01:00
Martin Evans	369c915afe	Added TokenToString conversion on model handle	2023-07-25 16:55:04 +01:00
Martin Evans	b721072aa5	Exposed some extra model properties on safe handle	2023-07-25 16:41:17 +01:00
Martin Evans	44b1e93609	Moved LoRA loading into `SafeLlamaModelHandle`	2023-07-25 16:35:24 +01:00
Martin Evans	c95b14d8b3	- Fixed null check - Additional comments	2023-07-25 16:23:25 +01:00
Martin Evans	f16aa58e12	Updated to use the new loading system in llama (llama_state). This new system has split model weights and contexts into two separate things, allowing one set of weights to be shared between many contexts. This change _only_ implements the low level API and makes no effort to update the LlamaSharp higher level abstraction. It is built upon llama `b3f138d`, necessary DLLs are not included in this commit.	2023-07-25 01:18:12 +01:00
Rinne	c5e8b3eba2	Merge pull request #56 from martindevans/memory_mapped_save_loading_and_saving Memory Mapped LoadState/SaveState	2023-07-24 22:49:00 +08:00
Rinne	d17fa991cc	Merge pull request #53 from martindevans/xml_docs_fixes XML docs fixes	2023-07-24 22:31:51 +08:00
Rinne	1b0523f630	Merge branch 'master' into master	2023-07-22 23:27:50 +08:00
Martin Evans	4d72420a04	Replaced `SaveState` and `LoadState` implementations. These new implementations map the file into memory and then pass the pointer directly into the native API. This improves things in two ways: - A C# array cannot exceed 2,147,483,591 bytes. In my own use of LlamaSharp I encountered this limit. - This saves an extra copy of the entire state data into a C# `byte[]`, so it should be faster. This does _not_ fix some other places where `GetStateData` is used. I'll look at those in a separate PR.	2023-07-21 18:54:31 +01:00
Martin Evans	2e76b79af6	Various minor XML docs fixes	2023-07-20 16:07:53 +01:00
SignalRT	56a37a0d7d	Update to lates llama.cpp Adapt the interface change in llama_backend_init	2023-07-15 11:42:19 +02:00
unknown	dba866ffcf	Update API method name	2023-07-13 22:39:26 -07:00
Yaohui Liu	1062fe1a7e	feat: upgrade the native libraries.	2023-06-21 15:21:27 +08:00
Yaohui Liu	9850417a12	feat: update quantize native params.	2023-06-20 23:32:58 +08:00
Yaohui Liu	3bf74ec9b9	feat: add chat session for refactored code.	2023-06-12 02:47:25 +08:00
Yaohui Liu	264fb9a706	refactor: LLamaModel and LLamaExecutor.	2023-06-10 18:37:58 +08:00
Yaohui Liu	3a62f087fe	fix: encoding error when using other languages.	2023-06-03 18:51:20 +08:00
Yaohui Liu	18c2ff2395	refactor: instruct mode and examples.	2023-05-21 20:36:49 +08:00
Yaohui Liu	55d5a8ae51	fix: quantization error with fp16.	2023-05-20 23:51:22 +08:00
Yaohui Liu	19979f664a	feat: support loading and saving state.	2023-05-20 14:01:20 +08:00
Yaohui Liu	00d91cf99e	refactor: some parts of code of LLamaModel.	2023-05-18 03:59:55 +08:00
Yaohui Liu	1fca06dc7f	fix: n_gpu_layers miss in llama context.	2023-05-17 04:22:54 +08:00
Yaohui Liu	4314f64b9c	feat: add check for backend package.	2023-05-17 03:40:45 +08:00
Yaohui Liu	6ffcb5306b	refactor: use official api of quantization instead.	2023-05-13 15:02:19 +08:00
Yaohui Liu	0958bbac2c	feat: add get-embedding api to LLamaModel.	2023-05-13 02:08:03 +08:00
Yaohui Liu	33067f990f	feat: run quantization in csharp.	2023-05-11 17:38:28 +08:00
Yaohui Liu	118d410d52	build: revise build informations.	2023-05-11 13:57:57 +08:00
Yaohui Liu	856d6549de	build: add linux support.	2023-05-11 04:20:56 +08:00
Yaohui Liu	02524ae4eb	build: add package informations.	2023-05-11 04:07:02 +08:00
Yaohui Liu	d6a7997e46	feat: add gpt model.	2023-05-10 20:48:16 +08:00
Yaohui Liu	5a79edeb51	feat: add the framework and basic usages.	2023-05-10 02:13:41 +08:00

1 2 3 4 5

240 Commits