Merge pull request #141 from SciSharp/rinne-dev

docs: update the docs to follow new version.
This commit is contained in:
Rinne 2023-09-02 17:57:23 +08:00 committed by GitHub
commit b82e9f8fb0
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
77 changed files with 4970 additions and 596 deletions

View File

@ -7,8 +7,8 @@
<Platforms>AnyCPU;x64;Arm64</Platforms>
<AllowUnsafeBlocks>True</AllowUnsafeBlocks>
<Version>0.4.2</Version>
<Authors>Yaohui Liu, Haiping Chen</Authors>
<Version>0.5.0</Version>
<Authors>Yaohui Liu, Martin Devans, Haiping Chen</Authors>
<Company>SciSharp STACK</Company>
<GeneratePackageOnBuild>true</GeneratePackageOnBuild>
<Copyright>MIT, SciSharp STACK $([System.DateTime]::UtcNow.ToString(yyyy))</Copyright>
@ -21,7 +21,7 @@
weights to run, please go to https://github.com/SciSharp/LLamaSharp for more information.
</Description>
<PackageReleaseNotes>
LLamaSharp 0.4.1 followed up the master branch of llama.cpp. (commit id: aacdbd4)
LLamaSharp 0.5.0 adds support for GGUF, grammar and integration with semantic-kernel.
</PackageReleaseNotes>
<PackageLicenseExpression>MIT</PackageLicenseExpression>
<PackageOutputPath>packages</PackageOutputPath>

View File

@ -4,9 +4,9 @@
The figure below shows the core framework structure, which is separated to four levels.
- **LLamaModel**: The holder of a model which directly interact with native library and provide some basic APIs such as tokenization and embedding. Currently it includes three classes: `LLamaModel`, `LLamaEmbedder` and `LLamaQuantizer`.
- **LLamaContext**: The holder of a model which directly interact with native library and provide some basic APIs such as tokenization and embedding. Currently it includes three classes: `LLamaContext`, `LLamaEmbedder` and `LLamaQuantizer`.
- **LLamaExecutors**: Executors which define the way to run the LLama model. It provides text-to-text APIs to make it easy to use. Currently we provide three kinds of executors: `InteractiveExecutor`, `InstructuExecutor` and `StatelessExecutor`.
- **ChatSession**: A wrapping for `InteractiveExecutor` and `LLamaModel`, which supports interactive tasks and saving/re-loading sessions. It also provides a flexible way to customize the text process by `IHistoryTransform`, `ITextTransform` and `ITextStreamTransform`.
- **ChatSession**: A wrapping for `InteractiveExecutor` and `LLamaContext`, which supports interactive tasks and saving/re-loading sessions. It also provides a flexible way to customize the text process by `IHistoryTransform`, `ITextTransform` and `ITextStreamTransform`.
- **High-level Applications**: Some applications that provides higher-level integration. For example, [BotSharp](https://github.com/SciSharp/BotSharp) provides integration for vector search, Chatbot UI and Web APIs. [semantic-kernel](https://github.com/microsoft/semantic-kernel) provides various APIs for manipulations related with LLM. If you've made an integration, please tell us and add it to the doc!
@ -14,7 +14,7 @@ The figure below shows the core framework structure, which is separated to four
## Recommended Use
Since `LLamaModel` interact with native library, it's not recommended to use the methods of it directly unless you know what you are doing. So does the `NativeApi`, which is not included in the architecture figure above.
Since `LLamaContext` interact with native library, it's not recommended to use the methods of it directly unless you know what you are doing. So does the `NativeApi`, which is not included in the architecture figure above.
`ChatSession` is recommended to be used when you want to build an application similar to ChatGPT, or the ChatBot, because it works best with `InteractiveExecutor`. Though other executors are also allowed to passed as a parameter to initialize a `ChatSession`, it's not encouraged if you are new to LLamaSharp and LLM.

View File

@ -0,0 +1,3 @@
# The Usage of semantic-kernel Integration
Please see [this doc](../../LLama.SemanticKernel/README.md)

View File

@ -9,6 +9,7 @@ LLamaSharp is the C#/.NET binding of [llama.cpp](https://github.com/ggerganov/ll
- Model inference
- Model quantization
- Generating embeddings
- Grammar parse
- Interactive/Instruct/Stateless executor mode
- Chat session APIs
- Save/load the state

Binary file not shown.

Before

Width:  |  Height:  |  Size: 161 KiB

After

Width:  |  Height:  |  Size: 158 KiB

Binary file not shown.

View File

@ -8,26 +8,32 @@
[InteractiveExecutor](./llama.interactiveexecutor.md)
[LLamaEmbedder](./llama.llamaembedder.md)
[LLamaContext](./llama.llamacontext.md)
[LLamaModel](./llama.llamamodel.md)
[LLamaEmbedder](./llama.llamaembedder.md)
[LLamaQuantizer](./llama.llamaquantizer.md)
[LLamaTransforms](./llama.llamatransforms.md)
[ResettableLLamaModel](./llama.resettablellamamodel.md)
[LLamaWeights](./llama.llamaweights.md)
[StatefulExecutorBase](./llama.statefulexecutorbase.md)
[StatelessExecutor](./llama.statelessexecutor.md)
[Utils](./llama.utils.md)
## LLama.Abstractions
[IHistoryTransform](./llama.abstractions.ihistorytransform.md)
[IInferenceParams](./llama.abstractions.iinferenceparams.md)
[ILLamaExecutor](./llama.abstractions.illamaexecutor.md)
[IModelParams](./llama.abstractions.imodelparams.md)
[ITextStreamTransform](./llama.abstractions.itextstreamtransform.md)
[ITextTransform](./llama.abstractions.itexttransform.md)
@ -46,17 +52,45 @@
[LLamaDefaultLogger](./llama.common.llamadefaultlogger.md)
[MiroStateType](./llama.common.mirostatetype.md)
[MirostatType](./llama.common.mirostattype.md)
[ModelParams](./llama.common.modelparams.md)
## LLama.Exceptions
[GrammarExpectedName](./llama.exceptions.grammarexpectedname.md)
[GrammarExpectedNext](./llama.exceptions.grammarexpectednext.md)
[GrammarExpectedPrevious](./llama.exceptions.grammarexpectedprevious.md)
[GrammarFormatException](./llama.exceptions.grammarformatexception.md)
[GrammarUnexpectedCharAltElement](./llama.exceptions.grammarunexpectedcharaltelement.md)
[GrammarUnexpectedCharRngElement](./llama.exceptions.grammarunexpectedcharrngelement.md)
[GrammarUnexpectedEndElement](./llama.exceptions.grammarunexpectedendelement.md)
[GrammarUnexpectedEndOfInput](./llama.exceptions.grammarunexpectedendofinput.md)
[GrammarUnexpectedHexCharsCount](./llama.exceptions.grammarunexpectedhexcharscount.md)
[GrammarUnknownEscapeCharacter](./llama.exceptions.grammarunknownescapecharacter.md)
[RuntimeError](./llama.exceptions.runtimeerror.md)
## LLama.Extensions
[DictionaryExtension](./llama.extensions.dictionaryextension.md)
[IModelParamsExtensions](./llama.extensions.imodelparamsextensions.md)
[KeyValuePairExtensions](./llama.extensions.keyvaluepairextensions.md)
## LLama.Grammars
[Grammar](./llama.grammars.grammar.md)
[GrammarRule](./llama.grammars.grammarrule.md)
## LLama.Native
@ -64,6 +98,12 @@
[LLamaFtype](./llama.native.llamaftype.md)
[LLamaGrammarElement](./llama.native.llamagrammarelement.md)
[LLamaGrammarElementType](./llama.native.llamagrammarelementtype.md)
[LLamaModelQuantizeParams](./llama.native.llamamodelquantizeparams.md)
[LLamaTokenData](./llama.native.llamatokendata.md)
[LLamaTokenDataArray](./llama.native.llamatokendataarray.md)
@ -74,8 +114,14 @@
[SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)
[SafeLLamaGrammarHandle](./llama.native.safellamagrammarhandle.md)
[SafeLLamaHandleBase](./llama.native.safellamahandlebase.md)
[SafeLlamaModelHandle](./llama.native.safellamamodelhandle.md)
[SamplingApi](./llama.native.samplingapi.md)
## LLama.OldVersion
[ChatCompletion](./llama.oldversion.chatcompletion.md)

View File

@ -0,0 +1,268 @@
# IInferenceParams
Namespace: LLama.Abstractions
The paramters used for inference.
```csharp
public interface IInferenceParams
```
## Properties
### **TokensKeep**
number of tokens to keep from initial prompt
```csharp
public abstract int TokensKeep { get; set; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **MaxTokens**
how many new tokens to predict (n_predict), set to -1 to inifinitely generate response
until it complete.
```csharp
public abstract int MaxTokens { get; set; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **LogitBias**
logit bias for specific tokens
```csharp
public abstract Dictionary<int, float> LogitBias { get; set; }
```
#### Property Value
[Dictionary&lt;Int32, Single&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.dictionary-2)<br>
### **AntiPrompts**
Sequences where the model will stop generating further tokens.
```csharp
public abstract IEnumerable<string> AntiPrompts { get; set; }
```
#### Property Value
[IEnumerable&lt;String&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.ienumerable-1)<br>
### **PathSession**
path to file for saving/loading model eval state
```csharp
public abstract string PathSession { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **InputSuffix**
string to suffix user inputs with
```csharp
public abstract string InputSuffix { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **InputPrefix**
string to prefix user inputs with
```csharp
public abstract string InputPrefix { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **TopK**
0 or lower to use vocab size
```csharp
public abstract int TopK { get; set; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **TopP**
1.0 = disabled
```csharp
public abstract float TopP { get; set; }
```
#### Property Value
[Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
### **TfsZ**
1.0 = disabled
```csharp
public abstract float TfsZ { get; set; }
```
#### Property Value
[Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
### **TypicalP**
1.0 = disabled
```csharp
public abstract float TypicalP { get; set; }
```
#### Property Value
[Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
### **Temperature**
1.0 = disabled
```csharp
public abstract float Temperature { get; set; }
```
#### Property Value
[Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
### **RepeatPenalty**
1.0 = disabled
```csharp
public abstract float RepeatPenalty { get; set; }
```
#### Property Value
[Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
### **RepeatLastTokensCount**
last n tokens to penalize (0 = disable penalty, -1 = context size) (repeat_last_n)
```csharp
public abstract int RepeatLastTokensCount { get; set; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **FrequencyPenalty**
frequency penalty coefficient
0.0 = disabled
```csharp
public abstract float FrequencyPenalty { get; set; }
```
#### Property Value
[Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
### **PresencePenalty**
presence penalty coefficient
0.0 = disabled
```csharp
public abstract float PresencePenalty { get; set; }
```
#### Property Value
[Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
### **Mirostat**
Mirostat uses tokens instead of words.
algorithm described in the paper https://arxiv.org/abs/2007.14966.
0 = disabled, 1 = mirostat, 2 = mirostat 2.0
```csharp
public abstract MirostatType Mirostat { get; set; }
```
#### Property Value
[MirostatType](./llama.common.mirostattype.md)<br>
### **MirostatTau**
target entropy
```csharp
public abstract float MirostatTau { get; set; }
```
#### Property Value
[Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
### **MirostatEta**
learning rate
```csharp
public abstract float MirostatEta { get; set; }
```
#### Property Value
[Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
### **PenalizeNL**
consider newlines as a repeatable token (penalize_nl)
```csharp
public abstract bool PenalizeNL { get; set; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **Grammar**
Grammar to constrain possible tokens
```csharp
public abstract SafeLLamaGrammarHandle Grammar { get; set; }
```
#### Property Value
[SafeLLamaGrammarHandle](./llama.native.safellamagrammarhandle.md)<br>

View File

@ -10,26 +10,26 @@ public interface ILLamaExecutor
## Properties
### **Model**
### **Context**
The loaded model for this executor.
The loaded context for this executor.
```csharp
public abstract LLamaModel Model { get; }
public abstract LLamaContext Context { get; }
```
#### Property Value
[LLamaModel](./llama.llamamodel.md)<br>
[LLamaContext](./llama.llamacontext.md)<br>
## Methods
### **Infer(String, InferenceParams, CancellationToken)**
### **Infer(String, IInferenceParams, CancellationToken)**
Infers a response from the model.
```csharp
IEnumerable<string> Infer(string text, InferenceParams inferenceParams, CancellationToken token)
IEnumerable<string> Infer(string text, IInferenceParams inferenceParams, CancellationToken token)
```
#### Parameters
@ -37,7 +37,7 @@ IEnumerable<string> Infer(string text, InferenceParams inferenceParams, Cancella
`text` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
Your prompt
`inferenceParams` [InferenceParams](./llama.common.inferenceparams.md)<br>
`inferenceParams` [IInferenceParams](./llama.abstractions.iinferenceparams.md)<br>
Any additional parameters
`token` [CancellationToken](https://docs.microsoft.com/en-us/dotnet/api/system.threading.cancellationtoken)<br>
@ -47,19 +47,24 @@ A cancellation token.
[IEnumerable&lt;String&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.ienumerable-1)<br>
### **InferAsync(String, InferenceParams, CancellationToken)**
### **InferAsync(String, IInferenceParams, CancellationToken)**
Asynchronously infers a response from the model.
```csharp
IAsyncEnumerable<string> InferAsync(string text, InferenceParams inferenceParams, CancellationToken token)
IAsyncEnumerable<string> InferAsync(string text, IInferenceParams inferenceParams, CancellationToken token)
```
#### Parameters
`text` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
Your prompt
`inferenceParams` [InferenceParams](./llama.common.inferenceparams.md)<br>
`inferenceParams` [IInferenceParams](./llama.abstractions.iinferenceparams.md)<br>
Any additional parameters
`token` [CancellationToken](https://docs.microsoft.com/en-us/dotnet/api/system.threading.cancellationtoken)<br>
A cancellation token.
#### Returns

View File

@ -0,0 +1,276 @@
# IModelParams
Namespace: LLama.Abstractions
The parameters for initializing a LLama model.
```csharp
public interface IModelParams
```
## Properties
### **ContextSize**
Model context size (n_ctx)
```csharp
public abstract int ContextSize { get; set; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **MainGpu**
the GPU that is used for scratch and small tensors
```csharp
public abstract int MainGpu { get; set; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **LowVram**
if true, reduce VRAM usage at the cost of performance
```csharp
public abstract bool LowVram { get; set; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **GpuLayerCount**
Number of layers to run in VRAM / GPU memory (n_gpu_layers)
```csharp
public abstract int GpuLayerCount { get; set; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **Seed**
Seed for the random number generator (seed)
```csharp
public abstract int Seed { get; set; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **UseFp16Memory**
Use f16 instead of f32 for memory kv (memory_f16)
```csharp
public abstract bool UseFp16Memory { get; set; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **UseMemorymap**
Use mmap for faster loads (use_mmap)
```csharp
public abstract bool UseMemorymap { get; set; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **UseMemoryLock**
Use mlock to keep model in memory (use_mlock)
```csharp
public abstract bool UseMemoryLock { get; set; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **Perplexity**
Compute perplexity over the prompt (perplexity)
```csharp
public abstract bool Perplexity { get; set; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **ModelPath**
Model path (model)
```csharp
public abstract string ModelPath { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **ModelAlias**
model alias
```csharp
public abstract string ModelAlias { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **LoraAdapter**
lora adapter path (lora_adapter)
```csharp
public abstract string LoraAdapter { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **LoraBase**
base model path for the lora adapter (lora_base)
```csharp
public abstract string LoraBase { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Threads**
Number of threads (-1 = autodetect) (n_threads)
```csharp
public abstract int Threads { get; set; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **BatchSize**
batch size for prompt processing (must be &gt;=32 to use BLAS) (n_batch)
```csharp
public abstract int BatchSize { get; set; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **ConvertEosToNewLine**
Whether to convert eos to newline during the inference.
```csharp
public abstract bool ConvertEosToNewLine { get; set; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **EmbeddingMode**
Whether to use embedding mode. (embedding) Note that if this is set to true,
The LLamaModel won't produce text response anymore.
```csharp
public abstract bool EmbeddingMode { get; set; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **TensorSplits**
how split tensors should be distributed across GPUs
```csharp
public abstract Single[] TensorSplits { get; set; }
```
#### Property Value
[Single[]](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
### **RopeFrequencyBase**
RoPE base frequency
```csharp
public abstract float RopeFrequencyBase { get; set; }
```
#### Property Value
[Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
### **RopeFrequencyScale**
RoPE frequency scaling factor
```csharp
public abstract float RopeFrequencyScale { get; set; }
```
#### Property Value
[Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
### **MulMatQ**
Use experimental mul_mat_q kernels
```csharp
public abstract bool MulMatQ { get; set; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **Encoding**
The encoding to use for models
```csharp
public abstract Encoding Encoding { get; set; }
```
#### Property Value
[Encoding](https://docs.microsoft.com/en-us/dotnet/api/system.text.encoding)<br>

View File

@ -161,19 +161,19 @@ public void LoadSession(string path)
`path` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
The directory name to load the session.
### **Chat(ChatHistory, InferenceParams, CancellationToken)**
### **Chat(ChatHistory, IInferenceParams, CancellationToken)**
Get the response from the LLama model with chat histories.
```csharp
public IEnumerable<string> Chat(ChatHistory history, InferenceParams inferenceParams, CancellationToken cancellationToken)
public IEnumerable<string> Chat(ChatHistory history, IInferenceParams inferenceParams, CancellationToken cancellationToken)
```
#### Parameters
`history` [ChatHistory](./llama.common.chathistory.md)<br>
`inferenceParams` [InferenceParams](./llama.common.inferenceparams.md)<br>
`inferenceParams` [IInferenceParams](./llama.abstractions.iinferenceparams.md)<br>
`cancellationToken` [CancellationToken](https://docs.microsoft.com/en-us/dotnet/api/system.threading.cancellationtoken)<br>
@ -181,20 +181,20 @@ public IEnumerable<string> Chat(ChatHistory history, InferenceParams inferencePa
[IEnumerable&lt;String&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.ienumerable-1)<br>
### **Chat(String, InferenceParams, CancellationToken)**
### **Chat(String, IInferenceParams, CancellationToken)**
Get the response from the LLama model. Note that prompt could not only be the preset words,
but also the question you want to ask.
```csharp
public IEnumerable<string> Chat(string prompt, InferenceParams inferenceParams, CancellationToken cancellationToken)
public IEnumerable<string> Chat(string prompt, IInferenceParams inferenceParams, CancellationToken cancellationToken)
```
#### Parameters
`prompt` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
`inferenceParams` [InferenceParams](./llama.common.inferenceparams.md)<br>
`inferenceParams` [IInferenceParams](./llama.abstractions.iinferenceparams.md)<br>
`cancellationToken` [CancellationToken](https://docs.microsoft.com/en-us/dotnet/api/system.threading.cancellationtoken)<br>
@ -202,19 +202,19 @@ public IEnumerable<string> Chat(string prompt, InferenceParams inferenceParams,
[IEnumerable&lt;String&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.ienumerable-1)<br>
### **ChatAsync(ChatHistory, InferenceParams, CancellationToken)**
### **ChatAsync(ChatHistory, IInferenceParams, CancellationToken)**
Get the response from the LLama model with chat histories.
```csharp
public IAsyncEnumerable<string> ChatAsync(ChatHistory history, InferenceParams inferenceParams, CancellationToken cancellationToken)
public IAsyncEnumerable<string> ChatAsync(ChatHistory history, IInferenceParams inferenceParams, CancellationToken cancellationToken)
```
#### Parameters
`history` [ChatHistory](./llama.common.chathistory.md)<br>
`inferenceParams` [InferenceParams](./llama.common.inferenceparams.md)<br>
`inferenceParams` [IInferenceParams](./llama.abstractions.iinferenceparams.md)<br>
`cancellationToken` [CancellationToken](https://docs.microsoft.com/en-us/dotnet/api/system.threading.cancellationtoken)<br>
@ -222,19 +222,19 @@ public IAsyncEnumerable<string> ChatAsync(ChatHistory history, InferenceParams i
[IAsyncEnumerable&lt;String&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.iasyncenumerable-1)<br>
### **ChatAsync(String, InferenceParams, CancellationToken)**
### **ChatAsync(String, IInferenceParams, CancellationToken)**
Get the response from the LLama model with chat histories asynchronously.
```csharp
public IAsyncEnumerable<string> ChatAsync(string prompt, InferenceParams inferenceParams, CancellationToken cancellationToken)
public IAsyncEnumerable<string> ChatAsync(string prompt, IInferenceParams inferenceParams, CancellationToken cancellationToken)
```
#### Parameters
`prompt` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
`inferenceParams` [InferenceParams](./llama.common.inferenceparams.md)<br>
`inferenceParams` [IInferenceParams](./llama.abstractions.iinferenceparams.md)<br>
`cancellationToken` [CancellationToken](https://docs.microsoft.com/en-us/dotnet/api/system.threading.cancellationtoken)<br>

View File

@ -2,6 +2,8 @@
Namespace: LLama.Common
Role of the message author, e.g. user/assistant/system
```csharp
public enum AuthorRole
```
@ -13,3 +15,7 @@ Implements [IComparable](https://docs.microsoft.com/en-us/dotnet/api/system.icom
| Name | Value | Description |
| --- | --: | --- |
| Unknown | -1 | Role is unknown |
| System | 0 | Message comes from a "system" prompt, not written by a user or language model |
| User | 1 | Message comes from the user |
| Assistant | 2 | Messages was generated by the language model |

View File

@ -20,6 +20,8 @@ Implements IEnumerable&lt;T&gt;, [IEnumerable](https://docs.microsoft.com/en-us/
### **Count**
Number of items in this queue
```csharp
public int Count { get; }
```
@ -30,6 +32,8 @@ public int Count { get; }
### **Capacity**
Maximum number of items allowed in this queue
```csharp
public int Capacity { get; }
```
@ -42,6 +46,8 @@ public int Capacity { get; }
### **FixedSizeQueue(Int32)**
Create a new queue
```csharp
public FixedSizeQueue(int size)
```
@ -49,9 +55,12 @@ public FixedSizeQueue(int size)
#### Parameters
`size` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
the maximum number of items to store in this queue
### **FixedSizeQueue(Int32, IEnumerable&lt;T&gt;)**
Fill the quene with the data. Please ensure that data.Count &lt;= size
```csharp
public FixedSizeQueue(int size, IEnumerable<T> data)
```
@ -66,6 +75,8 @@ public FixedSizeQueue(int size, IEnumerable<T> data)
### **FillWith(T)**
Replace every item in the queue with the given value
```csharp
public FixedSizeQueue<T> FillWith(T value)
```
@ -73,10 +84,12 @@ public FixedSizeQueue<T> FillWith(T value)
#### Parameters
`value` T<br>
The value to replace all items with
#### Returns
[FixedSizeQueue&lt;T&gt;](./llama.common.fixedsizequeue-1.md)<br>
returns this
### **Enqueue(T)**
@ -90,16 +103,6 @@ public void Enqueue(T item)
`item` T<br>
### **ToArray()**
```csharp
public T[] ToArray()
```
#### Returns
T[]<br>
### **GetEnumerator()**
```csharp

View File

@ -2,6 +2,8 @@
Namespace: LLama.Common
receives log messages from LLamaSharp
```csharp
public interface ILLamaLogger
```
@ -10,7 +12,7 @@ public interface ILLamaLogger
### **Log(String, String, LogLevel)**
Write the log in cosutomized way
Write the log in customized way
```csharp
void Log(string source, string message, LogLevel level)

View File

@ -2,11 +2,14 @@
Namespace: LLama.Common
The paramters used for inference.
```csharp
public class InferenceParams
public class InferenceParams : LLama.Abstractions.IInferenceParams
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [InferenceParams](./llama.common.inferenceparams.md)
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [InferenceParams](./llama.common.inferenceparams.md)<br>
Implements [IInferenceParams](./llama.abstractions.iinferenceparams.md)
## Properties
@ -212,12 +215,12 @@ Mirostat uses tokens instead of words.
0 = disabled, 1 = mirostat, 2 = mirostat 2.0
```csharp
public MiroStateType Mirostat { get; set; }
public MirostatType Mirostat { get; set; }
```
#### Property Value
[MiroStateType](./llama.common.mirostatetype.md)<br>
[MirostatType](./llama.common.mirostattype.md)<br>
### **MirostatTau**
@ -255,6 +258,18 @@ public bool PenalizeNL { get; set; }
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **Grammar**
A grammar to constrain the possible tokens
```csharp
public SafeLLamaGrammarHandle Grammar { get; set; }
```
#### Property Value
[SafeLLamaGrammarHandle](./llama.native.safellamagrammarhandle.md)<br>
## Constructors
### **InferenceParams()**

View File

@ -2,8 +2,8 @@
Namespace: LLama.Common
The default logger of LLamaSharp. On default it write to console. User methods of `LLamaLogger.Default` to change the behavior.
It's more recommended to inherit `ILLamaLogger` to cosutomize the behavior.
The default logger of LLamaSharp. On default it write to console. Use methods of `LLamaLogger.Default` to change the behavior.
It's recommended to inherit `ILLamaLogger` to customize the behavior.
```csharp
public sealed class LLamaDefaultLogger : ILLamaLogger
@ -16,6 +16,8 @@ Implements [ILLamaLogger](./llama.common.illamalogger.md)
### **Default**
Get the default logger instance
```csharp
public static LLamaDefaultLogger Default { get; }
```
@ -26,8 +28,22 @@ public static LLamaDefaultLogger Default { get; }
## Methods
### **EnableNative()**
Enable logging output from llama.cpp
```csharp
public LLamaDefaultLogger EnableNative()
```
#### Returns
[LLamaDefaultLogger](./llama.common.llamadefaultlogger.md)<br>
### **EnableConsole()**
Enable writing log messages to console
```csharp
public LLamaDefaultLogger EnableConsole()
```
@ -38,6 +54,8 @@ public LLamaDefaultLogger EnableConsole()
### **DisableConsole()**
Disable writing messages to console
```csharp
public LLamaDefaultLogger DisableConsole()
```
@ -48,6 +66,8 @@ public LLamaDefaultLogger DisableConsole()
### **EnableFile(String, FileMode)**
Enable writing log messages to file
```csharp
public LLamaDefaultLogger EnableFile(string filename, FileMode mode)
```
@ -64,6 +84,14 @@ public LLamaDefaultLogger EnableFile(string filename, FileMode mode)
### **DisableFile(String)**
#### Caution
Use DisableFile method without 'filename' parameter
---
Disable writing log messages to file
```csharp
public LLamaDefaultLogger DisableFile(string filename)
```
@ -71,6 +99,19 @@ public LLamaDefaultLogger DisableFile(string filename)
#### Parameters
`filename` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
unused!
#### Returns
[LLamaDefaultLogger](./llama.common.llamadefaultlogger.md)<br>
### **DisableFile()**
Disable writing log messages to file
```csharp
public LLamaDefaultLogger DisableFile()
```
#### Returns
@ -78,6 +119,8 @@ public LLamaDefaultLogger DisableFile(string filename)
### **Log(String, String, LogLevel)**
Log a message
```csharp
public void Log(string source, string message, LogLevel level)
```
@ -85,13 +128,18 @@ public void Log(string source, string message, LogLevel level)
#### Parameters
`source` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
The source of this message (e.g. class name)
`message` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
The message to log
`level` [LogLevel](./llama.common.illamalogger.loglevel.md)<br>
Severity level of this message
### **Info(String)**
Write a log message with "Info" severity
```csharp
public void Info(string message)
```
@ -102,6 +150,8 @@ public void Info(string message)
### **Warn(String)**
Write a log message with "Warn" severity
```csharp
public void Warn(string message)
```
@ -112,6 +162,8 @@ public void Warn(string message)
### **Error(String)**
Write a log message with "Error" severity
```csharp
public void Error(string message)
```

View File

@ -1,15 +1,21 @@
# MiroStateType
# MirostatType
Namespace: LLama.Common
Type of "mirostat" sampling to use.
https://github.com/basusourya/mirostat
```csharp
public enum MiroStateType
public enum MirostatType
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [ValueType](https://docs.microsoft.com/en-us/dotnet/api/system.valuetype) → [Enum](https://docs.microsoft.com/en-us/dotnet/api/system.enum) → [MiroStateType](./llama.common.mirostatetype.md)<br>
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [ValueType](https://docs.microsoft.com/en-us/dotnet/api/system.valuetype) → [Enum](https://docs.microsoft.com/en-us/dotnet/api/system.enum) → [MirostatType](./llama.common.mirostattype.md)<br>
Implements [IComparable](https://docs.microsoft.com/en-us/dotnet/api/system.icomparable), [IFormattable](https://docs.microsoft.com/en-us/dotnet/api/system.iformattable), [IConvertible](https://docs.microsoft.com/en-us/dotnet/api/system.iconvertible)
## Fields
| Name | Value | Description |
| --- | --: | --- |
| Disable | 0 | Disable Mirostat sampling |
| Mirostat | 1 | Original mirostat algorithm |
| Mirostat2 | 2 | Mirostat 2.0 algorithm |

View File

@ -2,11 +2,14 @@
Namespace: LLama.Common
The parameters for initializing a LLama model.
```csharp
public class ModelParams
public class ModelParams : LLama.Abstractions.IModelParams, System.IEquatable`1[[LLama.Common.ModelParams, LLamaSharp, Version=0.5.0.0, Culture=neutral, PublicKeyToken=null]]
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [ModelParams](./llama.common.modelparams.md)
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [ModelParams](./llama.common.modelparams.md)<br>
Implements [IModelParams](./llama.abstractions.imodelparams.md), [IEquatable&lt;ModelParams&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.iequatable-1)
## Properties
@ -22,6 +25,30 @@ public int ContextSize { get; set; }
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **MainGpu**
the GPU that is used for scratch and small tensors
```csharp
public int MainGpu { get; set; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **LowVram**
if true, reduce VRAM usage at the cost of performance
```csharp
public bool LowVram { get; set; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **GpuLayerCount**
Number of layers to run in VRAM / GPU memory (n_gpu_layers)
@ -106,6 +133,18 @@ public string ModelPath { get; set; }
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **ModelAlias**
model alias
```csharp
public string ModelAlias { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **LoraAdapter**
lora adapter path (lora_adapter)
@ -179,14 +218,93 @@ public bool EmbeddingMode { get; set; }
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **TensorSplits**
how split tensors should be distributed across GPUs
```csharp
public Single[] TensorSplits { get; set; }
```
#### Property Value
[Single[]](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
### **RopeFrequencyBase**
RoPE base frequency
```csharp
public float RopeFrequencyBase { get; set; }
```
#### Property Value
[Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
### **RopeFrequencyScale**
RoPE frequency scaling factor
```csharp
public float RopeFrequencyScale { get; set; }
```
#### Property Value
[Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
### **MulMatQ**
Use experimental mul_mat_q kernels
```csharp
public bool MulMatQ { get; set; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **Encoding**
The encoding to use to convert text for the model
```csharp
public Encoding Encoding { get; set; }
```
#### Property Value
[Encoding](https://docs.microsoft.com/en-us/dotnet/api/system.text.encoding)<br>
## Constructors
### **ModelParams(String, Int32, Int32, Int32, Boolean, Boolean, Boolean, Boolean, String, String, Int32, Int32, Boolean, Boolean)**
### **ModelParams(String)**
```csharp
public ModelParams(string modelPath, int contextSize, int gpuLayerCount, int seed, bool useFp16Memory, bool useMemorymap, bool useMemoryLock, bool perplexity, string loraAdapter, string loraBase, int threads, int batchSize, bool convertEosToNewLine, bool embeddingMode)
public ModelParams(string modelPath)
```
#### Parameters
`modelPath` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
The model path.
### **ModelParams(String, Int32, Int32, Int32, Boolean, Boolean, Boolean, Boolean, String, String, Int32, Int32, Boolean, Boolean, Single, Single, Boolean, String)**
#### Caution
Use object initializer to set all optional parameters
---
```csharp
public ModelParams(string modelPath, int contextSize, int gpuLayerCount, int seed, bool useFp16Memory, bool useMemorymap, bool useMemoryLock, bool perplexity, string loraAdapter, string loraBase, int threads, int batchSize, bool convertEosToNewLine, bool embeddingMode, float ropeFrequencyBase, float ropeFrequencyScale, bool mulMatQ, string encoding)
```
#### Parameters
@ -232,3 +350,89 @@ Whether to convert eos to newline during the inference.
`embeddingMode` [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
Whether to use embedding mode. (embedding) Note that if this is set to true, The LLamaModel won't produce text response anymore.
`ropeFrequencyBase` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
RoPE base frequency.
`ropeFrequencyScale` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
RoPE frequency scaling factor
`mulMatQ` [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
Use experimental mul_mat_q kernels
`encoding` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
The encoding to use to convert text for the model
## Methods
### **ToString()**
```csharp
public string ToString()
```
#### Returns
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **PrintMembers(StringBuilder)**
```csharp
protected bool PrintMembers(StringBuilder builder)
```
#### Parameters
`builder` [StringBuilder](https://docs.microsoft.com/en-us/dotnet/api/system.text.stringbuilder)<br>
#### Returns
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **GetHashCode()**
```csharp
public int GetHashCode()
```
#### Returns
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **Equals(Object)**
```csharp
public bool Equals(object obj)
```
#### Parameters
`obj` [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object)<br>
#### Returns
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **Equals(ModelParams)**
```csharp
public bool Equals(ModelParams other)
```
#### Parameters
`other` [ModelParams](./llama.common.modelparams.md)<br>
#### Returns
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **&lt;Clone&gt;$()**
```csharp
public ModelParams <Clone>$()
```
#### Returns
[ModelParams](./llama.common.modelparams.md)<br>

View File

@ -0,0 +1,94 @@
# GrammarExpectedName
Namespace: LLama.Exceptions
Failed to parse a "name" element when one was expected
```csharp
public class GrammarExpectedName : GrammarFormatException, System.Runtime.Serialization.ISerializable
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [Exception](https://docs.microsoft.com/en-us/dotnet/api/system.exception) → [GrammarFormatException](./llama.exceptions.grammarformatexception.md) → [GrammarExpectedName](./llama.exceptions.grammarexpectedname.md)<br>
Implements [ISerializable](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.serialization.iserializable)
## Properties
### **TargetSite**
```csharp
public MethodBase TargetSite { get; }
```
#### Property Value
[MethodBase](https://docs.microsoft.com/en-us/dotnet/api/system.reflection.methodbase)<br>
### **Message**
```csharp
public string Message { get; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Data**
```csharp
public IDictionary Data { get; }
```
#### Property Value
[IDictionary](https://docs.microsoft.com/en-us/dotnet/api/system.collections.idictionary)<br>
### **InnerException**
```csharp
public Exception InnerException { get; }
```
#### Property Value
[Exception](https://docs.microsoft.com/en-us/dotnet/api/system.exception)<br>
### **HelpLink**
```csharp
public string HelpLink { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Source**
```csharp
public string Source { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **HResult**
```csharp
public int HResult { get; set; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **StackTrace**
```csharp
public string StackTrace { get; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>

View File

@ -0,0 +1,94 @@
# GrammarExpectedNext
Namespace: LLama.Exceptions
A specified string was expected when parsing
```csharp
public class GrammarExpectedNext : GrammarFormatException, System.Runtime.Serialization.ISerializable
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [Exception](https://docs.microsoft.com/en-us/dotnet/api/system.exception) → [GrammarFormatException](./llama.exceptions.grammarformatexception.md) → [GrammarExpectedNext](./llama.exceptions.grammarexpectednext.md)<br>
Implements [ISerializable](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.serialization.iserializable)
## Properties
### **TargetSite**
```csharp
public MethodBase TargetSite { get; }
```
#### Property Value
[MethodBase](https://docs.microsoft.com/en-us/dotnet/api/system.reflection.methodbase)<br>
### **Message**
```csharp
public string Message { get; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Data**
```csharp
public IDictionary Data { get; }
```
#### Property Value
[IDictionary](https://docs.microsoft.com/en-us/dotnet/api/system.collections.idictionary)<br>
### **InnerException**
```csharp
public Exception InnerException { get; }
```
#### Property Value
[Exception](https://docs.microsoft.com/en-us/dotnet/api/system.exception)<br>
### **HelpLink**
```csharp
public string HelpLink { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Source**
```csharp
public string Source { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **HResult**
```csharp
public int HResult { get; set; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **StackTrace**
```csharp
public string StackTrace { get; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>

View File

@ -0,0 +1,94 @@
# GrammarExpectedPrevious
Namespace: LLama.Exceptions
A specified character was expected to preceded another when parsing
```csharp
public class GrammarExpectedPrevious : GrammarFormatException, System.Runtime.Serialization.ISerializable
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [Exception](https://docs.microsoft.com/en-us/dotnet/api/system.exception) → [GrammarFormatException](./llama.exceptions.grammarformatexception.md) → [GrammarExpectedPrevious](./llama.exceptions.grammarexpectedprevious.md)<br>
Implements [ISerializable](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.serialization.iserializable)
## Properties
### **TargetSite**
```csharp
public MethodBase TargetSite { get; }
```
#### Property Value
[MethodBase](https://docs.microsoft.com/en-us/dotnet/api/system.reflection.methodbase)<br>
### **Message**
```csharp
public string Message { get; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Data**
```csharp
public IDictionary Data { get; }
```
#### Property Value
[IDictionary](https://docs.microsoft.com/en-us/dotnet/api/system.collections.idictionary)<br>
### **InnerException**
```csharp
public Exception InnerException { get; }
```
#### Property Value
[Exception](https://docs.microsoft.com/en-us/dotnet/api/system.exception)<br>
### **HelpLink**
```csharp
public string HelpLink { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Source**
```csharp
public string Source { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **HResult**
```csharp
public int HResult { get; set; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **StackTrace**
```csharp
public string StackTrace { get; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>

View File

@ -0,0 +1,94 @@
# GrammarFormatException
Namespace: LLama.Exceptions
Base class for all grammar exceptions
```csharp
public abstract class GrammarFormatException : System.Exception, System.Runtime.Serialization.ISerializable
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [Exception](https://docs.microsoft.com/en-us/dotnet/api/system.exception) → [GrammarFormatException](./llama.exceptions.grammarformatexception.md)<br>
Implements [ISerializable](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.serialization.iserializable)
## Properties
### **TargetSite**
```csharp
public MethodBase TargetSite { get; }
```
#### Property Value
[MethodBase](https://docs.microsoft.com/en-us/dotnet/api/system.reflection.methodbase)<br>
### **Message**
```csharp
public string Message { get; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Data**
```csharp
public IDictionary Data { get; }
```
#### Property Value
[IDictionary](https://docs.microsoft.com/en-us/dotnet/api/system.collections.idictionary)<br>
### **InnerException**
```csharp
public Exception InnerException { get; }
```
#### Property Value
[Exception](https://docs.microsoft.com/en-us/dotnet/api/system.exception)<br>
### **HelpLink**
```csharp
public string HelpLink { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Source**
```csharp
public string Source { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **HResult**
```csharp
public int HResult { get; set; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **StackTrace**
```csharp
public string StackTrace { get; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>

View File

@ -0,0 +1,94 @@
# GrammarUnexpectedCharAltElement
Namespace: LLama.Exceptions
A CHAR_ALT was created without a preceding CHAR element
```csharp
public class GrammarUnexpectedCharAltElement : GrammarFormatException, System.Runtime.Serialization.ISerializable
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [Exception](https://docs.microsoft.com/en-us/dotnet/api/system.exception) → [GrammarFormatException](./llama.exceptions.grammarformatexception.md) → [GrammarUnexpectedCharAltElement](./llama.exceptions.grammarunexpectedcharaltelement.md)<br>
Implements [ISerializable](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.serialization.iserializable)
## Properties
### **TargetSite**
```csharp
public MethodBase TargetSite { get; }
```
#### Property Value
[MethodBase](https://docs.microsoft.com/en-us/dotnet/api/system.reflection.methodbase)<br>
### **Message**
```csharp
public string Message { get; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Data**
```csharp
public IDictionary Data { get; }
```
#### Property Value
[IDictionary](https://docs.microsoft.com/en-us/dotnet/api/system.collections.idictionary)<br>
### **InnerException**
```csharp
public Exception InnerException { get; }
```
#### Property Value
[Exception](https://docs.microsoft.com/en-us/dotnet/api/system.exception)<br>
### **HelpLink**
```csharp
public string HelpLink { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Source**
```csharp
public string Source { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **HResult**
```csharp
public int HResult { get; set; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **StackTrace**
```csharp
public string StackTrace { get; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>

View File

@ -0,0 +1,94 @@
# GrammarUnexpectedCharRngElement
Namespace: LLama.Exceptions
A CHAR_RNG was created without a preceding CHAR element
```csharp
public class GrammarUnexpectedCharRngElement : GrammarFormatException, System.Runtime.Serialization.ISerializable
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [Exception](https://docs.microsoft.com/en-us/dotnet/api/system.exception) → [GrammarFormatException](./llama.exceptions.grammarformatexception.md) → [GrammarUnexpectedCharRngElement](./llama.exceptions.grammarunexpectedcharrngelement.md)<br>
Implements [ISerializable](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.serialization.iserializable)
## Properties
### **TargetSite**
```csharp
public MethodBase TargetSite { get; }
```
#### Property Value
[MethodBase](https://docs.microsoft.com/en-us/dotnet/api/system.reflection.methodbase)<br>
### **Message**
```csharp
public string Message { get; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Data**
```csharp
public IDictionary Data { get; }
```
#### Property Value
[IDictionary](https://docs.microsoft.com/en-us/dotnet/api/system.collections.idictionary)<br>
### **InnerException**
```csharp
public Exception InnerException { get; }
```
#### Property Value
[Exception](https://docs.microsoft.com/en-us/dotnet/api/system.exception)<br>
### **HelpLink**
```csharp
public string HelpLink { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Source**
```csharp
public string Source { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **HResult**
```csharp
public int HResult { get; set; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **StackTrace**
```csharp
public string StackTrace { get; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>

View File

@ -0,0 +1,94 @@
# GrammarUnexpectedEndElement
Namespace: LLama.Exceptions
An END was encountered before the last element
```csharp
public class GrammarUnexpectedEndElement : GrammarFormatException, System.Runtime.Serialization.ISerializable
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [Exception](https://docs.microsoft.com/en-us/dotnet/api/system.exception) → [GrammarFormatException](./llama.exceptions.grammarformatexception.md) → [GrammarUnexpectedEndElement](./llama.exceptions.grammarunexpectedendelement.md)<br>
Implements [ISerializable](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.serialization.iserializable)
## Properties
### **TargetSite**
```csharp
public MethodBase TargetSite { get; }
```
#### Property Value
[MethodBase](https://docs.microsoft.com/en-us/dotnet/api/system.reflection.methodbase)<br>
### **Message**
```csharp
public string Message { get; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Data**
```csharp
public IDictionary Data { get; }
```
#### Property Value
[IDictionary](https://docs.microsoft.com/en-us/dotnet/api/system.collections.idictionary)<br>
### **InnerException**
```csharp
public Exception InnerException { get; }
```
#### Property Value
[Exception](https://docs.microsoft.com/en-us/dotnet/api/system.exception)<br>
### **HelpLink**
```csharp
public string HelpLink { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Source**
```csharp
public string Source { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **HResult**
```csharp
public int HResult { get; set; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **StackTrace**
```csharp
public string StackTrace { get; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>

View File

@ -0,0 +1,94 @@
# GrammarUnexpectedEndOfInput
Namespace: LLama.Exceptions
End-of-file was encountered while parsing
```csharp
public class GrammarUnexpectedEndOfInput : GrammarFormatException, System.Runtime.Serialization.ISerializable
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [Exception](https://docs.microsoft.com/en-us/dotnet/api/system.exception) → [GrammarFormatException](./llama.exceptions.grammarformatexception.md) → [GrammarUnexpectedEndOfInput](./llama.exceptions.grammarunexpectedendofinput.md)<br>
Implements [ISerializable](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.serialization.iserializable)
## Properties
### **TargetSite**
```csharp
public MethodBase TargetSite { get; }
```
#### Property Value
[MethodBase](https://docs.microsoft.com/en-us/dotnet/api/system.reflection.methodbase)<br>
### **Message**
```csharp
public string Message { get; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Data**
```csharp
public IDictionary Data { get; }
```
#### Property Value
[IDictionary](https://docs.microsoft.com/en-us/dotnet/api/system.collections.idictionary)<br>
### **InnerException**
```csharp
public Exception InnerException { get; }
```
#### Property Value
[Exception](https://docs.microsoft.com/en-us/dotnet/api/system.exception)<br>
### **HelpLink**
```csharp
public string HelpLink { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Source**
```csharp
public string Source { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **HResult**
```csharp
public int HResult { get; set; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **StackTrace**
```csharp
public string StackTrace { get; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>

View File

@ -0,0 +1,94 @@
# GrammarUnexpectedHexCharsCount
Namespace: LLama.Exceptions
An incorrect number of characters were encountered while parsing a hex literal
```csharp
public class GrammarUnexpectedHexCharsCount : GrammarFormatException, System.Runtime.Serialization.ISerializable
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [Exception](https://docs.microsoft.com/en-us/dotnet/api/system.exception) → [GrammarFormatException](./llama.exceptions.grammarformatexception.md) → [GrammarUnexpectedHexCharsCount](./llama.exceptions.grammarunexpectedhexcharscount.md)<br>
Implements [ISerializable](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.serialization.iserializable)
## Properties
### **TargetSite**
```csharp
public MethodBase TargetSite { get; }
```
#### Property Value
[MethodBase](https://docs.microsoft.com/en-us/dotnet/api/system.reflection.methodbase)<br>
### **Message**
```csharp
public string Message { get; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Data**
```csharp
public IDictionary Data { get; }
```
#### Property Value
[IDictionary](https://docs.microsoft.com/en-us/dotnet/api/system.collections.idictionary)<br>
### **InnerException**
```csharp
public Exception InnerException { get; }
```
#### Property Value
[Exception](https://docs.microsoft.com/en-us/dotnet/api/system.exception)<br>
### **HelpLink**
```csharp
public string HelpLink { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Source**
```csharp
public string Source { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **HResult**
```csharp
public int HResult { get; set; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **StackTrace**
```csharp
public string StackTrace { get; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>

View File

@ -0,0 +1,94 @@
# GrammarUnknownEscapeCharacter
Namespace: LLama.Exceptions
An unexpected character was encountered after an escape sequence
```csharp
public class GrammarUnknownEscapeCharacter : GrammarFormatException, System.Runtime.Serialization.ISerializable
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [Exception](https://docs.microsoft.com/en-us/dotnet/api/system.exception) → [GrammarFormatException](./llama.exceptions.grammarformatexception.md) → [GrammarUnknownEscapeCharacter](./llama.exceptions.grammarunknownescapecharacter.md)<br>
Implements [ISerializable](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.serialization.iserializable)
## Properties
### **TargetSite**
```csharp
public MethodBase TargetSite { get; }
```
#### Property Value
[MethodBase](https://docs.microsoft.com/en-us/dotnet/api/system.reflection.methodbase)<br>
### **Message**
```csharp
public string Message { get; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Data**
```csharp
public IDictionary Data { get; }
```
#### Property Value
[IDictionary](https://docs.microsoft.com/en-us/dotnet/api/system.collections.idictionary)<br>
### **InnerException**
```csharp
public Exception InnerException { get; }
```
#### Property Value
[Exception](https://docs.microsoft.com/en-us/dotnet/api/system.exception)<br>
### **HelpLink**
```csharp
public string HelpLink { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Source**
```csharp
public string Source { get; set; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **HResult**
```csharp
public int HResult { get; set; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **StackTrace**
```csharp
public string StackTrace { get; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>

View File

@ -1,73 +0,0 @@
# DictionaryExtension
Namespace: LLama.Extensions
```csharp
public static class DictionaryExtension
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [DictionaryExtension](./llama.extensions.dictionaryextension.md)
## Methods
### **Deconstruct&lt;T1, T2&gt;(KeyValuePair&lt;T1, T2&gt;, T1&, T2&)**
```csharp
public static void Deconstruct<T1, T2>(KeyValuePair<T1, T2> pair, T1& first, T2& second)
```
#### Type Parameters
`T1`<br>
`T2`<br>
#### Parameters
`pair` KeyValuePair&lt;T1, T2&gt;<br>
`first` T1&<br>
`second` T2&<br>
### **Update&lt;T1, T2&gt;(Dictionary&lt;T1, T2&gt;, IDictionary&lt;T1, T2&gt;)**
```csharp
public static void Update<T1, T2>(Dictionary<T1, T2> dic, IDictionary<T1, T2> other)
```
#### Type Parameters
`T1`<br>
`T2`<br>
#### Parameters
`dic` Dictionary&lt;T1, T2&gt;<br>
`other` IDictionary&lt;T1, T2&gt;<br>
### **GetOrDefault&lt;T1, T2&gt;(Dictionary&lt;T1, T2&gt;, T1, T2)**
```csharp
public static T2 GetOrDefault<T1, T2>(Dictionary<T1, T2> dic, T1 key, T2 defaultValue)
```
#### Type Parameters
`T1`<br>
`T2`<br>
#### Parameters
`dic` Dictionary&lt;T1, T2&gt;<br>
`key` T1<br>
`defaultValue` T2<br>
#### Returns
T2<br>

View File

@ -0,0 +1,37 @@
# IModelParamsExtensions
Namespace: LLama.Extensions
Extention methods to the IModelParams interface
```csharp
public static class IModelParamsExtensions
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [IModelParamsExtensions](./llama.extensions.imodelparamsextensions.md)
## Methods
### **ToLlamaContextParams(IModelParams, LLamaContextParams&)**
Convert the given `IModelParams` into a `LLamaContextParams`
```csharp
public static MemoryHandle ToLlamaContextParams(IModelParams params, LLamaContextParams& result)
```
#### Parameters
`params` [IModelParams](./llama.abstractions.imodelparams.md)<br>
`result` [LLamaContextParams&](./llama.native.llamacontextparams&.md)<br>
#### Returns
[MemoryHandle](https://docs.microsoft.com/en-us/dotnet/api/system.buffers.memoryhandle)<br>
#### Exceptions
[FileNotFoundException](https://docs.microsoft.com/en-us/dotnet/api/system.io.filenotfoundexception)<br>
[ArgumentException](https://docs.microsoft.com/en-us/dotnet/api/system.argumentexception)<br>

View File

@ -0,0 +1,40 @@
# KeyValuePairExtensions
Namespace: LLama.Extensions
Extensions to the KeyValuePair struct
```csharp
public static class KeyValuePairExtensions
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [KeyValuePairExtensions](./llama.extensions.keyvaluepairextensions.md)
## Methods
### **Deconstruct&lt;TKey, TValue&gt;(KeyValuePair&lt;TKey, TValue&gt;, TKey&, TValue&)**
Deconstruct a KeyValuePair into it's constituent parts.
```csharp
public static void Deconstruct<TKey, TValue>(KeyValuePair<TKey, TValue> pair, TKey& first, TValue& second)
```
#### Type Parameters
`TKey`<br>
Type of the Key
`TValue`<br>
Type of the Value
#### Parameters
`pair` KeyValuePair&lt;TKey, TValue&gt;<br>
The KeyValuePair to deconstruct
`first` TKey&<br>
First element, the Key
`second` TValue&<br>
Second element, the Value

View File

@ -0,0 +1,110 @@
# Grammar
Namespace: LLama.Grammars
A grammar is a set of [GrammarRule](./llama.grammars.grammarrule.md)s for deciding which characters are valid next. Can be used to constrain
output to certain formats - e.g. force the model to output JSON
```csharp
public sealed class Grammar
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [Grammar](./llama.grammars.grammar.md)
## Properties
### **StartRuleIndex**
Index of the initial rule to start from
```csharp
public ulong StartRuleIndex { get; set; }
```
#### Property Value
[UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
### **Rules**
The rules which make up this grammar
```csharp
public IReadOnlyList<GrammarRule> Rules { get; }
```
#### Property Value
[IReadOnlyList&lt;GrammarRule&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.ireadonlylist-1)<br>
## Constructors
### **Grammar(IReadOnlyList&lt;GrammarRule&gt;, UInt64)**
Create a new grammar from a set of rules
```csharp
public Grammar(IReadOnlyList<GrammarRule> rules, ulong startRuleIndex)
```
#### Parameters
`rules` [IReadOnlyList&lt;GrammarRule&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.ireadonlylist-1)<br>
The rules which make up this grammar
`startRuleIndex` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
Index of the initial rule to start from
#### Exceptions
[ArgumentOutOfRangeException](https://docs.microsoft.com/en-us/dotnet/api/system.argumentoutofrangeexception)<br>
## Methods
### **CreateInstance()**
Create a `SafeLLamaGrammarHandle` instance to use for parsing
```csharp
public SafeLLamaGrammarHandle CreateInstance()
```
#### Returns
[SafeLLamaGrammarHandle](./llama.native.safellamagrammarhandle.md)<br>
### **Parse(String, String)**
Parse a string of GGML BNF into a Grammar
```csharp
public static Grammar Parse(string gbnf, string startRule)
```
#### Parameters
`gbnf` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
The string to parse
`startRule` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
Name of the start rule of this grammar
#### Returns
[Grammar](./llama.grammars.grammar.md)<br>
A Grammar which can be converted into a SafeLLamaGrammarHandle for sampling
#### Exceptions
[GrammarFormatException](./llama.exceptions.grammarformatexception.md)<br>
Thrown if input is malformed
### **ToString()**
```csharp
public string ToString()
```
#### Returns
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>

View File

@ -0,0 +1,118 @@
# GrammarRule
Namespace: LLama.Grammars
A single rule in a [Grammar](./llama.grammars.grammar.md)
```csharp
public sealed class GrammarRule : System.IEquatable`1[[LLama.Grammars.GrammarRule, LLamaSharp, Version=0.5.0.0, Culture=neutral, PublicKeyToken=null]]
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [GrammarRule](./llama.grammars.grammarrule.md)<br>
Implements [IEquatable&lt;GrammarRule&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.iequatable-1)
## Properties
### **Name**
Name of this rule
```csharp
public string Name { get; }
```
#### Property Value
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Elements**
The elements of this grammar rule
```csharp
public IReadOnlyList<LLamaGrammarElement> Elements { get; }
```
#### Property Value
[IReadOnlyList&lt;LLamaGrammarElement&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.ireadonlylist-1)<br>
## Constructors
### **GrammarRule(String, IReadOnlyList&lt;LLamaGrammarElement&gt;)**
Create a new GrammarRule containing the given elements
```csharp
public GrammarRule(string name, IReadOnlyList<LLamaGrammarElement> elements)
```
#### Parameters
`name` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
`elements` [IReadOnlyList&lt;LLamaGrammarElement&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.ireadonlylist-1)<br>
#### Exceptions
[ArgumentException](https://docs.microsoft.com/en-us/dotnet/api/system.argumentexception)<br>
## Methods
### **ToString()**
```csharp
public string ToString()
```
#### Returns
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **GetHashCode()**
```csharp
public int GetHashCode()
```
#### Returns
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **Equals(Object)**
```csharp
public bool Equals(object obj)
```
#### Parameters
`obj` [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object)<br>
#### Returns
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **Equals(GrammarRule)**
```csharp
public bool Equals(GrammarRule other)
```
#### Parameters
`other` [GrammarRule](./llama.grammars.grammarrule.md)<br>
#### Returns
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **&lt;Clone&gt;$()**
```csharp
public GrammarRule <Clone>$()
```
#### Returns
[GrammarRule](./llama.grammars.grammarrule.md)<br>

View File

@ -13,31 +13,31 @@ Implements [ILLamaExecutor](./llama.abstractions.illamaexecutor.md)
## Properties
### **Model**
### **Context**
The mode used by the executor.
The context used by the executor.
```csharp
public LLamaModel Model { get; }
public LLamaContext Context { get; }
```
#### Property Value
[LLamaModel](./llama.llamamodel.md)<br>
[LLamaContext](./llama.llamacontext.md)<br>
## Constructors
### **InstructExecutor(LLamaModel, String, String)**
### **InstructExecutor(LLamaContext, String, String)**
```csharp
public InstructExecutor(LLamaModel model, string instructionPrefix, string instructionSuffix)
public InstructExecutor(LLamaContext context, string instructionPrefix, string instructionSuffix)
```
#### Parameters
`model` [LLamaModel](./llama.llamamodel.md)<br>
`context` [LLamaContext](./llama.llamacontext.md)<br>
`instructionPrefix` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
@ -111,15 +111,15 @@ protected void PreprocessInputs(string text, InferStateArgs args)
`args` [InferStateArgs](./llama.statefulexecutorbase.inferstateargs.md)<br>
### **PostProcess(InferenceParams, InferStateArgs, IEnumerable`1&)**
### **PostProcess(IInferenceParams, InferStateArgs, IEnumerable`1&)**
```csharp
protected bool PostProcess(InferenceParams inferenceParams, InferStateArgs args, IEnumerable`1& extraOutputs)
protected bool PostProcess(IInferenceParams inferenceParams, InferStateArgs args, IEnumerable`1& extraOutputs)
```
#### Parameters
`inferenceParams` [InferenceParams](./llama.common.inferenceparams.md)<br>
`inferenceParams` [IInferenceParams](./llama.abstractions.iinferenceparams.md)<br>
`args` [InferStateArgs](./llama.statefulexecutorbase.inferstateargs.md)<br>
@ -129,14 +129,14 @@ protected bool PostProcess(InferenceParams inferenceParams, InferStateArgs args,
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **InferInternal(InferenceParams, InferStateArgs)**
### **InferInternal(IInferenceParams, InferStateArgs)**
```csharp
protected void InferInternal(InferenceParams inferenceParams, InferStateArgs args)
protected void InferInternal(IInferenceParams inferenceParams, InferStateArgs args)
```
#### Parameters
`inferenceParams` [InferenceParams](./llama.common.inferenceparams.md)<br>
`inferenceParams` [IInferenceParams](./llama.abstractions.iinferenceparams.md)<br>
`args` [InferStateArgs](./llama.statefulexecutorbase.inferstateargs.md)<br>

View File

@ -13,31 +13,31 @@ Implements [ILLamaExecutor](./llama.abstractions.illamaexecutor.md)
## Properties
### **Model**
### **Context**
The mode used by the executor.
The context used by the executor.
```csharp
public LLamaModel Model { get; }
public LLamaContext Context { get; }
```
#### Property Value
[LLamaModel](./llama.llamamodel.md)<br>
[LLamaContext](./llama.llamacontext.md)<br>
## Constructors
### **InteractiveExecutor(LLamaModel)**
### **InteractiveExecutor(LLamaContext)**
```csharp
public InteractiveExecutor(LLamaModel model)
public InteractiveExecutor(LLamaContext context)
```
#### Parameters
`model` [LLamaModel](./llama.llamamodel.md)<br>
`context` [LLamaContext](./llama.llamacontext.md)<br>
## Methods
@ -109,17 +109,17 @@ protected void PreprocessInputs(string text, InferStateArgs args)
`args` [InferStateArgs](./llama.statefulexecutorbase.inferstateargs.md)<br>
### **PostProcess(InferenceParams, InferStateArgs, IEnumerable`1&)**
### **PostProcess(IInferenceParams, InferStateArgs, IEnumerable`1&)**
Return whether to break the generation.
```csharp
protected bool PostProcess(InferenceParams inferenceParams, InferStateArgs args, IEnumerable`1& extraOutputs)
protected bool PostProcess(IInferenceParams inferenceParams, InferStateArgs args, IEnumerable`1& extraOutputs)
```
#### Parameters
`inferenceParams` [InferenceParams](./llama.common.inferenceparams.md)<br>
`inferenceParams` [IInferenceParams](./llama.abstractions.iinferenceparams.md)<br>
`args` [InferStateArgs](./llama.statefulexecutorbase.inferstateargs.md)<br>
@ -129,14 +129,14 @@ protected bool PostProcess(InferenceParams inferenceParams, InferStateArgs args,
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **InferInternal(InferenceParams, InferStateArgs)**
### **InferInternal(IInferenceParams, InferStateArgs)**
```csharp
protected void InferInternal(InferenceParams inferenceParams, InferStateArgs args)
protected void InferInternal(IInferenceParams inferenceParams, InferStateArgs args)
```
#### Parameters
`inferenceParams` [InferenceParams](./llama.common.inferenceparams.md)<br>
`inferenceParams` [IInferenceParams](./llama.abstractions.iinferenceparams.md)<br>
`args` [InferStateArgs](./llama.statefulexecutorbase.inferstateargs.md)<br>

View File

@ -1,21 +1,33 @@
# LLamaModel
# LLamaContext
Namespace: LLama
The abstraction of a LLama model, which holds the context in the native library.
A llama_context, which holds all the context required to interact with a model
```csharp
public class LLamaModel : System.IDisposable
public sealed class LLamaContext : System.IDisposable
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [LLamaModel](./llama.llamamodel.md)<br>
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [LLamaContext](./llama.llamacontext.md)<br>
Implements [IDisposable](https://docs.microsoft.com/en-us/dotnet/api/system.idisposable)
## Properties
### **VocabCount**
Total number of tokens in vocabulary of this model
```csharp
public int VocabCount { get; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **ContextSize**
The context size.
Total number of tokens in the context
```csharp
public int ContextSize { get; }
@ -25,22 +37,33 @@ public int ContextSize { get; }
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **EmbeddingSize**
Dimension of embedding vectors
```csharp
public int EmbeddingSize { get; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **Params**
The model params set for this model.
```csharp
public ModelParams Params { get; set; }
public IModelParams Params { get; set; }
```
#### Property Value
[ModelParams](./llama.common.modelparams.md)<br>
[IModelParams](./llama.abstractions.imodelparams.md)<br>
### **NativeHandle**
The native handle, which is used to be passed to the native APIs. Please avoid using it
unless you know what is the usage of the Native API.
The native handle, which is used to be passed to the native APIs
```csharp
public SafeLLamaContextHandle NativeHandle { get; }
@ -50,6 +73,10 @@ public SafeLLamaContextHandle NativeHandle { get; }
[SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
**Remarks:**
Be careful how you use this!
### **Encoding**
The encoding set for this model to deal with text input.
@ -62,35 +89,82 @@ public Encoding Encoding { get; }
[Encoding](https://docs.microsoft.com/en-us/dotnet/api/system.text.encoding)<br>
### **EmbeddingLength**
The embedding length of the model, also known as `n_embed`
```csharp
public int EmbeddingLength { get; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
## Constructors
### **LLamaModel(ModelParams, String, ILLamaLogger)**
### **LLamaContext(IModelParams, ILLamaLogger)**
#### Caution
Use the LLamaWeights.CreateContext instead
---
```csharp
public LLamaModel(ModelParams Params, string encoding, ILLamaLogger logger)
public LLamaContext(IModelParams params, ILLamaLogger logger)
```
#### Parameters
`Params` [ModelParams](./llama.common.modelparams.md)<br>
`params` [IModelParams](./llama.abstractions.imodelparams.md)<br>
Model params.
`encoding` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
Encoding to deal with text input.
`logger` [ILLamaLogger](./llama.common.illamalogger.md)<br>
The logger.
### **LLamaContext(LLamaWeights, IModelParams, ILLamaLogger)**
Create a new LLamaContext for the given LLamaWeights
```csharp
public LLamaContext(LLamaWeights model, IModelParams params, ILLamaLogger logger)
```
#### Parameters
`model` [LLamaWeights](./llama.llamaweights.md)<br>
`params` [IModelParams](./llama.abstractions.imodelparams.md)<br>
`logger` [ILLamaLogger](./llama.common.illamalogger.md)<br>
#### Exceptions
[ObjectDisposedException](https://docs.microsoft.com/en-us/dotnet/api/system.objectdisposedexception)<br>
## Methods
### **Clone()**
Create a copy of the current state of this context
```csharp
public LLamaContext Clone()
```
#### Returns
[LLamaContext](./llama.llamacontext.md)<br>
### **Tokenize(String, Boolean)**
Tokenize a string.
```csharp
public IEnumerable<int> Tokenize(string text, bool addBos)
public Int32[] Tokenize(string text, bool addBos)
```
#### Parameters
@ -102,7 +176,7 @@ Whether to add a bos to the text.
#### Returns
[IEnumerable&lt;Int32&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.ienumerable-1)<br>
[Int32[]](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **DeTokenize(IEnumerable&lt;Int32&gt;)**
@ -134,6 +208,12 @@ public void SaveState(string filename)
### **GetStateData()**
#### Caution
Use `GetState` instead, this supports larger states (over 2GB)
---
Get the state data as a byte array.
```csharp
@ -144,6 +224,18 @@ public Byte[] GetStateData()
[Byte[]](https://docs.microsoft.com/en-us/dotnet/api/system.byte)<br>
### **GetState()**
Get the state data as an opaque handle
```csharp
public State GetState()
```
#### Returns
[State](./llama.llamacontext.state.md)<br>
### **LoadState(String)**
Load the state from specified path.
@ -176,21 +268,39 @@ public void LoadState(Byte[] stateData)
[RuntimeError](./llama.exceptions.runtimeerror.md)<br>
### **Sample(LLamaTokenDataArray, Single, MiroStateType, Single, Single, Int32, Single, Single, Single)**
### **LoadState(State)**
Load the state from memory.
```csharp
public void LoadState(State state)
```
#### Parameters
`state` [State](./llama.llamacontext.state.md)<br>
#### Exceptions
[RuntimeError](./llama.exceptions.runtimeerror.md)<br>
### **Sample(LLamaTokenDataArray, Nullable`1&, Single, MirostatType, Single, Single, Int32, Single, Single, Single, SafeLLamaGrammarHandle)**
Perform the sampling. Please don't use it unless you fully know what it does.
```csharp
public int Sample(LLamaTokenDataArray candidates, float temperature, MiroStateType mirostat, float mirostatTau, float mirostatEta, int topK, float topP, float tfsZ, float typicalP)
public int Sample(LLamaTokenDataArray candidates, Nullable`1& mirostat_mu, float temperature, MirostatType mirostat, float mirostatTau, float mirostatEta, int topK, float topP, float tfsZ, float typicalP, SafeLLamaGrammarHandle grammar)
```
#### Parameters
`candidates` [LLamaTokenDataArray](./llama.native.llamatokendataarray.md)<br>
`mirostat_mu` [Nullable`1&](https://docs.microsoft.com/en-us/dotnet/api/system.nullable-1&)<br>
`temperature` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
`mirostat` [MiroStateType](./llama.common.mirostatetype.md)<br>
`mirostat` [MirostatType](./llama.common.mirostattype.md)<br>
`mirostatTau` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
@ -204,6 +314,8 @@ public int Sample(LLamaTokenDataArray candidates, float temperature, MiroStateTy
`typicalP` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
`grammar` [SafeLLamaGrammarHandle](./llama.native.safellamagrammarhandle.md)<br>
#### Returns
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
@ -259,6 +371,75 @@ The updated `pastTokensCount`.
[RuntimeError](./llama.exceptions.runtimeerror.md)<br>
### **Eval(List&lt;Int32&gt;, Int32)**
```csharp
public int Eval(List<int> tokens, int pastTokensCount)
```
#### Parameters
`tokens` [List&lt;Int32&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.list-1)<br>
`pastTokensCount` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
#### Returns
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
The updated `pastTokensCount`.
#### Exceptions
[RuntimeError](./llama.exceptions.runtimeerror.md)<br>
### **Eval(ReadOnlyMemory&lt;Int32&gt;, Int32)**
```csharp
public int Eval(ReadOnlyMemory<int> tokens, int pastTokensCount)
```
#### Parameters
`tokens` [ReadOnlyMemory&lt;Int32&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.readonlymemory-1)<br>
`pastTokensCount` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
#### Returns
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
The updated `pastTokensCount`.
#### Exceptions
[RuntimeError](./llama.exceptions.runtimeerror.md)<br>
### **Eval(ReadOnlySpan&lt;Int32&gt;, Int32)**
```csharp
public int Eval(ReadOnlySpan<int> tokens, int pastTokensCount)
```
#### Parameters
`tokens` [ReadOnlySpan&lt;Int32&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.readonlyspan-1)<br>
`pastTokensCount` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
#### Returns
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
The updated `pastTokensCount`.
#### Exceptions
[RuntimeError](./llama.exceptions.runtimeerror.md)<br>
### **GenerateResult(IEnumerable&lt;Int32&gt;)**
```csharp
@ -273,10 +454,24 @@ internal IEnumerable<string> GenerateResult(IEnumerable<int> ids)
[IEnumerable&lt;String&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.ienumerable-1)<br>
### **TokenToString(Int32)**
Convert a token into a string
```csharp
public string TokenToString(int token)
```
#### Parameters
`token` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
#### Returns
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Dispose()**
```csharp
public void Dispose()
```

View File

@ -5,30 +5,62 @@ Namespace: LLama
The embedder for LLama, which supports getting embeddings from text.
```csharp
public class LLamaEmbedder : System.IDisposable
public sealed class LLamaEmbedder : System.IDisposable
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [LLamaEmbedder](./llama.llamaembedder.md)<br>
Implements [IDisposable](https://docs.microsoft.com/en-us/dotnet/api/system.idisposable)
## Properties
### **EmbeddingSize**
Dimension of embedding vectors
```csharp
public int EmbeddingSize { get; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
## Constructors
### **LLamaEmbedder(ModelParams)**
### **LLamaEmbedder(IModelParams)**
```csharp
public LLamaEmbedder(ModelParams params)
public LLamaEmbedder(IModelParams params)
```
#### Parameters
`params` [ModelParams](./llama.common.modelparams.md)<br>
`params` [IModelParams](./llama.abstractions.imodelparams.md)<br>
### **LLamaEmbedder(LLamaWeights, IModelParams)**
```csharp
public LLamaEmbedder(LLamaWeights weights, IModelParams params)
```
#### Parameters
`weights` [LLamaWeights](./llama.llamaweights.md)<br>
`params` [IModelParams](./llama.abstractions.imodelparams.md)<br>
## Methods
### **GetEmbeddings(String, Int32, Boolean, String)**
#### Caution
'threads' and 'encoding' parameters are no longer used
---
Get the embeddings of the text.
```csharp
@ -40,12 +72,56 @@ public Single[] GetEmbeddings(string text, int threads, bool addBos, string enco
`text` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
`threads` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
Threads used for inference.
unused
`addBos` [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
Add bos to the text.
`encoding` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
unused
#### Returns
[Single[]](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
#### Exceptions
[RuntimeError](./llama.exceptions.runtimeerror.md)<br>
### **GetEmbeddings(String)**
Get the embeddings of the text.
```csharp
public Single[] GetEmbeddings(string text)
```
#### Parameters
`text` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
#### Returns
[Single[]](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
#### Exceptions
[RuntimeError](./llama.exceptions.runtimeerror.md)<br>
### **GetEmbeddings(String, Boolean)**
Get the embeddings of the text.
```csharp
public Single[] GetEmbeddings(string text, bool addBos)
```
#### Parameters
`text` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
`addBos` [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
Add bos to the text.
#### Returns

View File

@ -12,12 +12,12 @@ Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object)
## Methods
### **Quantize(String, String, LLamaFtype, Int32)**
### **Quantize(String, String, LLamaFtype, Int32, Boolean, Boolean)**
Quantize the model.
```csharp
public static bool Quantize(string srcFileName, string dstFilename, LLamaFtype ftype, int nthread)
public static bool Quantize(string srcFileName, string dstFilename, LLamaFtype ftype, int nthread, bool allowRequantize, bool quantizeOutputTensor)
```
#### Parameters
@ -34,6 +34,10 @@ The type of quantization.
`nthread` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
Thread to be used during the quantization. By default it's the physical core number.
`allowRequantize` [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
`quantizeOutputTensor` [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
#### Returns
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
@ -43,12 +47,12 @@ Whether the quantization is successful.
[ArgumentException](https://docs.microsoft.com/en-us/dotnet/api/system.argumentexception)<br>
### **Quantize(String, String, String, Int32)**
### **Quantize(String, String, String, Int32, Boolean, Boolean)**
Quantize the model.
```csharp
public static bool Quantize(string srcFileName, string dstFilename, string ftype, int nthread)
public static bool Quantize(string srcFileName, string dstFilename, string ftype, int nthread, bool allowRequantize, bool quantizeOutputTensor)
```
#### Parameters
@ -65,6 +69,10 @@ The type of quantization.
`nthread` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
Thread to be used during the quantization. By default it's the physical core number.
`allowRequantize` [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
`quantizeOutputTensor` [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
#### Returns
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>

View File

@ -0,0 +1,118 @@
# LLamaWeights
Namespace: LLama
A set of model weights, loaded into memory.
```csharp
public sealed class LLamaWeights : System.IDisposable
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [LLamaWeights](./llama.llamaweights.md)<br>
Implements [IDisposable](https://docs.microsoft.com/en-us/dotnet/api/system.idisposable)
## Properties
### **NativeHandle**
The native handle, which is used in the native APIs
```csharp
public SafeLlamaModelHandle NativeHandle { get; }
```
#### Property Value
[SafeLlamaModelHandle](./llama.native.safellamamodelhandle.md)<br>
**Remarks:**
Be careful how you use this!
### **Encoding**
Encoding to use to convert text into bytes for the model
```csharp
public Encoding Encoding { get; }
```
#### Property Value
[Encoding](https://docs.microsoft.com/en-us/dotnet/api/system.text.encoding)<br>
### **VocabCount**
Total number of tokens in vocabulary of this model
```csharp
public int VocabCount { get; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **ContextSize**
Total number of tokens in the context
```csharp
public int ContextSize { get; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **EmbeddingSize**
Dimension of embedding vectors
```csharp
public int EmbeddingSize { get; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
## Methods
### **LoadFromFile(IModelParams)**
Load weights into memory
```csharp
public static LLamaWeights LoadFromFile(IModelParams params)
```
#### Parameters
`params` [IModelParams](./llama.abstractions.imodelparams.md)<br>
#### Returns
[LLamaWeights](./llama.llamaweights.md)<br>
### **Dispose()**
```csharp
public void Dispose()
```
### **CreateContext(IModelParams)**
Create a llama_context using this model
```csharp
public LLamaContext CreateContext(IModelParams params)
```
#### Parameters
`params` [IModelParams](./llama.abstractions.imodelparams.md)<br>
#### Returns
[LLamaContext](./llama.llamacontext.md)<br>

View File

@ -2,6 +2,8 @@
Namespace: LLama.Native
A C# representation of the llama.cpp `llama_context_params` struct
```csharp
public struct LLamaContextParams
```
@ -10,6 +12,14 @@ Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object)
## Fields
### **seed**
RNG seed, -1 for random
```csharp
public int seed;
```
### **n_ctx**
text context
@ -18,6 +28,14 @@ text context
public int n_ctx;
```
### **n_batch**
prompt processing batch size
```csharp
public int n_batch;
```
### **n_gpu_layers**
number of layers to store in VRAM
@ -26,60 +44,38 @@ number of layers to store in VRAM
public int n_gpu_layers;
```
### **seed**
### **main_gpu**
RNG seed, -1 for random
the GPU that is used for scratch and small tensors
```csharp
public int seed;
public int main_gpu;
```
### **f16_kv**
### **tensor_split**
use fp16 for KV cache
how to split layers across multiple GPUs
```csharp
public bool f16_kv;
public IntPtr tensor_split;
```
### **logits_all**
### **rope_freq_base**
the llama_eval() call computes all logits, not just the last one
ref: https://github.com/ggerganov/llama.cpp/pull/2054
RoPE base frequency
```csharp
public bool logits_all;
public float rope_freq_base;
```
### **vocab_only**
### **rope_freq_scale**
only load the vocabulary, no weights
ref: https://github.com/ggerganov/llama.cpp/pull/2054
RoPE frequency scaling factor
```csharp
public bool vocab_only;
```
### **use_mmap**
use mmap if possible
```csharp
public bool use_mmap;
```
### **use_mlock**
force system to keep model in RAM
```csharp
public bool use_mlock;
```
### **embedding**
embedding mode only
```csharp
public bool embedding;
public float rope_freq_scale;
```
### **progress_callback**
@ -97,3 +93,101 @@ context pointer passed to the progress callback
```csharp
public IntPtr progress_callback_user_data;
```
## Properties
### **low_vram**
if true, reduce VRAM usage at the cost of performance
```csharp
public bool low_vram { get; set; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **mul_mat_q**
if true, use experimental mul_mat_q kernels
```csharp
public bool mul_mat_q { get; set; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **f16_kv**
use fp16 for KV cache
```csharp
public bool f16_kv { get; set; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **logits_all**
the llama_eval() call computes all logits, not just the last one
```csharp
public bool logits_all { get; set; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **vocab_only**
only load the vocabulary, no weights
```csharp
public bool vocab_only { get; set; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **use_mmap**
use mmap if possible
```csharp
public bool use_mmap { get; set; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **use_mlock**
force system to keep model in RAM
```csharp
public bool use_mlock { get; set; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **embedding**
embedding mode only
```csharp
public bool embedding { get; set; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>

View File

@ -2,6 +2,8 @@
Namespace: LLama.Native
Supported model file types
```csharp
public enum LLamaFtype
```
@ -13,3 +15,21 @@ Implements [IComparable](https://docs.microsoft.com/en-us/dotnet/api/system.icom
| Name | Value | Description |
| --- | --: | --- |
| LLAMA_FTYPE_ALL_F32 | 0 | All f32 |
| LLAMA_FTYPE_MOSTLY_F16 | 1 | Mostly f16 |
| LLAMA_FTYPE_MOSTLY_Q8_0 | 7 | Mostly 8 bit |
| LLAMA_FTYPE_MOSTLY_Q4_0 | 2 | Mostly 4 bit |
| LLAMA_FTYPE_MOSTLY_Q4_1 | 3 | Mostly 4 bit |
| LLAMA_FTYPE_MOSTLY_Q4_1_SOME_F16 | 4 | Mostly 4 bit, tok_embeddings.weight and output.weight are f16 |
| LLAMA_FTYPE_MOSTLY_Q5_0 | 8 | Mostly 5 bit |
| LLAMA_FTYPE_MOSTLY_Q5_1 | 9 | Mostly 5 bit |
| LLAMA_FTYPE_MOSTLY_Q2_K | 10 | K-Quant 2 bit |
| LLAMA_FTYPE_MOSTLY_Q3_K_S | 11 | K-Quant 3 bit (Small) |
| LLAMA_FTYPE_MOSTLY_Q3_K_M | 12 | K-Quant 3 bit (Medium) |
| LLAMA_FTYPE_MOSTLY_Q3_K_L | 13 | K-Quant 3 bit (Large) |
| LLAMA_FTYPE_MOSTLY_Q4_K_S | 14 | K-Quant 4 bit (Small) |
| LLAMA_FTYPE_MOSTLY_Q4_K_M | 15 | K-Quant 4 bit (Medium) |
| LLAMA_FTYPE_MOSTLY_Q5_K_S | 16 | K-Quant 5 bit (Small) |
| LLAMA_FTYPE_MOSTLY_Q5_K_M | 17 | K-Quant 5 bit (Medium) |
| LLAMA_FTYPE_MOSTLY_Q6_K | 18 | K-Quant 6 bit |
| LLAMA_FTYPE_GUESSED | 1024 | File type was not specified |

View File

@ -0,0 +1,96 @@
# LLamaGrammarElement
Namespace: LLama.Native
An element of a grammar
```csharp
public struct LLamaGrammarElement
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [ValueType](https://docs.microsoft.com/en-us/dotnet/api/system.valuetype) → [LLamaGrammarElement](./llama.native.llamagrammarelement.md)<br>
Implements [IEquatable&lt;LLamaGrammarElement&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.iequatable-1)
## Fields
### **Type**
The type of this element
```csharp
public LLamaGrammarElementType Type;
```
### **Value**
Unicode code point or rule ID
```csharp
public uint Value;
```
## Constructors
### **LLamaGrammarElement(LLamaGrammarElementType, UInt32)**
Construct a new LLamaGrammarElement
```csharp
LLamaGrammarElement(LLamaGrammarElementType type, uint value)
```
#### Parameters
`type` [LLamaGrammarElementType](./llama.native.llamagrammarelementtype.md)<br>
`value` [UInt32](https://docs.microsoft.com/en-us/dotnet/api/system.uint32)<br>
## Methods
### **Equals(LLamaGrammarElement)**
```csharp
bool Equals(LLamaGrammarElement other)
```
#### Parameters
`other` [LLamaGrammarElement](./llama.native.llamagrammarelement.md)<br>
#### Returns
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **Equals(Object)**
```csharp
bool Equals(object obj)
```
#### Parameters
`obj` [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object)<br>
#### Returns
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **GetHashCode()**
```csharp
int GetHashCode()
```
#### Returns
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **IsCharElement()**
```csharp
bool IsCharElement()
```
#### Returns
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>

View File

@ -0,0 +1,24 @@
# LLamaGrammarElementType
Namespace: LLama.Native
grammar element type
```csharp
public enum LLamaGrammarElementType
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [ValueType](https://docs.microsoft.com/en-us/dotnet/api/system.valuetype) → [Enum](https://docs.microsoft.com/en-us/dotnet/api/system.enum) → [LLamaGrammarElementType](./llama.native.llamagrammarelementtype.md)<br>
Implements [IComparable](https://docs.microsoft.com/en-us/dotnet/api/system.icomparable), [IFormattable](https://docs.microsoft.com/en-us/dotnet/api/system.iformattable), [IConvertible](https://docs.microsoft.com/en-us/dotnet/api/system.iconvertible)
## Fields
| Name | Value | Description |
| --- | --: | --- |
| END | 0 | end of rule definition |
| ALT | 1 | start of alternate definition for rule |
| RULE_REF | 2 | non-terminal element: reference to rule |
| CHAR | 3 | terminal element: character (code point) |
| CHAR_NOT | 4 | inverse char(s) ([^a], [^a-b] [^abc]) |
| CHAR_RNG_UPPER | 5 | modifies a preceding CHAR or CHAR_ALT to be an inclusive range ([a-z]) |
| CHAR_ALT | 6 | modifies a preceding CHAR or CHAR_RNG_UPPER to add an alternate char to match ([ab], [a-zA]) |

View File

@ -0,0 +1,55 @@
# LLamaModelQuantizeParams
Namespace: LLama.Native
Quantizer parameters used in the native API
```csharp
public struct LLamaModelQuantizeParams
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [ValueType](https://docs.microsoft.com/en-us/dotnet/api/system.valuetype) → [LLamaModelQuantizeParams](./llama.native.llamamodelquantizeparams.md)
## Fields
### **nthread**
number of threads to use for quantizing, if &lt;=0 will use std::thread::hardware_concurrency()
```csharp
public int nthread;
```
### **ftype**
quantize to this llama_ftype
```csharp
public LLamaFtype ftype;
```
## Properties
### **allow_requantize**
allow quantizing non-f32/f16 tensors
```csharp
public bool allow_requantize { get; set; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **quantize_output_tensor**
quantize output.weight
```csharp
public bool quantize_output_tensor { get; set; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>

View File

@ -2,6 +2,8 @@
Namespace: LLama.Native
Contains an array of LLamaTokenData, potentially sorted.
```csharp
public struct LLamaTokenDataArray
```
@ -12,34 +14,50 @@ Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object)
### **data**
The LLamaTokenData
```csharp
public Memory<LLamaTokenData> data;
```
### **size**
```csharp
public ulong size;
```
### **sorted**
Indicates if `data` is sorted by logits in descending order. If this is false the token data is in _no particular order_.
```csharp
public bool sorted;
```
## Constructors
### **LLamaTokenDataArray(LLamaTokenData[], UInt64, Boolean)**
### **LLamaTokenDataArray(Memory&lt;LLamaTokenData&gt;, Boolean)**
Create a new LLamaTokenDataArray
```csharp
LLamaTokenDataArray(LLamaTokenData[] data, ulong size, bool sorted)
LLamaTokenDataArray(Memory<LLamaTokenData> tokens, bool isSorted)
```
#### Parameters
`data` [LLamaTokenData[]](./llama.native.llamatokendata.md)<br>
`tokens` [Memory&lt;LLamaTokenData&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.memory-1)<br>
`size` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
`isSorted` [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
`sorted` [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
## Methods
### **Create(ReadOnlySpan&lt;Single&gt;)**
Create a new LLamaTokenDataArray, copying the data from the given logits
```csharp
LLamaTokenDataArray Create(ReadOnlySpan<float> logits)
```
#### Parameters
`logits` [ReadOnlySpan&lt;Single&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.readonlyspan-1)<br>
#### Returns
[LLamaTokenDataArray](./llama.native.llamatokendataarray.md)<br>

View File

@ -2,6 +2,8 @@
Namespace: LLama.Native
Contains a pointer to an array of LLamaTokenData which is pinned in memory.
```csharp
public struct LLamaTokenDataArrayNative
```
@ -12,18 +14,57 @@ Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object)
### **data**
A pointer to an array of LlamaTokenData
```csharp
public IntPtr data;
```
**Remarks:**
Memory must be pinned in place for all the time this LLamaTokenDataArrayNative is in use
### **size**
Number of LLamaTokenData in the array
```csharp
public ulong size;
```
## Properties
### **sorted**
Indicates if the items in the array are sorted
```csharp
public bool sorted;
public bool sorted { get; set; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
## Methods
### **Create(LLamaTokenDataArray, LLamaTokenDataArrayNative&)**
Create a new LLamaTokenDataArrayNative around the data in the LLamaTokenDataArray
```csharp
MemoryHandle Create(LLamaTokenDataArray array, LLamaTokenDataArrayNative& native)
```
#### Parameters
`array` [LLamaTokenDataArray](./llama.native.llamatokendataarray.md)<br>
Data source
`native` [LLamaTokenDataArrayNative&](./llama.native.llamatokendataarraynative&.md)<br>
Created native array
#### Returns
[MemoryHandle](https://docs.microsoft.com/en-us/dotnet/api/system.buffers.memoryhandle)<br>
A memory handle, pinning the data in place until disposed

File diff suppressed because it is too large Load Diff

View File

@ -2,8 +2,10 @@
Namespace: LLama.Native
A safe wrapper around a llama_context
```csharp
public class SafeLLamaContextHandle : SafeLLamaHandleBase, System.IDisposable
public sealed class SafeLLamaContextHandle : SafeLLamaHandleBase, System.IDisposable
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [CriticalFinalizerObject](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.constrainedexecution.criticalfinalizerobject) → [SafeHandle](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.safehandle) → [SafeLLamaHandleBase](./llama.native.safellamahandlebase.md) → [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
@ -11,6 +13,54 @@ Implements [IDisposable](https://docs.microsoft.com/en-us/dotnet/api/system.idis
## Properties
### **VocabCount**
Total number of tokens in vocabulary of this model
```csharp
public int VocabCount { get; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **ContextSize**
Total number of tokens in the context
```csharp
public int ContextSize { get; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **EmbeddingSize**
Dimension of embedding vectors
```csharp
public int EmbeddingSize { get; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **ModelHandle**
Get the model which this context is using
```csharp
public SafeLlamaModelHandle ModelHandle { get; }
```
#### Property Value
[SafeLlamaModelHandle](./llama.native.safellamamodelhandle.md)<br>
### **IsInvalid**
```csharp
@ -33,15 +83,21 @@ public bool IsClosed { get; }
## Constructors
### **SafeLLamaContextHandle(IntPtr)**
### **SafeLLamaContextHandle(IntPtr, SafeLlamaModelHandle)**
Create a new SafeLLamaContextHandle
```csharp
public SafeLLamaContextHandle(IntPtr handle)
public SafeLLamaContextHandle(IntPtr handle, SafeLlamaModelHandle model)
```
#### Parameters
`handle` [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
pointer to an allocated llama_context
`model` [SafeLlamaModelHandle](./llama.native.safellamamodelhandle.md)<br>
the model which this context was created from
## Methods
@ -54,3 +110,265 @@ protected bool ReleaseHandle()
#### Returns
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **Create(SafeLlamaModelHandle, LLamaContextParams)**
Create a new llama_state for the given model
```csharp
public static SafeLLamaContextHandle Create(SafeLlamaModelHandle model, LLamaContextParams lparams)
```
#### Parameters
`model` [SafeLlamaModelHandle](./llama.native.safellamamodelhandle.md)<br>
`lparams` [LLamaContextParams](./llama.native.llamacontextparams.md)<br>
#### Returns
[SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
#### Exceptions
[RuntimeError](./llama.exceptions.runtimeerror.md)<br>
### **Clone(LLamaContextParams)**
Create a new llama context with a clone of the current llama context state
```csharp
public SafeLLamaContextHandle Clone(LLamaContextParams lparams)
```
#### Parameters
`lparams` [LLamaContextParams](./llama.native.llamacontextparams.md)<br>
#### Returns
[SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
### **Tokenize(String, Boolean, Encoding)**
Convert the given text into tokens
```csharp
public Int32[] Tokenize(string text, bool add_bos, Encoding encoding)
```
#### Parameters
`text` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
The text to tokenize
`add_bos` [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
Whether the "BOS" token should be added
`encoding` [Encoding](https://docs.microsoft.com/en-us/dotnet/api/system.text.encoding)<br>
Encoding to use for the text
#### Returns
[Int32[]](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
#### Exceptions
[RuntimeError](./llama.exceptions.runtimeerror.md)<br>
### **GetLogits()**
Token logits obtained from the last call to llama_eval()
The logits for the last token are stored in the last row
Can be mutated in order to change the probabilities of the next token.<br>
Rows: n_tokens<br>
Cols: n_vocab
```csharp
public Span<float> GetLogits()
```
#### Returns
[Span&lt;Single&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.span-1)<br>
### **TokenToString(Int32, Encoding)**
Convert a token into a string
```csharp
public string TokenToString(int token, Encoding encoding)
```
#### Parameters
`token` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
Token to decode into a string
`encoding` [Encoding](https://docs.microsoft.com/en-us/dotnet/api/system.text.encoding)<br>
#### Returns
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **TokenToString(Int32, Encoding, StringBuilder)**
Append a single llama token to a string builder
```csharp
public void TokenToString(int token, Encoding encoding, StringBuilder dest)
```
#### Parameters
`token` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
Token to decode
`encoding` [Encoding](https://docs.microsoft.com/en-us/dotnet/api/system.text.encoding)<br>
`dest` [StringBuilder](https://docs.microsoft.com/en-us/dotnet/api/system.text.stringbuilder)<br>
string builder to append the result to
### **TokenToSpan(Int32, Span&lt;Byte&gt;)**
Convert a single llama token into bytes
```csharp
public int TokenToSpan(int token, Span<byte> dest)
```
#### Parameters
`token` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
Token to decode
`dest` [Span&lt;Byte&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.span-1)<br>
A span to attempt to write into. If this is too small nothing will be written
#### Returns
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
The size of this token. **nothing will be written** if this is larger than `dest`
### **Eval(ReadOnlySpan&lt;Int32&gt;, Int32, Int32)**
Run the llama inference to obtain the logits and probabilities for the next token.
```csharp
public bool Eval(ReadOnlySpan<int> tokens, int n_past, int n_threads)
```
#### Parameters
`tokens` [ReadOnlySpan&lt;Int32&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.readonlyspan-1)<br>
The provided batch of new tokens to process
`n_past` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
the number of tokens to use from previous eval calls
`n_threads` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
#### Returns
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
Returns true on success
### **GetStateSize()**
Get the size of the state, when saved as bytes
```csharp
public ulong GetStateSize()
```
#### Returns
[UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
### **GetState(Byte*, UInt64)**
Get the raw state of this context, encoded as bytes. Data is written into the `dest` pointer.
```csharp
public ulong GetState(Byte* dest, ulong size)
```
#### Parameters
`dest` [Byte*](https://docs.microsoft.com/en-us/dotnet/api/system.byte*)<br>
Destination to write to
`size` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
Number of bytes available to write to in dest (check required size with `GetStateSize()`)
#### Returns
[UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
The number of bytes written to dest
#### Exceptions
[ArgumentOutOfRangeException](https://docs.microsoft.com/en-us/dotnet/api/system.argumentoutofrangeexception)<br>
Thrown if dest is too small
### **GetState(IntPtr, UInt64)**
Get the raw state of this context, encoded as bytes. Data is written into the `dest` pointer.
```csharp
public ulong GetState(IntPtr dest, ulong size)
```
#### Parameters
`dest` [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
Destination to write to
`size` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
Number of bytes available to write to in dest (check required size with `GetStateSize()`)
#### Returns
[UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
The number of bytes written to dest
#### Exceptions
[ArgumentOutOfRangeException](https://docs.microsoft.com/en-us/dotnet/api/system.argumentoutofrangeexception)<br>
Thrown if dest is too small
### **SetState(Byte*)**
Set the raw state of this context
```csharp
public ulong SetState(Byte* src)
```
#### Parameters
`src` [Byte*](https://docs.microsoft.com/en-us/dotnet/api/system.byte*)<br>
The pointer to read the state from
#### Returns
[UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
Number of bytes read from the src pointer
### **SetState(IntPtr)**
Set the raw state of this context
```csharp
public ulong SetState(IntPtr src)
```
#### Parameters
`src` [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
The pointer to read the state from
#### Returns
[UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
Number of bytes read from the src pointer

View File

@ -0,0 +1,97 @@
# SafeLLamaGrammarHandle
Namespace: LLama.Native
A safe reference to a `llama_grammar`
```csharp
public class SafeLLamaGrammarHandle : SafeLLamaHandleBase, System.IDisposable
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [CriticalFinalizerObject](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.constrainedexecution.criticalfinalizerobject) → [SafeHandle](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.safehandle) → [SafeLLamaHandleBase](./llama.native.safellamahandlebase.md) → [SafeLLamaGrammarHandle](./llama.native.safellamagrammarhandle.md)<br>
Implements [IDisposable](https://docs.microsoft.com/en-us/dotnet/api/system.idisposable)
## Properties
### **IsInvalid**
```csharp
public bool IsInvalid { get; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **IsClosed**
```csharp
public bool IsClosed { get; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
## Methods
### **ReleaseHandle()**
```csharp
protected bool ReleaseHandle()
```
#### Returns
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **Create(IReadOnlyList&lt;GrammarRule&gt;, UInt64)**
Create a new llama_grammar
```csharp
public static SafeLLamaGrammarHandle Create(IReadOnlyList<GrammarRule> rules, ulong start_rule_index)
```
#### Parameters
`rules` [IReadOnlyList&lt;GrammarRule&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.ireadonlylist-1)<br>
A list of list of elements, each inner list makes up one grammar rule
`start_rule_index` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
The index (in the outer list) of the start rule
#### Returns
[SafeLLamaGrammarHandle](./llama.native.safellamagrammarhandle.md)<br>
#### Exceptions
[RuntimeError](./llama.exceptions.runtimeerror.md)<br>
### **Create(LLamaGrammarElement**, UInt64, UInt64)**
Create a new llama_grammar
```csharp
public static SafeLLamaGrammarHandle Create(LLamaGrammarElement** rules, ulong nrules, ulong start_rule_index)
```
#### Parameters
`rules` [LLamaGrammarElement**](./llama.native.llamagrammarelement**.md)<br>
rules list, each rule is a list of rule elements (terminated by a LLamaGrammarElementType.END element)
`nrules` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
total number of rules
`start_rule_index` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
index of the start rule of the grammar
#### Returns
[SafeLLamaGrammarHandle](./llama.native.safellamagrammarhandle.md)<br>
#### Exceptions
[RuntimeError](./llama.exceptions.runtimeerror.md)<br>

View File

@ -2,6 +2,8 @@
Namespace: LLama.Native
Base class for all llama handles to native resources
```csharp
public abstract class SafeLLamaHandleBase : System.Runtime.InteropServices.SafeHandle, System.IDisposable
```

View File

@ -0,0 +1,220 @@
# SafeLlamaModelHandle
Namespace: LLama.Native
A reference to a set of llama model weights
```csharp
public sealed class SafeLlamaModelHandle : SafeLLamaHandleBase, System.IDisposable
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [CriticalFinalizerObject](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.constrainedexecution.criticalfinalizerobject) → [SafeHandle](https://docs.microsoft.com/en-us/dotnet/api/system.runtime.interopservices.safehandle) → [SafeLLamaHandleBase](./llama.native.safellamahandlebase.md) → [SafeLlamaModelHandle](./llama.native.safellamamodelhandle.md)<br>
Implements [IDisposable](https://docs.microsoft.com/en-us/dotnet/api/system.idisposable)
## Properties
### **VocabCount**
Total number of tokens in vocabulary of this model
```csharp
public int VocabCount { get; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **ContextSize**
Total number of tokens in the context
```csharp
public int ContextSize { get; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **EmbeddingSize**
Dimension of embedding vectors
```csharp
public int EmbeddingSize { get; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **IsInvalid**
```csharp
public bool IsInvalid { get; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **IsClosed**
```csharp
public bool IsClosed { get; }
```
#### Property Value
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
## Methods
### **ReleaseHandle()**
```csharp
protected bool ReleaseHandle()
```
#### Returns
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **LoadFromFile(String, LLamaContextParams)**
Load a model from the given file path into memory
```csharp
public static SafeLlamaModelHandle LoadFromFile(string modelPath, LLamaContextParams lparams)
```
#### Parameters
`modelPath` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
`lparams` [LLamaContextParams](./llama.native.llamacontextparams.md)<br>
#### Returns
[SafeLlamaModelHandle](./llama.native.safellamamodelhandle.md)<br>
#### Exceptions
[RuntimeError](./llama.exceptions.runtimeerror.md)<br>
### **ApplyLoraFromFile(String, String, Int32)**
Apply a LoRA adapter to a loaded model
```csharp
public void ApplyLoraFromFile(string lora, string modelBase, int threads)
```
#### Parameters
`lora` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
`modelBase` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
A path to a higher quality model to use as a base for the layers modified by the
adapter. Can be NULL to use the current loaded model.
`threads` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
#### Exceptions
[RuntimeError](./llama.exceptions.runtimeerror.md)<br>
### **TokenToSpan(Int32, Span&lt;Byte&gt;)**
Convert a single llama token into bytes
```csharp
public int TokenToSpan(int llama_token, Span<byte> dest)
```
#### Parameters
`llama_token` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
Token to decode
`dest` [Span&lt;Byte&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.span-1)<br>
A span to attempt to write into. If this is too small nothing will be written
#### Returns
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
The size of this token. **nothing will be written** if this is larger than `dest`
### **TokenToString(Int32, Encoding)**
Convert a single llama token into a string
```csharp
public string TokenToString(int llama_token, Encoding encoding)
```
#### Parameters
`llama_token` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
`encoding` [Encoding](https://docs.microsoft.com/en-us/dotnet/api/system.text.encoding)<br>
Encoding to use to decode the bytes into a string
#### Returns
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **TokenToString(Int32, Encoding, StringBuilder)**
Append a single llama token to a string builder
```csharp
public void TokenToString(int llama_token, Encoding encoding, StringBuilder dest)
```
#### Parameters
`llama_token` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
Token to decode
`encoding` [Encoding](https://docs.microsoft.com/en-us/dotnet/api/system.text.encoding)<br>
`dest` [StringBuilder](https://docs.microsoft.com/en-us/dotnet/api/system.text.stringbuilder)<br>
string builder to append the result to
### **Tokenize(String, Boolean, Encoding)**
Convert a string of text into tokens
```csharp
public Int32[] Tokenize(string text, bool add_bos, Encoding encoding)
```
#### Parameters
`text` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
`add_bos` [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
`encoding` [Encoding](https://docs.microsoft.com/en-us/dotnet/api/system.text.encoding)<br>
#### Returns
[Int32[]](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **CreateContext(LLamaContextParams)**
Create a new context for this model
```csharp
public SafeLLamaContextHandle CreateContext(LLamaContextParams params)
```
#### Parameters
`params` [LLamaContextParams](./llama.native.llamacontextparams.md)<br>
#### Returns
[SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>

View File

@ -0,0 +1,338 @@
# SamplingApi
Namespace: LLama.Native
Direct translation of the llama.cpp sampling API
```csharp
public class SamplingApi
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [SamplingApi](./llama.native.samplingapi.md)
## Constructors
### **SamplingApi()**
```csharp
public SamplingApi()
```
## Methods
### **llama_sample_grammar(SafeLLamaContextHandle, LLamaTokenDataArray, SafeLLamaGrammarHandle)**
Apply grammar rules to candidate tokens
```csharp
public static void llama_sample_grammar(SafeLLamaContextHandle ctx, LLamaTokenDataArray candidates, SafeLLamaGrammarHandle grammar)
```
#### Parameters
`ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
`candidates` [LLamaTokenDataArray](./llama.native.llamatokendataarray.md)<br>
`grammar` [SafeLLamaGrammarHandle](./llama.native.safellamagrammarhandle.md)<br>
### **llama_sample_repetition_penalty(SafeLLamaContextHandle, LLamaTokenDataArray, Memory&lt;Int32&gt;, UInt64, Single)**
#### Caution
last_tokens_size parameter is no longer needed
---
Repetition penalty described in CTRL academic paper https://arxiv.org/abs/1909.05858, with negative logit fix.
```csharp
public static void llama_sample_repetition_penalty(SafeLLamaContextHandle ctx, LLamaTokenDataArray candidates, Memory<int> last_tokens, ulong last_tokens_size, float penalty)
```
#### Parameters
`ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
`candidates` [LLamaTokenDataArray](./llama.native.llamatokendataarray.md)<br>
Pointer to LLamaTokenDataArray
`last_tokens` [Memory&lt;Int32&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.memory-1)<br>
`last_tokens_size` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
`penalty` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
### **llama_sample_repetition_penalty(SafeLLamaContextHandle, LLamaTokenDataArray, Memory&lt;Int32&gt;, Single)**
Repetition penalty described in CTRL academic paper https://arxiv.org/abs/1909.05858, with negative logit fix.
```csharp
public static void llama_sample_repetition_penalty(SafeLLamaContextHandle ctx, LLamaTokenDataArray candidates, Memory<int> last_tokens, float penalty)
```
#### Parameters
`ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
`candidates` [LLamaTokenDataArray](./llama.native.llamatokendataarray.md)<br>
Pointer to LLamaTokenDataArray
`last_tokens` [Memory&lt;Int32&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.memory-1)<br>
`penalty` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
### **llama_sample_frequency_and_presence_penalties(SafeLLamaContextHandle, LLamaTokenDataArray, Memory&lt;Int32&gt;, UInt64, Single, Single)**
#### Caution
last_tokens_size parameter is no longer needed
---
Frequency and presence penalties described in OpenAI API https://platform.openai.com/docs/api-reference/parameter-details.
```csharp
public static void llama_sample_frequency_and_presence_penalties(SafeLLamaContextHandle ctx, LLamaTokenDataArray candidates, Memory<int> last_tokens, ulong last_tokens_size, float alpha_frequency, float alpha_presence)
```
#### Parameters
`ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
`candidates` [LLamaTokenDataArray](./llama.native.llamatokendataarray.md)<br>
Pointer to LLamaTokenDataArray
`last_tokens` [Memory&lt;Int32&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.memory-1)<br>
`last_tokens_size` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
`alpha_frequency` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
`alpha_presence` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
### **llama_sample_frequency_and_presence_penalties(SafeLLamaContextHandle, LLamaTokenDataArray, Memory&lt;Int32&gt;, Single, Single)**
Frequency and presence penalties described in OpenAI API https://platform.openai.com/docs/api-reference/parameter-details.
```csharp
public static void llama_sample_frequency_and_presence_penalties(SafeLLamaContextHandle ctx, LLamaTokenDataArray candidates, Memory<int> last_tokens, float alpha_frequency, float alpha_presence)
```
#### Parameters
`ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
`candidates` [LLamaTokenDataArray](./llama.native.llamatokendataarray.md)<br>
Pointer to LLamaTokenDataArray
`last_tokens` [Memory&lt;Int32&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.memory-1)<br>
`alpha_frequency` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
`alpha_presence` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
### **llama_sample_softmax(SafeLLamaContextHandle, LLamaTokenDataArray)**
Sorts candidate tokens by their logits in descending order and calculate probabilities based on logits.
```csharp
public static void llama_sample_softmax(SafeLLamaContextHandle ctx, LLamaTokenDataArray candidates)
```
#### Parameters
`ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
`candidates` [LLamaTokenDataArray](./llama.native.llamatokendataarray.md)<br>
Pointer to LLamaTokenDataArray
### **llama_sample_top_k(SafeLLamaContextHandle, LLamaTokenDataArray, Int32, UInt64)**
Top-K sampling described in academic paper "The Curious Case of Neural Text Degeneration" https://arxiv.org/abs/1904.09751
```csharp
public static void llama_sample_top_k(SafeLLamaContextHandle ctx, LLamaTokenDataArray candidates, int k, ulong min_keep)
```
#### Parameters
`ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
`candidates` [LLamaTokenDataArray](./llama.native.llamatokendataarray.md)<br>
Pointer to LLamaTokenDataArray
`k` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
`min_keep` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
### **llama_sample_top_p(SafeLLamaContextHandle, LLamaTokenDataArray, Single, UInt64)**
Nucleus sampling described in academic paper "The Curious Case of Neural Text Degeneration" https://arxiv.org/abs/1904.09751
```csharp
public static void llama_sample_top_p(SafeLLamaContextHandle ctx, LLamaTokenDataArray candidates, float p, ulong min_keep)
```
#### Parameters
`ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
`candidates` [LLamaTokenDataArray](./llama.native.llamatokendataarray.md)<br>
Pointer to LLamaTokenDataArray
`p` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
`min_keep` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
### **llama_sample_tail_free(SafeLLamaContextHandle, LLamaTokenDataArray, Single, UInt64)**
Tail Free Sampling described in https://www.trentonbricken.com/Tail-Free-Sampling/.
```csharp
public static void llama_sample_tail_free(SafeLLamaContextHandle ctx, LLamaTokenDataArray candidates, float z, ulong min_keep)
```
#### Parameters
`ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
`candidates` [LLamaTokenDataArray](./llama.native.llamatokendataarray.md)<br>
Pointer to LLamaTokenDataArray
`z` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
`min_keep` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
### **llama_sample_typical(SafeLLamaContextHandle, LLamaTokenDataArray, Single, UInt64)**
Locally Typical Sampling implementation described in the paper https://arxiv.org/abs/2202.00666.
```csharp
public static void llama_sample_typical(SafeLLamaContextHandle ctx, LLamaTokenDataArray candidates, float p, ulong min_keep)
```
#### Parameters
`ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
`candidates` [LLamaTokenDataArray](./llama.native.llamatokendataarray.md)<br>
Pointer to LLamaTokenDataArray
`p` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
`min_keep` [UInt64](https://docs.microsoft.com/en-us/dotnet/api/system.uint64)<br>
### **llama_sample_temperature(SafeLLamaContextHandle, LLamaTokenDataArray, Single)**
Sample with temperature.
As temperature increases, the prediction becomes diverse but also vulnerable to hallucinations -- generating tokens that are sensible but not factual
```csharp
public static void llama_sample_temperature(SafeLLamaContextHandle ctx, LLamaTokenDataArray candidates, float temp)
```
#### Parameters
`ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
`candidates` [LLamaTokenDataArray](./llama.native.llamatokendataarray.md)<br>
`temp` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
### **llama_sample_token_mirostat(SafeLLamaContextHandle, LLamaTokenDataArray, Single, Single, Int32, Single&)**
Mirostat 1.0 algorithm described in the paper https://arxiv.org/abs/2007.14966. Uses tokens instead of words.
```csharp
public static int llama_sample_token_mirostat(SafeLLamaContextHandle ctx, LLamaTokenDataArray candidates, float tau, float eta, int m, Single& mu)
```
#### Parameters
`ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
`candidates` [LLamaTokenDataArray](./llama.native.llamatokendataarray.md)<br>
A vector of `LLamaTokenData` containing the candidate tokens, their probabilities (p), and log-odds (logit) for the current position in the generated text.
`tau` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
The target cross-entropy (or surprise) value you want to achieve for the generated text. A higher value corresponds to more surprising or less predictable text, while a lower value corresponds to less surprising or more predictable text.
`eta` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
The learning rate used to update `mu` based on the error between the target and observed surprisal of the sampled word. A larger learning rate will cause `mu` to be updated more quickly, while a smaller learning rate will result in slower updates.
`m` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
The number of tokens considered in the estimation of `s_hat`. This is an arbitrary value that is used to calculate `s_hat`, which in turn helps to calculate the value of `k`. In the paper, they use `m = 100`, but you can experiment with different values to see how it affects the performance of the algorithm.
`mu` [Single&](https://docs.microsoft.com/en-us/dotnet/api/system.single&)<br>
Maximum cross-entropy. This value is initialized to be twice the target cross-entropy (`2 * tau`) and is updated in the algorithm based on the error between the target and observed surprisal.
#### Returns
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **llama_sample_token_mirostat_v2(SafeLLamaContextHandle, LLamaTokenDataArray, Single, Single, Single&)**
Mirostat 2.0 algorithm described in the paper https://arxiv.org/abs/2007.14966. Uses tokens instead of words.
```csharp
public static int llama_sample_token_mirostat_v2(SafeLLamaContextHandle ctx, LLamaTokenDataArray candidates, float tau, float eta, Single& mu)
```
#### Parameters
`ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
`candidates` [LLamaTokenDataArray](./llama.native.llamatokendataarray.md)<br>
A vector of `LLamaTokenData` containing the candidate tokens, their probabilities (p), and log-odds (logit) for the current position in the generated text.
`tau` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
The target cross-entropy (or surprise) value you want to achieve for the generated text. A higher value corresponds to more surprising or less predictable text, while a lower value corresponds to less surprising or more predictable text.
`eta` [Single](https://docs.microsoft.com/en-us/dotnet/api/system.single)<br>
The learning rate used to update `mu` based on the error between the target and observed surprisal of the sampled word. A larger learning rate will cause `mu` to be updated more quickly, while a smaller learning rate will result in slower updates.
`mu` [Single&](https://docs.microsoft.com/en-us/dotnet/api/system.single&)<br>
Maximum cross-entropy. This value is initialized to be twice the target cross-entropy (`2 * tau`) and is updated in the algorithm based on the error between the target and observed surprisal.
#### Returns
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **llama_sample_token_greedy(SafeLLamaContextHandle, LLamaTokenDataArray)**
Selects the token with the highest probability.
```csharp
public static int llama_sample_token_greedy(SafeLLamaContextHandle ctx, LLamaTokenDataArray candidates)
```
#### Parameters
`ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
`candidates` [LLamaTokenDataArray](./llama.native.llamatokendataarray.md)<br>
Pointer to LLamaTokenDataArray
#### Returns
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **llama_sample_token(SafeLLamaContextHandle, LLamaTokenDataArray)**
Randomly selects a token from the candidates based on their probabilities.
```csharp
public static int llama_sample_token(SafeLLamaContextHandle ctx, LLamaTokenDataArray candidates)
```
#### Parameters
`ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
`candidates` [LLamaTokenDataArray](./llama.native.llamatokendataarray.md)<br>
Pointer to LLamaTokenDataArray
#### Returns
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>

View File

@ -2,8 +2,14 @@
Namespace: LLama.OldVersion
#### Caution
The entire LLama.OldVersion namespace will be removed
---
```csharp
public class ChatCompletion : System.IEquatable`1[[LLama.OldVersion.ChatCompletion, LLamaSharp, Version=0.4.0.0, Culture=neutral, PublicKeyToken=null]]
public class ChatCompletion : System.IEquatable`1[[LLama.OldVersion.ChatCompletion, LLamaSharp, Version=0.5.0.0, Culture=neutral, PublicKeyToken=null]]
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [ChatCompletion](./llama.oldversion.chatcompletion.md)<br>

View File

@ -2,8 +2,14 @@
Namespace: LLama.OldVersion
#### Caution
The entire LLama.OldVersion namespace will be removed
---
```csharp
public class ChatCompletionChoice : System.IEquatable`1[[LLama.OldVersion.ChatCompletionChoice, LLamaSharp, Version=0.4.0.0, Culture=neutral, PublicKeyToken=null]]
public class ChatCompletionChoice : System.IEquatable`1[[LLama.OldVersion.ChatCompletionChoice, LLamaSharp, Version=0.5.0.0, Culture=neutral, PublicKeyToken=null]]
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [ChatCompletionChoice](./llama.oldversion.chatcompletionchoice.md)<br>

View File

@ -2,8 +2,14 @@
Namespace: LLama.OldVersion
#### Caution
The entire LLama.OldVersion namespace will be removed
---
```csharp
public class ChatCompletionChunk : System.IEquatable`1[[LLama.OldVersion.ChatCompletionChunk, LLamaSharp, Version=0.4.0.0, Culture=neutral, PublicKeyToken=null]]
public class ChatCompletionChunk : System.IEquatable`1[[LLama.OldVersion.ChatCompletionChunk, LLamaSharp, Version=0.5.0.0, Culture=neutral, PublicKeyToken=null]]
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [ChatCompletionChunk](./llama.oldversion.chatcompletionchunk.md)<br>

View File

@ -2,8 +2,14 @@
Namespace: LLama.OldVersion
#### Caution
The entire LLama.OldVersion namespace will be removed
---
```csharp
public class ChatCompletionChunkChoice : System.IEquatable`1[[LLama.OldVersion.ChatCompletionChunkChoice, LLamaSharp, Version=0.4.0.0, Culture=neutral, PublicKeyToken=null]]
public class ChatCompletionChunkChoice : System.IEquatable`1[[LLama.OldVersion.ChatCompletionChunkChoice, LLamaSharp, Version=0.5.0.0, Culture=neutral, PublicKeyToken=null]]
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [ChatCompletionChunkChoice](./llama.oldversion.chatcompletionchunkchoice.md)<br>

View File

@ -2,8 +2,14 @@
Namespace: LLama.OldVersion
#### Caution
The entire LLama.OldVersion namespace will be removed
---
```csharp
public class ChatCompletionChunkDelta : System.IEquatable`1[[LLama.OldVersion.ChatCompletionChunkDelta, LLamaSharp, Version=0.4.0.0, Culture=neutral, PublicKeyToken=null]]
public class ChatCompletionChunkDelta : System.IEquatable`1[[LLama.OldVersion.ChatCompletionChunkDelta, LLamaSharp, Version=0.5.0.0, Culture=neutral, PublicKeyToken=null]]
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [ChatCompletionChunkDelta](./llama.oldversion.chatcompletionchunkdelta.md)<br>

View File

@ -2,8 +2,14 @@
Namespace: LLama.OldVersion
#### Caution
The entire LLama.OldVersion namespace will be removed
---
```csharp
public class ChatCompletionMessage : System.IEquatable`1[[LLama.OldVersion.ChatCompletionMessage, LLamaSharp, Version=0.4.0.0, Culture=neutral, PublicKeyToken=null]]
public class ChatCompletionMessage : System.IEquatable`1[[LLama.OldVersion.ChatCompletionMessage, LLamaSharp, Version=0.5.0.0, Culture=neutral, PublicKeyToken=null]]
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [ChatCompletionMessage](./llama.oldversion.chatcompletionmessage.md)<br>

View File

@ -2,8 +2,14 @@
Namespace: LLama.OldVersion
#### Caution
The entire LLama.OldVersion namespace will be removed
---
```csharp
public class ChatMessageRecord : System.IEquatable`1[[LLama.OldVersion.ChatMessageRecord, LLamaSharp, Version=0.4.0.0, Culture=neutral, PublicKeyToken=null]]
public class ChatMessageRecord : System.IEquatable`1[[LLama.OldVersion.ChatMessageRecord, LLamaSharp, Version=0.5.0.0, Culture=neutral, PublicKeyToken=null]]
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [ChatMessageRecord](./llama.oldversion.chatmessagerecord.md)<br>

View File

@ -2,6 +2,12 @@
Namespace: LLama.OldVersion
#### Caution
The entire LLama.OldVersion namespace will be removed
---
```csharp
public class ChatSession<T>
```
@ -78,7 +84,7 @@ public ChatSession<T> WithPromptFile(string promptFilename, string encoding)
### **WithAntiprompt(String[])**
Set the keyword to split the return value of chat AI.
Set the keywords to split the return value of chat AI.
```csharp
public ChatSession<T> WithAntiprompt(String[] antiprompt)

View File

@ -2,8 +2,14 @@
Namespace: LLama.OldVersion
#### Caution
The entire LLama.OldVersion namespace will be removed
---
```csharp
public class Completion : System.IEquatable`1[[LLama.OldVersion.Completion, LLamaSharp, Version=0.4.0.0, Culture=neutral, PublicKeyToken=null]]
public class Completion : System.IEquatable`1[[LLama.OldVersion.Completion, LLamaSharp, Version=0.5.0.0, Culture=neutral, PublicKeyToken=null]]
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [Completion](./llama.oldversion.completion.md)<br>

View File

@ -2,8 +2,14 @@
Namespace: LLama.OldVersion
#### Caution
The entire LLama.OldVersion namespace will be removed
---
```csharp
public class CompletionChoice : System.IEquatable`1[[LLama.OldVersion.CompletionChoice, LLamaSharp, Version=0.4.0.0, Culture=neutral, PublicKeyToken=null]]
public class CompletionChoice : System.IEquatable`1[[LLama.OldVersion.CompletionChoice, LLamaSharp, Version=0.5.0.0, Culture=neutral, PublicKeyToken=null]]
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [CompletionChoice](./llama.oldversion.completionchoice.md)<br>

View File

@ -2,8 +2,14 @@
Namespace: LLama.OldVersion
#### Caution
The entire LLama.OldVersion namespace will be removed
---
```csharp
public class CompletionChunk : System.IEquatable`1[[LLama.OldVersion.CompletionChunk, LLamaSharp, Version=0.4.0.0, Culture=neutral, PublicKeyToken=null]]
public class CompletionChunk : System.IEquatable`1[[LLama.OldVersion.CompletionChunk, LLamaSharp, Version=0.5.0.0, Culture=neutral, PublicKeyToken=null]]
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [CompletionChunk](./llama.oldversion.completionchunk.md)<br>

View File

@ -2,8 +2,14 @@
Namespace: LLama.OldVersion
#### Caution
The entire LLama.OldVersion namespace will be removed
---
```csharp
public class CompletionLogprobs : System.IEquatable`1[[LLama.OldVersion.CompletionLogprobs, LLamaSharp, Version=0.4.0.0, Culture=neutral, PublicKeyToken=null]]
public class CompletionLogprobs : System.IEquatable`1[[LLama.OldVersion.CompletionLogprobs, LLamaSharp, Version=0.5.0.0, Culture=neutral, PublicKeyToken=null]]
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [CompletionLogprobs](./llama.oldversion.completionlogprobs.md)<br>

View File

@ -2,8 +2,14 @@
Namespace: LLama.OldVersion
#### Caution
The entire LLama.OldVersion namespace will be removed
---
```csharp
public class CompletionUsage : System.IEquatable`1[[LLama.OldVersion.CompletionUsage, LLamaSharp, Version=0.4.0.0, Culture=neutral, PublicKeyToken=null]]
public class CompletionUsage : System.IEquatable`1[[LLama.OldVersion.CompletionUsage, LLamaSharp, Version=0.5.0.0, Culture=neutral, PublicKeyToken=null]]
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [CompletionUsage](./llama.oldversion.completionusage.md)<br>

View File

@ -2,8 +2,14 @@
Namespace: LLama.OldVersion
#### Caution
The entire LLama.OldVersion namespace will be removed
---
```csharp
public class Embedding : System.IEquatable`1[[LLama.OldVersion.Embedding, LLamaSharp, Version=0.4.0.0, Culture=neutral, PublicKeyToken=null]]
public class Embedding : System.IEquatable`1[[LLama.OldVersion.Embedding, LLamaSharp, Version=0.5.0.0, Culture=neutral, PublicKeyToken=null]]
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [Embedding](./llama.oldversion.embedding.md)<br>

View File

@ -2,8 +2,14 @@
Namespace: LLama.OldVersion
#### Caution
The entire LLama.OldVersion namespace will be removed
---
```csharp
public class EmbeddingData : System.IEquatable`1[[LLama.OldVersion.EmbeddingData, LLamaSharp, Version=0.4.0.0, Culture=neutral, PublicKeyToken=null]]
public class EmbeddingData : System.IEquatable`1[[LLama.OldVersion.EmbeddingData, LLamaSharp, Version=0.5.0.0, Culture=neutral, PublicKeyToken=null]]
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [EmbeddingData](./llama.oldversion.embeddingdata.md)<br>

View File

@ -2,8 +2,14 @@
Namespace: LLama.OldVersion
#### Caution
The entire LLama.OldVersion namespace will be removed
---
```csharp
public class EmbeddingUsage : System.IEquatable`1[[LLama.OldVersion.EmbeddingUsage, LLamaSharp, Version=0.4.0.0, Culture=neutral, PublicKeyToken=null]]
public class EmbeddingUsage : System.IEquatable`1[[LLama.OldVersion.EmbeddingUsage, LLamaSharp, Version=0.5.0.0, Culture=neutral, PublicKeyToken=null]]
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [EmbeddingUsage](./llama.oldversion.embeddingusage.md)<br>

View File

@ -2,6 +2,12 @@
Namespace: LLama.OldVersion
#### Caution
The entire LLama.OldVersion namespace will be removed
---
```csharp
public interface IChatModel
```

View File

@ -2,6 +2,12 @@
Namespace: LLama.OldVersion
#### Caution
The entire LLama.OldVersion namespace will be removed
---
```csharp
public class LLamaEmbedder : System.IDisposable
```

View File

@ -2,6 +2,12 @@
Namespace: LLama.OldVersion
#### Caution
The entire LLama.OldVersion namespace will be removed
---
```csharp
public class LLamaModel : IChatModel, System.IDisposable
```

View File

@ -2,6 +2,12 @@
Namespace: LLama.OldVersion
#### Caution
The entire LLama.OldVersion namespace will be removed
---
```csharp
public struct LLamaParams
```

View File

@ -1,101 +0,0 @@
# ResettableLLamaModel
Namespace: LLama
A LLamaModel what could be reset. Note that using this class will consume about 10% more memories.
```csharp
public class ResettableLLamaModel : LLamaModel, System.IDisposable
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [LLamaModel](./llama.llamamodel.md) → [ResettableLLamaModel](./llama.resettablellamamodel.md)<br>
Implements [IDisposable](https://docs.microsoft.com/en-us/dotnet/api/system.idisposable)
## Properties
### **OriginalState**
The initial state of the model
```csharp
public Byte[] OriginalState { get; set; }
```
#### Property Value
[Byte[]](https://docs.microsoft.com/en-us/dotnet/api/system.byte)<br>
### **ContextSize**
The context size.
```csharp
public int ContextSize { get; }
```
#### Property Value
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **Params**
The model params set for this model.
```csharp
public ModelParams Params { get; set; }
```
#### Property Value
[ModelParams](./llama.common.modelparams.md)<br>
### **NativeHandle**
The native handle, which is used to be passed to the native APIs. Please avoid using it
unless you know what is the usage of the Native API.
```csharp
public SafeLLamaContextHandle NativeHandle { get; }
```
#### Property Value
[SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
### **Encoding**
The encoding set for this model to deal with text input.
```csharp
public Encoding Encoding { get; }
```
#### Property Value
[Encoding](https://docs.microsoft.com/en-us/dotnet/api/system.text.encoding)<br>
## Constructors
### **ResettableLLamaModel(ModelParams, String)**
```csharp
public ResettableLLamaModel(ModelParams Params, string encoding)
```
#### Parameters
`Params` [ModelParams](./llama.common.modelparams.md)<br>
`encoding` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
## Methods
### **Reset()**
Reset the state to the initial state.
```csharp
public void Reset()
```

View File

@ -13,17 +13,17 @@ Implements [ILLamaExecutor](./llama.abstractions.illamaexecutor.md)
## Properties
### **Model**
### **Context**
The mode used by the executor.
The context used by the executor.
```csharp
public LLamaModel Model { get; }
public LLamaContext Context { get; }
```
#### Property Value
[LLamaModel](./llama.llamamodel.md)<br>
[LLamaContext](./llama.llamacontext.md)<br>
## Methods
@ -111,17 +111,17 @@ protected abstract void PreprocessInputs(string text, InferStateArgs args)
`args` [InferStateArgs](./llama.statefulexecutorbase.inferstateargs.md)<br>
### **PostProcess(InferenceParams, InferStateArgs, IEnumerable`1&)**
### **PostProcess(IInferenceParams, InferStateArgs, IEnumerable`1&)**
Do some post processing after the inference.
```csharp
protected abstract bool PostProcess(InferenceParams inferenceParams, InferStateArgs args, IEnumerable`1& extraOutputs)
protected abstract bool PostProcess(IInferenceParams inferenceParams, InferStateArgs args, IEnumerable`1& extraOutputs)
```
#### Parameters
`inferenceParams` [InferenceParams](./llama.common.inferenceparams.md)<br>
`inferenceParams` [IInferenceParams](./llama.abstractions.iinferenceparams.md)<br>
`args` [InferStateArgs](./llama.statefulexecutorbase.inferstateargs.md)<br>
@ -131,17 +131,17 @@ protected abstract bool PostProcess(InferenceParams inferenceParams, InferStateA
[Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
### **InferInternal(InferenceParams, InferStateArgs)**
### **InferInternal(IInferenceParams, InferStateArgs)**
The core inference logic.
```csharp
protected abstract void InferInternal(InferenceParams inferenceParams, InferStateArgs args)
protected abstract void InferInternal(IInferenceParams inferenceParams, InferStateArgs args)
```
#### Parameters
`inferenceParams` [InferenceParams](./llama.common.inferenceparams.md)<br>
`inferenceParams` [IInferenceParams](./llama.abstractions.iinferenceparams.md)<br>
`args` [InferStateArgs](./llama.statefulexecutorbase.inferstateargs.md)<br>
@ -193,19 +193,19 @@ public abstract void LoadState(string filename)
`filename` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **Infer(String, InferenceParams, CancellationToken)**
### **Infer(String, IInferenceParams, CancellationToken)**
Execute the inference.
```csharp
public IEnumerable<string> Infer(string text, InferenceParams inferenceParams, CancellationToken cancellationToken)
public IEnumerable<string> Infer(string text, IInferenceParams inferenceParams, CancellationToken cancellationToken)
```
#### Parameters
`text` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
`inferenceParams` [InferenceParams](./llama.common.inferenceparams.md)<br>
`inferenceParams` [IInferenceParams](./llama.abstractions.iinferenceparams.md)<br>
`cancellationToken` [CancellationToken](https://docs.microsoft.com/en-us/dotnet/api/system.threading.cancellationtoken)<br>
@ -213,19 +213,19 @@ public IEnumerable<string> Infer(string text, InferenceParams inferenceParams, C
[IEnumerable&lt;String&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.ienumerable-1)<br>
### **InferAsync(String, InferenceParams, CancellationToken)**
### **InferAsync(String, IInferenceParams, CancellationToken)**
Execute the inference asynchronously.
```csharp
public IAsyncEnumerable<string> InferAsync(string text, InferenceParams inferenceParams, CancellationToken cancellationToken)
public IAsyncEnumerable<string> InferAsync(string text, IInferenceParams inferenceParams, CancellationToken cancellationToken)
```
#### Parameters
`text` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
`inferenceParams` [InferenceParams](./llama.common.inferenceparams.md)<br>
`inferenceParams` [IInferenceParams](./llama.abstractions.iinferenceparams.md)<br>
`cancellationToken` [CancellationToken](https://docs.microsoft.com/en-us/dotnet/api/system.threading.cancellationtoken)<br>

View File

@ -14,46 +14,65 @@ Implements [ILLamaExecutor](./llama.abstractions.illamaexecutor.md)
## Properties
### **Model**
### **Context**
The mode used by the executor when running the inference.
The context used by the executor when running the inference.
```csharp
public LLamaModel Model { get; }
public LLamaContext Context { get; private set; }
```
#### Property Value
[LLamaModel](./llama.llamamodel.md)<br>
[LLamaContext](./llama.llamacontext.md)<br>
## Constructors
### **StatelessExecutor(LLamaModel)**
### **StatelessExecutor(LLamaWeights, IModelParams)**
Create a new stateless executor which will use the given model
```csharp
public StatelessExecutor(LLamaModel model)
public StatelessExecutor(LLamaWeights weights, IModelParams params)
```
#### Parameters
`model` [LLamaModel](./llama.llamamodel.md)<br>
The LLama model.
`weights` [LLamaWeights](./llama.llamaweights.md)<br>
`params` [IModelParams](./llama.abstractions.imodelparams.md)<br>
### **StatelessExecutor(LLamaContext)**
#### Caution
Use the constructor which automatically creates contexts using the LLamaWeights
---
Create a new stateless executor which will use the model used to create the given context
```csharp
public StatelessExecutor(LLamaContext context)
```
#### Parameters
`context` [LLamaContext](./llama.llamacontext.md)<br>
## Methods
### **Infer(String, InferenceParams, CancellationToken)**
### **Infer(String, IInferenceParams, CancellationToken)**
```csharp
public IEnumerable<string> Infer(string text, InferenceParams inferenceParams, CancellationToken cancellationToken)
public IEnumerable<string> Infer(string text, IInferenceParams inferenceParams, CancellationToken cancellationToken)
```
#### Parameters
`text` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
`inferenceParams` [InferenceParams](./llama.common.inferenceparams.md)<br>
`inferenceParams` [IInferenceParams](./llama.abstractions.iinferenceparams.md)<br>
`cancellationToken` [CancellationToken](https://docs.microsoft.com/en-us/dotnet/api/system.threading.cancellationtoken)<br>
@ -61,19 +80,19 @@ public IEnumerable<string> Infer(string text, InferenceParams inferenceParams, C
[IEnumerable&lt;String&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.ienumerable-1)<br>
### **InferAsync(String, InferenceParams, CancellationToken)**
### **InferAsync(String, IInferenceParams, CancellationToken)**
```csharp
public IAsyncEnumerable<string> InferAsync(string text, InferenceParams inferenceParams, CancellationToken token)
public IAsyncEnumerable<string> InferAsync(string text, IInferenceParams inferenceParams, CancellationToken cancellationToken)
```
#### Parameters
`text` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
`inferenceParams` [InferenceParams](./llama.common.inferenceparams.md)<br>
`inferenceParams` [IInferenceParams](./llama.abstractions.iinferenceparams.md)<br>
`token` [CancellationToken](https://docs.microsoft.com/en-us/dotnet/api/system.threading.cancellationtoken)<br>
`cancellationToken` [CancellationToken](https://docs.microsoft.com/en-us/dotnet/api/system.threading.cancellationtoken)<br>
#### Returns

157
docs/xmldocs/llama.utils.md Normal file
View File

@ -0,0 +1,157 @@
# Utils
Namespace: LLama
Assorted llama utilities
```csharp
public static class Utils
```
Inheritance [Object](https://docs.microsoft.com/en-us/dotnet/api/system.object) → [Utils](./llama.utils.md)
## Methods
### **InitLLamaContextFromModelParams(IModelParams)**
#### Caution
Use LLamaWeights.LoadFromFile and LLamaWeights.CreateContext instead
---
```csharp
public static SafeLLamaContextHandle InitLLamaContextFromModelParams(IModelParams params)
```
#### Parameters
`params` [IModelParams](./llama.abstractions.imodelparams.md)<br>
#### Returns
[SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
### **Tokenize(SafeLLamaContextHandle, String, Boolean, Encoding)**
#### Caution
Use SafeLLamaContextHandle Tokenize method instead
---
```csharp
public static IEnumerable<int> Tokenize(SafeLLamaContextHandle ctx, string text, bool add_bos, Encoding encoding)
```
#### Parameters
`ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
`text` [String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
`add_bos` [Boolean](https://docs.microsoft.com/en-us/dotnet/api/system.boolean)<br>
`encoding` [Encoding](https://docs.microsoft.com/en-us/dotnet/api/system.text.encoding)<br>
#### Returns
[IEnumerable&lt;Int32&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.collections.generic.ienumerable-1)<br>
### **GetLogits(SafeLLamaContextHandle, Int32)**
#### Caution
Use SafeLLamaContextHandle GetLogits method instead
---
```csharp
public static Span<float> GetLogits(SafeLLamaContextHandle ctx, int length)
```
#### Parameters
`ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
`length` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
#### Returns
[Span&lt;Single&gt;](https://docs.microsoft.com/en-us/dotnet/api/system.span-1)<br>
### **Eval(SafeLLamaContextHandle, Int32[], Int32, Int32, Int32, Int32)**
#### Caution
Use SafeLLamaContextHandle Eval method instead
---
```csharp
public static int Eval(SafeLLamaContextHandle ctx, Int32[] tokens, int startIndex, int n_tokens, int n_past, int n_threads)
```
#### Parameters
`ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
`tokens` [Int32[]](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
`startIndex` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
`n_tokens` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
`n_past` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
`n_threads` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
#### Returns
[Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
### **TokenToString(Int32, SafeLLamaContextHandle, Encoding)**
#### Caution
Use SafeLLamaContextHandle TokenToString method instead
---
```csharp
public static string TokenToString(int token, SafeLLamaContextHandle ctx, Encoding encoding)
```
#### Parameters
`token` [Int32](https://docs.microsoft.com/en-us/dotnet/api/system.int32)<br>
`ctx` [SafeLLamaContextHandle](./llama.native.safellamacontexthandle.md)<br>
`encoding` [Encoding](https://docs.microsoft.com/en-us/dotnet/api/system.text.encoding)<br>
#### Returns
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>
### **PtrToString(IntPtr, Encoding)**
#### Caution
No longer used internally by LlamaSharp
---
```csharp
public static string PtrToString(IntPtr ptr, Encoding encoding)
```
#### Parameters
`ptr` [IntPtr](https://docs.microsoft.com/en-us/dotnet/api/system.intptr)<br>
`encoding` [Encoding](https://docs.microsoft.com/en-us/dotnet/api/system.text.encoding)<br>
#### Returns
[String](https://docs.microsoft.com/en-us/dotnet/api/system.string)<br>

View File

@ -5,12 +5,12 @@ nav:
- Architecture: Architecture.md
- Tricks for FAQ: Tricks.md
- Contributing Guide: ContributingGuide.md
- LLamaModel:
- Model Parameters: LLamaModel/parameters.md
- Tokenization: LLamaModel/tokenization.md
- Get Embeddings: LLamaModel/embeddings.md
- Quantization: LLamaModel/quantization.md
- Save/Load State: LLamaModel/save-load-state.md
- LLamaContext:
- Context Parameters: LLamaContext/parameters.md
- Tokenization: LLamaContext/tokenization.md
- Get Embeddings: LLamaContext/embeddings.md
- Quantization: LLamaContext/quantization.md
- Save/Load State: LLamaContext/save-load-state.md
- LLamaExecutors:
- Inference Parameters: LLamaExecutors/parameters.md
- Text-to-Text APIs: LLamaExecutors/text-to-text-apis.md
@ -24,6 +24,7 @@ nav:
- Chinese: NonEnglishUsage/Chinese.md
- High-level Applications:
- BotSharp: HighLevelApps/bot-sharp.md
- semantic-kernel: HighLevelApps/semantic-kernel.md
- More:
- Logger: More/log.md
- Examples:
@ -39,7 +40,9 @@ nav:
- API Reference:
- index: ./xmldocs/index.md
- llama.abstractions.ihistorytransform: ./xmldocs/llama.abstractions.ihistorytransform.md
- llama.abstractions.iinferenceparams: ./xmldocs/llama.abstractions.iinferenceparams.md
- llama.abstractions.illamaexecutor: ./xmldocs/llama.abstractions.illamaexecutor.md
- llama.abstractions.imodelparams: ./xmldocs/llama.abstractions.imodelparams.md
- llama.abstractions.itextstreamtransform: ./xmldocs/llama.abstractions.itextstreamtransform.md
- llama.abstractions.itexttransform: ./xmldocs/llama.abstractions.itexttransform.md
- llama.chatsession: ./xmldocs/llama.chatsession.md
@ -49,24 +52,44 @@ nav:
- llama.common.illamalogger: ./xmldocs/llama.common.illamalogger.md
- llama.common.inferenceparams: ./xmldocs/llama.common.inferenceparams.md
- llama.common.llamadefaultlogger: ./xmldocs/llama.common.llamadefaultlogger.md
- llama.common.mirostatetype: ./xmldocs/llama.common.mirostatetype.md
- llama.common.mirostattype: ./xmldocs/llama.common.mirostattype.md
- llama.common.modelparams: ./xmldocs/llama.common.modelparams.md
- llama.exceptions.grammarexpectedname: ./xmldocs/llama.exceptions.grammarexpectedname.md
- llama.exceptions.grammarexpectednext: ./xmldocs/llama.exceptions.grammarexpectednext.md
- llama.exceptions.grammarexpectedprevious: ./xmldocs/llama.exceptions.grammarexpectedprevious.md
- llama.exceptions.grammarformatexception: ./xmldocs/llama.exceptions.grammarformatexception.md
- llama.exceptions.grammarunexpectedcharaltelement: ./xmldocs/llama.exceptions.grammarunexpectedcharaltelement.md
- llama.exceptions.grammarunexpectedcharrngelement: ./xmldocs/llama.exceptions.grammarunexpectedcharrngelement.md
- llama.exceptions.grammarunexpectedendelement: ./xmldocs/llama.exceptions.grammarunexpectedendelement.md
- llama.exceptions.grammarunexpectedendofinput: ./xmldocs/llama.exceptions.grammarunexpectedendofinput.md
- llama.exceptions.grammarunexpectedhexcharscount: ./xmldocs/llama.exceptions.grammarunexpectedhexcharscount.md
- llama.exceptions.grammarunknownescapecharacter: ./xmldocs/llama.exceptions.grammarunknownescapecharacter.md
- llama.exceptions.runtimeerror: ./xmldocs/llama.exceptions.runtimeerror.md
- llama.extensions.dictionaryextension: ./xmldocs/llama.extensions.dictionaryextension.md
- llama.extensions.imodelparamsextensions: ./xmldocs/llama.extensions.imodelparamsextensions.md
- llama.extensions.keyvaluepairextensions: ./xmldocs/llama.extensions.keyvaluepairextensions.md
- llama.grammars.grammar: ./xmldocs/llama.grammars.grammar.md
- llama.grammars.grammarrule: ./xmldocs/llama.grammars.grammarrule.md
- llama.instructexecutor: ./xmldocs/llama.instructexecutor.md
- llama.interactiveexecutor: ./xmldocs/llama.interactiveexecutor.md
- llama.llamacontext: ./xmldocs/llama.llamacontext.md
- llama.llamaembedder: ./xmldocs/llama.llamaembedder.md
- llama.llamamodel: ./xmldocs/llama.llamamodel.md
- llama.llamaquantizer: ./xmldocs/llama.llamaquantizer.md
- llama.llamatransforms: ./xmldocs/llama.llamatransforms.md
- llama.llamaweights: ./xmldocs/llama.llamaweights.md
- llama.native.llamacontextparams: ./xmldocs/llama.native.llamacontextparams.md
- llama.native.llamaftype: ./xmldocs/llama.native.llamaftype.md
- llama.native.llamagrammarelement: ./xmldocs/llama.native.llamagrammarelement.md
- llama.native.llamagrammarelementtype: ./xmldocs/llama.native.llamagrammarelementtype.md
- llama.native.llamamodelquantizeparams: ./xmldocs/llama.native.llamamodelquantizeparams.md
- llama.native.llamatokendata: ./xmldocs/llama.native.llamatokendata.md
- llama.native.llamatokendataarray: ./xmldocs/llama.native.llamatokendataarray.md
- llama.native.llamatokendataarraynative: ./xmldocs/llama.native.llamatokendataarraynative.md
- llama.native.nativeapi: ./xmldocs/llama.native.nativeapi.md
- llama.native.safellamacontexthandle: ./xmldocs/llama.native.safellamacontexthandle.md
- llama.native.safellamagrammarhandle: ./xmldocs/llama.native.safellamagrammarhandle.md
- llama.native.safellamahandlebase: ./xmldocs/llama.native.safellamahandlebase.md
- llama.native.safellamamodelhandle: ./xmldocs/llama.native.safellamamodelhandle.md
- llama.native.samplingapi: ./xmldocs/llama.native.samplingapi.md
- llama.oldversion.chatcompletion: ./xmldocs/llama.oldversion.chatcompletion.md
- llama.oldversion.chatcompletionchoice: ./xmldocs/llama.oldversion.chatcompletionchoice.md
- llama.oldversion.chatcompletionchunk: ./xmldocs/llama.oldversion.chatcompletionchunk.md
@ -88,9 +111,9 @@ nav:
- llama.oldversion.llamaembedder: ./xmldocs/llama.oldversion.llamaembedder.md
- llama.oldversion.llamamodel: ./xmldocs/llama.oldversion.llamamodel.md
- llama.oldversion.llamaparams: ./xmldocs/llama.oldversion.llamaparams.md
- llama.resettablellamamodel: ./xmldocs/llama.resettablellamamodel.md
- llama.statefulexecutorbase: ./xmldocs/llama.statefulexecutorbase.md
- llama.statelessexecutor: ./xmldocs/llama.statelessexecutor.md
- llama.utils: ./xmldocs/llama.utils.md
theme:
name: material