![]() * Create basic tokenizer playground app * Default to no display when user adding large body of text * Optimize BPE algorithm - Use map instead of object for `bpe_ranks` - Replace reduction in BPE algorithm with for loop - Avoid conversions between sets and arrays * Use for loop to avoid stack issues with `.push(...items)` * Fix `mergeArrays` typing * Remove unnecessary try-catch block in BPE * Add Llama, T5, and BERT tokenizers to the playground * Improve how BERT/T5 tokens are displayed * Improve how token margins are displayed * Use `Map` for cache * Add efficient heap-based priority queue implementation * Add more unit tests for LlamaTokenizer Selected from https://github.com/belladoreai/llama-tokenizer-js/blob/master/llama-tokenizer.js#L381-L452 * Implement priority-queue-based BPE algorithm * Remove old code * Update `bpe` docstring * Add `data-structures` page to docs * Update JSDoc for data-structures.js * Update data-structures.js * Move `TokenLattice` and `CharTrie` to data-structures module * Minor refactoring |
||
---|---|---|
.. | ||
vite.svg |