Commit Graph

980 Commits

Author SHA1 Message Date
Joshua Lochner 5d924dca69 Update JSDoc 2023-12-26 02:55:40 +02:00
Joshua Lochner d327d31284 Add support for CLIPSeg models 2023-12-26 02:54:01 +02:00
Joshua Lochner 7636a1c416
Add spaces template link to README (#467) 2023-12-23 19:58:00 +02:00
Lian LF 4112429d38
Update Next.js Dockerfile HOSTNAME (#461)
See https://github.com/vercel/next.js/pull/54342 and https://github.com/vercel/next.js/issues/54093
2023-12-20 01:53:45 +02:00
Joshua Lochner 0bf6e6712f [version] Update to 2.12.1 2023-12-18 23:25:00 +02:00
Joshua Lochner 1427125dc3
Update jinja dependency (#459)
* Make `@huggingface/jinja` a dependency

* Update package-lock.json

* Update JSDoc
2023-12-18 23:22:24 +02:00
Joshua Lochner 61cb4f5c3a
Include `@huggingface/jinja` in exported webpack build (#458)
In future, probably a better idea to dynamically import. However, currently it affects importing from CDNs.
2023-12-18 20:23:40 +02:00
Joshua Lochner 81aab022ff [version] Update to 2.12.0 2023-12-18 17:04:41 +02:00
Joshua Lochner d4f7cd5024
Add support for chat templates (#408)
* Add basic support for chat templates

* Cleanup

* JSDoc improvements

* Support conversion of user-defined functions

* Cleanup

* Fix function creation

* Add unit tests for templates

* Cleanup

* Improve JSDoc

* Add missing return types

* Add chat templates docs to table of contents

* Add support for logical negation

* Fix nested logical negation

* Add unit tests for logical operators

* Add loop variables

* Add support for `RuntimeValue` built-in functions

* Add unit tests for string instance methods

* Fix conversion of normal function to `FunctionValue`

* Update object method unit tests

* Save chat template to tokenizer_config.json during conversion

* Fix `raise_exception` error

* Add `!=` operator for booleans

* Remember to increment loop index

* Cleanup for loop evaluator

* Use `is` helper function

* Add support for text nodes

i.e., non Jinja statements/expressions

* Add auto-generated templating tests

* Update unit tests

* Remove unused function

* Add default chat templates

* Use repo with up-to-date tokenizer config

* Temporarily disable zephyr test

* Delete templates.test.js

* Move Jinja functionality to `@huggingface/jinja`

* Fix template cache type

* Update chat template unit tests

* Update `@huggingface/jinja` version

* Fix default llama2 system prompt usage

* Add unit test for llama2 w/o chat template set

* Update jinja version

* Update jinja version

* Add unit test for user-defined chat templates

Example from https://discuss.huggingface.co/t/issue-with-llama-2-chat-template-and-out-of-date-documentation/61645/3

* Add `AddedToken` for improved tokenization

* Add example usage for chat templates

* Add 'first' Metaspace pretokenizer prepend scheme

* Formatting

* Update wav2vec2 converter special tokens whitespace split

* Fix Metaspace pretokenizer split criteria

* Update inputs of `PreTokenizerSequence`

* Improve Metaspace pretokenizer

* Update llama tokenizer tests

* Improve handling of legacy llama tokenizer

* Re-enable SPM tests

* Add static tokenizer test cases

* Add llama2 static tests

* Allow user to override legacy tokenizer behaviour in `.from_pretrained`

* Add legacy tokenizer unit tests

* Bump jinja version to 0.1.0
2023-12-18 17:00:50 +02:00
Joshua Lochner 6129e45b2b [version] Update to 2.11.0 2023-12-13 15:19:12 +02:00
Joshua Lochner 2de085b6e5
Add support for ChineseCLIP models (#455)
* Update `VitMatteImageProcessor` test comment

* Add support for ChineseCLIP models

* Add chinese-clip to list of supported models

* Sort zero-shot-image-classification results by score (desc)

* Update expected zero-shot image classification output
2023-12-13 14:47:17 +02:00
Joshua Lochner b978ff8ce4
Add support for ViTMatte models (#448)
* Add support for `VitMatte` models

* Add `VitMatteImageProcessor`

* Add `VitMatteImageProcessor` unit test

* Fix typo

* Add example code for `VitMatteForImageMatting`

* Fix JSDoc

* Fix typo
2023-12-13 02:18:36 +02:00
Joshua Lochner 80d22dae7b
Add support for ESM models (#447)
* Add support for ESM models

* Add ESM tokenizer conversion methods

* Add special test cases for ESM tokenizer

* add special tokens in conversion script

* Do not save decoder

* Add special tokens tokenizer test

* Join tokens with space if decoder is null

* Treat all tokens as added tokens

* Use `WhitespaceSplit` pretokenizer

* `<eos>` and `<bos>` are not special tokens

* Update more supported ESM models

* Add `--tokenizer_id` to conversion script

* Add supported models comments
2023-12-13 02:10:27 +02:00
Joshua Lochner 0d2f05def5
Add support for ELECTRA models (#446) 2023-12-12 18:56:36 +02:00
Joshua Lochner 0ffdc8d9ca
Add support for Hubert models (#449) 2023-12-12 18:32:16 +02:00
Joshua Lochner 47b1a873a2
Add support for ConvBERT models (#445)
* Add support for `ConvBERT` models

* Fix `ConvBertTokenizer`

* Fix tokenizer
2023-12-12 17:54:27 +02:00
Joshua Lochner 9308f880c5
Add support for DINOv2 models (#444)
* Add dinov2 models

* Add `BitImageProcessor`

* Update list of supported models
2023-12-12 17:42:48 +02:00
Joshua Lochner 09c760e817
Add support for Phi models (#443) 2023-12-12 17:18:58 +02:00
Joshua Lochner 8c465a95be
Fix tensor inheritance (#451)
* Do not extend from ONNX tensor (fix #437)

* Fix typing issues

* Typing improvements

* Apply suggestions

* Update tensor import type
2023-12-12 17:17:13 +02:00
Joshua Lochner 2cd2997d57
Add CLS pooling option to `feature-extraction` pipeline (#450) 2023-12-12 00:19:59 +02:00
Joshua Lochner 8e49e5e638
Create CLAP demo (#442) 2023-12-08 19:43:10 +02:00
Joshua Lochner ff3019fc05
Add example usage for `SpeechT5ForSpeechToText` (#438) 2023-12-06 22:17:11 +02:00
Joshua Lochner cb8a5961df [version] Update to 2.10.1 2023-12-06 18:48:41 +02:00
Joshua Lochner 374c1052a7
Standardize `HF_ACCESS_TOKEN` -> `HF_TOKEN` (#431) 2023-12-06 18:33:50 +02:00
Joshua Lochner ceb75dccf9
Update vite version for example applications (#435) 2023-12-06 18:33:15 +02:00
Joshua Lochner d318b1f243
Fix zero-shot-object-detection `percentage` option (#434) 2023-12-06 18:23:53 +02:00
Joshua Lochner c0c746b056 Update issue templates 2023-12-06 17:15:29 +02:00
Joshua Lochner 9a8c664c2c
Documentation improvements (#299)
* Add link to optimum docs for supported architectures

Closes #288

* Refactor `SUPPORTED_MODELS` dict to include task

* Update example model id

* Update list of supported models

* Update generate_tests.py

* Remove requirement of `output_attentions` revision

* Add demo site to examples section (closes #233)

* Fix typo

* Include examples in docs index

* Update github issue templates

* Create config.yml

* Order supported models

* Cleanup

* Update 4_feature-request.yml
2023-12-06 17:01:36 +02:00
Joshua Lochner 57487744e7 [version] Update to 2.10.0 2023-12-05 15:34:53 +02:00
Joshua Lochner ac466e6198 Remove old workflow files 2023-12-05 12:43:26 +02:00
Joshua Lochner b1a4b58e86 Update processor unit test's max execution time 2023-12-05 12:37:51 +02:00
Joshua Lochner c5ed1d70ca
Add support for CLAP (`zero-shot-audio-classification`) and Audio Spectrogram Transformer (`audio-classification`) (#427)
* Add FFT unit tests

* Refactor maths.js and audio.js

* Refactor audio processors

* Add support for AST models

* Add another audio-classification example

* Add audio processing unit tests

* Implement `log_mel='dB'` in `spectrogram` function

* Add `ClapFeatureExtractor`

* Implement `ClapFeatureExtractor` unit tests

* Add support for `CLAP`

* Add `ZeroShotAudioClassificationPipeline`

* Add listed support for  `zero-shot-audio-classification` pipeline tag

* Cleanup

* `let` -> `const`

* Update `mel_filter_bank` unit test

* Add `'Xenova/tiny-random-ClapModel'`

* Add `ClapAudioModelWithProjection` and `ClapTextModelWithProjection`

* Move audio validation to helper function

* Optimize `mel_filter_bank` computation

-30ms

* Update mel filters unit test

* Cleanup

* Optimizations

* Fix jsdoc

* Optimizations

* Add WIP conversion scripts

Will be updated once https://github.com/huggingface/optimum/pull/1552 is merged
2023-12-05 12:17:42 +02:00
Joshua Lochner 6f05572854
Add support for ConvNeXT (V1+V2) models (#428)
* Add support for `convnext` and `convnextv2` models

* Fix typo
2023-12-02 17:33:21 +02:00
Joshua Lochner 3da3841811
Support decoding of tensors (#416)
* Support decoding of tensors (Closes #362)

* Remove debug line
2023-12-02 16:17:57 +02:00
Joshua Lochner 768a2e26d7 [version] Update to 2.9.0 2023-11-21 14:51:11 +02:00
Joshua Lochner 83dfa4718e
Add `depth-estimation` w/ DPT and GLPN (#389)
* Add `size` getter to `RawImage`

* Add `DPTFeatureExtractor`

* Add depth-estimation w/ DPT models

* Add GLPN models for depth estimation

* Add missing import in example

* Add `DPTFeatureExtractor` processor test

* Add unit test for GLPN processor

* Add support for `GLPNFeatureExtractor`

Uses `size_divisor` to determine resize width and height

* Add `GLPNForDepthEstimation` example code

* Add DPT to list of supported models

* Add GLPN to list of supported models

* Add `DepthEstimationPipeline`

* Add listed support for depth estimation pipeline

* Add depth estimation pipeline unit tests

* Fix formatting

* Update `pipeline` JSDoc

* Fix typo from merge
2023-11-20 15:43:45 +02:00
Joshua Lochner 5ddc4722f3
Add support for nougat models (`image-to-text`) (#391)
* Add `NougatTokenizer`

* Add nougat unit tests

* Add support for `NougatImageProcessor`

* Add `crop` function to `RawImage`

* Fix `RawImage` save function

OffscreenCanvas does not have `toDataURL` function

* Add listed support for nougat models

* Fix `min`/`max` function typing

* Add unknown token to tokenizer class

* Implement `NoBadWordsLogitsProcessor`

* Use `NoBadWordsLogitsProcessor` in `generate`

* Fix regex group substitutions

Python uses \1, \2, etc. for group substitutions, but JavaScript uses $1, $2, etc.

* Create `regexSplit` helper function to split but keep delimiter

* Fix splitting for String pattern types

* Fix docstring
2023-11-20 15:14:11 +02:00
Joshua Lochner 7cf8a2c442
Add `zero-shot-object-detection` w/ OwlViT (#392)
* Set `batch_size=1` for owlvit exports

* Add support for owlvit models

* Update default quantization settings

* Add list of supported models

* Revert update of owlvit quantization settings

* Add `OwlViTProcessor`

* Move `get_bounding_box` to utils

* Add `ZeroShotObjectDetectionPipeline`

* Add unit tests

* Add owlvit processor test

* Add listed support for `zero-shot-object-detection`

* Add OWL-ViT to list of supported models

* Update README.md

* Fix typo from merge
2023-11-20 14:34:56 +02:00
Hermann Rolfes b5ef835c7d
Fix NaNs when using ORT proxy (#404)
* Move tensor clone for Worker ownership NaN issue

* Update src/models.js - Use conditional operator

Co-authored-by: Joshua Lochner <admin@xenova.com>

* Update src/models.js - Object.create(null)

Co-authored-by: Joshua Lochner <admin@xenova.com>

* tensor.js: remove "Object" type to fix types (since ONNX exports correct type now)

* models.js / validateInputs(): Remove promise/await because it is not needed
Use "tensor instanceof Tensor" check because otherwise validateInputs() thinks it has an input even if it doesn't

* Fix JSDoc

* Update JSDoc

---------

Co-authored-by: Joshua Lochner <admin@xenova.com>
2023-11-20 14:32:15 +02:00
Dominik Weckmüller ac0096e33d
Add default `token_type_ids` for multilingual-e5-* models (#403)
* Fix #267 & #324

Add default token_type_ids. Fix for multilingual-e5-* family.

* Add add_token_types import

* export `add_token_types`

* Improvements

---------

Co-authored-by: Joshua Lochner <admin@xenova.com>
2023-11-20 00:44:35 +02:00
Joshua Lochner b8719b12dd
Ensure WASM fallback does not crash in GH actions (#402)
* Ensure WASM fallback does not crash in GH actions

* Add unit test for WordPiece `max_input_chars_per_word`

* Cleanup

* Set max test concurrency to 1
2023-11-19 08:06:49 +02:00
Joshua Lochner 19daf2d3c1
Add jsDelivr stats to README (#395) 2023-11-18 12:59:15 +02:00
Joshua Lochner 6fc268cb23
Update sharp dependency version (#400) 2023-11-18 12:58:21 +02:00
Sam L'Huillier c8bbdd41f4
Implement max character check for WordPiece tokenizer (#398)
* Implement max character check per token

* Update maxInputCharsPerWord to max_input_chars_per_word

Co-authored-by: Joshua Lochner <admin@xenova.com>

* Update maxInputCharsPerWord to max_input_chars_per_word

Co-authored-by: Joshua Lochner <admin@xenova.com>

* Update to ??

Co-authored-by: Joshua Lochner <admin@xenova.com>

---------

Co-authored-by: Joshua Lochner <admin@xenova.com>
2023-11-17 21:48:55 +02:00
Victor Nogueira 4e4148cb5c
Add support for Grouped Query Attention on Llama Model (#393)
Resolves #388
2023-11-15 17:51:33 +02:00
Joshua Lochner 35d61f5cc9
Add `CLIPFeatureExtractor` (and tests) (#387) 2023-11-15 16:28:29 +02:00
Joshua Lochner c98073042f [version] Update to 2.8.0 2023-11-09 18:00:58 +02:00
Joshua Lochner 73a99ba0af
Add image-to-image task w/ Swin2SR (for super-resolution) (#381)
* Add `Swin2SRImageProcessor`

* Add `RawImage.fromTensor` helper function

* Add clamp tensor function

* Add support for `.to` data type conversion

* Add `round` tensor function

* Add support for `mul` tensor function

* Fix image padding

* Only perform padding if it will affect size

* Create basic processors unit test suite

* Add SamProcessor test case

* Move `CONTENT_TYPE_MAP` outside `RawImage` class

* Perform reflective padding for swin2sr models

* Add swin2sr models for image super-resolution

* Add listed support for Swin2SR models

* Add image-to-image pipeline

* Add listed support for image-to-image task

* Add image-to-image unit tests

* Add `add` tensor functions

* Generalize `pad_image` helper function

* Add more unit tests for image processors

* Fix typo
2023-11-09 17:57:32 +02:00
Joshua Lochner 3e8d227247
Add support for Falcon and Mistral models (#379)
* By default, do not add special tokens in text-generation

See 147e8ce4ae/src/transformers/pipelines/text_generation.py (L106)

* Add support for mistral models

* Add support for Falcon models

* Replace `batch_size` with variable

* Add Falcon to list of supported models

* Fix typing issue with bigint literals
2023-11-09 16:43:51 +02:00
Joshua Lochner 96c5dd4ccf
Fix `text2text-generation` pipeline output inconsistency w/ python library (#384)
* Fix `text2text-generation` pipeline inconsistency

See https://huggingface.co/docs/transformers/v4.35.0/en/main_classes/pipelines#transformers.Text2TextGenerationPipeline

* Fix `text2text-generation` example in docs

* Improve text2text-generation output in docs
2023-11-09 16:08:27 +02:00