transformers.js/scripts/extra
Joshua Lochner c5ed1d70ca
Add support for CLAP (`zero-shot-audio-classification`) and Audio Spectrogram Transformer (`audio-classification`) (#427)
* Add FFT unit tests

* Refactor maths.js and audio.js

* Refactor audio processors

* Add support for AST models

* Add another audio-classification example

* Add audio processing unit tests

* Implement `log_mel='dB'` in `spectrogram` function

* Add `ClapFeatureExtractor`

* Implement `ClapFeatureExtractor` unit tests

* Add support for `CLAP`

* Add `ZeroShotAudioClassificationPipeline`

* Add listed support for  `zero-shot-audio-classification` pipeline tag

* Cleanup

* `let` -> `const`

* Update `mel_filter_bank` unit test

* Add `'Xenova/tiny-random-ClapModel'`

* Add `ClapAudioModelWithProjection` and `ClapTextModelWithProjection`

* Move audio validation to helper function

* Optimize `mel_filter_bank` computation

-30ms

* Update mel filters unit test

* Cleanup

* Optimizations

* Fix jsdoc

* Optimizations

* Add WIP conversion scripts

Will be updated once https://github.com/huggingface/optimum/pull/1552 is merged
2023-12-05 12:17:42 +02:00
..
clap.py Add support for CLAP (`zero-shot-audio-classification`) and Audio Spectrogram Transformer (`audio-classification`) (#427) 2023-12-05 12:17:42 +02:00
clip.py Add support for computing CLIP image and text embeddings separately (Closes #148) (#227) 2023-08-01 14:01:04 +02:00
marian.py New models and refactoring (#276) 2023-09-08 15:17:05 +02:00
speecht5.py Add support for `text-to-speech` (w/ Speecht5) (#345) 2023-10-23 16:31:46 +02:00
wav2vec2.py [WIP] Add MMS and Wav2Vec2 models (Closes #209) (#220) 2023-08-14 22:18:44 +02:00
whisper.py Fix `CustomWhisperOnnxConfig` 2023-09-01 16:14:49 +02:00