* Support outputting attentions in generate function
* Add unit tests for concatenating tensors
* Implement `cat` for `dim>0`
* Add `cat` unit tests for > 2 tensors
* Allow for negative indexing + bounds checking
* Add test case for `cat` with negative indexing
* Clean up `safeIndex` helper function
* Allow indexing error message to include dimension
* Reuse `safeIndex` helper function for `normalize_`
* Optimize `cat` indexing
* Implement `stack` tensor operation
+ add unit tests
* Add TODOs
* Implement `mean` tensor operation
* Implement `std_mean` tensor ops
* Fix order of `std_mean` returns
* Implement median filter
* Implement dynamic time warping
* Implement `neg` tensor op
* Throw error if audio sent to processor is not a `Float32Array`
* Add `round` helper function
* [WIP] Implement basic version of word-level-timestamps
Known issues:
- timestamps not correct for index > 0
- punctuation not same as python version
* Fix typo
* Fix timestamps
* Round to 2 decimals
* Fix punctuation
* Fix typing
* Remove debug statements
* Cleanup code
* Cleanup
* Remove debug statements
* Update JSDoc for extract token timestamps function
* Add return type for `std_mean` tensor function
* Improve typing of private whisper tokenizer functions
* Indicate method is private
* Allow whisper feature extractor to be called with Float64Array input
* Fix typo
* Throw error if `cross_attentions` are not present in model output when extracting token timestamps
* Throw error during generate function
* Allow whisper models to be exported with `output_attentions=True`
* Add alignment heads to generation config
* Remove print statement
* Update versions
* Override protobufjs version
* Update package-lock.json
* Require onnx==1.13.1 for conversion
Will update once onnxruntime-web supports onnx IR version 9
* Add unit test for word-level timestamps
* Extract add attentions function out of `generate`
* Fix `findLongestCommonSequence` return types
* Downgrade back to onnxruntime 1.14.0
1.15.1 is a little to unstable right now.
* Cleanup
- use `.map`
- rename variables
* Update comments
* Add examples for how to transcribe w/ word-level timestamps
* Add example for transcribing/translating audio longer than 30 seconds
* Make example more compact
* Only run encoder with required inputs
* Add basic whisper unit tests
* Add newline after heading for docs
* Add unit test for transcribing english with timestamps
* Add multilingual test case
Their latest version has a few issues, particularly with webgpu, and also uses .wasm files which are incompatible with their previous versions.
So, while those issues are sorted out, it's best to freeze their packages to the latest stable version.