transformers/tests/models/seamless_m4t
Arthur 15cfe38942
[`Core tokenization`] `add_dummy_prefix_space` option to help with latest issues (#28010)
* add add_dummy_prefix_space option to slow

* checking kwargs might be better. Should be there for all spm tokenizer IMO

* nits

* fix copies

* more copied

* nits

* add prefix space

* nit

* nits

* Update src/transformers/convert_slow_tokenizer.py

* fix inti

* revert wrong styling

* fix

* nits

* style

* updates

* make sure we use slow tokenizer for conversion instead of looking for the decoder

* support llama ast well

* update llama tokenizer fast

* nits

* nits nits nits

* update the doc

* update

* update to fix tests

* skip unrelated tailing test

* Update src/transformers/convert_slow_tokenizer.py

* add proper testing

* test decode as well

* more testing

* format

* fix llama test

* Apply suggestions from code review
2024-02-20 12:50:31 +01:00
..
__init__.py Add Seamless M4T model (#25693) 2023-10-23 14:49:48 +02:00
test_feature_extraction_seamless_m4t.py Fix error in M4T feature extractor (#28340) 2024-01-04 16:40:53 +00:00
test_modeling_seamless_m4t.py disable test_retain_grad_hidden_states_attentions on SeamlessM4TModelWithTextInputTest (#28169) 2023-12-21 08:39:44 +01:00
test_processor_seamless_m4t.py Add Seamless M4T model (#25693) 2023-10-23 14:49:48 +02:00
test_tokenization_seamless_m4t.py [`Core tokenization`] `add_dummy_prefix_space` option to help with latest issues (#28010) 2024-02-20 12:50:31 +01:00