c9837a0d27
* Conversion from slow to fast for BPE spm vocabs contained an error. - There is only 1 test currently (tokenizers + slow) that used the modified path and it's reformer, which does not contain any ids modification so the bug was silent for now. - The real issue is that vocab variable was overloaded by SentencePieceExtractor, leading to Slow specific vocab oddities to be completely ignored - The bug was reported here https://github.com/huggingface/transformers/issues/9518 - Ran the complete tokenization test suite with slow without error (`RUN_SLOW=1 pytest -sv tests/test_tokenization_*`) * Remove rebase error. * Adding the fixture. |
||
---|---|---|
.. | ||
tests_samples | ||
dummy-config.json | ||
empty.txt | ||
input.txt | ||
sample_text.txt | ||
sample_text_no_unicode.txt | ||
spiece.model | ||
test_sentencepiece.model | ||
test_sentencepiece_bpe.model | ||
test_sentencepiece_no_bos.model |