transformers

History

Matthijs Hollemans e4bacf6614 [WIP] add SpeechT5 model (#18922 ) * make SpeechT5 model by copying Wav2Vec2 * add paper to docs * whoops added docs in wrong file * remove SpeechT5Tokenizer + put CTC back in the name * remove deprecated class * remove unused docstring * delete SpeechT5FeatureExtractor, use Wav2Vec2FeatureExtractor instead * remove classes we don't need right now * initial stab at speech encoder prenet * add more speech encoder prenet stuff * improve SpeechEncoderPrenet * add encoder (not finished yet) * add relative position bias to self-attention * add encoder CTC layers * fix formatting * add decoder from BART, doesn't work yet * make it work with generate loop * wrap the encoder into a speech encoder class * wrap the decoder in a text decoder class * changed my mind * changed my mind again ;-) * load decoder weights, make it work * add weights for text decoder postnet * add SpeechT5ForCTC model that uses only the encoder * clean up EncoderLayer and DecoderLayer * implement _init_weights in SpeechT5PreTrainedModel * cleanup config + Encoder and Decoder * add head + cross attention masks * improve doc comments * fixup * more cleanup * more fixup * TextDecoderPrenet works now, thanks Kendall * add CTC loss * add placeholders for other pre/postnets * add type annotation * fix freeze_feature_encoder * set padding tokens to 0 in decoder attention mask * encoder attention mask downsampling * remove features_pen calculation * disable the padding tokens thing again * fixup * more fixup * code review fixes * rename encoder/decoder wrapper classes * allow checkpoints to be loaded into SpeechT5Model * put encoder into wrapper for CTC model * clean up conversion script * add encoder for TTS model * add speech decoder prenet * add speech decoder post-net * attempt to reconstruct the generation loop * add speech generation loop * clean up generate_speech * small tweaks * fix forward pass * enable always dropout on speech decoder prenet * sort declaration * rename models * fixup * fix copies * more fixup * make consistency checker happy * add Seq2SeqSpectrogramOutput class * doc comments * quick note about loss and labels * add HiFi-GAN implementation (from Speech2Speech PR) * rename file * add vocoder to TTS model * improve vocoder * working on tokenizer * more better tokenizer * add CTC tokenizer * fix decode and batch_code in CTC tokenizer * fix processor * two processors and feature extractors * use SpeechT5WaveformFeatureExtractor instead of Wav2Vec2 * cleanup * more cleanup * even more fixup * notebooks * fix log-mel spectrograms * support reduction factor * fixup * shift spectrograms to right to create decoder inputs * return correct labels * add labels for stop token prediction * fix doc comments * fixup * remove SpeechT5ForPreTraining * more fixup * update copyright headers * add usage examples * add SpeechT5ProcessorForCTC * fixup * push unofficial checkpoints to hub * initial version of tokenizer unit tests * add slow test * fix failing tests * tests for CTC tokenizer * finish CTC tokenizer tests * processor tests * initial test for feature extractors * tests for spectrogram feature extractor * fixup * more fixup * add decorators * require speech for tests * modeling tests * more tests for ASR model * fix imports * add fake tests for the other models * fixup * remove jupyter notebooks * add missing SpeechT5Model tests * add missing tests for SpeechT5ForCTC * add missing tests for SpeechT5ForTextToSpeech * sort tests by name * fix Hi-Fi GAN tests * fixup * add speech-to-speech model * refactor duplicate speech generation code * add processor for SpeechToSpeech model * add usage example * add tests for speech-to-speech model * fixup * enable gradient checkpointing for SpeechT5FeatureEncoder * code review * push_to_hub now takes repo_id * improve doc comments for HiFi-GAN config * add missing test * add integration tests * make number of layers in speech decoder prenet configurable * rename variable * rename variables * add auto classes for TTS and S2S * REMOVE CTC!!! * S2S processor does not support save/load_pretrained * fixup * these models are now in an auto mapping * fix doc links * rename HiFiGAN to HifiGan, remove separate config file * REMOVE auto classes * there can be only one * fixup * replace assert * reformat * feature extractor can process input and target at same time * update checkpoint names * fix commit hash		2023-02-03 12:43:46 -05:00
..
tests_samples	Fix img classification tests (#13456 )	2021-09-07 05:58:45 -04:00
add_distilbert_like_config.json	Add model like (#14992 )	2022-01-24 15:25:10 -05:00
dummy-config.json	AutoConfig + other Auto classes honor model_type	2020-01-11 02:46:17 +00:00
dummy_feature_extractor_config.json	[AutoProcessor] Add Wav2Vec2WithLM & small fix (#14675 )	2021-12-08 15:51:28 +01:00
empty.txt	GPU text generation: mMoved the encoded_prompt to correct device	2020-01-06 15:11:12 +01:00
input.txt	GPU text generation: mMoved the encoded_prompt to correct device	2020-01-06 15:11:12 +01:00
merges.txt	[AutoTokenizer] Allow creation of tokenizers by tokenizer type (#13668 )	2021-09-22 00:29:38 +02:00
preprocessor_config.json	Auto processor (#14465 )	2021-11-22 12:17:38 -05:00
sample_text.txt	GPU text generation: mMoved the encoded_prompt to correct device	2020-01-06 15:11:12 +01:00
sample_text_no_unicode.txt	[Dependencies\|tokenizers] Make both SentencePiece and Tokenizers optional dependencies (#7659 )	2020-10-18 20:51:24 +02:00
spiece.model	GPU text generation: mMoved the encoded_prompt to correct device	2020-01-06 15:11:12 +01:00
test_entity_vocab.json	Feature/fix slow test in mluke (#14749 )	2021-12-22 06:35:59 -05:00
test_sentencepiece.model	GPU text generation: mMoved the encoded_prompt to correct device	2020-01-06 15:11:12 +01:00
test_sentencepiece_bpe.model	Conversion from slow to fast for BPE spm vocabs contained an error. (#10120 )	2021-02-13 08:24:53 -05:00
test_sentencepiece_bpe_char.model	[WIP] add SpeechT5 model (#18922 )	2023-02-03 12:43:46 -05:00
test_sentencepiece_no_bos.model	[pegasus] Faster tokenizer tests (#7672 )	2020-10-09 11:10:32 -04:00
test_sentencepiece_with_bytefallback.model	add a warning in `SpmConverter` for sentencepiece's model using the byte fallback feature (#16629 )	2022-04-11 11:06:10 +02:00
vocab.json	[AutoTokenizer] Allow creation of tokenizers by tokenizer type (#13668 )	2021-09-22 00:29:38 +02:00
vocab.txt	[AutoTokenizer] Allow creation of tokenizers by tokenizer type (#13668 )	2021-09-22 00:29:38 +02:00