* start - docs, SpeechT5 copy and rename
* add relevant code from FastSpeech2 draft, have tests pass
* make it an actual conformer, demo ex.
* matching inference with original repo, includes debug code
* refactor nn.Sequentials, start more desc. var names
* more renaming
* more renaming
* vocoder scratchwork
* matching vocoder outputs
* hifigan vocoder conversion script
* convert model script, rename some config vars
* replace postnet with speecht5's implementation
* passing common tests, file cleanup
* expand testing, add output hidden states and attention
* tokenizer + passing tokenizer tests
* variety of updates and tests
* g2p_en pckg setup
* import structure edits
* docstrings and cleanup
* repo consistency
* deps
* small cleanup
* forward signature param order
* address comments except for masks and labels
* address comments on attention_mask and labels
* address second round of comments
* remove old unneeded line
* address comments part 1
* address comments pt 2
* rename auto mapping
* fixes for failing tests
* address comments part 3 (bart-like, train loss)
* make style
* pass config where possible
* add forward method + tests to WithHifiGan model
* make style
* address arg passing and generate_speech comments
* address Arthur comments
* address Arthur comments pt2
* lint changes
* Sanchit comment
* add g2p-en to doctest deps
* move up self.encoder
* onnx compatible tensor method
* fix is symbolic
* fix paper url
* move models to espnet org
* make style
* make fix-copies
* update docstring
* Arthur comments
* update docstring w/ new updates
* add model architecture images
* header size
* md wording update
* make style