Commit Graph

7175 Commits

Author SHA1 Message Date
Jaimeen Ahn 0661abc545
Variable Correction for Consistency in Distillation Example (#11444)
As the error comes from the inconsistency of variable meaning number of gpus in parser and its actual usage in the train.py script, 'gpus' and 'n_gpu' respectively,  the correction makes the example work
2021-04-26 13:30:48 -04:00
Bhadresh Savani 1d30ec95c7
[Examples] Fixes inconsistency around eval vs val and predict vs test (#11380)
* added changes for uniformity

* modified files

* corrected typo

* fixed qa scripts

* fix typos

* fixed predict typo in qa no trainer

* fixed test file

* reverted trainer changes

* reverted trainer changes in custom exmaples

* updated readme

* added changes in deepspeed test

* added changes for predict and eval
2021-04-26 09:24:31 -07:00
Sylvain Gugger 7959d83599
Give each test a different repo name (#11453) 2021-04-26 11:52:23 -04:00
Sylvain Gugger b03b2a653d Style 2021-04-26 11:45:04 -04:00
Stas Bekman ce11318e7e
make sure to test against the local checkout (#11437) 2021-04-26 08:42:43 -07:00
Stas Bekman a753cafdc0
[docs] fix invalid class name (#11438)
* fix invalid class name

* proper ref

* proper ref
2021-04-26 08:37:32 -07:00
Kostas Stathoulopoulos 6715e3b6a1
Clarify description of the is_split_into_words argument (#11449)
* Improve documentation for is_split_into_words argument

* Change description wording
2021-04-26 11:29:36 -04:00
Sylvain Gugger ab2cabb964
Pass along seed to DistributedSampler (#11406)
* Pass along seed to DistributedSampler

* Add seed to DistributedLengthGroupedSampler
2021-04-26 10:26:52 -04:00
LSinev b24ead87e1
fix some typos in docs, comments, logging/errors (#11432) 2021-04-26 09:14:25 -04:00
Amine Abdaoui e3e70f9551
docs(examples): fix link to TPU launcher script (#11427) 2021-04-26 09:08:43 -04:00
Sylvain Gugger d7633a4e46
Add basic support for FP16 in SageMaker model parallelism (#11407)
* Add FP16 support for SageMaker MP

* Add print debugs

* Squeeze

* Remove debug statements

* Add defensive check

* Typo
2021-04-26 08:55:14 -04:00
Daniel Stancl 38a716cd41
TF BART models - Add `cross_attentions` to model output and fix cross-attention head masking (#10699)
* Add cross_attn_head_mask to BART

* Fix cross_attentions in TFBart-like models

* This commit enables returning of `cross_attentions`
for TFBart-like models

* It also fixes attention head masking in cross-attenion module

* Update TF model templates

* Fix missing , in TF model templates

* Fix typo: congig -> config
2021-04-26 14:16:21 +02:00
Sylvain Gugger 4bd6b54fa4 Pin black to 21.4b0 2021-04-26 08:12:54 -04:00
Sylvain Gugger c1625b3261 With style 2021-04-26 08:07:29 -04:00
Sylvain Gugger 4b72cfd958 Pin black to 20.8.b1 2021-04-26 08:06:50 -04:00
Patrick von Platen 32dbb2d954
make style (#11442) 2021-04-26 13:50:34 +02:00
Vasudev Gupta 04ab2ca639
add pooling layer support (#11439) 2021-04-26 09:05:53 +02:00
abiolaTresor 30f065890e
updating the checkpoint for GPT2ForSequence Classification to one with classification head (#11434) 2021-04-26 10:28:51 +05:30
cronoik 35cd8eed88
EncoderDecoderConfigs should not create new objects (#11300)
* removes the creation of separate config objects and uses the existing ones instead+overwrite resize_token_embeddings from parent class because it is not working for the EncoderDecoderModel

* rollback to current version of the huggingface master branch

* reworked version that ties the encoder and decoder config of the parent encoderdecoder instance

* overwrite of resize_token_embeddings throws an error now

* review comment suggestion

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* implemented warning in case encoderdecoder is created with differing configs of encoderdecoderconfig and decoderconfig or encoderconfig

* added test to avoid diverging configs of wrapper class and wrapped classes

* Update src/transformers/models/encoder_decoder/modeling_encoder_decoder.py

* make style

Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-04-25 11:45:46 +02:00
Daniel Stancl f45cb66bf6
Add head_mask, decoder_head_mask, cross_head_mask to ProphetNet (#9964)
* Add head_mask & decoder_head_mask + some corrections

* Fix head masking for N-grams

* Enable test_headmasking for encoder and decod

* Fix one typo regarding in modeling_propgetnet.py

* Enable test_headmasking for ProphetNetStandaloneDecoderModelTest
and ProphetNetStandaloneEncoderModelTest in test_modeling_prophetnet.py

* make style

* Fix cross_head_mask

* Fix attention head mask naming

* `cross_head_mask` -> `cross_attn_head_mask`

* `cross_layer_head_mask` -> `cross_attn_layer_head_mask`

* Still need to merge #10605 to master to pass the tests
2021-04-25 11:06:16 +02:00
Sylvain Gugger 52166f672e Style 2021-04-23 20:40:17 -04:00
cronoik 9cac4fab07
documentation linked to the parent class PreTrainedTokenizerFast but it should be the slow tokenizer (#11410) 2021-04-23 20:19:15 -04:00
Sylvain Gugger b7fc043fce Merge branch 'master' of github.com:huggingface/transformers 2021-04-23 18:47:55 -04:00
Sylvain Gugger 81a6c7cd39 Use 3 workers for torch tests 2021-04-23 18:47:46 -04:00
Philip May 195bfd118a
Enable option for subword regularization in `XLMRobertaTokenizer` (#11149)
* enable subword regularization.

* fix tokenizer storage

* fix docstring formatting

* Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py

Co-authored-by: Stefan Schweter <stefan@schweter.it>

* fix docstring formatting

* add test for subword regularization tokenizer

* improve comments of test

* add sp_model_kwargs

* reformat docstring to match the style

* add some more documentation

* Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* improve docstring

* empty commit to trigger CI

* Update src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix docstring formatting for sphinx

Co-authored-by: Stefan Schweter <stefan@schweter.it>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-04-23 17:52:31 -04:00
Sylvain Gugger 1ef152eb48
Default to accuracy metric (#11405) 2021-04-23 14:49:59 -04:00
Daniel Stancl e3ff165aa5
Fix cross-attention head mask for Torch encoder-decoder models (#10605)
* Fix cross-attention head mask for Torch BART models

* Fix head masking for cross-attention module for the following
models: BART, Blenderbot, Blenderbot_small, M2M_100, Marian, MBart,
Pegasus

* Enable test_headmasking for M2M_100 model

* Fix cross_head_mask for FSMT, LED and T5

* This commit fixes `head_mask` for cross-attention modules
in the following models: FSMT, LED, T5

* It also contains some smaller changes in doc so that
it is be perfectly clear the shape of `cross_head_mask`
is the same as of `decoder_head_mask`

* Update template

* Fix template for BartForCausalLM

* Fix cross_head_mask for Speech2Text models

* Fix cross_head_mask in templates

* Fix args order in BartForCausalLM template

* Fix doc in BART templates

* Make more explicit naming

* `cross_head_mask` -> `cross_attn_head_mask`

* `cross_layer_head_mask` -> `cross_attn_layer_head_mask`

* Fix doc

* make style quality

* Fix speech2text docstring
2021-04-23 18:58:06 +02:00
Sylvain Gugger ca6b80cadb Wrong branch Sylvain... 2021-04-23 12:46:54 -04:00
Sylvain Gugger 3951fc55ee Try to trigger failure more 2021-04-23 12:44:54 -04:00
Sylvain Gugger bd41a0f74d Style 2021-04-23 12:32:37 -04:00
Nicola De Cao 1811883e80
Fixing bug in generation (#11297)
When passing `inputs_embeds` and not `input_ids=None` the generation function fails because `input_ids` is created but the function but it should not.
2021-04-23 18:24:26 +02:00
Kiran R 5c00918681
added support for exporting of t5 to onnx with past_key_values (#10651) 2021-04-23 18:14:20 +02:00
Patrick von Platen 50f4539b82
push (#11400) 2021-04-23 15:36:27 +02:00
Sylvain Gugger bf2e0cf70b
Trainer push to hub (#11328)
* Initial support for upload to hub

* push -> upload

* Fixes + examples

* Fix torchhub test

* Torchhub test I hate you

* push_model_to_hub -> push_to_hub

* Apply mixin to other pretrained models

* Remove ABC inheritance

* Add tests

* Typo

* Run tests

* Install git-lfs

* Change approach

* Add push_to_hub to all

* Staging test suite

* Typo

* Maybe like this?

* More deps

* Cache

* Adapt name

* Quality

* MOAR tests

* Put it in testing_utils

* Docs + torchhub last hope

* Styling

* Wrong method

* Typos

* Update src/transformers/file_utils.py

Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Address review comments

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Co-authored-by: Julien Chaumond <julien@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2021-04-23 09:17:37 -04:00
Teven 7bc86bea68
Fixed trainer total_flos relaoding in distributed mode (#11383)
* Fixed trainer total_flos relaoding in distributed mode

* logging flos at the end of training
2021-04-23 07:53:33 -04:00
Patrick von Platen 74e84f1fa6
make blenderbot test slow (#11395) 2021-04-23 07:49:09 -04:00
Yoshitomo Matsubara c3d6f33918
fixed typos (#11391) 2021-04-23 07:48:42 -04:00
Max Del a90d3f1862
Fix typo in text (#11396) 2021-04-23 07:37:19 -04:00
Patrick von Platen 2dc2d79ac7
correct conversion (#11394) 2021-04-23 11:59:34 +02:00
Patrick von Platen b48cf7124c
correct typo (#11393) 2021-04-23 11:34:59 +02:00
Patrick von Platen 8c9b5fcbaf
[Flax] Big FlaxBert Refactor (#11364)
* improve flax

* refactor

* typos

* Update src/transformers/modeling_flax_utils.py

* Apply suggestions from code review

* Update src/transformers/modeling_flax_utils.py

* fix typo

* improve error tolerance

* typo

* correct nasty saving bug

* fix from pretrained

* correct tree map

* add note

* correct weight tying
2021-04-23 09:53:09 +02:00
Sylvain Gugger 3ed5e97ba0
Fix Trainer with remove_unused_columns=False (#11382)
* Fix Trainer with remove_unused_columns=False

* Typo
2021-04-22 11:16:24 -04:00
PenutChen 0f3ad1507e
Fix typo (#11369) 2021-04-22 10:10:16 -04:00
Matt 2617396094
Correctly cast num_train_epochs to int (#11379) 2021-04-22 13:49:59 +01:00
Takuya Makino 881945c0b5
Add space (#11373) 2021-04-22 17:48:58 +05:30
johnson7788 5b5e4ca366
[run_translation.py] fix typo (#11372)
fix typo

Co-authored-by: johnson <johnson@github.com>
2021-04-22 17:47:11 +05:30
Patrick von Platen 58d8795d74
[Flax] Correct typo (#11374)
* finish

* fix copy
2021-04-22 13:11:44 +02:00
Patrick von Platen 880154d2e1
[Wav2Vec2] Fix special tokens for Wav2Vec2 tokenizer (#11349)
* fix wav2vec2 tok

* up
2021-04-22 12:23:08 +02:00
Sylvain Gugger 6f14eab50b Add in torchhub 2021-04-21 19:17:29 -04:00
Sylvain Gugger ff26f8ee3a Add huggingface_hub dep for #11328 2021-04-21 19:12:58 -04:00