Lysandre Debut
641b873c13
XLNet PLM Readme ( #6121 )
2020-07-29 11:38:15 -04:00
Sam Shleifer
92f8ce2ed6
Fix deebert tests ( #6102 )
2020-07-28 18:30:16 -04:00
Sam Shleifer
dafa296c95
[s2s] Delete useless method, log tokens_per_batch ( #6081 )
2020-07-28 11:24:23 -04:00
Stas Bekman
f0c70085c2
link to README.md ( #6068 )
...
* add a link to README.md
* Update README.md
2020-07-28 20:34:58 +08:00
Sam Shleifer
3c7fbf35a6
MBART: support summarization tasks where max_src_len > max_tgt_len ( #6003 )
...
* MBART: support summarization tasks
* fix test
* Style
* add tokenizer test
2020-07-28 08:18:11 -04:00
Sam Shleifer
7a68d40138
[s2s] Don't mention packed data in README ( #6079 )
2020-07-27 20:07:21 -04:00
Sam Shleifer
1e00ef681d
[s2s] dont document packing because it hurts performance ( #6077 )
2020-07-27 18:26:00 -04:00
Sam Shleifer
11792d7826
CL util to convert models to fp16 before upload ( #5953 )
2020-07-27 12:21:25 -04:00
Sam Shleifer
4302ace5bd
[pack_dataset] don't sort before packing, only pack train ( #5954 )
2020-07-27 12:14:23 -04:00
Suraj Patil
d1d15d6f2d
[examples (seq2seq)] fix preparing decoder_input_ids for T5 ( #5994 )
2020-07-27 10:10:43 -04:00
Sam Shleifer
c69ea5efc4
[CI] Don't test apex ( #6021 )
2020-07-24 15:34:16 -04:00
Sam Shleifer
c3206eef44
[test] partial coverage for train_mbart_enro_cc25.sh ( #5976 )
2020-07-22 14:34:49 -04:00
Sam Shleifer
feeb956a19
[docs] Add integration test example to copy pasta template ( #5961 )
...
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-07-22 12:48:38 -04:00
Sam Shleifer
9dab39feea
seq2seq/run_eval.py can take decoder_start_token_id ( #5949 )
2020-07-21 16:58:45 -04:00
Sam Shleifer
5b193b39b0
[examples/seq2seq]: add --label_smoothing option ( #5919 )
2020-07-21 16:51:39 -04:00
Sam Shleifer
95d1962b9c
[Doc] explaining romanian postprocessing for MBART BLEU hacking ( #5943 )
2020-07-21 14:12:48 -04:00
Aditya Soni
ccbf74a685
typos in seq2seq/readme ( #5937 )
2020-07-21 09:44:59 -04:00
Qingqing Cao
8e0bcb56ec
DataParallel fix: multi gpu evaluation ( #5926 )
...
The DataParallel training was fixed in https://github.com/huggingface/transformers/pull/5733 , this commit also fixes the evaluation. It's more convenient when the user enables both `do_train` and `do_eval`.
2020-07-20 17:54:08 -04:00
Sam Shleifer
f1a4e06f1f
[Fix] seq2seq pack_dataset.py actually packs ( #5913 )
...
Huge MT speedup!
2020-07-20 15:18:26 -04:00
Stas Bekman
35cb101eae
DataParallel fixes ( #5733 )
...
* DataParallel fixes:
1. switched to a more precise check
- if self.args.n_gpu > 1:
+ if isinstance(model, nn.DataParallel):
2. fix tests - require the same fixup under DataParallel as the training module
* another fix
2020-07-20 09:29:12 -04:00
Sam Shleifer
09a2f40684
Seq2SeqDataset uses linecache to save memory by @Pradhy729 ( #5792 )
...
Co-authored-by: Pradhy729 <49659913+Pradhy729@users.noreply.github.com>
2020-07-18 13:57:33 -04:00
Sam Shleifer
dad5e12e54
[seq2seq] distillation.py accepts trainer arguments ( #5865 )
2020-07-18 07:43:57 -04:00
Sam Shleifer
ba2400189b
[seq2seq] MAX_LEN env var for MT commands ( #5837 )
2020-07-17 22:51:31 -04:00
Nathan Raw
529850ae7b
Lightning Updates for v0.8.5 ( #5798 )
...
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-07-17 22:43:06 -04:00
Sam Shleifer
e238e3d55a
[seq2seq] Don't copy self.source in sortishsampler ( #5818 )
2020-07-17 01:53:25 -04:00
Sam Shleifer
283500ff9f
[seq2seq] pack_dataset.py rewrites dataset in max_tokens format ( #5819 )
2020-07-16 14:06:49 -04:00
Sam Shleifer
1a647abf0b
[fix] check code quality ( #5772 )
2020-07-15 14:59:38 -04:00
Sam Shleifer
d0486c8bc2
[cleanup] T5 test, warnings ( #5761 )
2020-07-15 08:23:22 -04:00
Boris Dayma
4d5a8d6557
docs(wandb): explain how to use W&B integration ( #5607 )
...
* docs(wandb): explain how to use W&B integration
fix #5262
* Also mention TensorBoard
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-07-14 05:12:33 -04:00
Julien Chaumond
201d23f285
Update The Big Table of Tasks
...
Co-Authored-By: Suraj Patil <surajp815@gmail.com>
Co-Authored-By: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-07-10 18:07:29 +02:00
Lysandre Debut
0533cf4706
Test XLA examples ( #5583 )
...
* Test XLA examples
* Style
* Using `require_torch_tpu`
* Style
* No need for pytest
2020-07-09 09:19:19 -04:00
Ji Xin
cfbb982974
Add DeeBERT (entropy-based early exiting for *BERT) ( #5477 )
...
* Add deebert code
* Add readme of deebert
* Add test for deebert
Update test for Deebert
* Update DeeBert (README, class names, function refactoring); remove requirements.txt
* Format update
* Update test
* Update readme and model init methods
2020-07-08 08:17:59 +08:00
Patrick von Platen
fde217c679
readme for benchmark ( #5363 )
2020-07-07 23:21:23 +02:00
Sam Shleifer
353b8f1e7a
Add mbart-large-cc25, support translation finetuning ( #5129 )
...
improve unittests for finetuning, especially w.r.t testing frozen parameters
fix freeze_embeds for T5
add streamlit setup.cfg
2020-07-07 13:23:01 -04:00
Patrick von Platen
4dc65591b5
[Almost all TF models] TF clean up: add missing CLM / MLM loss; fix T5 naming and keras compile ( #5395 )
...
* add first version of clm tf
* make style
* add more tests for bert
* update tf clm loss
* fix tests
* correct tf ner script
* add mlm loss
* delete bogus file
* clean tf auto model + add tests
* finish adding clm loss everywhere
* fix training in distilbert
* fix flake8
* save intermediate
* fix tf t5 naming
* remove prints
* finish up
* up
* fix tf gpt2
* fix new test utils import
* fix flake8
* keep backward compatibility
* Update src/transformers/modeling_tf_albert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/modeling_tf_auto.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/modeling_tf_electra.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/modeling_tf_roberta.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/modeling_tf_mobilebert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/modeling_tf_auto.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/modeling_tf_bert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update src/transformers/modeling_tf_distilbert.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* apply sylvains suggestions
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-07-07 18:15:53 +02:00
Suraj Patil
e49393c361
[examples] Add trainer support for question-answering ( #4829 )
...
* add SquadDataset
* add DataCollatorForQuestionAnswering
* update __init__
* add run_squad with trainer
* add DataCollatorForQuestionAnswering in __init__
* pass data_collator to trainer
* doc tweak
* Update run_squad_trainer.py
* Update __init__.py
* Update __init__.py
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-07-07 08:57:08 -04:00
Shashank Gupta
3dcb748e31
Added data collator for permutation (XLNet) language modeling and related calls ( #5522 )
...
* Added data collator for XLNet language modeling and related calls
Added DataCollatorForXLNetLanguageModeling in data/data_collator.py
to generate necessary inputs for language modeling training with
XLNetLMHeadModel. Also added related arguments, logic and calls in
examples/language-modeling/run_language_modeling.py.
Resolves : #4739 , #2008 (partially)
* Changed name to `DataCollatorForPermutationLanguageModeling`
Changed the name of `DataCollatorForXLNetLanguageModeling` to the more general `DataCollatorForPermutationLanguageModelling`.
Removed the `--mlm` flag requirement for the new collator and defined a separate `--plm_probability` flag for its use.
CTRL uses a CLM loss just like GPT and GPT-2, so should work out of the box with this script (provided `past` is taken care of
similar to `mems` for XLNet).
Changed calls and imports appropriately.
* Added detailed comments, changed variable names
Added more detailed comments to `DataCollatorForPermutationLanguageModeling` in `data/data_collator.py` to explain working. Also cleaned up variable names and made them more informative.
* Added tests for new data collator
Added tests in `tests/test_trainer.py` for DataCollatorForPermutationLanguageModeling based on those in DataCollatorForLanguageModeling. A specific test has been added to check for odd-length sequences.
* Fixed styling issues
2020-07-07 10:17:37 +02:00
Lysandre Debut
9d9b872b66
The `add_space_before_punct_symbol` is only for TransfoXL ( #5549 )
2020-07-06 12:17:05 -04:00
Sylvain Gugger
734a28a767
Clean up diffs in Trainer/TFTrainer ( #5417 )
...
* Cleanup and unify Trainer/TFTrainer
* Forgot to adapt TFTrainingArgs
* In tf scripts n_gpu -> n_replicas
* Update src/transformers/training_args.py
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Address review comments
* Formatting
* Fix typo
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-07-01 11:00:20 -04:00
Sam Shleifer
13deb95a40
Move tests/utils.py -> transformers/testing_utils.py ( #5350 )
2020-07-01 10:31:17 -04:00
Sylvain Gugger
4ade7491f4
Fix examples titles and optimization doc page ( #5408 )
2020-07-01 08:11:25 -04:00
Hong Xu
501040fd30
In the run_ner.py example, give the optional label arg a default value ( #5326 )
...
Otherwise, if label is not specified, the following error occurs:
Traceback (most recent call last):
File "run_ner.py", line 303, in <module>
main()
File "run_ner.py", line 101, in main
model_args, data_args, training_args = parser.parse_json_file(json_file=os.path.abspath(sys.argv[1]))
File "/home/user/anaconda3/envs/bert/lib/python3.7/site-packages/transformers/hf_argparser.py", line 159, in parse_json_file
obj = dtype(**inputs)
TypeError: __init__() missing 1 required positional argument: 'labels'
2020-06-30 19:45:35 -04:00
Sam Shleifer
27a7fe7a8d
examples/seq2seq: never override $WANDB_PROJECT ( #5407 )
2020-06-30 15:29:13 -04:00
Kevin Canwen Xu
331d8d2936
Upload DistilBART artwork ( #5394 )
2020-06-30 18:11:11 +08:00
MichaelJanz
9a473f1e43
Update Bertabs example to work again ( #5355 )
...
* Fix the bug 'Attempted relative import with no known parent package' when using the bertabs example. Also change the used model from bertabs-finetuned-cnndm, since it seems not be accessible anymore
* Update run_summarization.py
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
2020-06-30 14:05:01 +08:00
Sam Shleifer
a316a6aaa8
[seq2seq docs] Move evaluation down, fix typo ( #5365 )
2020-06-29 10:36:04 -04:00
Patrick von Platen
4bcc35cd69
[Docs] Benchmark docs ( #5360 )
...
* first doc version
* add benchmark docs
* fix typos
* improve README
* Update docs/source/benchmarks.rst
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* fix naming and docs
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-06-29 16:08:57 +02:00
Sam Shleifer
45e26125de
save_pretrained: mkdir(exist_ok=True) ( #5258 )
...
* all save_pretrained methods mkdir if not os.path.exists
2020-06-28 14:53:47 -04:00
Suraj Patil
12dfbd4f7a
[examples] fix example links ( #5344 )
2020-06-28 12:54:54 -04:00
Sam Shleifer
393b8dc09a
examples/seq2seq/run_eval.py fixes and docs ( #5322 )
2020-06-26 19:20:43 -04:00