Commit Graph

5250 Commits

Author SHA1 Message Date
Patrick von Platen 8abd7f69fc
fix warning for position ids (#6884) 2020-09-02 06:44:51 -04:00
Parthe Pandit 7cb0572c64
Update modeling_bert.py (#6897)
outptus -> outputs in example of BertForPreTraining
2020-09-02 06:39:01 -04:00
David Mark Nemeskey e3c55ceb8d
Model card for huBERT (#6893)
* Create README.md

Model card for huBERT.

* Update README.md

lowercase h

* Update model_cards/SZTAKI-HLT/hubert-base-cc/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-09-02 04:50:10 -04:00
Patrick von Platen 1889e96c8c
fix QA example for PT (#6890) 2020-09-02 09:53:09 +02:00
Julien Chaumond d822ab636b [model_cards] Fix file path for flexudy/t5-base-multi-sentence-doctor 2020-09-02 00:02:40 +02:00
Rohan Rajpal ad5fb33c9a
Create README.md (#6598) 2020-09-01 17:59:15 -04:00
Rohan Rajpal f9dadcd85b
Create README.md (#6602) 2020-09-01 17:58:43 -04:00
Igli Manaj f5d69c75f7
Update multilingual passage rereanking model card (#6788)
Fix range of possible score, add inference .
2020-09-01 17:56:19 -04:00
Tom Grek 5d820f3ca6
Model card for primer/BART-Squad2 (#6801) 2020-09-01 17:52:32 -04:00
zolekode 8b884dadc6
added model card for flexudys t5 model (#6759)
Co-authored-by: zolekode <pascal.zoleko@fau.de>
2020-09-01 17:38:55 -04:00
hakan bff6d517cd
loodos turkish model cards added (#6840) 2020-09-01 17:35:24 -04:00
Manuel Romero 502d194b95
Create README.md (#6887)
Add language meta attribute
2020-09-01 17:09:10 -04:00
Manuel Romero d082edf216
Create README.md (#6888)
Add language meta attribute
2020-09-01 17:09:02 -04:00
Abed khooli dacbee9a50
Create README.md (#6886)
* Create README.md

model card for  akhooli/xlm-r-large-arabic-sent

* Update model_cards/akhooli/xlm-r-large-arabic-sent/README.md

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-09-01 17:06:15 -04:00
Abed khooli e2971e61bd
Create README.md (#6885) 2020-09-01 16:57:48 -04:00
Patrick von Platen 4d1a3ffde8
[EncoderDecoder] Add xlm-roberta to encoder decoder (#6878)
* finish xlm-roberta

* finish docs

* expose XLMRobertaForCausalLM
2020-09-01 21:56:39 +02:00
Patrick von Platen 311992630c
Create README.md (#6883)
* Create README.md

* Update README.md
2020-09-01 19:24:45 +02:00
Jin Young (Daniel) Sohn 21d719238c
Add cache_dir to save features TextDataset (#6879)
* Add cache_dir to save features TextDataset

This is in case the dataset is in a RO filesystem, for which is the case
in tests (GKE TPU tests).

* style
2020-09-01 11:42:17 -04:00
Lysandre Debut 1461aac8d7
Update docs stable version 2020-09-01 11:02:24 -04:00
Lysandre 3726754a6c v3.1.0 documentation 2020-09-01 14:39:07 +02:00
Lysandre 4b3ee9cbc5 Release: v3.1.0 2020-09-01 14:27:52 +02:00
Patrick von Platen afc4ece462
[Generate] Facilitate PyTorch generate using `ModelOutputs` (#6735)
* fix generate for GPT2 Double Head

* fix gpt2 double head model

* fix  bart / t5

* also add for no beam search

* fix no beam search

* fix encoder decoder

* simplify t5

* simplify t5

* fix t5 tests

* fix BART

* fix transfo-xl

* fix conflict

* integrating sylvains and sams comments

* fix tf past_decoder_key_values

* fix enc dec test
2020-09-01 12:38:25 +02:00
Funtowicz Morgan 397f819615
Restore PaddingStrategy.MAX_LENGTH on QAPipeline while no v2. (#6875)
Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>
2020-09-01 05:35:35 -04:00
Sam Shleifer a32d85f0d4
delete reinit (#6862) 2020-09-01 03:43:27 -04:00
Sylvain Gugger d5f1ffa0d8
Logging doc (#6852)
* Add logging doc

* Foamtting

* Update docs/source/main_classes/logging.rst

* Update src/transformers/utils/logging.py

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-09-01 03:16:34 -04:00
Stas Bekman 59a6a32a61
add a final report to all pytest jobs (#6861)
we had it added for one job, please add it to all pytest jobs - we need the output of what tests were run to debug the codecov issue. thank you!
2020-08-31 22:47:23 -04:00
Sam Shleifer 431ab19d7a
[fix] typo in available in helper function (#6859) 2020-08-31 17:59:34 -04:00
Sam Shleifer 367235ee52
Bart can make decoder_input_ids from labels (#6758) 2020-08-31 16:16:47 -04:00
Sam Shleifer b9772897ec
[s2s] command line args for faster val steps (#6833) 2020-08-31 16:16:10 -04:00
Sam Shleifer 8af1970e45
Fix marian slow test (#6854) 2020-08-31 16:10:43 -04:00
Funtowicz Morgan bbdba0a76d
Update ONNX notebook to include section on quantization. (#6831)
* Update ONNX notebook to include section on quantization.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* Addressing ONNX team comments
2020-08-31 21:28:00 +02:00
Sylvain Gugger a59bcefbb1
Split hp search methods (#6857)
* Split the run_hp_search by backend

* Unused import
2020-08-31 15:16:39 -04:00
krfricke 23f9611c16
Add checkpointing to Ray Tune HPO (#6747)
* Introduce HPO checkpointing for PBT

* Moved checkpoint saving

* Fixed checkpoint subdir pass

* Fixed style

* Enable/disable checkpointing, check conditions for various tune schedulers incl. PBT

* Adjust number of GPUs to number of jobs

* Avoid mode pickling in ray

* Move hp search to integrations
2020-08-31 14:38:46 -04:00
Sam Shleifer 61b7ba93f5
Marian distill scripts + integration test (#6799) 2020-08-31 13:48:26 -04:00
Jin Young (Daniel) Sohn 02d09c8fcc
Only access loss tensor every logging_steps (#6802)
* Only access loss tensor every logging_steps

* tensor.item() was being called every step. This must not be done
for XLA:TPU tensors as it's terrible for performance causing TPU<>CPU
communication at each step. On RoBERTa MLM for example, it reduces step
time by 30%, should be larger for smaller step time models/tasks.
* Train batch size was not correct in case a user uses the
`per_gpu_train_batch_size` flag
* Avg reduce loss accross eval shards

* Fix style (#6803)

* t5 model should make decoder_attention_mask (#6800)

* [s2s] Test hub configs in self-scheduled CI (#6809)

* [s2s] round runtime in run_eval (#6798)

* Pegasus finetune script: add --adafactor (#6811)

* [bart] rename self-attention -> attention (#6708)

* [tests] fix typos in inputs (#6818)

* Fixed open in colab link (#6825)

* Add model card for singbert lite. Update widget for singbert and singbert-large. (#6827)

* BR_BERTo model card (#6793)

* clearly indicate shuffle=False (#6312)

* Clarify shuffle

* clarify shuffle

Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>

* [s2s README] Add more dataset download instructions (#6737)

* Style

* Patch logging issue

* Set default logging level to `WARNING` instead of `INFO`

* TF Flaubert w/ pre-norm (#6841)

* Dataset and DataCollator for BERT Next Sentence Prediction (NSP) task (#6644)

* add datacollator and dataset for next sentence prediction task

* bug fix (numbers of special tokens & truncate sequences)

* bug fix (+ dict inputs support for data collator)

* add padding for nsp data collator; renamed cached files to avoid conflict.

* add test for nsp data collator

* Style

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>

* Fix in Adafactor docstrings (#6845)

* Fix resuming training for Windows (#6847)

* Only access loss tensor every logging_steps

* tensor.item() was being called every step. This must not be done
for XLA:TPU tensors as it's terrible for performance causing TPU<>CPU
communication at each step. On RoBERTa MLM for example, it reduces step
time by 30%, should be larger for smaller step time models/tasks.
* Train batch size was not correct in case a user uses the
`per_gpu_train_batch_size` flag
* Avg reduce loss accross eval shards

* comments

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Thomas Ashish Cherian <6967017+PandaWhoCodes@users.noreply.github.com>
Co-authored-by: Zane Lim <zyuanlim@gmail.com>
Co-authored-by: Rodolfo De Nadai <rdenadai@gmail.com>
Co-authored-by: xujiaze13 <37360975+xujiaze13@users.noreply.github.com>
Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Huang Lianzhe <hlz@pku.edu.cn>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-08-31 11:35:51 -04:00
Sylvain Gugger c48546c7f7
Fix resuming training for Windows (#6847) 2020-08-31 11:02:30 -04:00
Sylvain Gugger d2f9cb838e
Fix in Adafactor docstrings (#6845) 2020-08-31 10:52:47 -04:00
Huang Lianzhe 2de7ee0385
Dataset and DataCollator for BERT Next Sentence Prediction (NSP) task (#6644)
* add datacollator and dataset for next sentence prediction task

* bug fix (numbers of special tokens & truncate sequences)

* bug fix (+ dict inputs support for data collator)

* add padding for nsp data collator; renamed cached files to avoid conflict.

* add test for nsp data collator

* Style

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-08-31 08:25:00 -04:00
Lysandre Debut 895d394669
TF Flaubert w/ pre-norm (#6841) 2020-08-31 04:53:20 -04:00
Lysandre 4561f05c5f Set default logging level to `WARNING` instead of `INFO` 2020-08-31 09:56:25 +02:00
Lysandre 05c3214153 Patch logging issue 2020-08-31 09:37:08 +02:00
Sam Shleifer dfa10a41ba
[s2s README] Add more dataset download instructions (#6737) 2020-08-30 16:29:24 -04:00
xujiaze13 32fe44086c
clearly indicate shuffle=False (#6312)
* Clarify shuffle

* clarify shuffle

Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>
2020-08-30 19:26:10 +08:00
Rodolfo De Nadai 0eecaceac7
BR_BERTo model card (#6793) 2020-08-30 19:02:46 +08:00
Zane Lim d176aaad7f
Add model card for singbert lite. Update widget for singbert and singbert-large. (#6827) 2020-08-30 18:21:49 +08:00
Thomas Ashish Cherian a5847619e3
Fixed open in colab link (#6825) 2020-08-30 18:21:00 +08:00
Stas Bekman 563485bf95
[tests] fix typos in inputs (#6818) 2020-08-30 18:19:57 +08:00
Sam Shleifer 22933e661f
[bart] rename self-attention -> attention (#6708) 2020-08-29 18:03:08 -04:00
Sam Shleifer 0f58903bb6
Pegasus finetune script: add --adafactor (#6811) 2020-08-29 17:43:32 -04:00
Sam Shleifer ac47458a02
[s2s] round runtime in run_eval (#6798) 2020-08-29 17:36:31 -04:00