Commit Graph

1087 Commits

Author SHA1 Message Date
Andrey Kulagin b1ff0b2ae7 Fix bug in examples: double wrap into DataParallel during eval 2020-04-20 19:37:44 -04:00
Jared T Nielsen c79b550dd0
Add `qas_id` to SquadResult and SquadExample (#3745)
* Add qas_id

* Fix incorrect name in squad.py

* Make output files optional for squad eval
2020-04-20 16:08:57 -04:00
Sam Shleifer a504cb49ec
[examples] fix summarization do_predict (#3866) 2020-04-20 10:49:56 -04:00
Thomas Wolf 827d6d6ef0
Cleanup fast tokenizers integration (#3706)
* First pass on utility classes and python tokenizers

* finishing cleanup pass

* style and quality

* Fix tests

* Updating following @mfuntowicz comment

* style and quality

* Fix Roberta

* fix batch_size/seq_length inBatchEncoding

* add alignement methods + tests

* Fix OpenAI and Transfo-XL tokenizers

* adding trim_offsets=True default for GPT2 et RoBERTa

* style and quality

* fix tests

* add_prefix_space in roberta

* bump up tokenizers to rc7

* style

* unfortunately tensorfow does like these - removing shape/seq_len for now

* Update src/transformers/tokenization_utils.py

Co-Authored-By: Stefan Schweter <stefan@schweter.it>

* Adding doc and docstrings

* making flake8 happy

Co-authored-by: Stefan Schweter <stefan@schweter.it>
2020-04-18 13:43:57 +02:00
Sam Shleifer f0c96fafd1
[examples] summarization/bart/finetune.py supports t5 (#3824)
renames `run_bart_sum.py` to `finetune.py`
2020-04-16 15:15:19 -04:00
Patrick von Platen 80a1694514
[Examples, T5] Change newstest2013 to newstest2014 and clean up (#3817)
* Refactored use of newstest2013 to newstest2014. Fixed bug where argparse consumed first command line argument as model_size argument rather than using default model_size by forcing explicit --model_size flag inclusion

* More pythonic file handling through 'with' context

* COSMETIC - ran Black and isort

* Fixed reference to number of lines in newstest2014

* Fixed failing test. More pythonic file handling

* finish PR from tholiao

* remove outcommented lines

* make style

* make isort happy

Co-authored-by: Thomas Liao <tholiao@gmail.com>
2020-04-16 20:00:41 +02:00
Davide Fiocco b1e2368b32
Typo fix (#3821) 2020-04-16 11:04:32 -04:00
Sam Shleifer c59b1e682d
[examples] unit test for run_bart_sum (#3544)
- adds pytorch-lightning dependency
2020-04-15 18:35:01 -04:00
Patrick von Platen 01c37dcdb5
[Config, Caching] Remove `output_past` everywhere and replace by `use_cache` argument (#3734)
* remove output_past from pt

* make style

* add optional input length for gpt2

* add use cache to prepare input

* save memory in gpt2

* correct gpt2 test inputs

* make past input optional for gpt2

* finish use_cache for all models

* make style

* delete modeling_gpt2 change in test file

* correct docstring

* correct is true statements for gpt2
2020-04-14 14:40:28 -04:00
elk-cloner 5ebd898953
fix dataset shuffling for Distributed training (#huggingface#3721) (#3766) 2020-04-13 10:11:18 -04:00
Jin Young Sohn 700ccf6e35
Fix glue_convert_examples_to_features API breakage (#3742) 2020-04-10 16:03:27 -04:00
Jin Young Sohn 551b450527
Add `run_glue_tpu.py` that trains models on TPUs (#3702)
* Initial commit to get BERT + run_glue.py on TPU

* Add README section for TPU and address comments.

* Cleanup TPU bits from run_glue.py (#3)

TPU runner is currently implemented in:
https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py.

We plan to upstream this directly into `huggingface/transformers`
(either `master` or `tpu`) branch once it's been more thoroughly tested.

* Cleanup TPU bits from run_glue.py

TPU runner is currently implemented in:
https://github.com/pytorch-tpu/transformers/blob/tpu/examples/run_glue_tpu.py.

We plan to upstream this directly into `huggingface/transformers`
(either `master` or `tpu`) branch once it's been more thoroughly tested.

* No need to call `xm.mark_step()` explicitly (#4)

Since for gradient accumulation we're accumulating on batches from
`ParallelLoader` instance which on next() marks the step itself.

* Resolve R/W conflicts from multiprocessing (#5)

* Add XLNet in list of models for `run_glue_tpu.py` (#6)

* Add RoBERTa to list of models in TPU GLUE (#7)

* Add RoBERTa and DistilBert to list of models in TPU GLUE (#8)

* Use barriers to reduce duplicate work/resources (#9)

* Shard eval dataset and aggregate eval metrics (#10)

* Shard eval dataset and aggregate eval metrics

Also, instead of calling `eval_loss.item()` every time do summation with
tensors on device.

* Change defaultdict to float

* Reduce the pred, label tensors instead of metrics

As brought up during review some metrics like f1 cannot be aggregated
via averaging. GLUE task metrics depends largely on the dataset, so
instead we sync the prediction and label tensors so that the metrics can
be computed accurately on those instead.

* Only use tb_writer from master (#11)

* Apply huggingface black code formatting

* Style

* Remove `--do_lower_case` as example uses cased

* Add option to specify tensorboard logdir

This is needed for our testing framework which checks regressions
against key metrics writtern by the summary writer.

* Using configuration for `xla_device`

* Prefix TPU specific comments.

* num_cores clarification and namespace eval metrics

* Cache features file under `args.cache_dir`

Instead of under `args.data_dir`. This is needed as our test infra uses
data_dir with a read-only filesystem.

* Rename `run_glue_tpu` to `run_tpu_glue`

Co-authored-by: LysandreJik <lysandre.debut@reseau.eseo.fr>
2020-04-10 12:53:54 -04:00
Julien Chaumond cbad305ce6
[docs] The use of `do_lower_case` in scripts is on its way to deprecation (#3738) 2020-04-10 12:34:04 -04:00
Julien Chaumond b169ac9c2b
[examples] Generate argparsers from type hints on dataclasses (#3669)
* [examples] Generate argparsers from type hints on dataclasses

* [HfArgumentParser] way simpler API

* Restore run_language_modeling.py for easier diff

* [HfArgumentParser] final tweaks from code review
2020-04-10 12:21:58 -04:00
Julien Chaumond f98d0ef2a2
Big cleanup of `glue_convert_examples_to_features` (#3688)
* Big cleanup of `glue_convert_examples_to_features`

* Use batch_encode_plus

* Cleaner wrapping of glue_convert_examples_to_features for TF

@lysandrejik

* Cleanup syntax, thanks to @mfuntowicz

* Raise explicit error in case of user error
2020-04-10 10:20:18 -04:00
Sam Shleifer 715aa5b135
[Bart] Replace config.output_past with use_cache kwarg (#3632) 2020-04-07 19:08:26 -04:00
Sam Shleifer e344e3d402
[examples] SummarizationDataset cleanup (#3451) 2020-04-07 19:05:58 -04:00
Patrick von Platen 80fa0f7812
[Examples, Benchmark] Improve benchmark utils (#3674)
* improve and add features to benchmark utils

* update benchmark style

* remove output files
2020-04-07 16:25:57 -04:00
Ethan Perez e52d1258e0
Fix RoBERTa/XLNet Pad Token in run_multiple_choice.py (#3631)
* Fix RoBERTa/XLNet Pad Token in run_multiple_choice.py

`convert_examples_to_fes atures` sets `pad_token=0` by default, which is correct for BERT but incorrect for RoBERTa (`pad_token=1`) and XLNet (`pad_token=5`). I think the other arguments to `convert_examples_to_features` are correct, but it might be helpful if someone checked who is more familiar with this part of the codebase.

* Simplifying change to match recent commits
2020-04-06 16:52:22 -04:00
Nicolas c50aa67bff
Resizing embedding matrix before sending it to the optimizer. (#3532)
* Resizing embedding matrix after sending it to the optimizer prevents from updating the newly resized matrix.

* Remove space for style matter
2020-04-02 15:00:05 -04:00
Mark Kockerbeck 1b10159950
Adding should_continue check for retraining (#3509) 2020-04-02 14:07:08 -04:00
Patrick von Platen ab5d06a094
[T5, examples] replace heavy t5 models with tiny random models (#3556)
* replace heavy t5 models with tiny random models as was done by sshleifer

* fix isort
2020-04-02 12:34:05 +02:00
Julien Chaumond 50e15c825c
Tokenizers: Start cleaning examples a little (#3455)
* Start cleaning examples

* Fixup
2020-04-01 07:13:40 -04:00
Patrick von Platen ae6834e028
[Examples] Clean summarization and translation example testing files for T5 and Bart (#3514)
* fix conflicts

* add model size argument to summarization

* correct wrong import

* fix isort

* correct imports

* other isort make style

* make style
2020-03-31 17:54:13 +02:00
Ethan Perez e5c393dceb
[Bug fix] Using loaded checkpoint with --do_predict (instead of… (#3437)
* Using loaded checkpoint with --do_predict

Without this fix, I'm getting near-random validation performance for a trained model, and the validation performance differs per validation run. I think this happens since the `model` variable isn't set with the loaded checkpoint, so I'm using a randomly initialized model. Looking at the model activations, they differ each time I run evaluation (but they don't with this fix).

* Update checkpoint loading

* Fixing model loading
2020-03-30 17:06:08 -04:00
Sam Shleifer 8deff3acf2
[bart-tiny-random] Put a 5MB model on S3 to allow faster exampl… (#3488) 2020-03-30 12:28:27 -04:00
Julien Plu d38bbb225f
Update the NER TF script (#3511)
* Update the NER TF script to remove the softmax and make the pad token label id to -1

* Reformat the quality and style

Co-authored-by: Julien Plu <julien.plu@adevinta.com>
2020-03-30 09:50:12 -04:00
Sam Shleifer 33ef7002e1
[Docs] examples/summarization/bart: Simplify CNN/DM preprocessi… (#3516) 2020-03-29 13:25:42 -04:00
Patrick von Platen 17dceae7a1
Fix circle ci flaky fail of wmt example (#3485)
* force bleu

* fix wrong file name

* rename file

* different filenames for each example test

* test files should clean up after themselves

* test files should clean up after themselves

* do not force bleu

* correct typo

* fix isort
2020-03-27 13:01:28 -04:00
Funtowicz Morgan b08259a120
run_ner.py / bert-base-multilingual-cased can output empty tokens (#2991)
* Use tokenizer.num_added_tokens to count number of added special_tokens instead of hardcoded numbers.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>

* run_ner.py - Do not add a label to the labels_ids if word_tokens is empty.

This can happen when using bert-base-multilingual-cased with an input containing an unique space.
In this case, the tokenizer will output just an empty word_tokens thus leading to an non-consistent behavior
over the labels_ids tokens adding one more tokens than tokens vector.

Signed-off-by: Morgan Funtowicz <morgan@huggingface.co>
2020-03-27 10:59:55 -04:00
Patrick von Platen f4f4946836
Rename `t5-large` to `t5-base` in README.md 2020-03-27 15:57:58 +01:00
Lysandre Debut ff80b73157
Add option to choose T5 model size. (#3480)
T5-small in test


isort
2020-03-27 15:56:59 +01:00
Patrick von Platen 5ad2ea06af
Add wmt translation example (#3428)
* add translation example

* make style

* adapt docstring

* add gpu device as input for example

* small renaming

* better README
2020-03-26 19:07:59 +01:00
Patrick von Platen e703e923ca
Add t5 summarization example (#3411)
* rebase to master

* change tf to pytorch

* change to pytorch

* small fix

* renaming

* add gpu training possibility

* renaming

* improve README

* incoorporate collins feedback

* better Readme

* better README.md
2020-03-26 18:17:55 +01:00
Lysandre Debut ffcffebe85
Force the return of token type IDs (#3439) 2020-03-26 09:41:36 +01:00
Andre Carrera 3d76df3a12
BART for summarization training with CNN/DM using pytorch-lightning 2020-03-24 21:00:24 -04:00
Julien Chaumond eaabaaf750 [run_language_modeling] Fix: initialize a new model from a config object 2020-03-24 17:56:40 -04:00
Julien Chaumond f8823bad9a Expose missing mappings (see #3415) 2020-03-24 17:46:25 -04:00
Julien Chaumond a8e3336a85 [examples] Use AutoModels in more examples 2020-03-23 20:11:14 -04:00
Julien Chaumond f7dcf8fcea [BertAbs] Move files around for more consistent naming 2020-03-23 13:58:49 -04:00
Julien Chaumond cf72479bf1 One last reorder of {scheduler,optimizer}.step() 2020-03-20 18:05:50 -04:00
Elijah Rippeth 634bf6cf7e fixes lr_scheduler warning
For more details, see https://pytorch.org/docs/stable/optim.html#how-to-adjust-learning-rate
2020-03-20 18:03:50 -04:00
Patrick von Platen 95e00d0808
Clean special token init in modeling_....py (#3264)
* make style

* fix conflicts
2020-03-20 21:41:04 +01:00
Nitish Shirish Keskar 8becb73293
removing torch.cuda.empty_cache() from TF function (#3267)
torch.cuda.empty_cache() was being called from a TF function (even when torch is unavailable)
not sure any replacement is needed if TF OOMs
2020-03-19 23:25:30 +01:00
Julien Chaumond 656e1386a2 Fix #3305: run_ner only possible on ModelForTokenClassification models 2020-03-19 16:41:28 -04:00
mataney c44a17db1b
[FIX] not training when epoch is small (#3006)
* solving bug where for small epochs and large gradient_accumulation_steps we never train

* black formatting

* no need to change these files
2020-03-19 11:21:21 -04:00
J.P Lee 2b60a26b46
Update examples/ner/run_ner.py to use AutoModel (#3305)
* Update examples/ner/run_ner.py to use AutoModel

* Fix missing code and apply `make style` command
2020-03-17 12:30:10 -04:00
Nathan Raw 930c9412b4
[WIP] Lightning glue example (#3290)
*  Alter base pl transformer to use automodels

* 🐛 Add batch size env variable to function call

* 💄 Apply black code style from Makefile

* 🚚 Move lightning base out of ner directory

*  Add lightning glue example

* 💄 self

* move _feature_file to base class

*  Move eval logging to custom callback

* 💄 Apply black code style

* 🐛 Add parent to pythonpath, remove copy command

* 🐛 Add missing max_length kwarg
2020-03-17 11:46:42 -04:00
Patrick von Platen e8f44af5bf
[generate] do_sample default back to False (#3298)
* change do_samples back

* None better default as boolean

* adapt do_sample to True in test example

* make style
2020-03-17 10:52:37 -04:00
Thomas Wolf 2187c49f5c
CPU/GPU memory benchmarking utilities - Remove support for python 3.5 (now only 3.6+) (#3186)
* memory benchmark rss

* have both forward pass and line-by-line mem tracing

* cleaned up tracing

* refactored and cleaning up API

* no f-strings yet...

* add GPU mem logging

* fix GPU memory monitoring

* style and quality

* clean up and doc

* update with comments

* Switching to python 3.6+

* fix quality
2020-03-17 10:17:11 -04:00
Sam Shleifer 5ea8ba67b4
[BART] Remove unused kwargs (#3279)
* Remove unused kwargs
* dont call forward in tests
2020-03-15 23:00:44 -04:00
Thomas Wolf 3814e167d9
Merge pull request #3225 from patrickvonplaten/finalize_merge_bart_generate_into_default_generate
Complete merge Seq-2-Seq generation into default generation
2020-03-14 15:08:59 +01:00
Patrick von Platen 4f75d380a4 make style 2020-03-13 16:35:52 +01:00
Patrick von Platen c2ee3840ae update file to new starting token logic 2020-03-13 16:34:44 +01:00
dependabot[bot] afea70c01c Bump psutil from 5.6.3 to 5.6.6 in /examples/distillation
Bumps [psutil](https://github.com/giampaolo/psutil) from 5.6.3 to 5.6.6.
- [Release notes](https://github.com/giampaolo/psutil/releases)
- [Changelog](https://github.com/giampaolo/psutil/blob/master/HISTORY.rst)
- [Commits](https://github.com/giampaolo/psutil/compare/release-5.6.3...release-5.6.6)

Signed-off-by: dependabot[bot] <support@github.com>
2020-03-12 21:14:56 -04:00
Sam Shleifer 2e81b9d8d7
Bart: update example for #3140 compatibility (#3233)
* Update bart example docs
2020-03-12 10:36:37 -04:00
Patrick von Platen 5b3000d933 renamed min_len to min_length 2020-03-11 11:06:56 +01:00
Shubham Agarwal 5ca356a464
NER - pl example (#3180)
* 1. seqeval required by ner pl example. install from examples/requirements. 2. unrecognized arguments: save_steps

* pl checkpoint callback filenotfound error: make directory and pass

* #3159 pl checkpoint path difference

* 1. Updated Readme for pl 2. pl script now also correct displays logs 3. pass gpu ids compared to number of gpus

* Updated results in readme

* 1. updated readme 2. removing deprecated pl methods 3. finalizing scripts

* comment length check

* using deprecated validation_end for stable results

* style related changes
2020-03-09 20:43:38 -04:00
Sam Shleifer 3aca02efb3
Bart example: model.to(device) (#3194) 2020-03-09 15:09:35 -04:00
Lysandre eb3e6cb04f cased -> uncased in BERT SQuAD example
closes #3183
2020-03-09 10:54:18 -04:00
Sam Shleifer 857e0a0d3b
Rename BartForMaskedLM -> BartForConditionalGeneration (#3114)
* improved documentation
2020-03-05 17:41:18 -05:00
Sam Shleifer 5b396457e5
Summarization Examples: add Bart CNN Evaluation (#3082)
* Rename and improve example

* Add test

* slightly faster test

* style

* This breaks remy prolly

* shorter test string

* no slow

* newdir structure

* New tree

* Style

* shorter

* docs

* clean

* Attempt future import

* more import hax
2020-03-03 15:29:59 -05:00
Davide Fiocco c0c7ec3458
Don't crash if fine-tuned model doesn't end with a number (#3099)
That's the same fix applied in https://github.com/huggingface/transformers/issues/2258 , but for the GLUE example
2020-03-03 08:59:47 -05:00
Victor SANH 6b1ff25084
fix n_gpu count when no_cuda flag is activated (#3077)
* fix n_gpu count when no_cuda flag is activated

* someone was left behind
2020-03-02 10:20:21 -05:00
Julien Chaumond 298bed16a8 make style 2020-03-01 14:08:01 -05:00
VictorSanh 852e032ca6 include roberta in run_squad_w_distillation - cc @graviraja 2020-03-01 01:56:50 +00:00
VictorSanh b5509abb36 --do_lower_case will always trick me... 2020-03-01 01:39:24 +00:00
srush 908fa43b54
Changes to NER examples for PLT and TPU (#3053)
* changes to allow for tpu training

* black

* tpu

* tpu
2020-02-27 16:45:32 -05:00
Lysandre Debut 8bcb37bfb8
NER support for Albert in run_ner.py and NerPipeline (#2983)
* * Added support for Albert when fine-tuning for NER

* Added support for Albert in NER pipeline

* Added command-line options to examples/ner/run_ner.py to better control tokenization

* Added class AlbertForTokenClassification

* Changed output for NerPipeline to use .convert_ids_to_tokens(...) instead of .decode(...) to better reflect tokens

* Added ,

* Now passes style guide enforcement

* Changes from reviews.

* Code now passes style enforcement

* Added test for AlbertForTokenClassification

* Added test for AlbertForTokenClassification
2020-02-27 10:22:55 -05:00
Martin Malmsten d762d4289c Code now passes style enforcement 2020-02-26 23:50:40 +01:00
Martin Malmsten 9495d38b0d Changes from reviews. 2020-02-26 23:36:39 +01:00
Andrew Walker 5bc99e7f33
fix several typos in Distil* readme (#3034) 2020-02-26 12:39:54 -05:00
Jhuo IH 7a7ee28cb9
missing ner link (#2967) 2020-02-25 14:06:57 -05:00
Patrick von Platen 65d74c4965
Add preprocessing step for transfo-xl tokenization to avoid tokenizing words followed by punction to <unk> (#2987)
* add preprocessing to add space before punctuation for transfo_xl

* improve warning messages

* make style

* compile regex at instantination of tokenizer object
2020-02-24 15:11:10 -05:00
Martin Malmsten 105dcb4162 Now passes style guide enforcement 2020-02-23 21:47:59 +01:00
Martin Malmsten 33eb8a165d Added , 2020-02-23 21:43:31 +01:00
Martin Malmsten 869b66f6b3 * Added support for Albert when fine-tuning for NER
* Added support for Albert in NER pipeline

* Added command-line options to examples/ner/run_ner.py to better control tokenization

* Added class AlbertForTokenClassification

* Changed output for NerPipeline to use .convert_ids_to_tokens(...) instead of .decode(...) to better reflect tokens
2020-02-23 21:13:03 +01:00
saippuakauppias cafc4dfc7c fix hardcoded path in examples readme 2020-02-22 11:12:38 -05:00
Patrick von Platen fc38d4c86f
Improve special_token_id logic in run_generation.py and add tests (#2885)
* improving generation

* finalized special token behaviour for no_beam_search generation

* solved modeling_utils merge conflict

* solve merge conflicts in modeling_utils.py

* add run_generation improvements from PR #2749

* adapted language generation to not use hardcoded -1 if no padding token is available

* remove the -1 removal as hard coded -1`s are not necessary anymore

* add lightweight language generation testing for randomely initialized models - just checking whether no errors are thrown

* add slow language generation tests for pretrained models using hardcoded output with pytorch seed

* delete ipdb

* check that all generated tokens are valid

* renaming

* renaming Generation -> Generate

* make style

* updated so that generate_beam_search has same token behavior than generate_no_beam_search

* consistent return format for run_generation.py

* deleted pretrain lm generate tests -> will be added in another PR

* cleaning of unused if statements and renaming

* run_generate will always return an iterable

* make style

* consistent renaming

* improve naming, make sure generate function always returns the same tensor, add docstring

* add slow tests for all lmhead models

* make style and improve example comments modeling_utils

* better naming and refactoring in modeling_utils

* improving generation

* finalized special token behaviour for no_beam_search generation

* solved modeling_utils merge conflict

* solve merge conflicts in modeling_utils.py

* add run_generation improvements from PR #2749

* adapted language generation to not use hardcoded -1 if no padding token is available

* remove the -1 removal as hard coded -1`s are not necessary anymore

* add lightweight language generation testing for randomely initialized models - just checking whether no errors are thrown

* add slow language generation tests for pretrained models using hardcoded output with pytorch seed

* delete ipdb

* check that all generated tokens are valid

* renaming

* renaming Generation -> Generate

* make style

* updated so that generate_beam_search has same token behavior than generate_no_beam_search

* consistent return format for run_generation.py

* deleted pretrain lm generate tests -> will be added in another PR

* cleaning of unused if statements and renaming

* run_generate will always return an iterable

* make style

* consistent renaming

* improve naming, make sure generate function always returns the same tensor, add docstring

* add slow tests for all lmhead models

* make style and improve example comments modeling_utils

* better naming and refactoring in modeling_utils

* changed fast random lm generation testing design to more general one

* delete in old testing design in gpt2

* correct old variable name

* temporary fix for encoder_decoder lm generation tests - has to be updated when t5 is fixed

* adapted all fast random generate tests to new design

* better warning description in modeling_utils

* better comment

* better comment and error message

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
2020-02-21 12:09:59 -05:00
maximeilluin c749a543fa
Added CamembertForQuestionAnswering (#2746)
* Added CamembertForQuestionAnswering

* fixed camembert tokenizer case
2020-02-21 12:01:02 -05:00
Martin Malmsten 4452b44b90
Labels are now added to model config under id2label and label2id (#2945) 2020-02-21 08:53:05 -05:00
Sam Shleifer 53ce3854a1
New BartModel (#2745)
* Results same as fairseq
* Wrote a ton of tests
* Struggled with api signatures
* added some docs
2020-02-20 18:11:13 -05:00
srush 889d3bfdbb
default arg fix (#2937) 2020-02-20 15:31:17 -05:00
srush b662f0e625
Support for torch-lightning in NER examples (#2890)
* initial pytorch lightning commit

* tested multigpu

* Fix learning rate schedule

* black formatting

* fix flake8

* isort

* isort

* .

Co-authored-by: Check your git settings! <chris@chris-laptop>
2020-02-20 11:50:05 -05:00
VictorSanh 2ae98336d1 fix vocab size in binarized_data (distil): int16 vs int32 2020-02-18 16:17:35 +00:00
VictorSanh 0dbddba6d2 fix typo in hans example call 2020-02-17 20:19:57 +00:00
Manuel Romero 4e597c8e4d Fix typo 2020-02-14 09:07:42 -05:00
Julien Chaumond 4d36472b96 [run_ner] Don't crash if fine-tuning local model that doesn't end with digit 2020-02-14 03:25:29 +00:00
Lysandre f54a5bd37f Raise error when using an mlm flag for a clm model + correct TextDataset 2020-02-12 13:23:14 -05:00
Lysandre 569897ce2c Fix a few issues regarding the language modeling script 2020-02-12 13:23:14 -05:00
VictorSanh ee5a6856ca distilbert-base-cased weights + Readmes + omissions 2020-02-07 15:28:13 -05:00
Julien Chaumond 42f08e596f [examples] rename run_lm_finetuning to run_language_modeling 2020-02-07 09:15:28 -05:00
Julien Chaumond 4f7bdb0958 [examples] Fix broken markdown 2020-02-07 09:15:28 -05:00
Peter Izsak 6fc3d34abd Fix multi-gpu evaluation in run_glue.py 2020-02-06 16:38:55 -05:00
Julien Chaumond ada24def22 [run_lm_finetuning] Tweak fix for non-long tensor, close #2728
see 1ebfeb7946 and #2728

Co-Authored-By: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2020-02-05 12:49:18 -05:00
Yuval Pinter d1ab1fab1b
pass langs parameter to certain XLM models (#2734)
* pass langs parameter to certain XLM models

Adding an argument that specifies the language the SQuAD dataset is in so language-sensitive XLMs (e.g. `xlm-mlm-tlm-xnli15-1024`) don't default to language `0`.
Allows resolution of issue #1799 .

* fixing from `make style`

* fixing style (again)
2020-02-04 17:12:42 -05:00
Lysandre 3bf5417258 Revert erroneous fix 2020-02-04 16:31:07 -05:00
Lysandre 1ebfeb7946 Cast to long when masking tokens 2020-02-04 15:56:16 -05:00
Lysandre 239dd23f64 [Follow up 213]
Masked indices should have -1 and not -100. Updating documentation + scripts that were forgotten
2020-02-03 16:08:05 -05:00
Antonio Carlos Falcão Petri 2ba147ecff Fix typo in examples/utils_ner.py
"%s-%d".format() -> "{}-{}".format()
2020-02-01 11:10:57 -05:00
Lysandre d18d47be67 run_generation style 2020-01-31 12:05:48 -05:00
Lysandre 7365f01d43 do_sample should be set to True in run_generation.py 2020-01-31 11:49:32 -05:00
Jared Nielsen 71a382319f Correct documentation 2020-01-30 18:41:24 -05:00
Hang Le f0a4fc6cd6 Add Flaubert 2020-01-30 10:04:18 -05:00
Jared Nielsen adb8c93134 Remove lines causing a KeyError 2020-01-29 14:01:16 -05:00
Lysandre 335dd5e68a Default save steps 50 to 500 in all scripts 2020-01-28 09:42:11 -05:00
Julien Chaumond 6b4c3ee234 [run_lm_finetuning] GPT2 tokenizer doesn't have a pad_token
ping @lysandrejik
2020-01-27 20:14:02 -05:00
VictorSanh 1ce3fb5cc7 update correct eval metrics (distilbert & co) 2020-01-24 11:45:22 -05:00
Julien Chaumond 1a8e87be4e Line-by-line text dataset (including padding) 2020-01-21 16:57:38 -05:00
Julien Chaumond b94cf7faac change order 2020-01-21 16:57:38 -05:00
Julien Chaumond 2eaa8b6e56 Easier to not support this, as it could be confusing
cc @lysandrejik
2020-01-21 16:57:38 -05:00
Julien Chaumond 801aaa5508 make style 2020-01-21 16:57:38 -05:00
Julien Chaumond 56d4ba8ddb [run_lm_finetuning] Train from scratch 2020-01-21 16:57:38 -05:00
jiyeon_baek 6d5049a24d Fix typo in examples/run_squad.py
Rul -> Run
2020-01-17 11:22:51 -05:00
Lysandre 6e2c28a14a Run SQuAD warning when the doc stride may be too high 2020-01-16 13:59:26 -05:00
thomwolf 258ed2eaa8 adding details in readme 2020-01-16 13:21:30 +01:00
thomwolf 50ee59578d update formating - make flake8 happy 2020-01-16 13:21:30 +01:00
thomwolf 1c9333584a formating 2020-01-16 13:21:30 +01:00
thomwolf e25b6fe354 updating readme 2020-01-16 13:21:30 +01:00
thomwolf 27c7b99015 adding details in readme - moving file 2020-01-16 13:21:30 +01:00
Nafise Sadat Moosavi 99d4515572 HANS evaluation 2020-01-16 13:21:30 +01:00
Julien Chaumond 83a41d39b3 💄 super 2020-01-15 18:33:50 -05:00
Julien Chaumond 715fa638a7 Merge branch 'master' into from_scratch_training 2020-01-14 18:58:21 +00:00
Julien Chaumond b803b067bf Config to Model mapping 2020-01-13 20:05:20 +00:00
IWillPull a3085020ed Added repetition penalty to PPLM example (#2436)
* Added repetition penalty

* Default PPLM repetition_penalty to neutral

* Minor modifications to comply with reviewer's suggestions. (j -> token_idx)

* Formatted code with `make style`
2020-01-10 23:00:07 -05:00
VictorSanh e83d9f1c1d cleaning - change ' to " (black requirements) 2020-01-10 19:34:25 -05:00
VictorSanh ebba9e929d minor spring cleaning - missing configs + processing 2020-01-10 19:14:58 -05:00
Victor SANH 331065e62d missing import 2020-01-10 11:42:53 +01:00
Victor SANH 414e9e7122 indents test 2020-01-10 11:42:53 +01:00
Victor SANH 3cdb38a7c0 indents 2020-01-10 11:42:53 +01:00
Victor SANH ebd45980a0 Align with `run_squad` + fix some errors 2020-01-10 11:42:53 +01:00
Victor SANH 45634f87f8 fix Sampler in distributed training - evaluation 2020-01-10 11:42:53 +01:00
Victor SANH af1ee9e648 Move `torch.nn.utils.clip_grad_norm_` 2020-01-10 11:42:53 +01:00
Lysandre 164c794eb3 New SQuAD API for distillation script 2020-01-10 11:42:53 +01:00
Lysandre 16ce15ed4b DistilBERT token type ids removed from inputs in run_squad 2020-01-08 13:18:30 +01:00
Lysandre Debut f24232cd1b Fix error with global step in run_squad.py 2020-01-08 11:39:00 +01:00
Oren Amsalem 43114b89ba spelling correction (#2434) 2020-01-07 17:25:25 +01:00
Lysandre Debut 27c1b656cc Fix error with global step in run_lm_finetuning.py 2020-01-07 16:16:12 +01:00
Simone Primarosa 176d3b3079 Add support for Albert and XLMRoberta for the Glue example (#2403)
* Add support for Albert and XLMRoberta for the Glue example
2020-01-07 14:55:55 +01:00
alberduris 81d6841b4b GPU text generation: mMoved the encoded_prompt to correct device 2020-01-06 15:11:12 +01:00
alberduris dd4df80f0b Moved the encoded_prompts to correct device 2020-01-06 15:11:12 +01:00
karajan1001 f01b3e6680 fix #2399 an ImportError in official example (#2400)
* fix #2399 an ImportError in official example

* style

Co-authored-by: Julien Chaumond <chaumond@gmail.com>
2020-01-05 12:50:20 -05:00
Julien Chaumond 629b22adcf [run_lm_finetuning] mask_tokens: document types 2020-01-01 12:55:10 -05:00
Thomas Wolf 0412f3d929
Merge pull request #2291 from aaugustin/fix-flake8-F841
Fix F841 flake8 warning
2019-12-25 22:37:42 +01:00
Aymeric Augustin a8d34e534e Remove [--editable] in install instructions.
Use -e only in docs targeted at contributors.

If a user copy-pastes  command line with [--editable], they will hit
an error. If they don't know the --editable option, we're giving them
a choice to make before they can move forwards, but this isn't a choice
they need to make right now.
2019-12-24 08:46:08 +01:00
Aymeric Augustin 81422c4e6d Remove unused variables in examples. 2019-12-23 22:29:02 +01:00
Aymeric Augustin c3783399db Remove redundant requirements with transformers. 2019-12-23 19:17:27 +01:00
Aymeric Augustin 9fc8dcb2a0 Standardize import.
Every other file uses this pattern.
2019-12-23 18:45:42 +01:00
Aymeric Augustin 1c62e87b34 Use built-in open().
On Python 3, `open is io.open`.
2019-12-22 18:38:56 +01:00
Aymeric Augustin d6eaf4e6d2 Update comments mentioning Python 2. 2019-12-22 18:38:56 +01:00
Aymeric Augustin 75a23d24af Remove import fallbacks. 2019-12-22 18:38:56 +01:00
Aymeric Augustin 798b3b3899 Remove sys.version_info[0] == 2 or 3. 2019-12-22 18:38:42 +01:00
Aymeric Augustin 6b2200fc88 Remove u-prefixes. 2019-12-22 17:47:54 +01:00
Aymeric Augustin c824d15aa1 Remove __future__ imports. 2019-12-22 17:47:54 +01:00
Aymeric Augustin 7e98e211f0 Remove unittest.main() in test modules.
This construct isn't used anymore these days.

Running python tests/test_foo.py puts the tests/ directory on
PYTHONPATH, which isn't representative of how we run tests.

Use python -m unittest tests/test_foo.py instead.
2019-12-22 14:42:03 +01:00
Aymeric Augustin ced0a94204 Switch test files to the standard test_*.py scheme. 2019-12-22 14:15:13 +01:00
Aymeric Augustin c11b3e2926 Sort imports for optional third-party libraries.
These libraries aren't always installed in the virtual environment where
isort is running. Declaring them properly avoids mixing these
third-party imports with local imports.
2019-12-22 11:19:13 +01:00
Aymeric Augustin 939148b050 Fix F401 flake8 warning (x28).
Do manually what autoflake couldn't manage.
2019-12-22 10:59:08 +01:00
Aymeric Augustin 783a616999 Fix F401 flake8 warning (x88 / 116).
This change is mostly autogenerated with:

    $ python -m autoflake --in-place --recursive --remove-all-unused-imports --ignore-init-module-imports examples templates transformers utils hubconf.py setup.py

I made minor changes in the generated diff.
2019-12-22 10:59:08 +01:00
Aymeric Augustin 80327a13ea Fix F401 flake8 warning (x152 / 268).
This change is mostly autogenerated with:

    $ python -m autoflake --in-place --recursive examples templates transformers utils hubconf.py setup.py

I made minor changes in the generated diff.
2019-12-22 10:59:08 +01:00
Aymeric Augustin fa2ccbc081 Fix E266 flake8 warning (x90). 2019-12-22 10:59:08 +01:00
Aymeric Augustin 2ab78325f0 Fix F821 flake8 warning (x47).
Ignore warnings related to Python 2, because it's going away soon.
2019-12-22 10:59:07 +01:00
Aymeric Augustin 631be27078 Fix E722 flake8 warnings (x26). 2019-12-22 10:59:07 +01:00
Aymeric Augustin b0f7db73cd Fix E741 flake8 warning (x14). 2019-12-22 10:59:07 +01:00
Aymeric Augustin fd2f17a7a1 Fix E714 flake8 warning (x8). 2019-12-22 10:59:07 +01:00
Aymeric Augustin 5eab3cf6bc Fix W605 flake8 warning (x5). 2019-12-22 10:59:07 +01:00
Aymeric Augustin 7dce8dc7ac Fix E731 flake8 warning (x3). 2019-12-22 10:59:07 +01:00
Aymeric Augustin 357db7098c Fix E712 flake8 warning (x1). 2019-12-22 10:59:07 +01:00
Aymeric Augustin f9c5317db2 Fix E265 flake8 warning (x1). 2019-12-22 10:59:07 +01:00
Aymeric Augustin 28e608a2c2 Remove trailing whitespace from all Python files.
Fixes flake8 warning W291 (x224).
2019-12-22 10:59:07 +01:00
Aymeric Augustin 158e82e061 Sort imports with isort.
This is the result of:

    $ isort --recursive examples templates transformers utils hubconf.py setup.py
2019-12-22 10:57:46 +01:00
Aymeric Augustin fa84ae26d6 Reformat source code with black.
This is the result of:

    $ black --line-length 119 examples templates transformers utils hubconf.py setup.py

There's a lot of fairly long lines in the project. As a consequence, I'm
picking the longest widely accepted line length, 119 characters.

This is also Thomas' preference, because it allows for explicit variable
names, to make the code easier to understand.
2019-12-21 17:52:29 +01:00
Thomas Wolf 73f6e9817c
Merge pull request #2115 from suvrat96/add_mmbt_model
[WIP] Add MMBT Model to Transformers Repo
2019-12-21 15:26:08 +01:00
thomwolf 344126fe58 move example to mm-imdb folder 2019-12-21 15:06:52 +01:00
Thomas Wolf 5b7fb6a4a1
Merge pull request #2134 from bkkaggle/saving-and-resuming
closes #1960 Add saving and resuming functionality for remaining examples
2019-12-21 15:03:53 +01:00
Thomas Wolf 6f68d559ab
Merge pull request #2130 from huggingface/ignored-index-coherence
[BREAKING CHANGE] Setting all ignored index to the PyTorch standard
2019-12-21 14:55:40 +01:00
thomwolf 1ab25c49d3 Merge branch 'master' into pr/2115 2019-12-21 14:54:30 +01:00
thomwolf b03872aae0 fix merge 2019-12-21 14:49:54 +01:00
Thomas Wolf 518ba748e0
Merge branch 'master' into saving-and-resuming 2019-12-21 14:41:39 +01:00
Thomas Wolf 18601c3b6e
Merge pull request #2173 from erenup/master
run_squad with roberta
2019-12-21 14:33:16 +01:00
Thomas Wolf eeb70cdd77
Merge branch 'master' into saving-and-resuming 2019-12-21 14:29:59 +01:00
Thomas Wolf ed9b84816e
Merge pull request #1840 from huggingface/generation_sampler
[WIP] Sampling sequence generator for transformers
2019-12-21 14:27:35 +01:00
thomwolf cfa0380515 Merge branch 'master' into generation_sampler 2019-12-21 14:12:52 +01:00
thomwolf 300ec3003c fixing run_generation example - using torch.no_grad 2019-12-21 14:02:19 +01:00
thomwolf 1c37746892 fixing run_generation 2019-12-21 13:52:49 +01:00
thomwolf 8a2be93b4e fix merge 2019-12-21 13:31:28 +01:00
Thomas Wolf 562f864038
Merge branch 'master' into fix-xlnet-squad2.0 2019-12-21 12:48:10 +01:00
Thomas Wolf 59941c5d1f
Merge pull request #2189 from stefan-it/xlmr
Add support for XLM-RoBERTa
2019-12-20 13:26:38 +01:00
Julien Chaumond a5a06a851e [doc] Param name consistency 2019-12-19 16:24:20 -05:00
Aidan Kierans 1718fb9e74 Minor/basic text fixes (#2229)
* Small clarification

Matches line 431 to line 435 for additional clarity and consistency.

* Fixed minor typo

The letter "s" was previously omitted from the word "docstrings".
2019-12-19 16:23:18 -05:00
Francesco 62c1fc3c1e Removed duplicate XLMConfig, XLMForQuestionAnswering and XLMTokenizer from import statement of run_squad.py script 2019-12-19 09:50:56 -05:00
Ejar 284572efc0 Updated typo on the link
Updated documentation due to typo
2019-12-19 09:36:43 -05:00
Stefan Schweter a26ce4dee1 examples: add XLM-RoBERTa to glue script 2019-12-19 02:23:01 +01:00
thomwolf 3d2096f516 further cleanup 2019-12-18 11:50:54 +01:00
thomwolf 83bc5235cf Merge branch 'master' into pr/2189 2019-12-17 11:47:32 +01:00
Thomas Wolf f061606277
Merge pull request #2164 from huggingface/cleanup-configs
[SMALL BREAKING CHANGE] Cleaning up configuration classes - Adding Model Cards
2019-12-17 09:10:16 +01:00
Lysandre 18a879f475 fix #2180 2019-12-16 16:44:29 -05:00
Lysandre d803409215 Fix run squad evaluate during training 2019-12-16 16:31:38 -05:00
Stefan Schweter 71b4750517 examples: add support for XLM-RoBERTa to run_ner script 2019-12-16 16:37:27 +01:00
thomwolf dc667ce1a7 double check cc @LysandreJik 2019-12-14 09:56:27 +01:00