Commit Graph

5202 Commits

Author SHA1 Message Date
Stas Bekman 51c4adf54c
[model cards] fix dataset yaml (#7216) 2020-09-17 15:29:39 -04:00
Sam Shleifer a5638b2b3a
[s2s] dynamic batch size with --max_tokens_per_batch (#7030) 2020-09-17 15:19:34 -04:00
Stas Bekman efeab6a3f1
[s2s] run_eval/run_eval_search tweaks (#7192)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-09-17 14:26:38 -04:00
Stas Bekman 9c5bcab5b0
[model cards] fix yaml in cards (#7207) 2020-09-17 14:11:17 -04:00
Sohee Yang e643a29722
Change to use relative imports in some files & Add python prompt symbols to example codes (#7202)
* Move 'from transformers' statements to relative imports in some files

* Add python prompt symbols in front of the example codes

* Reformat the code

* Add one missing space

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-09-17 12:30:45 -04:00
Stas Bekman 0fe6e435b6
[model cards] ported allenai Deep Encoder, Shallow Decoder models (#7153)
* [model cards] ported allenai Deep Encoder, Shallow Decoder models

* typo

* fix references

* add allenai/wmt19-de-en-6-6 model cards

* fill-in the missing info for the build script as provided by the searcher.
2020-09-17 17:58:49 +02:00
Stas Bekman 1eeb206bef
[ported model] FSMT (FairSeq MachineTranslation) (#6940)
* ready for PR

* cleanup

* correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST

* fix

* perfectionism

* revert change from another PR

* odd, already committed this one

* non-interactive upload workaround

* backup the failed experiment

* store langs in config

* workaround for localizing model path

* doc clean up as in https://github.com/huggingface/transformers/pull/6956

* style

* back out debug mode

* document: run_eval.py --num_beams 10

* remove unneeded constant

* typo

* re-use bart's Attention

* re-use EncoderLayer, DecoderLayer from bart

* refactor

* send to cuda and fp16

* cleanup

* revert (moved to another PR)

* better error message

* document run_eval --num_beams

* solve the problem of tokenizer finding the right files when model is local

* polish, remove hardcoded config

* add a note that the file is autogenerated to avoid losing changes

* prep for org change, remove unneeded code

* switch to model4.pt, update scores

* s/python/bash/

* missing init (but doesn't impact the finetuned model)

* cleanup

* major refactor (reuse-bart)

* new model, new expected weights

* cleanup

* cleanup

* full link

* fix model type

* merge porting notes

* style

* cleanup

* have to create a DecoderConfig object to handle vocab_size properly

* doc fix

* add note (not a public class)

* parametrize

* - add bleu scores integration tests

* skip test if sacrebleu is not installed

* cache heavy models/tokenizers

* some tweaks

* remove tokens that aren't used

* more purging

* simplify code

* switch to using decoder_start_token_id

* add doc

* Revert "major refactor (reuse-bart)"

This reverts commit 226dad15ca.

* decouple from bart

* remove unused code #1

* remove unused code #2

* remove unused code #3

* update instructions

* clean up

* move bleu eval to examples

* check import only once

* move data+gen script into files

* reuse via import

* take less space

* add prepare_seq2seq_batch (auto-tested)

* cleanup

* recode test to use json instead of yaml

* ignore keys not needed

* use the new -y in transformers-cli upload -y

* [xlm tok] config dict: fix str into int to match definition (#7034)

* [s2s] --eval_max_generate_length (#7018)

* Fix CI with change of name of nlp (#7054)

* nlp -> datasets

* More nlp -> datasets

* Woopsie

* More nlp -> datasets

* One last

* extending to support allen_nlp wmt models

- allow a specific checkpoint file to be passed
- more arg settings
- scripts for allen_nlp models

* sync with changes

* s/fsmt-wmt/wmt/ in model names

* s/fsmt-wmt/wmt/ in model names (p2)

* s/fsmt-wmt/wmt/ in model names (p3)

* switch to a better checkpoint

* typo

* make non-optional args such - adjust tests where possible or skip when there is no other choice

* consistency

* style

* adjust header

* cards moved (model rename)

* use best custom hparams

* update info

* remove old cards

* cleanup

* s/stas/facebook/

* update scores

* s/allen_nlp/allenai/

* url maps aren't needed

* typo

* move all the doc / build /eval generators to their own scripts

* cleanup

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* fix indent

* duplicated line

* style

* use the correct add_start_docstrings

* oops

* resizing can't be done with the core approach, due to 2 dicts

* check that the arg is a list

* style

* style

Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2020-09-17 11:31:29 -04:00
Sylvain Gugger 492bb6aa48
Trainer multi label (#7191)
* Trainer accep multiple labels

* Missing import

* Fix dosctrings
2020-09-17 08:15:37 -04:00
RafaelWO 709745927b
Transformer-XL: Remove unused parameters (#7087)
* Removed 'tgt_len' and 'ext_len' from Transfomer-XL

 * Some changes are still to be done

* Removed 'tgt_len' and 'ext_len' from Transfomer-XL (2)

 * Removed comments
 * Fixed quality

* Changed warning to info
2020-09-17 06:10:34 -04:00
Dhaval Taunk c183d81e27
added multilabel text classification notebook using distilbert to community notebooks (#7201)
* added multilabel classification using distilbert notebook to community notebooks

* added multilabel classification using distilbert notebook to community notebooks
2020-09-17 05:58:57 -04:00
Stas Bekman 79111b77d2
remove deprecated flag (#7171)
```
/home/circleci/.local/lib/python3.6/site-packages/isort/main.py:915: UserWarning: W0501: The following deprecated CLI flags were used and ignored: --recursive!
  "W0501: The following deprecated CLI flags were used and ignored: "
```
2020-09-17 05:52:12 -04:00
Stas Bekman 0cdafbf7ec
remove duplicated code (#7173) 2020-09-17 05:51:40 -04:00
Sam Shleifer 45b0b1ff2f
[s2s] fix kwarg typo (#7196) 2020-09-16 21:58:57 -04:00
Sam Shleifer 0203ad43bc
[s2s] distributed eval cleanup (#7186) 2020-09-16 15:38:37 -04:00
sgugger 3babef815c Formatting 2020-09-16 14:57:09 -04:00
Stas Bekman 42049b8e12
use the correct add_start_docstrings (#7174) 2020-09-16 14:40:35 -04:00
Stas Bekman fdaf8ab349
[s2s run_eval] new features (#7109)
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
2020-09-16 13:59:57 -04:00
Antoine Louis df165065c3
[model_cards] antoiloui/belgpt2 🇧🇪 (#7166)
* Create README.md

* Update README.md
2020-09-16 12:16:01 -04:00
Sylvain Gugger 108c9aefcc
Update README (#7133)
* Rewrite and update README

* Typo and migration guide

* Apply suggestions from code review

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>

* Address Clem's comments

Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>
2020-09-16 12:12:12 -04:00
Donna Choi 9e376e156a
Add condition (#7161) 2020-09-16 09:15:10 -04:00
Stas Bekman f8590c56e6
[doc] improve/expand the Parametrization section (#7156) 2020-09-16 08:45:50 -04:00
Stas Bekman d3391c87fe
build/eval/gen-card scripts for fsmt (#7155)
* build/eval/gen-card scripts for fsmt

* adjust for model renames
2020-09-16 08:41:26 -04:00
Xi Ye 08bfc1718a
fix the warning message of overflowed sequence (#7151) 2020-09-16 07:40:57 -04:00
Julien Plu af8425b749
Refactoring the TF activations functions (#7150)
* Refactoring the activations functions into a common file

* Apply style

* remove unused import

* fix tests

* Fix tests.
2020-09-16 07:03:47 -04:00
Stas Bekman b00cafbde5
[docs] add testing documentation (#7101)
* [docs] add testing documentation

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* tweaks as suggested

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* tweaks

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docs/source/testing.rst

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* more tweaks

* suggestions from @LysandreJik

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2020-09-15 19:25:25 -04:00
Patrick von Platen 85ffda96fc
fix encoder decoder kwargs (#7131) 2020-09-15 21:10:07 +02:00
Yih-Dar 4c62c6021a
fix ZeroDivisionError and epoch counting (#7125)
* fix ZeroDivisionError and epoch counting

* Add test for num_train_epochs calculation in trainer.py

* Remove @require_non_multigpu for test_num_train_epochs_in_training
2020-09-15 11:51:50 -04:00
Patrick von Platen 7af2791d77
Create README.md 2020-09-15 16:47:36 +02:00
Sylvain Gugger 153ec2f154
Funnel model cards (#7147) 2020-09-15 10:40:57 -04:00
Sylvain Gugger 7186ca6240
Multi predictions trainer (#7126)
* Allow multiple outputs

* Formatting

* Move the unwrapping before metrics

* Fix typo

* Add test for non-supported config options
2020-09-15 10:27:24 -04:00
Pedro Lima 52d250f6aa
[model_cards] pvl/labse_bert model card
From **Language-Agnostic BERT Sentence Embedding**

https://ai.googleblog.com/2020/08/language-agnostic-bert-sentence.html
2020-09-15 08:54:12 -04:00
tuner007 84d64805b0
Create README.md (#7097)
Model card for PEGASUS finetuned for paraphrasing task
2020-09-15 08:48:25 -04:00
Philip May 52bb7ccce5
German electra model card v3 update (#7089)
* changed eval table model order

* Update install

* update mc
2020-09-15 08:48:13 -04:00
Siddharth Jain 1a85299a5e
Tiny typo fix (#7143) 2020-09-15 08:18:42 -04:00
Paul O'Leary McCann e29c3f1b11
Add quotes to paths in MeCab arguments (#7142)
Without quotes directories with spaces in them will fail to be processed
correctly.
2020-09-15 19:04:50 +08:00
Yih-Dar cb061e78e1
Fix TF Trainer loss calculation (#6998)
* create branch for issue #6968

* First attempt to fix incorrect tf trainer loss calculation

* Fix training loss in metric

* fix tf trainer evaluation loss

* apply count_instances_in_batch() for eval and test datasets

* prototype of using a new argument in trainer_tf.py to fix loss issue

* some renaming and fix, in particular for evaluation methods

* fix bugs to have a running version

* change to @staticmethod

* apply style
2020-09-15 05:41:00 -04:00
Stas Bekman b0cbcdb05b
[logging] remove no longer needed verbosity override (#7100) 2020-09-15 04:01:14 -04:00
Sylvain Gugger 2bf70e2150
Fix reproducible tests in Trainer (#7119)
* Fix reproducible tests in Trainer

* Deal with multiple GPUs
2020-09-15 03:32:44 -04:00
Sam Shleifer 9e89390ce1
[QOL] add signature for prepare_seq2seq_batch (#7108) 2020-09-14 20:33:08 -04:00
Sam Shleifer 33d479d2b2
[s2s] distributed eval in one command (#7124) 2020-09-14 15:57:56 -04:00
sgugger 206b78d485 Pin version of TF and torch 2020-09-14 14:08:51 -04:00
Kevin Canwen Xu 90cde2e938
Add Mirror Option for Downloads (#6679)
* Add Tuna Mirror for Downloads from China

* format fix

* Use preset instead of hardcoding URL

* Fix

* make style

* update the mirror option doc

* update the mirror
2020-09-14 23:50:22 +08:00
Antonio V Mendoza e0e0675ac7
Demoing LXMERT with raw images by incorporating the FRCNN model for roi-pooled extraction and bounding-box predction on the GQA answer set. (#6986)
* adding demo

* Update examples/lxmert/requirements.txt

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* Update examples/lxmert/checkpoint.sh

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>

* added user input for .py demo

* updated model loading, data extrtaction, checkpoints, and lots of other automation

* adding normalizing for bounding boxes

* Update requirements.txt

* some optimizations for extracting data

* added data extracting file

* added data extraction file

* minor fixes to reqs and readme

* Style

* remove options

Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2020-09-14 10:07:04 -04:00
sgugger 5636cbb25d Extra ) 2020-09-14 09:37:55 -04:00
Sylvain Gugger ccc8e30c8a
Clean up autoclass doc (#7081) 2020-09-14 09:26:41 -04:00
Stas Bekman 3ca1874ca4
[examples testing] restore code (#7099)
For some reason https://github.com/huggingface/transformers/pull/5512 re-added temp dir creation code that was removed by
https://github.com/huggingface/transformers/pull/6494 defeating the purpose of that PR for those tests.
2020-09-14 08:54:23 -04:00
Stas Bekman 4d39148419
fix deprecation warnings (#7033)
* fix deprecation warnings

* remove tests/test_tokenization_common.py's test_padding_to_max_length

* revert test_padding_to_max_length
2020-09-14 07:51:19 -04:00
Stas Bekman 576eec98e0
ignore FutureWarning in tests (#7079) 2020-09-14 07:50:51 -04:00
Bartosz Telenczuk 15d18e0307
fix link to paper (#7116) 2020-09-14 07:43:40 -04:00
Lysandre Debut bb3106f741
Temporarily skip failing tests due to dependency change (#7118)
* Temporarily skip failing tests due to dependency change

* Remove trace
2020-09-14 07:42:13 -04:00