Commit Graph

3673 Commits

Author SHA1 Message Date
ktrapeznikov ac40eed1a5 Create README.md
adding readme for 
ktrapeznikov/albert-xlarge-v2-squad-v2
2020-04-04 15:18:54 -04:00
ktrapeznikov fd9995ebc5 Create README.md 2020-04-04 15:18:31 -04:00
Julien Chaumond 5d912e7ed4 Tweak typing for #3566 2020-04-04 15:04:03 -04:00
Julien Chaumond 94eb68d742 weigths*weights 2020-04-04 15:03:26 -04:00
Manuel Romero 243e687be6 Create model card 2020-04-04 08:20:34 -04:00
Julien Chaumond 3e4b4dd190 [model_cards] Link to ExBERT visualisation
Hat/tip @bhoov @HendrikStrobelt @sebastianGehrmann

Also cc @srush and @thomwolf
2020-04-03 20:03:29 -04:00
Max Ryabinin c6acd246ec
Speed up GELU computation with torch.jit (#2988)
* Compile gelu_new with torchscript

* Compile _gelu_python with torchscript

* Wrap gelu_new with torch.jit for torch>=1.4
2020-04-03 15:20:21 -04:00
Lysandre Debut d5d7d88612
ELECTRA (#3257)
* Electra wip

* helpers

* Electra wip

* Electra v1

* ELECTRA may be saved/loaded

* Generator & Discriminator

* Embedding size instead of halving the hidden size

* ELECTRA Tokenizer

* Revert BERT helpers

* ELECTRA Conversion script

* Archive maps

* PyTorch tests

* Start fixing tests

* Tests pass

* Same configuration for both models

* Compatible with base + large

* Simplification + weight tying

* Archives

* Auto + Renaming to standard names

* ELECTRA is uncased

* Tests

* Slight API changes

* Update tests

* wip

* ElectraForTokenClassification

* temp

* Simpler arch + tests

Removed ElectraForPreTraining which will be in a script

* Conversion script

* Auto model

* Update links to S3

* Split ElectraForPreTraining and ElectraForTokenClassification

* Actually test PreTraining model

* Remove num_labels from configuration

* wip

* wip

* From discriminator and generator to electra

* Slight API changes

* Better naming

* TensorFlow ELECTRA tests

* Accurate conversion script

* Added to conversion script

* Fast ELECTRA tokenizer

* Style

* Add ELECTRA to README

* Modeling Pytorch Doc + Real style

* TF Docs

* Docs

* Correct links

* Correct model intialized

* random fixes

* style

* Addressing Patrick's and Sam's comments

* Correct links in docs
2020-04-03 14:10:54 -04:00
Yohei Tamura 8594dd80dd
BertJapaneseTokenizer accept options for mecab (#3566)
* BertJapaneseTokenizer accept options for mecab

* black

* fix mecab_option to Option[str]
2020-04-03 11:12:19 -04:00
HUSEIN ZOLKEPLI 216e167ce6
Added albert-base-bahasa-cased README and fixed tiny-bert-bahasa-cased README (#3613)
* add bert bahasa readme

* update readme

* update readme

* added xlnet

* added tiny-bert and fix xlnet readme

* added albert base
2020-04-03 09:28:43 -04:00
ahotrod 1ac6a246d8
Update README.md (#3604)
Update AutoModel & AutoTokernizer loading.
2020-04-03 09:28:25 -04:00
ahotrod e91692f4a3
Update README.md (#3603) 2020-04-03 09:27:57 -04:00
HenrykBorzymowski 8e287d507d
corrected mistake in polish model cards (#3611)
* added model_cards for polish squad models

* corrected mistake in polish design cards

Co-authored-by: Henryk Borzymowski <henryk.borzymowski@pwc.com>
2020-04-03 09:07:15 -04:00
redewiedergabe 81484b447b
Create README.md (#3568)
* Create README.md

* added meta block (language: german)

* Added additional information about test data
2020-04-02 21:48:31 -04:00
ahotrod 9f6349aba9 Create README.md 2020-04-02 21:43:12 -04:00
Henryk Borzymowski ddb1ce7418 added model_cards for polish squad models 2020-04-02 21:40:16 -04:00
Patrick von Platen f68d22850c
delete bogus print statement (#3595) 2020-04-02 21:49:34 +02:00
Nicolas c50aa67bff
Resizing embedding matrix before sending it to the optimizer. (#3532)
* Resizing embedding matrix after sending it to the optimizer prevents from updating the newly resized matrix.

* Remove space for style matter
2020-04-02 15:00:05 -04:00
Mark Kockerbeck 1b10159950
Adding should_continue check for retraining (#3509) 2020-04-02 14:07:08 -04:00
Patrick von Platen 390c128592
[Encoder-Decoder] Force models outputs to always have batch_size as their first dim (#3536)
* solve conflicts

* improve comments
2020-04-02 15:18:33 +02:00
Patrick von Platen ab5d06a094
[T5, examples] replace heavy t5 models with tiny random models (#3556)
* replace heavy t5 models with tiny random models as was done by sshleifer

* fix isort
2020-04-02 12:34:05 +02:00
Patrick von Platen a4ee4da18a
[T5, TF 2.2] change tf t5 argument naming (#3547)
* change tf t5 argument naming for TF 2.2

* correct bug in testing
2020-04-01 22:04:20 +02:00
Patrick von Platen 06dd597552
fix bug in warnings T5 pipelines (#3545) 2020-04-01 21:59:12 +02:00
Anirudh Srinivasan 9de9ceb6c5
Correct output shape for Bert NSP models in docs (#3482) 2020-04-01 15:04:38 -04:00
Patrick von Platen b815edf69f
[T5, Testst] Add extensive hard-coded integration tests and make sure PT and TF give equal results (#3550)
* add some t5 integration tests

* finish summarization and translation integration tests for T5 - results loook good

* add tf test

* fix == vs is bug

* fix tf beam search error and make tf t5 tests pass
2020-04-01 18:01:33 +02:00
HUSEIN ZOLKEPLI 8538ce9044
Add tiny-bert-bahasa-cased model card (#3567)
* add bert bahasa readme

* update readme

* update readme

* added xlnet

* added tiny-bert and fix xlnet readme
2020-04-01 07:15:00 -04:00
Manuel Romero c1a6252be1
Create model card (#3557)
Create model card for: distilbert-multi-finetuned-for-xqua-on-tydiqa
2020-04-01 07:14:23 -04:00
Julien Chaumond 50e15c825c
Tokenizers: Start cleaning examples a little (#3455)
* Start cleaning examples

* Fixup
2020-04-01 07:13:40 -04:00
Patrick von Platen b38d552a92
[Generate] Add bad words list argument to the generate function (#3367)
* add bad words list

* make style

* add bad_words_tokens

* make style

* better naming

* make style

* fix typo
2020-03-31 18:42:31 +02:00
Patrick von Platen ae6834e028
[Examples] Clean summarization and translation example testing files for T5 and Bart (#3514)
* fix conflicts

* add model size argument to summarization

* correct wrong import

* fix isort

* correct imports

* other isort make style

* make style
2020-03-31 17:54:13 +02:00
Manuel Romero 0373b60c4c
Update README.md (#3552)
- Show that the last uploaded version was trained on more data (custom_license files)
2020-03-31 10:40:34 -04:00
Patrick von Platen 83d1fbcff6
[Docs] Add usage examples for translation and summarization (#3538) 2020-03-31 09:36:03 -04:00
Patrick von Platen 55bcae7f25
remove useless and confusing lm_labels line (#3531) 2020-03-31 09:32:25 -04:00
Patrick von Platen 42e1e3c67f
Update usage doc regarding generate fn (#3504) 2020-03-31 09:31:46 -04:00
Patrick von Platen 57b0fab692
Add better explanation to check `docs` locally. (#3459) 2020-03-31 09:30:17 -04:00
Manuel Romero a8d4dff0a1
Update README.md (#3470)
Fix typo
2020-03-31 08:01:09 -04:00
Manuel Romero 4a5663568f
Create card for the model: GPT-2-finetuned-covid-bio-medrxiv (#3453) 2020-03-31 08:01:03 -04:00
Branden Chan bbedb59675
Create README.md (#3393)
* Create README.md

* Update README.md
2020-03-31 08:00:35 -04:00
Manuel Romero c2cf192943
Add link to 16 POS tags model (#3465) 2020-03-31 08:00:00 -04:00
Gabriele Sarti c82ef72158
Added CovidBERT-NLI model card (#3477) 2020-03-31 07:59:49 -04:00
Manuel Romero b48a1f08c1
Add text shown in example of usage (#3464) 2020-03-31 07:59:36 -04:00
Manuel Romero 99833a9cbf
Create model card (#3487) 2020-03-31 07:59:22 -04:00
Sho Arora ebceeeacda
Add electra and alectra model cards (#3524) 2020-03-31 07:58:48 -04:00
Leandro von Werra a6c4ee27fd
Add model cards (#3537)
* feat: add model card bert-imdb

* feat: add model card gpt2-imdb-pos

* feat: add model card gpt2-imdb
2020-03-31 07:54:45 -04:00
Ethan Perez e5c393dceb
[Bug fix] Using loaded checkpoint with --do_predict (instead of… (#3437)
* Using loaded checkpoint with --do_predict

Without this fix, I'm getting near-random validation performance for a trained model, and the validation performance differs per validation run. I think this happens since the `model` variable isn't set with the loaded checkpoint, so I'm using a randomly initialized model. Looking at the model activations, they differ each time I run evaluation (but they don't with this fix).

* Update checkpoint loading

* Fixing model loading
2020-03-30 17:06:08 -04:00
Sam Shleifer 8deff3acf2
[bart-tiny-random] Put a 5MB model on S3 to allow faster exampl… (#3488) 2020-03-30 12:28:27 -04:00
dougian 1f72865726
[BART] Update encoder and decoder on set_input_embedding (#3501)
Co-authored-by: Ioannis Douratsos <ioannisd@amazon.com>
2020-03-30 12:20:37 -04:00
Julien Chaumond cc598b312b [InputExample] Unfreeze for now, cf. #3423 2020-03-30 10:41:49 -04:00
Julien Plu d38bbb225f
Update the NER TF script (#3511)
* Update the NER TF script to remove the softmax and make the pad token label id to -1

* Reformat the quality and style

Co-authored-by: Julien Plu <julien.plu@adevinta.com>
2020-03-30 09:50:12 -04:00
LysandreJik eff757f2e3 Re-pin isort version 2020-03-30 09:00:47 -04:00