ktrapeznikov
ac40eed1a5
Create README.md
...
adding readme for
ktrapeznikov/albert-xlarge-v2-squad-v2
2020-04-04 15:18:54 -04:00
ktrapeznikov
fd9995ebc5
Create README.md
2020-04-04 15:18:31 -04:00
Julien Chaumond
5d912e7ed4
Tweak typing for #3566
2020-04-04 15:04:03 -04:00
Julien Chaumond
94eb68d742
weigths*weights
2020-04-04 15:03:26 -04:00
Manuel Romero
243e687be6
Create model card
2020-04-04 08:20:34 -04:00
Julien Chaumond
3e4b4dd190
[model_cards] Link to ExBERT visualisation
...
Hat/tip @bhoov @HendrikStrobelt @sebastianGehrmann
Also cc @srush and @thomwolf
2020-04-03 20:03:29 -04:00
Max Ryabinin
c6acd246ec
Speed up GELU computation with torch.jit ( #2988 )
...
* Compile gelu_new with torchscript
* Compile _gelu_python with torchscript
* Wrap gelu_new with torch.jit for torch>=1.4
2020-04-03 15:20:21 -04:00
Lysandre Debut
d5d7d88612
ELECTRA ( #3257 )
...
* Electra wip
* helpers
* Electra wip
* Electra v1
* ELECTRA may be saved/loaded
* Generator & Discriminator
* Embedding size instead of halving the hidden size
* ELECTRA Tokenizer
* Revert BERT helpers
* ELECTRA Conversion script
* Archive maps
* PyTorch tests
* Start fixing tests
* Tests pass
* Same configuration for both models
* Compatible with base + large
* Simplification + weight tying
* Archives
* Auto + Renaming to standard names
* ELECTRA is uncased
* Tests
* Slight API changes
* Update tests
* wip
* ElectraForTokenClassification
* temp
* Simpler arch + tests
Removed ElectraForPreTraining which will be in a script
* Conversion script
* Auto model
* Update links to S3
* Split ElectraForPreTraining and ElectraForTokenClassification
* Actually test PreTraining model
* Remove num_labels from configuration
* wip
* wip
* From discriminator and generator to electra
* Slight API changes
* Better naming
* TensorFlow ELECTRA tests
* Accurate conversion script
* Added to conversion script
* Fast ELECTRA tokenizer
* Style
* Add ELECTRA to README
* Modeling Pytorch Doc + Real style
* TF Docs
* Docs
* Correct links
* Correct model intialized
* random fixes
* style
* Addressing Patrick's and Sam's comments
* Correct links in docs
2020-04-03 14:10:54 -04:00
Yohei Tamura
8594dd80dd
BertJapaneseTokenizer accept options for mecab ( #3566 )
...
* BertJapaneseTokenizer accept options for mecab
* black
* fix mecab_option to Option[str]
2020-04-03 11:12:19 -04:00
HUSEIN ZOLKEPLI
216e167ce6
Added albert-base-bahasa-cased README and fixed tiny-bert-bahasa-cased README ( #3613 )
...
* add bert bahasa readme
* update readme
* update readme
* added xlnet
* added tiny-bert and fix xlnet readme
* added albert base
2020-04-03 09:28:43 -04:00
ahotrod
1ac6a246d8
Update README.md ( #3604 )
...
Update AutoModel & AutoTokernizer loading.
2020-04-03 09:28:25 -04:00
ahotrod
e91692f4a3
Update README.md ( #3603 )
2020-04-03 09:27:57 -04:00
HenrykBorzymowski
8e287d507d
corrected mistake in polish model cards ( #3611 )
...
* added model_cards for polish squad models
* corrected mistake in polish design cards
Co-authored-by: Henryk Borzymowski <henryk.borzymowski@pwc.com>
2020-04-03 09:07:15 -04:00
redewiedergabe
81484b447b
Create README.md ( #3568 )
...
* Create README.md
* added meta block (language: german)
* Added additional information about test data
2020-04-02 21:48:31 -04:00
ahotrod
9f6349aba9
Create README.md
2020-04-02 21:43:12 -04:00
Henryk Borzymowski
ddb1ce7418
added model_cards for polish squad models
2020-04-02 21:40:16 -04:00
Patrick von Platen
f68d22850c
delete bogus print statement ( #3595 )
2020-04-02 21:49:34 +02:00
Nicolas
c50aa67bff
Resizing embedding matrix before sending it to the optimizer. ( #3532 )
...
* Resizing embedding matrix after sending it to the optimizer prevents from updating the newly resized matrix.
* Remove space for style matter
2020-04-02 15:00:05 -04:00
Mark Kockerbeck
1b10159950
Adding should_continue check for retraining ( #3509 )
2020-04-02 14:07:08 -04:00
Patrick von Platen
390c128592
[Encoder-Decoder] Force models outputs to always have batch_size as their first dim ( #3536 )
...
* solve conflicts
* improve comments
2020-04-02 15:18:33 +02:00
Patrick von Platen
ab5d06a094
[T5, examples] replace heavy t5 models with tiny random models ( #3556 )
...
* replace heavy t5 models with tiny random models as was done by sshleifer
* fix isort
2020-04-02 12:34:05 +02:00
Patrick von Platen
a4ee4da18a
[T5, TF 2.2] change tf t5 argument naming ( #3547 )
...
* change tf t5 argument naming for TF 2.2
* correct bug in testing
2020-04-01 22:04:20 +02:00
Patrick von Platen
06dd597552
fix bug in warnings T5 pipelines ( #3545 )
2020-04-01 21:59:12 +02:00
Anirudh Srinivasan
9de9ceb6c5
Correct output shape for Bert NSP models in docs ( #3482 )
2020-04-01 15:04:38 -04:00
Patrick von Platen
b815edf69f
[T5, Testst] Add extensive hard-coded integration tests and make sure PT and TF give equal results ( #3550 )
...
* add some t5 integration tests
* finish summarization and translation integration tests for T5 - results loook good
* add tf test
* fix == vs is bug
* fix tf beam search error and make tf t5 tests pass
2020-04-01 18:01:33 +02:00
HUSEIN ZOLKEPLI
8538ce9044
Add tiny-bert-bahasa-cased model card ( #3567 )
...
* add bert bahasa readme
* update readme
* update readme
* added xlnet
* added tiny-bert and fix xlnet readme
2020-04-01 07:15:00 -04:00
Manuel Romero
c1a6252be1
Create model card ( #3557 )
...
Create model card for: distilbert-multi-finetuned-for-xqua-on-tydiqa
2020-04-01 07:14:23 -04:00
Julien Chaumond
50e15c825c
Tokenizers: Start cleaning examples a little ( #3455 )
...
* Start cleaning examples
* Fixup
2020-04-01 07:13:40 -04:00
Patrick von Platen
b38d552a92
[Generate] Add bad words list argument to the generate function ( #3367 )
...
* add bad words list
* make style
* add bad_words_tokens
* make style
* better naming
* make style
* fix typo
2020-03-31 18:42:31 +02:00
Patrick von Platen
ae6834e028
[Examples] Clean summarization and translation example testing files for T5 and Bart ( #3514 )
...
* fix conflicts
* add model size argument to summarization
* correct wrong import
* fix isort
* correct imports
* other isort make style
* make style
2020-03-31 17:54:13 +02:00
Manuel Romero
0373b60c4c
Update README.md ( #3552 )
...
- Show that the last uploaded version was trained on more data (custom_license files)
2020-03-31 10:40:34 -04:00
Patrick von Platen
83d1fbcff6
[Docs] Add usage examples for translation and summarization ( #3538 )
2020-03-31 09:36:03 -04:00
Patrick von Platen
55bcae7f25
remove useless and confusing lm_labels line ( #3531 )
2020-03-31 09:32:25 -04:00
Patrick von Platen
42e1e3c67f
Update usage doc regarding generate fn ( #3504 )
2020-03-31 09:31:46 -04:00
Patrick von Platen
57b0fab692
Add better explanation to check `docs` locally. ( #3459 )
2020-03-31 09:30:17 -04:00
Manuel Romero
a8d4dff0a1
Update README.md ( #3470 )
...
Fix typo
2020-03-31 08:01:09 -04:00
Manuel Romero
4a5663568f
Create card for the model: GPT-2-finetuned-covid-bio-medrxiv ( #3453 )
2020-03-31 08:01:03 -04:00
Branden Chan
bbedb59675
Create README.md ( #3393 )
...
* Create README.md
* Update README.md
2020-03-31 08:00:35 -04:00
Manuel Romero
c2cf192943
Add link to 16 POS tags model ( #3465 )
2020-03-31 08:00:00 -04:00
Gabriele Sarti
c82ef72158
Added CovidBERT-NLI model card ( #3477 )
2020-03-31 07:59:49 -04:00
Manuel Romero
b48a1f08c1
Add text shown in example of usage ( #3464 )
2020-03-31 07:59:36 -04:00
Manuel Romero
99833a9cbf
Create model card ( #3487 )
2020-03-31 07:59:22 -04:00
Sho Arora
ebceeeacda
Add electra and alectra model cards ( #3524 )
2020-03-31 07:58:48 -04:00
Leandro von Werra
a6c4ee27fd
Add model cards ( #3537 )
...
* feat: add model card bert-imdb
* feat: add model card gpt2-imdb-pos
* feat: add model card gpt2-imdb
2020-03-31 07:54:45 -04:00
Ethan Perez
e5c393dceb
[Bug fix] Using loaded checkpoint with --do_predict (instead of… ( #3437 )
...
* Using loaded checkpoint with --do_predict
Without this fix, I'm getting near-random validation performance for a trained model, and the validation performance differs per validation run. I think this happens since the `model` variable isn't set with the loaded checkpoint, so I'm using a randomly initialized model. Looking at the model activations, they differ each time I run evaluation (but they don't with this fix).
* Update checkpoint loading
* Fixing model loading
2020-03-30 17:06:08 -04:00
Sam Shleifer
8deff3acf2
[bart-tiny-random] Put a 5MB model on S3 to allow faster exampl… ( #3488 )
2020-03-30 12:28:27 -04:00
dougian
1f72865726
[BART] Update encoder and decoder on set_input_embedding ( #3501 )
...
Co-authored-by: Ioannis Douratsos <ioannisd@amazon.com>
2020-03-30 12:20:37 -04:00
Julien Chaumond
cc598b312b
[InputExample] Unfreeze for now, cf. #3423
2020-03-30 10:41:49 -04:00
Julien Plu
d38bbb225f
Update the NER TF script ( #3511 )
...
* Update the NER TF script to remove the softmax and make the pad token label id to -1
* Reformat the quality and style
Co-authored-by: Julien Plu <julien.plu@adevinta.com>
2020-03-30 09:50:12 -04:00
LysandreJik
eff757f2e3
Re-pin isort version
2020-03-30 09:00:47 -04:00