thomwolf
|
64e0adda81
|
better error message
|
2019-06-18 10:51:31 +02:00 |
thomwolf
|
382e2d1e50
|
spliting config and weight files for bert also
|
2019-06-18 10:37:16 +02:00 |
Thomas Wolf
|
a6f2511811
|
Merge pull request #694 from huggingface/release_0.6.3
Release 0.6.3
|
2019-06-17 16:27:25 +02:00 |
thomwolf
|
4447f270b2
|
updating hub
|
2019-06-17 16:21:28 +02:00 |
thomwolf
|
33d3db5c43
|
updating head masking, readme and docstrings
|
2019-06-17 15:51:28 +02:00 |
thomwolf
|
965f172de6
|
output all hidden layers states in GPT/GPT-2
|
2019-06-17 14:34:12 +02:00 |
thomwolf
|
f12007e421
|
add head masking and pruning to openai GPT
|
2019-06-17 14:19:40 +02:00 |
thomwolf
|
b860e47cf5
|
add head masking and pruning to gpt-2
|
2019-06-17 14:12:10 +02:00 |
thomwolf
|
7220d47a1c
|
adding head pruning and tests
|
2019-06-17 13:20:45 +02:00 |
thomwolf
|
8415a38b23
|
better error messages
|
2019-06-17 13:03:48 +02:00 |
thomwolf
|
96c4d3d988
|
add head masking tests
|
2019-06-17 12:17:26 +02:00 |
thomwolf
|
34858ae1d9
|
adding bert whole words, bertgerman and gpt-2 medium models, head masking
|
2019-06-17 11:02:39 +02:00 |
Thomas Wolf
|
80684f6f86
|
Merge pull request #690 from shashwath94/projadpsftmax_fix
Transformer XL ProjectedAdaptiveLogSoftmax output fix
|
2019-06-15 23:14:10 +02:00 |
Thomas Wolf
|
9e363703d6
|
Merge pull request #688 from deepset-ai/german_bert
Add German Bert model to code, update readme
|
2019-06-15 23:13:41 +02:00 |
Thomas Wolf
|
cc6cd430f7
|
Merge pull request #691 from vanche/master
import class "GPT2MultipleChoiceHead"
|
2019-06-15 23:12:55 +02:00 |
vanche
|
8289646d4e
|
import class "GPT2MultipleChoiceHead"
|
2019-06-15 22:19:30 +09:00 |
Shashwath H A
|
5076a5daa7
|
Fix proj adp softmax output return when n_clusters=0
|
2019-06-14 22:03:21 -04:00 |
timoeller
|
16af9ff7b0
|
Add German Bert model to code, update readme
|
2019-06-14 17:42:46 +02:00 |
Thomas Wolf
|
b3f9e9451b
|
Merge pull request #687 from huggingface/tests_and_doc
Updating tests and doc
|
2019-06-14 17:23:45 +02:00 |
thomwolf
|
44e9ddd7fe
|
fix num_special_tokens in GPT 2 test
|
2019-06-14 17:17:43 +02:00 |
Thomas Wolf
|
cad88e19de
|
Merge pull request #672 from oliverguhr/master
Add vocabulary and model config to the finetune output
|
2019-06-14 17:02:47 +02:00 |
Thomas Wolf
|
c6de625229
|
Merge pull request #655 from huggingface/finish_torchhub_interfaces
Finish torchhub interfaces
|
2019-06-14 17:02:08 +02:00 |
Thomas Wolf
|
ff276fc00c
|
Merge branch 'master' into finish_torchhub_interfaces
|
2019-06-14 16:59:07 +02:00 |
Thomas Wolf
|
a64736dc23
|
Merge pull request #646 from Colanim/patch-1
Fix link in README
|
2019-06-14 16:57:45 +02:00 |
Thomas Wolf
|
460d9afd45
|
Merge pull request #640 from Barqawiz/master
Support latest multi language bert fine tune
|
2019-06-14 16:57:02 +02:00 |
Thomas Wolf
|
277c77f1c5
|
Merge pull request #630 from tguens/master
Update run_squad.py
|
2019-06-14 16:56:26 +02:00 |
Thomas Wolf
|
659af2cbd0
|
Merge pull request #604 from samuelbroscheit/master
Fixing issue "Training beyond specified 't_total' steps with schedule 'warmup_linear'" reported in #556
|
2019-06-14 16:49:24 +02:00 |
Thomas Wolf
|
2d6a53490d
|
Merge pull request #597 from huggingface/attention
GPT-2 (medium size model, special_tokens, fine-tuning, attention) + repo code coverage metric
|
2019-06-14 16:47:32 +02:00 |
Thomas Wolf
|
35e6baab37
|
Merge branch 'master' into attention
|
2019-06-14 16:41:56 +02:00 |
thomwolf
|
5e1207b8ad
|
add attention to all bert models and add test
|
2019-06-14 16:28:25 +02:00 |
thomwolf
|
bcc9e93e6f
|
fix test
|
2019-06-14 15:38:20 +02:00 |
Thomas Wolf
|
f9cde97b31
|
Merge pull request #675 from meetshah1995/patch-1
[hotfix] Fix frozen pooler parameters in SWAG example.
|
2019-06-12 10:01:21 +02:00 |
Meet Pragnesh Shah
|
e02ce4dc79
|
[hotfix] Fix frozen pooler parameters in SWAG example.
|
2019-06-11 15:13:53 -07:00 |
Oliver Guhr
|
5c08c8c273
|
adds the tokenizer + model config to the output
|
2019-06-11 13:46:33 +02:00 |
Thomas Wolf
|
784c0ed89a
|
Merge pull request #668 from jeonsworld/patch-2
apply Whole Word Masking technique
|
2019-06-11 11:29:10 +02:00 |
jeonsworld
|
a3a604cefb
|
Update pregenerate_training_data.py
apply Whole Word Masking technique.
referred to [create_pretraining_data.py](https://github.com/google-research/bert/blob/master/create_pretraining_data.py)
|
2019-06-10 12:17:23 +09:00 |
VictorSanh
|
ee0308f79d
|
fix typo
|
2019-06-06 17:30:49 +02:00 |
VictorSanh
|
2d07f945ad
|
fix error with torch.no_grad and loss computation
|
2019-06-06 17:10:24 +02:00 |
VictorSanh
|
6b8d227092
|
some cleaning
|
2019-06-06 17:07:03 +02:00 |
VictorSanh
|
122d5c52ac
|
distinguish was is not trained
|
2019-06-06 17:02:51 +02:00 |
VictorSanh
|
2647ac3294
|
forgot bertForPreTraining
|
2019-06-06 16:57:40 +02:00 |
VictorSanh
|
cf44d98392
|
Add more examples to BERT models for torchhub
|
2019-06-06 16:36:02 +02:00 |
thomwolf
|
a3274ac40b
|
adding attention outputs in bert
|
2019-06-03 16:11:45 -05:00 |
VictorSanh
|
826496580b
|
Revert "add output_attentions for BertModel"
This reverts commit de5e5682a1 .
|
2019-06-03 17:10:25 -04:00 |
VictorSanh
|
de5e5682a1
|
add output_attentions for BertModel
|
2019-06-03 17:05:24 -04:00 |
VictorSanh
|
312fdd7752
|
fix doc error
|
2019-06-01 17:43:26 -04:00 |
VictorSanh
|
cdf0f2fec3
|
fix typo/presentation
|
2019-06-01 17:42:00 -04:00 |
VictorSanh
|
8f97f6c57f
|
fix typo
cc @thomwolf
|
2019-06-01 17:29:07 -04:00 |
VictorSanh
|
466a96543a
|
fix bug/typos
|
2019-06-01 17:28:56 -04:00 |
VictorSanh
|
c198ff5f1f
|
fix typos/bugs
|
2019-06-01 16:28:42 -04:00 |