seanBE
6dc6c716c5
fix pytorch-transformers migration description in README
2019-10-07 09:59:54 +01:00
Christopher Goh
904158ac4d
Rephrase forward method to reduce ambiguity
2019-10-06 23:40:52 -04:00
Christopher Goh
0f65d8cbbe
Fix some typos in README
2019-10-06 23:40:52 -04:00
Santiago Castro
1dea291a02
Remove unnecessary use of FusedLayerNorm in XLNet
2019-10-06 13:35:01 -04:00
LysandreJik
f3e0218fbb
Correct device assignment in run_generation
2019-10-05 21:05:16 -04:00
thomwolf
78ef1a9930
fixes
2019-10-04 17:59:44 -04:00
thomwolf
6c1d0bc066
update encode_plus - add truncation strategies
2019-10-04 17:38:38 -04:00
VictorSanh
0820bb0555
unecessary carriage return
2019-10-04 17:23:15 -04:00
VictorSanh
f5891c3821
run_squad --> run_squad_w_distillation
2019-10-04 17:23:15 -04:00
VictorSanh
764a7923ec
add distillation+finetuning option in run_squad
2019-10-04 17:23:15 -04:00
Lysandre Debut
bb464289ce
New model addition issue template
2019-10-04 16:41:26 -04:00
thomwolf
92c0f2fb90
Merge remote-tracking branch 'origin/julien_multiple-choice' into encoding-qol
2019-10-04 15:48:06 -04:00
Julien Chaumond
9e136ff57c
Honor args.overwrite_cache (h/t @erenup)
2019-10-04 15:00:56 -04:00
LysandreJik
7bddb45a6f
Decode documentaton
2019-10-04 14:27:38 -04:00
keskarnitish
dbed1c5d94
Adding CTRL (squashed commit)
...
adding conversion script
adding first draft of modeling & tokenization
adding placeholder for test files
bunch of changes
registering the tokenizer/model/etc
tests
change link; something is very VERY wrong here
weird end-of-word thingy going on
i think the tokenization works now ; wrote the unit tests
overall structure works;load w next
the monster is alive!
works after some cleanup as well
adding emacs autosave to gitignore
currently only supporting the 48 layer one; seems to infer fine on my macbook
cleanup
fixing some documentation
fixing some documentation
tests passing?
now works on CUDA also
adding greedy?
adding greedy sampling
works well
2019-10-03 22:29:03 -07:00
Thomas Wolf
b3cfd97946
Merge pull request #1373 from TimYagan/fix-css
...
Fixed critical css font-family issues
2019-10-03 19:04:02 -04:00
Lysandre Debut
81a1e12469
Merge pull request #1313 from enzoampil/master
...
Add option to use a 'stop token'
2019-10-03 22:43:57 +00:00
Lysandre Debut
d3f24dfad7
Merge branch 'master' into master
2019-10-03 22:43:09 +00:00
LysandreJik
ecc4f1bdfa
XLM use_lang_embedding flag in run_generation
2019-10-03 17:42:16 -04:00
LysandreJik
c2c2ca0fdb
Added XLM to run_generation, with prompt language selection.
2019-10-03 17:18:48 -04:00
Thomas Wolf
1569610f2d
Merge pull request #1296 from danai-antoniou/add-duplicate-tokens-error
...
Added ValueError for duplicates in list of added tokens
2019-10-03 17:06:17 -04:00
drc10723
e1b2949ae6
DistillBert Documentation Code Example fixes
2019-10-03 15:51:33 -04:00
Simon Layton
899883644f
Fix test fails and warnings
...
Attention output was in bnij ordering instead of ijbn which everything
else will expect. This was an oversight on my part, and keeps the
attention inputs/outputs identical to the original code.
Also moved back from tensor slicing to index_select in rel_shift_bnij to
make the tracer happy.
2019-10-03 12:05:15 -04:00
VictorSanh
e2ae9c0b73
fix links in doc index
2019-10-03 11:42:21 -04:00
LysandreJik
aebd83230f
Update naming + remove f string in run_lm_finetuning example
2019-10-03 11:31:36 -04:00
LysandreJik
651bfb7ad5
always_truncate by default
2019-10-03 11:31:36 -04:00
LysandreJik
5ed50a93fb
LM finetuning won't mask special tokens anymore
2019-10-03 11:31:36 -04:00
LysandreJik
cc412edd42
Supports already existing special tokens
2019-10-03 11:31:36 -04:00
LysandreJik
2f259b228e
Sequence IDS
2019-10-03 11:31:36 -04:00
LysandreJik
7c789c337d
Always truncate argument in the encode method
2019-10-03 11:31:36 -04:00
Brian Ma
7af0777910
Update run_glue.py
...
add DistilBert model shortcut into ALL_MODELS
2019-10-03 15:31:11 +00:00
VictorSanh
c1689ac301
fix name
2019-10-03 10:56:39 -04:00
VictorSanh
4a790c40b1
update doc for distil*
2019-10-03 10:54:02 -04:00
VictorSanh
6be46a6e64
update links to new weights
2019-10-03 10:27:11 -04:00
VictorSanh
5f07d8f11a
prepare release
2019-10-03 10:27:11 -04:00
VictorSanh
35071007cb
incoming release 🔥 update links to arxiv preprint
2019-10-03 10:27:11 -04:00
VictorSanh
f1f23ad171
fix buf in convert_pt_chkpt_to_tf2
2019-10-03 10:27:11 -04:00
VictorSanh
2a91f6071f
upddate README - TODO updadte link to paper
2019-10-03 10:27:11 -04:00
VictorSanh
c51e533a5f
update train.py
2019-10-03 10:27:11 -04:00
VictorSanh
a76c3f9cb0
update requirements
2019-10-03 10:27:11 -04:00
VictorSanh
bb9c5ead54
update distiller
2019-10-03 10:27:11 -04:00
VictorSanh
a12ab0a8db
update binarized_data
2019-10-03 10:27:11 -04:00
VictorSanh
4d6dfbd376
update extract
2019-10-03 10:27:11 -04:00
VictorSanh
23edebc079
update extract_distilbert
2019-10-03 10:27:11 -04:00
VictorSanh
cbfcfce205
update token_counts
2019-10-03 10:27:11 -04:00
VictorSanh
19e4ebbe3f
grouped_batch_sampler
2019-10-03 10:27:11 -04:00
VictorSanh
594202a934
lm_seqs_dataset
2019-10-03 10:27:11 -04:00
VictorSanh
38084507c4
add distillation_configs
2019-10-03 10:27:11 -04:00
Simon Layton
9ffda216ec
Fix missed head transpose
2019-10-03 09:23:16 -04:00
Brian Ma
2195c0d5f9
Evaluation result.txt path changing #1286
2019-10-03 12:49:12 +08:00