Commit Graph

620 Commits

Author SHA1 Message Date
thomwolf 361aff6de5 typos 2019-03-27 11:54:59 +01:00
thomwolf cea8ba1d59 adjusted formating and some wording in the readme 2019-03-27 11:53:44 +01:00
Matthew Carrigan 24e67fbf75 Minor README update 2019-03-25 12:33:30 +00:00
Matthew Carrigan 8d1d1ffde2 Corrected the displayed loss when gradient_accumulation_steps > 1 2019-03-25 12:15:19 +00:00
Matthew Carrigan abb7d1ff6d Added proper context management to ensure cleanup happens in the right
order.
2019-03-21 17:50:03 +00:00
Matthew Carrigan 06a30cfdf3 Added a --reduce_memory option to the training script to keep training
data on disc as a memmap rather than in memory
2019-03-21 17:04:12 +00:00
Matthew Carrigan 7d1ae644ef Added a --reduce_memory option to the training script to keep training
data on disc as a memmap rather than in memory
2019-03-21 17:02:18 +00:00
Matthew Carrigan 2bba7f810e Added a --reduce_memory option to shelve docs to disc instead of keeping them in memory. 2019-03-21 16:50:16 +00:00
Matthew Carrigan 8733ffcb5e Removing a couple of other old unnecessary comments 2019-03-21 14:09:57 +00:00
Matthew Carrigan 8a861048dd Fixed up the notes on a possible future low-memory path 2019-03-21 14:08:39 +00:00
Matthew Carrigan a8a577ba93 Reduced memory usage for pregenerating the data a lot by writing it
out on the fly without shuffling - the Sampler in the finetuning script
will shuffle for us.
2019-03-21 14:05:52 +00:00
Matthew Carrigan 0ae59e662d Reduced memory usage for pregenerating the data a lot by writing it
out on the fly without shuffling - the Sampler in the finetuning script
will shuffle for us.
2019-03-21 14:04:17 +00:00
Matthew Carrigan 6a9038ba53 Removed an old irrelevant comment 2019-03-21 13:36:41 +00:00
Matthew Carrigan 29a392fbcf Small README changes 2019-03-20 17:35:17 +00:00
Matthew Carrigan 832b2b0058 Adding README 2019-03-20 17:31:49 +00:00
Matthew Carrigan 934d3f4d2f Syncing up argument names between the scripts 2019-03-20 17:23:23 +00:00
Matthew Carrigan f19ba35b2b Move old finetuning script into the new folder 2019-03-20 16:47:06 +00:00
Matthew Carrigan 7de5c6aa5e PEP8 and formatting cleanups 2019-03-20 16:44:04 +00:00
Matthew Carrigan 1798e98e5a Added final TODOs 2019-03-20 16:42:37 +00:00
Matthew Carrigan c64c2fc4c2 Fixed embarrassing indentation problem 2019-03-20 15:42:57 +00:00
Matthew Carrigan 0540d360f2 Fixed logging 2019-03-20 15:36:51 +00:00
Matthew Carrigan 976554a472 First commit of the new LM finetuning 2019-03-20 14:23:51 +00:00
Thomas Wolf f3e5404880
Merge pull request #381 from tseretelitornike/master
Added missing imports.
2019-03-15 12:54:40 +01:00
tseretelitornike 83857ffeaa
Added missing imports. 2019-03-15 12:45:48 +01:00
Thomas Wolf d5c037c3ed
Merge pull request #380 from yongbowin/patch-3
typo in annotation
2019-03-14 15:56:40 +01:00
Yongbo Wang d1e4fa98a9
typo in annotation
modify `heruistic` to `heuristic` in line 660, `charcter` to `character` in line 661.
2019-03-14 17:32:15 +08:00
Thomas Wolf 59e2bdd086
Merge pull request #379 from yongbowin/patch-2
typo
2019-03-14 10:17:18 +01:00
Yongbo Wang 3d6452163d
typo
modify `mull` to `null` in line 474 annotation.
2019-03-14 17:03:38 +08:00
Thomas Wolf 76906372b0
Merge pull request #378 from huggingface/absolute_imports
Add absolute imports to GPT, GPT-2, Transfo-XL and and fix empty nbest_predictions.json
2019-03-14 10:00:47 +01:00
thomwolf a98dfe4ced fixing #377 (empty nbest_predictions.json) 2019-03-14 09:57:06 +01:00
thomwolf e5f2d9122c adding absolute imports to gpt2, openai and transfo-xl 2019-03-14 09:55:01 +01:00
Thomas Wolf eecaaa734a
Merge pull request #371 from yongbowin/patch-1
Simplify code, delete redundancy line
2019-03-14 09:03:32 +01:00
Yongbo Wang 22a465a91f
Simplify code, delete redundancy line
delete redundancy line `if args.train`, simplify code.
2019-03-13 09:42:06 +08:00
Thomas Wolf 9b03d67b83
Merge pull request #362 from Bharat123rox/patch-1
Make the hyperlink of NVIDIA Apex clickable
2019-03-11 09:08:51 +01:00
Thomas Wolf 8435d78f0c
Merge pull request #361 from junjieqian/jqian/updateReadme
Correct line number in README for classes
2019-03-11 09:08:27 +01:00
Thomas Wolf 80790705e0
Merge pull request #359 from elonmuskceo/fix-typo
Update run_gpt2.py
2019-03-11 09:07:56 +01:00
Thomas Wolf 13aa13dbc0
Merge pull request #358 from cdjhz/patch-1
add 'padding_idx=0' for BertEmbeddings
2019-03-11 09:06:55 +01:00
Thomas Wolf c0660df5dd
Merge pull request #357 from pglock/feature/354-use-dropout-layer-gpt
Use Dropout Layer in OpenAIGPTMultipleChoiceHead
2019-03-11 09:06:27 +01:00
Bharat Raghunathan f91ce0b803
Make the hyperlink of NVIDIA Apex clickable 2019-03-09 20:05:39 +05:30
Junjie Qian d648a02203 Correct line number in README for classes 2019-03-08 16:28:03 -08:00
Elon Musk 66d8206809
Update run_gpt2.py 2019-03-08 11:59:08 -05:00
Haozhe Ji 72fa8d03a7
add 'padding_idx=0' for BertEmbeddings 2019-03-07 20:02:55 +08:00
Philipp Glock 6190e8ce4c Fix: use dropout layer 2019-03-07 10:12:45 +01:00
thomwolf 7cc35c3104 fix openai gpt example and updating readme 2019-03-06 11:43:21 +01:00
thomwolf 906b638efa updating readme 2019-03-06 10:24:19 +01:00
thomwolf 994d86609b fixing PYTORCH_PRETRAINED_BERT_CACHE use in examples 2019-03-06 10:21:24 +01:00
thomwolf 2dd8f524f5 removing test for long sequences error following #337 2019-03-06 10:10:41 +01:00
thomwolf 5c85fc3977 fix typo - logger info 2019-03-06 10:05:21 +01:00
Thomas Wolf 8e36da7acb
Merge pull request #347 from jplehmann/feature/sst2-processor
Processor for SST-2 task
2019-03-06 09:48:27 +01:00
Thomas Wolf 21c88a07b7
Merge pull request #341 from potatochip/patch-1
catch exception if pathlib not install
2019-03-06 09:48:01 +01:00