Fix duplicate call to save_checkpoint when using deepspeed (#14946)
* Fix duplicate call to save_checkpoint when using deepspeed / stage3_gather_fp16_weights_on_model_save
* Revert "Fix duplicate call to save_checkpoint when using deepspeed / stage3_gather_fp16_weights_on_model_save"
This reverts commit 6a3dec0397
.
* Delete correct duplicate invocation of deepspeed save_checkpoint
This commit is contained in:
parent
03885a3f50
commit
c1138273d4
|
@ -1999,9 +1999,6 @@ class Trainer:
|
|||
# This must be called on all ranks
|
||||
self.deepspeed.save_fp16_model(output_dir, WEIGHTS_NAME)
|
||||
|
||||
# save a deepspeed checkpoint as well (this is very fast)
|
||||
self.deepspeed.save_checkpoint(output_dir)
|
||||
|
||||
elif self.args.should_save:
|
||||
self._save(output_dir)
|
||||
|
||||
|
|
Loading…
Reference in New Issue