[docs] post-PR merge fix (#15355)

* [docs] post-PR merge fix

* Update docs/source/main_classes/deepspeed.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
This commit is contained in:
Stas Bekman 2022-01-26 11:23:32 -08:00 committed by GitHub
parent 99a2771189
commit fc8fc400e3
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23
1 changed files with 2 additions and 2 deletions

View File

@ -31,7 +31,7 @@ won't be possible on a single GPU.
🤗 Transformers integrates [DeepSpeed](https://github.com/microsoft/DeepSpeed) via 2 options:
1. Integration of the core DeepSpeed features via [`Trainer`]. This is everything done for your type
1. Integration of the core DeepSpeed features via [`Trainer`]. This is an everything-done-for-you type
of integration - just supply your custom config file or use our template and you have nothing else to do. Most of
this document is focused on this feature.
2. If you don't use [`Trainer`] and want to use your own Trainer where you integrated DeepSpeed
@ -604,7 +604,7 @@ The following is an example of configuration for ZeRO stage 2:
**Performance tuning:**
- enabling `offload_optimizer` should reduce GPU RAM usage (it requires `"stage": 2`)
- `"overlap_comm": true` trade offs increased GPU RAM usage to lower all-reduce latency. `overlap_comm` uses 4.5x
- `"overlap_comm": true` trades off increased GPU RAM usage to lower all-reduce latency. `overlap_comm` uses 4.5x
the `allgather_bucket_size` and `reduce_bucket_size` values. So if they are set to 5e8, this requires a 9GB
footprint (`5e8 x 2Bytes x 2 x 4.5`). Therefore, if you have a GPU with 8GB or less RAM, to avoid getting
OOM-errors you will need to reduce those parameters to about `2e8`, which would require 3.6GB. You will want to do