Sagemaker test docs update for framework upgrade (#11206)
* increased train_runtime for model parallelism * added documentation for framework upgrade
This commit is contained in:
parent
74d7c24d8d
commit
f243a5ec0d
|
@ -66,8 +66,7 @@ images:
|
|||
```
|
||||
2. In the PR comment describe what test, we ran and with which package versions. Here you can copy the table from [Current Tests](#current-tests).
|
||||
|
||||
TODO: Add a screenshot of PR + Text template to make it easy to open.
|
||||
|
||||
2. In the PR comment describe what test we ran and with which framework versions. Here you can copy the table from [Current Tests](#current-tests). You can take a look at this [PR](https://github.com/aws/deep-learning-containers/pull/1016), which information are needed.
|
||||
## Test Case 2: Releasing a New AWS Framework DLC
|
||||
|
||||
|
||||
|
@ -92,7 +91,6 @@ AWS_PROFILE=<enter-your-profile> make test-sagemaker
|
|||
```
|
||||
These tests take around 10-15 minutes to finish. Preferably make a screenshot of the successfully ran tests.
|
||||
|
||||
|
||||
### After successful Tests:
|
||||
|
||||
After we have successfully run tests for the new framework version we need to create a PR at the [Deep Learning Container Repository](https://github.com/aws/deep-learning-containers).
|
||||
|
@ -136,7 +134,7 @@ images:
|
|||
docker_file: !join [ docker/, *SHORT_VERSION, /, *DOCKER_PYTHON_VERSION, /,
|
||||
*CUDA_VERSION, /Dockerfile., *DEVICE_TYPE ]
|
||||
```
|
||||
2. In the PR comment describe what test we ran and with which framework versions. Here you can copy the table from [Current Tests](#current-tests). You can take a look at this [PR](https://github.com/aws/deep-learning-containers/pull/1016), which information are needed.
|
||||
2. In the PR comment describe what test we ran and with which framework versions. Here you can copy the table from [Current Tests](#current-tests). You can take a look at this [PR](https://github.com/aws/deep-learning-containers/pull/1025), which information are needed.
|
||||
|
||||
## Current Tests
|
||||
|
||||
|
|
|
@ -28,14 +28,14 @@ if is_sagemaker_available():
|
|||
"script": "run_glue_model_parallelism.py",
|
||||
"model_name_or_path": "roberta-large",
|
||||
"instance_type": "ml.p3dn.24xlarge",
|
||||
"results": {"train_runtime": 1500, "eval_accuracy": 0.3, "eval_loss": 1.2},
|
||||
"results": {"train_runtime": 1600, "eval_accuracy": 0.3, "eval_loss": 1.2},
|
||||
},
|
||||
{
|
||||
"framework": "pytorch",
|
||||
"script": "run_glue.py",
|
||||
"model_name_or_path": "roberta-large",
|
||||
"instance_type": "ml.p3dn.24xlarge",
|
||||
"results": {"train_runtime": 1500, "eval_accuracy": 0.3, "eval_loss": 1.2},
|
||||
"results": {"train_runtime": 1600, "eval_accuracy": 0.3, "eval_loss": 1.2},
|
||||
},
|
||||
]
|
||||
)
|
||||
|
|
Loading…
Reference in New Issue