* Port core files + ESM (because ESM code is odd)
* Search-replace in modelling code
* Fix up transfo_xl as well
* Fix other core files + tests (still need to add correct import to tests)
* Fix cookiecutter
* make fixup, fix imports in some more core files
* Auto-add imports to tests
* Cleanup, add imports to sagemaker tests
* Use correct exception for importing tf_keras
* Fixes in modeling_tf_utils
* make fixup
* Correct version parsing code
* Ensure the pipeline tests correctly revert to float32 after each test
* Ensure the pipeline tests correctly revert to float32 after each test
* More tf.keras -> keras
* Add dtype cast
* Better imports of tf_keras
* Add a cast for tf.assign, just in case
* Fix callback imports
* Result of black 23.1
* Update target to Python 3.7
* Switch flake8 to ruff
* Configure isort
* Configure isort
* Apply isort with line limit
* Put the right black version
* adapt black in check copies
* Fix copies
* Updates the default branch from master to main
* Links from `master` to `main`
* Typo
* Update examples/flax/README.md
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Barrier -> barrier
* added logger for metrics
* removed stream handler in trainer
* moved handler
* removed streamhandler from trainer
* updated test image and instance type added datasets version to test
* Update tests/sagemaker/scripts/pytorch/requirements.txt
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Replace is_sagemaker_distributed_available
* Merge SageMakerTrainer into Trainer
* Test with shorter condition
* Put back deleted line
* Deprecate SageMakerTrainer and SageMakerTrainingArguments
* Apply suggestions from code review
Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>
Co-authored-by: Philipp Schmid <32632186+philschmid@users.noreply.github.com>
* init
* first working test
* added todo for setup.py
* working test for single node multi node ddp and smd
* added tensorflow single node test
* added directory for pytorch and tensorflow due to different requirements.txt
* added directory for pytorch and tensorflow
* added comment for run_glue until it is available
* added output_dir to it
* smaller dataset to make test running faster
* adjust HP and script
* adjusted parameter for tensorflow
* refactored test scripts
* adjusted make file
* init
* first working test
* added todo for setup.py
* working test for single node multi node ddp and smd
* added tensorflow single node test
* added directory for pytorch and tensorflow due to different requirements.txt
* added directory for pytorch and tensorflow
* added comment for run_glue until it is available
* added output_dir to it
* smaller dataset to make test running faster
* adjust HP and script
* adjusted parameter for tensorflow
* refactored test scripts
* adjusted make file
* updated dlc container
* commented in all tests
* added both ecr images
* added new master branches
* debug
* added new datasets version
* init
* strange rebase bug
* removed changes
* changed min version for tests to work
* updated DLC
* added model parallel test
* removed test files
* removed test files
* tested with ned dlc
* added correct sagemaker sdk version
* adjust DLCs for official one
* reworked tests
* quality
* removed default profile added documentation to it
* added step in release for sagemaker tests
* reverted version for example script removed duplicated script and added install from master to requirements.txt
* removed mistaken .DS_Stores from mac
* fixed tests
* added Sylvains feedback
* make style
* added lysandre's feedback