transformers

History

Younes Belkada f6261d7d81 FEAT / Optim: Add GaLore optimizer (#29588 ) * add galore v1 * add import * add tests and doc * fix doctest * forward contrib credits from discussions * forward contrib credits from discussions * Apply suggestions from code review Co-authored-by: Zach Mueller <muellerzr@gmail.com> * fix failing tests' * switch to `optim_target_modules` and clarify docs * more clarification * enhance lookup logic * update a test to add peak memory * add regex, all-linear and single string support * add layer-wise optimization through DummyOptimizers and LRSchedulers * forward contrib credits from discussions and original idea * add a section about DDP not supported in layerwise * Update src/transformers/trainer.py Co-authored-by: Zach Mueller <muellerzr@gmail.com> * fix self * check only if layer_wise * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * oops * make use of intervals * clarify comment * add matching tests * GaLoRe -> GaLore * move to `get_scheduler` * add note on docs * add a warning * adapt a bit the docs * update docstring * support original API * Update docs/source/en/trainer.md * slightly refactor * Update docs/source/en/trainer.md Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com> * Update src/transformers/training_args.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix args parsing and add tests * remove warning for regex * fix type hint * add note about extra args * make `is_regex` return optional --------- Co-authored-by: Maxime <maximegmd @users.noreply.github.com> Co-authored-by: Wing Lian <winglian @users.noreply.github.com> Co-authored-by: Zach Mueller <muellerzr@gmail.com> Co-authored-by: hiyouga <hiyouga@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> Co-authored-by: Matthew Douglas <38992547+matthewdouglas@users.noreply.github.com>		2024-03-19 11:40:23 +01:00
..
__init__.py	[Test refactor 1/5] Per-folder tests reorganization (#15725 )	2022-02-23 15:46:28 -05:00
test_data_collator.py	handle numpy inputs in whole word mask data collator (#22032 )	2023-03-10 10:50:29 -05:00
test_trainer.py	FEAT / Optim: Add GaLore optimizer (#29588 )	2024-03-19 11:40:23 +01:00
test_trainer_callback.py	Apply ruff flake8-comprehensions (#21694 )	2023-02-22 09:14:54 +01:00
test_trainer_distributed.py	🚨 Fully revert atomic checkpointing 🚨 (#29370 )	2024-03-04 06:17:42 -05:00
test_trainer_seq2seq.py	Trainer: fail early in the presence of an unsavable `generation_config` (#29675 )	2024-03-15 12:59:10 +00:00
test_trainer_tpu.py	[Test refactor 1/5] Per-folder tests reorganization (#15725 )	2022-02-23 15:46:28 -05:00
test_trainer_utils.py	Apply ruff flake8-comprehensions (#21694 )	2023-02-22 09:14:54 +01:00