Commit Graph

12 Commits

Author SHA1 Message Date
Younes Belkada 4a51075a96
`bitsandbytes` - `Linear8bitLt` integration into `transformers` models (#17901)
* first commit

* correct replace function

* add final changes

- works like charm!
- cannot implement tests yet
- tested

* clean up a bit

* add bitsandbytes dependencies

* working version

- added import function
- added bitsandbytes utils file

* small fix

* small fix

- fix import issue

* fix import issues

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit

- move bitsandbytes utils to utils
- change comments on functions

* reformat docstring

- reformat docstring on init_empty_weights_8bit

* Update src/transformers/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* revert bad formatting

* change to bitsandbytes

* refactor a bit

- remove init8bit since it is useless

* more refactoring

- fixed init empty weights issue
- added threshold param

* small hack to make it work

* Update src/transformers/modeling_utils.py

* Update src/transformers/modeling_utils.py

* revmoe the small hack

* modify utils file

* make style + refactor a bit

* create correctly device map

* add correct dtype for device map creation

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply suggestions

- remove with torch.grad
- do not rely on Python bool magic!

* add docstring

 - add docstring for new kwargs

* add docstring

- comment `replace_8bit_linear` function
- fix weird formatting

* - added more documentation
- added new utility function for memory footprint tracking
- colab demo to add

* few modifs

- typo doc
- force cast into float16 when load_in_8bit is enabled

* added colab link

* add test architecture + docstring a bit

* refactor a bit testing class

* make style + refactor a bit

* enhance checks

- add more checks
- start writing saving test

* clean up a bit

* male style

* add more details on doc

* add more tests

- still needs to fix 2 tests

* replace by "or"

- could not fix it from GitHub GUI

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit testing code + add readme

* make style

* fix import issue

* Update src/transformers/modeling_utils.py

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* add few comments

* add more doctring + make style

* more docstring

* raise error when loaded in 8bit

* make style

* add warning if loaded on CPU

* add small sanity check

* fix small comment

* add bitsandbytes on dockerfile

* Improve documentation

- improve documentation from comments

* add few comments

* slow tests pass on the VM but not on the CI VM

* Fix merge conflict

* make style

* another test should pass on a multi gpu setup

* fix bad import in testing file

* Fix slow tests

- remove dummy batches
- no more CUDA illegal memory errors

* odify dockerfile

* Update docs/source/en/main_classes/model.mdx

* Update Dockerfile

* Update model.mdx

* Update Dockerfile

* Apply suggestions from code review

* few modifications

- lm head can stay on disk/cpu
- change model name so that test pass

* change test value

- change test value to the correct output
- torch bmm changed to baddmm in bloom modeling when merging

* modify installation guidelines

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* replace `n`by `name`

* merge `load_in_8bit` and `low_cpu_mem_usage`

* first try - keep the lm head in full precision

* better check

- check the attribute `base_model_prefix` instead of computing the number of parameters

* added more tests

* Update src/transformers/utils/bitsandbytes.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers into integration-8bit

* improve documentation

- fix typos for installation
- change title in the documentation

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
2022-08-10 09:13:36 +02:00
NielsRogge 82bb682643
[VideoMAE] Add model to doc tests (#18523)
* Add videomae to doc tests

* Add pip install decord

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-08-08 19:28:51 +02:00
Yih-Dar b089cca347
PyTorch 1.12.0 for scheduled CI (#17949)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-29 19:32:19 +02:00
Yih-Dar ca169dbdf1
Enable PyTorch nightly build CI (#17335)
* nightly build pytorch CI

* fix working dir

* change time and event name

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-17 16:42:27 +02:00
Yih-Dar 264128cb9d
Explicit versions in docker files (#17586)
* Update docker file

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-08 15:04:22 +02:00
Joao Gante 78c695eb62
CLI: add stricter automatic checks to `pt-to-tf` (#17588)
* Stricter pt-to-tf checks; Update docker image for related tests

* check all attributes in the output

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-06-08 10:45:10 +01:00
Yih-Dar 9aa230aa2f
Use latest stable PyTorch/DeepSpeed for Push & Scheduled CI (#17417)
* update versions

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-06-07 11:53:05 +02:00
Yih-Dar 7198b63362
install dev. version of accelerate (#17243)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-13 13:47:09 -04:00
Sylvain Gugger 867f3950fa
Rename master to main for notebooks links and leftovers (#16397) 2022-03-25 09:12:23 -04:00
Lysandre Debut c1000e703b
Dcoker images runtime -> devel (#16141)
* Runtime -> Devel

* Torch before DeepSpeed
2022-03-14 12:37:20 -04:00
Lysandre Debut 7ff9d450cd
Scatter should run on CUDA (#15872) 2022-03-01 11:47:17 -05:00
Lysandre Debut a0e3480699
[Test refactor 5/5] Build docker images (#15729) 2022-02-23 15:48:19 -05:00