Commit Graph

10417 Commits

Author SHA1 Message Date
Sylvain Gugger d7e2d7b40b
Preserve hub-related kwargs in AutoModel.from_pretrained (#18545)
* Preserve hub-related kwargs in AutoModel.from_pretrained

* Fix tests

* Remove debug statement
2022-08-10 08:00:18 -04:00
Joao Gante 34aad0dac0
TF: XLA-trainable DeBERTa v2 (#18546)
* fix deberta issues

* add different code paths for gpu and tpu

* shorter gpu take along axis

* Stable Dropout without tf cond

* variable must be float
2022-08-10 12:57:21 +01:00
Younes Belkada 4a51075a96
`bitsandbytes` - `Linear8bitLt` integration into `transformers` models (#17901)
* first commit

* correct replace function

* add final changes

- works like charm!
- cannot implement tests yet
- tested

* clean up a bit

* add bitsandbytes dependencies

* working version

- added import function
- added bitsandbytes utils file

* small fix

* small fix

- fix import issue

* fix import issues

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit

- move bitsandbytes utils to utils
- change comments on functions

* reformat docstring

- reformat docstring on init_empty_weights_8bit

* Update src/transformers/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* revert bad formatting

* change to bitsandbytes

* refactor a bit

- remove init8bit since it is useless

* more refactoring

- fixed init empty weights issue
- added threshold param

* small hack to make it work

* Update src/transformers/modeling_utils.py

* Update src/transformers/modeling_utils.py

* revmoe the small hack

* modify utils file

* make style + refactor a bit

* create correctly device map

* add correct dtype for device map creation

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply suggestions

- remove with torch.grad
- do not rely on Python bool magic!

* add docstring

 - add docstring for new kwargs

* add docstring

- comment `replace_8bit_linear` function
- fix weird formatting

* - added more documentation
- added new utility function for memory footprint tracking
- colab demo to add

* few modifs

- typo doc
- force cast into float16 when load_in_8bit is enabled

* added colab link

* add test architecture + docstring a bit

* refactor a bit testing class

* make style + refactor a bit

* enhance checks

- add more checks
- start writing saving test

* clean up a bit

* male style

* add more details on doc

* add more tests

- still needs to fix 2 tests

* replace by "or"

- could not fix it from GitHub GUI

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* refactor a bit testing code + add readme

* make style

* fix import issue

* Update src/transformers/modeling_utils.py

Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>

* add few comments

* add more doctring + make style

* more docstring

* raise error when loaded in 8bit

* make style

* add warning if loaded on CPU

* add small sanity check

* fix small comment

* add bitsandbytes on dockerfile

* Improve documentation

- improve documentation from comments

* add few comments

* slow tests pass on the VM but not on the CI VM

* Fix merge conflict

* make style

* another test should pass on a multi gpu setup

* fix bad import in testing file

* Fix slow tests

- remove dummy batches
- no more CUDA illegal memory errors

* odify dockerfile

* Update docs/source/en/main_classes/model.mdx

* Update Dockerfile

* Update model.mdx

* Update Dockerfile

* Apply suggestions from code review

* few modifications

- lm head can stay on disk/cpu
- change model name so that test pass

* change test value

- change test value to the correct output
- torch bmm changed to baddmm in bloom modeling when merging

* modify installation guidelines

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* replace `n`by `name`

* merge `load_in_8bit` and `low_cpu_mem_usage`

* first try - keep the lm head in full precision

* better check

- check the attribute `base_model_prefix` instead of computing the number of parameters

* added more tests

* Update src/transformers/utils/bitsandbytes.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Merge branch 'integration-8bit' of https://github.com/younesbelkada/transformers into integration-8bit

* improve documentation

- fix typos for installation
- change title in the documentation

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
2022-08-10 09:13:36 +02:00
Steven Liu 8cf4a6f0a6
📝 update documentation build section (#18548) 2022-08-09 18:22:55 -05:00
Sylvain Gugger 38a674599c Clean up comment 2022-08-09 15:15:01 -04:00
YouJiacheng 5e2f373705
Restore _init_weights value in no_init_weights (#18504)
* Recover _init_weights value in no_init_weights

For potential nested use. 
In addition, users might modify private no_init_weights as well.

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Remove private variable change check

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-08-09 14:23:30 -04:00
Steven Liu 0c183cc2f4
📝 update metric with evaluate (#18535) 2022-08-09 11:58:11 -05:00
Nicolas Patry 9f5fe63548
Adding a new `align_to_words` param to qa pipeline. (#18010)
* Adding a new `align_to_words` param to qa pipeline.

* Update src/transformers/pipelines/question_answering.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Import protection.

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-08-09 18:50:02 +02:00
Younes Belkada ab2006e3d6
BART - Fix attention mask device issue on copied models (#18540)
* attempt to fix attn mask device

* fix bart `_prepare_decoder_attention_mask`

- add correct device
- run `make fix-copies` to propagate the fix
2022-08-09 14:47:18 +02:00
Yih-Dar 6bea7b8178
Minor update of `run_call_with_unpacked_inputs` (#18541)
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-09 14:33:41 +02:00
Thomas Chaigneau 8cb5ecd912
Add mt5 onnx config (#18394)
* update features

* MT5OnnxConfig added with updated with tests and docs

* fix imports

* fix onnc_config_cls for mt5

Co-authored-by: Thomas Chaigneau <thomas.deeptools.ai>
2022-08-09 03:46:53 -04:00
Niklas Hansson fe785730dc
fix: data2vec-vision Onnx ready-made configuration. (#18427)
* feat: add the data2vec conf that are missing https://huggingface.co/docs/transformers/serialization

* fix: wrong config
2022-08-09 03:35:05 -04:00
Younes Belkada ab62a23d8c
Let's not cast them all (#18471)
* add correct dtypes when checking for params dtype

* forward contrib credits

* Update src/transformers/modeling_utils.py

Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com>

* more comments

- added more comments on why we cast only floating point parameters

* Update src/transformers/modeling_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: sgugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Thomas Wang <24695242+thomasw21@users.noreply.github.com>
2022-08-08 23:48:49 +02:00
AguilaCudicio 499450ed75
Spanish translation of summarization.mdx (#15947) (#18477)
* Add Spanish translation of summarization.mdx

* Apply suggestions from code review

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>

Co-authored-by: Omar U. Espejel <espejelomar@gmail.com>
2022-08-08 15:54:11 -04:00
Ian Castillo ed70f24291
Add Spanish translation of converting_tensorflow_models.mdx (#18512)
* Add file in spanish docs to be translated

* Finish translation to Spanish

* Improve Spanish  wording

* Add suggested changes from review
2022-08-08 15:53:43 -04:00
Rasmus Arpe Fogh Jensen a765b68aa6
Update no_trainer.py scripts to include accelerate gradient accumulation wrapper (#18473)
* Added accelerate gradient accumulation wrapper to run_image_classification_no_trainer.py example script

* make fixup changes

* PR comments

* changed input to Acceletor based on PR comment, ran make fixup

* Added comment explaining the sync_gradients statement

* Fixed lr scheduler max steps

* Changed run_clm_no_trainer.py script to use accelerate gradient accum wrapper

* Fixed all scripts except wav2vec2 pretraining to use accelerate gradient accum wrapper

* Added accelerate gradient accum wrapper for wav2vec2_pretraining_no_trainer.py script

* make fixup and lr_scheduler step inserted back into run_qa_beam_search_no_trainer.py

* removed changes to run_wav2vec2_pretraining_no_trainer.py script and fixed using wrong constant in qa_beam_search_no_trainer.py script
2022-08-08 15:52:47 -04:00
Mishig Davaadorj f1f5de31ed
Update perf_train_gpu_one.mdx (#18532) 2022-08-08 20:33:34 +02:00
NielsRogge 82bb682643
[VideoMAE] Add model to doc tests (#18523)
* Add videomae to doc tests

* Add pip install decord

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-08-08 19:28:51 +02:00
Steven Liu 3632531ec6
Add example of multimodal usage to pipeline tutorial (#18498)
* 📝 add example of multimodal usage to pipeline tutorial

* 🖍 apply feedbacks

* 🖍 apply niels feedback
2022-08-08 11:31:31 -05:00
Steven Liu 36b37990af
update to use interlibrary links instead of Markdown (#18500) 2022-08-08 10:53:52 -05:00
Yih-Dar ec8d26248f
unpin resampy (#18527)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-08 17:44:10 +02:00
Sylvain Gugger 47e1676255
New cache fixes: add safeguard before looking in folders (#18522) 2022-08-08 10:22:27 -04:00
Ankur Goyal 7495924007
Specify en in doc-builder README example (#18526)
Co-authored-by: Ankur Goyal <ankur@impira.com>
2022-08-08 10:22:17 -04:00
Sylvain Gugger aff5117f46 Remove debug statement 2022-08-08 09:54:10 -04:00
Sylvain Gugger 70b0d4e193
Fix compatibility with 1.12 (#17925)
* Fix compatibility with 1.12

* Remove pin from examples requirements

* Update torch scatter version

* Fix compatibility with 1.12

* Remove pin from examples requirements

* Update torch scatter version

* fix torch.onnx.symbolic_opset12 import

* Reject bad version

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-08 09:53:08 -04:00
Sourab Mangrulkar 2fecde742d
update fsdp docs (#18521)
* updating fsdp documentation

* typo fix
2022-08-08 18:56:51 +05:30
Sylvain Gugger 377cdded7a
Clean up hub (#18497)
* Clean up utils.hub

* Remove imports

* More fixes

* Last fix
2022-08-08 08:48:10 -04:00
Nicolas Patry a4562552eb
[DX fix] Fixing QA pipeline streaming a dataset. (#18516)
* [DX fix] Fixing QA pipeline streaming a dataset.

QuestionAnsweringArgumentHandler would iterate over the whole dataset
effectively killing all properties of the pipeline.
This restores nice properties when using `Dataset` or `Generator` since
those are meant to be consumed lazily.

* Handling TF better.
2022-08-08 14:25:56 +02:00
regisss 88a0ce57bb
Add seed setting to image classification example (#18519) 2022-08-08 08:08:11 -04:00
Julien Chaumond 9129fd0377
`transformers-cli login` => `huggingface-cli login` (#18490)
* zero chance anyone's using that constant no?

* `transformers-cli login` => `huggingface-cli login`

* `transformers-cli repo create` => `huggingface-cli repo create`

* `make style`
2022-08-06 09:42:55 +02:00
Julien Chaumond 8d1f9039d0
Just re-reading the whole doc every couple of months 😬 (#18489)
* Delete valohai.yaml

* NLP => ML

* typo

* website supports https

* datasets

* 60k + modalities

* unrelated link fixing for accelerate

* Ok those links were actually broken

* Fix link

* Make `AutoTokenizer` auto-link

* wording tweak

* add at least one non-nlp task
2022-08-06 09:38:55 +02:00
Julien Chaumond b8c247b6d0
Typo reported by Joel Grus on TWTR (#18493) 2022-08-05 13:29:38 -04:00
Yih-Dar 38d656041b
disable Onnx test for google/long-t5-tglobal-base (#18454)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-05 19:27:19 +02:00
Sylvain Gugger 56a55d3ce4 Forgot one new_ for cache migration 2022-08-05 13:24:53 -04:00
Yih-Dar 9d64f7f00c
Update some expected values in `quicktour.mdx` for `resampy 0.3.0` (#18484)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-05 19:17:51 +02:00
Sylvain Gugger faacdf007b
Move cache folder to huggingface/hub for consistency with hf_hub (#18492)
* Move cache folder to just huggingface

* Thank you VsCode for this needless import

* Move to hub

* Forgot one
2022-08-05 13:14:00 -04:00
Yih-Dar 280db2e39c
Fix `test_dbmdz_english` by updating expected values (#18482)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-05 16:49:54 +02:00
Sylvain Gugger 5cd4032368
Use new huggingface_hub tools for download models (#18438)
* Draft new cached_file

* Initial draft for config and model

* Small fixes

* Fix first batch of tests

* Look in cache when internet is down

* Fix last tests

* Bad black, not fixing all quality errors

* Make diff less

* Implement change for TF and Flax models

* Add tokenizer and feature extractor

* For compatibility with main

* Add utils to move the cache and auto-do it at first use.

* Quality

* Deal with empty commit shas

* Deal with empty etag

* Address review comments
2022-08-05 10:12:40 -04:00
Sylvain Gugger 70fa1a8d26
Fix pipeline tests (#18487)
* Fix pipeline tests

* Make sure all pipelines tests run with init changes
2022-08-05 09:14:51 -04:00
Sylvain Gugger c7849d9efc
Remove py.typed (#18485) 2022-08-05 09:12:19 -04:00
Yih-Dar 893122f666
Add TF prefix to TF-Res test class (#18481)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-05 13:59:55 +02:00
Seunghwan Hong bf174f916b
Refactor `TFSwinLayer` to increase serving compatibility (#18352)
* Refactor `TFSwinLayer` to increase serving compatibility

Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>

* Fix missed parameters while refactoring

Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>

* Fix window_reverse to calculate batch size

Signed-off-by: Seunghwan Hong <harrydrippin@gmail.com>
Co-Authored-By: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2022-08-05 07:40:14 -04:00
Seunghwan Hong 575aa6ef1a
Fix TFSwinSelfAttention to have relative position index as non-trainable weight (#18226)
Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>
2022-08-05 07:39:40 -04:00
Nicolas Patry 586dcf6b21
Fixing issue where generic model types wouldn't load properly with the pipeline (#18392)
* Adding a better error message when the model is improperly configured

within transformers.

* Update src/transformers/pipelines/__init__.py

* Black version.

* Overriding task aliases so that tokenizer+feature_extractor

values are correct.

* Fixing task aliases by overriding their names early

* X.

* Fixing feature-extraction.

* black again.

* Normalizing `translation` too.

* Fixing last few corner cases.

translation need to use its non normalized name (translation_XX_to_YY,
so that the task_specific_params are correctly overloaded).
This can be removed and cleaned up in a later PR.

`speech-encode-decoder` actually REQUIRES to pass a `tokenizer` manually
so the error needs to be discarded when the `tokenizer` is already
there.

* doc-builder fix.

* Fixing the real issue.

* Removing dead code.

* Do not import the actual config classes.
2022-08-05 08:45:07 +02:00
Yih-Dar 14928921e2
Add `TF_MODEL_FOR_SEMANTIC_SEGMENTATION_MAPPING` (#18469)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-04 20:41:15 +02:00
Kian Sierra McGettigan 0bf1e1aca4
Update no trainer examples for QA and Semantic Segmentation (#18474)
* swag_no_trainer updated for with gather_metrics

* Removed unused variable samples_seen

* updated examples with gather_for_metrics
2022-08-04 13:22:19 -04:00
Yih-Dar d2704c4143
Add machine type in the artifact of Examples directory job (#18459)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-04 18:52:01 +02:00
NielsRogge f9a0008d2d
Add VideoMAE (#17821)
* First draft

* Add VideoMAEForVideoClassification

* Improve conversion script

* Add VideoMAEForPreTraining

* Add VideoMAEFeatureExtractor

* Improve VideoMAEFeatureExtractor

* Improve docs

* Add first draft of model tests

* Improve VideoMAEForPreTraining

* Fix base_model_prefix

* Make model take pixel_values of shape (B, T, C, H, W)

* Add loss computation of VideoMAEForPreTraining

* Improve tests

* Improve model testsé

* Make all tests pass

* Add VideoMAE to main README

* Add tests for VideoMAEFeatureExtractor

* Add integration test

* Improve conversion script

* Rename patch embedding class

* Remove VideoMAELayer from init

* Update design of patch embeddings

* Improve comments

* Improve conversion script

* Improve conversion script

* Add conversion of pretrained model

* Add loss verification of pretrained model

* Add loss verification of unnormalized targets

* Add integration test for pretraining model

* Apply suggestions from code review

* Fix bug to make feature extractor resize only shorter edge

* Address more comments

* Improve normalization of videos

* Add doc examples

* Move constants to dedicated script

* Remove scripts

* Transfer checkpoints, fix docs

* Update script

* Update image mean and std

* Fix doc tests

* Set return_tensors to NumPy by default

* Revert the previous change

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-08-04 18:02:55 +02:00
Thomas Wang 672b66262a
Add FX support for torch.baddbmm andd torch.Tensor.baddbmm (#18363) 2022-08-04 16:02:16 +02:00
Sylvain Gugger df28de0581
Fix load of model checkpoints in the Trainer (#18470) 2022-08-04 08:22:25 -04:00