Commit Graph

10336 Commits

Author SHA1 Message Date
NielsRogge 7b9e995b70
Fix docs (#18399)
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-08-01 17:02:51 +02:00
Sylvain Gugger e0bc4c73e8
Add balanced strategies for device_map in from_pretrained (#18349)
* Add balanced strategies for device_map in from_pretrained

* Add safeguards for Accelerate version

* Update src/transformers/modeling_utils.py

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>

* Style

Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>
2022-08-01 10:28:26 -04:00
NielsRogge 39e76d76fd
Fix doc tests (#18397)
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-08-01 15:56:10 +02:00
Arthur 1141371103
Fix OPT doc tests (#18365) 2022-08-01 15:19:45 +02:00
Sylvain Gugger af1e6b4d87
Add evaluate to test dependencies (#18396) 2022-08-01 08:55:44 -04:00
Yih-Dar bd6d1b4300
Add a check regarding the number of occurrences of ``` (#18389)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-01 14:23:02 +02:00
YouJiacheng 1cd7c6f154
Fix from_pretrained kwargs passing (#18387)
Fix #18385
I don't know whether `use_auth_token`, `cache_dir` and `local_files_only` should be passed to `(cls.slow_tokenizer_class)._from_pretrained`, but I guess it should.
2022-08-01 08:16:24 -04:00
amyeroberts 96b5d7db9c
Remove pt-like calls on tf tensor (#18393) 2022-08-01 13:06:30 +01:00
Ogundepo Odunayo 679d68a11b
Correct the spelling of bleu metric (#18375) 2022-08-01 07:51:27 -04:00
atturaioe 1f84399171
Migrate metric to Evaluate in Pytorch examples (#18369)
* Migrate metric to Evaluate in pytorch examples

* Remove unused imports
2022-08-01 07:40:25 -04:00
dependabot[bot] 25ec12eaf7
Bump mistune from 0.8.4 to 2.0.3 in /examples/research_projects/lxmert (#18370)
Bumps [mistune](https://github.com/lepture/mistune) from 0.8.4 to 2.0.3.
- [Release notes](https://github.com/lepture/mistune/releases)
- [Changelog](https://github.com/lepture/mistune/blob/master/docs/changes.rst)
- [Commits](https://github.com/lepture/mistune/compare/v0.8.4...v2.0.3)

---
updated-dependencies:
- dependency-name: mistune
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-01 04:46:57 -04:00
dependabot[bot] a7360385f4
Bump mistune in /examples/research_projects/visual_bert (#18371)
Bumps [mistune](https://github.com/lepture/mistune) from 0.8.4 to 2.0.3.
- [Release notes](https://github.com/lepture/mistune/releases)
- [Changelog](https://github.com/lepture/mistune/blob/master/docs/changes.rst)
- [Commits](https://github.com/lepture/mistune/compare/v0.8.4...v2.0.3)

---
updated-dependencies:
- dependency-name: mistune
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-08-01 04:46:31 -04:00
Sourab Mangrulkar b2e4b091f0
fix FSDP ShardedGradScaler (#18358)
renaming it
2022-07-30 10:07:56 +05:30
Yih-Dar 51227e26ab
Fix TFSegformerForSemanticSegmentation doctest (#18362)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-29 16:30:59 +02:00
Michael Benayoun 4e2f4a92dd
[FX] Symbolic trace for Bloom (#18356)
* Bloom model can now be traced

* Bloom traced model can be torch scripted and serialized

* Bloom can be traced with variable keyword arguments

* Enable XLNet support

* Disable XLNet for now
2022-07-29 16:12:27 +02:00
Yih-Dar 1763770bd9
Fix some doctests (#18359)
* Fix some doctests

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-29 14:13:28 +02:00
Sylvain Gugger 986526a0e4
Replace `as_target` context managers by direct calls (#18325)
* Preliminary work on tokenizers

* Quality + fix tests

* Treat processors

* Fix pad

* Remove all uses of  in tests, docs and examples

* Replace all as_target_tokenizer

* Fix tests

* Fix quality

* Update examples/flax/image-captioning/run_image_captioning_flax.py

Co-authored-by: amyeroberts <amy@huggingface.co>

* Style

Co-authored-by: amyeroberts <amy@huggingface.co>
2022-07-29 08:09:09 -04:00
Yih-Dar a64bcb564d
Fix OwlViT torchscript tests (#18347)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-29 10:36:04 +02:00
Sanchit Gandhi a4ee463d95
[Docs] Fix Speech Encoder Decoder doc sample (#18346)
* [Docs] Fix Speech Encoder Decoder doc sample

* improve pre-processing comment

* make style
2022-07-29 09:11:28 +01:00
Vijay S Kalmath da503ea02f
Migrate metrics used in flax examples to Evaluate (#18348)
Currently, tensorflow examples use the `load_metric` function from
Datasets library, commit migrates function call to `load` function
from Evaluate library.
2022-07-28 15:06:23 -04:00
Vijay S Kalmath a2586795e5
Migrate metric to Evaluate library for tensorflow examples (#18327)
* Migrate metric to Evaluate library in tf examples

Currently tensorflow examples use `load_metric` function from Datasets
library , commit migrates function call to `load` function to
Evaluate library.

Fix for #18306

* Migrate metric to Evaluate library in tf examples

Currently tensorflow examples use `load_metric` function from Datasets
library , commit migrates function call to `load` function to
Evaluate library.

Fix for #18306

* Migrate `metric` to Evaluate for all tf examples

Currently tensorflow examples use `load_metric` function from Datasets
library , commit migrates function call to `load` function to
Evaluate library.
2022-07-28 14:24:27 -04:00
Thomas Wang 7b0908769b
[BLOOM] Deprecate `position_ids` (#18342) 2022-07-28 20:21:43 +02:00
Ankur Goyal 9c336657a9
Include tensorflow-aarch64 as a candidate (#18345)
Co-authored-by: Ankur Goyal <ankur@impira.com>
2022-07-28 12:45:02 -04:00
Yih-Dar b53dab601c
Remove Flax OPT from doctest for now (#18338)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-07-28 11:50:44 -04:00
Loubna Ben Allal 286a18fa00
Fix codeparrot deduplication - ignore whitespaces (#18023)
* ignore whitspaces for hash

* reformat code

* Update README.md
2022-07-28 15:58:26 +02:00
bhuang 5d1fed0740
Update automatic_speech_recognition.py (#18339) 2022-07-28 09:53:03 -04:00
Nicola Procopio 985c7e3ac9
Updated _toctree.yml (#18337) 2022-07-28 09:04:32 -04:00
Edoardo Federici a8e279579b
updated translation (#18333)
Left the term fine-tuning since there is no correct translation into Italian and the English term is generally used. The same was done with some terms like "learning rate"
2022-07-28 08:14:15 -04:00
Edoardo Federici 1e380c7dcb
fixed typo (#18331) 2022-07-28 06:14:56 -04:00
Steven Liu 96be1b7f49
Update feature extractor docs (#18324)
As pointed out by @NielsRogge, a feature extractor is used to prepare inputs for a model with a single modality rather than multimodal models.
2022-07-27 15:32:57 -05:00
Wang, Yi 2b81f72be9
start from 1.12, torch_ccl is renamed as oneccl_bindings_for_pytorch … (#18229)
* start from 1.12, torch_ccl is renamed as oneccl_bindings_for_pytorch and should import it before use

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* add doc for perf_train_cpu_many

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

* update doc

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2022-07-27 11:15:41 -04:00
Ritik Nandwal e87ac9d18b
Add swin transformer v2 (#17469)
* Add files generated using transformer-cli add-new-model-like command

* Add changes for swinv2 attention and forward method

* Add fixes

* Add modifications for weight conversion and remaining args in swin model

* Add changes for patchmerging

* Add changes for SwinV2selfattention

* Update conversion script

* Add final fixes for the swin_v2 model

* Add changes for conversion script for pretrained window size case

* Add pretrained window size value from config in SwinV2Encoder class

* Make fixup

* Add swinv2 to models_not_in_readme to utils/check_copies.py

* Modify Swinv2v2 to Swin Transformer V2

* Remove copied from, to run make fixup command

* Add updates to swinv2tf from main branch

* Add pretrained_window_size to config, to make tests pass

* Add modified weights from nandwalritik profile for swinv2

* Update model weights from swinv2 from nandwalritik profile

* Add fix for build_pr_documentation CI fix

* Add fixes for weight conversion

* Add change to make input with padding work

* Add fixes for test cases

* Add few changes from swin to swinv2 to pass test cases

* Remove tests for tensorflow as swinv2 for TF is not added yet

* Overide test_pt_tf_model_equivalence function as TF implementation for swinv2 is not added yet

* Add modeling_tf_swinv2 to _ignore_modules as test file is removed for this one right now.

* Update docs url for swinv2 in README.md

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Undo changes for check_repo

* Update url in readme.md

* Remove overrided function to test pt_tf_model_equivalence

* Remove TF model imports for Swinv2 as its not implemented in this PR

* Add changes for index.mdx

* Add swinv2 papers link,abstract and contributors details

* Rename cpb_mlp to continous_position_bias_mlp

* Add tips for swinv2 model

* Update src/transformers/models/swinv2/configuration_swinv2.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/swinv2/configuration_swinv2.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Fix indentation for docstring example in src/transformers/models/swinv2/configuration_swinv2.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update import order in src/transformers/models/swinv2/configuration_swinv2.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add copyright statements in weights conversion script.

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Remove Swinv2 from models_not_in_readme

* Reformat code

* Remove TF implementation file for swinv2

* Update start docstring.

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add changes for docstring

* Update orgname for weights to microsoft

* Remove to_2tuple function

* Add copied from statements wherever applicable

* Add copied from to Swinv2ForMaskedImageModelling class

* Reformat code.

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add unittest.skip(with reason.) for test_inputs_embeds test case.

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add updates for test_modeling_swinv2.py

* Add @unittest.skip() annotation for clarity to create_and_test_config_common_properties function

* Add continuous_position_bias_mlp parameter to conversion script

* Add test for testing masked_image_modelling for swinv2

* Update Swinv2 to Swin Transformer v2 in docs/source/en/model_doc/swinv2.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update Swinv2 to Swin Transformer v2 in docs/source/en/model_doc/swinv2.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/swinv2.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/swinv2.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Add suggested changes

* Add copied from to forward methods of Swinv2Stage and Swinv2Encoder

* Add push_to_hub flag to weight conversion script

* Change order or Swinv2DropPath class

* Add id2label mapping for imagenet 21k

* Add updated url for SwinV2 functions and classes used in implementation

* Update input_feature dimensions format, mentioned in comments.

Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>

* Add suggested changes for modeling_swin2.py

* Update docs

* Remove create_and_test_config_common_properties function, as test_model_common_attributes is sufficient.

* Fix indentation.

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add changes for making Nit objects in code style

* Add suggested changes

* Add suggested changes for test_modelling_swinv2

* make fix-copies

* Update docs/source/en/model_doc/swinv2.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-27 11:14:47 -04:00
Lysandre c89a592e87 Dev version 2022-07-27 17:13:57 +02:00
Sanchit Gandhi 7490a97cac
[Flax] Fix incomplete batches in example scripts (#17863)
* [Flax] Fix incomplete batches in example scripts

* fix dataloader batching

* convert jnp batch idxs to np array

* add missing `pad_shard_unpad` to final prediction generate step

* only `pad_shard_unpad` at inference time

* merge conflicts

* remove incomplete batch step from eval

* fix run_qa.py

* add `pad_shard_unpad` to run_flax_ner.py

* add `pad_shard_unpad` to run_flax_glue.py

* add `pad_shard_unpad` to run_image_classification.py

* make style

* fix mlm flax eval batches

* remove redundant imports
2022-07-27 15:50:47 +01:00
Alara Dirik 9caf68a638
Owlvit test fixes (#18303)
* fix owlvit test assertion errors

* fix gpu test error

* remove redundant lines

* fix styling
2022-07-27 17:26:27 +03:00
Sylvain Gugger 0077360d67
Fix sacremoses sof dependency for Transformers XL (#18321)
* Fix sacremoses sof dependency for Transofmers XL

* Add function to the submodule init
2022-07-27 09:37:02 -04:00
Lysandre Debut 5c5676cdf9
sentencepiece shouldn't be required for the fast LayoutXLM tokenizer (#18320) 2022-07-27 09:09:32 -04:00
Sylvain Gugger cf32b2ee42
Remove all uses of six (#18318)
* Remove all uses of six

* fix quality
2022-07-27 08:39:09 -04:00
Duong A. Nguyen 170fcaa604
Generalize decay_mask_fn to apply mask to all LayerNorm params (#18273)
* generalize decay_mask_fn to find all layernorm params

* fixup

* generalising decay_mask_fn
2022-07-27 12:23:57 +01:00
Nouamane Tazi 83d2d74509
fix loading from pretrained for sharded model with `torch_dtype="auto" (#18061) 2022-07-27 07:20:35 -04:00
Younes Belkada 7996ef74dd
fix module order (#18312)
- put gelu before 4h to h
2022-07-27 07:06:01 -04:00
Mikkel Denker 70e7d1d656
Fixes torch jit tracing for LayoutLMv2 model (re-open) (#18313)
* Fixes torch jit tracing for LayoutLMv2 model.
Pytorch seems to reuse memory for input_shape which caused a mismatch in shapes later in the forward pass.

* Fixed code quality

* avoid unneeded allocation of vector for shape
2022-07-27 06:38:40 -04:00
Loubna Ben Allal 1d71ad8905
Update CodeParrot readme to include training in Megatron (#17798)
* add info about megatron training

* upload models and datasets from CodeParrot organization

* upload models and datasets from CodeParrot organization

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* Update examples/research_projects/codeparrot/README.md

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>

* fix typo and add comment about codeparrot vs megatron

Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
2022-07-27 11:59:08 +02:00
Yanming Wang d5610b53fa
[XLA] Improve t5 model performance (#18288) 2022-07-27 10:44:14 +02:00
Seunghwan Hong e318cda9ee
Apply type correction to `TFSwinModelOutput` (#18295)
Signed-off-by: Seunghwan Hong <seunghwan@scatterlab.co.kr>
2022-07-27 04:35:56 -04:00
NielsRogge ccd4180f8a
[EncoderDecoder] Improve docs (#18271)
* Improve docs

* Improve docs of speech one as well

* Apply suggestions from code review

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-07-27 10:08:59 +02:00
Manuel R. Ciosici 5dfec704da
Remove duplicated line (#18310)
Removes a duplicated instantiation of device. I removed the second instance of the line to maintain code alignment with the GPT-J implementation of forward.
2022-07-27 04:00:47 -04:00
NielsRogge 47c2af0951
[DETR] Improve code examples (#18262)
* Improve doc test

* Improve code example of segmentation model

* Apply suggestion

* Update src/transformers/models/detr/modeling_detr.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-27 09:54:41 +02:00
Carolyn Wang ee67e7ad4f
patch for smddp import (#18244)
* add import

* format
2022-07-26 16:00:24 -04:00
Matt 68097dcce0
Fix Sylvain's nits on the original KerasMetricCallback PR (#18300)
* Fix Sylvain's nits on the original PR

* Update src/transformers/keras_callbacks.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Re-add "optional" to docstring

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-07-26 17:08:16 +01:00