Fix#18385
I don't know whether `use_auth_token`, `cache_dir` and `local_files_only` should be passed to `(cls.slow_tokenizer_class)._from_pretrained`, but I guess it should.
* Bloom model can now be traced
* Bloom traced model can be torch scripted and serialized
* Bloom can be traced with variable keyword arguments
* Enable XLNet support
* Disable XLNet for now
Currently, tensorflow examples use the `load_metric` function from
Datasets library, commit migrates function call to `load` function
from Evaluate library.
* Migrate metric to Evaluate library in tf examples
Currently tensorflow examples use `load_metric` function from Datasets
library , commit migrates function call to `load` function to
Evaluate library.
Fix for #18306
* Migrate metric to Evaluate library in tf examples
Currently tensorflow examples use `load_metric` function from Datasets
library , commit migrates function call to `load` function to
Evaluate library.
Fix for #18306
* Migrate `metric` to Evaluate for all tf examples
Currently tensorflow examples use `load_metric` function from Datasets
library , commit migrates function call to `load` function to
Evaluate library.
Left the term fine-tuning since there is no correct translation into Italian and the English term is generally used. The same was done with some terms like "learning rate"
* start from 1.12, torch_ccl is renamed as oneccl_bindings_for_pytorch and should import it before use
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* add doc for perf_train_cpu_many
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* update doc
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
* Add files generated using transformer-cli add-new-model-like command
* Add changes for swinv2 attention and forward method
* Add fixes
* Add modifications for weight conversion and remaining args in swin model
* Add changes for patchmerging
* Add changes for SwinV2selfattention
* Update conversion script
* Add final fixes for the swin_v2 model
* Add changes for conversion script for pretrained window size case
* Add pretrained window size value from config in SwinV2Encoder class
* Make fixup
* Add swinv2 to models_not_in_readme to utils/check_copies.py
* Modify Swinv2v2 to Swin Transformer V2
* Remove copied from, to run make fixup command
* Add updates to swinv2tf from main branch
* Add pretrained_window_size to config, to make tests pass
* Add modified weights from nandwalritik profile for swinv2
* Update model weights from swinv2 from nandwalritik profile
* Add fix for build_pr_documentation CI fix
* Add fixes for weight conversion
* Add change to make input with padding work
* Add fixes for test cases
* Add few changes from swin to swinv2 to pass test cases
* Remove tests for tensorflow as swinv2 for TF is not added yet
* Overide test_pt_tf_model_equivalence function as TF implementation for swinv2 is not added yet
* Add modeling_tf_swinv2 to _ignore_modules as test file is removed for this one right now.
* Update docs url for swinv2 in README.md
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Undo changes for check_repo
* Update url in readme.md
* Remove overrided function to test pt_tf_model_equivalence
* Remove TF model imports for Swinv2 as its not implemented in this PR
* Add changes for index.mdx
* Add swinv2 papers link,abstract and contributors details
* Rename cpb_mlp to continous_position_bias_mlp
* Add tips for swinv2 model
* Update src/transformers/models/swinv2/configuration_swinv2.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update src/transformers/models/swinv2/configuration_swinv2.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Fix indentation for docstring example in src/transformers/models/swinv2/configuration_swinv2.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update import order in src/transformers/models/swinv2/configuration_swinv2.py
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Add copyright statements in weights conversion script.
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Remove Swinv2 from models_not_in_readme
* Reformat code
* Remove TF implementation file for swinv2
* Update start docstring.
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Add changes for docstring
* Update orgname for weights to microsoft
* Remove to_2tuple function
* Add copied from statements wherever applicable
* Add copied from to Swinv2ForMaskedImageModelling class
* Reformat code.
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Add unittest.skip(with reason.) for test_inputs_embeds test case.
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Add updates for test_modeling_swinv2.py
* Add @unittest.skip() annotation for clarity to create_and_test_config_common_properties function
* Add continuous_position_bias_mlp parameter to conversion script
* Add test for testing masked_image_modelling for swinv2
* Update Swinv2 to Swin Transformer v2 in docs/source/en/model_doc/swinv2.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update Swinv2 to Swin Transformer v2 in docs/source/en/model_doc/swinv2.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/en/model_doc/swinv2.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update docs/source/en/model_doc/swinv2.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Add suggested changes
* Add copied from to forward methods of Swinv2Stage and Swinv2Encoder
* Add push_to_hub flag to weight conversion script
* Change order or Swinv2DropPath class
* Add id2label mapping for imagenet 21k
* Add updated url for SwinV2 functions and classes used in implementation
* Update input_feature dimensions format, mentioned in comments.
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
* Add suggested changes for modeling_swin2.py
* Update docs
* Remove create_and_test_config_common_properties function, as test_model_common_attributes is sufficient.
* Fix indentation.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Add changes for making Nit objects in code style
* Add suggested changes
* Add suggested changes for test_modelling_swinv2
* make fix-copies
* Update docs/source/en/model_doc/swinv2.mdx
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Alara Dirik <8944735+alaradirik@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Fixes torch jit tracing for LayoutLMv2 model.
Pytorch seems to reuse memory for input_shape which caused a mismatch in shapes later in the forward pass.
* Fixed code quality
* avoid unneeded allocation of vector for shape
* add info about megatron training
* upload models and datasets from CodeParrot organization
* upload models and datasets from CodeParrot organization
* Update examples/research_projects/codeparrot/README.md
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* Update examples/research_projects/codeparrot/README.md
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* Update examples/research_projects/codeparrot/README.md
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* Update examples/research_projects/codeparrot/README.md
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* Update examples/research_projects/codeparrot/README.md
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* fix typo and add comment about codeparrot vs megatron
Co-authored-by: Leandro von Werra <lvwerra@users.noreply.github.com>
* Improve docs
* Improve docs of speech one as well
* Apply suggestions from code review
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Removes a duplicated instantiation of device. I removed the second instance of the line to maintain code alignment with the GPT-J implementation of forward.