Commit Graph

11615 Commits

Author SHA1 Message Date
Yih-Dar 7032e02032
Install `sentencepiece` in `DeepSpeed` CI image (#20795)
* Install sentencepiece in DS CI image

* update

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-12-16 18:23:46 +01:00
NielsRogge 26dd041c6e
Add Swin2SR (#19784)
* First draft

* Add more improvements

* Improve forward pass

* Fix layernorm

* Add upscaler

* More improvements

* More improvements

* More improvements

* Improve conversion script

* Add preprocessing

* Make output match original implementation

* Add additional attributes

* Add support for more models

* Support more models

* Add support for real world sr

* Add initial Swin2SRFeatureExtractor

* Add ImageSuperResolutionOutput

* Make more tests pass

* Use BaseModelOutput

* Fix one more test

* Fix more tests

* Fix another test

* Fix all tests

* Rename to Swin2SRImageProcessor

* Fix toctree

* Fix toctree

* Fix rebase

* Improve Swin2SRImageProcessor

* Remove feature extractor file

* Improve model

* Improve conversion script

* Fix integration test

* Fix init

* Fix conversion script

* Address comments

* Improve upsampler

* Add NearestConvUpsampler

* Improve pixel shuffle upsampler

* Improve auxiliary upsampler

* Improve conversion script

* Rename conv_last to final_convolution

* Fix rebase

* Improve upsample module

* Add padding to image processor

* Fix bug

* Update padding

* Remove print statement and fix integration test

* Improve docs

* Add image processor tests

* Convert all checkpoints, fix testsé

* Remove print statements

* Fix import

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-12-16 16:24:01 +01:00
NielsRogge 7f99861218
Add Universal Segmentation class + mapping (#20766)
* Add mapping

* Add mapping to pipeline

* Apply suggestions

* Fix feature extractor tests

* Use ForInstance, add model to universal mapping

* More fixes

* Remove model from deprecated objectsé

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-12-16 14:22:46 +01:00
Matt e65445b4d6
Stop calling expand_1d on newer TF versions (#20786) 2022-12-16 13:10:07 +00:00
Nicolas Patry 3ee958207a
Fix object detection2 (#20798)
* Revert "Fixing object detection with `layoutlm` (#20776)"

This reverts commit fca66abe2a.

* Better fix for layoutlm object detection.

* Style.
2022-12-16 13:25:36 +01:00
Younes Belkada 4341f4e224
[Pipeline] skip feature extraction test if in `IMAGE_PROCESSOR_MAPPING` (#20790)
skip feature extraction test if in `IMAGE_PROCESSOR_MAPPING`
2022-12-16 12:46:58 +01:00
Yih-Dar 1543cee7c8
Recompile `apex` in `DeepSpeed` CI image (#20788)
Recompile apex in DeepSpeed CI image

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-12-15 21:35:27 +01:00
amyeroberts 491e951875
Move convert_to_rgb to image_transforms module (#20784)
* Move convert_to_rgb to image_transforms module

* Fix tests
2022-12-15 18:47:04 +00:00
Joao Gante 4bc723f87d
Generate: use `GenerationConfig` as the basis for `.generate()` parametrization (#20388)
* generate from config mvp

* fix failing tests

* max_time test

* Load default gen config at model load time; Update docs

* further documentation; add tests

* adapt rag to the new structure

* handle models not instantiated with from_pretained (like in tests)

* better default generation config

* add can_generate fn

* handle legacy use case of ad hoc model config changes

* initialize gen config from config in individual methods, if gen config is none

* fix _get_decoder_start_token_id when called outside GenerationMixin

* correct model config load order (set attr > model config > decoder config)

* update rag to match latest changes

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* load gen config from model config in model.from_pretrained

* fix can_generate fn

* handle generate calls without a previous from_pretrained (e.g. tests)

* add legacy behavior (and a warning)

* lower logger severity

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-12-15 18:27:20 +00:00
Yih-Dar b1706f6908
Install video dependency for pipeline CI (#20777)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-12-15 18:47:05 +01:00
Nicolas Patry fca66abe2a
Fixing object detection with `layoutlm` (#20776)
* Fixing object detection with layoutlm.

* Fixup.
2022-12-15 18:46:43 +01:00
Younes Belkada 8891193e83
[Pipeline] fix failing bloom `pipeline` test (#20778)
fix failing `pipeline` test
2022-12-15 18:46:00 +01:00
Lars Mennen b9b70b0e66
Patch for FlanT5-XXL 8bit support (#20760)
* Workaround for #20287: FlanT5-XXL 8bit support

* Make fix-copies

* revert unrelated change

* Dont apply to longt5 and switch transformers
2022-12-15 12:26:58 -05:00
Yih-Dar fe9152f67c
Install vision for TF pipeline tests (#20771)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-12-15 11:16:37 +01:00
Nicolas Patry a9912d2fca
Even more validation. (#20762)
* Even more validation.

* Fixing order.
2022-12-15 10:05:54 +01:00
NielsRogge 67acb07e9e
Add Swin backbone (#20769)
* Add Swin backbone

* Remove line

* Add code example

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-12-14 19:35:28 +01:00
Yih-Dar 94f8e21c70
Install `torch-tensorrt 1.3.0` for DeepSpeed CI (#20764)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-12-14 17:30:36 +01:00
amyeroberts 7b23a582b9
Replaces xxx_required with requires_backends (#20715)
* Replaces xxx_required with requires_backends

* Fixup
2022-12-14 14:38:44 +00:00
Arthur 7c9e2f248c
[CI-Test] Fixes but also skips the mT5 tests (#20755)
* weight -> weights

* model embedding resize does not work with both v2 and noraml

* remove useless test
2022-12-14 15:36:04 +01:00
casuallyName dfd818420d
Fix attribute error problem (#20765)
fix: 修复Trainer无法使用use_legacy_prediction_loop参数的问题

解决使用use_legacy_prediction_loop参数在predict阶段使用prediction_loop进行预测时,遇到AttributeError: 'PredictionOutput' object has no attribute 'num_samples'的问题

Co-authored-by: ZhouHang <zhouhang@idataway.com>
2022-12-14 09:26:06 -05:00
NielsRogge 11745b4e45
[Tests] Improve test_attention_outputs (#20701)
* Improve tests

* Improve TF tests

* Apply suggestion

* Fix test

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-12-14 14:41:40 +01:00
Yih-Dar 722bf7efcc
Fix missing `()` in some usage of `is_flaky` (#20749)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-12-14 11:37:29 +01:00
amyeroberts 9bafedc0fa
Remove image_transforms functions from init (#20704) 2022-12-14 10:17:11 +00:00
Yih-Dar d994473b05
Uninstall `torch_tensorrt` in `DeepSpeed` CI image for now (#20758)
Uninstall torch_tensorrt for now

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-12-13 22:25:47 +01:00
Nicolas Patry ba9da49aa2
Fixing the pipeline tutorial test (#20746)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-12-13 19:08:30 +01:00
Hazrul Akmal f28c918c7e
Add docs xlm roberta (#20742)
* added model resources for xlm-roberta

* added model resources for xlm-roberta

* resolve suggested changes

* add resources to xlm-roberta
2022-12-13 09:25:55 -08:00
NielsRogge 6ef42587ae
[NAT, DiNAT] Add backbone class (#20654)
* Add first draft

* Add out_features attribute to config

* Add corresponding test

* Add Dinat backbone

* Add BackboneMixin

* Add Backbone mixin, improve tests

* Fix embeddings

* Fix bug

* Improve backbones

* Fix Nat backbone tests

* Fix Dinat backbone tests

* Apply suggestions

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-12-13 17:06:59 +01:00
dhansmair 30d8919ab1
in the resize() function in image_transforms.py, the line 267: (#20728)
`image = to_channel_dimension_format(image, ChannelDimension.LAST)`
is redundant as this same conversion is also applied in to_pil_image().

This redundant call actually makes the training fail in rare cases.
The problem can be reproduced with the following code snippet:
```
from transformers.models.clip import CLIPFeatureExtractor
vision_processor = CLIPFeatureExtractor.from_pretrained('openai/clip-vit-large-patch14')
images = [
    torch.rand(size=(3, 2, 10), dtype=torch.float),
    torch.rand(size=(3, 10, 1), dtype=torch.float),
    torch.rand(size=(3, 1, 10), dtype=torch.float)
]
for image in images:
    processed_image = vision_processor(images=image, return_tensors="pt")['pixel_values']
    print(processed_image.shape)
    assert processed_image.shape == torch.Size([1, 3, 224, 224])
```

The last image has a height of 1 pixel.
The second call to to_channel_dimesion_format() will transpose the image, and the height
dimension is wrongly treated as the channels dimension afterwards.
Because of this, the following normalize() step will result in an
exception.
2022-12-13 08:55:08 -05:00
Matt 4f1788b34d
Fix AdamWeightDecay for TF 2.11 (#20735)
* Fix AdamWeightDecay for TF

* Fix AdamWeightDecay for TF

* make fixup
2022-12-13 12:51:07 +00:00
Yih-Dar a12c5cbcd8
Change a logic in pipeline test regarding TF (#20710)
* Fix the pipeline test regarding TF

* Fix the pipeline test regarding TF

* update comment

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-12-13 13:42:36 +01:00
Younes Belkada 1af4bee896
Add `keep_in_fp32_modules` support (#20683)
* add `keep_in_fp32_modules` support

* pass it as class attribute

* few modifs

- make tests `slow`
- fix logic

* better logic

* fix failing test

* `bfloat16` support

* Update src/transformers/modeling_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix

* simplify tests

* simplify tests

* fix test

* modify message

* more checks

* fix failing tests

* add more conditions

- add `is_accelerate_available`
- fixes pipleine tests that failed

* add suggestions

* Update src/transformers/modeling_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix failing `bnb` test

* add last safety checker

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-12-13 11:59:57 +01:00
Yih-Dar d4bf9ee1ff
Update CI to torch 1.13.0 (#20687)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-12-12 20:04:56 +01:00
Yih-Dar f41a11a16f
rename `layoutlm_job` to `exotic_models_job` (#20736)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-12-12 20:02:16 +01:00
amyeroberts 1416b5d9d8
Add decorator for flaky Donut tests (#20739)
* Add decorator for flaky tests

* Fix up
2022-12-12 18:25:27 +00:00
Sylvain Gugger a450789d9a
Disambiguate test for required_input in tokenization base file. (#20731)
* Disambiguate test for required_input in tokenization base file.

* Add test for size
2022-12-12 13:13:09 -05:00
Sylvain Gugger 29ff8716a2
Add a progress bar for large model loading (#20713) 2022-12-12 13:12:56 -05:00
Ariel Ekgren 5f94855dc3
Add gpt-sw3 model to transformers (#20209)
* Add templates for gpt-sw3

* Add templates for gpt-sw3

* Added sentencepiece tokenizer

* intermediate commit with many changes

* fixed conflicts

* Init commit for tokenization port

* Tokenization progress

* Remove fast tokenizer

* Clean up and rename spm.model -> spiece.model

* Remove TF -> PT conversion script template, Clean up Megatron -> PT script

* Optimize encode & decode performance

* added new attention

* added new attention

* attention for gpt-sw3 working

* attention good

* Cache is now working

* fixed attention mask so that it works with causal attention

* fixed badbmm bug for cpu and caching

* updated config with correct parameters

* Refactor and leave optimizations as separate functions to avoid breaking expected functionality

* Fix special tokens mapping for both tokenizers

* cleaning up of code and comments

* HF compatible attention outputs

* Tokenizer now passing tests, add documentation

* Update documentation

* reverted back to base implementation after checking that it is identical to pretrained model

* updated gpt-sw3 config

* updated conversion script

* aligned parameters with gpt-sw3 config

* changed default scale_attn_by_inverse_layer_idx to true

* removed flag from conversion script

* added temporary model path

* reverted back to functioning convert script

* small changes to default config

* updated tests for gpt-sw3

* make style, make quality, minor cleanup

* Change local paths to testing online repository

* Change name: GptSw3 -> GPTSw3

* Remove GPTSw3TokenizerFast references

* Use official model repository and add more model sizes

* Added reference to 6.7b model

* Add GPTSw3DoubleHeadsModel to IGNORE_NON_AUTO_CONFIGURED, like GPT2DoubleHeadsModel

* Remove pointers to non-existing TFGPTSw3

* Add GPTSw3 to docs/_toctree.yml

* Remove TF artifacts from GPTSw3 in __init__ files

* Update README:s with 'make fix-copies'

* Add 20b model to archive list

* Add documentation for GPT-Sw3

* Fix typo in documentation for GPT-Sw3

* Do 'make fix-copies' again after having updated docs

* Fix some typos in docs

* Update src/transformers/models/gpt_sw3/configuration_gpt_sw3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/configuration_gpt_sw3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/__init__.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/__init__.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/convert_megatron_to_pytorch.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/models/gpt_sw3/test_tokenization_gpt_sw3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Resolve comments from PR feedback

* Resolve more comments from PR feedback, also set use_cache=True in convert script

* Add '# Copied from' comments for GPTSw3 modeling

* Set 'is_parallelizable = False'

* Remove '# Copied from' where code was modified and add 'with x->y' when appropriate

* Remove parallelize in mdx

* make style, make quality

* Update GPTSw3Config default values and corresponding documentation

* Update src/transformers/models/gpt_sw3/tokenization_gpt_sw3.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Clean up and protect GPTSw3Tokenizer imports with is_sentencepiece_available

* Make style, make quality

* Add dummy object for GPTSw3Tokenizer via 'make fix-copies'

* make fix-copies

* Remove GPTSw3 modeling classes

* make style, make quality

* Add GPTSw3 auto-mappings for other GPT2 heads

* Update docs/source/en/model_doc/gpt-sw3.mdx

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/convert_megatron_to_pytorch.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/tokenization_gpt_sw3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Remove old TODO-comment

* Add example usage to GPTSw3Tokenizer docstring

* make style, make quality

* Add implementation details and example usage to gpt-sw3.mdx

Co-authored-by: JoeyOhman <joeyoh@kth.se>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-12-12 13:12:13 -05:00
amyeroberts b58beebe72
Add vision requirement to image transforms (#20712)
* Add require_vision decorator

* Fixup

* Use requires_backends

* Add requires_backend to utils functions
2022-12-12 17:43:45 +00:00
Steven Liu fd2bed7f9f
Clarify return_tensor and return_text parameters (#20662)
* clarify docstring

* make style
2022-12-12 09:16:13 -08:00
Matt c1b9a11dd4
Convert tokenizer outputs for Keras in doc example (#20732)
* Convert tokenizer outputs for Keras in doc example

* Das deutsche Beispiel auch korrigieren
2022-12-12 16:14:04 +00:00
Juanjo do Olmo 0ba94aceb6
Spanish translation of the file debugging.mdx (#20566)
* Create and translate to Spanish debugging.mdx

* solved typo error in a header

* Update debugging.mdx

* Update debugging.mdx

* Update docs/source/es/debugging.mdx

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/debugging.mdx

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/debugging.mdx

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/debugging.mdx

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/debugging.mdx

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update _toctree.yml

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-12-12 10:38:56 -05:00
Sourab Mangrulkar a413c725d4
fsdp fix (#20719) 2022-12-12 20:37:52 +05:30
stanleycai95 17c742bbf5
Very small edit to change name to OpenAI GPT (#20722) 2022-12-12 09:43:43 -05:00
Ian C 8f1f59ce86
Add type hints for Whisper models (#20396)
* Initial commit

* Add type hints for two major classes

* Run make fixup

* Fix output type for Whisper

* Run isort to fix imports
2022-12-12 14:39:21 +00:00
Nicolas Patry 53357e8196
Adding ValueError when imcompatible parameters are used. (#20729) 2022-12-12 15:39:13 +01:00
Yih-Dar 5ba2dbd9b1
Fix `AutoModelTest.test_model_from_pretrained` (#20730)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-12-12 15:37:43 +01:00
Peter a3345c1f13
Add `accelerate` support for LongT5 models (#20341)
*  add accelerate support for LongT5 models

Signed-off-by: peter szemraj <peterszemraj@gmail.com>

* fix `accelerate` tests

* Trigger CI test

Signed-off-by: peter szemraj <peterszemraj@gmail.com>
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
2022-12-12 09:25:52 -05:00
Alberto Mario Ceballos-Arroyo 8286af6f54
Spanish translation of asr.mdx and add_new_pipeline.mdx (#20569)
* Fix minor typo in question_answering.mdx

* Fixes minor typo in the english version of tasks/asr.mdx

* Update _toctree.yml

* Translate add_new_pipeline.mdx into Spanish

* Fixes some typos in the English version of add_new_pipeline.mdx

* Translate asr.mdx into Spanish

* Fixes small typos in add_new_pipeline.mdx

* Update docs/source/es/add_new_pipeline.mdx

Suggestion by @osanseviero

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/add_new_pipeline.mdx

Suggestion by @osanseviero: use "biblioteca" instead of "librería."

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/tasks/asr.mdx

Suggestion by @osanseviero.

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/add_new_pipeline.mdx

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/add_new_pipeline.mdx

Suggestion by @osanseviero.

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/add_new_pipeline.mdx

Suggestion by @osanseviero.

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/add_new_pipeline.mdx

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/tasks/asr.mdx

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/tasks/asr.mdx

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update docs/source/es/tasks/asr.mdx

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>

* Update asr.mdx

Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>
2022-12-12 09:23:23 -05:00
Salvo Cavallaro 8d2fca07e8
Made LUKE Tokenizer independent from RoBERTa (#20720) 2022-12-12 09:22:08 -05:00
Sylvain Gugger 799cea64ac
Fix rendering issue in quicktour (#20708)
* Fix rendering issue in quicktour

* Separate in two blocks
2022-12-09 13:51:35 -05:00