Commit Graph

13776 Commits

Author SHA1 Message Date
Younes Belkada 3170af71e1
[`Detr`] Fix detr BatchNorm replacement issue (#25230)
* fix detr weird issue

* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix copies

* fix copies

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-08-01 12:21:48 +02:00
Younes Belkada 05ebb0264e
[`MPT`] Add `require_bitsandbytes` on MPT integration tests (#25201)
* add  `require_bitsandbytes` on MPT integration tests

* add it on mpt as well
2023-08-01 12:20:34 +02:00
Younes Belkada 972fdcc778
[`Docs`/`quantization`] Clearer explanation on how things works under the hood. + remove outdated info (#25216)
* clearer explanation on how things works under the hood.

* Update docs/source/en/main_classes/quantization.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/main_classes/quantization.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add `load_in_4bit` in `from_pretrained`

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-08-01 10:56:52 +02:00
Younes Belkada 77c3973e8f
[`Pix2Struct`] Fix pix2struct cross attention (#25200)
* fix pix2struct cross attention

* fix torchscript slow test
2023-08-01 10:56:37 +02:00
Wang, Yi 4033ea7167
make build_mpt_alibi_tensor a method of MptModel so that deepspeed co… (#25193)
make build_mpt_alibi_tensor a method of MptModel so that deepspeed could override it to make autoTP work

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2023-08-01 01:35:49 -04:00
Yih-Dar 0fd8d2aa2c
Fix docker image build failure (#25214)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-31 20:13:15 +02:00
Yih-Dar 1b4f6199c6
Update tiny model info. and pipeline testing (#25213)
* update tiny_model_summary.json

* update

* update

* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-31 19:35:33 +02:00
Younes Belkada e0c50b274a
[`pipeline`] revisit device check for pipeline (#25207)
* revisit device check for pipeline

* let's raise an error.
2023-07-31 18:43:21 +02:00
Stas Bekman 5220606607
[quantization.md] fix (#25190)
Update quantization.md
2023-07-31 09:37:29 -07:00
Yih-Dar 9ca3aa0156
Fix `all_model_classes` in `FlaxBloomGenerationTest` (#25211)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-31 17:32:05 +02:00
Younes Belkada 59dcea3fe4
[`PreTrainedModel`] Wrap `cuda` and `to` method correctly (#25206)
wrap `cuda` and `to` method correctly
2023-07-31 17:25:09 +02:00
Yih-Dar 67b85f24de
Better error message in `_prepare_output_docstrings` (#25202)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-31 16:15:02 +02:00
Joao Gante 4a564490e1
Musicgen: CFG is manually added (#25173) 2023-07-31 11:21:11 +01:00
amyeroberts 05cda5df34
🚨🚨🚨 Fix rescale ViVit Efficientnet (#25174)
* Fix rescaling bug

* Add tests

* Update integration tests

* Fix up

* Update src/transformers/image_transforms.py

* Update test - new possible order in list
2023-07-28 19:52:51 +01:00
Sanchit Gandhi 03f98f9683
[MusicGen] Fix integration tests (#25169)
* move to device

* update with cuda values

* fix fp16

* more rigorous
2023-07-28 18:50:15 +01:00
Yoni Gottesman c90e14fb0f
Fix beam search to sample at least 1 non eos token (#25103) (#25115) 2023-07-28 13:20:24 -04:00
Sohyun Sim 31f137c04f
🌐 [i18n-KO] Translated `transformers_agents.md` to Korean (#24881)
* docs: ko: transformers_agents.md

* docs: ko: transformers_agents.md

* feat: deepl draft

* fix: manual edits

* fix: resolve suggestions

Co-authored-by: Juntae <79131091+sronger@users.noreply.github.com>
Co-authored-by: Injin Paek <71638597+eenzeenee@users.noreply.github.com>

---------

Co-authored-by: Juntae <79131091+sronger@users.noreply.github.com>
Co-authored-by: Injin Paek <71638597+eenzeenee@users.noreply.github.com>
2023-07-28 13:06:37 -04:00
Younes Belkada dd9d45b6ec
[`InstructBlip`] Fix instructblip slow test (#25171)
* fix instruct blip slow test

* Update tests/models/instructblip/test_modeling_instructblip.py
2023-07-28 17:00:10 +02:00
Younes Belkada add0895dd9
[`Mpt`] Fix mpt slow test (#25170)
fix mpt slow test
2023-07-28 16:45:09 +02:00
Yih-Dar d53b8ad780
Update `use_auth_token` -> `token` in example scripts (#25167)
* pytorch examples

* tensorflow examples

* flax examples

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-28 15:33:45 +02:00
Alexander Markov 3cbc560d03
added compiled model support for inference (#25124)
* added compiled model support for inference

* linter

* Fix tests

* linter

* linter

* remove inference mode from pipelines

* Linter

---------

Co-authored-by: amarkov <alexander@inworld.ai>
2023-07-28 08:28:04 -04:00
Alan Ji afa96fffdf
make run_generation more generic for other devices (#25133)
* make run_generation more generic for other devices

* use Accelerate to support any device type it supports.

* make style

* fix error usage of accelerator.prepare_model

* use `PartialState` to make sure everything is running on the right device

---------

Co-authored-by: statelesshz <jihuazhong1@huawei.com>
2023-07-28 08:20:10 -04:00
jiqing-feng d23d2c27c2
Represent query_length in a different way to solve jit issue (#25164)
Fix jit trace
2023-07-28 08:19:10 -04:00
YQ 2a78720104
override .cuda() to check if model is already quantized (#25166) 2023-07-28 08:17:24 -04:00
Lucain c1dba1111b
Add test when downloading from gated repo (#25039) 2023-07-28 08:14:27 -04:00
Lucain 6232c380f2
Fix `.push_to_hub` and cleanup `get_full_repo_name` usage (#25120)
* Fix .push_to_hub and cleanup get_full_repo_name usage

* Do not rely on Python bool conversion magic

* request changes
2023-07-28 11:40:08 +02:00
Sylvain Gugger 400e76ef11
Add new model in doc table of content (#25148) 2023-07-27 13:41:50 -04:00
Sanchit Gandhi e93103632b
Add bloom flax (#25094)
* First commit

* step 1 working

* add alibi

* placeholder for `scan`

* add matrix mult alibi

* beta scaling factor for bmm

* working v1 - simple forward pass

* move layer_number from attribute to arg in call

* partial functioning scan

* hacky working scan

* add more modifs

* add test

* update scan for new kwarg order

* fix position_ids problem

* fix bug in attention layer

* small fix

- do the alibi broadcasting only once

* prelim refactor

* finish refactor

* alibi shifting

* incorporate dropout_add to attention module

* make style

* make padding work again

* update

* remove bogus file

* up

* get generation to work

* clean code a bit

* added small tests

* adding albii test

* make CI tests pass:

- change init weight
- add correct tuple for output attention
- add scan test
- make CI tests work

* fix few nits

* fix nit onnx

* fix onnx nit

* add missing dtype args to nn.Modules

* remove debugging statements

* fix scan generate

* Update modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* Update test_modeling_flax_bloom.py

* fix small test issue + make style

* clean up

* Update tests/models/bloom/test_modeling_flax_bloom.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* fix function name

* small fix test

* forward contrib credits from PR17761

* Fix failing test

* fix small typo documentation

* fix non passing test

- remove device from build alibi

* refactor call

- refactor `FlaxBloomBlockCollection` module

* make style

* upcast to fp32

* cleaner way to upcast

* remove unused args

* remove layer number

* fix scan test

* make style

* fix i4 casting

* fix slow test

* Update src/transformers/models/bloom/modeling_flax_bloom.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* remove `layer_past`

* refactor a bit

* fix `scan` slow test

* remove useless import

* major changes

- remove unused code
- refactor a bit
- revert import `torch`

* major refactoring

- change build alibi

* remove scan

* fix tests

* make style

* clean-up alibi

* add integration tests

* up

* fix batch norm conversion

* style

* style

* update pt-fx cross tests

* update copyright

* Update src/transformers/modeling_flax_pytorch_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* per-weight check

* style

* line formats

---------

Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: haileyschoelkopf <haileyschoelkopf@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-07-27 18:24:56 +01:00
Yih-Dar 0c790ddbd1
More `token` things (#25146)
* fix

* fix

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-27 17:42:07 +02:00
Yoach Lacombe 0b92ae3489
Add offload support to Bark (#25037)
* initial Bark offload proposal

* use hooks instead of manually offloading

* add test of bark offload to cpu feature

* Apply nit suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update docstrings of offload

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* remove unecessary set_seed in Bark tests

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2023-07-27 15:35:17 +01:00
Arthur 9cea3e7b80
[`MptConfig`] support from pretrained args (#25116)
* support from pretrained args

* draft addition of tests

* update test

* use parrent assert true

* Update src/transformers/models/mpt/configuration_mpt.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-07-27 16:24:52 +02:00
Zach Mueller a1c4954d25
🚨🚨🚨Change default from `adamw_hf` to `adamw_torch` 🚨🚨🚨 (#25109)
* Change defaults

* Sylvain's comments
2023-07-27 09:11:28 -04:00
Bram Vanroy 9a220ce30c
Clarify 4/8 bit loading log message (#25134)
* clarify 4/8 bit loading log message

* make style
2023-07-27 09:09:27 -04:00
Arthur 9429642e2d
[`T5/LlamaTokenizer`] default legacy to `None` to not always warn (#25131)
default legacy to None
2023-07-27 14:43:18 +02:00
Pbihao de9e3b5945
fix delete all checkpoints when save_total_limit is set to 1 (#25136) 2023-07-27 08:34:02 -04:00
Sourab Mangrulkar a004237926
fix deepspeed load best model at end when the model gets sharded (#25057) 2023-07-27 07:11:43 +05:30
amyeroberts 1689aea733
Move center_crop to BaseImageProcessor (#25122) 2023-07-26 18:30:38 +01:00
amyeroberts 659829b6ae
MaskFormer - enable return_dict in order to compile (#25052)
* Enable return_dict in order to compile

* Update tests
2023-07-26 16:23:30 +01:00
Eric Bezzam b914ec9847
Fix ViT docstring regarding default dropout values. (#25118)
Fix docstring for dropout.
2023-07-26 11:08:57 -04:00
amyeroberts 1486d2aec2
Move common image processing methods to BaseImageProcessor (#25089)
Move out common methods
2023-07-26 15:09:17 +01:00
Yih-Dar d30cf3d02f
Fix past CI after #24334 (#25113)
update

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-26 15:34:42 +02:00
Yih-Dar 224da5df69
update `use_auth_token` -> `token` (#25083)
* update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-26 15:09:59 +02:00
Leo c53c8e490c
fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is … (#24772)
fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor."

Co-authored-by: 刘长伟 <hzliuchw@corp.netease.com>
2023-07-26 09:07:21 -04:00
David Reguera 04a5c859b0
Add descriptive docstring to TemperatureLogitsWarper (#24892)
* Add descriptive docstring to TemperatureLogitsWarper

It addresses https://github.com/huggingface/transformers/issues/24783

* Remove niche features

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Commit suggestion

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Refactor the examples to simpler ones

* Add a missing comma

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Make args description more compact

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Remove extra text after making description more compact

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Fix linter

---------

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2023-07-26 08:58:26 -04:00
Yih-Dar 31acba5697
Fix `PvtModelIntegrationTest::test_inference_fp16` (#25106)
update

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-26 14:57:44 +02:00
Kihoon Son ee63520a7b
🌐[i18n-KO] Translated pipeline_webserver.md to Korean (#24828)
* translated pipeline_webserver.md

Co-Authored-By: Hyeonseo Yun <0525yhs@gmail.com>
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>
Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com>

* Update pipeline_webserver.md

* Apply suggestions from code review

Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Sangam Lee <74291999+augustinLib@users.noreply.github.com>
Co-authored-by: Kim haewon <ehdvkf02@naver.com>

---------

Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Sangam Lee <74291999+augustinLib@users.noreply.github.com>
Co-authored-by: Kim haewon <ehdvkf02@naver.com>
2023-07-26 08:40:37 -04:00
Shauray Singh 277d3aed0a
documentation for llama2 models (#25102)
* fix documentation

* changes
2023-07-26 08:30:33 -04:00
Marc Sun a5cc30d72a
fix tied_params for meta tensor (#25101)
* fix tied_params for meta tensor

* remove duplicate
2023-07-25 18:08:45 -04:00
dependabot[bot] f1deb21fce
Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/visual_bert (#25097)
Bump certifi in /examples/research_projects/visual_bert

Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22.
- [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22)

---
updated-dependencies:
- dependency-name: certifi
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-25 17:25:14 -04:00
dependabot[bot] 45bde362d2
Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/decision_transformer (#25098)
Bump certifi in /examples/research_projects/decision_transformer

Bumps [certifi](https://github.com/certifi/python-certifi) from 2022.12.7 to 2023.7.22.
- [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22)

---
updated-dependencies:
- dependency-name: certifi
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-25 17:25:05 -04:00