Younes Belkada
3170af71e1
[`Detr`] Fix detr BatchNorm replacement issue ( #25230 )
...
* fix detr weird issue
* Update src/transformers/models/conditional_detr/modeling_conditional_detr.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fix copies
* fix copies
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-08-01 12:21:48 +02:00
Younes Belkada
05ebb0264e
[`MPT`] Add `require_bitsandbytes` on MPT integration tests ( #25201 )
...
* add `require_bitsandbytes` on MPT integration tests
* add it on mpt as well
2023-08-01 12:20:34 +02:00
Younes Belkada
972fdcc778
[`Docs`/`quantization`] Clearer explanation on how things works under the hood. + remove outdated info ( #25216 )
...
* clearer explanation on how things works under the hood.
* Update docs/source/en/main_classes/quantization.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/main_classes/quantization.md
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* add `load_in_4bit` in `from_pretrained`
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-08-01 10:56:52 +02:00
Younes Belkada
77c3973e8f
[`Pix2Struct`] Fix pix2struct cross attention ( #25200 )
...
* fix pix2struct cross attention
* fix torchscript slow test
2023-08-01 10:56:37 +02:00
Wang, Yi
4033ea7167
make build_mpt_alibi_tensor a method of MptModel so that deepspeed co… ( #25193 )
...
make build_mpt_alibi_tensor a method of MptModel so that deepspeed could override it to make autoTP work
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2023-08-01 01:35:49 -04:00
Yih-Dar
0fd8d2aa2c
Fix docker image build failure ( #25214 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-31 20:13:15 +02:00
Yih-Dar
1b4f6199c6
Update tiny model info. and pipeline testing ( #25213 )
...
* update tiny_model_summary.json
* update
* update
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-31 19:35:33 +02:00
Younes Belkada
e0c50b274a
[`pipeline`] revisit device check for pipeline ( #25207 )
...
* revisit device check for pipeline
* let's raise an error.
2023-07-31 18:43:21 +02:00
Stas Bekman
5220606607
[quantization.md] fix ( #25190 )
...
Update quantization.md
2023-07-31 09:37:29 -07:00
Yih-Dar
9ca3aa0156
Fix `all_model_classes` in `FlaxBloomGenerationTest` ( #25211 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-31 17:32:05 +02:00
Younes Belkada
59dcea3fe4
[`PreTrainedModel`] Wrap `cuda` and `to` method correctly ( #25206 )
...
wrap `cuda` and `to` method correctly
2023-07-31 17:25:09 +02:00
Yih-Dar
67b85f24de
Better error message in `_prepare_output_docstrings` ( #25202 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-31 16:15:02 +02:00
Joao Gante
4a564490e1
Musicgen: CFG is manually added ( #25173 )
2023-07-31 11:21:11 +01:00
amyeroberts
05cda5df34
🚨 🚨 🚨 Fix rescale ViVit Efficientnet ( #25174 )
...
* Fix rescaling bug
* Add tests
* Update integration tests
* Fix up
* Update src/transformers/image_transforms.py
* Update test - new possible order in list
2023-07-28 19:52:51 +01:00
Sanchit Gandhi
03f98f9683
[MusicGen] Fix integration tests ( #25169 )
...
* move to device
* update with cuda values
* fix fp16
* more rigorous
2023-07-28 18:50:15 +01:00
Yoni Gottesman
c90e14fb0f
Fix beam search to sample at least 1 non eos token ( #25103 ) ( #25115 )
2023-07-28 13:20:24 -04:00
Sohyun Sim
31f137c04f
🌐 [i18n-KO] Translated `transformers_agents.md` to Korean ( #24881 )
...
* docs: ko: transformers_agents.md
* docs: ko: transformers_agents.md
* feat: deepl draft
* fix: manual edits
* fix: resolve suggestions
Co-authored-by: Juntae <79131091+sronger@users.noreply.github.com>
Co-authored-by: Injin Paek <71638597+eenzeenee@users.noreply.github.com>
---------
Co-authored-by: Juntae <79131091+sronger@users.noreply.github.com>
Co-authored-by: Injin Paek <71638597+eenzeenee@users.noreply.github.com>
2023-07-28 13:06:37 -04:00
Younes Belkada
dd9d45b6ec
[`InstructBlip`] Fix instructblip slow test ( #25171 )
...
* fix instruct blip slow test
* Update tests/models/instructblip/test_modeling_instructblip.py
2023-07-28 17:00:10 +02:00
Younes Belkada
add0895dd9
[`Mpt`] Fix mpt slow test ( #25170 )
...
fix mpt slow test
2023-07-28 16:45:09 +02:00
Yih-Dar
d53b8ad780
Update `use_auth_token` -> `token` in example scripts ( #25167 )
...
* pytorch examples
* tensorflow examples
* flax examples
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-28 15:33:45 +02:00
Alexander Markov
3cbc560d03
added compiled model support for inference ( #25124 )
...
* added compiled model support for inference
* linter
* Fix tests
* linter
* linter
* remove inference mode from pipelines
* Linter
---------
Co-authored-by: amarkov <alexander@inworld.ai>
2023-07-28 08:28:04 -04:00
Alan Ji
afa96fffdf
make run_generation more generic for other devices ( #25133 )
...
* make run_generation more generic for other devices
* use Accelerate to support any device type it supports.
* make style
* fix error usage of accelerator.prepare_model
* use `PartialState` to make sure everything is running on the right device
---------
Co-authored-by: statelesshz <jihuazhong1@huawei.com>
2023-07-28 08:20:10 -04:00
jiqing-feng
d23d2c27c2
Represent query_length in a different way to solve jit issue ( #25164 )
...
Fix jit trace
2023-07-28 08:19:10 -04:00
YQ
2a78720104
override .cuda() to check if model is already quantized ( #25166 )
2023-07-28 08:17:24 -04:00
Lucain
c1dba1111b
Add test when downloading from gated repo ( #25039 )
2023-07-28 08:14:27 -04:00
Lucain
6232c380f2
Fix `.push_to_hub` and cleanup `get_full_repo_name` usage ( #25120 )
...
* Fix .push_to_hub and cleanup get_full_repo_name usage
* Do not rely on Python bool conversion magic
* request changes
2023-07-28 11:40:08 +02:00
Sylvain Gugger
400e76ef11
Add new model in doc table of content ( #25148 )
2023-07-27 13:41:50 -04:00
Sanchit Gandhi
e93103632b
Add bloom flax ( #25094 )
...
* First commit
* step 1 working
* add alibi
* placeholder for `scan`
* add matrix mult alibi
* beta scaling factor for bmm
* working v1 - simple forward pass
* move layer_number from attribute to arg in call
* partial functioning scan
* hacky working scan
* add more modifs
* add test
* update scan for new kwarg order
* fix position_ids problem
* fix bug in attention layer
* small fix
- do the alibi broadcasting only once
* prelim refactor
* finish refactor
* alibi shifting
* incorporate dropout_add to attention module
* make style
* make padding work again
* update
* remove bogus file
* up
* get generation to work
* clean code a bit
* added small tests
* adding albii test
* make CI tests pass:
- change init weight
- add correct tuple for output attention
- add scan test
- make CI tests work
* fix few nits
* fix nit onnx
* fix onnx nit
* add missing dtype args to nn.Modules
* remove debugging statements
* fix scan generate
* Update modeling_flax_bloom.py
* Update test_modeling_flax_bloom.py
* Update test_modeling_flax_bloom.py
* Update test_modeling_flax_bloom.py
* fix small test issue + make style
* clean up
* Update tests/models/bloom/test_modeling_flax_bloom.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* fix function name
* small fix test
* forward contrib credits from PR17761
* Fix failing test
* fix small typo documentation
* fix non passing test
- remove device from build alibi
* refactor call
- refactor `FlaxBloomBlockCollection` module
* make style
* upcast to fp32
* cleaner way to upcast
* remove unused args
* remove layer number
* fix scan test
* make style
* fix i4 casting
* fix slow test
* Update src/transformers/models/bloom/modeling_flax_bloom.py
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* remove `layer_past`
* refactor a bit
* fix `scan` slow test
* remove useless import
* major changes
- remove unused code
- refactor a bit
- revert import `torch`
* major refactoring
- change build alibi
* remove scan
* fix tests
* make style
* clean-up alibi
* add integration tests
* up
* fix batch norm conversion
* style
* style
* update pt-fx cross tests
* update copyright
* Update src/transformers/modeling_flax_pytorch_utils.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* per-weight check
* style
* line formats
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: haileyschoelkopf <haileyschoelkopf@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-07-27 18:24:56 +01:00
Yih-Dar
0c790ddbd1
More `token` things ( #25146 )
...
* fix
* fix
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-27 17:42:07 +02:00
Yoach Lacombe
0b92ae3489
Add offload support to Bark ( #25037 )
...
* initial Bark offload proposal
* use hooks instead of manually offloading
* add test of bark offload to cpu feature
* Apply nit suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Update docstrings of offload
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* remove unecessary set_seed in Bark tests
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2023-07-27 15:35:17 +01:00
Arthur
9cea3e7b80
[`MptConfig`] support from pretrained args ( #25116 )
...
* support from pretrained args
* draft addition of tests
* update test
* use parrent assert true
* Update src/transformers/models/mpt/configuration_mpt.py
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
---------
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-07-27 16:24:52 +02:00
Zach Mueller
a1c4954d25
🚨 🚨 🚨 Change default from `adamw_hf` to `adamw_torch` 🚨 🚨 🚨 ( #25109 )
...
* Change defaults
* Sylvain's comments
2023-07-27 09:11:28 -04:00
Bram Vanroy
9a220ce30c
Clarify 4/8 bit loading log message ( #25134 )
...
* clarify 4/8 bit loading log message
* make style
2023-07-27 09:09:27 -04:00
Arthur
9429642e2d
[`T5/LlamaTokenizer`] default legacy to `None` to not always warn ( #25131 )
...
default legacy to None
2023-07-27 14:43:18 +02:00
Pbihao
de9e3b5945
fix delete all checkpoints when save_total_limit is set to 1 ( #25136 )
2023-07-27 08:34:02 -04:00
Sourab Mangrulkar
a004237926
fix deepspeed load best model at end when the model gets sharded ( #25057 )
2023-07-27 07:11:43 +05:30
amyeroberts
1689aea733
Move center_crop to BaseImageProcessor ( #25122 )
2023-07-26 18:30:38 +01:00
amyeroberts
659829b6ae
MaskFormer - enable return_dict in order to compile ( #25052 )
...
* Enable return_dict in order to compile
* Update tests
2023-07-26 16:23:30 +01:00
Eric Bezzam
b914ec9847
Fix ViT docstring regarding default dropout values. ( #25118 )
...
Fix docstring for dropout.
2023-07-26 11:08:57 -04:00
amyeroberts
1486d2aec2
Move common image processing methods to BaseImageProcessor ( #25089 )
...
Move out common methods
2023-07-26 15:09:17 +01:00
Yih-Dar
d30cf3d02f
Fix past CI after #24334 ( #25113 )
...
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-26 15:34:42 +02:00
Yih-Dar
224da5df69
update `use_auth_token` -> `token` ( #25083 )
...
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-26 15:09:59 +02:00
Leo
c53c8e490c
fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is … ( #24772 )
...
fix "UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor."
Co-authored-by: 刘长伟 <hzliuchw@corp.netease.com>
2023-07-26 09:07:21 -04:00
David Reguera
04a5c859b0
Add descriptive docstring to TemperatureLogitsWarper ( #24892 )
...
* Add descriptive docstring to TemperatureLogitsWarper
It addresses https://github.com/huggingface/transformers/issues/24783
* Remove niche features
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Commit suggestion
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Refactor the examples to simpler ones
* Add a missing comma
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Make args description more compact
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Remove extra text after making description more compact
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* Fix linter
---------
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2023-07-26 08:58:26 -04:00
Yih-Dar
31acba5697
Fix `PvtModelIntegrationTest::test_inference_fp16` ( #25106 )
...
update
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-26 14:57:44 +02:00
Kihoon Son
ee63520a7b
🌐 [i18n-KO] Translated pipeline_webserver.md to Korean ( #24828 )
...
* translated pipeline_webserver.md
Co-Authored-By: Hyeonseo Yun <0525yhs@gmail.com>
Co-Authored-By: Wonhyeong Seo <wonhseo@kakao.com>
Co-Authored-By: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-Authored-By: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-Authored-By: Nayeon Han <nayeon2.han@gmail.com>
Co-Authored-By: Jungnerd <46880056+jungnerd@users.noreply.github.com>
* Update pipeline_webserver.md
* Apply suggestions from code review
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Sangam Lee <74291999+augustinLib@users.noreply.github.com>
Co-authored-by: Kim haewon <ehdvkf02@naver.com>
---------
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
Co-authored-by: Sohyun Sim <96299403+sim-so@users.noreply.github.com>
Co-authored-by: Gabriel Yang <gabrielwithhappy@gmail.com>
Co-authored-by: Nayeon Han <nayeon2.han@gmail.com>
Co-authored-by: Jungnerd <46880056+jungnerd@users.noreply.github.com>
Co-authored-by: Sangam Lee <74291999+augustinLib@users.noreply.github.com>
Co-authored-by: Kim haewon <ehdvkf02@naver.com>
2023-07-26 08:40:37 -04:00
Shauray Singh
277d3aed0a
documentation for llama2 models ( #25102 )
...
* fix documentation
* changes
2023-07-26 08:30:33 -04:00
Marc Sun
a5cc30d72a
fix tied_params for meta tensor ( #25101 )
...
* fix tied_params for meta tensor
* remove duplicate
2023-07-25 18:08:45 -04:00
dependabot[bot]
f1deb21fce
Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/visual_bert ( #25097 )
...
Bump certifi in /examples/research_projects/visual_bert
Bumps [certifi](https://github.com/certifi/python-certifi ) from 2022.12.7 to 2023.7.22.
- [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22 )
---
updated-dependencies:
- dependency-name: certifi
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-25 17:25:14 -04:00
dependabot[bot]
45bde362d2
Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/decision_transformer ( #25098 )
...
Bump certifi in /examples/research_projects/decision_transformer
Bumps [certifi](https://github.com/certifi/python-certifi ) from 2022.12.7 to 2023.7.22.
- [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22 )
---
updated-dependencies:
- dependency-name: certifi
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-25 17:25:05 -04:00