Commit Graph

15060 Commits

Author SHA1 Message Date
Younes Belkada 1faeff85ce
Fix Vip-llava docs (#28085)
* Update vipllava.md

* Update modeling_vipllava.py
2023-12-15 20:16:47 +01:00
Ligeng Zhu ffa04def0e
Fix wrong examples in llava usage. (#28020)
* Fix wrong examples in llava usage.

* Update modeling_llava.py
2023-12-15 17:09:50 +00:00
Kotaro Tanahashi 29a1c1b472
Fix `low_cpu_mem_usage` Flag Conflict with DeepSpeed Zero 3 in `from_pretrained` for Models with `keep_in_fp32_modules`" (#27762)
Fix `from_pretrained` Logic
for `low_cpu_mem_usage` with DeepSpeed Zero3
2023-12-15 17:03:41 +00:00
Quentin Lhoest 26ea725bc0
Update fixtures-image-utils (#28080)
* fix hf-internal-testing/fixtures_image_utils

* fix test

* comments
2023-12-15 16:58:36 +00:00
dumpmemory 1c286be508
Fix bug for checkpoint saving on multi node training setting (#28078)
* add multi-node traning setting

* fix style
2023-12-15 16:18:56 +00:00
Julien Chaumond dec84b3211
make torch.load a bit safer (#27282)
* make torch.load a bit safer

* Fixes

---------

Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>
2023-12-15 16:01:18 +01:00
Ke Wen 74cae670ce
Make GPT2 traceable in meta state (#28054)
* Put device in tensor constructor instead of to()

* Fix copy
2023-12-15 15:45:31 +01:00
Adilzhan Ismailov e2b6df7971
[LLaVa] Add past_key_values to _skip_keys_device_placement to fix multi-GPU dispatch (#28051)
Add past_key_values to _skip_keys_device_placement  for LLaVa
2023-12-15 14:05:20 +00:00
Yoach Lacombe deb72cb6d9
Skip M4T `test_retain_grad_hidden_states_attentions` (#28060)
* skip test from SpeechInput

* refine description of skip
2023-12-15 13:39:16 +00:00
Younes Belkada d269c4b2d7
[`Mixtral`] update conversion script to reflect new changes (#28068)
* Update convert_mixtral_weights_to_hf.py

* forward contrib credits from original fix

---------

Co-authored-by: thomasw21 <thomasw21@users.noreply.github.com>
2023-12-15 14:05:20 +01:00
Cylis 70a127a37a
doc: Correct spelling mistake (#28064) 2023-12-15 13:01:39 +00:00
Yoach Lacombe c817c17dbe
Remove SpeechT5 deprecated argument (#28062) 2023-12-15 12:15:06 +00:00
Sanchit Gandhi 6af3ce7757
[Flax LLaMA] Fix attn dropout (#28059) 2023-12-15 10:57:36 +00:00
Sanchit Gandhi 7e876dca54
[Flax BERT] Update deprecated 'split' method (#28012)
* [Flax BERT] Update deprecated 'split' method

* fix copies
2023-12-15 10:57:18 +00:00
Younes Belkada e737446ee6
[`Modeling` / `Mixtral`] Fix GC + PEFT issues with Mixtral (#28061)
fix for mistral
2023-12-15 11:34:42 +01:00
Younes Belkada 1e20931765
[`FA-2`] Fix fa-2 issue when passing `config` to `from_pretrained` (#28043)
* fix fa-2 issue

* fix test

* Update src/transformers/modeling_utils.py

Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>

* clenaer fix

* up

* add more robust tests

* Update src/transformers/modeling_utils.py

Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>

* fixup

* Update src/transformers/modeling_utils.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* pop

* add test

---------

Co-authored-by: fxmarty <9808326+fxmarty@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-12-15 11:08:27 +01:00
amyeroberts 1a585c1222
Remove warning when Annotion enum is created (#28048)
Remove warning when enum is created
2023-12-14 19:50:20 +00:00
Matt 3060899be5
Replace build() with build_in_name_scope() for some TF tests (#28046)
Replace build() with build_in_name_scope() for some tests
2023-12-14 17:42:25 +00:00
Matt 050e0b44f6
Proper build() methods for TF (#27794)
* Add a convenience method for building in your own name scope

* Second attempt at auto layer building

* Revert "Second attempt at auto layer building"

This reverts commit e03a3aaecf9ec41a805582b83cbdfe3290a631be.

* Attempt #3

* Revert "Attempt #3"

This reverts commit b9df7a0857560d29b5abbed6127d9e9eca77cf47.

* Add missing attributes that we're going to need later

* Add some attributes we're going to need later

* A fourth attempt! Feel the power flow through you!

* Revert "A fourth attempt! Feel the power flow through you!"

This reverts commit 6bf4aaf3875d6f28485f50187617a4c616c8aff7.

* Add more values we'll need later

* TF refactor that we'll need later

* Revert "TF refactor that we'll need later"

This reverts commit ca07202fb5b7b7436b893baa8d688b4f348ea7b9.

* Revert "Revert "TF refactor that we'll need later""

This reverts commit 1beb0f39f293ed9c27594575e1c849aadeb15c13.

* make fixup

* Attempt five!

* Revert "Attempt five!"

This reverts commit 3302207958dfd0374b0447a51c06eea51a506044.

* Attempt six - this time don't add empty methods

* Revert "Attempt six - this time don't add empty methods"

This reverts commit 67d60129be75416b6beb8f47c7d38d77b18d79bb.

* Attempt seven - better base model class detection!

* Revert "Attempt seven - better base model class detection!"

This reverts commit 5f14845e92ea0e87c598da933bfbfee10f553bc9.

* Another attribute we'll need later

* Try again with the missing attribute!

* Revert "Try again with the missing attribute!"

This reverts commit 760c6f30c5dffb3e04b0e73c34a77d1882a0fef7.

* This is the attempt that will pierce the heavens!

* Revert "This is the attempt that will pierce the heavens!"

This reverts commit c868bb657de057aca7a5260350a3f831fc4dfee6.

* Attempt seven - snag list is steadily decreasing

* Revert "Attempt seven - snag list is steadily decreasing"

This reverts commit 46fbd975deda64429bfb3e5fac4fc0370c00d316.

* Attempt eight - will an empty snag list do it?

* Revert "Attempt eight - will an empty snag list do it?"

This reverts commit 7c8a3c2b083253649569e9877e02054ae5cec67b.

* Fixes to Hubert issues that cause problems later

* Trying again with Conv1D/SeparableConv fixes

* Revert "Trying again with Conv1D/SeparableConv fixes"

This reverts commit 55092bca952bc0f750aa1ffe246a640bf1e2036e.

* Apply the build shape fixes to Wav2Vec2 as well

* One more attempt!

* Revert "One more attempt!"

This reverts commit 5ac3e4cb01b9458cc93312873725f9444ae7261c.

* Another attempt!

* Revert "Another attempt!"

This reverts commit ea16d890e019d7de8792a3b8e72f3b1c02adae50.

* Let's see how many failures we get without the internal build method

* Fix OpenAI

* Fix MobileBERT

* (Mostly) fix GroupVIT

* Fix BLIP

* One more BLIP fix

* One more BLIP fix!

* Fix Regnet

* Finally fully fix GroupViT

* Fix Data2Vec and add the new AdaptivePool

* Fix Segformer

* Fix Albert

* Fix Deberta/DebertaV2

* Fix XLM

* Actually fix XLM

* Fix Flaubert

* Fix lxmert

* Fix Resnet

* Fix ConvBERT

* Fix ESM

* Fix Convnext / ConvnextV2

* Fix SAM

* Fix Efficientformer

* Fix LayoutLMv3

* Fix speech_to_text

* Fix mpnet and mobilevit

* Fix Swin

* Fix CTRL

* Fix CVT

* Fix DPR

* Fix Wav2Vec2

* Fix T5

* Fix Hubert

* Fix GPT2

* Fix Whisper

* Fix DeiT

* Fix the encoder-decoder / dual-encoder classes

* make fix-copies

* build in name scope

* Fix summarization test

* Fix tied weight names for BART + Blenderbot

* Fix tied weight name building

* Fix to TFESM weight building

* Update TF SAM

* Expand all the shapes out into Big Boy Shapes
2023-12-14 15:17:30 +00:00
Sanchit Gandhi 52c37882fb
[Seamless] Fix links in docs (#27905)
* [Seamless] Fix links in docs

* apply suggestions from code review
2023-12-14 15:14:13 +00:00
Joao Gante 388fd314d8
Generate: Mistral/Mixtral FA2 cache fix when going beyond the context window (#28037) 2023-12-14 14:52:45 +00:00
James E. Dobson 0ede762636
Fixed spelling error in T5 tokenizer warning message (s/thouroughly/t… (#28014)
Fixed spelling error in T5 tokenizer warning message (s/thouroughly/thoroughly)
2023-12-14 14:52:03 +00:00
Yoach Lacombe bb1d0d0d9e
Fix languages covered by M4Tv2 (#28019)
* correct language assessment  + add tests

* Update src/transformers/models/seamless_m4t_v2/modeling_seamless_m4t_v2.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* make style + simplify and enrich test

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-12-14 14:43:44 +00:00
Joao Gante e2b16485f3
SeamlessM4T: `test_retain_grad_hidden_states_attentions` is flaky (#28035) 2023-12-14 13:56:03 +00:00
Joao Gante 9e5c28c573
Generate: assisted decoding now uses `generate` for the assistant (#28030)
generate refactor
2023-12-14 13:31:13 +00:00
Yih-Dar dde6c427a1
Fix AMD push CI not triggered (#28029)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-12-14 12:44:00 +01:00
Younes Belkada 73de5108e1
[`core` / `modeling`] Fix training bug with PEFT + GC (#28031)
fix trainign bug
2023-12-14 12:19:45 +01:00
Arthur 2788f8d8d5
[`SeamlessM4TTokenizer`] Safe import (#28026)
safe import
2023-12-14 08:46:10 +01:00
Arthur 131a528be0
well well well (#28011) 2023-12-14 06:51:04 +01:00
Marc Sun 17506d1256
add `modules_in_block_to_quantize` arg in GPTQconfig (#27956)
* add inside_layer_modules arg

* fix

* change to modules_to_quantize_inside_block

* fix

* remane again

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* better docsting

* fix again with less explanation

* Update src/transformers/utils/quantization_config.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* style

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-12-13 14:13:44 -05:00
Rockerz fe44b1f1a9
Add model_docs from cpmant.md to derformable_detr.md (#27884)
* upfaste

* Update

* Update docs/source/ja/model_doc/deformable_detr.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/model_doc/data2vec.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/model_doc/cvt.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* add suggestions

* Toctree update

* remove git references

* Update docs/source/ja/_toctree.yml

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ja/model_doc/decision_transformer.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-12-13 10:02:29 -08:00
Lysandre 3ed3e3190c Dev version 2023-12-13 18:29:31 +01:00
Aaron Jimenez 815ea8e8a2
[Doc] Spanish translation of glossary.md (#27958)
* Add glossary to es/_toctree.yml

* Add glossary.md to es/

* A section translated

* B and C section translated

* Fix typo in en/glossary.md C section

* D section translated | Add a extra line in en/glossary.md

* E and F section translated | Fix typo in en/glossary.md

* Fix words preentrenado

* H and I section translated | Fix typo in en/glossary.md

* L section translated

* M and N section translated

* P section translated

* R section translated

* S section translated

* T section translated

* U and Z section translated | Fix TensorParallel link in both files

* Fix word
2023-12-13 09:21:59 -08:00
Zach Mueller 93766251cb
Fix bug with rotating checkpoints (#28009)
* Fix bug

* Write test

* Keep back old modification for grad accum steps

* Whitespace...

* Whitespace again

* Race condition

* Wait for everyone
2023-12-13 12:17:30 -05:00
Arthur ec43d6870a
[`CI slow`] Fix expected values (#27999)
* fix expected values

* style

* test is slow
2023-12-13 13:37:10 +01:00
Arindam Jati 749f94e460
Fix PatchTSMixer slow tests (#27997)
* fix slow tests

* revert formatting

---------

Co-authored-by: Arindam Jati <arindam.jati@ibm.com>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
2023-12-13 13:34:25 +01:00
Younes Belkada c7f076a00e
Adds VIP-llava to transformers (#27932)
* v1

* add-new-model-like

* revert

* fix forward and conversion script

* revert

* fix copies

* fixup

* fix

* Update docs/source/en/index.md

* Apply suggestions from code review

* push

* fix

* fixes here and there

* up

* fixup and fix tests

* Apply suggestions from code review

* add docs

* fixup

* fixes

* docstring

* add docstring

* fixup

* docstring

* fixup

* nit

* docs

* more copies

* fix copies

* nit

* update test
2023-12-13 10:42:24 +01:00
Arthur 371fb0b7dc
[`Whisper`] raise better errors (#27971)
* [`Whisper`] raise better erros
fixes #27893

* update torch as well
2023-12-13 09:13:01 +01:00
Arthur 230ac352d8
[`Tokenizer Serialization`] Fix the broken serialisation (#27099)
* nits

* nits

* actual fix

* style

* ze fix

* fix fix fix style
2023-12-13 09:11:34 +01:00
Dave Berenbaum f4db565b69
fix typo in dvclive callback (#27983) 2023-12-12 16:29:58 -05:00
Stas Bekman 9936143014
[doc] fix typo (#27981) 2023-12-12 20:32:42 +00:00
fxmarty 78172dcdb7
Fix SDPA correctness following torch==2.1.2 regression (#27973)
* fix sdpa with non-contiguous inputs for gpt_bigcode

* fix other archs

* add currently comment

* format
2023-12-13 00:33:46 +09:00
Matt 5e4ef0a0f6
Better key error for AutoConfig (#27976)
* Improve the error printed when loading an unrecognized architecture

* Improve the error printed when loading an unrecognized architecture

* Raise a ValueError instead because KeyError prints weirdly

* make fixup
2023-12-12 14:41:55 +00:00
saswatmeher a49f4acab3
Fix link in README.md of Image Captioning (#27969)
Update the link for vision encoder decoder doc used by
FlaxVisionEncoderDecoderModel link.
2023-12-12 08:07:15 -05:00
Arthur 680c610f97
Hot-fix-mixstral-loss (#27948)
* fix loss computation

* compute on GPU if possible
2023-12-12 12:20:28 +01:00
Joao Gante 4b759da8be
Generate: `assisted_decoding` now accepts arbitrary candidate generators (#27750)
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-12-12 09:25:57 +00:00
Anthony Susevski e660424717
fixed typos (issue 27919) (#27920)
* fixed typos (issue 27919)

* Update docs/source/en/tasks/knowledge_distillation_for_image_classification.md

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-12-11 18:44:23 -05:00
dancingpipi e5079b0b2a
Support PeftModel signature inspect (#27865)
* Support PeftModel signature inspect

* Use get_base_model() to get the base model

---------

Co-authored-by: shujunhua1 <shujunhua1@jd.com>
2023-12-11 19:30:11 +00:00
Steven Liu 35478182ce
[docs] Fused AWQ modules (#27896)
streamline
2023-12-11 10:41:33 -08:00
NielsRogge 67b1335cb9
Update bounding box format everywhere (#27944)
Update formats
2023-12-11 18:03:42 +00:00