Pavel Iakubovskii
66f675eb65
Fix W&B run name ( #30462 )
...
* Remove comparison to output_dir
* Update docs for `run_name`
* Add warning
2024-05-03 12:04:15 +01:00
Mayank Mishra
425e1a0426
add mlp bias for llama models ( #30031 )
...
* add bias
* fix quality
2024-05-03 11:02:17 +02:00
Raushan Turganbay
a0e77a1f6b
Fix CI after #30410 ( #30612 )
...
* Fix CI after #30410
* [run-slow] blenderbot
2024-05-03 01:18:48 +05:00
mobicham
59952994c4
Add HQQ quantization support ( #29637 )
...
* update HQQ transformers integration
* push import_utils.py
* add force_hooks check in modeling_utils.py
* fix | with Optional
* force bias as param
* check bias is Tensor
* force forward for multi-gpu
* review fixes pass
* remove torch grad()
* if any key in linear_tags fix
* add cpu/disk check
* isinstance return
* add multigpu test + refactor tests
* clean hqq_utils imports in hqq.py
* clean hqq_utils imports in quantizer_hqq.py
* delete hqq_utils.py
* Delete src/transformers/utils/hqq_utils.py
* ruff init
* remove torch.float16 from __init__ in test
* refactor test
* isinstance -> type in quantizer_hqq.py
* cpu/disk device_map check in quantizer_hqq.py
* remove type(module) nn.linear check in quantizer_hqq.py
* add BaseQuantizeConfig import inside HqqConfig init
* remove hqq import in hqq.py
* remove accelerate import from test_hqq.py
* quant config.py doc update
* add hqqconfig to main_classes doc
* make style
* __init__ fix
* ruff __init__
* skip_modules list
* hqqconfig format fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* hqqconfig doc fix
* test_hqq.py remove mistral comment
* remove self.using_multi_gpu is False
* torch_dtype default val set and logger.info
* hqq.py isinstance fix
* remove torch=None
* torch_device test_hqq
* rename test_hqq
* MODEL_ID in test_hqq
* quantizer_hqq setattr fix
* quantizer_hqq typo fix
* imports quantizer_hqq.py
* isinstance quantizer_hqq
* hqq_layer.bias reformat quantizer_hqq
* Step 2 as comment in quantizer_hqq
* prepare_for_hqq_linear() comment
* keep_in_fp32_modules fix
* HqqHfQuantizer reformat
* quantization.md hqqconfig
* quantization.md model example reformat
* quantization.md # space
* quantization.md space })
* quantization.md space })
* quantization_config fix doc
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* axis value check in quantization_config
* format
* dynamic config explanation
* quant config method in quantization.md
* remove shard-level progress
* .cuda fix modeling_utils
* test_hqq fixes
* make fix-copies
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-02 17:51:49 +01:00
Jonghwan Hyeon
4c940934da
Output `None` as attention when layer is skipped ( #30597 )
...
* Output `None` as attention when layer is skipped
* Add test for output_attentions
2024-05-02 17:25:19 +01:00
Michael Benayoun
39359e5b5f
Fix FX tracing issues for Llama ( #30619 )
2024-05-02 17:03:10 +02:00
Joao Gante
9719202d37
Generate: fix `SinkCache` on Llama models ( #30581 )
2024-05-02 15:24:33 +01:00
Joao Gante
66abe13951
Docs: add missing `StoppingCriteria` autodocs ( #30617 )
...
* add missing docstrings to docs
* Update src/transformers/generation/stopping_criteria.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-02 15:20:04 +01:00
Joao Gante
aa55ff44a2
Docs: fix `generate`-related rendering issues ( #30600 )
...
* does this work?
* like this?
* fix the other generate links
* missing these
2024-05-02 14:42:25 +01:00
amitportnoy
801894e08c
phi3 chat_template does not support system role ( #30606 )
...
* phi3 chat_template does not support system role
* fix doc test error
2024-05-02 15:30:21 +02:00
Yih-Dar
f57f014936
Use `contiguous()` in clip checkpoint conversion script ( #30613 )
...
* fix
* fix
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-02 13:59:40 +02:00
Zhan Lu
a65da83d75
fix:missing `output_router_logits` in SwitchTransformers ( #30573 )
...
* fix:missing `output_router_logits` in SwitchTransformers
* fix whitespace in blank line
2024-05-02 13:47:00 +02:00
amyeroberts
4ad5adaf1d
Fix copies for DBRX - neuron fix ( #30610 )
2024-05-02 11:00:26 +01:00
Richard Brown
f95302584b
🚨 Update image_processing_vitmatte.py ( #30566 )
...
* Update image_processing_vitmatte.py
* add test
* [run-slow]vitmatte
2024-05-02 11:00:07 +01:00
Bai Li
12c5544dca
Fix memory leak with CTC training script on Chinese languages ( #30358 )
...
* Fix memory leak with CTC training script on Chinese languages
* Fix lint
2024-05-02 09:33:36 +01:00
Michael Benayoun
fbabd6746f
Fix for Neuron ( #30259 )
2024-05-02 10:24:47 +02:00
Raushan Turganbay
5cf3e6bf05
Fix: failing CI after #30568 ( #30599 )
...
* failiing CI
* no let's keep it intil full deprecation in v4.42
2024-05-02 12:15:17 +05:00
dependabot[bot]
c681b58b06
Bump torch from 1.9.0+cpu to 1.13.1 in /examples/flax/vision ( #21168 )
...
Bumps [torch](https://github.com/pytorch/pytorch ) from 1.9.0+cpu to 1.13.1.
- [Release notes](https://github.com/pytorch/pytorch/releases )
- [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md )
- [Commits](https://github.com/pytorch/pytorch/commits/v1.13.1 )
---
updated-dependencies:
- dependency-name: torch
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-01 20:14:57 +01:00
dependabot[bot]
3a36597a5f
Bump pillow from 10.0.1 to 10.2.0 in /examples/research_projects/decision_transformer ( #28655 )
...
Bump pillow in /examples/research_projects/decision_transformer
Bumps [pillow](https://github.com/python-pillow/Pillow ) from 10.0.1 to 10.2.0.
- [Release notes](https://github.com/python-pillow/Pillow/releases )
- [Changelog](https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst )
- [Commits](https://github.com/python-pillow/Pillow/compare/10.0.1...10.2.0 )
---
updated-dependencies:
- dependency-name: pillow
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 19:58:34 +01:00
dependabot[bot]
4f3c7af489
Bump torch from 1.9.0+cpu to 1.13.1 in /examples/research_projects/jax-projects/hybrid_clip ( #21167 )
...
Bump torch in /examples/research_projects/jax-projects/hybrid_clip
Bumps [torch](https://github.com/pytorch/pytorch ) from 1.9.0+cpu to 1.13.1.
- [Release notes](https://github.com/pytorch/pytorch/releases )
- [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md )
- [Commits](https://github.com/pytorch/pytorch/commits/v1.13.1 )
---
updated-dependencies:
- dependency-name: torch
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 18:37:55 +01:00
dependabot[bot]
6f465d45d9
Bump torch from 1.11.0 to 1.13.1 in /examples/research_projects/decision_transformer ( #21171 )
...
Bump torch in /examples/research_projects/decision_transformer
Bumps [torch](https://github.com/pytorch/pytorch ) from 1.11.0 to 1.13.1.
- [Release notes](https://github.com/pytorch/pytorch/releases )
- [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md )
- [Commits](https://github.com/pytorch/pytorch/compare/v1.11.0...v1.13.1 )
---
updated-dependencies:
- dependency-name: torch
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 18:16:25 +01:00
Fraser Mince
5090ea3f68
Fix llava half precision and autocast issues ( #29721 )
...
* Ensure input_embeds and image_features are the same dtype in autocast
* Fix nans in half precision llava-next and fix autocasting behavior.
* Fix styling issues.
* fix randn newline instantiation
* fix broken slow llava test
* Fix llava next init.
* fix styling issues
* [run-slow]llava,llava_next
* fix styling issues
2024-05-01 17:49:44 +01:00
Joao Gante
d57ffb487f
Generate: remove deprecated public decoding functions and streamline logic 🧼 ( #29956 )
2024-05-01 17:38:44 +01:00
NielsRogge
dc401d3a4e
Improve object detection task guideline ( #29967 )
...
* Add improvements
* Address comment
2024-05-01 17:58:01 +02:00
amyeroberts
d2feb54591
Fix image segmentation example - don't reopen image ( #30481 )
...
Fix image segmentation example - don't repoen image
2024-05-01 16:52:57 +01:00
dependabot[bot]
6e0cba3cec
Bump torch from 1.6.0 to 1.13.1 in /examples/research_projects/visual_bert ( #21172 )
...
Bump torch in /examples/research_projects/visual_bert
Bumps [torch](https://github.com/pytorch/pytorch ) from 1.6.0 to 1.13.1.
- [Release notes](https://github.com/pytorch/pytorch/releases )
- [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md )
- [Commits](https://github.com/pytorch/pytorch/compare/v1.6.0...v1.13.1 )
---
updated-dependencies:
- dependency-name: torch
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:40:54 +01:00
dependabot[bot]
ce66c0e989
Bump torch from 1.11.0 to 1.13.1 in /examples/research_projects/codeparrot ( #21170 )
...
Bump torch in /examples/research_projects/codeparrot
Bumps [torch](https://github.com/pytorch/pytorch ) from 1.11.0 to 1.13.1.
- [Release notes](https://github.com/pytorch/pytorch/releases )
- [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md )
- [Commits](https://github.com/pytorch/pytorch/compare/v1.11.0...v1.13.1 )
---
updated-dependencies:
- dependency-name: torch
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:40:19 +01:00
dependabot[bot]
7a29c577e8
Bump torch from 1.6.0 to 1.13.1 in /examples/research_projects/lxmert ( #21174 )
...
Bumps [torch](https://github.com/pytorch/pytorch ) from 1.6.0 to 1.13.1.
- [Release notes](https://github.com/pytorch/pytorch/releases )
- [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md )
- [Commits](https://github.com/pytorch/pytorch/compare/v1.6.0...v1.13.1 )
---
updated-dependencies:
- dependency-name: torch
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:39:55 +01:00
dependabot[bot]
b33f01fe6b
Bump pyarrow from 1.0.1 to 15.0.0 in /examples/research_projects/lxmert ( #30584 )
...
Bumps [pyarrow](https://github.com/apache/arrow ) from 1.0.1 to 15.0.0.
- [Commits](https://github.com/apache/arrow/compare/apache-arrow-1.0.1...go/v15.0.0 )
---
updated-dependencies:
- dependency-name: pyarrow
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:38:07 +01:00
dependabot[bot]
0ec3003ae9
Bump pyarrow from 1.0.1 to 15.0.0 in /examples/research_projects/visual_bert ( #30583 )
...
Bump pyarrow in /examples/research_projects/visual_bert
Bumps [pyarrow](https://github.com/apache/arrow ) from 1.0.1 to 15.0.0.
- [Commits](https://github.com/apache/arrow/compare/apache-arrow-1.0.1...go/v15.0.0 )
---
updated-dependencies:
- dependency-name: pyarrow
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:37:54 +01:00
dependabot[bot]
aefbdfe8cf
Bump pyarrow from 7.0.0 to 15.0.0 in /examples/research_projects/decision_transformer ( #30582 )
...
Bump pyarrow in /examples/research_projects/decision_transformer
Bumps [pyarrow](https://github.com/apache/arrow ) from 7.0.0 to 15.0.0.
- [Commits](https://github.com/apache/arrow/compare/go/v7.0.0...go/v15.0.0 )
---
updated-dependencies:
- dependency-name: pyarrow
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:37:40 +01:00
dependabot[bot]
7164171212
Bump gitpython from 3.1.32 to 3.1.41 in /examples/research_projects/distillation ( #30586 )
...
Bump gitpython in /examples/research_projects/distillation
Bumps [gitpython](https://github.com/gitpython-developers/GitPython ) from 3.1.32 to 3.1.41.
- [Release notes](https://github.com/gitpython-developers/GitPython/releases )
- [Changelog](https://github.com/gitpython-developers/GitPython/blob/main/CHANGES )
- [Commits](https://github.com/gitpython-developers/GitPython/compare/3.1.32...3.1.41 )
---
updated-dependencies:
- dependency-name: gitpython
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:36:57 +01:00
dependabot[bot]
ff8f624542
Bump grpcio from 1.44.0 to 1.53.2 in /examples/research_projects/decision_transformer ( #30585 )
...
Bump grpcio in /examples/research_projects/decision_transformer
Bumps [grpcio](https://github.com/grpc/grpc ) from 1.44.0 to 1.53.2.
- [Release notes](https://github.com/grpc/grpc/releases )
- [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md )
- [Commits](https://github.com/grpc/grpc/compare/v1.44.0...v1.53.2 )
---
updated-dependencies:
- dependency-name: grpcio
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:35:52 +01:00
dependabot[bot]
b71f512823
Bump gitpython from 3.1.32 to 3.1.41 in /examples/research_projects/decision_transformer ( #30587 )
...
Bump gitpython in /examples/research_projects/decision_transformer
Bumps [gitpython](https://github.com/gitpython-developers/GitPython ) from 3.1.32 to 3.1.41.
- [Release notes](https://github.com/gitpython-developers/GitPython/releases )
- [Changelog](https://github.com/gitpython-developers/GitPython/blob/main/CHANGES )
- [Commits](https://github.com/gitpython-developers/GitPython/compare/3.1.32...3.1.41 )
---
updated-dependencies:
- dependency-name: gitpython
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:30:24 +01:00
Pedro Cuenca
f4f18afde8
Gemma: update activation warning ( #29995 )
...
* Gemma: only display act. warning when necessary
This is a nit PR, but I was confused. I got the warning even after I
had changed `hidden_act` to `gelu_pytorch_tanh`, telling me that I
was using the "legacy" `gelu_pytorch_tanh`.
Another option is to keep the warning but change the message to say
something like "`hidden_act` is ignored, please use `hidden_activation`
instead. Setting Gemma's activation function to `gelu_pytorch_tanh`".
* Change message, and set `config.hidden_activation`
2024-05-01 17:23:38 +02:00
amyeroberts
bbaa8ceff6
Fix canonical model --model_type in examples ( #30480 )
...
Fix --model_type in examples
2024-05-01 15:47:05 +01:00
Arthur
3c69d81eeb
remove jax example ( #30498 )
...
remove example
2024-05-01 16:34:57 +02:00
Matt
1e05671d21
Fix QA example ( #30580 )
...
* Handle cases when CLS token is absent
* Use BOS token as a fallback
2024-05-01 08:43:02 +01:00
Matt
4b4da18f53
Refactor default chat template warnings ( #30551 )
...
* Temporarily silence warnings in apply_chat_template until we can properly deprecate default chat templates
* make fixup
* Move the default chat template warning into apply_chat_template itself
* make fixup
2024-05-01 08:42:11 +01:00
Raushan Turganbay
4bc9cb36b7
Fix Marian model conversion ( #30173 )
...
* fix marian model coversion
* uncomment that line
* remove unnecessary code
* revert tie_weights, doesn't hurt
2024-05-01 12:33:12 +05:00
Raushan Turganbay
38a4bf79ad
Encoder-decoder models: move embedding scale to nn.Module ( #30410 )
...
* move scaling to nn.Module
* let the test be here for now (need to fix)
* failing tests
* last failing models
* Revert commit 4c14817f38
* clean-up
* oops forgot
* codestyle
* raise NotImplemented when possible
* Update tests/test_modeling_common.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* skip tests in respective modeling files
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-01 12:33:00 +05:00
Raushan Turganbay
9d31b32e9d
Use text config's vocab size in testing models ( #30568 )
...
use text config's vocab size
2024-05-01 12:32:45 +05:00
Yih-Dar
78fdd64dcf
Remove `use_square_size` after loading ( #30567 )
...
* fix
* add test
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-04-30 21:11:37 +02:00
Yih-Dar
87927b248e
General PR slow CI ( #30540 )
...
* More general PR slow CI
* Update utils/pr_slow_ci_models.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-04-30 21:05:09 +02:00
Raushan Turganbay
b8ac4d035c
Fix generation doctests ( #30263 )
...
* fix doctest
* fix torch doctest
* make CI happy
* raise error
* make fixup
2024-04-30 21:02:26 +02:00
DarshanDeshpande
2ecefc3959
Add chat templating support for KeyDataset in text-generation pipeline ( #30558 )
...
* added chat templating support for keydataset in generation pipeline
* fixed and improved test
* fix formatting test failures
* Fix tests
* Fix tests
2024-04-30 19:51:41 +01:00
Jiarui Xu
0cdb6b3f92
BlipModel: get_multimodal_features method ( #30438 )
...
* add_blip_get_multimodal_feautres
* Fix docstring error
* reimplement get_multimodal_features
* fix error
* recheck code quality
* add new necessary tests
2024-04-30 19:01:01 +01:00
Anton Vlasjuk
9112520b15
Fix seq2seq collator padding ( #30556 )
...
* fix seq2seq data collator to respect the given padding strategy
further added tests for the seq2seq data collator in the style of the `data_collator_for_token_classification` (pt, tf, np)
* formatting and change bool equals "==" to "is"
* add missed return types in tests
* update numpy test as it can handle unequal shapes, not like pt or tf
2024-04-30 18:32:30 +01:00
Joao Gante
78a57c5e1a
DBRX: make fixup ( #30578 )
2024-04-30 18:30:23 +01:00
Joao Gante
1bff6a0b58
Generate: update links on LLM tutorial doc ( #30550 )
2024-04-30 18:14:12 +01:00