Commit Graph

15957 Commits

Author SHA1 Message Date
Clara Pohland e076953079
Trainer._load_from_checkpoint - support loading multiple Peft adapters (#30505)
* Trainer: load checkpoint model with multiple adapters

* Trainer._load_from_checkpoint support multiple active adapters

* PeftModel.set_adapter does not support multiple adapters yet

* Trainer._load_from_checkpoint test multiple adapters

---------

Co-authored-by: Clara Luise Pohland <clara-luise.pohland@telekom.de>
2024-05-06 08:22:52 -04:00
Marc Sun aa64f086a2
Fix llava next tie_word_embeddings config (#30640)
* fix llava next embedding

* add docstring

* Update src/transformers/models/llava_next/configuration_llava_next.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

---------

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2024-05-06 14:01:26 +02:00
Younes Belkada 9c772ac888
Quantization / HQQ: Fix HQQ tests on our runner (#30668)
Update test_hqq.py
2024-05-06 11:33:52 +02:00
Arthur a45c514899
Hotfix-change-ci (#30669)
* dmmy change

* fiux

* revert change
2024-05-06 11:26:04 +02:00
jiaqianjing 09edd77f64
Check if the current compiled version of pytorch supports MPS (#30664) 2024-05-06 10:32:19 +02:00
Arthur 307f632bb2
[`CI update`] Try to use dockers and no cache (#29202)
* change cis

* nits

* update

* minor updates

* [push-ci-image]

* nit [push-ci-image]

* nitsssss

* [build-ci-image]

* [push-ci-image]

* [push-ci-image]

* both

* [push-ci-image]

* this?

* [push-ci-image]

* pypi-kenlm needs g++

* [push-ci-image]

* nit

* more nits [push-ci-image]

* nits [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* add vision

* [push-ci-image]

* [push-ci-image]

* add new dummy file but will need to update them [push-ci-image]

* [push-ci-image]

* show package size as well

* [push-ci-image]

* potentially ignore failures

* workflow updates

* nits [push-ci-image]

* [push-ci-image]

* fix consistency

* clean nciida triton

* also show big packages [push-ci-image]

* nit

* update

* another one

* line escape?

* add accelerate [push-ci-image]

* updates [push-ci-image]

* nits to run tests, no push-ci

* try to parse skip reason to make sure nothing is skipped that should no be skippped

* nit?

* always show skipped reasons

* nits

* better parsing of the test outputs

* action="store_true",

* failure on failed

* show matched

* debug

* update short summary with skipped, failed and errors

* nits

* nits

* coolu pdates

* remove docbuilder

* fix

* always run checks

* oups

* nits

* don't error out on library printing

* non zero exi codes

* no warning

* nit

* WAT?

* format nit

* [push-ci-image]

* fail if fail is needed

* [push-ci-image]

* sound file for torch light?

* [push-ci-image]

* order is important [push-ci-image]

* [push-ci-image] reduce even further

* [push-ci-image]

* use pytest rich !

* yes [push-ci-image]

* oupsy

* bring back the full traceback, but pytest rich should help

* nit

* [push-ci-image]

* re run

* nit

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* empty push to trigger

* [push-ci-image]

* nit? [push-ci-image]

* empty

* try to install timm with no deps

* [push-ci-image]

* oups [push-ci-image]

* [push-ci-image]

* [push-ci-image] ?

* [push-ci-image] open ssh client for git checkout fast

* empty for torch light

* updates [push-ci-image]

* nit

* @v4 for checkout

* [push-ci-image]

* [push-ci-image]

* fix fetch tests with parallelism

* [push-ci-image]

* more parallelism

* nit

* more nits

* empty to re-trigger

* empty to re-trigger

* split by timing

* did not work with previous commit

* junit.xml

* no path?

* mmm this?

* junitxml format

* split by timing

* nit

* fix junit family

* now we can test if the xunit1 is compatible!

* this?

* fully list tests

* update

* update

* oups

* finally

* use classname

* remove working directory to make sure the path does not interfere

* okay no juni should have the correct path

* name split?

* sort by classname is what make most sense

* some testing

* naem

* oups

* test something fun

* autodetect

* 18?

* nit

* file size?

* uip

* 4 is best

* update to see versions

* better print

* [push-ci-image]

* [push-ci-image]

* please install the correct keras version

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* uv is fucking me up

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* nits

* [push-ci-image]

* [push-ci-image]

* install issues an pins

* tapas as well

* nits

* more paralellism

* short tb

* soundfile

* soundfile

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* oups

* [push-ci-image]

* fix some things

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* use torch-light for hub

* small git lfs for hub job

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* fix tf tapas

* [push-ci-image]

* nits

* [push-ci-image]

* don't update the test

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* no use them

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* update tf proba

* [push-ci-image]

* [push-ci-image]

* woops

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* test with built dockers

* [push-ci-image]

* skip annoying tests

* revert fix copy

* update test values

* update

* last skip and fixup

* nit

* ALL GOOOD

* quality

* Update tests/models/layoutlmv2/test_image_processing_layoutlmv2.py

* Update docker/quality.dockerfile

Co-authored-by: Lysandre Debut <hi@lysand.re>

* Update src/transformers/models/tapas/modeling_tf_tapas.py

Co-authored-by: Lysandre Debut <hi@lysand.re>

* Apply suggestions from code review

Co-authored-by: Lysandre Debut <hi@lysand.re>

* use torch-speed

* updates

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* [push-ci-image]

* fuck ken-lm [push-ci-image]

* [push-ci-image]

* [push-ci-image]

---------

Co-authored-by: Lysandre Debut <hi@lysand.re>
2024-05-06 10:10:32 +02:00
Yih-Dar 91d155ea92
Avoid duplication in PR slow CI model list (#30634)
update

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-03 18:19:30 +02:00
Yen Ting deb7605a2a
Prevent `TextGenerationPipeline._sanitize_parameters` from overriding previously provided parameters (#30362)
* Fixed TextGenerationPipeline._sanitize_parameters default params

* removed empty spaces

---------

Co-authored-by: Ng, Yen Ting <yen.ting.ng@intel.com>
2024-05-03 17:49:28 +02:00
Younes Belkada d0c72c15c2
HQQ: PEFT support for HQQ (#30632)
Update quantizer_hqq.py
2024-05-03 16:01:15 +02:00
Pavel Iakubovskii 66f675eb65
Fix W&B run name (#30462)
* Remove comparison to output_dir

* Update docs for `run_name`

* Add warning
2024-05-03 12:04:15 +01:00
Mayank Mishra 425e1a0426
add mlp bias for llama models (#30031)
* add bias

* fix quality
2024-05-03 11:02:17 +02:00
Raushan Turganbay a0e77a1f6b
Fix CI after #30410 (#30612)
* Fix CI after #30410

* [run-slow] blenderbot
2024-05-03 01:18:48 +05:00
mobicham 59952994c4
Add HQQ quantization support (#29637)
* update HQQ transformers integration

* push import_utils.py

* add force_hooks check in modeling_utils.py

* fix | with Optional

* force bias as param

* check bias is Tensor

* force forward for multi-gpu

* review fixes pass

* remove torch grad()

* if any key in linear_tags fix

* add cpu/disk check

* isinstance return

* add multigpu test + refactor tests

* clean hqq_utils imports in hqq.py

* clean hqq_utils imports in quantizer_hqq.py

* delete hqq_utils.py

* Delete src/transformers/utils/hqq_utils.py

* ruff init

* remove torch.float16 from __init__ in test

* refactor test

* isinstance -> type in quantizer_hqq.py

* cpu/disk device_map check in quantizer_hqq.py

* remove type(module) nn.linear check in quantizer_hqq.py

* add BaseQuantizeConfig import inside HqqConfig init

* remove hqq import in hqq.py

* remove accelerate import from test_hqq.py

* quant config.py doc update

* add hqqconfig to main_classes doc

* make style

* __init__ fix

* ruff __init__

* skip_modules list

* hqqconfig format fix

* hqqconfig doc fix

* hqqconfig doc fix

* hqqconfig doc fix

* hqqconfig doc fix

* hqqconfig doc fix

* hqqconfig doc fix

* hqqconfig doc fix

* hqqconfig doc fix

* hqqconfig doc fix

* test_hqq.py remove mistral comment

* remove self.using_multi_gpu is False

* torch_dtype default val set and logger.info

* hqq.py isinstance fix

* remove torch=None

* torch_device test_hqq

* rename test_hqq

* MODEL_ID in test_hqq

* quantizer_hqq setattr fix

* quantizer_hqq typo fix

* imports quantizer_hqq.py

* isinstance quantizer_hqq

* hqq_layer.bias reformat quantizer_hqq

* Step 2 as comment in quantizer_hqq

* prepare_for_hqq_linear() comment

* keep_in_fp32_modules fix

* HqqHfQuantizer reformat

* quantization.md hqqconfig

* quantization.md model example reformat

* quantization.md # space

* quantization.md space   })

* quantization.md space   })

* quantization_config fix doc

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* axis value check in quantization_config

* format

* dynamic config explanation

* quant config method in quantization.md

* remove shard-level progress

* .cuda fix modeling_utils

* test_hqq fixes

* make fix-copies

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-02 17:51:49 +01:00
Jonghwan Hyeon 4c940934da
Output `None` as attention when layer is skipped (#30597)
* Output `None` as attention when layer is skipped

* Add test for output_attentions
2024-05-02 17:25:19 +01:00
Michael Benayoun 39359e5b5f
Fix FX tracing issues for Llama (#30619) 2024-05-02 17:03:10 +02:00
Joao Gante 9719202d37
Generate: fix `SinkCache` on Llama models (#30581) 2024-05-02 15:24:33 +01:00
Joao Gante 66abe13951
Docs: add missing `StoppingCriteria` autodocs (#30617)
* add missing docstrings to docs

* Update src/transformers/generation/stopping_criteria.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-02 15:20:04 +01:00
Joao Gante aa55ff44a2
Docs: fix `generate`-related rendering issues (#30600)
* does this work?

* like this?

* fix the other generate links

* missing these
2024-05-02 14:42:25 +01:00
amitportnoy 801894e08c
phi3 chat_template does not support system role (#30606)
* phi3 chat_template does not support system role

* fix doc test error
2024-05-02 15:30:21 +02:00
Yih-Dar f57f014936
Use `contiguous()` in clip checkpoint conversion script (#30613)
* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2024-05-02 13:59:40 +02:00
Zhan Lu a65da83d75
fix:missing `output_router_logits` in SwitchTransformers (#30573)
* fix:missing `output_router_logits` in SwitchTransformers

* fix whitespace in blank line
2024-05-02 13:47:00 +02:00
amyeroberts 4ad5adaf1d
Fix copies for DBRX - neuron fix (#30610) 2024-05-02 11:00:26 +01:00
Richard Brown f95302584b
🚨 Update image_processing_vitmatte.py (#30566)
* Update image_processing_vitmatte.py

* add test

* [run-slow]vitmatte
2024-05-02 11:00:07 +01:00
Bai Li 12c5544dca
Fix memory leak with CTC training script on Chinese languages (#30358)
* Fix memory leak with CTC training script on Chinese languages

* Fix lint
2024-05-02 09:33:36 +01:00
Michael Benayoun fbabd6746f
Fix for Neuron (#30259) 2024-05-02 10:24:47 +02:00
Raushan Turganbay 5cf3e6bf05
Fix: failing CI after #30568 (#30599)
* failiing CI

* no let's keep it intil full deprecation in  v4.42
2024-05-02 12:15:17 +05:00
dependabot[bot] c681b58b06
Bump torch from 1.9.0+cpu to 1.13.1 in /examples/flax/vision (#21168)
Bumps [torch](https://github.com/pytorch/pytorch) from 1.9.0+cpu to 1.13.1.
- [Release notes](https://github.com/pytorch/pytorch/releases)
- [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md)
- [Commits](https://github.com/pytorch/pytorch/commits/v1.13.1)

---
updated-dependencies:
- dependency-name: torch
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-01 20:14:57 +01:00
dependabot[bot] 3a36597a5f
Bump pillow from 10.0.1 to 10.2.0 in /examples/research_projects/decision_transformer (#28655)
Bump pillow in /examples/research_projects/decision_transformer

Bumps [pillow](https://github.com/python-pillow/Pillow) from 10.0.1 to 10.2.0.
- [Release notes](https://github.com/python-pillow/Pillow/releases)
- [Changelog](https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst)
- [Commits](https://github.com/python-pillow/Pillow/compare/10.0.1...10.2.0)

---
updated-dependencies:
- dependency-name: pillow
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 19:58:34 +01:00
dependabot[bot] 4f3c7af489
Bump torch from 1.9.0+cpu to 1.13.1 in /examples/research_projects/jax-projects/hybrid_clip (#21167)
Bump torch in /examples/research_projects/jax-projects/hybrid_clip

Bumps [torch](https://github.com/pytorch/pytorch) from 1.9.0+cpu to 1.13.1.
- [Release notes](https://github.com/pytorch/pytorch/releases)
- [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md)
- [Commits](https://github.com/pytorch/pytorch/commits/v1.13.1)

---
updated-dependencies:
- dependency-name: torch
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 18:37:55 +01:00
dependabot[bot] 6f465d45d9
Bump torch from 1.11.0 to 1.13.1 in /examples/research_projects/decision_transformer (#21171)
Bump torch in /examples/research_projects/decision_transformer

Bumps [torch](https://github.com/pytorch/pytorch) from 1.11.0 to 1.13.1.
- [Release notes](https://github.com/pytorch/pytorch/releases)
- [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md)
- [Commits](https://github.com/pytorch/pytorch/compare/v1.11.0...v1.13.1)

---
updated-dependencies:
- dependency-name: torch
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 18:16:25 +01:00
Fraser Mince 5090ea3f68
Fix llava half precision and autocast issues (#29721)
* Ensure input_embeds and image_features are the same dtype in autocast

* Fix nans in half precision llava-next and fix autocasting behavior.

* Fix styling issues.

* fix randn newline instantiation

* fix broken slow llava test

* Fix llava next init.

* fix styling issues

* [run-slow]llava,llava_next

* fix styling issues
2024-05-01 17:49:44 +01:00
Joao Gante d57ffb487f
Generate: remove deprecated public decoding functions and streamline logic 🧼 (#29956) 2024-05-01 17:38:44 +01:00
NielsRogge dc401d3a4e
Improve object detection task guideline (#29967)
* Add improvements

* Address comment
2024-05-01 17:58:01 +02:00
amyeroberts d2feb54591
Fix image segmentation example - don't reopen image (#30481)
Fix image segmentation example - don't repoen image
2024-05-01 16:52:57 +01:00
dependabot[bot] 6e0cba3cec
Bump torch from 1.6.0 to 1.13.1 in /examples/research_projects/visual_bert (#21172)
Bump torch in /examples/research_projects/visual_bert

Bumps [torch](https://github.com/pytorch/pytorch) from 1.6.0 to 1.13.1.
- [Release notes](https://github.com/pytorch/pytorch/releases)
- [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md)
- [Commits](https://github.com/pytorch/pytorch/compare/v1.6.0...v1.13.1)

---
updated-dependencies:
- dependency-name: torch
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:40:54 +01:00
dependabot[bot] ce66c0e989
Bump torch from 1.11.0 to 1.13.1 in /examples/research_projects/codeparrot (#21170)
Bump torch in /examples/research_projects/codeparrot

Bumps [torch](https://github.com/pytorch/pytorch) from 1.11.0 to 1.13.1.
- [Release notes](https://github.com/pytorch/pytorch/releases)
- [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md)
- [Commits](https://github.com/pytorch/pytorch/compare/v1.11.0...v1.13.1)

---
updated-dependencies:
- dependency-name: torch
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:40:19 +01:00
dependabot[bot] 7a29c577e8
Bump torch from 1.6.0 to 1.13.1 in /examples/research_projects/lxmert (#21174)
Bumps [torch](https://github.com/pytorch/pytorch) from 1.6.0 to 1.13.1.
- [Release notes](https://github.com/pytorch/pytorch/releases)
- [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md)
- [Commits](https://github.com/pytorch/pytorch/compare/v1.6.0...v1.13.1)

---
updated-dependencies:
- dependency-name: torch
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:39:55 +01:00
dependabot[bot] b33f01fe6b
Bump pyarrow from 1.0.1 to 15.0.0 in /examples/research_projects/lxmert (#30584)
Bumps [pyarrow](https://github.com/apache/arrow) from 1.0.1 to 15.0.0.
- [Commits](https://github.com/apache/arrow/compare/apache-arrow-1.0.1...go/v15.0.0)

---
updated-dependencies:
- dependency-name: pyarrow
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:38:07 +01:00
dependabot[bot] 0ec3003ae9
Bump pyarrow from 1.0.1 to 15.0.0 in /examples/research_projects/visual_bert (#30583)
Bump pyarrow in /examples/research_projects/visual_bert

Bumps [pyarrow](https://github.com/apache/arrow) from 1.0.1 to 15.0.0.
- [Commits](https://github.com/apache/arrow/compare/apache-arrow-1.0.1...go/v15.0.0)

---
updated-dependencies:
- dependency-name: pyarrow
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:37:54 +01:00
dependabot[bot] aefbdfe8cf
Bump pyarrow from 7.0.0 to 15.0.0 in /examples/research_projects/decision_transformer (#30582)
Bump pyarrow in /examples/research_projects/decision_transformer

Bumps [pyarrow](https://github.com/apache/arrow) from 7.0.0 to 15.0.0.
- [Commits](https://github.com/apache/arrow/compare/go/v7.0.0...go/v15.0.0)

---
updated-dependencies:
- dependency-name: pyarrow
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:37:40 +01:00
dependabot[bot] 7164171212
Bump gitpython from 3.1.32 to 3.1.41 in /examples/research_projects/distillation (#30586)
Bump gitpython in /examples/research_projects/distillation

Bumps [gitpython](https://github.com/gitpython-developers/GitPython) from 3.1.32 to 3.1.41.
- [Release notes](https://github.com/gitpython-developers/GitPython/releases)
- [Changelog](https://github.com/gitpython-developers/GitPython/blob/main/CHANGES)
- [Commits](https://github.com/gitpython-developers/GitPython/compare/3.1.32...3.1.41)

---
updated-dependencies:
- dependency-name: gitpython
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:36:57 +01:00
dependabot[bot] ff8f624542
Bump grpcio from 1.44.0 to 1.53.2 in /examples/research_projects/decision_transformer (#30585)
Bump grpcio in /examples/research_projects/decision_transformer

Bumps [grpcio](https://github.com/grpc/grpc) from 1.44.0 to 1.53.2.
- [Release notes](https://github.com/grpc/grpc/releases)
- [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md)
- [Commits](https://github.com/grpc/grpc/compare/v1.44.0...v1.53.2)

---
updated-dependencies:
- dependency-name: grpcio
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:35:52 +01:00
dependabot[bot] b71f512823
Bump gitpython from 3.1.32 to 3.1.41 in /examples/research_projects/decision_transformer (#30587)
Bump gitpython in /examples/research_projects/decision_transformer

Bumps [gitpython](https://github.com/gitpython-developers/GitPython) from 3.1.32 to 3.1.41.
- [Release notes](https://github.com/gitpython-developers/GitPython/releases)
- [Changelog](https://github.com/gitpython-developers/GitPython/blob/main/CHANGES)
- [Commits](https://github.com/gitpython-developers/GitPython/compare/3.1.32...3.1.41)

---
updated-dependencies:
- dependency-name: gitpython
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-01 16:30:24 +01:00
Pedro Cuenca f4f18afde8
Gemma: update activation warning (#29995)
* Gemma: only display act. warning when necessary

This is a nit PR, but I was confused. I got the warning even after I
had changed `hidden_act` to `gelu_pytorch_tanh`, telling me that I
was using the "legacy" `gelu_pytorch_tanh`.

Another option is to keep the warning but change the message to say
something like "`hidden_act` is ignored, please use `hidden_activation`
instead. Setting Gemma's activation function to `gelu_pytorch_tanh`".

* Change message, and set `config.hidden_activation`
2024-05-01 17:23:38 +02:00
amyeroberts bbaa8ceff6
Fix canonical model --model_type in examples (#30480)
Fix --model_type in examples
2024-05-01 15:47:05 +01:00
Arthur 3c69d81eeb
remove jax example (#30498)
remove example
2024-05-01 16:34:57 +02:00
Matt 1e05671d21
Fix QA example (#30580)
* Handle cases when CLS token is absent

* Use BOS token as a fallback
2024-05-01 08:43:02 +01:00
Matt 4b4da18f53
Refactor default chat template warnings (#30551)
* Temporarily silence warnings in apply_chat_template until we can properly deprecate default chat templates

* make fixup

* Move the default chat template warning into apply_chat_template itself

* make fixup
2024-05-01 08:42:11 +01:00
Raushan Turganbay 4bc9cb36b7
Fix Marian model conversion (#30173)
* fix marian model coversion

* uncomment that line

* remove unnecessary code

* revert tie_weights, doesn't hurt
2024-05-01 12:33:12 +05:00
Raushan Turganbay 38a4bf79ad
Encoder-decoder models: move embedding scale to nn.Module (#30410)
* move scaling to nn.Module

* let the test be here for now (need to fix)

* failing tests

* last failing models

* Revert commit 4c14817f38

* clean-up

* oops forgot

* codestyle

* raise NotImplemented when possible

* Update tests/test_modeling_common.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* skip tests in respective modeling files

---------

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2024-05-01 12:33:00 +05:00