transformers

Commit Graph

Author	SHA1	Message	Date
Clara Pohland	e076953079	Trainer._load_from_checkpoint - support loading multiple Peft adapters (#30505 ) * Trainer: load checkpoint model with multiple adapters * Trainer._load_from_checkpoint support multiple active adapters * PeftModel.set_adapter does not support multiple adapters yet * Trainer._load_from_checkpoint test multiple adapters --------- Co-authored-by: Clara Luise Pohland <clara-luise.pohland@telekom.de>	2024-05-06 08:22:52 -04:00
Marc Sun	aa64f086a2	Fix llava next tie_word_embeddings config (#30640 ) * fix llava next embedding * add docstring * Update src/transformers/models/llava_next/configuration_llava_next.py Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com> --------- Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>	2024-05-06 14:01:26 +02:00
Younes Belkada	9c772ac888	Quantization / HQQ: Fix HQQ tests on our runner (#30668 ) Update test_hqq.py	2024-05-06 11:33:52 +02:00
Arthur	a45c514899	Hotfix-change-ci (#30669 ) * dmmy change * fiux * revert change	2024-05-06 11:26:04 +02:00
jiaqianjing	09edd77f64	Check if the current compiled version of pytorch supports MPS (#30664 )	2024-05-06 10:32:19 +02:00
Arthur	307f632bb2	[`CI update`] Try to use dockers and no cache (#29202 ) * change cis * nits * update * minor updates * [push-ci-image] * nit [push-ci-image] * nitsssss * [build-ci-image] * [push-ci-image] * [push-ci-image] * both * [push-ci-image] * this? * [push-ci-image] * pypi-kenlm needs g++ * [push-ci-image] * nit * more nits [push-ci-image] * nits [push-ci-image] * [push-ci-image] * [push-ci-image] * [push-ci-image] * add vision * [push-ci-image] * [push-ci-image] * add new dummy file but will need to update them [push-ci-image] * [push-ci-image] * show package size as well * [push-ci-image] * potentially ignore failures * workflow updates * nits [push-ci-image] * [push-ci-image] * fix consistency * clean nciida triton * also show big packages [push-ci-image] * nit * update * another one * line escape? * add accelerate [push-ci-image] * updates [push-ci-image] * nits to run tests, no push-ci * try to parse skip reason to make sure nothing is skipped that should no be skippped * nit? * always show skipped reasons * nits * better parsing of the test outputs * action="store_true", * failure on failed * show matched * debug * update short summary with skipped, failed and errors * nits * nits * coolu pdates * remove docbuilder * fix * always run checks * oups * nits * don't error out on library printing * non zero exi codes * no warning * nit * WAT? * format nit * [push-ci-image] * fail if fail is needed * [push-ci-image] * sound file for torch light? * [push-ci-image] * order is important [push-ci-image] * [push-ci-image] reduce even further * [push-ci-image] * use pytest rich ! * yes [push-ci-image] * oupsy * bring back the full traceback, but pytest rich should help * nit * [push-ci-image] * re run * nit * [push-ci-image] * [push-ci-image] * [push-ci-image] * empty push to trigger * [push-ci-image] * nit? [push-ci-image] * empty * try to install timm with no deps * [push-ci-image] * oups [push-ci-image] * [push-ci-image] * [push-ci-image] ? * [push-ci-image] open ssh client for git checkout fast * empty for torch light * updates [push-ci-image] * nit * @v4 for checkout * [push-ci-image] * [push-ci-image] * fix fetch tests with parallelism * [push-ci-image] * more parallelism * nit * more nits * empty to re-trigger * empty to re-trigger * split by timing * did not work with previous commit * junit.xml * no path? * mmm this? * junitxml format * split by timing * nit * fix junit family * now we can test if the xunit1 is compatible! * this? * fully list tests * update * update * oups * finally * use classname * remove working directory to make sure the path does not interfere * okay no juni should have the correct path * name split? * sort by classname is what make most sense * some testing * naem * oups * test something fun * autodetect * 18? * nit * file size? * uip * 4 is best * update to see versions * better print * [push-ci-image] * [push-ci-image] * please install the correct keras version * [push-ci-image] * [push-ci-image] * [push-ci-image] * [push-ci-image] * [push-ci-image] * uv is fucking me up * [push-ci-image] * [push-ci-image] * [push-ci-image] * nits * [push-ci-image] * [push-ci-image] * install issues an pins * tapas as well * nits * more paralellism * short tb * soundfile * soundfile * [push-ci-image] * [push-ci-image] * [push-ci-image] * oups * [push-ci-image] * fix some things * [push-ci-image] * [push-ci-image] * [push-ci-image] * [push-ci-image] * use torch-light for hub * small git lfs for hub job * [push-ci-image] * [push-ci-image] * [push-ci-image] * [push-ci-image] * fix tf tapas * [push-ci-image] * nits * [push-ci-image] * don't update the test * [push-ci-image] * [push-ci-image] * [push-ci-image] * no use them * [push-ci-image] * [push-ci-image] * [push-ci-image] * [push-ci-image] * update tf proba * [push-ci-image] * [push-ci-image] * woops * [push-ci-image] * [push-ci-image] * [push-ci-image] * [push-ci-image] * [push-ci-image] * [push-ci-image] * test with built dockers * [push-ci-image] * skip annoying tests * revert fix copy * update test values * update * last skip and fixup * nit * ALL GOOOD * quality * Update tests/models/layoutlmv2/test_image_processing_layoutlmv2.py * Update docker/quality.dockerfile Co-authored-by: Lysandre Debut <hi@lysand.re> * Update src/transformers/models/tapas/modeling_tf_tapas.py Co-authored-by: Lysandre Debut <hi@lysand.re> * Apply suggestions from code review Co-authored-by: Lysandre Debut <hi@lysand.re> * use torch-speed * updates * [push-ci-image] * [push-ci-image] * [push-ci-image] * [push-ci-image] * fuck ken-lm [push-ci-image] * [push-ci-image] * [push-ci-image] --------- Co-authored-by: Lysandre Debut <hi@lysand.re>	2024-05-06 10:10:32 +02:00
Yih-Dar	91d155ea92	Avoid duplication in PR slow CI model list (#30634 ) update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-05-03 18:19:30 +02:00
Yen Ting	deb7605a2a	Prevent `TextGenerationPipeline._sanitize_parameters` from overriding previously provided parameters (#30362 ) * Fixed TextGenerationPipeline._sanitize_parameters default params * removed empty spaces --------- Co-authored-by: Ng, Yen Ting <yen.ting.ng@intel.com>	2024-05-03 17:49:28 +02:00
Younes Belkada	d0c72c15c2	HQQ: PEFT support for HQQ (#30632 ) Update quantizer_hqq.py	2024-05-03 16:01:15 +02:00
Pavel Iakubovskii	66f675eb65	Fix W&B run name (#30462 ) * Remove comparison to output_dir * Update docs for `run_name` * Add warning	2024-05-03 12:04:15 +01:00
Mayank Mishra	425e1a0426	add mlp bias for llama models (#30031 ) * add bias * fix quality	2024-05-03 11:02:17 +02:00
Raushan Turganbay	a0e77a1f6b	Fix CI after #30410 (#30612 ) * Fix CI after #30410 * [run-slow] blenderbot	2024-05-03 01:18:48 +05:00
mobicham	59952994c4	Add HQQ quantization support (#29637 ) * update HQQ transformers integration * push import_utils.py * add force_hooks check in modeling_utils.py * fix \| with Optional * force bias as param * check bias is Tensor * force forward for multi-gpu * review fixes pass * remove torch grad() * if any key in linear_tags fix * add cpu/disk check * isinstance return * add multigpu test + refactor tests * clean hqq_utils imports in hqq.py * clean hqq_utils imports in quantizer_hqq.py * delete hqq_utils.py * Delete src/transformers/utils/hqq_utils.py * ruff init * remove torch.float16 from __init__ in test * refactor test * isinstance -> type in quantizer_hqq.py * cpu/disk device_map check in quantizer_hqq.py * remove type(module) nn.linear check in quantizer_hqq.py * add BaseQuantizeConfig import inside HqqConfig init * remove hqq import in hqq.py * remove accelerate import from test_hqq.py * quant config.py doc update * add hqqconfig to main_classes doc * make style * __init__ fix * ruff __init__ * skip_modules list * hqqconfig format fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * test_hqq.py remove mistral comment * remove self.using_multi_gpu is False * torch_dtype default val set and logger.info * hqq.py isinstance fix * remove torch=None * torch_device test_hqq * rename test_hqq * MODEL_ID in test_hqq * quantizer_hqq setattr fix * quantizer_hqq typo fix * imports quantizer_hqq.py * isinstance quantizer_hqq * hqq_layer.bias reformat quantizer_hqq * Step 2 as comment in quantizer_hqq * prepare_for_hqq_linear() comment * keep_in_fp32_modules fix * HqqHfQuantizer reformat * quantization.md hqqconfig * quantization.md model example reformat * quantization.md # space * quantization.md space }) * quantization.md space }) * quantization_config fix doc Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * axis value check in quantization_config * format * dynamic config explanation * quant config method in quantization.md * remove shard-level progress * .cuda fix modeling_utils * test_hqq fixes * make fix-copies --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-05-02 17:51:49 +01:00
Jonghwan Hyeon	4c940934da	Output `None` as attention when layer is skipped (#30597 ) * Output `None` as attention when layer is skipped * Add test for output_attentions	2024-05-02 17:25:19 +01:00
Michael Benayoun	39359e5b5f	Fix FX tracing issues for Llama (#30619 )	2024-05-02 17:03:10 +02:00
Joao Gante	9719202d37	Generate: fix `SinkCache` on Llama models (#30581 )	2024-05-02 15:24:33 +01:00
Joao Gante	66abe13951	Docs: add missing `StoppingCriteria` autodocs (#30617 ) * add missing docstrings to docs * Update src/transformers/generation/stopping_criteria.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-05-02 15:20:04 +01:00
Joao Gante	aa55ff44a2	Docs: fix `generate`-related rendering issues (#30600 ) * does this work? * like this? * fix the other generate links * missing these	2024-05-02 14:42:25 +01:00
amitportnoy	801894e08c	phi3 chat_template does not support system role (#30606 ) * phi3 chat_template does not support system role * fix doc test error	2024-05-02 15:30:21 +02:00
Yih-Dar	f57f014936	Use `contiguous()` in clip checkpoint conversion script (#30613 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-05-02 13:59:40 +02:00
Zhan Lu	a65da83d75	fix:missing `output_router_logits` in SwitchTransformers (#30573 ) * fix:missing `output_router_logits` in SwitchTransformers * fix whitespace in blank line	2024-05-02 13:47:00 +02:00
amyeroberts	4ad5adaf1d	Fix copies for DBRX - neuron fix (#30610 )	2024-05-02 11:00:26 +01:00
Richard Brown	f95302584b	🚨 Update image_processing_vitmatte.py (#30566 ) * Update image_processing_vitmatte.py * add test * [run-slow]vitmatte	2024-05-02 11:00:07 +01:00
Bai Li	12c5544dca	Fix memory leak with CTC training script on Chinese languages (#30358 ) * Fix memory leak with CTC training script on Chinese languages * Fix lint	2024-05-02 09:33:36 +01:00
Michael Benayoun	fbabd6746f	Fix for Neuron (#30259 )	2024-05-02 10:24:47 +02:00
Raushan Turganbay	5cf3e6bf05	Fix: failing CI after #30568 (#30599 ) * failiing CI * no let's keep it intil full deprecation in v4.42	2024-05-02 12:15:17 +05:00
dependabot[bot]	c681b58b06	Bump torch from 1.9.0+cpu to 1.13.1 in /examples/flax/vision (#21168 ) Bumps [torch](https://github.com/pytorch/pytorch) from 1.9.0+cpu to 1.13.1. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/commits/v1.13.1) --- updated-dependencies: - dependency-name: torch dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-05-01 20:14:57 +01:00
dependabot[bot]	3a36597a5f	Bump pillow from 10.0.1 to 10.2.0 in /examples/research_projects/decision_transformer (#28655 ) Bump pillow in /examples/research_projects/decision_transformer Bumps [pillow](https://github.com/python-pillow/Pillow) from 10.0.1 to 10.2.0. - [Release notes](https://github.com/python-pillow/Pillow/releases) - [Changelog](https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst) - [Commits](https://github.com/python-pillow/Pillow/compare/10.0.1...10.2.0) --- updated-dependencies: - dependency-name: pillow dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 19:58:34 +01:00
dependabot[bot]	4f3c7af489	Bump torch from 1.9.0+cpu to 1.13.1 in /examples/research_projects/jax-projects/hybrid_clip (#21167 ) Bump torch in /examples/research_projects/jax-projects/hybrid_clip Bumps [torch](https://github.com/pytorch/pytorch) from 1.9.0+cpu to 1.13.1. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/commits/v1.13.1) --- updated-dependencies: - dependency-name: torch dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 18:37:55 +01:00
dependabot[bot]	6f465d45d9	Bump torch from 1.11.0 to 1.13.1 in /examples/research_projects/decision_transformer (#21171 ) Bump torch in /examples/research_projects/decision_transformer Bumps [torch](https://github.com/pytorch/pytorch) from 1.11.0 to 1.13.1. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/compare/v1.11.0...v1.13.1) --- updated-dependencies: - dependency-name: torch dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 18:16:25 +01:00
Fraser Mince	5090ea3f68	Fix llava half precision and autocast issues (#29721 ) * Ensure input_embeds and image_features are the same dtype in autocast * Fix nans in half precision llava-next and fix autocasting behavior. * Fix styling issues. * fix randn newline instantiation * fix broken slow llava test * Fix llava next init. * fix styling issues * [run-slow]llava,llava_next * fix styling issues	2024-05-01 17:49:44 +01:00
Joao Gante	d57ffb487f	Generate: remove deprecated public decoding functions and streamline logic 🧼 (#29956 )	2024-05-01 17:38:44 +01:00
NielsRogge	dc401d3a4e	Improve object detection task guideline (#29967 ) * Add improvements * Address comment	2024-05-01 17:58:01 +02:00
amyeroberts	d2feb54591	Fix image segmentation example - don't reopen image (#30481 ) Fix image segmentation example - don't repoen image	2024-05-01 16:52:57 +01:00
dependabot[bot]	6e0cba3cec	Bump torch from 1.6.0 to 1.13.1 in /examples/research_projects/visual_bert (#21172 ) Bump torch in /examples/research_projects/visual_bert Bumps [torch](https://github.com/pytorch/pytorch) from 1.6.0 to 1.13.1. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/compare/v1.6.0...v1.13.1) --- updated-dependencies: - dependency-name: torch dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 16:40:54 +01:00
dependabot[bot]	ce66c0e989	Bump torch from 1.11.0 to 1.13.1 in /examples/research_projects/codeparrot (#21170 ) Bump torch in /examples/research_projects/codeparrot Bumps [torch](https://github.com/pytorch/pytorch) from 1.11.0 to 1.13.1. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/compare/v1.11.0...v1.13.1) --- updated-dependencies: - dependency-name: torch dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 16:40:19 +01:00
dependabot[bot]	7a29c577e8	Bump torch from 1.6.0 to 1.13.1 in /examples/research_projects/lxmert (#21174 ) Bumps [torch](https://github.com/pytorch/pytorch) from 1.6.0 to 1.13.1. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/compare/v1.6.0...v1.13.1) --- updated-dependencies: - dependency-name: torch dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 16:39:55 +01:00
dependabot[bot]	b33f01fe6b	Bump pyarrow from 1.0.1 to 15.0.0 in /examples/research_projects/lxmert (#30584 ) Bumps [pyarrow](https://github.com/apache/arrow) from 1.0.1 to 15.0.0. - [Commits](https://github.com/apache/arrow/compare/apache-arrow-1.0.1...go/v15.0.0) --- updated-dependencies: - dependency-name: pyarrow dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 16:38:07 +01:00
dependabot[bot]	0ec3003ae9	Bump pyarrow from 1.0.1 to 15.0.0 in /examples/research_projects/visual_bert (#30583 ) Bump pyarrow in /examples/research_projects/visual_bert Bumps [pyarrow](https://github.com/apache/arrow) from 1.0.1 to 15.0.0. - [Commits](https://github.com/apache/arrow/compare/apache-arrow-1.0.1...go/v15.0.0) --- updated-dependencies: - dependency-name: pyarrow dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 16:37:54 +01:00
dependabot[bot]	aefbdfe8cf	Bump pyarrow from 7.0.0 to 15.0.0 in /examples/research_projects/decision_transformer (#30582 ) Bump pyarrow in /examples/research_projects/decision_transformer Bumps [pyarrow](https://github.com/apache/arrow) from 7.0.0 to 15.0.0. - [Commits](https://github.com/apache/arrow/compare/go/v7.0.0...go/v15.0.0) --- updated-dependencies: - dependency-name: pyarrow dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 16:37:40 +01:00
dependabot[bot]	7164171212	Bump gitpython from 3.1.32 to 3.1.41 in /examples/research_projects/distillation (#30586 ) Bump gitpython in /examples/research_projects/distillation Bumps [gitpython](https://github.com/gitpython-developers/GitPython) from 3.1.32 to 3.1.41. - [Release notes](https://github.com/gitpython-developers/GitPython/releases) - [Changelog](https://github.com/gitpython-developers/GitPython/blob/main/CHANGES) - [Commits](https://github.com/gitpython-developers/GitPython/compare/3.1.32...3.1.41) --- updated-dependencies: - dependency-name: gitpython dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 16:36:57 +01:00
dependabot[bot]	ff8f624542	Bump grpcio from 1.44.0 to 1.53.2 in /examples/research_projects/decision_transformer (#30585 ) Bump grpcio in /examples/research_projects/decision_transformer Bumps [grpcio](https://github.com/grpc/grpc) from 1.44.0 to 1.53.2. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.44.0...v1.53.2) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 16:35:52 +01:00
dependabot[bot]	b71f512823	Bump gitpython from 3.1.32 to 3.1.41 in /examples/research_projects/decision_transformer (#30587 ) Bump gitpython in /examples/research_projects/decision_transformer Bumps [gitpython](https://github.com/gitpython-developers/GitPython) from 3.1.32 to 3.1.41. - [Release notes](https://github.com/gitpython-developers/GitPython/releases) - [Changelog](https://github.com/gitpython-developers/GitPython/blob/main/CHANGES) - [Commits](https://github.com/gitpython-developers/GitPython/compare/3.1.32...3.1.41) --- updated-dependencies: - dependency-name: gitpython dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 16:30:24 +01:00
Pedro Cuenca	f4f18afde8	Gemma: update activation warning (#29995 ) * Gemma: only display act. warning when necessary This is a nit PR, but I was confused. I got the warning even after I had changed `hidden_act` to `gelu_pytorch_tanh`, telling me that I was using the "legacy" `gelu_pytorch_tanh`. Another option is to keep the warning but change the message to say something like "`hidden_act` is ignored, please use `hidden_activation` instead. Setting Gemma's activation function to `gelu_pytorch_tanh`". * Change message, and set `config.hidden_activation`	2024-05-01 17:23:38 +02:00
amyeroberts	bbaa8ceff6	Fix canonical model --model_type in examples (#30480 ) Fix --model_type in examples	2024-05-01 15:47:05 +01:00
Arthur	3c69d81eeb	remove jax example (#30498 ) remove example	2024-05-01 16:34:57 +02:00
Matt	1e05671d21	Fix QA example (#30580 ) * Handle cases when CLS token is absent * Use BOS token as a fallback	2024-05-01 08:43:02 +01:00
Matt	4b4da18f53	Refactor default chat template warnings (#30551 ) * Temporarily silence warnings in apply_chat_template until we can properly deprecate default chat templates * make fixup * Move the default chat template warning into apply_chat_template itself * make fixup	2024-05-01 08:42:11 +01:00
Raushan Turganbay	4bc9cb36b7	Fix Marian model conversion (#30173 ) * fix marian model coversion * uncomment that line * remove unnecessary code * revert tie_weights, doesn't hurt	2024-05-01 12:33:12 +05:00
Raushan Turganbay	38a4bf79ad	Encoder-decoder models: move embedding scale to nn.Module (#30410 ) * move scaling to nn.Module * let the test be here for now (need to fix) * failing tests * last failing models * Revert commit `4c14817f38` * clean-up * oops forgot * codestyle * raise NotImplemented when possible * Update tests/test_modeling_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * skip tests in respective modeling files --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-05-01 12:33:00 +05:00

... 2 3 4 5 6 ...

15957 Commits All Branches Search

15957 Commits

All Branches