transformers

Commit Graph

Author	SHA1	Message	Date
Pavel Iakubovskii	66f675eb65	Fix W&B run name (#30462 ) * Remove comparison to output_dir * Update docs for `run_name` * Add warning	2024-05-03 12:04:15 +01:00
Mayank Mishra	425e1a0426	add mlp bias for llama models (#30031 ) * add bias * fix quality	2024-05-03 11:02:17 +02:00
Raushan Turganbay	a0e77a1f6b	Fix CI after #30410 (#30612 ) * Fix CI after #30410 * [run-slow] blenderbot	2024-05-03 01:18:48 +05:00
mobicham	59952994c4	Add HQQ quantization support (#29637 ) * update HQQ transformers integration * push import_utils.py * add force_hooks check in modeling_utils.py * fix \| with Optional * force bias as param * check bias is Tensor * force forward for multi-gpu * review fixes pass * remove torch grad() * if any key in linear_tags fix * add cpu/disk check * isinstance return * add multigpu test + refactor tests * clean hqq_utils imports in hqq.py * clean hqq_utils imports in quantizer_hqq.py * delete hqq_utils.py * Delete src/transformers/utils/hqq_utils.py * ruff init * remove torch.float16 from __init__ in test * refactor test * isinstance -> type in quantizer_hqq.py * cpu/disk device_map check in quantizer_hqq.py * remove type(module) nn.linear check in quantizer_hqq.py * add BaseQuantizeConfig import inside HqqConfig init * remove hqq import in hqq.py * remove accelerate import from test_hqq.py * quant config.py doc update * add hqqconfig to main_classes doc * make style * __init__ fix * ruff __init__ * skip_modules list * hqqconfig format fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * hqqconfig doc fix * test_hqq.py remove mistral comment * remove self.using_multi_gpu is False * torch_dtype default val set and logger.info * hqq.py isinstance fix * remove torch=None * torch_device test_hqq * rename test_hqq * MODEL_ID in test_hqq * quantizer_hqq setattr fix * quantizer_hqq typo fix * imports quantizer_hqq.py * isinstance quantizer_hqq * hqq_layer.bias reformat quantizer_hqq * Step 2 as comment in quantizer_hqq * prepare_for_hqq_linear() comment * keep_in_fp32_modules fix * HqqHfQuantizer reformat * quantization.md hqqconfig * quantization.md model example reformat * quantization.md # space * quantization.md space }) * quantization.md space }) * quantization_config fix doc Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * axis value check in quantization_config * format * dynamic config explanation * quant config method in quantization.md * remove shard-level progress * .cuda fix modeling_utils * test_hqq fixes * make fix-copies --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-05-02 17:51:49 +01:00
Jonghwan Hyeon	4c940934da	Output `None` as attention when layer is skipped (#30597 ) * Output `None` as attention when layer is skipped * Add test for output_attentions	2024-05-02 17:25:19 +01:00
Michael Benayoun	39359e5b5f	Fix FX tracing issues for Llama (#30619 )	2024-05-02 17:03:10 +02:00
Joao Gante	9719202d37	Generate: fix `SinkCache` on Llama models (#30581 )	2024-05-02 15:24:33 +01:00
Joao Gante	66abe13951	Docs: add missing `StoppingCriteria` autodocs (#30617 ) * add missing docstrings to docs * Update src/transformers/generation/stopping_criteria.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-05-02 15:20:04 +01:00
Joao Gante	aa55ff44a2	Docs: fix `generate`-related rendering issues (#30600 ) * does this work? * like this? * fix the other generate links * missing these	2024-05-02 14:42:25 +01:00
amitportnoy	801894e08c	phi3 chat_template does not support system role (#30606 ) * phi3 chat_template does not support system role * fix doc test error	2024-05-02 15:30:21 +02:00
Yih-Dar	f57f014936	Use `contiguous()` in clip checkpoint conversion script (#30613 ) * fix * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-05-02 13:59:40 +02:00
Zhan Lu	a65da83d75	fix:missing `output_router_logits` in SwitchTransformers (#30573 ) * fix:missing `output_router_logits` in SwitchTransformers * fix whitespace in blank line	2024-05-02 13:47:00 +02:00
amyeroberts	4ad5adaf1d	Fix copies for DBRX - neuron fix (#30610 )	2024-05-02 11:00:26 +01:00
Richard Brown	f95302584b	🚨 Update image_processing_vitmatte.py (#30566 ) * Update image_processing_vitmatte.py * add test * [run-slow]vitmatte	2024-05-02 11:00:07 +01:00
Bai Li	12c5544dca	Fix memory leak with CTC training script on Chinese languages (#30358 ) * Fix memory leak with CTC training script on Chinese languages * Fix lint	2024-05-02 09:33:36 +01:00
Michael Benayoun	fbabd6746f	Fix for Neuron (#30259 )	2024-05-02 10:24:47 +02:00
Raushan Turganbay	5cf3e6bf05	Fix: failing CI after #30568 (#30599 ) * failiing CI * no let's keep it intil full deprecation in v4.42	2024-05-02 12:15:17 +05:00
dependabot[bot]	c681b58b06	Bump torch from 1.9.0+cpu to 1.13.1 in /examples/flax/vision (#21168 ) Bumps [torch](https://github.com/pytorch/pytorch) from 1.9.0+cpu to 1.13.1. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/commits/v1.13.1) --- updated-dependencies: - dependency-name: torch dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-05-01 20:14:57 +01:00
dependabot[bot]	3a36597a5f	Bump pillow from 10.0.1 to 10.2.0 in /examples/research_projects/decision_transformer (#28655 ) Bump pillow in /examples/research_projects/decision_transformer Bumps [pillow](https://github.com/python-pillow/Pillow) from 10.0.1 to 10.2.0. - [Release notes](https://github.com/python-pillow/Pillow/releases) - [Changelog](https://github.com/python-pillow/Pillow/blob/main/CHANGES.rst) - [Commits](https://github.com/python-pillow/Pillow/compare/10.0.1...10.2.0) --- updated-dependencies: - dependency-name: pillow dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 19:58:34 +01:00
dependabot[bot]	4f3c7af489	Bump torch from 1.9.0+cpu to 1.13.1 in /examples/research_projects/jax-projects/hybrid_clip (#21167 ) Bump torch in /examples/research_projects/jax-projects/hybrid_clip Bumps [torch](https://github.com/pytorch/pytorch) from 1.9.0+cpu to 1.13.1. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/commits/v1.13.1) --- updated-dependencies: - dependency-name: torch dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 18:37:55 +01:00
dependabot[bot]	6f465d45d9	Bump torch from 1.11.0 to 1.13.1 in /examples/research_projects/decision_transformer (#21171 ) Bump torch in /examples/research_projects/decision_transformer Bumps [torch](https://github.com/pytorch/pytorch) from 1.11.0 to 1.13.1. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/compare/v1.11.0...v1.13.1) --- updated-dependencies: - dependency-name: torch dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 18:16:25 +01:00
Fraser Mince	5090ea3f68	Fix llava half precision and autocast issues (#29721 ) * Ensure input_embeds and image_features are the same dtype in autocast * Fix nans in half precision llava-next and fix autocasting behavior. * Fix styling issues. * fix randn newline instantiation * fix broken slow llava test * Fix llava next init. * fix styling issues * [run-slow]llava,llava_next * fix styling issues	2024-05-01 17:49:44 +01:00
Joao Gante	d57ffb487f	Generate: remove deprecated public decoding functions and streamline logic 🧼 (#29956 )	2024-05-01 17:38:44 +01:00
NielsRogge	dc401d3a4e	Improve object detection task guideline (#29967 ) * Add improvements * Address comment	2024-05-01 17:58:01 +02:00
amyeroberts	d2feb54591	Fix image segmentation example - don't reopen image (#30481 ) Fix image segmentation example - don't repoen image	2024-05-01 16:52:57 +01:00
dependabot[bot]	6e0cba3cec	Bump torch from 1.6.0 to 1.13.1 in /examples/research_projects/visual_bert (#21172 ) Bump torch in /examples/research_projects/visual_bert Bumps [torch](https://github.com/pytorch/pytorch) from 1.6.0 to 1.13.1. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/compare/v1.6.0...v1.13.1) --- updated-dependencies: - dependency-name: torch dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 16:40:54 +01:00
dependabot[bot]	ce66c0e989	Bump torch from 1.11.0 to 1.13.1 in /examples/research_projects/codeparrot (#21170 ) Bump torch in /examples/research_projects/codeparrot Bumps [torch](https://github.com/pytorch/pytorch) from 1.11.0 to 1.13.1. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/compare/v1.11.0...v1.13.1) --- updated-dependencies: - dependency-name: torch dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 16:40:19 +01:00
dependabot[bot]	7a29c577e8	Bump torch from 1.6.0 to 1.13.1 in /examples/research_projects/lxmert (#21174 ) Bumps [torch](https://github.com/pytorch/pytorch) from 1.6.0 to 1.13.1. - [Release notes](https://github.com/pytorch/pytorch/releases) - [Changelog](https://github.com/pytorch/pytorch/blob/master/RELEASE.md) - [Commits](https://github.com/pytorch/pytorch/compare/v1.6.0...v1.13.1) --- updated-dependencies: - dependency-name: torch dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 16:39:55 +01:00
dependabot[bot]	b33f01fe6b	Bump pyarrow from 1.0.1 to 15.0.0 in /examples/research_projects/lxmert (#30584 ) Bumps [pyarrow](https://github.com/apache/arrow) from 1.0.1 to 15.0.0. - [Commits](https://github.com/apache/arrow/compare/apache-arrow-1.0.1...go/v15.0.0) --- updated-dependencies: - dependency-name: pyarrow dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 16:38:07 +01:00
dependabot[bot]	0ec3003ae9	Bump pyarrow from 1.0.1 to 15.0.0 in /examples/research_projects/visual_bert (#30583 ) Bump pyarrow in /examples/research_projects/visual_bert Bumps [pyarrow](https://github.com/apache/arrow) from 1.0.1 to 15.0.0. - [Commits](https://github.com/apache/arrow/compare/apache-arrow-1.0.1...go/v15.0.0) --- updated-dependencies: - dependency-name: pyarrow dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 16:37:54 +01:00
dependabot[bot]	aefbdfe8cf	Bump pyarrow from 7.0.0 to 15.0.0 in /examples/research_projects/decision_transformer (#30582 ) Bump pyarrow in /examples/research_projects/decision_transformer Bumps [pyarrow](https://github.com/apache/arrow) from 7.0.0 to 15.0.0. - [Commits](https://github.com/apache/arrow/compare/go/v7.0.0...go/v15.0.0) --- updated-dependencies: - dependency-name: pyarrow dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 16:37:40 +01:00
dependabot[bot]	7164171212	Bump gitpython from 3.1.32 to 3.1.41 in /examples/research_projects/distillation (#30586 ) Bump gitpython in /examples/research_projects/distillation Bumps [gitpython](https://github.com/gitpython-developers/GitPython) from 3.1.32 to 3.1.41. - [Release notes](https://github.com/gitpython-developers/GitPython/releases) - [Changelog](https://github.com/gitpython-developers/GitPython/blob/main/CHANGES) - [Commits](https://github.com/gitpython-developers/GitPython/compare/3.1.32...3.1.41) --- updated-dependencies: - dependency-name: gitpython dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 16:36:57 +01:00
dependabot[bot]	ff8f624542	Bump grpcio from 1.44.0 to 1.53.2 in /examples/research_projects/decision_transformer (#30585 ) Bump grpcio in /examples/research_projects/decision_transformer Bumps [grpcio](https://github.com/grpc/grpc) from 1.44.0 to 1.53.2. - [Release notes](https://github.com/grpc/grpc/releases) - [Changelog](https://github.com/grpc/grpc/blob/master/doc/grpc_release_schedule.md) - [Commits](https://github.com/grpc/grpc/compare/v1.44.0...v1.53.2) --- updated-dependencies: - dependency-name: grpcio dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 16:35:52 +01:00
dependabot[bot]	b71f512823	Bump gitpython from 3.1.32 to 3.1.41 in /examples/research_projects/decision_transformer (#30587 ) Bump gitpython in /examples/research_projects/decision_transformer Bumps [gitpython](https://github.com/gitpython-developers/GitPython) from 3.1.32 to 3.1.41. - [Release notes](https://github.com/gitpython-developers/GitPython/releases) - [Changelog](https://github.com/gitpython-developers/GitPython/blob/main/CHANGES) - [Commits](https://github.com/gitpython-developers/GitPython/compare/3.1.32...3.1.41) --- updated-dependencies: - dependency-name: gitpython dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-05-01 16:30:24 +01:00
Pedro Cuenca	f4f18afde8	Gemma: update activation warning (#29995 ) * Gemma: only display act. warning when necessary This is a nit PR, but I was confused. I got the warning even after I had changed `hidden_act` to `gelu_pytorch_tanh`, telling me that I was using the "legacy" `gelu_pytorch_tanh`. Another option is to keep the warning but change the message to say something like "`hidden_act` is ignored, please use `hidden_activation` instead. Setting Gemma's activation function to `gelu_pytorch_tanh`". * Change message, and set `config.hidden_activation`	2024-05-01 17:23:38 +02:00
amyeroberts	bbaa8ceff6	Fix canonical model --model_type in examples (#30480 ) Fix --model_type in examples	2024-05-01 15:47:05 +01:00
Arthur	3c69d81eeb	remove jax example (#30498 ) remove example	2024-05-01 16:34:57 +02:00
Matt	1e05671d21	Fix QA example (#30580 ) * Handle cases when CLS token is absent * Use BOS token as a fallback	2024-05-01 08:43:02 +01:00
Matt	4b4da18f53	Refactor default chat template warnings (#30551 ) * Temporarily silence warnings in apply_chat_template until we can properly deprecate default chat templates * make fixup * Move the default chat template warning into apply_chat_template itself * make fixup	2024-05-01 08:42:11 +01:00
Raushan Turganbay	4bc9cb36b7	Fix Marian model conversion (#30173 ) * fix marian model coversion * uncomment that line * remove unnecessary code * revert tie_weights, doesn't hurt	2024-05-01 12:33:12 +05:00
Raushan Turganbay	38a4bf79ad	Encoder-decoder models: move embedding scale to nn.Module (#30410 ) * move scaling to nn.Module * let the test be here for now (need to fix) * failing tests * last failing models * Revert commit `4c14817f38` * clean-up * oops forgot * codestyle * raise NotImplemented when possible * Update tests/test_modeling_common.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * skip tests in respective modeling files --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-05-01 12:33:00 +05:00
Raushan Turganbay	9d31b32e9d	Use text config's vocab size in testing models (#30568 ) use text config's vocab size	2024-05-01 12:32:45 +05:00
Yih-Dar	78fdd64dcf	Remove `use_square_size` after loading (#30567 ) * fix * add test --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-04-30 21:11:37 +02:00
Yih-Dar	87927b248e	General PR slow CI (#30540 ) * More general PR slow CI * Update utils/pr_slow_ci_models.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-04-30 21:05:09 +02:00
Raushan Turganbay	b8ac4d035c	Fix generation doctests (#30263 ) * fix doctest * fix torch doctest * make CI happy * raise error * make fixup	2024-04-30 21:02:26 +02:00
DarshanDeshpande	2ecefc3959	Add chat templating support for KeyDataset in text-generation pipeline (#30558 ) * added chat templating support for keydataset in generation pipeline * fixed and improved test * fix formatting test failures * Fix tests * Fix tests	2024-04-30 19:51:41 +01:00
Jiarui Xu	0cdb6b3f92	BlipModel: get_multimodal_features method (#30438 ) * add_blip_get_multimodal_feautres * Fix docstring error * reimplement get_multimodal_features * fix error * recheck code quality * add new necessary tests	2024-04-30 19:01:01 +01:00
Anton Vlasjuk	9112520b15	Fix seq2seq collator padding (#30556 ) * fix seq2seq data collator to respect the given padding strategy further added tests for the seq2seq data collator in the style of the `data_collator_for_token_classification` (pt, tf, np) * formatting and change bool equals "==" to "is" * add missed return types in tests * update numpy test as it can handle unequal shapes, not like pt or tf	2024-04-30 18:32:30 +01:00
Joao Gante	78a57c5e1a	DBRX: make fixup (#30578 )	2024-04-30 18:30:23 +01:00
Joao Gante	1bff6a0b58	Generate: update links on LLM tutorial doc (#30550 )	2024-04-30 18:14:12 +01:00

1 2 3 4 5 ...

15848 Commits All Branches Search

15848 Commits

All Branches