transformers

Commit Graph

Author	SHA1	Message	Date
Younes Belkada	d703eaaeff	[`bnb`] Fix bnb slow tests (#28788 ) fix bnb slow tests	2024-01-31 01:31:20 +01:00
Matt	74c9cfeaa7	Pin Torch to <2.2.0 (#28785 ) * Pin torch to <2.2.0 * Pin torchvision and torchaudio as well * Playing around with versions to see if this helps * twiddle something to restart the CI * twiddle it back * Try changing the natten version * make fixup * Revert "Try changing the natten version" This reverts commit `de0d6592c3`. * make fixup * fix fix fix * fix fix fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-30 23:01:12 +01:00
Matt	415e9a0980	Add tf_keras imports to prepare for Keras 3 (#28588 ) * Port core files + ESM (because ESM code is odd) * Search-replace in modelling code * Fix up transfo_xl as well * Fix other core files + tests (still need to add correct import to tests) * Fix cookiecutter * make fixup, fix imports in some more core files * Auto-add imports to tests * Cleanup, add imports to sagemaker tests * Use correct exception for importing tf_keras * Fixes in modeling_tf_utils * make fixup * Correct version parsing code * Ensure the pipeline tests correctly revert to float32 after each test * Ensure the pipeline tests correctly revert to float32 after each test * More tf.keras -> keras * Add dtype cast * Better imports of tf_keras * Add a cast for tf.assign, just in case * Fix callback imports	2024-01-30 17:26:36 +00:00
amyeroberts	1d489b3e61	Task-specific pipeline init args (#28439 ) * Abstract out pipeline init args * Address PR comments * Reword * BC PIPELINE_INIT_ARGS * Remove old arguments * Small fix	2024-01-30 16:54:57 +00:00
amyeroberts	2fa1c808ae	[`Backbone`] Use `load_backbone` instead of `AutoBackbone.from_config` (#28661 ) * Enable instantiating model with pretrained backbone weights * Remove doc updates until changes made in modeling code * Use load_backbone instead * Add use_timm_backbone to the model configs * Add missing imports and arguments * Update docstrings * Make sure test is properly configured * Include recent DPT updates	2024-01-30 16:54:09 +00:00
Yih-Dar	c24c52454a	Further pin pytest version (in a temporary way) (#28780 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-30 17:48:49 +01:00
fxmarty	6f7d5db58c	Fix transformers.utils.fx compatibility with torch<2.0 (#28774 ) guard sdpa on torch>=2.0	2024-01-30 14:54:42 +01:00
Thien Tran	5c8d941d66	Use Conv1d for TDNN (#25728 ) * use conv for tdnn * run make fixup * update TDNN * add PEFT LoRA check * propagate tdnn warnings to others * add missing imports * update TDNN in wav2vec2_bert * add missing imports	2024-01-30 09:33:55 +01:00
Younes Belkada	866253f85e	[`HfQuantizer`] Move it to "Developper guides" (#28768 ) Update _toctree.yml	2024-01-30 07:20:20 +01:00
Poedator	d78e78a0e4	`HfQuantizer` class for quantization-related stuff in `modeling_utils.py` (#26610 ) * squashed earlier commits for easier rebase * rm rebase leftovers * 4bit save enabled @quantizers * TMP gptq test use exllama * fix AwqConfigTest::test_wrong_backend for A100 * quantizers AWQ fixes * _load_pretrained_model low_cpu_mem_usage branch * quantizers style * remove require_low_cpu_mem_usage attr * rm dtype arg from process_model_before_weight_loading * rm config_origin from Q-config * rm inspect from q_config * fixed docstrings in QuantizationConfigParser * logger.warning fix * mv is_loaded_in_4(8)bit to BnbHFQuantizer * is_accelerate_available error msg fix in quantizer * split is_model_trainable in bnb quantizer class * rm llm_int8_skip_modules as separate var in Q * Q rm todo * fwd ref to HFQuantizer in type hint * rm note re optimum.gptq.GPTQQuantizer * quantization_config in __init__ simplified * replaced NonImplemented with create_quantized_param * rm load_in_4/8_bit deprecation warning * QuantizationConfigParser refactoring * awq-related minor changes * awq-related changes * awq config.modules_to_not_convert * raise error if no q-method in q-config in args * minor cleanup * awq quantizer docstring * combine common parts in bnb process_model_before_weight_loading * revert test_gptq * .process_model_ cleanup * restore dict config warning * removed typevars in quantizers.py * cleanup post-rebase 16 jan * QuantizationConfigParser classmethod refactor * rework of handling of unexpected aux elements of bnb weights * moved q-related stuff from save_pretrained to quantizers * refactor v1 * more changes * fix some tests * remove it from main init * ooops * Apply suggestions from code review Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> * fix awq issues * fix * fix * fix * fix * fix * fix * add docs * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/hf_quantizer.md * address comments * fix * fixup * Update src/transformers/modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * address final comment * update * Update src/transformers/quantizers/base.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/quantizers/auto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix * add kwargs update * fixup * add `optimum_quantizer` attribute * oops * rm unneeded file * fix doctests --------- Co-authored-by: younesbelkada <younesbelkada@gmail.com> Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-01-30 02:48:25 +01:00
Zhan Ling	1f5590d32e	Move CLIP _no_split_modules to CLIPPreTrainedModel (#27841 ) Add _no_split_modules to CLIPModel	2024-01-30 02:15:58 +01:00
Omar Sanseviero	a989c6c6eb	Don't allow passing `load_in_8bit` and `load_in_4bit` at the same time (#28266 ) * Update quantization_config.py * Style * Protect from setting directly * add tests * Update tests/quantization/bnb/test_4bit.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-01-30 01:43:40 +01:00
ThibaultLengagne	cd2eb8cb2b	Add French translation: french README.md (#28696 ) * doc: french README Signed-off-by: ThibaultLengagne <thibaultl@padok.fr> * doc: Add Depth Anything Signed-off-by: ThibaultLengagne <thibaultl@padok.fr> * doc: Add french link in other docs Signed-off-by: ThibaultLengagne <thibaultl@padok.fr> * doc: Add missing links in fr docs * doc: fix several mistakes in translation Signed-off-by: ThibaultLengagne <thibaultl@padok.fr> --------- Signed-off-by: ThibaultLengagne <thibaultl@padok.fr> Co-authored-by: Sarapuce <alexandreh@padok.fr>	2024-01-29 10:07:49 -08:00
Ajay Patel	a055d09e11	Support saving only PEFT adapter in checkpoints when using PEFT + FSDP (#28297 ) * Update trainer.py * Revert "Update trainer.py" This reverts commit 0557e2cc9effa3a41304322032239a3874b948a7. * Make trainer.py use adapter_only=True when using FSDP + PEFT * Support load_best_model with adapter_only=True * Ruff format * Inspect function args for save_ load_ fsdp utility functions and only pass adapter_only=True if they support it	2024-01-29 17:10:15 +00:00
Sanchit Gandhi	da3c79b245	[Whisper] Make tokenizer normalization public (#28136 ) * [Whisper] Make tokenizer normalization public * add to docs	2024-01-29 16:07:35 +00:00
xkszltl	e694e985d7	Fix typo of `Block`. (#28727 )	2024-01-29 15:25:00 +00:00
amyeroberts	9e8f35fa28	Mark test_constrained_beam_search_generate as flaky (#28757 ) * Make test_constrained_beam_search_generate as flaky * Update tests/generation/test_utils.py	2024-01-29 15:22:25 +00:00
amyeroberts	0f8d015a41	Pin pytest version <8.0.0 (#28758 ) * Pin pytest version <8.0.0 * Update setup.py * make deps_table_update	2024-01-29 15:22:14 +00:00
Julien Chaumond	26aa03a252	small doc update for CamemBERT (#28644 )	2024-01-29 15:46:32 +01:00
Nate Cibik	0548af54cc	Enable Gradient Checkpointing in Deformable DETR (#28686 ) * Enabled gradient checkpointing in Deformable DETR * Enabled gradient checkpointing in Deformable DETR encoder * Removed # Copied from headers in modeling_deta.py to break dependence on Deformable DETR code	2024-01-29 10:10:40 +00:00
Wesley Gifford	f72c7c22d9	PatchtTST and PatchTSMixer fixes (#28083 ) * 🐛 fix .max bug * remove prediction_length from regression output dimensions * fix parameter names, fix output names, update tests * ensure shape for PatchTST * ensure output shape for PatchTSMixer * update model, batch, and expected for regression distribution test * update test expected Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com> * Update tests/models/patchtst/test_modeling_patchtst.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtst/test_modeling_patchtst.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtst/test_modeling_patchtst.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update src/transformers/models/patchtsmixer/modeling_patchtsmixer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * standardize on patch_length Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com> * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/patchtsmixer/test_modeling_patchtsmixer.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Make arguments more explicit Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com> * adjust prepared inputs Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com> --------- Signed-off-by: Wesley M. Gifford <wmgifford@us.ibm.com> Co-authored-by: Wesley M. Gifford <wmgifford@us.ibm.com> Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-01-29 10:09:26 +00:00
Vinyzu	3a08cc485f	[Docs] Fix Typo in English & Japanese CLIP Model Documentation (TMBD -> TMDB) (#28751 ) * [Docs] Fix Typo in English CLIP model_doc * [Docs] Fix Typo in Japanese CLIP model_doc	2024-01-29 10:06:51 +00:00
Klaus Hipp	39fa400969	Fix input data file extension in examples (#28741 )	2024-01-29 10:06:31 +00:00
Yih-Dar	5649c0cbb8	Fix `DepthEstimationPipeline`'s docstring (#28733 ) * fix * fix * Fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-29 10:42:55 +01:00
Angela Yi	243e186efb	Add serialization logic to pytree types (#27871 ) * Add serialized type name to pytrees * Modify context * add serde test	2024-01-29 10:41:20 +01:00
amyeroberts	f1cc615721	[`Siglip`] protect from imports if sentencepiece not installed (#28737 ) [Siglip] protect from imports if sentencepiece not installed	2024-01-28 15:10:14 +00:00
Joao Gante	03cc17775b	Generate: deprecate old src imports (#28607 )	2024-01-27 15:54:19 +00:00
Joao Gante	a28a76996c	Falcon: removed unused function (#28605 )	2024-01-27 15:52:59 +00:00
Sanchit Gandhi	de13a951b3	[Flax] Update no init test for Flax v0.7.1 (#28735 )	2024-01-26 18:20:39 +00:00
Steven Liu	abe0289e6d	[docs] Fix datasets in guides (#28715 ) * change datasets * fix	2024-01-26 09:29:07 -08:00
Yih-Dar	f8b7c4345a	Unpin pydantic (#28728 ) * try pydantic v2 * try pydantic v2 --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-26 17:39:33 +01:00
Scruel Tao	3aea38ce61	fix: suppress `GatedRepoError` to use cache file (fix #28558 ). (#28566 ) * fix: suppress `GatedRepoError` to use cache file (fix #28558). * move condition_to_return parameter back to outside.	2024-01-26 16:25:08 +00:00
Matt	708b19eb09	Stop confusing the TF compiler with ModelOutput objects (#28712 ) * Stop confusing the TF compiler with ModelOutput objects * Stop confusing the TF compiler with ModelOutput objects	2024-01-26 12:22:29 +00:00
Yih-Dar	a638de1987	Fix `weights_only` (#28725 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-26 13:00:49 +01:00
Shukant Pal	d6ac8f4ad2	Initialize _tqdm_active with hf_hub_utils.are_progress_bars_disabled(… (#28717 ) Initialize _tqdm_active with hf_hub_utils.are_progress_bars_disabled() to respect HF_HUB_DISABLE_PROGRESS_BARS It seems like enable_progress_bar() and disable_progress_bar() sync up with huggingface_hub, but the initial value is always True. This changes will make sure the user's preference is respected implicity on initialization.	2024-01-26 11:59:34 +00:00
D	3a46e30dd1	[`docs`] Update preprocessing.md (#28719 ) * Update preprocessing.md adjust ImageProcessor link to working target (same as in lower section of file) * Update preprocessing.md	2024-01-26 11:58:57 +00:00
Turetskii Mikhail	1f47a24aa1	fix: corrected misleading log message in save_pretrained function (#28699 )	2024-01-26 11:52:53 +00:00
Facico	bbe30c6968	support PeftMixedModel signature inspect (#28321 ) * support PeftMixedModel signature inspect * import PeftMixedModel only peft>=0.7.0 * Update src/transformers/trainer.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * fix styling * Update src/transformers/trainer.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * Update src/transformers/trainer.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * style fixup * fix note --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-01-26 12:05:01 +01:00
fxmarty	8eb74c1c89	Fix duplicate & unnecessary flash attention warnings (#28557 ) * fix duplicate & unnecessary flash warnings * trigger ci * warning_once * if/else order --------- Co-authored-by: Your Name <you@example.com>	2024-01-26 09:37:04 +01:00
Yih-Dar	142ce68389	Don't fail when `LocalEntryNotFoundError` during `processor_config.json` loading (#28709 ) * fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-26 09:02:32 +01:00
Peter Götz	2875195887	[`docs`] Improve visualization for vertical parallelism (#28583 ) The documentation says "We refer to this Model parallelism as “Vertical” because of how models are typically visualized.", but then visualizes the model horizontally. This change visualizes the model indeed vertically.	2024-01-25 17:55:11 +00:00
Fanli Lin	4cbd876e42	[`Vilt`] align input and model dtype in the ViltPatchEmbeddings forward pass (#28633 ) align dtype	2024-01-25 15:03:20 +00:00
Yusuf	24f1a00e4c	Update question_answering.md (#28694 ) fix typo: from: "model = TFAutoModelForQuestionAnswering("distilbert-base-uncased")" to: model = TFAutoModelForQuestionAnswering.from_pretrained("distilbert-base-uncased")	2024-01-25 14:06:38 +00:00
Merve Noyan	2000095666	Improve Backbone API docs (#28666 ) Update backbones.md	2024-01-25 11:51:58 +00:00
Tom Aarsen	7fa4b36eba	[`chore`] Add missing space in warning (#28695 ) Add missing space in warning	2024-01-25 09:34:52 +00:00
NielsRogge	963db81a5a	Add Depth Anything (#28654 ) * First draft * More improvements * More improvements * More improvements * More improvements * Add docs * Remove file * Add copied from * Address comments * Address comments * Address comments * Fix style * Update docs * Convert all checkpoints, add integration test * Rename checkpoints * Add pretrained backbone attributes * Fix default config * Address comment * Add figure to docs * Fix bug thanks to @xenova * Update conversion script * Fix integration test	2024-01-25 09:34:50 +01:00
Steven Liu	f40b87de0c	[docs] Fix doc format (#28684 ) * fix hfoptions * revert changes to other files * fix	2024-01-24 11:18:59 -08:00
Fanli Lin	8278b1538e	improve efficient training on CPU documentation (#28646 ) * update doc * revert * typo fix * refine * add dtypes * Update docs/source/en/perf_train_cpu.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_train_cpu.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/en/perf_train_cpu.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * no comma * use avx512-vnni --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-01-24 09:07:13 -08:00
nakranivaibhav	5d29530ea2	Improved type hinting for all attention parameters (#28479 ) * Changed type hinting for all attention inputs to 'Optional[Tuple[torch.FloatTensor,...]] = None' * Fixed the ruff formatting issue * fixed type hinting for all hidden_states to 'Optional[Tuple[torch.FloatTensor, ...]] = None' * Changed type hinting in these 12 scripts modeling_dpr.py,modeling_nat.py,idefics/vision.py,modeling_tf_dpr.py,modeling_luke.py,modeling_swin.py,modeling_tf_swin.py,modeling_blip.py,modeling_tf_blip.py,modeling_donut_swin.py,modeling_dinat.py,modeling_swinv2.py * test fail update * fixed type hinting for these 15 scripts modeling_xlnet.py,modeling_tf_xlnet.py,modeling_led.py,modeling_tf_led.py,modleing_rwkv.py,modeling_dpt.py,modeling_tf_cvt.py,modeling_clip.py,modeling_flax_clip.py,modeling_tf_clip.py,modeling_longformer.py,modeling_tf_longformer.py,modeling_siglip.py,modeling_clap.py,modeling_git.py * Changed type hinting in these 12 scripts modeling_dpr.py,modeling_nat.py,idefics/vision.py,modeling_tf_dpr.py,modeling_luke.py,modeling_swin.py,modeling_tf_swin.py,modeling_blip.py,modeling_tf_blip.py,modeling_donut_swin.py,modeling_dinat.py,modeling_swinv2.py * test fail update * Removed the myvenv file * Fixed type hinting for these 8 scripts modeling_tvlt.py,modeling_sam.py,modeling_tf_sam.py,modeling_tvp.py,modeling_rag.py,modeling_tf_rag.py,modeling_tf_xlm.py,modeling_xlm.py	2024-01-24 16:47:34 +00:00
Steven Liu	738ec75c90	[docs] DeepSpeed (#28542 ) * config * optim * pre deploy * deploy * save weights, memory, troubleshoot, non-Trainer * done	2024-01-24 08:31:28 -08:00

1 2 3 4 5 ...

15018 Commits All Branches Search

15018 Commits

All Branches