transformers

Commit Graph

Author	SHA1	Message	Date
Klaus Hipp	1ea0bbd73c	[Docs] Update project names and links in awesome-transformers (#28878 ) Update project names and repository links in awesome-transformers	2024-02-06 04:06:29 +01:00
dependabot[bot]	e83227d76e	Bump cryptography from 41.0.2 to 42.0.0 in /examples/research_projects/decision_transformer (#28879 ) Bump cryptography in /examples/research_projects/decision_transformer Bumps [cryptography](https://github.com/pyca/cryptography) from 41.0.2 to 42.0.0. - [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst) - [Commits](https://github.com/pyca/cryptography/compare/41.0.2...42.0.0) --- updated-dependencies: - dependency-name: cryptography dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-02-06 03:53:08 +01:00
nakranivaibhav	2e7c942c81	Adds LlamaForQuestionAnswering class in modeling_llama.py along with AutoModel Support (#28777 ) * This is a test commit * testing commit * final commit with some changes * Removed copy statement * Fixed formatting issues * Fixed error added past_key_values in the forward method * Fixed a trailing whitespace. Damn the formatting rules are strict * Added the copy statement	2024-02-06 03:41:42 +01:00
xkszltl	ac51e59e47	Do not use mtime for checkpoint rotation. (#28862 ) Resolve https://github.com/huggingface/transformers/issues/26961	2024-02-06 03:21:50 +01:00
eajechiloae	06901162b5	ClearMLCallback enhancements: support multiple runs and handle logging better (#28559 ) * add clearml tracker * support multiple train runs * remove bad code * add UI entries for config/hparams overrides * handle models in different tasks * run ruff format * tidy code based on code review --------- Co-authored-by: Eugen Ajechiloae <eugenajechiloae@gmail.com>	2024-02-05 20:04:17 +00:00
amyeroberts	ba3264b4e8	Image Feature Extraction pipeline (#28216 ) * Draft pipeline * Fixup * Fix docstrings * Update doctest * Update pipeline_model_mapping * Update docstring * Update tests * Update src/transformers/pipelines/image_feature_extraction.py Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Fix docstrings - review comments * Remove pipeline mapping for composite vision models * Add to pipeline tests * Remove for flava (multimodal) * safe pil import * Add requirements for pipeline run * Account for super slow efficientnet * Review comments * Fix tests * Swap order of kwargs * Use build_pipeline_init_args * Add back FE pipeline for Vilt * Include image_processor_kwargs in docstring * Mark test as flaky * Update TODO * Update tests/pipelines/test_pipelines_image_feature_extraction.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Add license header --------- Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-02-05 14:50:07 +00:00
Yoach Lacombe	7addc9346c	Correct wav2vec2-bert inputs_to_logits_ratio (#28821 ) * Correct wav2vec2-bert inputs_to_logits_ratio * correct ratio * correct ratio, clean asr pipeline * refactor on one line	2024-02-05 13:14:47 +00:00
Arthur	3f9f749325	[`Doc`] update contribution guidelines (#28858 ) update guidelines	2024-02-05 21:19:21 +09:00
Nicolas Patry	2da28c4b41	[WIP] Hard error when ignoring tensors. (#27484 ) * [WIP] Hard error when ignoring tensors. * Better selection/error when saving a checkpoint. - Find all names we should normally drop (those are in the transformers config) - Find all disjoint tensors (for those we can safely trigger a copy to get rid of the sharing before saving) - Clone those disjoint tensors getting rid of the issue - Find all identical names (those should be declared in the config but we try to find them all anyway.) - For all identical names: - If they are in the config, just ignore them everything is fine - If they are not, warn about them. - For all remainder tensors which are shared yet neither identical NOR disjoint. raise a hard error. * Adding a failing test on `main` that passes here. * We don't need to keep the subfolder logic in this test. * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-02-05 09:17:24 +01:00
w4ffl35	0466fd5ca2	Ability to override clean_code_for_run (#28783 ) * Add clean_code_for_run function * Call clean_code_for_run from agent method	2024-02-05 03:48:41 +01:00
Zizhao Chen	c430d6eaee	[Docs] Fix bad doc: replace save with logging (#28855 ) Fix bad doc: replace save with logging	2024-02-05 03:38:08 +01:00
Ziyang	7b702836af	Support custom scheduler in deepspeed training (#26831 ) Reuse trainer.create_scheduler to create scheduler for deepspeed	2024-02-05 03:33:55 +01:00
dependabot[bot]	ca8944c4e3	Bump dash from 2.3.0 to 2.15.0 in /examples/research_projects/decision_transformer (#28845 ) Bump dash in /examples/research_projects/decision_transformer Bumps [dash](https://github.com/plotly/dash) from 2.3.0 to 2.15.0. - [Release notes](https://github.com/plotly/dash/releases) - [Changelog](https://github.com/plotly/dash/blob/dev/CHANGELOG.md) - [Commits](https://github.com/plotly/dash/compare/v2.3.0...v2.15.0) --- updated-dependencies: - dependency-name: dash dependency-type: direct:production ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>	2024-02-05 03:12:30 +01:00
amyeroberts	3d2900e829	Mark `test_encoder_decoder_model_generate` for `vision_encoder_deocder` as flaky (#28842 ) Mark test as flaky	2024-02-02 16:57:08 +00:00
Sourab Mangrulkar	80d50076c8	Reduce GPU memory usage when using FSDP+PEFT (#28830 ) support FSDP+PEFT	2024-02-02 21:18:01 +05:30
Yih-Dar	f497795948	Use `-v` for `pytest` on CircleCI (#28840 ) use -v in pytest Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-02 16:44:13 +01:00
Yih-Dar	a7cb92aa03	fix / skip (for now) some tests before switch to torch 2.2 (#28838 ) * fix / skip some tests before we can switch to torch 2.2 * style --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-02 14:11:50 +01:00
Yih-Dar	0e75aeefaf	Fix issues caused by natten (#28834 ) try Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-02 21:11:48 +09:00
Juri Ganitkevitch	ec29d25d9f	Add missing None check for hf_quantizer (#28804 ) * Add missing None check for hf_quantizer * Add test, fix logic. * make style * Switch test model to Mistral * Comment * Update tests/test_modeling_utils.py --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>	2024-02-02 09:34:12 +01:00
skumar951	1efb21c764	Explicitly check if token ID's are None in TFBertTokenizer constructor (#28824 ) Add an explicit none-check, since token ids can be 0	2024-02-02 09:13:36 +01:00
Klaus Hipp	721ee783ca	[Docs] Fix spelling and grammar mistakes (#28825 ) * Fix typos and grammar mistakes in docs and examples * Fix typos in docstrings and comments * Fix spelling of `tokenizer` in model tests * Remove erroneous spaces in decorators * Remove extra spaces in Markdown link texts	2024-02-02 08:45:00 +01:00
Steven Liu	2418c64a1c	[docs] HfQuantizer (#28820 ) * tidy * fix path	2024-02-02 08:22:18 +01:00
Steven Liu	abbffc4525	[docs] Backbone (#28739 ) * backbones * fix path * fix paths * fix code snippet * fix links	2024-02-01 09:16:16 -08:00
Rockerz	23ea6743f2	Add models from deit (#28302 ) * Add modelss * Add 2 more models * add models to tocrree * Add modles * Update docs/source/ja/model_doc/detr.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/model_doc/deit.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update docs/source/ja/model_doc/deplot.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * fix bugs --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-02-01 09:15:55 -08:00
zspo	d98591a12b	[docs] fix some bugs about parameter description (#28806 ) Co-authored-by: p_spozzhang <p_spozzhang@tencent.com>	2024-02-01 16:59:29 +00:00
Sangbum Daniel Choi	e19c12e094	enable graident checkpointing in DetaObjectDetection and add tests in Swin/Donut_Swin (#28615 ) * enable graident checkpointing in DetaObjectDetection * fix missing part in original DETA * make style * make fix-copies * Revert "make fix-copies" This reverts commit 4041c86c29248f1673e8173b677c20b5a4511358. * remove fix-copies of DetaDecoder * enable swin gradient checkpointing * fix gradient checkpointing in donut_swin * add tests for deta/swin/donut * Revert "fix gradient checkpointing in donut_swin" This reverts commit 1cf345e34d3cc0e09eb800d9895805b1dd9b474d. * change supports_gradient_checkpointing pipeline to PreTrainedModel * Revert "add tests for deta/swin/donut" This reverts commit 6056ffbb1eddc3cb3a99e4ebb231ae3edf295f5b. * Revert "Revert "fix gradient checkpointing in donut_swin"" This reverts commit 24e25d0a14891241de58a0d86f817d0b5d2a341f. * Simple revert * enable deformable detr gradient checkpointing * add gradient in encoder	2024-02-01 15:07:44 +00:00
Matt	7bc6d76396	Add tip on setting tokenizer attributes (#28764 ) * Add tip on setting tokenizer attributes * Grammar * Remove the bit that was causing doc builds to fail	2024-02-01 14:44:58 +00:00
fxmarty	709dc43239	Fix symbolic_trace with kv cache (#28724 ) * fix symbolic_trace with kv cache * comment & better test	2024-02-01 09:45:02 +01:00
Yih-Dar	eb8e7a005f	Make `is_torch_bf16_available_on_device` more strict (#28796 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-02-01 09:03:53 +01:00
JB (Don)	0d26abdd3a	Adding [T5/MT5/UMT5]ForTokenClassification (#28443 ) * Adding [T5/MT5/UMT5]ForTokenClassification * Add auto mappings for T5ForTokenClassification and variants * Adding ForTokenClassification to the list of models * Adding attention_mask param to the T5ForTokenClassification test * Remove outdated comment in test * Adding EncoderOnly and Token Classification tests for MT5 and UMT5 * Fix typo in umt5 string * Add tests for all the existing MT5 models * Fix wrong comment in dependency_versions_table * Reverting change to common test for _keys_to_ignore_on_load_missing The test is correctly picking up redundant keys in _keys_to_ignore_on_load_missing. * Removing _keys_to_ignore_on_missing from MT5 since the key is not used in the model * Add fix-copies to MT5ModelTest	2024-02-01 03:53:49 +01:00
Shichao Song	7b2bd1fbbd	[docs] Correct the statement in the docstirng of compute_transition_scores in generation/utils.py (#28786 )	2024-01-31 17:07:30 +00:00
Yih-Dar	4735866141	Split daily CI using 2 level matrix (#28773 ) * update / add new workflow files * Add comment * Use env.NUM_SLICES * use scripts * use scripts * use scripts * Fix * using one script * Fix * remove unused file * update * fail-fast: false * remove unused file * fix * fix * use matrix * inputs * style * update * fix * fix * no model name * add doc * allow args * style * pass argument --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-31 18:04:43 +01:00
Yih-Dar	95346e9dcd	Add artifact name in job step to maintain job / artifact correspondence (#28682 ) * avoid using job name * apply to other files --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-31 15:58:17 +01:00
Joao Gante	beb2a09687	DeepSpeed: hardcode `torch.arange` dtype on `float` usage to avoid incorrect initialization (#28760 )	2024-01-31 14:39:07 +00:00
Kian Sierra McGettigan	f7076cd346	Flax mistral (#26943 ) * direct copy from llama work * mistral modules forward pass working * flax mistral forward pass with sliding window * added tests * added layer collection approach * Revert "added layer collection approach" This reverts commit `0e2905bf22`. * Revert "Revert "added layer collection approach"" This reverts commit `fb17b6187a`. * fixed attention outputs * added mistral to init and auto * fixed import name * fixed layernorm weight dtype * freeze initialized weights * make sure conversion consideres bfloat16 * added backend * added docstrings * added cache * fixed sliding window causal mask * passes cache tests * passed all tests * applied make style * removed commented out code * applied fix-copies ignored other model changes * applied make fix-copies * removed unused functions * passed generation integration test * slow tests pass * fixed slow tests * changed default dtype from jax.numpy.float32 to float32 for docstring check * skip cache test for FlaxMistralForSequenceClassification since if pad_token_id in input_ids it doesn't score previous input_ids * updated checkpoint since from_pt not included * applied black style * removed unused args * Applied styling and fixup * changed checkpoint for doc back * fixed rf after adding it to hf hub * Add dummy ckpt * applied styling * added tokenizer to new ckpt * fixed slice format * fix init and slice * changed ref for placeholder TODO * added copies from Llama * applied styling * applied fix-copies * fixed docs * update weight dtype reconversion for sharded weights * removed Nullable input ids * Removed unnecessary output attentions in Module * added embedding weight initialziation * removed unused past_key_values * fixed deterministic * Fixed RMS Norm and added copied from * removed input_embeds * applied make style * removed nullable input ids from sequence classification model * added copied from GPTJ * added copied from Llama on FlaxMistralDecoderLayer * added copied from to FlaxMistralPreTrainedModel methods * fix test deprecation warning * freeze gpt neox random_params and fix copies * applied make style * fixed doc issue * skipped docstring test to allign # copied from * applied make style * removed FlaxMistralForSequenceClassification * removed unused padding_idx * removed more sequence classification * removed sequence classification * applied styling and consistency * added copied from in tests * removed sequence classification test logic * applied styling * applied make style * removed freeze and fixed copies * undo test change * changed repeat_kv to tile * fixed to key value groups * updated copyright year * split casual_mask * empty to rerun failed pt_flax_equivalence test FlaxWav2Vec2ModelTest * went back to 2023 for tests_pr_documentation_tests * went back to 2024 * changed tile to repeat * applied make style * empty for retry on Wav2Vec2	2024-01-31 14:19:02 +01:00
Matt	7a4961007a	Wrap Keras methods to support BatchEncoding (#28734 ) * Shim the Keras methods to support BatchEncoding * Extract everything to a convert_batch_encoding function * Convert BatchFeature too (thanks Amy) * tf.keras -> keras	2024-01-31 13:18:42 +00:00
Julien Chaumond	721e2d94df	canonical repos moves (#28795 ) * canonical repos moves * Style --------- Co-authored-by: Lysandre <lysandre@huggingface.co>	2024-01-31 14:18:31 +01:00
Hieu Lam	bebeeee012	Resolve DeepSpeed cannot resume training with PeftModel (#28746 ) * fix: resolve deepspeed resume peft model issues * chore: update something * chore: update model instance pass into is peft model checks * chore: remove hard code value to tests * fix: format code	2024-01-31 13:58:26 +01:00
Patrick von Platen	65a926e82b	[Whisper] Refactor forced_decoder_ids & prompt ids (#28687 ) * up * Fix more * Correct more * Fix more tests * fix fast tests * Fix more * fix more * push all files * finish all * make style * Fix timestamp wrap * make style * make style * up * up * up * Fix lang detection behavior * Fix lang detection behavior * Add lang detection test * Fix lang detection behavior * make style * Update src/transformers/models/whisper/generation_whisper.py Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com> * better error message * make style tests * add warning --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2024-01-31 14:02:07 +02:00
Younes Belkada	f9f1f2ac5e	[`HFQuantizer`] Remove `check_packages_compatibility` logic (#28789 ) remove `check_packages_compatibility` logic	2024-01-31 03:21:27 +01:00
tom-p-reichel	ae0c27adfa	don't initialize the output embeddings if we're going to tie them to input embeddings (#28192 ) * test that tied output embeddings aren't initialized on load * don't initialize the output embeddings if we're going to tie them to the input embeddings	2024-01-31 02:19:18 +01:00
Alessio Serra	a937425e94	Prevent MLflow exception from disrupting training (#28779 ) Modified MLflow logging metrics from synchronous to asynchronous Co-authored-by: codiceSpaghetti <alessio.ser@hotmail.it>	2024-01-31 02:10:44 +01:00
Younes Belkada	d703eaaeff	[`bnb`] Fix bnb slow tests (#28788 ) fix bnb slow tests	2024-01-31 01:31:20 +01:00
Matt	74c9cfeaa7	Pin Torch to <2.2.0 (#28785 ) * Pin torch to <2.2.0 * Pin torchvision and torchaudio as well * Playing around with versions to see if this helps * twiddle something to restart the CI * twiddle it back * Try changing the natten version * make fixup * Revert "Try changing the natten version" This reverts commit `de0d6592c3`. * make fixup * fix fix fix * fix fix fix --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-30 23:01:12 +01:00
Matt	415e9a0980	Add tf_keras imports to prepare for Keras 3 (#28588 ) * Port core files + ESM (because ESM code is odd) * Search-replace in modelling code * Fix up transfo_xl as well * Fix other core files + tests (still need to add correct import to tests) * Fix cookiecutter * make fixup, fix imports in some more core files * Auto-add imports to tests * Cleanup, add imports to sagemaker tests * Use correct exception for importing tf_keras * Fixes in modeling_tf_utils * make fixup * Correct version parsing code * Ensure the pipeline tests correctly revert to float32 after each test * Ensure the pipeline tests correctly revert to float32 after each test * More tf.keras -> keras * Add dtype cast * Better imports of tf_keras * Add a cast for tf.assign, just in case * Fix callback imports	2024-01-30 17:26:36 +00:00
amyeroberts	1d489b3e61	Task-specific pipeline init args (#28439 ) * Abstract out pipeline init args * Address PR comments * Reword * BC PIPELINE_INIT_ARGS * Remove old arguments * Small fix	2024-01-30 16:54:57 +00:00
amyeroberts	2fa1c808ae	[`Backbone`] Use `load_backbone` instead of `AutoBackbone.from_config` (#28661 ) * Enable instantiating model with pretrained backbone weights * Remove doc updates until changes made in modeling code * Use load_backbone instead * Add use_timm_backbone to the model configs * Add missing imports and arguments * Update docstrings * Make sure test is properly configured * Include recent DPT updates	2024-01-30 16:54:09 +00:00
Yih-Dar	c24c52454a	Further pin pytest version (in a temporary way) (#28780 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2024-01-30 17:48:49 +01:00
fxmarty	6f7d5db58c	Fix transformers.utils.fx compatibility with torch<2.0 (#28774 ) guard sdpa on torch>=2.0	2024-01-30 14:54:42 +01:00
Thien Tran	5c8d941d66	Use Conv1d for TDNN (#25728 ) * use conv for tdnn * run make fixup * update TDNN * add PEFT LoRA check * propagate tdnn warnings to others * add missing imports * update TDNN in wav2vec2_bert * add missing imports	2024-01-30 09:33:55 +01:00

1 2 3 4 5 ...

15060 Commits All Branches Search

15060 Commits

All Branches