transformers

Commit Graph

Author	SHA1	Message	Date
Lysandre Debut	d4e92f1a21	Remove add-new-model in favor of add-new-model-like (#30424 ) * Remove add-new-model in favor of add-new-model-like * nits	2024-04-24 09:38:18 +02:00
Abhi Venigalla	005b957fb8	Add DBRX Model (#29921 ) * wip * fix __init__.py * add docs * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * address comments 1 * work on make fixup * pass configs down * add sdpa attention * remove DbrxBlock * add to configuration_auto * docstring now passes formatting test * fix style * update READMEs * add dbrx to modeling_auto * make fix-copies generated this * add DBRX_PRETRAINED_CONFIG_ARCHIVE_MAP * config docstring passes formatting test * rename moe_loss_weight to router_aux_loss_coef * add to flash-attn documentation * fix model-path in tests * Explicitly make `"suli"` the default `ffn_act_fn` Co-authored-by: Wing Lian <wing.lian@gmail.com> * default to using router_aux_loss_coef over ffn_config[moe_loss_weight] * fix _flash_attn_uses_top_left_mask and is_causal * fix tests path * don't use token type IDs * follow Llama and remove token_type_ids from test * init ConfigTester differently so tests pass * remove multiple choice test * remove question + answer test * remove sequence classification test * remove token classification test * copy Llama tests and remove token_type_ids from test inputs * do not test pruning or headmasking; style code * add _tied_weights_keys parameter to pass test * add type hints * fix type check * update config tester * remove masked_lm test * remove encoder tests * initialize DbrxModelTester with correct params * style * torch_dtype does not rely on torch * run make fixup, fix-copies * use https://huggingface.co/v2ray/dbrx-base-fixed/blob/main/modeling_dbrx.py * add copyright info * fix imports and DbrxRotaryEmbedding * update DbrxModel docstring * use copies * change model path in docstring * use config in DbrxFFN * fix flashattention2, sdpaattention * input config to DbrXAttention, DbrxNormAttentionNorm * more fixes * fix * fix again! * add informative comment * fix ruff? * remove print statement + style * change doc-test * fix doc-test * fix docstring * delete commented out text * make defaults match dbrx-instruct * replace `router_aux_loss_coef` with `moe_loss_weight` * is_decoder=True * remove is_decoder from configtester * implement sdpa properly * make is_decoder pass tests * start on the GenerationTesterMixin tests * add dbrx to sdpa documentation * skip weight typing test * style * initialize smaller model Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Add DBRX to toctree * skip test_new_cache_format * make config defaults smaller again * add pad_token_id * remove pad_token_id from config * Remove all references to DBRX_PRETRAINED_CONFIG_ARCHIVE_MAP * Update src/transformers/models/dbrx/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/dbrx/modeling_dbrx.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/model_doc/dbrx.md Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * Update src/transformers/models/dbrx/configuration_dbrx.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update docs/source/en/model_doc/dbrx.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix typo * Apply suggestions from code review Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * update docs, fix configuration_auto.py * address pr comments * remove is_decoder flag * slice * fix requires grad * remove grad * disconnect differently * remove grad * enable grads * patch * detach expert * nissan al ghaib * Update modeling_dbrx.py * Update src/transformers/models/dbrx/modeling_dbrx.py Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> * replace "Gemma" with "Dbrx" * remove # type: ignore * don't hardcode vocab_size * remove ToDo * Re-add removed idefics2 line * Update test to use tiny-random! * Remove TODO * Remove one more case of loading the entire dbrx-instruct in the tests * Update src/transformers/models/dbrx/modeling_dbrx.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * address some comments * small model * add dbrx to tokenization_auto * More docstrings with add_start_docstrings * Dbrx for now * add PipelineTesterMixin * Update src/transformers/models/dbrx/configuration_dbrx.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * remove flash-attn2 import error * fix docstring Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * add useage example * put on one line Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * fix ffn_act_fn Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * change "dbrx" to "DBRX" for display purposes. * fix __init__.py? * fix __init__.py * fix README * return the aux_loss * remove extra spaces * fix configuration_auto.py * fix format in tokenization_auto * remove new line * add more useage examples --------- Co-authored-by: Abhi Venigalla <abhi.venigalla@databricks.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Eitan Turok <eitan.turok@databricks.com> Co-authored-by: Eitan Turok <150733043+eitanturok@users.noreply.github.com> Co-authored-by: Wing Lian <wing.lian@gmail.com> Co-authored-by: Eitan Turok <eitanturok@gmail.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com> Co-authored-by: Matt <rocketknight1@gmail.com> Co-authored-by: Your Name <you@example.com> Co-authored-by: Mihir Patel <mihir.v.patel7@gmail.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2024-04-18 15:18:52 +02:00
Lysandre Debut	39114c0383	Remove static pretrained maps from the library's internals (#29112 ) * [test_all] Remove static pretrained maps from the library's internals * Deprecate archive maps instead of removing them * Revert init changes * [test_all] Deprecate instead of removing * [test_all] PVT v2 support * [test_all] Tests should all pass * [test_all] Style * Address review comments * Update src/transformers/models/deprecated/_archive_maps.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/deprecated/_archive_maps.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * [test_all] trigger tests * [test_all] LLAVA * [test_all] Bad rebase --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2024-03-25 10:33:38 +01:00
Klaus Hipp	fe3df9d5b3	[Docs] Add language identifiers to fenced code blocks (#28955 ) Add language identifiers to code blocks	2024-02-12 10:48:31 -08:00
Klaus Hipp	721ee783ca	[Docs] Fix spelling and grammar mistakes (#28825 ) * Fix typos and grammar mistakes in docs and examples * Fix typos in docstrings and comments * Fix spelling of `tokenizer` in model tests * Remove erroneous spaces in decorators * Remove extra spaces in Markdown link texts	2024-02-02 08:45:00 +01:00
Matt	415e9a0980	Add tf_keras imports to prepare for Keras 3 (#28588 ) * Port core files + ESM (because ESM code is odd) * Search-replace in modelling code * Fix up transfo_xl as well * Fix other core files + tests (still need to add correct import to tests) * Fix cookiecutter * make fixup, fix imports in some more core files * Auto-add imports to tests * Cleanup, add imports to sagemaker tests * Use correct exception for importing tf_keras * Fixes in modeling_tf_utils * make fixup * Correct version parsing code * Ensure the pipeline tests correctly revert to float32 after each test * Ensure the pipeline tests correctly revert to float32 after each test * More tf.keras -> keras * Add dtype cast * Better imports of tf_keras * Add a cast for tf.assign, just in case * Fix callback imports	2024-01-30 17:26:36 +00:00
Klaus Hipp	39fa400969	Fix input data file extension in examples (#28741 )	2024-01-29 10:06:31 +00:00
Joao Gante	7e93ce40c5	Fix `input_embeds` docstring in encoder-decoder architectures (#28168 )	2023-12-21 11:01:54 +00:00
Matt	050e0b44f6	Proper build() methods for TF (#27794 ) * Add a convenience method for building in your own name scope * Second attempt at auto layer building * Revert "Second attempt at auto layer building" This reverts commit e03a3aaecf9ec41a805582b83cbdfe3290a631be. * Attempt #3 * Revert "Attempt #3" This reverts commit b9df7a0857560d29b5abbed6127d9e9eca77cf47. * Add missing attributes that we're going to need later * Add some attributes we're going to need later * A fourth attempt! Feel the power flow through you! * Revert "A fourth attempt! Feel the power flow through you!" This reverts commit 6bf4aaf3875d6f28485f50187617a4c616c8aff7. * Add more values we'll need later * TF refactor that we'll need later * Revert "TF refactor that we'll need later" This reverts commit ca07202fb5b7b7436b893baa8d688b4f348ea7b9. * Revert "Revert "TF refactor that we'll need later"" This reverts commit 1beb0f39f293ed9c27594575e1c849aadeb15c13. * make fixup * Attempt five! * Revert "Attempt five!" This reverts commit 3302207958dfd0374b0447a51c06eea51a506044. * Attempt six - this time don't add empty methods * Revert "Attempt six - this time don't add empty methods" This reverts commit 67d60129be75416b6beb8f47c7d38d77b18d79bb. * Attempt seven - better base model class detection! * Revert "Attempt seven - better base model class detection!" This reverts commit 5f14845e92ea0e87c598da933bfbfee10f553bc9. * Another attribute we'll need later * Try again with the missing attribute! * Revert "Try again with the missing attribute!" This reverts commit 760c6f30c5dffb3e04b0e73c34a77d1882a0fef7. * This is the attempt that will pierce the heavens! * Revert "This is the attempt that will pierce the heavens!" This reverts commit c868bb657de057aca7a5260350a3f831fc4dfee6. * Attempt seven - snag list is steadily decreasing * Revert "Attempt seven - snag list is steadily decreasing" This reverts commit 46fbd975deda64429bfb3e5fac4fc0370c00d316. * Attempt eight - will an empty snag list do it? * Revert "Attempt eight - will an empty snag list do it?" This reverts commit 7c8a3c2b083253649569e9877e02054ae5cec67b. * Fixes to Hubert issues that cause problems later * Trying again with Conv1D/SeparableConv fixes * Revert "Trying again with Conv1D/SeparableConv fixes" This reverts commit 55092bca952bc0f750aa1ffe246a640bf1e2036e. * Apply the build shape fixes to Wav2Vec2 as well * One more attempt! * Revert "One more attempt!" This reverts commit 5ac3e4cb01b9458cc93312873725f9444ae7261c. * Another attempt! * Revert "Another attempt!" This reverts commit ea16d890e019d7de8792a3b8e72f3b1c02adae50. * Let's see how many failures we get without the internal build method * Fix OpenAI * Fix MobileBERT * (Mostly) fix GroupVIT * Fix BLIP * One more BLIP fix * One more BLIP fix! * Fix Regnet * Finally fully fix GroupViT * Fix Data2Vec and add the new AdaptivePool * Fix Segformer * Fix Albert * Fix Deberta/DebertaV2 * Fix XLM * Actually fix XLM * Fix Flaubert * Fix lxmert * Fix Resnet * Fix ConvBERT * Fix ESM * Fix Convnext / ConvnextV2 * Fix SAM * Fix Efficientformer * Fix LayoutLMv3 * Fix speech_to_text * Fix mpnet and mobilevit * Fix Swin * Fix CTRL * Fix CVT * Fix DPR * Fix Wav2Vec2 * Fix T5 * Fix Hubert * Fix GPT2 * Fix Whisper * Fix DeiT * Fix the encoder-decoder / dual-encoder classes * make fix-copies * build in name scope * Fix summarization test * Fix tied weight names for BART + Blenderbot * Fix tied weight name building * Fix to TFESM weight building * Update TF SAM * Expand all the shapes out into Big Boy Shapes	2023-12-14 15:17:30 +00:00
V.Prasanna kumar	ffbcfc0166	Broken links fixed related to datasets docs (#27569 ) fixed the broken links belogs to dataset library of transformers	2023-11-17 13:44:09 -08:00
Arthur	651408a077	[`Styling`] stylify using ruff (#27144 ) * try to stylify using ruff * might need to remove these changes? * use ruf format andruff check * use isinstance instead of type comparision * use # fmt: skip * use # fmt: skip * nits * soem styling changes * update ci job * nits isinstance * more files update * nits * more nits * small nits * check and format * revert wrong changes * actually use formatter instead of checker * nits * well docbuilder is overwriting this commit * revert notebook changes * try to nuke docbuilder * style * fix feature exrtaction test * remve `indent-width = 4` * fixup * more nits * update the ruff version that we use * style * nuke docbuilder styling * leve the print for detected changes * nits * Remove file I/O Co-authored-by: charliermarsh <charlie.r.marsh@gmail.com> * style * nits * revert notebook changes * Add # fmt skip when possible * Add # fmt skip when possible * Fix * More ` # fmt: skip` usage * More ` # fmt: skip` usage * More ` # fmt: skip` usage * NIts * more fixes * fix tapas * Another way to skip * Recommended way * Fix two more fiels * Remove asynch Remove asynch --------- Co-authored-by: charliermarsh <charlie.r.marsh@gmail.com>	2023-11-16 17:43:19 +01:00
Maria Khalusova	9beb2737d7	[docs] fixed links with 404 (#27327 ) * fixed links with 404 * make style	2023-11-06 19:45:03 +00:00
Patrick von Platen	ac5893756b	[Attention Mask] Refactor all encoder-decoder attention mask (#27086 ) * [FA2 Bart] Add FA2 to all Bart-like * better * Refactor attention mask * remove all customized atteniton logic * format * mass rename * replace _expand_mask * replace _expand_mask * mass rename * add pt files * mass replace & rename * mass replace & rename * mass replace & rename * mass replace & rename * Update src/transformers/models/idefics/modeling_idefics.py * fix more * clean more * fix more * make style * fix again * finish * finish * finish * finish * finish * finish * finish * finish * finish * finish * Apply suggestions from code review * Apply suggestions from code review Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * small fix mistral * finish * finish * finish * finish --------- Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-10-27 16:42:01 +02:00
Younes Belkada	ffff9e70ab	[`core`/ `gradient_checkpointing`] Refactor GC - part 2 (#27073 ) * fix * more fixes * fix other models * fix long t5 * use `gradient_checkpointing_func` instead * fix copies * set `gradient_checkpointing_func` as a private attribute and retrieve previous behaviour * Update src/transformers/modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * replace it with `is_gradient_checkpointing_set` * remove default * Update src/transformers/modeling_utils.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fixup --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>	2023-10-27 16:15:22 +02:00
Younes Belkada	06e782da4e	[`core`] Refactor of `gradient_checkpointing` (#27020 ) * v1 * fix * remove `create_custom_forward` * fixup * fixup * add test and fix all failing GC tests * remove all remaining `create_custom_forward` methods * fix idefics bug * fixup * replace with `__call__` * add comment * quality	2023-10-25 12:16:15 +02:00
Tom Aarsen	40ea9ab2a1	Add many missing spaces in adjacent strings (#26751 ) Add missing spaces in adjacent strings	2023-10-12 10:28:40 +02:00
Roy Hvaara	fc63914399	[JAX] Replace uses of `jnp.array` in types with `jnp.ndarray`. (#26703 ) `jnp.array` is a function, not a type: https://jax.readthedocs.io/en/latest/_autosummary/jax.numpy.array.html so it never makes sense to use `jnp.array` in a type annotation. Presumably the intent was to write `jnp.ndarray` aka `jax.Array`. Co-authored-by: Peter Hawkins <phawkins@google.com>	2023-10-10 21:35:16 +02:00
Dong-Yong Lee	8881f38a4f	Fix beam search when using model parallel (#24969 ) * Fix GPTNeoX beam search when using parallelize * Fix beam search idx device when using model parallel * remove onnx related stuff Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * fix: move test_beam_search_on_multi_gpu to GenerationTesterMixin * fix: add right item to _no_split_modules of MegaPreTrainedModel * fix: add num_beams within parallelized beam_search test Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> --------- Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>	2023-09-14 11:00:52 -04:00
Yih-Dar	5744482abc	Fix `token` in example template (#25351 ) fix Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-08-08 12:00:31 +02:00
JB (Don)	5ea2595ecd	Add warning for missing attention mask when pad tokens are detected (#25345 ) * Add attention mask and pad token warning to many of the models * Remove changes under examples/research_projects These files are not maintained by HG. * Skip the warning check during torch.fx or JIT tracing * Switch ordering for the warning and input shape assignment This ordering is a little cleaner for some of the cases. * Add missing line break in one of the files	2023-08-08 10:49:21 +02:00
Jackmin801	145109382a	Allow `trust_remote_code` in example scripts (#25248 ) * pytorch examples * pytorch mim no trainer * cookiecutter * flax examples * missed line in pytorch run_glue * tensorflow examples * tensorflow run_clip * tensorflow run_mlm * tensorflow run_ner * tensorflow run_clm * pytorch example from_configs * pytorch no trainer examples * Revert "tensorflow run_clip" This reverts commit `261f86ac1f`. * fix: duplicated argument	2023-08-07 16:32:25 +02:00
mariecwhite	a6e6b1c622	Remove jnp.DeviceArray since it is deprecated. (#24875 ) * Remove jnp.DeviceArray since it is deprecated. * Replace all instances of jnp.DeviceArray with jax.Array * Update src/transformers/models/bert/modeling_flax_bert.py --------- Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>	2023-08-04 18:36:57 +01:00
Yih-Dar	224da5df69	update `use_auth_token` -> `token` (#25083 ) * update --------- Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2023-07-26 15:09:59 +02:00
Sylvain Gugger	87fba947a5	Move template doc file to md (#25004 )	2023-07-21 16:49:44 -04:00
Liyang90	1f6f32c243	Removing unnecessary `device=device` in modeling_llama.py (#24696 ) * Update modeling_llama.py Removing unnecessary `device=device` * fix in all occurrences of _make_causal_mask	2023-07-13 10:30:22 +01:00
Joao Gante	a074a5d34d	Docs: change some `input_ids` doc reference from `BertTokenizer` to `AutoTokenizer` (#24730 )	2023-07-10 17:57:26 +01:00
MS Kim(tony9402)	232c898f9f	Fix annotations (#24582 ) * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations * fix annotations	2023-06-29 14:17:35 -04:00
Bowen Bao	a28325e25e	Replace python random with torch.rand to enable dynamo.export (#24434 ) * Replace python random with torch.rand to enable dynamo.export * revert changes to flax model code * Remove unused random import * Fix torch template * Move torch.manual_seed(0) to right location	2023-06-23 08:17:21 -04:00
Younes Belkada	3ce3385c47	Revert "Fix gradient checkpointing + fp16 autocast for most models" (#24420 ) Revert "Fix gradient checkpointing + fp16 autocast for most models (#24247)" This reverts commit `285a48011d`.	2023-06-22 16:11:27 +02:00
Younes Belkada	285a48011d	Fix gradient checkpointing + fp16 autocast for most models (#24247 ) * fix gc bug * continue PoC on OPT * fixes * 🤯 * fix tests * remove pytest.mark * fixup * forward contrib credits from discussions * forward contrib credits from discussions * reverting changes on untouched files. --------- Co-authored-by: zhaoqf123 <zhaoqf123@users.noreply.github.com> Co-authored-by: 7eu7d7 <7eu7d7@users.noreply.github.com>	2023-06-21 17:04:59 +02:00
Sylvain Gugger	eb849f6604	Migrate doc files to Markdown. (#24376 ) * Rename index.mdx to index.md * With saved modifs * Address review comment * Treat all files * .mdx -> .md * Remove special char * Update utils/tests_fetcher.py Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr> --------- Co-authored-by: Lysandre Debut <lysandre.debut@reseau.eseo.fr>	2023-06-20 18:07:47 -04:00
Matt	3403712958	Big TF test cleanup (#24282 ) * Fix one BLIP arg not being optional, remove misspelled arg * Remove the lxmert test overrides and just use the base test_saved_model_creation * saved_model_creation fixes and re-enabling tests across the board * Remove unnecessary skip * Stop caching sinusoidal embeddings in speech_to_text * Fix transfo_xl compilation * Fix transfo_xl compilation * Fix the conditionals in xglm * Set the save spec only when building * Clarify comment * Move comment correctly * Correct embeddings generation for speech2text * Mark RAG generation tests as @slow * Remove redundant else: * Add comment to clarify the save_spec line in build() * Fix size tests for XGLM at last! * make fixup * Remove one band_part operation * Mark test_keras_fit as @slow	2023-06-16 15:40:49 +01:00
Matt	3bd1fe4315	Stop storing references to bound methods via tf.function (#24146 ) * Stop storing references to bound methods in tf.functions * Remove the gc.collect calls now that we resolved the underlying problem * Remove the default signature from model.serving entirely, big cleanup * Remove _prune_signature as self.input_signature can prune itself * Restore serving docstring * Update int support test to check the input signature * Make sure other tests also use model.input_signature and not serving.input_signature * Restore _prune_signature * Remove the doctest GC now it's no longer needed * Correct core tests to use the pruned sig * order lines correctly in core tests * Add eager_serving back with a deprecation warning	2023-06-13 19:04:22 +01:00
Joao Gante	7bb6933b9d	TF: standardize `test_model_common_attributes` for language models (#23457 )	2023-06-13 17:51:37 +01:00
Matt	e45e756d22	Remove the last few TF serving sigs (#23738 ) Remove some more serving methods that (I think?) turned up while this PR was open	2023-05-24 21:19:44 +01:00
Matt	814de8fac7	Overhaul TF serving signatures + dummy inputs (#23234 ) * Let's try autodetecting serving sigs * Don't clobber existing sigs * Change shapes for multiplechoice models * Make default dummy inputs smarter too * Fix missing f-string * Let's YOLO a serving output too * Read __class__.__name__ properly * Don't just pass naked lists in there and expect it to be okay * Code cleanup * Update default serving sig * Clearer error messages * Further updates to the default serving output * make fixup * Update the serving output a bit more * Cleanups and renames, raise errors appropriately when we can't infer inputs * More renames * we're building in a functional context again, yolo * import DUMMY_INPUTS from the right place * import DUMMY_INPUTS from the right place * Support cross-attention in the dummies * Support cross-attention in the dummies * Complete removal of dummy/serving overrides in BERT * Complete removal of dummy/serving overrides in RoBERTa * Obliterate lots and lots of serving sig and dummy overrides * merge type hint changes * Fix for token_type_ids with vocab_size 1 * Add missing property decorator * Fix T5 and hopefully some models that take conv inputs * More signature pruning * Fix T5's signature * Fix Wav2Vec2 signature * Fix LongformerForMultipleChoice input signature * Fix BLIP and LED * Better default serving output error handling * Fix BART dummies * Fix dummies for cross-attention, esp encoder-decoder models * Fix visionencoderdecoder signature * Fix BLIP serving output * Small tweak to BART dummies * Cleanup the ugly parameter inspection line that I used in a few places * committed a breakpoint again * Move the text_dims check * Remove blip_text serving_output * Add decoder_input_ids to the default input sig * Remove all the manual overrides for encoder-decoder model signatures * Tweak longformer/led input sigs * Tweak default serving output * output.keys() -> output * make fixup	2023-05-24 17:03:24 +01:00
Matt	f8b2574416	Better TF docstring types (#23477 ) * Rework TF type hints to use \| None instead of Optional[] for tf.Tensor * Rework TF type hints to use \| None instead of Optional[] for tf.Tensor * Don't forget the imports * Add the imports to tests too * make fixup * Refactor tests that depended on get_type_hints * Better test refactor * Fix an old hidden bug in the test_keras_fit input creation code * Fix for the Deit tests	2023-05-24 13:52:52 +01:00
zspo	003a0cf8cc	Fix some docs what layerdrop does (#23691 ) * Fix some docs what layerdrop does * Update src/transformers/models/data2vec/configuration_data2vec_audio.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fix more docs --------- Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2023-05-23 14:50:40 -04:00
Joao Gante	cf9e7cb079	TF: embeddings out of bounds check factored into function (#23427 )	2023-05-17 17:04:51 +01:00
Connor Henderson	188a8bfccc	docs: Fix broken link in 'How to add a model...' (#23216 ) fix link	2023-05-08 14:56:42 -04:00
Sylvain Gugger	28c19ab58d	Make it easier to develop without a dev install (#22697 ) * Make it easier to develop without a dev install * Remove ugly hack that doesn't work anyway	2023-04-11 08:41:53 -04:00
Joao Gante	440f39754b	Generate - update cookie cutters to not initialize cache with training and gradient checkpointing (#21759 )	2023-02-24 11:21:00 +00:00
Sylvain Gugger	68b21b37ea	Final cleanup of TOKENIZER_FOR_DOC (#21565 ) FInal cleanup of TOKENIZER_FOR_DOC	2023-02-14 09:47:32 -05:00
Sylvain Gugger	67d074874d	Cleanup quality (#21493 ) * Remove mentions of flake8/isort * Clean up inits * Deall with all other inits * Last special rule for dummy files	2023-02-07 12:27:31 -05:00
Arthur	12eb528b5a	[CI ] Remove `past` in favor of `pat_key_values` (#21443 ) * fix past renamed to past_key_value * update more `past`that were ski^êd * fixup * remove changes made to rag * refactor `_reorder_cache` to use `past_key_values` * fix git `prepare_inputs_for_generation` to pass tests when false is needed in use_cache	2023-02-07 09:51:35 +01:00
Joao Gante	cbaaa2f6ac	Flax dtype-dependent numerical masking (#21197 )	2023-01-19 16:43:42 +00:00
Arthur	e3ecbaa4ab	Patch-past-refactor (#21050 ) * small patches, forgot a line * refactor PT * the actual fix	2023-01-09 18:12:13 +01:00
Arthur	f0577df6de	Replace `past` with `past_key_values` (#20944 ) * start cleanup * more updates * more models are affected * more updates * update generation utils * style * revert change that removed reorder cachce * update generation utils * style * style * remove reorder cache	2023-01-08 10:21:40 +01:00
Joao Gante	4fd89e4978	Generate: delete unused TF `_reorder_cache` (#20964 )	2023-01-03 10:54:56 +00:00
Eli Simhayev	e35bc46af6	fix docs typos in "add_new_model" (#20900 ) fix Jupyter typos	2022-12-27 02:49:15 -05:00

1 2 3 4 5 ...

336 Commits