transformers

History

Alazar 94306352f4 Port IDEFICS to tensorflow (#26870 ) * Initial commit * Just a copy of modeling_idefics.py that will be ported to TF * - Prepend TF to the name of all classes - Convert pytorch ops to TF (not all operations are converted yet) * Add TF imports * Add autotranslated files * Add TF classes to model_tf_auto.py * Add the TF classes in model_doc * include auto-translated code * Adopted from auto-translated version * Add a forgotten super().build * Add test code for TF version. * Fix indentation and load pytorch weights for now * Some fixes. Many tests are still failing but some are passing now. - I have added TODO's for some of the hacks I made to unblock me and I will address them soon - I have the processing_idefics.py hacked in my view to support TF temporarily * Add ALL_LAYERNORM_LAYERS to match pytorch * Revert "Add ALL_LAYERNORM_LAYERS to match pytorch" This reverts commit 7e0a35119b4d7a6284d04d8c543fba1b29e573c9 as it is not needed in the tf implementation. * Fix freeze_relevant_params() * Some more fixes * Fix test_attention_outputs * Add tf stuff to processing_idefics.py processing_idefics.py supports both pytorch and tf now. test_processor_idefics.py for pytorch is passing, so i didn't break anything but still some issues with tf. I also need to add tf tests in test_processor_idefics.py. * Pass return_tensors to image processing code and fix test * Pass return_tensors to the image processor __init__ * Fix several test cases - Make input to some of the forward pass of type `TFModelInputType` - Decorate main layer forward pass with `@unpack_inputs` - Decorate main layer with `@keras_serializable` - Pass `inputs` to TFIdeficsModel * Some more fixes forgotten in last commit * Fix processing code and vision_tf.py * Fix perceiver bug * Import from * Auto-add build() methods + style pass * Fix build() errors due to `None` being passed as shape to some layers * Change name in TFIdeficsForVisionText2Text to attribute in IdeficsForVisionText2Text * Fix pytorch weights load for tf2 There were a lot of `name=` missing in weight initialization code. * Attempt to fix CI * Add back accidently removed line * Remove torch-specific stuff from the TF test file * make fix-copies, make style, remove autotranslated files * Fixes to imports/docstrings * Let's try the from future import in desperation * Fix the core random_attention_mask fn to match the torch/flax behaviour * Clean random_attention_mask up correctly * Remove torch-only test * Fix loss shape, couple of nits * make style * Don't test for OOB embeddings because IDEFICS uses those deliberately * Fix loss computation to handle masking * Fix test failures when flattening * Fix some test failures - Add cross attention gate which was missing and wasn't being passed arround - Fix overwriting of image_attention_mask due to hack I had for dummy inputs * Add a proper stateless scaled_dot_product_attention * make style * Adding missing attribute from the PyTorch version * Small cleanups to decoupledlinearlayer in case that helps * Pass epsilon to LayerNormalization * Attemp to fix pytorch weight cross-loading for TFIdeficsEmbedding * Fix a bug in TFIdeficsGatedCrossAttentionLayer * Patching up build() methods * Constant self.inv_freq * Constant self.inv_freq * First working version The TF implementation works now, there was a bug in the TFIdeficsDecoupledLinear where the weights were mis-intialized (in_features,out_features) when it should be: (out_features, in_features) I have tested this so far with tiny-random and idefics-9b-instruct and gives correct output. I also dumped the final outputs for both pytorch and TF and they are identical. * Fix some test failures * remove print statement * Fix return_tensors * Fix CI test failure check_code_quality * Attempt to fix CI failures by running `make fixup` The hardcoded IDs in test_modeling_tf_idefics.py are for the integration test and makes that file unreadable and should probably be moved to a seperate file. * Attempt to fix tests_pr_documentation_tests * Fix a test failure in test_image_processing_idefics.py * Fix test test_pt_tf_model_equivalence * Fix a few failures * Tiny fix * Some minor fixes * Remove a duplicate test * Override a few test failures for IDEFICS - `test_keras_save_load` is passing now - `test_compile_tf_model` is still failing * Fix processing_idefics.py after rebase * Guard import keras with is_tf_available * fix check code quality * fix check code quality * Minor fixes * Skip test_save_load temporarily This test passed on my local box but fails on the CI, skipping for now to see if there are other remaining failures on the CI. * Run `ruff format tests src utils` * Fix last failing test, `test_compile_tf_model` * Add fixes for vision_tf.py I forgot to add this file in last commit. * Minor fixes * Replace "<<<" with "<<" for doc tests IDEFICS-9B is too big for doctest runner, so don't run it there * Make code more readable * Fix bug after code review I added a layer_norm_eps to IdeficsConfig but I don't even need it since the vision config has a layer_norm_eps. * Fix after code review Use original code tokenizer.convert_tokens_to_ids * Keep PyTorch as the default return_tensors * Fixes to modeling_tf after code review * Fixes from code review - Remove all references of `TF_IDEFICS_PRETRAINED_MODEL_ARCHIVE_LIST` - Pass 1e-5 to LayerNormalization in perceiver * Run ruff * Undo a change * Refactor processing code after Matt's suggestion * Remove TODO's that aren't needed anymore * For pytorch, Use original pytorch processing code from main Since this PR is a TF port it shouldn't make any modifications to pytorch IDEFICS code. This changes undo's the pytorch processing modifications I made and uses original code from main. * Update tests/models/idefics/test_modeling_idefics.py * Update tests/models/idefics/test_modeling_tf_idefics.py * Add missing imports for is_pt_tf_cross_test * [DO NOT MERGE]: This is a commit for debugging and will be reverted The cross test `test_pt_tf_model_equivalence` passes locally but fails when running on the CI. This commit is to help debug that and will be reverted. * Revert "[DO NOT MERGE]: This is a commit for debugging and will be reverted" This reverts commit 8f0d709ec5bd46685fb0b4259d914ffee794875b. * [DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted * [DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted * Revert "[DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted" This reverts commit 998cc38b8c3d313bf5e5eb55a7f5b7b881897b89. * Revert "[DO NOT MERGE]: This commit is for debugging a CI failure and will be reverted" This reverts commit 1c695ac4219c4ae4d39b330b01744dc27deb7dd4. * Don't skip test_save_load IIRC test_save_load was also failing on the CI but not on my local box, it might be easier to debug that on the CI first than the cross tests * Debugging commit, will be reverted * Revert "Debugging commit, will be reverted" This reverts commit 8eafc8e41e20c4e95a3a90834f06a6e9f445e2d5. * Override `test_save_load` and push model to save Maybe this will help me repro this weird bug * pass my repo_id * add endpoint * Pass a temp (write) token just for this CI * Undo last few commits, still pushing to hub for model debugging The issue seems to be with save_pretrained(), when I looked at the model saved from the CI test failure it is basically empty and has no weights. `self.save_weights(..)` seems to be failing in save_pretrained but needs more debugging * Add logging to modeling tf utils, will be reverted just for debugging * Debugging, will revert * Revert "Debugging, will revert" This reverts commit 9d0d3075fb7c82d8cde3a5c76bc8f3876c5c55d3. * Revert "Add logging to modeling tf utils, will be reverted just for debugging" This reverts commit 774b6b7b1c17b3ce5d7634ade768f2f686cee617. * Remove `test_save_load` The CI failures are gone after my latest rebase, no idea why but I was still saving the model to my hub on HF and the tf_model.h5 file now has everything. * Run make fix-copies * Run ruff format tests src utils * Debugging commit, will be reverted * Run ruff, also trigger CI run * Run ruff again * Undo debugging commit --------- Co-authored-by: Matt <rocketknight1@gmail.com> Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>		2024-05-13 15:59:46 +01:00
..
albert	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
align	Encoder-decoder models: move embedding scale to nn.Module (#30410 )	2024-05-01 12:33:00 +05:00
altclip	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
audio_spectrogram_transformer	fix: Replace deprecated `assertEquals` with `assertEqual` (#30241 )	2024-04-15 09:36:06 +01:00
auto	Fix auto tests (#30067 )	2024-04-05 17:49:46 +02:00
autoformer	Add tests for batching support (#29297 )	2024-03-12 17:46:19 +00:00
bark	Encoder-decoder models: move embedding scale to nn.Module (#30410 )	2024-05-01 12:33:00 +05:00
bart	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
barthez	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
bartpho	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
beit	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
bert	[`BERT`] Add support for sdpa (#28802 )	2024-04-26 16:23:44 +01:00
bert_generation	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
bert_japanese	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
bertweet	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
big_bird	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
bigbird_pegasus	Do not remove half seq length in generation tests (#30016 )	2024-04-19 17:32:52 +01:00
biogpt	[`generate`] fix breaking change for patch (#29976 )	2024-04-02 09:51:45 +02:00
bit	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
blenderbot	Generate: left-padding test, revisited (#29515 )	2024-03-08 10:06:46 +00:00
blenderbot_small	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
blip	Blip dynamic input resolution (#30722 )	2024-05-13 12:20:16 +01:00
blip_2	Blip dynamic input resolution (#30722 )	2024-05-13 12:20:16 +01:00
bloom	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
bridgetower	Encoder-decoder models: move embedding scale to nn.Module (#30410 )	2024-05-01 12:33:00 +05:00
bros	[tests] add the missing `require_torch_multi_gpu` flag (#30250 )	2024-04-15 16:30:52 +01:00
byt5	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
camembert	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
canine	Encoder-decoder models: move embedding scale to nn.Module (#30410 )	2024-05-01 12:33:00 +05:00
chinese_clip	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
clap	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
clip	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
clipseg	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
clvp	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
code_llama	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
codegen	Add token type ids to CodeGenTokenizer (#29265 )	2024-04-17 12:19:18 +02:00
cohere	Cache: models return input cache type (#30716 )	2024-05-08 18:26:34 +01:00
conditional_detr	Encoder-decoder models: move embedding scale to nn.Module (#30410 )	2024-05-01 12:33:00 +05:00
convbert	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
convnext	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
convnextv2	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
cpm	Fix PipelineTests skip conditions (#22320 )	2023-03-22 20:02:24 +01:00
cpmant	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
ctrl	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
cvt	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
data2vec	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
dbrx	Cache: models return input cache type (#30716 )	2024-05-08 18:26:34 +01:00
deberta	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
deberta_v2	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
decision_transformer	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
deformable_detr	Adding _tie_weights() to prediction heads to support low_cpu_mem_usage=True (#29024 )	2024-05-07 11:12:21 +02:00
deit	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
depth_anything	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
deta	Adding _tie_weights() to prediction heads to support low_cpu_mem_usage=True (#29024 )	2024-05-07 11:12:21 +02:00
detr	Encoder-decoder models: move embedding scale to nn.Module (#30410 )	2024-05-01 12:33:00 +05:00
dinat	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
dinov2	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
distilbert	Fix FA2 tests (#29909 )	2024-04-01 07:51:00 +00:00
dit	Update old existing feature extractor references (#24552 )	2023-06-29 10:17:36 +01:00
donut	Removal of deprecated maps (#30576 )	2024-05-09 14:15:56 +02:00
dpr	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
dpt	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
efficientformer	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
efficientnet	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
electra	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
encodec	Adding _tie_weights() to prediction heads to support low_cpu_mem_usage=True (#29024 )	2024-05-07 11:12:21 +02:00
encoder_decoder	Generate: missing generation config eos token setting in encoder-decoder tests (#29146 )	2024-02-20 16:17:51 +00:00
ernie	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
ernie_m	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
esm	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
falcon	Support for Falcon2-11B (#30771 )	2024-05-13 13:32:43 +02:00
fastspeech2_conformer	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
flaubert	Fix 29807 sinusoidal positional encodings in Flaubert, Informer and XLM (#29904 )	2024-04-02 10:27:26 +02:00
flava	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
fnet	Tests: replace `torch.testing.assert_allclose` by `torch.testing.assert_close` (#29915 )	2024-03-28 09:53:31 +00:00
focalnet	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
fsmt	Encoder-decoder models: move embedding scale to nn.Module (#30410 )	2024-05-01 12:33:00 +05:00
funnel	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
fuyu	Update tiny model summary file (#27388 )	2023-11-23 21:00:39 +01:00
gemma	Cache: models return input cache type (#30716 )	2024-05-08 18:26:34 +01:00
git	fix: Replace deprecated `assertEquals` with `assertEqual` (#30241 )	2024-04-15 09:36:06 +01:00
glpn	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
gpt2	Adding Flash Attention 2 Support for GPT2 (#29226 )	2024-03-28 09:31:24 +00:00
gpt_bigcode	CI with `num_hidden_layers=2` 🚀🚀🚀 (#25266 )	2023-08-02 20:22:36 +02:00
gpt_neo	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
gpt_neox	RoPE models: add numerical sanity-check test for RoPE scaling (#29808 )	2024-03-28 11:25:50 +00:00
gpt_neox_japanese	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
gpt_sw3	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
gptj	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
gptsan_japanese	Encoder-decoder models: move embedding scale to nn.Module (#30410 )	2024-05-01 12:33:00 +05:00
graphormer	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
grounding_dino	[Grounding DINO] Add support for cross-attention in GroundingDinoMultiHeadAttention (#30364 )	2024-04-23 09:56:14 +01:00
groupvit	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
herbert	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
hubert	Fix failing tests on `main` due to torch 2.1 (#26607 )	2023-10-05 10:27:05 +02:00
ibert	Encoder-decoder models: move embedding scale to nn.Module (#30410 )	2024-05-01 12:33:00 +05:00
idefics	Port IDEFICS to tensorflow (#26870 )	2024-05-13 15:59:46 +01:00
idefics2	Encoder-decoder models: move embedding scale to nn.Module (#30410 )	2024-05-01 12:33:00 +05:00
imagegpt	Encoder-decoder models: move embedding scale to nn.Module (#30410 )	2024-05-01 12:33:00 +05:00
informer	Fix 29807 sinusoidal positional encodings in Flaubert, Informer and XLM (#29904 )	2024-04-02 10:27:26 +02:00
instructblip	Blip dynamic input resolution (#30722 )	2024-05-13 12:20:16 +01:00
jamba	Jamba: fix left-padding test (#30389 )	2024-04-22 17:02:55 +01:00
jukebox	Tests: replace `torch.testing.assert_allclose` by `torch.testing.assert_close` (#29915 )	2024-03-28 09:53:31 +00:00
kosmos2	Remove `use_square_size` after loading (#30567 )	2024-04-30 21:11:37 +02:00
layoutlm	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
layoutlmv2	[`CI update`] Try to use dockers and no cache (#29202 )	2024-05-06 10:10:32 +02:00
layoutlmv3	[`CI update`] Try to use dockers and no cache (#29202 )	2024-05-06 10:10:32 +02:00
layoutxlm	Add correct batched handling for apply_chat_template (#29222 )	2024-03-20 15:50:22 +00:00
led	Do not remove half seq length in generation tests (#30016 )	2024-04-19 17:32:52 +01:00
levit	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
lilt	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
llama	Llama: fix custom 4D masks, v2 (#30348 )	2024-05-13 13:46:06 +02:00
llava	Fix llava half precision and autocast issues (#29721 )	2024-05-01 17:49:44 +01:00
llava_next	Fix llava half precision and autocast issues (#29721 )	2024-05-01 17:49:44 +01:00
longformer	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
longt5	Do not remove half seq length in generation tests (#30016 )	2024-04-19 17:32:52 +01:00
luke	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
lxmert	Adding _tie_weights() to prediction heads to support low_cpu_mem_usage=True (#29024 )	2024-05-07 11:12:21 +02:00
m2m_100	Add Flash Attention 2 to M2M100 model (#30256 )	2024-04-18 10:27:58 +02:00
mamba	Mamba `slow_forward` gradient fix (#29563 )	2024-03-27 04:52:12 +01:00
marian	Adding _tie_weights() to prediction heads to support low_cpu_mem_usage=True (#29024 )	2024-05-07 11:12:21 +02:00
markuplm	Add correct batched handling for apply_chat_template (#29222 )	2024-03-20 15:50:22 +00:00
mask2former	fix: Replace deprecated `assertEquals` with `assertEqual` (#30241 )	2024-04-15 09:36:06 +01:00
maskformer	fix: Replace deprecated `assertEquals` with `assertEqual` (#30241 )	2024-04-15 09:36:06 +01:00
mbart	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
mbart50	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
mega	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
megatron_bert	CI with `num_hidden_layers=2` 🚀🚀🚀 (#25266 )	2023-08-02 20:22:36 +02:00
megatron_gpt2	Move test model folders (#17034 )	2022-05-03 14:42:02 +02:00
mgp_str	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
mistral	Llama: fix custom 4D masks, v2 (#30348 )	2024-05-13 13:46:06 +02:00
mixtral	Enable fx tracing for Mistral (#30209 )	2024-04-17 14:38:48 +05:00
mluke	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
mobilebert	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
mobilenet_v1	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
mobilenet_v2	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
mobilevit	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
mobilevitv2	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
mpnet	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
mpt	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
mra	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
mt5	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
musicgen	Adding _tie_weights() to prediction heads to support low_cpu_mem_usage=True (#29024 )	2024-05-07 11:12:21 +02:00
musicgen_melody	Adding _tie_weights() to prediction heads to support low_cpu_mem_usage=True (#29024 )	2024-05-07 11:12:21 +02:00
mvp	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
nat	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
nezha	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
nllb	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
nllb_moe	Tests: replace `torch.testing.assert_allclose` by `torch.testing.assert_close` (#29915 )	2024-03-28 09:53:31 +00:00
nougat	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
nystromformer	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
olmo	Cache: models return input cache type (#30716 )	2024-05-08 18:26:34 +01:00
oneformer	Fix OneFormer `post_process_instance_segmentation` for panoptic tasks (#29304 )	2024-03-04 11:04:49 +00:00
openai	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
opt	Proper build() methods for TF (#27794 )	2023-12-14 15:17:30 +00:00
owlv2	Fix image post-processing for OWLv2 (#30686 )	2024-05-09 17:02:03 +01:00
owlvit	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
patchtsmixer	PatchtTST and PatchTSMixer fixes (#28083 )	2024-01-29 10:09:26 +00:00
patchtst	PatchtTST and PatchTSMixer fixes (#28083 )	2024-01-29 10:09:26 +00:00
pegasus	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
pegasus_x	device agnostic models testing (#27146 )	2023-10-31 18:12:14 +01:00
perceiver	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
persimmon	[tests] add `require_bitsandbytes` marker (#30116 )	2024-04-08 12:49:31 +01:00
phi	RoPE models: add numerical sanity-check test for RoPE scaling (#29808 )	2024-03-28 11:25:50 +00:00
phi3	Phi-3 (#30423 )	2024-04-24 17:32:09 +02:00
phobert	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
pix2struct	BLIP - fix pt-tf equivalence test (#30258 )	2024-04-16 17:46:53 +01:00
plbart	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
poolformer	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
pop2piano	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
prophetnet	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
pvt	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
pvt_v2	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
qdqbert	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
qwen2	Enable fx tracing for Mistral (#30209 )	2024-04-17 14:38:48 +05:00
qwen2_moe	Enable fx tracing for Mistral (#30209 )	2024-04-17 14:38:48 +05:00
rag	Add `dataset_revision` argument to `RagConfig` (#29610 )	2024-03-14 16:48:11 +01:00
realm	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
recurrent_gemma	Cache: models return input cache type (#30716 )	2024-05-08 18:26:34 +01:00
reformer	Do not remove half seq length in generation tests (#30016 )	2024-04-19 17:32:52 +01:00
regnet	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
rembert	fix: Replace deprecated `assertEquals` with `assertEqual` (#30241 )	2024-04-15 09:36:06 +01:00
resnet	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
roberta	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
roberta_prelayernorm	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
roc_bert	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
roformer	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
rwkv	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
sam	Tests: replace `torch.testing.assert_allclose` by `torch.testing.assert_close` (#29915 )	2024-03-28 09:53:31 +00:00
seamless_m4t	Generate: consistently handle special tokens as tensors (#30624 )	2024-05-09 18:01:57 +01:00
seamless_m4t_v2	Generate: consistently handle special tokens as tensors (#30624 )	2024-05-09 18:01:57 +01:00
segformer	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
seggpt	[SegGPT] Fix seggpt image processor (#29550 )	2024-04-26 19:40:12 +01:00
sew	Adding _tie_weights() to prediction heads to support low_cpu_mem_usage=True (#29024 )	2024-05-07 11:12:21 +02:00
sew_d	Adding _tie_weights() to prediction heads to support low_cpu_mem_usage=True (#29024 )	2024-05-07 11:12:21 +02:00
siglip	Add dynamic resolution input/interpolate position embedding to SigLIP (#30719 )	2024-05-09 11:10:38 +01:00
speech_encoder_decoder	Generate: missing generation config eos token setting in encoder-decoder tests (#29146 )	2024-02-20 16:17:51 +00:00
speech_to_text	Generate: consistently handle special tokens as tensors (#30624 )	2024-05-09 18:01:57 +01:00
speech_to_text_2	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
speecht5	fix: Replace deprecated `assertEquals` with `assertEqual` (#30241 )	2024-04-15 09:36:06 +01:00
splinter	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
squeezebert	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
stablelm	[`StableLm`] Add QK normalization and Parallel Residual Support (#29745 )	2024-04-08 23:51:58 +02:00
starcoder2	Fix FA2 tests (#29909 )	2024-04-01 07:51:00 +00:00
superpoint	Removal of deprecated maps (#30576 )	2024-05-09 14:15:56 +02:00
swiftformer	Removal of deprecated maps (#30576 )	2024-05-09 14:15:56 +02:00
swin	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
swin2sr	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
swinv2	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
switch_transformers	Tests: replace `torch.testing.assert_allclose` by `torch.testing.assert_close` (#29915 )	2024-03-28 09:53:31 +00:00
t5	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
table_transformer	Encoder-decoder models: move embedding scale to nn.Module (#30410 )	2024-05-01 12:33:00 +05:00
tapas	[`CI update`] Try to use dockers and no cache (#29202 )	2024-05-06 10:10:32 +02:00
time_series_transformer	Add tests for batching support (#29297 )	2024-03-12 17:46:19 +00:00
timesformer	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
timm_backbone	Adding _tie_weights() to prediction heads to support low_cpu_mem_usage=True (#29024 )	2024-05-07 11:12:21 +02:00
trocr	CI with `num_hidden_layers=2` 🚀🚀🚀 (#25266 )	2023-08-02 20:22:36 +02:00
tvlt	fix: Replace deprecated `assertEquals` with `assertEqual` (#30241 )	2024-04-15 09:36:06 +01:00
tvp	Enable instantiating model with pretrained backbone weights (#28214 )	2024-01-23 11:01:50 +00:00
udop	[UDOP] Add special tokens to tokenizer (#29594 )	2024-04-19 09:06:01 +02:00
umt5	Tests: replace `torch.testing.assert_allclose` by `torch.testing.assert_close` (#29915 )	2024-03-28 09:53:31 +00:00
unispeech	[Wav2Vec2 and Co] Update init tests for PT 2.1 (#26494 )	2023-10-03 10:52:34 +02:00
unispeech_sat	Byebye torch 1.10 (#28207 )	2024-01-11 16:18:27 +01:00
univnet	Add tests for batching support (#29297 )	2024-03-12 17:46:19 +00:00
upernet	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
videomae	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
vilt	Encoder-decoder models: move embedding scale to nn.Module (#30410 )	2024-05-01 12:33:00 +05:00
vipllava	Use text config's vocab size in testing models (#30568 )	2024-05-01 12:32:45 +05:00
vision_encoder_decoder	Fix `VisionEncoderDecoder` Positional Arg (#29497 )	2024-03-07 20:45:51 +00:00
vision_text_dual_encoder	[`Styling`] stylify using ruff (#27144 )	2023-11-16 17:43:19 +01:00
visual_bert	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
vit	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
vit_hybrid	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
vit_mae	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
vit_msn	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
vitdet	mark `test_initialization` as flaky in 2 model tests (#27906 )	2023-12-08 14:54:32 +01:00
vitmatte	🚨 Update image_processing_vitmatte.py (#30566 )	2024-05-02 11:00:07 +01:00
vits	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
vivit	Enable dynamic resolution for vivit (#30630 )	2024-05-09 11:23:39 +01:00
wav2vec2	Add sdpa and fa2 the Wav2vec2 family. (#30121 )	2024-04-22 18:30:38 +01:00
wav2vec2_bert	Add new meta w2v2-conformer BERT-like model (#28165 )	2024-01-18 13:37:34 +00:00
wav2vec2_conformer	device agnostic models testing (#27146 )	2023-10-31 18:12:14 +01:00
wav2vec2_phoneme	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
wav2vec2_with_lm	Fix some tests using `"common_voice"` (#27147 )	2023-10-30 15:27:15 +01:00
wavlm	Output `None` as attention when layer is skipped (#30597 )	2024-05-02 17:25:19 +01:00
whisper	Generate: consistently handle special tokens as tensors (#30624 )	2024-05-09 18:01:57 +01:00
x_clip	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
xglm	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
xlm	Fix 29807 sinusoidal positional encodings in Flaubert, Informer and XLM (#29904 )	2024-04-02 10:27:26 +02:00
xlm_prophetnet	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
xlm_roberta	Adds pretrained IDs directly in the tests (#29534 )	2024-03-13 14:53:27 +01:00
xlm_roberta_xl	Revert low cpu mem tie weights (#29135 )	2024-02-20 12:06:46 +00:00
xlnet	Do not remove half seq length in generation tests (#30016 )	2024-04-19 17:32:52 +01:00
xmod	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
yolos	Fix YOLOS image processor resizing (#30436 )	2024-04-24 09:50:17 +01:00
yoso	Remove static pretrained maps from the library's internals (#29112 )	2024-03-25 10:33:38 +01:00
__init__.py	Move test model folders (#17034 )	2022-05-03 14:42:02 +02:00