transformers

History

Arthur 0fe44059ae Add recurrent gemma (#30143 ) * Fork. * RecurrentGemma initial commit. * Updating __init__.py. * Minor modification to how we initialize the cache. Changing how the config specifies the architecture. * Reformat code to 4 spaces. Fixed a few typos. * Fixed the forward pass. Still unclear on the cache? * Fixed the RecurrentGemmaForCausalLM * Minor comment that we might not need attention_mask and output_attention arguments. * Now cache should work as well. * Adding a temporary example to check whether the model generation works. * Adding the tests and updating imports. * Adding the example file missing in the previous commit. * First working example. * Removing .gitignore and reverting parts of __init__. * Re-add .gitignore. * Addressing comments for configuration. * Move mask creation to `_prepare_inputs_for_generation`. * First try at integration tests: 1. AttributeError: 'GriffinCausalLMOutput' object has no attribute 'attentions'. 2. `cache_position` not passed * Transfoering between machines. * Running normal tests. * Minor fix. * More fixes. * Addressing more comments. * Minor fixes. * first stab at cleanup * more refactoring * fix copies and else * renaming and get init to work * fix causal mask creation * update * nit * fix a hell lot of things * updates * update conversion script * make all keys importable * nits * add auto mappings * properly convert ffw_up and down * add scaling * fix generations * for recurrent dtype * update * fix going beyong window * fixup * add missing files * current updates to remove last einops * finish modeling refactor * TADA * fix compile * fix most failing testt ? ? * update tests * refactor and update * update * nits, fixup and update tests * more fixup * nits * fix imports * test format * fixups * nits * tuple typing * fix code quality * add model card * fix doc * skip most generation tests * nits * style * doc fixes * fix pr and check_copies? * last nit * oupsy * Apply suggestions from code review Co-authored-by: Lysandre Debut <hi@lysand.re> * update * Update src/transformers/models/recurrent_gemma/convert_recurrent_gemma_to_hf.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com> * update based on review * doc nit * fix quality * quality * fix slow test model path * update default dype * ignore attributes that can be safely ignored in check config attributes * 0lallalala come on * save nit * style * remove to dict update * make sure we can also run in float16 * style --------- Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com> Co-authored-by: Aleksandar Botev <botev@google.com> Co-authored-by: Leonard Berrada <lberrada@users.noreply.github.com> Co-authored-by: anushanf <anushanf@google.com> Co-authored-by: botev <botevmg@gmail.com> Co-authored-by: Lysandre Debut <hi@lysand.re> Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>		2024-04-10 16:59:13 +02:00
..
test_module	AutoImageProcessor (#20111 )	2022-11-08 19:54:41 +00:00
tf_ops	Check TF ops for ONNX compliance (#10025 )	2021-02-15 07:55:10 -05:00
add_pipeline_model_mapping_to_test.py	A script to add/update `pipeline_model_mapping` systematically (#22180 )	2023-04-06 18:08:14 +02:00
check_build.py	Clean up CUDA kernels (#23455 )	2023-05-18 14:14:43 -04:00
check_config_attributes.py	Add recurrent gemma (#30143 )	2024-04-10 16:59:13 +02:00
check_config_docstrings.py	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
check_copies.py	Fix copies main ci (#29979 )	2024-04-01 12:43:58 +02:00
check_doc_toc.py	Doc checks (#25408 )	2023-08-10 10:53:22 +02:00
check_docstrings.py	Add MusicGen Melody (#28819 )	2024-03-18 13:06:12 +00:00
check_doctest_list.py	Avoid many failing tests in doctesting (#27262 )	2023-11-03 12:47:07 +01:00
check_dummies.py	Doc checks (#25408 )	2023-08-10 10:53:22 +02:00
check_inits.py	Make using safetensors files automated. (#27571 )	2023-12-01 15:51:10 +01:00
check_model_tester.py	Add a new script to check model testers' config (#22063 )	2023-03-13 19:11:19 +01:00
check_repo.py	Add recurrent gemma (#30143 )	2024-04-10 16:59:13 +02:00
check_self_hosted_runner.py	Tiny fix for `check_self_hosted_runner.py` (#24052 )	2023-06-06 18:17:41 +02:00
check_support_list.py	Fix the check of models supporting FA/SDPA not run (#28202 )	2023-12-22 12:56:11 +01:00
check_table.py	Add support for fine-tuning CLIP-like models using contrastive-image-text example (#29070 )	2024-02-20 12:08:31 +00:00
check_task_guides.py	More utils doc (#25457 )	2023-08-17 07:58:35 +02:00
check_tf_ops.py	Check TF ops for ONNX compliance (#10025 )	2021-02-15 07:55:10 -05:00
create_dummy_models.py	Update tiny model creation script (#27674 )	2023-11-28 10:05:34 +01:00
custom_init_isort.py	More utils doc (#25457 )	2023-08-17 07:58:35 +02:00
download_glue_data.py	Raise exceptions instead of asserts (#13907 )	2021-10-07 12:44:23 +05:30
extract_warnings.py	Make Slack CI reporting stronger (#21823 )	2023-02-28 17:12:44 +01:00
get_ci_error_statistics.py	Add artifact name in job step to maintain job / artifact correspondence (#28682 )	2024-01-31 15:58:17 +01:00
get_github_job_time.py	Make Slack CI reporting stronger (#21823 )	2023-02-28 17:12:44 +01:00
get_modified_files.py	exclude deleted files in the fixup script (#21436 )	2023-02-03 12:57:02 -05:00
get_previous_daily_ci.py	Fix a minor bug in CI slack report (#22906 )	2023-04-21 20:36:35 +02:00
get_test_info.py	Add an utility file to get information from test files (#21856 )	2023-03-01 17:53:29 +01:00
not_doctested.txt	Add recurrent gemma (#30143 )	2024-04-10 16:59:13 +02:00
notification_service.py	Fix quantization tests (#29914 )	2024-04-09 17:10:29 +02:00
notification_service_doc_tests.py	Fix slack report failing for doctest (#27042 )	2023-10-30 10:48:24 +01:00
notification_service_quantization.py	Fix quantization tests (#29914 )	2024-04-09 17:10:29 +02:00
past_ci_versions.py	(Re-)Enable Nightly + Past CI (#22393 )	2023-03-30 21:06:35 +02:00
print_env.py	Print more library versions in CI (#17384 )	2022-06-02 10:24:16 +02:00
release.py	More utils doc (#25457 )	2023-08-17 07:58:35 +02:00
slow_documentation_tests.txt	Add MusicGen Melody (#28819 )	2024-03-18 13:06:12 +00:00
sort_auto_mappings.py	More utils doc (#25457 )	2023-08-17 07:58:35 +02:00
split_model_tests.py	Split daily CI using 2 level matrix (#28773 )	2024-01-31 18:04:43 +01:00
tests_fetcher.py	[test fetcher] Always include the directly related test files (#30050 )	2024-04-05 14:30:36 +02:00
update_metadata.py	Add feature extraction mapping for automatic metadata update (#28944 )	2024-02-26 10:35:37 +00:00
update_tiny_models.py	Update tiny model summary file for recent models (#22637 )	2023-04-06 22:52:59 +02:00