* Bookmark, initial impelemtation. Need to test
* Clean
* Working fully, woop woop
* I think working version now, testing
* Fin!
* rm cast, could keep None
* Fix typing issue
* rm typehint
* Add test
* Add tests and make more rigid
* Update push-important-models.yml
* dummy commit
* Update modeling_bark.py
* test
* test
* test
* another test
* another test
* test
* final test
* final test
* test
* another test
* test
* test
* another test
* test llama
* revert everything
* remove echo
* Add test for parse_json_file
* Change Path to PathLike
* Fix `Import block is un-sorted or un-formatted`
* revert parse_json_file
* Fix ruff format
* Add parse_json_file test
* v1
* v1
* more changes
* more models
* add more markers
* swtich to A10
* use cache
* Update .github/workflows/push-important-models.yml
* Update .github/workflows/push-important-models.yml
* Update modeling_llama.py
* test
* test
* another test
* test
* test
* attempt to fix
* fix
* try automatic tagging
* fix
* alternative approach for collecting
* fix
* fix
* fix
* test
* fix
* fix
* test
* revert some changes
* fix
* fix
* fix
* final push
* fix
* revert
* test new slack message
* oops
* Update send-slack.yml
* test
* test re-usable workflow in steps
* Update action.yml
* test
* another test
* test
* another test
* test
* another test
* another test (hopefully last one)
* attempt to fix
* allez
* removing comma
* test
* another test
* attempt
* test
* test
* test push
* test
* test
* another test
* test
* make it better
* fix commas
* valid json
* test
* another test
* test
* final push
* test
* final push
* more customizable messages
* test
* push
* oops
* another test
* another test
* missing indentation
* more tweaks
* more tweaks
* another test
* another test
* tests
* final push
* use global variables instead
* Update .github/workflows/push-important-models.yml
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* commit to test all models
* issue with arrays
* another test
* attempt to fix failing tests
* Update .github/workflows/push-important-models.yml
* add ssh
* Update .github/workflows/push-important-models.yml
* test
* test
* add install curl
* attempt to fix
* final fix
* test
* test
* test
* fix test
* another test
* add inherit secrets
* push
* revert unneeded changes
* revert
* add env variables
* add pip freeze
* revert change in gemma
* Update .github/workflows/push-important-models.yml
* fix mistral and mixtral
* add pdb
* fix mixtral tesst
* fix
* fix mistral ?
* add fix gemma
* fix mistral
* fix
* test
* anoter test
* fix
* fix
* fix mistral tests
* fix them again
* final fixes for mistral
* fix padding right
* fix whipser fa2
* fix
* fix
* fix gemma
* test
* fix llama
* fix
* fix
* fix llama gemma
* add class attribute
* fix CI
* clarify whisper
* compute_capability
* rename names in some comments
* Add # fmt: skip
* make style
* Update tests/models/mistral/test_modeling_mistral.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* update
* update
* change branch
* correct workflow
* modify file
* test
* works
* final test
* another fix
* install sudo
* final fix
* add `-y`
* set to `main`
* Update .github/actions/post-slack/action.yml
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* change title
* fixup
* add upload report
* fix
* revert to main
* add empty lines + add comment
---------
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Remove auto class
* Update ImagePointDescriptionOutput
* Update model outputs
* Rename output class
* Revert "Remove auto class"
This reverts commit ed4a8f549d.
* Address comments
* Update integration_utils.py
Add the case where a tensor with one element is log with Mlflow
* Update src/transformers/integrations/integration_utils.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update integration_utils.py add a whitespace
---------
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Fork.
* RecurrentGemma initial commit.
* Updating __init__.py.
* Minor modification to how we initialize the cache.
Changing how the config specifies the architecture.
* Reformat code to 4 spaces.
Fixed a few typos.
* Fixed the forward pass.
Still unclear on the cache?
* Fixed the RecurrentGemmaForCausalLM
* Minor comment that we might not need attention_mask and output_attention arguments.
* Now cache should work as well.
* Adding a temporary example to check whether the model generation works.
* Adding the tests and updating imports.
* Adding the example file missing in the previous commit.
* First working example.
* Removing .gitignore and reverting parts of __init__.
* Re-add .gitignore.
* Addressing comments for configuration.
* Move mask creation to `_prepare_inputs_for_generation`.
* First try at integration tests:
1. AttributeError: 'GriffinCausalLMOutput' object has no attribute 'attentions'.
2. `cache_position` not passed
* Transfoering between machines.
* Running normal tests.
* Minor fix.
* More fixes.
* Addressing more comments.
* Minor fixes.
* first stab at cleanup
* more refactoring
* fix copies and else
* renaming and get init to work
* fix causal mask creation
* update
* nit
* fix a hell lot of things
* updates
* update conversion script
* make all keys importable
* nits
* add auto mappings
* properly convert ffw_up and down
* add scaling
* fix generations
* for recurrent dtype
* update
* fix going beyong window
* fixup
* add missing files
* current updates to remove last einops
* finish modeling refactor
* TADA
* fix compile
* fix most failing testt ? ?
* update tests
* refactor and update
* update
* nits, fixup and update tests
* more fixup
* nits
* fix imports
* test format
* fixups
* nits
* tuple typing
* fix code quality
* add model card
* fix doc
* skip most generation tests
* nits
* style
* doc fixes
* fix pr and check_copies?
* last nit
* oupsy
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <hi@lysand.re>
* update
* Update src/transformers/models/recurrent_gemma/convert_recurrent_gemma_to_hf.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* Update tests/models/recurrent_gemma/test_modeling_recurrent_gemma.py
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* update based on review
* doc nit
* fix quality
* quality
* fix slow test model path
* update default dype
* ignore attributes that can be safely ignored in check config attributes
* 0lallalala come on
* save nit
* style
* remove to dict update
* make sure we can also run in float16
* style
---------
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
Co-authored-by: Aleksandar Botev <botev@google.com>
Co-authored-by: Leonard Berrada <lberrada@users.noreply.github.com>
Co-authored-by: anushanf <anushanf@google.com>
Co-authored-by: botev <botevmg@gmail.com>
Co-authored-by: Lysandre Debut <hi@lysand.re>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
* fix learning rate display issue in galore optimizer
* fix kwarg in accelerate when using versions < 0.28.0
* this was supposed to be in the other PR whoops