transformers/tests/models/mistral
Aritra Roy Gosthipaty 965e98dc54
[Port] TensorFlow implementation of Mistral (#29708)
* chore: initial commit

* chore: adding imports and inits

* chore: adding the causal and classification code

* chore: adding names to the layers

* chore: using single self attn layer

* chore: built the model and layers

* chore: start with testing

* chore: docstring change, transpose fix

* fix: rotary embedding

* chore: adding cache implementation

* remove unused torch

* chore: fixing the indexing issue

* make fix-copies

* Use modeling_tf_utils.keras

* make fixup

* chore: fixing tests

* chore: adding past key value logic

* chore: adding multi label classfication test

* fix: switching on the built parameters in the layers

* fixing repo consistency

* ruff formats

* style changes

* fix: tf and pt equivalence

* removing returns from docstrings

* fix docstrings

* fix docstrings

* removing todos

* fix copies

* fix docstring

* fix docstring

* chore: using easier rotate_half

* adding integration tests

* chore: addressing review related to rotary embedding layer

* review changes

* [run-slow] mistral

* skip: test save load after resize token embedding

* style

---------

Co-authored-by: Matt <rocketknight1@gmail.com>
2024-05-23 17:48:49 +01:00
..
__init__.py [Mistral] Mistral-7B-v0.1 support (#26447) 2023-09-27 18:30:46 +02:00
test_modeling_flax_mistral.py Flax mistral (#26943) 2024-01-31 14:19:02 +01:00
test_modeling_mistral.py CI: AMD MI300 tests fix (#30797) 2024-05-21 12:46:07 +01:00
test_modeling_tf_mistral.py [Port] TensorFlow implementation of Mistral (#29708) 2024-05-23 17:48:49 +01:00