transformers/tests/models/stablelm
Jonathan Tow 2f12e40822
[`StableLm`] Add QK normalization and Parallel Residual Support (#29745)
* init: add StableLm 2 support

* add integration test for parallel residual and qk layernorm

* update(modeling): match qk norm naming for consistency with phi/persimmon

* fix(tests): run fwd/bwd on random init test model to jitter norm weights off identity

* `use_parallel_residual`: add copy pointer to `GPTNeoXLayer.forward`

* refactor: rename head states var in `StableLmLayerNormPerHead`

* tests: update test model and add generate check
2024-04-08 23:51:58 +02:00
..
__init__.py Add `StableLM` (#28810) 2024-02-14 07:15:18 +01:00
test_modeling_stablelm.py [`StableLm`] Add QK normalization and Parallel Residual Support (#29745) 2024-04-08 23:51:58 +02:00