transformers/tests/quantization
Ilyas Moutawwakil 4fc708f98c
Exllama kernels support for AWQ models (#28634)
* added exllama kernels support for awq models

* doc

* style

* Update src/transformers/modeling_utils.py

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* refactor

* moved exllama post init to after device dispatching

* bump autoawq version

* added exllama test

* style

* configurable exllama kernels

* copy exllama_config from gptq

* moved exllama version check to post init

* moved to quantization dockerfile

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
2024-03-05 03:22:48 +01:00
..
aqlm_integration Cleaner Cache `dtype` and `device` extraction for CUDA graph generation for quantizers compatibility (#29079) 2024-02-27 09:32:39 +01:00
autoawq Exllama kernels support for AWQ models (#28634) 2024-03-05 03:22:48 +01:00
bnb FIX [`bnb` / `tests`] Propagate the changes from #29092 to 4-bit tests (#29122) 2024-02-20 11:11:15 +01:00
gptq [GPTQ] Fix test (#28018) 2024-01-15 11:22:54 -05:00