transformers

History

Marc Sun 28de2f4de3 [Quantization] Quanto quantizer (#29023 ) * start integration * fix * add and debug tests * update tests * make pytorch serialization works * compatible with device_map and offload * fix tests * make style * add ref * guard against safetensors * add float8 and style * fix is_serializable * Fix shard_checkpoint compatibility with quanto * more tests * docs * adjust memory * better * style * pass tests * Update src/transformers/modeling_utils.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add is_safe_serialization instead * Update src/transformers/quantizers/quantizer_quanto.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add QbitsTensor tests * fix tests * simplify activation list * Update docs/source/en/quantization.md Co-authored-by: David Corvoysier <david.corvoysier@gmail.com> * better comment * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by: David Corvoysier <david.corvoysier@gmail.com> * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by: David Corvoysier <david.corvoysier@gmail.com> * find and fix edge case * Update docs/source/en/quantization.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * pass weights_only_kwarg instead * fix shard_checkpoint loading * simplify update_missing_keys * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * recursion to get all tensors * block serialization * skip serialization tests * fix * change by cuda:0 for now * fix regression * update device_map * fix doc * add noteboon * update torch_dtype * update doc * typo * typo * remove comm --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: David Corvoysier <david.corvoysier@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Younes Belkada <younesbelkada@gmail.com>		2024-03-15 11:51:29 -04:00
..
transformers-all-latest-gpu	Use torch 2.2 for daily CI (model tests) (#29208 )	2024-02-23 21:37:08 +08:00
transformers-doc-builder	Use python 3.10 for docbuild (#28399 )	2024-01-11 14:39:49 +01:00
transformers-gpu	TF: TF 2.10 unpin + related onnx test skips (#18995 )	2022-09-12 19:30:27 +01:00
transformers-past-gpu	Byebye pytorch 1.9 (#24080 )	2023-06-16 16:38:23 +02:00
transformers-pytorch-amd-gpu	Add deepspeed test to amd scheduled CI (#27633 )	2023-12-11 16:33:36 +01:00
transformers-pytorch-deepspeed-amd-gpu	Add deepspeed test to amd scheduled CI (#27633 )	2023-12-11 16:33:36 +01:00
transformers-pytorch-deepspeed-latest-gpu	Use torch 2.2 for deepspeed CI (#29246 )	2024-02-27 17:51:37 +08:00
transformers-pytorch-deepspeed-nightly-gpu	Update CUDA versions for DeepSpeed (#27853 )	2023-12-05 16:15:21 -05:00
transformers-pytorch-gpu	[SDPA] Make sure attn mask creation is always done on CPU (#28400 )	2024-01-09 11:05:19 +01:00
transformers-pytorch-tpu	Rename master to main for notebooks links and leftovers (#16397 )	2022-03-25 09:12:23 -04:00
transformers-quantization-latest-gpu	[Quantization] Quanto quantizer (#29023 )	2024-03-15 11:51:29 -04:00
transformers-tensorflow-gpu	Update TF pin in docker image (#25343 )	2023-08-07 12:32:34 +02:00