transformers/docs/source/en/main_classes
Marc Sun 28de2f4de3
[Quantization] Quanto quantizer (#29023)
* start integration

* fix

* add and debug tests

* update tests

* make pytorch serialization works

* compatible with device_map and offload

* fix tests

* make style

* add ref

* guard against safetensors

* add float8 and style

* fix is_serializable

* Fix shard_checkpoint compatibility with quanto

* more tests

* docs

* adjust memory

* better

* style

* pass tests

* Update src/transformers/modeling_utils.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* add is_safe_serialization instead

* Update src/transformers/quantizers/quantizer_quanto.py

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* add QbitsTensor tests

* fix tests

* simplify activation list

* Update docs/source/en/quantization.md

Co-authored-by: David Corvoysier <david.corvoysier@gmail.com>

* better comment

* Update tests/quantization/quanto_integration/test_quanto.py

Co-authored-by: David Corvoysier <david.corvoysier@gmail.com>

* Update tests/quantization/quanto_integration/test_quanto.py

Co-authored-by: David Corvoysier <david.corvoysier@gmail.com>

* find and fix edge case

* Update docs/source/en/quantization.md

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* pass weights_only_kwarg instead

* fix shard_checkpoint loading

* simplify update_missing_keys

* Update tests/quantization/quanto_integration/test_quanto.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* recursion to get all tensors

* block serialization

* skip serialization tests

* fix

* change by cuda:0 for now

* fix regression

* update device_map

* fix doc

* add noteboon

* update torch_dtype

* update doc

* typo

* typo

* remove comm

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: David Corvoysier <david.corvoysier@gmail.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Younes Belkada <younesbelkada@gmail.com>
2024-03-15 11:51:29 -04:00
..
agent.md [doc] Always call it Agents for consistency (#25958) 2023-09-05 12:27:20 +01:00
backbones.md [`Doc`] Fix docbuilder - make `BackboneMixin` and `BackboneConfigMixin` importable from `utils`. (#29002) 2024-02-14 10:29:22 +00:00
callback.md Adds dvclive callback (#27352) 2023-11-09 12:19:31 +00:00
configuration.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
data_collator.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
deepspeed.md [docs] DeepSpeed (#28542) 2024-01-24 08:31:28 -08:00
feature_extractor.md Fixed typos (#26810) 2023-10-16 09:52:29 +02:00
image_processor.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
keras_callbacks.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
logging.md Warnings controlled by logger level (#26527) 2023-10-12 10:48:38 +02:00
model.md Fix typo 'submosules' (#24809) 2023-07-13 16:56:53 +01:00
onnx.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
optimizer_schedules.md Migrate doc files to Markdown. (#24376) 2023-06-20 18:07:47 -04:00
output.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
pipelines.md Update all references to canonical models (#29001) 2024-02-16 08:16:58 +01:00
processors.md [docs] fixed links with 404 (#27327) 2023-11-06 19:45:03 +00:00
quantization.md [Quantization] Quanto quantizer (#29023) 2024-03-15 11:51:29 -04:00
text_generation.md Generate: get generation mode from the generation config instance 🧼 (#29441) 2024-03-06 11:18:35 +00:00
tokenizer.md [`PretrainedTokenizer`] add some of the most important functions to the doc (#27313) 2023-11-06 15:11:00 +01:00
trainer.md [docs] Trainer docs (#28145) 2023-12-20 10:37:23 -08:00