transformers

History

Marc Sun 28de2f4de3 [Quantization] Quanto quantizer (#29023 ) * start integration * fix * add and debug tests * update tests * make pytorch serialization works * compatible with device_map and offload * fix tests * make style * add ref * guard against safetensors * add float8 and style * fix is_serializable * Fix shard_checkpoint compatibility with quanto * more tests * docs * adjust memory * better * style * pass tests * Update src/transformers/modeling_utils.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add is_safe_serialization instead * Update src/transformers/quantizers/quantizer_quanto.py Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> * add QbitsTensor tests * fix tests * simplify activation list * Update docs/source/en/quantization.md Co-authored-by: David Corvoysier <david.corvoysier@gmail.com> * better comment * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by: David Corvoysier <david.corvoysier@gmail.com> * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by: David Corvoysier <david.corvoysier@gmail.com> * find and fix edge case * Update docs/source/en/quantization.md Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * pass weights_only_kwarg instead * fix shard_checkpoint loading * simplify update_missing_keys * Update tests/quantization/quanto_integration/test_quanto.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * recursion to get all tensors * block serialization * skip serialization tests * fix * change by cuda:0 for now * fix regression * update device_map * fix doc * add noteboon * update torch_dtype * update doc * typo * typo * remove comm --------- Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com> Co-authored-by: David Corvoysier <david.corvoysier@gmail.com> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Younes Belkada <younesbelkada@gmail.com>		2024-03-15 11:51:29 -04:00
..
agent.md	[doc] Always call it Agents for consistency (#25958 )	2023-09-05 12:27:20 +01:00
backbones.md	[`Doc`] Fix docbuilder - make `BackboneMixin` and `BackboneConfigMixin` importable from `utils`. (#29002 )	2024-02-14 10:29:22 +00:00
callback.md	Adds dvclive callback (#27352 )	2023-11-09 12:19:31 +00:00
configuration.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
data_collator.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
deepspeed.md	[docs] DeepSpeed (#28542 )	2024-01-24 08:31:28 -08:00
feature_extractor.md	Fixed typos (#26810 )	2023-10-16 09:52:29 +02:00
image_processor.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
keras_callbacks.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
logging.md	Warnings controlled by logger level (#26527 )	2023-10-12 10:48:38 +02:00
model.md	Fix typo 'submosules' (#24809 )	2023-07-13 16:56:53 +01:00
onnx.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
optimizer_schedules.md	Migrate doc files to Markdown. (#24376 )	2023-06-20 18:07:47 -04:00
output.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
pipelines.md	Update all references to canonical models (#29001 )	2024-02-16 08:16:58 +01:00
processors.md	[docs] fixed links with 404 (#27327 )	2023-11-06 19:45:03 +00:00
quantization.md	[Quantization] Quanto quantizer (#29023 )	2024-03-15 11:51:29 -04:00
text_generation.md	Generate: get generation mode from the generation config instance 🧼 (#29441 )	2024-03-06 11:18:35 +00:00
tokenizer.md	[`PretrainedTokenizer`] add some of the most important functions to the doc (#27313 )	2023-11-06 15:11:00 +01:00
trainer.md	[docs] Trainer docs (#28145 )	2023-12-20 10:37:23 -08:00