transformers

Commit Graph

Author	SHA1	Message	Date
Yih-Dar	7032e02032	Install `sentencepiece` in `DeepSpeed` CI image (#20795 ) * Install sentencepiece in DS CI image * update Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-12-16 18:23:46 +01:00
NielsRogge	26dd041c6e	Add Swin2SR (#19784 ) * First draft * Add more improvements * Improve forward pass * Fix layernorm * Add upscaler * More improvements * More improvements * More improvements * Improve conversion script * Add preprocessing * Make output match original implementation * Add additional attributes * Add support for more models * Support more models * Add support for real world sr * Add initial Swin2SRFeatureExtractor * Add ImageSuperResolutionOutput * Make more tests pass * Use BaseModelOutput * Fix one more test * Fix more tests * Fix another test * Fix all tests * Rename to Swin2SRImageProcessor * Fix toctree * Fix toctree * Fix rebase * Improve Swin2SRImageProcessor * Remove feature extractor file * Improve model * Improve conversion script * Fix integration test * Fix init * Fix conversion script * Address comments * Improve upsampler * Add NearestConvUpsampler * Improve pixel shuffle upsampler * Improve auxiliary upsampler * Improve conversion script * Rename conv_last to final_convolution * Fix rebase * Improve upsample module * Add padding to image processor * Fix bug * Update padding * Remove print statement and fix integration test * Improve docs * Add image processor tests * Convert all checkpoints, fix testsé * Remove print statements * Fix import Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-12-16 16:24:01 +01:00
NielsRogge	7f99861218	Add Universal Segmentation class + mapping (#20766 ) * Add mapping * Add mapping to pipeline * Apply suggestions * Fix feature extractor tests * Use ForInstance, add model to universal mapping * More fixes * Remove model from deprecated objectsé Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-12-16 14:22:46 +01:00
Matt	e65445b4d6	Stop calling expand_1d on newer TF versions (#20786 )	2022-12-16 13:10:07 +00:00
Nicolas Patry	3ee958207a	Fix object detection2 (#20798 ) * Revert "Fixing object detection with `layoutlm` (#20776)" This reverts commit `fca66abe2a`. * Better fix for layoutlm object detection. * Style.	2022-12-16 13:25:36 +01:00
Younes Belkada	4341f4e224	[Pipeline] skip feature extraction test if in `IMAGE_PROCESSOR_MAPPING` (#20790 ) skip feature extraction test if in `IMAGE_PROCESSOR_MAPPING`	2022-12-16 12:46:58 +01:00
Yih-Dar	1543cee7c8	Recompile `apex` in `DeepSpeed` CI image (#20788 ) Recompile apex in DeepSpeed CI image Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-12-15 21:35:27 +01:00
amyeroberts	491e951875	Move convert_to_rgb to image_transforms module (#20784 ) * Move convert_to_rgb to image_transforms module * Fix tests	2022-12-15 18:47:04 +00:00
Joao Gante	4bc723f87d	Generate: use `GenerationConfig` as the basis for `.generate()` parametrization (#20388 ) * generate from config mvp * fix failing tests * max_time test * Load default gen config at model load time; Update docs * further documentation; add tests * adapt rag to the new structure * handle models not instantiated with from_pretained (like in tests) * better default generation config * add can_generate fn * handle legacy use case of ad hoc model config changes * initialize gen config from config in individual methods, if gen config is none * fix _get_decoder_start_token_id when called outside GenerationMixin * correct model config load order (set attr > model config > decoder config) * update rag to match latest changes * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * load gen config from model config in model.from_pretrained * fix can_generate fn * handle generate calls without a previous from_pretrained (e.g. tests) * add legacy behavior (and a warning) * lower logger severity Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-12-15 18:27:20 +00:00
Yih-Dar	b1706f6908	Install video dependency for pipeline CI (#20777 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-12-15 18:47:05 +01:00
Nicolas Patry	fca66abe2a	Fixing object detection with `layoutlm` (#20776 ) * Fixing object detection with layoutlm. * Fixup.	2022-12-15 18:46:43 +01:00
Younes Belkada	8891193e83	[Pipeline] fix failing bloom `pipeline` test (#20778 ) fix failing `pipeline` test	2022-12-15 18:46:00 +01:00
Lars Mennen	b9b70b0e66	Patch for FlanT5-XXL 8bit support (#20760 ) * Workaround for #20287: FlanT5-XXL 8bit support * Make fix-copies * revert unrelated change * Dont apply to longt5 and switch transformers	2022-12-15 12:26:58 -05:00
Yih-Dar	fe9152f67c	Install vision for TF pipeline tests (#20771 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-12-15 11:16:37 +01:00
Nicolas Patry	a9912d2fca	Even more validation. (#20762 ) * Even more validation. * Fixing order.	2022-12-15 10:05:54 +01:00
NielsRogge	67acb07e9e	Add Swin backbone (#20769 ) * Add Swin backbone * Remove line * Add code example Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-12-14 19:35:28 +01:00
Yih-Dar	94f8e21c70	Install `torch-tensorrt 1.3.0` for DeepSpeed CI (#20764 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-12-14 17:30:36 +01:00
amyeroberts	7b23a582b9	Replaces xxx_required with requires_backends (#20715 ) * Replaces xxx_required with requires_backends * Fixup	2022-12-14 14:38:44 +00:00
Arthur	7c9e2f248c	[CI-Test] Fixes but also skips the mT5 tests (#20755 ) * weight -> weights * model embedding resize does not work with both v2 and noraml * remove useless test	2022-12-14 15:36:04 +01:00
casuallyName	dfd818420d	Fix attribute error problem (#20765 ) fix: 修复Trainer无法使用use_legacy_prediction_loop参数的问题解决使用use_legacy_prediction_loop参数在predict阶段使用prediction_loop进行预测时，遇到AttributeError: 'PredictionOutput' object has no attribute 'num_samples'的问题 Co-authored-by: ZhouHang <zhouhang@idataway.com>	2022-12-14 09:26:06 -05:00
NielsRogge	11745b4e45	[Tests] Improve test_attention_outputs (#20701 ) * Improve tests * Improve TF tests * Apply suggestion * Fix test Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-12-14 14:41:40 +01:00
Yih-Dar	722bf7efcc	Fix missing `()` in some usage of `is_flaky` (#20749 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-12-14 11:37:29 +01:00
amyeroberts	9bafedc0fa	Remove image_transforms functions from init (#20704 )	2022-12-14 10:17:11 +00:00
Yih-Dar	d994473b05	Uninstall `torch_tensorrt` in `DeepSpeed` CI image for now (#20758 ) Uninstall torch_tensorrt for now Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-12-13 22:25:47 +01:00
Nicolas Patry	ba9da49aa2	Fixing the pipeline tutorial test (#20746 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-12-13 19:08:30 +01:00
Hazrul Akmal	f28c918c7e	Add docs xlm roberta (#20742 ) * added model resources for xlm-roberta * added model resources for xlm-roberta * resolve suggested changes * add resources to xlm-roberta	2022-12-13 09:25:55 -08:00
NielsRogge	6ef42587ae	[NAT, DiNAT] Add backbone class (#20654 ) * Add first draft * Add out_features attribute to config * Add corresponding test * Add Dinat backbone * Add BackboneMixin * Add Backbone mixin, improve tests * Fix embeddings * Fix bug * Improve backbones * Fix Nat backbone tests * Fix Dinat backbone tests * Apply suggestions Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>	2022-12-13 17:06:59 +01:00
dhansmair	30d8919ab1	in the resize() function in image_transforms.py, the line 267: (#20728 ) `image = to_channel_dimension_format(image, ChannelDimension.LAST)` is redundant as this same conversion is also applied in to_pil_image(). This redundant call actually makes the training fail in rare cases. The problem can be reproduced with the following code snippet: ``` from transformers.models.clip import CLIPFeatureExtractor vision_processor = CLIPFeatureExtractor.from_pretrained('openai/clip-vit-large-patch14') images = [ torch.rand(size=(3, 2, 10), dtype=torch.float), torch.rand(size=(3, 10, 1), dtype=torch.float), torch.rand(size=(3, 1, 10), dtype=torch.float) ] for image in images: processed_image = vision_processor(images=image, return_tensors="pt")['pixel_values'] print(processed_image.shape) assert processed_image.shape == torch.Size([1, 3, 224, 224]) ``` The last image has a height of 1 pixel. The second call to to_channel_dimesion_format() will transpose the image, and the height dimension is wrongly treated as the channels dimension afterwards. Because of this, the following normalize() step will result in an exception.	2022-12-13 08:55:08 -05:00
Matt	4f1788b34d	Fix AdamWeightDecay for TF 2.11 (#20735 ) * Fix AdamWeightDecay for TF * Fix AdamWeightDecay for TF * make fixup	2022-12-13 12:51:07 +00:00
Yih-Dar	a12c5cbcd8	Change a logic in pipeline test regarding TF (#20710 ) * Fix the pipeline test regarding TF * Fix the pipeline test regarding TF * update comment Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-12-13 13:42:36 +01:00
Younes Belkada	1af4bee896	Add `keep_in_fp32_modules` support (#20683 ) * add `keep_in_fp32_modules` support * pass it as class attribute * few modifs - make tests `slow` - fix logic * better logic * fix failing test * `bfloat16` support * Update src/transformers/modeling_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix * simplify tests * simplify tests * fix test * modify message * more checks * fix failing tests * add more conditions - add `is_accelerate_available` - fixes pipleine tests that failed * add suggestions * Update src/transformers/modeling_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix failing `bnb` test * add last safety checker Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-12-13 11:59:57 +01:00
Yih-Dar	d4bf9ee1ff	Update CI to torch 1.13.0 (#20687 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-12-12 20:04:56 +01:00
Yih-Dar	f41a11a16f	rename `layoutlm_job` to `exotic_models_job` (#20736 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-12-12 20:02:16 +01:00
amyeroberts	1416b5d9d8	Add decorator for flaky Donut tests (#20739 ) * Add decorator for flaky tests * Fix up	2022-12-12 18:25:27 +00:00
Sylvain Gugger	a450789d9a	Disambiguate test for required_input in tokenization base file. (#20731 ) * Disambiguate test for required_input in tokenization base file. * Add test for size	2022-12-12 13:13:09 -05:00
Sylvain Gugger	29ff8716a2	Add a progress bar for large model loading (#20713 )	2022-12-12 13:12:56 -05:00
Ariel Ekgren	5f94855dc3	Add gpt-sw3 model to transformers (#20209 ) * Add templates for gpt-sw3 * Add templates for gpt-sw3 * Added sentencepiece tokenizer * intermediate commit with many changes * fixed conflicts * Init commit for tokenization port * Tokenization progress * Remove fast tokenizer * Clean up and rename spm.model -> spiece.model * Remove TF -> PT conversion script template, Clean up Megatron -> PT script * Optimize encode & decode performance * added new attention * added new attention * attention for gpt-sw3 working * attention good * Cache is now working * fixed attention mask so that it works with causal attention * fixed badbmm bug for cpu and caching * updated config with correct parameters * Refactor and leave optimizations as separate functions to avoid breaking expected functionality * Fix special tokens mapping for both tokenizers * cleaning up of code and comments * HF compatible attention outputs * Tokenizer now passing tests, add documentation * Update documentation * reverted back to base implementation after checking that it is identical to pretrained model * updated gpt-sw3 config * updated conversion script * aligned parameters with gpt-sw3 config * changed default scale_attn_by_inverse_layer_idx to true * removed flag from conversion script * added temporary model path * reverted back to functioning convert script * small changes to default config * updated tests for gpt-sw3 * make style, make quality, minor cleanup * Change local paths to testing online repository * Change name: GptSw3 -> GPTSw3 * Remove GPTSw3TokenizerFast references * Use official model repository and add more model sizes * Added reference to 6.7b model * Add GPTSw3DoubleHeadsModel to IGNORE_NON_AUTO_CONFIGURED, like GPT2DoubleHeadsModel * Remove pointers to non-existing TFGPTSw3 * Add GPTSw3 to docs/_toctree.yml * Remove TF artifacts from GPTSw3 in __init__ files * Update README:s with 'make fix-copies' * Add 20b model to archive list * Add documentation for GPT-Sw3 * Fix typo in documentation for GPT-Sw3 * Do 'make fix-copies' again after having updated docs * Fix some typos in docs * Update src/transformers/models/gpt_sw3/configuration_gpt_sw3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/configuration_gpt_sw3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/__init__.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/convert_megatron_to_pytorch.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update tests/models/gpt_sw3/test_tokenization_gpt_sw3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Resolve comments from PR feedback * Resolve more comments from PR feedback, also set use_cache=True in convert script * Add '# Copied from' comments for GPTSw3 modeling * Set 'is_parallelizable = False' * Remove '# Copied from' where code was modified and add 'with x->y' when appropriate * Remove parallelize in mdx * make style, make quality * Update GPTSw3Config default values and corresponding documentation * Update src/transformers/models/gpt_sw3/tokenization_gpt_sw3.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/__init__.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Clean up and protect GPTSw3Tokenizer imports with is_sentencepiece_available * Make style, make quality * Add dummy object for GPTSw3Tokenizer via 'make fix-copies' * make fix-copies * Remove GPTSw3 modeling classes * make style, make quality * Add GPTSw3 auto-mappings for other GPT2 heads * Update docs/source/en/model_doc/gpt-sw3.mdx Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/convert_megatron_to_pytorch.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Update src/transformers/models/gpt_sw3/tokenization_gpt_sw3.py Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> * Remove old TODO-comment * Add example usage to GPTSw3Tokenizer docstring * make style, make quality * Add implementation details and example usage to gpt-sw3.mdx Co-authored-by: JoeyOhman <joeyoh@kth.se> Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-12-12 13:12:13 -05:00
amyeroberts	b58beebe72	Add vision requirement to image transforms (#20712 ) * Add require_vision decorator * Fixup * Use requires_backends * Add requires_backend to utils functions	2022-12-12 17:43:45 +00:00
Steven Liu	fd2bed7f9f	Clarify return_tensor and return_text parameters (#20662 ) * clarify docstring * make style	2022-12-12 09:16:13 -08:00
Matt	c1b9a11dd4	Convert tokenizer outputs for Keras in doc example (#20732 ) * Convert tokenizer outputs for Keras in doc example * Das deutsche Beispiel auch korrigieren	2022-12-12 16:14:04 +00:00
Juanjo do Olmo	0ba94aceb6	Spanish translation of the file debugging.mdx (#20566 ) * Create and translate to Spanish debugging.mdx * solved typo error in a header * Update debugging.mdx * Update debugging.mdx * Update docs/source/es/debugging.mdx Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/debugging.mdx Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/debugging.mdx Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/debugging.mdx Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/debugging.mdx Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update _toctree.yml Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2022-12-12 10:38:56 -05:00
Sourab Mangrulkar	a413c725d4	fsdp fix (#20719 )	2022-12-12 20:37:52 +05:30
stanleycai95	17c742bbf5	Very small edit to change name to OpenAI GPT (#20722 )	2022-12-12 09:43:43 -05:00
Ian C	8f1f59ce86	Add type hints for Whisper models (#20396 ) * Initial commit * Add type hints for two major classes * Run make fixup * Fix output type for Whisper * Run isort to fix imports	2022-12-12 14:39:21 +00:00
Nicolas Patry	53357e8196	Adding ValueError when imcompatible parameters are used. (#20729 )	2022-12-12 15:39:13 +01:00
Yih-Dar	5ba2dbd9b1	Fix `AutoModelTest.test_model_from_pretrained` (#20730 ) Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>	2022-12-12 15:37:43 +01:00
Peter	a3345c1f13	Add `accelerate` support for LongT5 models (#20341 ) * ✨ add accelerate support for LongT5 models Signed-off-by: peter szemraj <peterszemraj@gmail.com> * fix `accelerate` tests * Trigger CI test Signed-off-by: peter szemraj <peterszemraj@gmail.com> Co-authored-by: younesbelkada <younesbelkada@gmail.com>	2022-12-12 09:25:52 -05:00
Alberto Mario Ceballos-Arroyo	8286af6f54	Spanish translation of asr.mdx and add_new_pipeline.mdx (#20569 ) * Fix minor typo in question_answering.mdx * Fixes minor typo in the english version of tasks/asr.mdx * Update _toctree.yml * Translate add_new_pipeline.mdx into Spanish * Fixes some typos in the English version of add_new_pipeline.mdx * Translate asr.mdx into Spanish * Fixes small typos in add_new_pipeline.mdx * Update docs/source/es/add_new_pipeline.mdx Suggestion by @osanseviero Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/add_new_pipeline.mdx Suggestion by @osanseviero: use "biblioteca" instead of "librería." Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/tasks/asr.mdx Suggestion by @osanseviero. Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/add_new_pipeline.mdx Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/add_new_pipeline.mdx Suggestion by @osanseviero. Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/add_new_pipeline.mdx Suggestion by @osanseviero. Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/add_new_pipeline.mdx Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/tasks/asr.mdx Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/tasks/asr.mdx Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update docs/source/es/tasks/asr.mdx Co-authored-by: Omar Sanseviero <osanseviero@gmail.com> * Update asr.mdx Co-authored-by: Omar Sanseviero <osanseviero@gmail.com>	2022-12-12 09:23:23 -05:00
Salvo Cavallaro	8d2fca07e8	Made LUKE Tokenizer independent from RoBERTa (#20720 )	2022-12-12 09:22:08 -05:00
Sylvain Gugger	799cea64ac	Fix rendering issue in quicktour (#20708 ) * Fix rendering issue in quicktour * Separate in two blocks	2022-12-09 13:51:35 -05:00

1 2 3 4 5 ...

11615 Commits All Branches Search

11615 Commits

All Branches