Commit Graph

10591 Commits

Author SHA1 Message Date
Ankur Goyal 2ef7742117
Add DocumentQuestionAnswering pipeline (#18414)
* [WIP] Skeleton of VisualQuestionAnweringPipeline extended to support LayoutLM-like models

* Fixup

* Use the full encoding

* Basic refactoring to DocumentQuestionAnsweringPipeline

* Cleanup

* Improve args, docs, and implement preprocessing

* Integrate OCR

* Refactor question_answering pipeline

* Use refactored QA code in the document qa pipeline

* Fix tests

* Some small cleanups

* Use a string type annotation for Image.Image

* Update encoding with image features

* Wire through the basic docs

* Handle invalid response

* Handle empty word_boxes properly

* Docstring fix

* Integrate Donut model

* Fixup

* Incorporate comments

* Address comments

* Initial incorporation of tests

* Address Comments

* Change assert to ValueError

* Comments

* Wrap `score` in float to make it JSON serializable

* Incorporate AutoModeLForDocumentQuestionAnswering changes

* Fixup

* Rename postprocess function

* Fix auto import

* Applying comments

* Improve docs

* Remove extra assets and add copyright

* Address comments

Co-authored-by: Ankur Goyal <ankur@impira.com>
2022-09-07 13:38:49 -04:00
Olatunji Ruwase 3059d80d80
[DeepSpeed ZeRO3] Fix performance degradation in sharded models (#18911)
* [DeepSpeed] Fix performance degradation in sharded models

* style

* polish

Co-authored-by: Stas Bekman <stas@stason.org>
2022-09-07 07:44:20 -07:00
Yih-Dar 10c774cf60
remvoe `_create_and_check_torch_fx_tracing` in specific test files (#18667)
* remvoe _create_and_check_torch_fx_tracing defined in specific model test files

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-07 16:22:09 +02:00
Joao Gante 0eabab0998
TF: final bias as a layer in seq2seq models (replicate TFMarian fix) (#18903) 2022-09-07 14:03:02 +01:00
Matt 2b9513fdab
Update TF fine-tuning docs (#18654)
* Update TF fine-tuning docs

* Fix formatting

* Add some section headers so the right sidebar works better

* Squiggly it

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/training.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Explain things in the text, not the comments

* Make the two dataset creation methods into a list

* Move the advice about collation out of a <Tip>

* Edits for clarity

* Edits for clarity

* Edits for clarity

* Replace `to_tf_dataset` with `prepare_tf_dataset` in the fine-tuning pages

* Restructure the page a little bit

* Restructure the page a little bit

* Restructure the page a little bit

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-09-07 13:30:07 +01:00
Wang, Yi d842f2d5b9
update the train_batch_size in case HPO change batch_size_per_device (#18918)
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>

Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2022-09-07 08:01:30 -04:00
Nicholas Broad 4f299b2446
Accelerator end training (#18910)
* add accelerator.end_training()

Some trackers need this to end their runs.

* fixup and quality

* add space

* add space again ?!?
2022-09-07 07:46:26 -04:00
Yih-Dar 7a8118947f
Add checks for more workflow jobs (#18905)
* add check for scheduled CI

* Add check to other CIs

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-07 12:51:37 +02:00
NielsRogge c25f27fa6a
[VideoMAE] Improve code examples (#18919)
* Simplify code example

* Add seed
2022-09-07 12:24:12 +02:00
Ekagra Ranjan 0a632f076d
Fix incorrect size of input for 1st strided window length in `Perplexity of fixed-length models` (#18906)
* update the PPL for stride 512

* fix 1st strided window size

* linting

* fix typo

* styling
2022-09-06 15:20:12 -04:00
Yih-Dar 7d5fde991d
unpin slack_sdk version (#18901)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-06 18:42:00 +02:00
Sylvain Gugger 71ff88fa4f
Further reduce the number of alls to head for cached objects (#18871)
* Further reduce the number of alls to head for cached models/tokenizers/pipelines

* Fix tests

* Address review comments
2022-09-06 12:34:37 -04:00
Alara Dirik 6678350c01
fixes bugs to handle non-dict output (#18897) 2022-09-06 16:13:34 +03:00
Yih-Dar 998a90bc7d
Fix `test_tf_encode_plus_sent_to_model` for `LayoutLMv3` (#18898)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-06 14:51:03 +02:00
Ekagra Ranjan f85acb4d73
Fix decode_input_ids to bare T5Model and improve doc (#18791)
* use tokenizer to output tensor

* add preprocessing for decoder_input_ids for bare T5Model

* add preprocessing to tf and flax

* linting

* linting

* Update src/transformers/models/t5/modeling_flax_t5.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/t5/modeling_tf_t5.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/transformers/models/t5/modeling_t5.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-09-06 14:12:26 +02:00
arun99481 3b19c0317b
updating gather function with gather_for_metrics in run_wav2vec2_pretraining (#18877)
Co-authored-by: Arun Rajaram <arunrajaram@Aruns-MacBook-Pro.local>
2022-09-06 07:36:37 -04:00
Had 734b7e2a5a
Mask t5 relative position bias then head pruned (#17968)
* add position bias head masking if heads pruned

* fix pruning function in t5 encoder

* make style

* make fix-copies

* Revert added folder

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-09-06 10:39:31 +02:00
Joao Gante d4dbd7ca59
Generate: get the correct beam index on eos token (#18851) 2022-09-05 19:35:47 +01:00
zkep c6d3daba54
Update Chinese documentation (#18893)
* update the translation
2022-09-05 19:56:12 +02:00
Sofia Oliveira cfd623a859
Add type hints to XLM-Roberta-XL models (#18475)
* Add type hints to XLM-Roberta-XL models

* Format
2022-09-05 13:38:08 +01:00
Surya Prakash Sahu 17c634fd5b
Update perf_train_gpu_one.mdx (#18442) 2022-09-05 14:06:36 +02:00
Patrick von Platen badb9d2aaa
Correct naming pegasus x (#18896)
* add first generation tutorial

* [Pegasus X] correct naming

* [Generation] Remove
2022-09-05 11:25:00 +02:00
Lysandre Debut 591cfc6c90
Mention TF and Flax checkpoints (#18894) 2022-09-05 11:09:39 +02:00
Joao Gante 7f27e002fd
TF: TFMarianMTModel final logits bias as a layer (#18833)
* bias as a layer

* alias the bias (hah, it rhymes)

* add comment with info
2022-09-05 09:20:27 +01:00
Steven Liu 65fb71bc76
Add Trainer to quicktour (#18723)
* 📝 update quicktour

* 📝 add trainer section

* 🖍 markdown table, apply feedbacks

*  make style

* add tf training section

* make style
2022-09-02 15:05:31 -05:00
Steven Liu ae32f3afef
Finetune guide for semantic segmentation (#18640)
* 📝 first draft

* oops add to toctree

* make style

* 📝 add inference section

* 🖍 make style

* 📝 add images

* 🖍 apply feedbacks

* remove num_labels and pytorch block

* apply feedbacks, add colab notebook

Co-authored-by: Steven <stevhliu@gmail.com>
2022-09-02 14:29:51 -05:00
Steven Liu bf9d506137
Update docs landing page (#18590)
* 📝 update docs landing page

* 🖍 apply feedbacks

* apply feedbacks

* apply feedbacks, use <br> for list
2022-09-02 14:29:06 -05:00
Jason Phang 53e33e6f1b
PEGASUS-X (#18551)
* PegasusX Initial commit

* rename

* pegasus X implementation

* pegx update

* pegx fix

* pegasus-x fixes

* pegx updates

* cleanup

* cleanup

* cleanup

* tests

* stylefixes

* Documentation update

* Model hub fix

* cleanup

* update

* update

* testfix

* Check fix

* tweaks for merging

* style

* style

* updates for pr

* style

* change pegasus-x repo
2022-09-02 19:54:02 +02:00
Yih-Dar ecdf9b06bc
Remove cached torch_extensions on CI runners (#18868)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-02 18:17:58 +02:00
Yih-Dar 4e29b3f884
A script to download artifacts and perform CI error statistics (#18865)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-02 17:59:26 +02:00
Joao Gante 9196f48b95
Generate: validate `model_kwargs` on TF (and catch typos in generate arguments) (#18651) 2022-09-02 16:25:26 +01:00
Stas Bekman c5be7cae59
postpone bnb load until it's needed (#18859) 2022-09-02 08:22:46 -07:00
Sylvain Gugger 9e346f7436
Fix number of examples for iterable datasets in multiprocessing (#18856)
* Fix number of examples for iterable datasets in multiprocessing

* Add stronger check
2022-09-02 10:49:39 -04:00
Yih-Dar 0ab465a5d2
pin Slack SDK to 3.18.1 to avoid failing issue (#18869)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-02 16:49:08 +02:00
Sylvain Gugger 38c3cd52fb
Clean up utils.hub using the latest from hf_hub (#18857)
* Clean up utils.hub using the latest from hf_hub

* Adapt test

* Address review comment

* Fix test
2022-09-02 10:30:06 -04:00
NielsRogge 17981faf67
Add OWL-ViT to the appropriate section (#18867)
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-09-02 15:59:25 +02:00
NielsRogge c60dd98e87
[LayoutLM] Add clarification to docs (#18716)
* Add clarification

* Add another clarification

* Apply suggestion

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-09-02 14:48:19 +02:00
OlivierDehaene 129d73294e
Fix naming issue with ImageToText pipeline (#18864)
Co-authored-by: Olivier Dehaene <olivier@huggingface.co>
2022-09-02 07:55:30 -04:00
kmckiern 9b3eb81014
if learning rate is a tensor, get item (float) (#18861) 2022-09-02 07:46:31 -04:00
Steven Liu 142e12afb4
Split docs on modality (#18205)
* update

* 🖍 add missing files

* 📝 add nested sections

* 🖍 align titles with tasks

* oops

* remove quotes from titles
2022-09-01 15:19:11 -05:00
Ankur Goyal 23fab60b67
Pin revision for LayoutLMForQuestionAnswering and TFLayoutLMForQuestionAnswering tests (#18854)
* Pin revision for tests

* Fixup

* Update revision in models

* Shorten revisions

Co-authored-by: Ankur Goyal <ankur@impira.com>
2022-09-01 12:52:33 -04:00
OlivierDehaene ddb69e5af8
Add Image To Text Generation pipeline (#18821)
* Add Image2TextGenerationPipeline to supported pipelines

* Add Flax and Tensorflow support

* Add Flax and Tensorflow small tests

* Add default model for Tensorflow

* Add docstring

* Fix doc style

* Add tiny models for pytorch and flax

* Remove flax from pipeline.
Fix tests

* Use ydshieh/vit-gpt2-coco-en as a default for both PyTorch and Tensorflow

* Fix Tensorflow support

Co-authored-by: Olivier Dehaene <olivier@huggingface.co>
2022-09-01 12:07:14 -04:00
Sylvain Gugger c61f116b63
Tie weights after preparing the model in run_clm (#18855) 2022-09-01 12:06:56 -04:00
Cody Yu 1c381f3600
Cache results of is_torch_tpu_available() (#18777)
* Cache results of is_torch_tpu_available()

* Update src/transformers/utils/import_utils.py

* Update src/transformers/utils/import_utils.py
2022-09-01 11:45:33 -04:00
Sayak Paul 954e18ab97
TensorFlow MobileViT (#18555)
* initial implementation.

* add: working model till image classification.

* add: initial implementation that passes intg tests.

Co-authored-by: Amy <aeroberts4444@gmail.com>

* chore: formatting.

* add: tests (still breaking because of config mismatch).

Coo-authored-by: Yih <2521628+ydshieh@users.noreply.github.com>

* add: corrected tests and remaning changes.

* fix code style and repo consistency.

* address PR comments.

* address Amy's comments.

* chore: remove from_pt argument.

* chore: add full-stop.

* fix: TFLite model conversion in the doc.

* Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/mobilevit/modeling_tf_mobilevit.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* apply formatting.

* chore: remove comments from the example block.

* remove identation in the example.

Co-authored-by: Amy <aeroberts4444@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-09-01 10:35:15 -04:00
Gustavo de Rosa fe58929ad6
Adds timeout argument to training_args to avoid socket timeouts in DDP (#18562)
* chore(training_args): Adds support for timeout argument.

* fix(training_args): Passes make style through changes.

* fix(training_args): Removes wrong docstring sentence.

* fix(training_args): Fixes timeout not being JSON serializable.

* fix(training_args_sm): Also updates timeout to timeout_delta.

* fix(training_args): Fixes PR according to suggestions.
2022-09-01 10:33:53 -04:00
kumapo ab663b2274
reflect max_new_tokens in `Seq2SeqTrainer` (#18786)
* reflect max_new_tokens in gen_kwargs to `trainer.generate()`

* reflect max_new_tokens in `Seq2SeqTrainer`

* remove unnecessary variable

* Trigger CI

* fix style
2022-09-01 09:12:38 -04:00
Pedro Cuenca f719c0377f
Minor typo in prose of model outputs documentation. (#18848) 2022-09-01 12:05:40 +02:00
Albert Villanova del Moral fafbb57df1
Pin rouge_score (#18247)
* Pin rouge_score

* Pin also in dependency_versions_table

* Update excluded versions

* Revert "Update excluded versions"

This reverts commit 0d0362df30.

* Revert "Revert "Update excluded versions""

This reverts commit 66c47af8a6.
2022-09-01 12:04:49 +02:00
Yih-Dar e7da38f5dc
add a script to get time info. from GA workflow jobs (#18822)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-01 12:02:52 +02:00