Tommy Chiang
a2ef9c5446
Use torch.unique_consecutive to check same element ( #13637 )
...
We use `torch.unique` here only to check whether every elements have
the same value.
Therefore, we can use `torch.unique_consecutive` here.
This function eliminates all but the first element from every consecutive
group of equivalent elements.
Like, if we apply this function to `[1, 2, 2, 1]`, it will result in
`[1, 2, 1]`.
As you could see, this is enough for checking whether every elements
have the same value.
Since `torch.unique_consecutive` do less thing, it is much more faster.
On my computer, it is 25x faster on GPU and 15x faster on CPU.
2021-09-24 10:31:23 +02:00
Patrick von Platen
95f888fd6a
Update README.md
2021-09-24 09:53:37 +02:00
Josh Devins
678bb248d0
Make assertions only if actually chunking forward ( #13598 )
...
This moves the assertion on checking input dimensions into a block that will only be called if the function is actually going to do chunking forward. This is often not the case at inference time and PyTorch tracing a model with this assertion in it leads to a tracing warning.
TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
input_tensor.shape[chunk_dim] == tensor_shape for input_tensor in input_tensors
2021-09-24 08:52:15 +02:00
Patrick von Platen
4a320f6c9a
[ASR] Add official ASR CTC example to `examples/pytorch/speech-recognition` ( #13620 )
...
* up
* rename
* add asr example
* add auto feature extractor
* some more fixes
* correct layerdrop
* correct for multi-gpu dist
* clean up
* refactor
* refactor
* more fixes
* more fixes
* clean-up
* finish
* up
* Apply suggestions from code review
* fix isort
* update
* up
* add note
* apply surajs suggestions
* Apply suggestions from code review
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* isort
* small change
* Apply suggestions from code review
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* Apply suggestions from code review
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
* add hubert
* Update examples/pytorch/speech-recognition/run_speech_recognition_ctc.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Anton Lozhkov <aglozhkov@gmail.com>
2021-09-24 07:01:11 +02:00
Lysandre Debut
41c186d2a4
Replace torch.set_grad_enabled by torch.no_grad ( #13703 )
2021-09-23 17:08:29 -04:00
Md Saiful Islam Sayef
f888e5c372
Add FSNER example in research_projects ( #13712 )
...
* Add example use of few-shot named entity recognition model in research_projects folder.
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update fsner example README.md.
- Change wrong import FSNERTokenizerWrapper to FSNERTokenizerUtils in the example code
- Add a link to the model identifier
* Update examples/research_projects/fsner/src/fsner/model.py
Fix spelling mistake in the default parameter of pretrained model name.
Co-authored-by: Stefan Schweter <stefan@schweter.it>
* Add example use of few-shot named entity recognition model in research_projects folder.
* Apply suggestions from code review
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
* Update fsner example README.md.
- Change wrong import FSNERTokenizerWrapper to FSNERTokenizerUtils in the example code
- Add a link to the model identifier
* Update examples/research_projects/fsner/src/fsner/model.py
Fix spelling mistake in the default parameter of pretrained model name.
Co-authored-by: Stefan Schweter <stefan@schweter.it>
* Run Checking/fixing examples/flax/language-modeling/run_clm_flax.py examples/flax/question-answering/run_qa.py examples/flax/question-answering/utils_qa.py examples/flax/token-classification/run_flax_ner.py examples/legacy/multiple_choice/utils_multiple_choice.py examples/legacy/seq2seq/seq2seq_trainer.py examples/legacy/token-classification/utils_ner.py examples/pytorch/image-classification/run_image_classification.py examples/pytorch/language-modeling/run_clm.py examples/pytorch/language-modeling/run_clm_no_trainer.py examples/pytorch/language-modeling/run_mlm.py examples/pytorch/language-modeling/run_mlm_no_trainer.py examples/pytorch/language-modeling/run_plm.py examples/pytorch/multiple-choice/run_swag.py examples/pytorch/multiple-choice/run_swag_no_trainer.py examples/pytorch/question-answering/run_qa.py examples/pytorch/question-answering/run_qa_beam_search.py examples/pytorch/question-answering/run_qa_beam_search_no_trainer.py examples/pytorch/question-answering/run_qa_no_trainer.py examples/pytorch/summarization/run_summarization.py examples/pytorch/summarization/run_summarization_no_trainer.py examples/pytorch/test_examples.py examples/pytorch/text-classification/run_glue.py examples/pytorch/text-classification/run_glue_no_trainer.py examples/pytorch/text-classification/run_xnli.py examples/pytorch/token-classification/run_ner.py examples/pytorch/token-classification/run_ner_no_trainer.py examples/pytorch/translation/run_translation.py examples/pytorch/translation/run_translation_no_trainer.py examples/research_projects/adversarial/utils_hans.py examples/research_projects/distillation/grouped_batch_sampler.py examples/research_projects/fsner/setup.py examples/research_projects/fsner/src/fsner/__init__.py examples/research_projects/fsner/src/fsner/model.py examples/research_projects/fsner/src/fsner/tokenizer_utils.py examples/research_projects/jax-projects/big_bird/evaluate.py examples/research_projects/jax-projects/hybrid_clip/run_hybrid_clip.py examples/tensorflow/language-modeling/run_clm.py examples/tensorflow/multiple-choice/run_swag.py examples/tensorflow/question-answering/run_qa.py examples/tensorflow/summarization/run_summarization.py examples/tensorflow/text-classification/run_glue.py examples/tensorflow/translation/run_translation.py src/transformers/__init__.py src/transformers/commands/add_new_model.py src/transformers/configuration_utils.py src/transformers/convert_slow_tokenizer.py src/transformers/data/__init__.py src/transformers/data/data_collator.py src/transformers/data/datasets/glue.py src/transformers/data/datasets/language_modeling.py src/transformers/data/datasets/squad.py src/transformers/deepspeed.py src/transformers/dependency_versions_table.py src/transformers/feature_extraction_sequence_utils.py src/transformers/file_utils.py src/transformers/generation_flax_utils.py src/transformers/generation_logits_process.py src/transformers/generation_tf_utils.py src/transformers/generation_utils.py src/transformers/integrations.py src/transformers/modelcard.py src/transformers/modeling_flax_utils.py src/transformers/modeling_outputs.py src/transformers/modeling_tf_utils.py src/transformers/modeling_utils.py src/transformers/models/__init__.py src/transformers/models/albert/__init__.py src/transformers/models/albert/modeling_albert.py src/transformers/models/albert/modeling_flax_albert.py src/transformers/models/albert/tokenization_albert_fast.py src/transformers/models/auto/__init__.py src/transformers/models/auto/auto_factory.py src/transformers/models/auto/configuration_auto.py src/transformers/models/auto/dynamic.py src/transformers/models/auto/feature_extraction_auto.py src/transformers/models/auto/modeling_auto.py src/transformers/models/auto/modeling_flax_auto.py src/transformers/models/auto/modeling_tf_auto.py src/transformers/models/auto/tokenization_auto.py src/transformers/models/bart/configuration_bart.py src/transformers/models/bart/modeling_bart.py src/transformers/models/bart/modeling_flax_bart.py src/transformers/models/bart/modeling_tf_bart.py src/transformers/models/barthez/tokenization_barthez_fast.py src/transformers/models/beit/__init__.py src/transformers/models/beit/configuration_beit.py src/transformers/models/beit/modeling_beit.py src/transformers/models/beit/modeling_flax_beit.py src/transformers/models/bert/configuration_bert.py src/transformers/models/bert/modeling_bert.py src/transformers/models/bert/modeling_flax_bert.py src/transformers/models/bert_generation/configuration_bert_generation.py src/transformers/models/bert_generation/modeling_bert_generation.py src/transformers/models/big_bird/configuration_big_bird.py src/transformers/models/big_bird/modeling_big_bird.py src/transformers/models/big_bird/modeling_flax_big_bird.py src/transformers/models/big_bird/tokenization_big_bird_fast.py src/transformers/models/bigbird_pegasus/configuration_bigbird_pegasus.py src/transformers/models/bigbird_pegasus/modeling_bigbird_pegasus.py src/transformers/models/blenderbot/configuration_blenderbot.py src/transformers/models/blenderbot/modeling_blenderbot.py src/transformers/models/blenderbot/modeling_tf_blenderbot.py src/transformers/models/blenderbot_small/configuration_blenderbot_small.py src/transformers/models/blenderbot_small/modeling_blenderbot_small.py src/transformers/models/blenderbot_small/modeling_tf_blenderbot_small.py src/transformers/models/byt5/tokenization_byt5.py src/transformers/models/camembert/tokenization_camembert_fast.py src/transformers/models/canine/configuration_canine.py src/transformers/models/canine/modeling_canine.py src/transformers/models/clip/configuration_clip.py src/transformers/models/clip/convert_clip_original_pytorch_to_hf.py src/transformers/models/clip/modeling_clip.py src/transformers/models/clip/modeling_flax_clip.py src/transformers/models/clip/tokenization_clip.py src/transformers/models/convbert/modeling_convbert.py src/transformers/models/ctrl/configuration_ctrl.py src/transformers/models/deberta/modeling_tf_deberta.py src/transformers/models/deberta_v2/__init__.py src/transformers/models/deberta_v2/modeling_deberta_v2.py src/transformers/models/deberta_v2/modeling_tf_deberta_v2.py src/transformers/models/deit/configuration_deit.py src/transformers/models/deit/modeling_deit.py src/transformers/models/detr/configuration_detr.py src/transformers/models/detr/modeling_detr.py src/transformers/models/distilbert/__init__.py src/transformers/models/distilbert/configuration_distilbert.py src/transformers/models/distilbert/modeling_distilbert.py src/transformers/models/distilbert/modeling_flax_distilbert.py src/transformers/models/dpr/configuration_dpr.py src/transformers/models/dpr/modeling_dpr.py src/transformers/models/electra/modeling_electra.py src/transformers/models/electra/modeling_flax_electra.py src/transformers/models/encoder_decoder/__init__.py src/transformers/models/encoder_decoder/modeling_encoder_decoder.py src/transformers/models/encoder_decoder/modeling_flax_encoder_decoder.py src/transformers/models/flaubert/configuration_flaubert.py src/transformers/models/flaubert/modeling_flaubert.py src/transformers/models/fnet/__init__.py src/transformers/models/fnet/configuration_fnet.py src/transformers/models/fnet/convert_fnet_original_flax_checkpoint_to_pytorch.py src/transformers/models/fnet/modeling_fnet.py src/transformers/models/fnet/tokenization_fnet.py src/transformers/models/fnet/tokenization_fnet_fast.py src/transformers/models/fsmt/configuration_fsmt.py src/transformers/models/fsmt/modeling_fsmt.py src/transformers/models/funnel/configuration_funnel.py src/transformers/models/gpt2/__init__.py src/transformers/models/gpt2/configuration_gpt2.py src/transformers/models/gpt2/modeling_flax_gpt2.py src/transformers/models/gpt2/modeling_gpt2.py src/transformers/models/gpt2/modeling_tf_gpt2.py src/transformers/models/gpt_neo/configuration_gpt_neo.py src/transformers/models/gpt_neo/modeling_gpt_neo.py src/transformers/models/gptj/__init__.py src/transformers/models/gptj/configuration_gptj.py src/transformers/models/gptj/modeling_gptj.py src/transformers/models/herbert/tokenization_herbert_fast.py src/transformers/models/hubert/__init__.py src/transformers/models/hubert/configuration_hubert.py src/transformers/models/hubert/convert_hubert_original_s3prl_checkpoint_to_pytorch.py src/transformers/models/hubert/modeling_hubert.py src/transformers/models/hubert/modeling_tf_hubert.py src/transformers/models/ibert/modeling_ibert.py src/transformers/models/layoutlm/__init__.py src/transformers/models/layoutlm/configuration_layoutlm.py src/transformers/models/layoutlm/modeling_layoutlm.py src/transformers/models/layoutlmv2/__init__.py src/transformers/models/layoutlmv2/configuration_layoutlmv2.py src/transformers/models/layoutlmv2/feature_extraction_layoutlmv2.py src/transformers/models/layoutlmv2/modeling_layoutlmv2.py src/transformers/models/layoutlmv2/processing_layoutlmv2.py src/transformers/models/layoutlmv2/tokenization_layoutlmv2.py src/transformers/models/layoutlmv2/tokenization_layoutlmv2_fast.py src/transformers/models/led/configuration_led.py src/transformers/models/led/modeling_led.py src/transformers/models/longformer/modeling_longformer.py src/transformers/models/luke/configuration_luke.py src/transformers/models/luke/modeling_luke.py src/transformers/models/luke/tokenization_luke.py src/transformers/models/lxmert/configuration_lxmert.py src/transformers/models/m2m_100/configuration_m2m_100.py src/transformers/models/m2m_100/modeling_m2m_100.py src/transformers/models/m2m_100/tokenization_m2m_100.py src/transformers/models/marian/configuration_marian.py src/transformers/models/marian/modeling_flax_marian.py src/transformers/models/marian/modeling_marian.py src/transformers/models/marian/modeling_tf_marian.py src/transformers/models/mbart/configuration_mbart.py src/transformers/models/mbart/modeling_flax_mbart.py src/transformers/models/mbart/modeling_mbart.py src/transformers/models/mbart/tokenization_mbart.py src/transformers/models/mbart/tokenization_mbart_fast.py src/transformers/models/mbart50/tokenization_mbart50.py src/transformers/models/mbart50/tokenization_mbart50_fast.py src/transformers/models/megatron_bert/configuration_megatron_bert.py src/transformers/models/megatron_bert/convert_megatron_bert_checkpoint.py src/transformers/models/megatron_bert/modeling_megatron_bert.py src/transformers/models/megatron_gpt2/convert_megatron_gpt2_checkpoint.py src/transformers/models/openai/configuration_openai.py src/transformers/models/pegasus/__init__.py src/transformers/models/pegasus/configuration_pegasus.py src/transformers/models/pegasus/modeling_flax_pegasus.py src/transformers/models/pegasus/modeling_pegasus.py src/transformers/models/pegasus/modeling_tf_pegasus.py src/transformers/models/pegasus/tokenization_pegasus_fast.py src/transformers/models/prophetnet/configuration_prophetnet.py src/transformers/models/prophetnet/modeling_prophetnet.py src/transformers/models/rag/modeling_rag.py src/transformers/models/rag/modeling_tf_rag.py src/transformers/models/reformer/configuration_reformer.py src/transformers/models/reformer/tokenization_reformer_fast.py src/transformers/models/rembert/configuration_rembert.py src/transformers/models/rembert/modeling_rembert.py src/transformers/models/rembert/tokenization_rembert_fast.py src/transformers/models/roberta/modeling_flax_roberta.py src/transformers/models/roberta/modeling_roberta.py src/transformers/models/roberta/modeling_tf_roberta.py src/transformers/models/roformer/configuration_roformer.py src/transformers/models/roformer/modeling_roformer.py src/transformers/models/speech_encoder_decoder/__init__.py src/transformers/models/speech_encoder_decoder/configuration_speech_encoder_decoder.py src/transformers/models/speech_encoder_decoder/convert_speech_to_text_wav2vec2_seq2seq_original_to_pytorch.py src/transformers/models/speech_encoder_decoder/modeling_speech_encoder_decoder.py src/transformers/models/speech_to_text/configuration_speech_to_text.py src/transformers/models/speech_to_text/feature_extraction_speech_to_text.py src/transformers/models/speech_to_text/modeling_speech_to_text.py src/transformers/models/speech_to_text_2/__init__.py src/transformers/models/speech_to_text_2/configuration_speech_to_text_2.py src/transformers/models/speech_to_text_2/modeling_speech_to_text_2.py src/transformers/models/speech_to_text_2/processing_speech_to_text_2.py src/transformers/models/speech_to_text_2/tokenization_speech_to_text_2.py src/transformers/models/splinter/configuration_splinter.py src/transformers/models/splinter/modeling_splinter.py src/transformers/models/t5/configuration_t5.py src/transformers/models/t5/modeling_flax_t5.py src/transformers/models/t5/modeling_t5.py src/transformers/models/t5/modeling_tf_t5.py src/transformers/models/t5/tokenization_t5_fast.py src/transformers/models/tapas/__init__.py src/transformers/models/tapas/configuration_tapas.py src/transformers/models/tapas/convert_tapas_original_tf_checkpoint_to_pytorch.py src/transformers/models/tapas/modeling_tapas.py src/transformers/models/tapas/tokenization_tapas.py src/transformers/models/transfo_xl/configuration_transfo_xl.py src/transformers/models/visual_bert/modeling_visual_bert.py src/transformers/models/vit/configuration_vit.py src/transformers/models/vit/convert_dino_to_pytorch.py src/transformers/models/vit/modeling_flax_vit.py src/transformers/models/vit/modeling_vit.py src/transformers/models/wav2vec2/__init__.py src/transformers/models/wav2vec2/configuration_wav2vec2.py src/transformers/models/wav2vec2/convert_wav2vec2_original_s3prl_checkpoint_to_pytorch.py src/transformers/models/wav2vec2/feature_extraction_wav2vec2.py src/transformers/models/wav2vec2/modeling_flax_wav2vec2.py src/transformers/models/wav2vec2/modeling_wav2vec2.py src/transformers/models/wav2vec2/tokenization_wav2vec2.py src/transformers/models/xlm/configuration_xlm.py src/transformers/models/xlm_roberta/tokenization_xlm_roberta.py src/transformers/models/xlm_roberta/tokenization_xlm_roberta_fast.py src/transformers/models/xlnet/configuration_xlnet.py src/transformers/models/xlnet/tokenization_xlnet_fast.py src/transformers/onnx/convert.py src/transformers/onnx/features.py src/transformers/optimization.py src/transformers/pipelines/__init__.py src/transformers/pipelines/audio_classification.py src/transformers/pipelines/automatic_speech_recognition.py src/transformers/pipelines/base.py src/transformers/pipelines/conversational.py src/transformers/pipelines/feature_extraction.py src/transformers/pipelines/fill_mask.py src/transformers/pipelines/image_classification.py src/transformers/pipelines/object_detection.py src/transformers/pipelines/question_answering.py src/transformers/pipelines/table_question_answering.py src/transformers/pipelines/text2text_generation.py src/transformers/pipelines/text_classification.py src/transformers/pipelines/text_generation.py src/transformers/pipelines/token_classification.py src/transformers/pipelines/zero_shot_classification.py src/transformers/testing_utils.py src/transformers/tokenization_utils.py src/transformers/tokenization_utils_base.py src/transformers/tokenization_utils_fast.py src/transformers/trainer.py src/transformers/trainer_callback.py src/transformers/trainer_pt_utils.py src/transformers/trainer_seq2seq.py src/transformers/trainer_utils.py src/transformers/training_args.py src/transformers/training_args_seq2seq.py src/transformers/utils/dummy_detectron2_objects.py src/transformers/utils/dummy_flax_objects.py src/transformers/utils/dummy_pt_objects.py src/transformers/utils/dummy_tf_objects.py src/transformers/utils/dummy_tokenizers_objects.py src/transformers/utils/dummy_vision_objects.py tests/deepspeed/test_deepspeed.py tests/sagemaker/conftest.py tests/sagemaker/test_multi_node_data_parallel.py tests/test_configuration_auto.py tests/test_configuration_common.py tests/test_data_collator.py tests/test_feature_extraction_auto.py tests/test_feature_extraction_layoutlmv2.py tests/test_feature_extraction_speech_to_text.py tests/test_feature_extraction_wav2vec2.py tests/test_file_utils.py tests/test_modeling_auto.py tests/test_modeling_bart.py tests/test_modeling_beit.py tests/test_modeling_bert.py tests/test_modeling_clip.py tests/test_modeling_common.py tests/test_modeling_convbert.py tests/test_modeling_deit.py tests/test_modeling_distilbert.py tests/test_modeling_encoder_decoder.py tests/test_modeling_flaubert.py tests/test_modeling_flax_albert.py tests/test_modeling_flax_bart.py tests/test_modeling_flax_beit.py tests/test_modeling_flax_distilbert.py tests/test_modeling_flax_encoder_decoder.py tests/test_modeling_flax_gpt2.py tests/test_modeling_flax_gpt_neo.py tests/test_modeling_flax_mt5.py tests/test_modeling_flax_pegasus.py tests/test_modeling_fnet.py tests/test_modeling_gpt2.py tests/test_modeling_gpt_neo.py tests/test_modeling_gptj.py tests/test_modeling_hubert.py tests/test_modeling_layoutlmv2.py tests/test_modeling_pegasus.py tests/test_modeling_rag.py tests/test_modeling_reformer.py tests/test_modeling_speech_encoder_decoder.py tests/test_modeling_speech_to_text.py tests/test_modeling_speech_to_text_2.py tests/test_modeling_tf_auto.py tests/test_modeling_tf_deberta_v2.py tests/test_modeling_tf_hubert.py tests/test_modeling_tf_pytorch.py tests/test_modeling_tf_wav2vec2.py tests/test_modeling_wav2vec2.py tests/test_onnx_v2.py tests/test_pipelines_audio_classification.py tests/test_pipelines_automatic_speech_recognition.py tests/test_pipelines_common.py tests/test_pipelines_conversational.py tests/test_pipelines_feature_extraction.py tests/test_pipelines_fill_mask.py tests/test_pipelines_image_classification.py tests/test_pipelines_object_detection.py tests/test_pipelines_question_answering.py tests/test_pipelines_summarization.py tests/test_pipelines_table_question_answering.py tests/test_pipelines_text2text_generation.py tests/test_pipelines_text_classification.py tests/test_pipelines_text_generation.py tests/test_pipelines_token_classification.py tests/test_pipelines_translation.py tests/test_pipelines_zero_shot.py tests/test_processor_layoutlmv2.py tests/test_processor_wav2vec2.py tests/test_sequence_feature_extraction_common.py tests/test_tokenization_auto.py tests/test_tokenization_byt5.py tests/test_tokenization_canine.py tests/test_tokenization_common.py tests/test_tokenization_fnet.py tests/test_tokenization_layoutlmv2.py tests/test_tokenization_luke.py tests/test_tokenization_mbart.py tests/test_tokenization_mbart50.py tests/test_tokenization_speech_to_text_2.py tests/test_tokenization_t5.py tests/test_tokenization_tapas.py tests/test_tokenization_xlm_roberta.py tests/test_trainer.py tests/test_trainer_distributed.py tests/test_trainer_tpu.py tests/test_utils_check_copies.py utils/check_copies.py utils/check_repo.py utils/notification_service.py utils/release.py utils/tests_fetcher.py
python utils/custom_init_isort.py
python utils/style_doc.py src/transformers docs/source --max_len 119
running deps_table_update
updating src/transformers/dependency_versions_table.py
python utils/check_copies.py
python utils/check_table.py
python utils/check_dummies.py
python utils/check_repo.py
Checking all models are public.
Checking all models are properly tested.
Checking all objects are properly documented.
Checking all models are in at least one auto class.
python utils/check_inits.py
python utils/tests_fetcher.py --sanity_check and fix suggested changes.
* Run black examples tests src utils
isort examples tests src utils
Skipped 1 files
make autogenerate_code
make[1]: Entering directory '/mnt/c/Users/Admin/Desktop/Home/Projects/transformers'
running deps_table_update
updating src/transformers/dependency_versions_table.py
make[1]: Leaving directory '/mnt/c/Users/Admin/Desktop/Home/Projects/transformers'
make extra_style_checks
make[1]: Entering directory '/mnt/c/Users/Admin/Desktop/Home/Projects/transformers'
python utils/custom_init_isort.py
python utils/style_doc.py src/transformers docs/source --max_len 119
make[1]: Leaving directory '/mnt/c/Users/Admin/Desktop/Home/Projects/transformers' for reformatting code.
* Add installation dependencies for examples/research_projects/fsner.
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Stefan Schweter <stefan@schweter.it>
2021-09-23 17:04:15 -04:00
Li-Huai (Allan) Lin
1988849bbf
Handle `UnicodeDecodeError` ( #13717 )
2021-09-23 16:56:34 -04:00
kding1
8632a60d33
Add cpu distributed fine-tuning support for transformers Trainer API ( #13574 )
...
* update trainer with cpu distributed fine-tuning support.
Signed-off-by: Ding, Ke <ke.ding@intel.com>
* Style.
* refinement on cpu dist training check.
Signed-off-by: Ding, Ke <ke.ding@intel.com>
* style.
Signed-off-by: Ding, Ke <ke.ding@intel.com>
* Test over private field not public one.
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Morgan Funtowicz <funtowiczmo@gmail.com>
Co-authored-by: Funtowicz Morgan <mfuntowicz@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-09-23 18:15:27 +02:00
kding1
6a3a197fcd
Add SigOpt HPO to transformers trainer api ( #13572 )
...
* add sigopt hpo to transformers.
Signed-off-by: Ding, Ke <ke.ding@intel.com>
* extend sigopt changes to test code and others..
Signed-off-by: Ding, Ke <ke.ding@intel.com>
* Style.
* fix style for sigopt integration.
Signed-off-by: Ding, Ke <ke.ding@intel.com>
* Add necessary information to run unittests on SigOpt.
Co-authored-by: Morgan Funtowicz <funtowiczmo@gmail.com>
2021-09-23 17:01:51 +02:00
Stas Bekman
62832c962f
1x model size CPU memory usage for `from_pretrained` ( #13466 )
...
* one possible solution
* low mem from_pretrained
* edge cases
* solve the persistent buffers
* style
* parametrize
* for later
* proper solution
* cleanup
* refactor; rework based on suggestions
* revert splitting into 2 parts, move checks into main func
2021-09-22 19:33:09 -07:00
Lysandre Debut
ca257a06cc
Fix torchscript tests ( #13701 )
2021-09-22 19:02:54 -04:00
Lysandre Debut
5b57075449
Add BlenderBot small tokenizer to the init ( #13367 )
...
* Add BlenderBot small tokenizer to the init
* Update src/transformers/__init__.py
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Style
* Bugfix
Co-authored-by: Suraj Patil <surajp815@gmail.com>
2021-09-22 19:00:47 -04:00
Gunjan Chhablani
9e0fd78051
Fix reference to tpu short seq length ( #13686 )
2021-09-22 18:36:24 -04:00
Suraj Patil
6dc41d9f8e
add a note about tokenizer ( #13696 )
2021-09-22 17:18:13 -04:00
Anton Lozhkov
7c7d2ec952
[GPT-J] Use the `float16` checkpoints in integration tests ( #13676 )
...
* Use fp16 checkpoints
* Style
* Fix outputs and disable OOM tests
* Correct another output
* Use a random smaller model for generation tests
* repo quickfix
* fix gradient checkpointing
2021-09-22 23:17:57 +03:00
Lysandre Debut
0ecdf6de03
Patch training arguments issue ( #13700 )
...
* Patch training arguments issue
* Update src/transformers/training_args.py
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-09-22 15:33:18 -04:00
Gunjan Chhablani
50c746eeb7
Allow only textual inputs to VisualBert ( #13687 )
2021-09-22 21:21:53 +05:30
Yih-Dar
93624bfee9
Fix non-negligible difference between GPT2 and TFGP2 ( #13679 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2021-09-22 09:14:55 -04:00
MocktaiLEngineer
a0c08aa36c
Assertions to exceptions ( #13692 )
...
* Raise exceptions instead of using assertions for control flow #12789
* # coding=utf-8
* Raise exceptions instead of using assertions for control flow
* Raise exceptions instead of using assertions for control flow
* Update src/transformers/tokenization_utils.py
Raise exceptions instead of using assertions for control flow
Co-authored-by: Suraj Patil <surajp815@gmail.com>
* Update src/transformers/tokenization_utils.py
Raise exceptions instead of using assertions for control flow
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Raise exceptions instead of using assertions for control flow
* test
* Raise exceptions instead of using assertions for control flow
Co-authored-by: MocktaiLEngineer <kavinarasu22@gmail.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2021-09-22 09:14:29 -04:00
Sylvain Gugger
27d4639779
Make gradient_checkpointing a training argument ( #13657 )
...
* Make gradient_checkpointing a training argument
* Update src/transformers/modeling_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Update src/transformers/configuration_utils.py
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
* Fix tests
* Style
* document Gradient Checkpointing as a performance feature
* Small rename
* PoC for not using the config
* Adapt BC to new PoC
* Forgot to save
* Rollout changes to all other models
* Fix typo
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Stas Bekman <stas@stason.org>
2021-09-22 07:51:38 -04:00
Anton Lozhkov
75f6641eaf
[Wav2Vec2FeatureExtractor] Fix `extractor.pad()` dtype backwards compatibility ( #13693 )
...
* Force dtype, add tests
* Local torch imports
* Remove unused logic (always ndarray)
2021-09-22 11:02:54 +02:00
Patrick von Platen
8e908c8c74
[AutoTokenizer] Allow creation of tokenizers by tokenizer type ( #13668 )
...
* up
* up
2021-09-22 00:29:38 +02:00
Patrick von Platen
2608944dc2
up ( #13688 )
2021-09-22 00:28:43 +02:00
Kamal Raj
8565d38f30
Update modeling_flax_wav2vec2.py ( #13680 )
...
conv kernel_size to Tuple,
Flax Version 0.3.5 breaking change, https://github.com/google/flax/releases/tag/v0.3.5
2021-09-21 23:36:13 +02:00
Sylvain Gugger
d16bec9530
Skip FlaxWav2Vec2 test until fixed
2021-09-21 16:17:01 -04:00
Nishant Prabhu
ddd4d02f30
Layoutlm onnx support (Issue #13300 ) ( #13562 )
...
* Add support for exporting PyTorch LayoutLM to ONNX
* Added tests for converting LayoutLM to ONNX
* Add support for exporting PyTorch LayoutLM to ONNX
* Added tests for converting LayoutLM to ONNX
* cleanup
* Removed regression/ folder
* Add support for exporting PyTorch LayoutLM to ONNX
* Added tests for converting LayoutLM to ONNX
* cleanup
* Fixed import error
* Remove unnecessary import statements
* Changed max_2d_positions from class variable to instance variable of the config class
* Add support for exporting PyTorch LayoutLM to ONNX
* Added tests for converting LayoutLM to ONNX
* cleanup
* Add support for exporting PyTorch LayoutLM to ONNX
* cleanup
* Fixed import error
* Changed max_2d_positions from class variable to instance variable of the config class
* Use super class generate_dummy_inputs method
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
* Add support for Masked LM, sequence classification and token classification
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
* Removed uncessary import and method
* Fixed code styling
* Raise error if PyTorch is not installed
* Remove unnecessary import statement
Co-authored-by: Michael Benayoun <mickbenayoun@gmail.com>
2021-09-21 15:39:37 -04:00
Sylvain Gugger
b7d264be0d
Add push_to_hub to no_trainer examples ( #13659 )
...
* Add push_to_hub to no_trainer examples
* Quality
* Document integration
* Roll out to other examples
2021-09-21 13:13:30 -04:00
Stas Bekman
a722c301bf
[SinusoidalPositionalEmbedding] incorrect dtype when make_weights in forward ( #13665 )
2021-09-21 09:05:05 -07:00
Anton Lozhkov
1417978cd4
[SequenceFeatureExtractor] Rewrite padding logic from pure python to numpy ( #13650 )
...
* Test np padding
* Pass feature extraction tests
* Update type hints
* Fix flaky integration tests
* Try a more stable waveform
* Add to_numpy jax support
* int32 attention masks
* Refactor normalization tests
2021-09-21 17:10:13 +03:00
Kamal Raj
8d533e6ad6
Typo "UNKWOWN" -> "UNKNOWN" ( #13675 )
2021-09-21 09:11:26 -04:00
Kamal Raj
78807d86eb
[FLAX] Question Answering Example ( #13649 )
...
* flax qa example
* Updated README: Added Large model
* added utils_qa.py FULL_COPIES
* Updates:
1. Copyright Year updated
2. added dtype arg
3. passing seed and dtype to load model
4. Check eval flag before running eval
* updated README
* updated code comment
2021-09-21 18:34:48 +05:30
Kamal Raj
a2dec768a2
beit-flax ( #13515 )
...
* beit-flax
* updated FLAX_BEIT_MLM_DOCSTRING
* removed bool_masked_pos from classification
* updated Copyright
* code refactoring: x -> embeddings
* updated test: rm from_pt
* Update docs/source/model_doc/beit.rst
* model code dtype updates and
other changes according to review
* relative_position_bias
revert back to pytorch design
2021-09-21 13:34:19 +02:00
Patrick von Platen
48fa42e5d5
Add Speech AutoModels ( #13655 )
...
* upload
* correct
* correct
* correct
* finish
* up
* up
* up again
2021-09-21 08:50:33 +02:00
flozi00
ea92136597
Fix typo distilbert doc ( #13643 )
2021-09-20 15:10:33 -04:00
Lowin
28d5700aae
fix research_projects/mlm_wwm readme.md examples ( #13646 )
...
the variables of run example is not correct
2021-09-20 15:01:35 -04:00
Sylvain Gugger
002a078aff
Dynamically load model code from the Hub ( #13467 )
...
* Dynamic model
* Use defensive flag
* Style
* Doc and arg rename
* Arg rename
* Add tests
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Address review comments
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
2021-09-20 13:59:21 -04:00
flozi00
aeb2dac04d
Change https:/ to https:// ( #13644 )
2021-09-20 12:31:46 -04:00
Stas Bekman
0af901e83f
[megatron_gpt2] checkpoint v3 ( #13508 )
...
* [megatron_gpt2] checkpoint v3
* bug fix
* fixes
* switch to default from - which is what the current megatron-lm uses
* cleanup
* back compat
2021-09-20 08:50:54 -07:00
Kamal Raj
936b3fdeaa
Update modeling_tf_deberta.py ( #13654 )
...
Fixed expand_dims axis
2021-09-20 11:11:04 -04:00
Ayaka Mikazuki
04976a32dc
Fix mT5 documentation ( #13639 )
...
* Fix MT5 documentation
The abstract is incomplete
* MT5 -> mT5
2021-09-20 07:53:31 -04:00
Chengjiang Li
fe379f856b
[Fix]Make sure the args tb_writer passed to the TensorBoardCallback works ( #13636 )
2021-09-20 07:50:03 -04:00
Gunjan Chhablani
d8049331dc
Add FNet ( #13045 )
...
* Init FNet
* Update config
* Fix config
* Update model classes
* Update tokenizers to use sentencepiece
* Fix errors in model
* Fix defaults in config
* Remove position embedding type completely
* Fix typo and take only real numbers
* Fix type vocab size in configuration
* Add projection layer to embeddings
* Fix position ids bug in embeddings
* Add minor changes
* Add conversion script and remove CausalLM vestiges
* Fix conversion script
* Fix conversion script
* Remove CausalLM Test
* Update checkpoint names to dummy checkpoints
* Add tokenizer mapping
* Fix modeling file and corresponding tests
* Add tokenization test file
* Add PreTraining model test
* Make style and quality
* Make tokenization base tests work
* Update docs
* Add FastTokenizer tests
* Fix fast tokenizer special tokens
* Fix style and quality
* Remove load_tf_weights vestiges
* Add FNet to main README
* Fix configuration example indentation
* Comment tokenization slow test
* Fix style
* Add changes from review
* Fix style
* Remove bos and eos tokens from tokenizers
* Add tokenizer slow test, TPU transforms, NSP
* Add scipy check
* Add scipy availabilty check to test
* Fix tokenizer and use correct inputs
* Remove remaining TODOs
* Fix tests
* Fix tests
* Comment Fourier Test
* Uncomment Fourier Test
* Change to google checkpoint
* Add changes from review
* Fix activation function
* Fix model integration test
* Add more integration tests
* Add comparison steps to MLM integration test
* Fix style
* Add masked tokenization fix
* Improve mask tokenization fix
* Fix index docs
* Add changes from review
* Fix issue
* Fix failing import in test
* some more fixes
* correct fast tokenizer
* finalize
* make style
* Remove additional tokenization logic
* Set do_lower_case to False
* Allow keeping accents
* Fix tokenization test
* Fix FNet Tokenizer Fast
* fix tests
* make style
* Add tips to FNet docs
Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2021-09-20 13:24:30 +02:00
Suraj Patil
87d5057d86
fix typo ( #13647 )
2021-09-20 13:22:26 +05:30
calpt
b518aaf193
Fix GPT2Config parameters in GPT2ModelTester ( #13630 )
2021-09-17 15:36:23 -04:00
Lysandre Debut
300ee0c7b2
Updated tiny distilbert models ( #13631 )
2021-09-17 15:35:34 -04:00
Yih-Dar
afb07a79ab
fix some docstring in encoder-decoder models ( #13611 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2021-09-17 17:39:35 +02:00
Alessandro Suglia
19b7acdd61
Cloned tensors after indexing in _compute_attn_output_with_global_indices ( #13613 )
...
Co-authored-by: Alessandro Suglia <asuglia@fb.com>
2021-09-17 17:05:49 +02:00
Alex Hedges
ce32c69c0b
Use `config_dict_or_path` for deepspeed.zero.Init ( #13614 )
2021-09-17 07:57:27 -07:00
Matt
0eb02871dd
Removed console spam from misfiring warnings ( #13625 )
...
* Removed misfiring warnings
* Revert "Removed misfiring warnings"
This reverts commit cea90de325056b9c1cbcda2bd2613a785c1639ce.
* Retain the warning, but only when the user actually overrides things
* Fix accidentally breaking just about every model on the hub simultaneously
* Style pass
2021-09-17 15:44:33 +01:00
Li-Huai (Allan) Lin
da8beaaf76
Fix special tokens not correctly tokenized ( #13489 )
...
* Fix special tokens not correctly tokenized
* Add testing
* Fix
* Fix
* Use user workflows instead of directly assigning variables
* Enable test of fast tokenizers
* Update test of canine tokenizer
2021-09-17 10:28:28 -04:00