Commit Graph

12128 Commits

Author SHA1 Message Date
bofeng huang c8545d2a9c
[Whisper] Add SpecAugment (#21298)
* Return and rescale attention_mask

* Add SpecAugment to Whisper modeling

* Fix test

* Update docstring

* Add SpecAug related parameters to model config

* Add the _mask_input_features function to doc

* Fix quality

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Remove dev comments

* Add test

* Resolve conflict

* feat: mask {feature, time} prob fast tests

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: sanchit-gandhi <sanchit@huggingface.co>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-02-24 11:07:52 +01:00
Sanchit Gandhi 75bd49ff88
[Flax] Fix erroneous kwargs being passed to generate config (#21765) 2023-02-24 09:59:18 +01:00
Arthur 14f33205a7
Different behavior in DistilBERT when using "inputs_embeds" (#21752)
* Different behavior in DistilBERT when using "inputs_embeds"
Fixes #21089

* fix failing test
2023-02-24 09:48:07 +01:00
Sanchit Gandhi 13489248fa
[Examples] Generalise run audio classification for log-mel models (#21756)
* [Examples] Generalise run audio classification for log-mel models

* batch feature extractor

* make style
2023-02-24 09:19:07 +01:00
Shubhamai f7ca656f07
[Flax] adding support for batch norm layers (#21581)
* [flax] adding support for batch norm layers

* fixing bugs related to pt+flax integration

* cleanup, batchnorm support in sharded pt to flax

* support for batchnorm tests in pt+flax integration

* simplifying checking batch norm layer
2023-02-24 08:47:33 +01:00
Connor Henderson 279008adc3
fix: Change is_last chunk calc and add conditional break in chunk_iter (#21612)
* fix: Change is_last chunk calc and add conditional break

* format fix

* account for 0 and full stride_rights, add comment

* add new test

* make style

* update slow whisper asr test timestamps

* use nested_simplify on output and round timestamp to hundreths place
2023-02-24 08:30:32 +01:00
Clémentine Fourrier 4446b6b094
Graphormer fix (#21699)
* Removed useless check for backend

* fix style check for graphormer

* Reverted change and corrected requires_backend for cython

* code qual
2023-02-24 08:20:52 +01:00
Stas Bekman 633062639b
[deepspeed tests] fix issues introduced by #21700 (#21769)
* [deepspeed tests] fix issues introduced by #21700

* fix

* fix
2023-02-23 13:22:25 -08:00
Maria Khalusova 04d90ac49e
Auto api Value Error addition to Troubleshoot (#21708)
* troubleshooting guide: added an error description for missing auto-mapping

* minor polishing

* changed the example

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/troubleshooting.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-02-23 11:51:18 -05:00
Batese2001 0ffa22f9f6
Added Type Hints for modeling_tf_encoder_decoder.py (#21673)
* Ran Black formatting

* Added imports and reformatted

* Update src/transformers/models/encoder_decoder/modeling_tf_encoder_decoder.py

---------

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2023-02-23 14:08:26 +00:00
ydshieh aa3787c8f0 Skip test_log_level for now 2023-02-23 12:11:20 +01:00
Joao Gante 1d4b797852
Generate: Fix GIT batched captioning (#21738) 2023-02-23 09:50:37 +00:00
Younes Belkada 78a93d17c0
[`GPTNeo`] Fix gradient checkpointing bug (#21733)
* fix bug

* forward contrib credits from discussions

* change logic

---------

Co-authored-by: edbeeching <edbeeching@users.noreply.github.com>
2023-02-23 09:48:19 +01:00
Yih-Dar 36a6a1adb6
Fix 2 quicktour file doctest (#21742)
* Update expect output values - as Hub repo. files are updated

* Update expect output values - as librosa is from 0.9.2 to 0.10.0 on CI docker

* fix

* update one more

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-23 09:41:28 +01:00
Yih-Dar ff143ae10e
Update doctest GH workflow file (#21744)
update

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-23 09:40:53 +01:00
Naga Sai Abhinay 448e050b0d
Make ImageProcessorMixin compatible with subfolder kwarg (#21725)
* Add subfolder support

* Add kwarg docstring

* formatting fix

* Add test
2023-02-23 09:28:18 +01:00
Thomas Paviot 064f374874
typos in french documentation (#21750) 2023-02-23 09:17:01 +01:00
Maria Khalusova 619d51e01f
Added "Open in Colab" to task guides (#21729)
added Open in Colab to task guides
2023-02-22 08:32:35 -05:00
Matt d913f4aa40
Fix to KerasMetricCallback when the model returns unstructured output (#21727)
* Stop doing dict-things to non-dict inputs

* Add a debug check

* Add a debug check

* Remove debug checks, looks good now!

* make fixup
2023-02-22 13:15:14 +00:00
Sanchit Gandhi 82e61f3445
[SpeechT5HifiGan] Handle batched inputs (#21702)
* [SpeechT5HifiGan] Handle batched inputs

* fix docstring

* rebase and new ruff style
2023-02-22 11:16:56 +01:00
Yih-Dar 09127c5713
Fix `GPTSanJapaneseModel` (#21731)
* fix

* skip test_model_parallelism

* skip test_model_parallelism

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-22 11:09:04 +01:00
Yih-Dar aff87da15b
Fix `ErnieMEmbeddings` device issue (#21726)
* remove .parameters()).device

* fix

* fix

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-22 10:57:34 +01:00
Yih-Dar 2f2b19ff40
Change doc example for `BigBirdForQuestionAnswering` (#21723)
Change doc example for BigBirdForQuestionAnswering

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-22 10:55:12 +01:00
Yih-Dar 354b338316
Remove `gptsan_japanese` from doctest list to avoid GPU OOM (#21722)
remove from doctest list to avoid GPU OOM

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-22 10:51:00 +01:00
Sylvain Gugger b19d64d852
Respect documentation on passive log level (#21700)
* Respect documentation on passive log level

* Fix test and set log level in examples

* Add doc
2023-02-22 09:39:18 +01:00
Sylvain Gugger ee6e71e29c
Fix quality 2023-02-22 03:36:15 -05:00
Younes Belkada 24b930ad1d
[`MBart`] Fix cross attention mask check (#21730)
fix typo
2023-02-22 09:21:25 +01:00
Aaron Gokaslan 5e8c8eb5ba
Apply ruff flake8-comprehensions (#21694) 2023-02-22 09:14:54 +01:00
Kashif Rasul df06fb1f0b
Time series transformer: input projection and Std scaler (#21020)
* added loc and scale outputs from scalers

* fix typo

* fix tests

* fixed formatting

* initial StdScaler

* move scaling to optional str

* calculate std feature for scalers

* undid change as it does not help

* added StdScaler with weights

* added input projection layer and d_model hyperparam

* use linear proj

* add back layernorm_embedding

* add sin-cos pos embeddings

* updated scalers

* formatting

* fix type

* fixed test

* fix repeated_past_values cal.

* fix when keepdim=false

* fix default_scale

* backward compatibility of scaling config

* update integration test expected output

* fix style

* fix docs

* use the actual num_static_real_features in feature_dim cal

* clarified docs

* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* prediction_length is not optional

* fix for reviewer

* Update src/transformers/models/time_series_transformer/configuration_time_series_transformer.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* get rid of un-needed new lines

* fix doc

* remove unneeded new lines

* fix style

* static_categorical_features and static_real_features are optional

* fix integration test

* Update src/transformers/models/time_series_transformer/modeling_time_series_transformer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixing docs for multivariate setting

* documentation for generate

---------

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-02-22 07:50:13 +01:00
mollerup23 bb5a2f2fc3
Adding type hints to call() functions in this file (#21548)
* Adding type hints to call() functions in this file

* make fixup

* Update src/transformers/models/marian/modeling_tf_marian.py

* Update src/transformers/models/marian/modeling_tf_marian.py

* Update src/transformers/models/marian/modeling_tf_marian.py

* Update src/transformers/models/marian/modeling_tf_marian.py

* Update src/transformers/models/marian/modeling_tf_marian.py

* Update src/transformers/models/marian/modeling_tf_marian.py

* Update src/transformers/models/marian/modeling_tf_marian.py

* Update src/transformers/models/marian/modeling_tf_marian.py

---------

Co-authored-by: Matt <rocketknight1@gmail.com>
Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2023-02-21 16:28:33 +00:00
Maria Khalusova 78a53d59cb
Adding task guides to resources (#21704)
* added resources: links to task guides that support these models

* minor polishing

* conflict resolved

* link fix

* Update docs/source/en/model_doc/vision-encoder-decoder.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-02-21 10:35:11 -05:00
Yih-Dar 03aaac3502
Fix TVLT (torch device issue) (#21710)
* fix tvlt ci

* fix tvlt ci

* fix tvlt ci

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-21 11:37:49 +01:00
Yih-Dar 4c6346cc3e
Fix `get_class_in_module` (#21709)
Fix get_class_in_module

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-21 09:39:15 +01:00
Yih-Dar ed6ceb7649
Fix typo in `PROCESSOR_MAPPING_NAMES` and add tests (#21703)
* Add test

* Fix GITProcessor

* Update

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-21 09:38:26 +01:00
Arthur 4deaa534f5
remove position ids and token type ids from forward args in docstring (#21701) 2023-02-21 07:01:36 +01:00
Ishan Jindal c40e3581c7
Fix axial positional encoding calculations for reformer.mdx (#21649)
* Update reformer.mdx

Fix axial positional encoding calculations

* Update docs/source/en/model_doc/reformer.mdx

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-02-21 06:59:51 +01:00
Jonatan Kłosko deafc24388
Add WhisperTokenizerFast (#21222)
* Add WhisperTokenizerFast

* Fixup

* Up

* Up

* Improve tests

* Update src/transformers/models/whisper/tokenization_whisper_fast.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Keep stride in whisper pipelien test

* Remove unknown token special case

* Reduce vocabulary size in tests

* Fix vocab size assertion

* Sync copied changes from WhisperTokenizer

* Skip pipeline tests

* Update assertion

* Remove Whisper tokenizer dependency on sentencepiece

* Format

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-02-21 06:58:54 +01:00
Sylvain Gugger 8b3db33a76
Pass along revision in dynamic code fetch (#21698) 2023-02-20 21:21:42 +01:00
Arthur 4194e5f42b
Fix-rag-finetune-project-requirement (#21697)
pin pytorch lightning requirement
2023-02-20 17:23:39 +01:00
Alara Dirik 49ab16239c
Add EfficientNet (#21563)
* Add EfficientNet to transformers
2023-02-20 16:37:11 +03:00
Younes Belkada c9a0671477
[`bnb`] fix `bnb` decoders bug (#21688)
* fix `bnb` decoders bug

* make fixup
2023-02-20 12:21:58 +00:00
tanreinama f56174ac5b
add GPTSAN model (reopen) (#21291)
* add GPTSAN-Japanese

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN (update for review)

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* fix typo in comment text

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* fix document and comments

* fix class name GPTSAN->GPTSan

* fix import and test for tokenizer
2023-02-20 11:25:27 +01:00
Sylvain Gugger c87bbe1ff0
Fix quality 2023-02-20 03:27:09 -05:00
Morgan McGuire 011cc17a81
Fix for non-contiguous label tensors in VisonEncoderDecoder (#21582)
* add prints

* add shape

* add reshape

* clean up
2023-02-20 09:23:46 +01:00
Andy Ehrenberg 2840272c5f
add flax whisper implementation (#20479)
* add flax whisper implementation

* rever change to setup

* remove unused imports

* revert generation changes

* flax whisper docs

* docs

* import order

* import sorting

* isort

* add dummy objects

* doc formatting

* formatting

* remove trailing whitespaces

* fix flax whisper docs

* add generation logic to unlock flax whisper

* remove scans

* give credits to Flax Bart implementation

* remove unused imports

* add license

* remove assert

* more credits to Bart

* fix style

* formatting

* support left padding

* add flax whisper generation test

* remove copied from comments whenever not a full copy

* fix docstrings for logits processors

* revert change to FlaxForceTokensLogitsProcessor

* revert doc changes

* improve generation docs

* reorganize

* formatting

* cleanup docs

* add tests

* handle empty list case

* fix forced decoder ids in flax tests

* add flax whisper to inits

* upate dummy objects

* docs for FlaxAutoModelForSpeechSeq2Seq

* fix decoder_position_ids computation in pretrained model decode/__call__ fns

* add Copied from statements as necessary

* compute position_ids only in __call__ and decode methods of pretrained model subclasses

* improve readabilityof compute positional embeddings

* check dimensionality of input_features instead of hidden_states

* copied from statement for init_cache

* formatting

* fix copies

* fix copies

* pass attention mask to encoder layers

* fix decoder module outputs

* set dtype

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* smaller flax model for whisper test

* Update src/transformers/generation/flax_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/whisper/modeling_flax_whisper.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/models/whisper/test_modeling_flax_whisper.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* cleanup

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/whisper/modeling_flax_whisper.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* bias cleanup

* doc fix

* align style for force tokens processor

* readability

* fix input shape in tests

* revert FlaxGenerationMixin docstring

* formatting

* fix tests

* fix imports

* consistent encoder hidden states

* consistent hidden states

* input shapes

* typo

* partial class trick

* partial class for input shape

* base_class with correct input shape

* partial base classes

* match by name

* set main_input_name

* compare on names

* formatting

* remove unused import

* safer position ids computation

* safer position id computation

* Update src/transformers/models/whisper/modeling_flax_whisper.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update src/transformers/models/whisper/modeling_flax_whisper.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* remove identical inherited tests

* fix prompt ids in tests

* use generation config

* use jnp array

* better var names

* more explicit bias use

* import transformers

* formatting

* test formatting

* remove unused imports

* remove unused imports

* formatting

* isort

* docs

* fix ln orders for encoder hidden states

* whisper unique generation stuff

* flake

* use finfo for attention bias

* docs

* Update src/transformers/generation/flax_utils.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* docs

* add timestamp flax test

* jit for timestamps

* formatting

* clean up timestamps processor

* formatting

* remove if_true

* cleanup

---------

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
2023-02-20 09:17:40 +01:00
AlexWertheim 7735e0406f
Enable PyTorch/XLA Fully Sharded Data Parallel (FSDP) (#21406)
* Reinserted import statement accidentally removed during rebasing.

* Added auto_wrap functionality, restructured XLA FSDP logic to more closely match PyTorch FSDP logic.

* Fixed flag descriptions; changed several instances of fsdp_ to xla_fsdp_; pass in auto_wrap_policy and auto_wrapper_callable directly to avoid lambda saving.

* Moved XLA FSDP logic to be adjacent to Fairscale FSDP logic in trainer.

* Formatted changes in accordance with HF style requirements.

* Added back in warning which was accidentally removed.

* - Merged XLA FSDP training arguments into `fsdp_config`
- Added `xla` boolean flag to `fsdp_config` to specify XLA FSDP wrapping
- Merged XLA FSDP wrapping logic into FSDP wrapping logic within trainer
  class

* Cleaned up errors, moved argument to fsdp_config

- Set `xla` and `xla_fsdp_grad_ckpt` flags by default in fsdp_config
- Added missing colons following conditionals
- Moved `fsdp_transformer_layer_cls_to_wrap` to `fsdp_config`
- Modified `fsdp_transformer_layer_cls_to_wrap` to be list of strings,
  not just one string
- Changed Fairscale FSDP logic to allow for set of layer classes to wrap
- Removed unnecessary checks for `xla_fsdp`

* Corrected small errors, improved layer class flag

- Correctly set default values for `xla` and `xla_fsdp_grad_ckpt`
  arguments
- Made `fsdp_transformer_layer_cls_to_wrap` a list of strings instead of
  a single string
- Added processing to ensure that `fsdp_transformer_layer_cls_to_wrap`
  works as expected if passed as a single string
- Updated PyTorch FSDP logic to accept a list of layers to wrap, as done
  with XLA FSDP
- Replaced instances of `getattr()` with `.get()` for dictionary
  retrievals with default values, including when setting
  `fsdp_min_num_params`
- Corrected `self.fsdp is not None` to `len(self.fsdp) > 0`
- Removed extraneous `xla_fsdp` argument descriptions from outside
  `fsdp_config`

* Changed xla-fsdp-settings to be dictionary

- Modified xla-fsdp-settings to be entered directly as dictionary
  instead of loaded through JSON file
- Made small style corrections

* Reverted unintentional local_rank TPU check

* Do not block XLA FSDP if local rank is -1

* Rebased and applied automatic formatting

- Rebased
- Applied automatic formatting changes via `make style`

* Applied automatic formatting with latest version of black

* Replaced  expression with

* Reran black examples tests src utils
ruff examples tests src utils --fix
make autogenerate_code
make[1]: Entering directory '/usr/local/google/home/awertheim/HF-FSDP-PR/transformers'
make[1]: Leaving directory '/usr/local/google/home/awertheim/HF-FSDP-PR/transformers' after additional formatting changes

* Additionall automatic formatting changes

* Remove unnecessary whitespace characters from src/transformers/training_args.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-02-20 09:06:23 +01:00
Yih-Dar 7f1cdf1895
Fix dynamic module import error (#21646)
* fix dynamic module import error

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-02-17 21:22:39 +01:00
Younes Belkada 8a4c319d33
[`BLIP`] update blip path on slow tests (#21476)
* update blip path

* Update tests/models/blip/test_modeling_blip.py
2023-02-17 18:26:36 +00:00
Younes Belkada 087fd5f368
[`ImageProcessor`] Refactor default `mean` & `std` to `OPENAI_CLIP_MEAN` & `OPENAI_CLIP_STD` (#21425)
* fix default value

* add the fix on other models
2023-02-17 18:57:05 +01:00
Joao Gante 005b515754
Generate: eta sampling numerical stability (#21676) 2023-02-17 17:09:37 +00:00