Commit Graph

273 Commits

Author SHA1 Message Date
Julien Chaumond 9129fd0377
`transformers-cli login` => `huggingface-cli login` (#18490)
* zero chance anyone's using that constant no?

* `transformers-cli login` => `huggingface-cli login`

* `transformers-cli repo create` => `huggingface-cli repo create`

* `make style`
2022-08-06 09:42:55 +02:00
LSinev 02b176c4ce
Fix torch version comparisons (#18460)
Comparisons like
version.parse(torch.__version__) > version.parse("1.6")
are True for torch==1.6.0+cu101 or torch==1.6.0+cpu

version.parse(version.parse(torch.__version__).base_version) are preferred (and available in pytorch_utils.py
2022-08-03 13:37:18 -04:00
Yih-Dar bd6d1b4300
Add a check regarding the number of occurrences of ``` (#18389)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-08-01 14:23:02 +02:00
Yulv-git 95113d1365
Fix some typos. (#17560)
* Fix some typos.

Signed-off-by: Yulv-git <yulvchi@qq.com>

* Fix typo.

Signed-off-by: Yulv-git <yulvchi@qq.com>

* make fixup.
2022-07-11 05:00:13 -04:00
Sanchit Gandhi 485bbe79d5
[Flax] Add remat (gradient checkpointing) (#17843)
* [Flax] Add remat (gradient checkpointing)

* fix variable naming in test

* flip: checkpoint using a method

* fix naming

* fix class naming

* apply PVP's suggestions from code review

* make fix-copies

* fix big-bird, electra, roberta

* cookie-cutter

* fix flax big-bird

* move test to common
2022-07-01 18:33:54 +01:00
Leon Derczynski b8142753f9
Add missing comment quotes (#17379) 2022-06-29 06:16:36 -04:00
Yih-Dar d3cb28886a
Not use -1e4 as attn mask (#17306)
* Use torch.finfo(self.dtype).min

* for GPTNeoX

* for Albert

* For Splinter

* Update src/transformers/models/data2vec/modeling_data2vec_audio.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix -inf used in Bart-like models

* Fix a few remaining -inf

* more fix

* clean up

* For CLIP

* For FSMT

* clean up

* fix test

* Add dtype argument and use it for LayoutLMv3

* update FlaxLongT5Attention

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-06-20 16:16:16 +02:00
Joao Gante 132402d752
TF: BART compatible with XLA generation (#17479)
* Also propagate changes to blenderbot, blenderbot_small, marian, mbart, and pegasus
2022-06-20 11:07:46 +01:00
Ayush Mangal a5282ab4bc
Fix typo in adding_a_new_model README (#17679) 2022-06-13 03:22:07 -04:00
Sylvain Gugger 3cab90279f
Add examples telemetry (#17552)
* Add examples telemetry

* Alternative approach

* Add to all other examples

* Add to templates as well

* Put framework separately

* Same for TensorFlow
2022-06-07 11:57:52 -04:00
cloudhan e86faecfd4
Fix obvious typos in flax decoder impl (#17279)
Change config.encoder_ffn_dim -> config.decoder_ffn_dim for decoder.
2022-05-16 13:08:04 +02:00
Suraj Patil 9bd67ac7bb
update BART docs (#17212) 2022-05-12 19:25:16 +01:00
Sylvain Gugger 4ad2f68e34
Fix template init (#17163) 2022-05-10 15:24:23 -04:00
Dom Miketa df735d1317
[WIP] Fix Pyright static type checking by replacing if-else imports with try-except (#16578)
* rebase and isort

* modify cookiecutter init

* fix cookiecutter auto imports

* fix clean_frameworks_in_init

* fix add_model_to_main_init

* blackify

* replace unnecessary f-strings

* update yolos imports

* fix roberta import bug

* fix yolos missing dependency

* fix add_model_like and cookiecutter bug

* fix repository consistency error

* modify cookiecutter, fix add_new_model_like

* remove stale line

Co-authored-by: Dom Miketa <dmiketa@exscientia.co.uk>
2022-05-09 11:28:53 -04:00
Pavel Belevich 39f8eafc1b
Remove device parameter from create_extended_attention_mask_for_decoder (#16894) 2022-05-03 11:06:11 -04:00
Yih-Dar 19420fd99e
Move test model folders (#17034)
* move test model folders (TODO: fix imports and others)

* fix (potentially partially) imports (in model test modules)

* fix (potentially partially) imports (in tokenization test modules)

* fix (potentially partially) imports (in feature extraction test modules)

* fix import utils.test_modeling_tf_core

* fix path ../fixtures/

* fix imports about generation.test_generation_flax_utils

* fix more imports

* fix fixture path

* fix get_test_dir

* update module_to_test_file

* fix get_tests_dir from wrong transformers.utils

* update config.yml (CircleCI)

* fix style

* remove missing imports

* update new model script

* update check_repo

* update SPECIAL_MODULE_TO_TEST_MAP

* fix style

* add __init__

* update self-scheduled

* fix add_new_model scripts

* check one way to get location back

* python setup.py build install

* fix import in test auto

* update self-scheduled.yml

* update slack notification script

* Add comments about artifact names

* fix for yolos

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-05-03 14:42:02 +02:00
Sanchit Gandhi cd9274d010
[FlaxBert] Add ForCausalLM (#16995)
* [FlaxBert] Add ForCausalLM

* make style

* fix output attentions

* Add RobertaForCausalLM

* remove comment

* fix fx-to-pt model loading

* remove comment

* add modeling tests

* add enc-dec model tests

* add big_bird

* add electra

* make style

* make repo-consitency

* add to docs

* remove roberta test

* quality

* amend cookiecutter

* fix attention_mask bug in flax bert model tester

* tighten pt-fx thresholds to 1e-5

* add 'copied from' statements

* amend 'copied from' statements

* amend 'copied from' statements

* quality
2022-05-03 11:26:19 +02:00
yujun bdd690a74d
add torch.no_grad when in eval mode (#17020)
* add torch.no_grad when in eval mode

* make style quality
2022-05-02 07:49:19 -04:00
Joao Gante e03966e404
TF: XLA stable softmax (#16892)
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-25 20:10:51 +01:00
Suraj Patil d3bd9ac728
[Flax] improve large model init and loading (#16148)
* begin do_init

* add params_shape_tree

* raise error if params are accessed when do_init is False

* don't allow do_init=False when keys are missing

* make shape tree a property

* assign self._params at the end

* add test for do_init

* add do_init arg to all flax models

* fix param setting

* disbale do_init for composite models

* update test

* add do_init in FlaxBigBirdForMultipleChoice

* better names and errors

* improve test

* style

* add a warning when do_init=False

* remove extra if

* set params after _required_params

* add test for from_pretrained

* do_init => _do_init

* chage warning to info

* fix typo

* add params in init_weights

* add params to gpt neo init

* add params to init_weights

* update do_init test

* Trigger CI

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update template

* trigger CI

* style

* style

* fix template

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2022-04-19 14:19:55 +02:00
Anmol Joshi a315988bae
Moved functions to pytorch_utils.py (#16625)
* Moved functions to pytorch_utils.py

* isort formatting

* Reverted tf changes

* isort, make fix-copies

* documentation fix

* Fixed Conv1D import

* Reverted research examples file

* backward compatibility for pytorch_utils

* missing import

* isort fix
2022-04-12 12:38:50 -04:00
Matt 4354005291
Adding new train_step logic to make things less confusing for users (#15994)
* Adding new train_step logic to make things less confusing for users

* DO NOT ASK WHY WE NEED THAT SUBCLASS

* Metrics now working, at least for single-output models with type annotations!

* Updates and TODOs for the new train_step

* Make fixup

* Temporary test workaround until T5 has types

* Temporary test workaround until T5 has types

* I think this actually works! Needs a lot of tests though

* MAke style/quality

* Revert changes to T5 tests

* Deleting the aforementioned unmentionable subclass

* Deleting the aforementioned unmentionable subclass

* Adding a Keras API test

* Style fixes

* Removing unneeded TODO and comments

* Update test_step too

* Stop trying to compute metrics with the dummy_loss, patch up test

* Make style

* make fixup

* Docstring cleanup

* make fixup

* make fixup

* Stop expanding 1D input tensors when using dummy loss

* Adjust T5 test given the new compile()

* make fixup

* Skipping test for convnext

* Removing old T5-specific Keras test now that we have a common one

* make fixup

* make fixup

* Only skip convnext test on CPU

* Update src/transformers/modeling_tf_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/modeling_tf_utils.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Avoiding TF import issues

* make fixup

* Update compile() to support TF 2.3

* Skipping model.fit() on template classes for now

* Skipping model.fit() on template class tests for now

* Replace ad-hoc solution with find_labels

* make fixup

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-05 14:23:27 +01:00
SaulLu 02214cb3cc
add a template to add missing tokenization test (#16553)
* add a template to add missing tokenization test

* add cookiecutter setting

* improve doc

* Update templates/adding_a_missing_tokenization_test/README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-04-05 10:50:22 +02:00
Joao Gante dad5ca83b2
TF: Finalize `unpack_inputs`-related changes (#16499)
* Add unpack_inputs to remaining models

* removed kwargs to `call()` in TF models

* fix TF T5 tests
2022-04-04 16:37:33 +01:00
Yih-Dar 2199382dfd
Use random_attention_mask for TF tests (#16517)
* use random_attention_mask for TF tests

* Fix for TFCLIP test (for now).

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-04-01 16:53:07 +02:00
Joao Gante c2f8eaf6bc
TF: unpack inputs on Convbert, GPTJ, LED, and templates (#16491)
* Add unpack_inputs to remaining models

* remove stray use of inputs in the templates; fix tf.debugging of attn masks
2022-03-30 17:12:27 +01:00
Sylvain Gugger 088c1880b7
Big file_utils cleanup (#16396)
* Big file_utils cleanup

* This one still needs to be treated separately
2022-03-25 07:25:20 -04:00
Sylvain Gugger 4975002df5
Reorganize file utils (#16264)
* Split file_utils in several submodules

* Fixes

* Add back more objects

* More fixes

* Who exactly decided to import that from there?

* Second suggestion to code with code review

* Revert wront move

* Fix imports

* Adapt all imports

* Adapt all imports everywhere

* Revert this import, will fix in a separate commit
2022-03-23 10:26:33 -04:00
Lysandre Debut eca77f4719
Updates the default branch from master to main (#16326)
* Updates the default branch from master to main

* Links from `master` to `main`

* Typo

* Update examples/flax/README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-03-23 03:46:59 -04:00
Jacob Dineen ec3aace0ae
Add type annotations for Rembert/Splinter and copies (#16338)
* undo black autoformat

* minor fix to rembert forward with default

* make fix-copies, make quality

* Adding types to template model

* Removing List from the template types

* Remove `Optional` from a couple of types that don't accept `None`

Co-authored-by: matt <rocketknight1@gmail.com>
2022-03-22 20:07:48 +00:00
Robot Jelly d50f62f2de
added type hints for BART model (#16270)
* added type hints for BART model

* make fixup, adding imports to copied files

* Adding some missing types to cookiecutter

* Adding some missing types to cookiecutter

* Adding some missing types to cookiecutter

Co-authored-by: matt <rocketknight1@gmail.com>
2022-03-21 15:18:01 +00:00
Sanchit Gandhi ee27b3d7df
Replace all deprecated `jax.ops` operations with jnp's `at` (#16078)
* Replace all deprecated `jax.ops` operations with jnp's `at`

* np to jnp scores

* suggested changes
2022-03-16 09:08:55 +00:00
Joao Gante 70203b5937
TF generate refactor - past without encoder outputs (#15944)
* Remove packed past from generation_tf_utils

* update models with the new past format

* update template accordingly
2022-03-08 14:46:44 +00:00
Yih-Dar f0aacc140b
Do not change the output from tuple to list - to match PT's version (#15918)
* Do not change the output from tuple to list - to match PT's version

* Fix the same issues for 5 other models and the template

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-03-04 17:50:24 +01:00
Yih-Dar 8635407bc7
Fix tf.concatenate + test past_key_values for TF models (#15774)
* fix wrong method name tf.concatenate

* add tests related to causal LM / decoder

* make style and quality

* clean-up

* Fix TFBertModel's extended_attention_mask when past_key_values is provided

* Fix tests

* fix copies

* More tf.int8 -> tf.int32 in TF test template

* clean-up

* Update TF test template

* revert the previous commit + update the TF test template

* Fix TF template extended_attention_mask when past_key_values is provided

* Fix some styles manually

* clean-up

* Fix ValueError: too many values to unpack in the test

* Fix more: too many values to unpack in the test

* Add a comment for extended_attention_mask when there is past_key_values

* Fix TFElectra extended_attention_mask when past_key_values is provided

* Add tests to other TF models

* Fix for TF Electra test: add prepare_config_and_inputs_for_decoder

* Fix not passing training arg to lm_head in TFRobertaForCausalLM

* Fix tests (with past) for TF Roberta

* add testing for pask_key_values for TFElectra model

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-02-25 17:11:46 +01:00
Lysandre Debut bb7949b35a
Fix model templates (#15806)
* Fix model templates

* Update paths
2022-02-23 18:27:29 -05:00
Lysandre Debut 29c10a41d0
[Test refactor 1/5] Per-folder tests reorganization (#15725)
* Per-folder tests reorganization

Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Stas Bekman <stas@stason.org>
2022-02-23 15:46:28 -05:00
Patrick von Platen 2e12b907ae
TF generate refactor - Greedy Search (#15562)
* TF generate start refactor

* Add tf tests for sample generate

* re-organize

* boom boom

* Apply suggestions from code review

* re-add

* add all code

* make random greedy pass

* make encoder-decoder random work

* further improvements

* delete bogus file

* make gpt2 and t5 tests work

* finish logits tests

* correct logits processors

* correct past / encoder_outputs drama

* refactor some methods

* another fix

* refactor shape_list

* fix more shape list

* import shape
_list

* finish docs

* fix imports

* make style

* correct tf utils

* Fix TFRag as well

* Apply Lysandre's and Sylvais suggestions

* Update tests/test_generation_tf_logits_process.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* Update src/transformers/tf_utils.py

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>

* remove cpu according to gante

* correct logit processor

Co-authored-by: Matt <Rocketknight1@users.noreply.github.com>
2022-02-15 17:54:43 +01:00
Yih-Dar 6a5472a8e1
Force use_cache to be False in PyTorch (#15385)
* use_cache = False for PT models if labels is passed

* Fix for BigBirdPegasusForConditionalGeneration

* add warning if users specify use_cache=True

* Use logger.warning instead of warnings.warn

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-02-08 16:20:53 +01:00
SaulLu 7b8bdd8601
fix the `tokenizer_config.json` file for the slow tokenizer when a fast version is available (#15319)
* add new test

* update test

* remove `tokenizer_file` from `additional_files_names` in `tokenization_utils_base.py`

* add `tokenizer_file` for the fast only tokenizer

* change global variables layoutxml

* remove `"tokenizer_file"` from DPR tokenizer's Global variables

* remove `tokenizer_file` from herbert slow tokenizer init

* `"tokenizer_file"` from LED tokenizer's Global variables

* remove `tokenizer_file` from mbart slow tokenizer init

* remove `tokenizer_file` from slow tokenizer template

* adapt to versioning

* adapt the `test_tokenizer_mismatch_warning` test

* clean test

* clarify `VOCAB_FILES_NAMES` in tokenization_utils_fast.py

* Revert "remove `tokenizer_file` from mbart slow tokenizer init"

This reverts commit 0dbb723fa9.

* Revert "`"tokenizer_file"` from LED tokenizer's Global variables"

This reverts commit 5a3f879bdd.

* Revert "remove `tokenizer_file` from herbert slow tokenizer init"

This reverts commit f5e10007b7.

* Revert "remove `"tokenizer_file"` from DPR tokenizer's Global variables"

This reverts commit da0895330b.

* set `tokenizer_file` in super `__init__` of mbart
2022-02-01 16:48:25 +01:00
Yih-Dar dc05dd539f
Fix TF Causal LM models' returned logits (#15256)
* Fix TF Causal LM models' returned logits

* Fix expected shape in the tests

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-02-01 11:04:07 +00:00
Yih-Dar 554d333ece
Fix loss calculation in TFXXXForTokenClassification models (#15294)
* Fix loss calculation in TFFunnelForTokenClassification

* revert the change in TFFunnelForTokenClassification

* fix FunnelForTokenClassification loss

* fix other TokenClassification loss

* fix more

* fix more

* add num_labels to ElectraForTokenClassification

* revert the change to research projects

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-01-31 11:43:08 -05:00
Sylvain Gugger 7fc6f41d91
Add doc for add-new-model-like command (#15433) 2022-01-31 11:10:45 -05:00
Yih-Dar c15bb3fe19
[Fix doc example] fix missing import jnp (#15291)
* fix missing import jnp

* Fix missing jax and k=1

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-01-24 14:54:23 +01:00
Jonas Kuball c962c2adbf
Adds missing module_specs for usages of _LazyModule (#15230)
* Add missing __spec__ for transformers.models.auto

* Moves the __spec__-test to the UnitTest class

* Adds module_spec to all instances of _LazyModule

* Refactors an old test from pytest to unittest
2022-01-21 07:30:12 -05:00
Matt 2708bfa127
Rename compute_loss in TF models (#15207)
* Rename compute_loss to hf_compute_loss to avoid conflicts with the new Keras method

* make style

* Adding deprecation warning to `compute_loss`

* Fix sneaky reference to compute_loss

* Replace logger.warning with warnings.warn

* Clarifying warning and deprecation timeline
2022-01-19 13:29:07 +00:00
Sylvain Gugger 5f3c57fc84
Check the repo consistency in model templates test (#15141)
* Check the repo consistency in model templates test

* Fix doc template

* Fix docstrings

* Fix last docstring
2022-01-14 04:52:38 -05:00
Sylvain Gugger 1a00863e95 Fix typo in doc template 2022-01-11 15:22:15 -05:00
NielsRogge 6ea6266625
Fix cookiecutter (#15100) 2022-01-11 05:57:26 -05:00
Suraj Patil 3e9fdcf019
[DOC] fix doc examples for bart-like models (#15093)
* fix doc examples

* remove double colons
2022-01-10 18:13:28 +01:00