* Add attention mask and pad token warning to many of the models
* Remove changes under examples/research_projects
These files are not maintained by HG.
* Skip the warning check during torch.fx or JIT tracing
* Switch ordering for the warning and input shape assignment
This ordering is a little cleaner for some of the cases.
* Add missing line break in one of the files
* Remove jnp.DeviceArray since it is deprecated.
* Replace all instances of jnp.DeviceArray with jax.Array
* Update src/transformers/models/bert/modeling_flax_bert.py
---------
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
* Replace python random with torch.rand to enable dynamo.export
* revert changes to flax model code
* Remove unused random import
* Fix torch template
* Move torch.manual_seed(0) to right location
* Fix one BLIP arg not being optional, remove misspelled arg
* Remove the lxmert test overrides and just use the base test_saved_model_creation
* saved_model_creation fixes and re-enabling tests across the board
* Remove unnecessary skip
* Stop caching sinusoidal embeddings in speech_to_text
* Fix transfo_xl compilation
* Fix transfo_xl compilation
* Fix the conditionals in xglm
* Set the save spec only when building
* Clarify comment
* Move comment correctly
* Correct embeddings generation for speech2text
* Mark RAG generation tests as @slow
* Remove redundant else:
* Add comment to clarify the save_spec line in build()
* Fix size tests for XGLM at last!
* make fixup
* Remove one band_part operation
* Mark test_keras_fit as @slow
* Stop storing references to bound methods in tf.functions
* Remove the gc.collect calls now that we resolved the underlying problem
* Remove the default signature from model.serving entirely, big cleanup
* Remove _prune_signature as self.input_signature can prune itself
* Restore serving docstring
* Update int support test to check the input signature
* Make sure other tests also use model.input_signature and not serving.input_signature
* Restore _prune_signature
* Remove the doctest GC now it's no longer needed
* Correct core tests to use the pruned sig
* order lines correctly in core tests
* Add eager_serving back with a deprecation warning
* Let's try autodetecting serving sigs
* Don't clobber existing sigs
* Change shapes for multiplechoice models
* Make default dummy inputs smarter too
* Fix missing f-string
* Let's YOLO a serving output too
* Read __class__.__name__ properly
* Don't just pass naked lists in there and expect it to be okay
* Code cleanup
* Update default serving sig
* Clearer error messages
* Further updates to the default serving output
* make fixup
* Update the serving output a bit more
* Cleanups and renames, raise errors appropriately when we can't infer inputs
* More renames
* we're building in a functional context again, yolo
* import DUMMY_INPUTS from the right place
* import DUMMY_INPUTS from the right place
* Support cross-attention in the dummies
* Support cross-attention in the dummies
* Complete removal of dummy/serving overrides in BERT
* Complete removal of dummy/serving overrides in RoBERTa
* Obliterate lots and lots of serving sig and dummy overrides
* merge type hint changes
* Fix for token_type_ids with vocab_size 1
* Add missing property decorator
* Fix T5 and hopefully some models that take conv inputs
* More signature pruning
* Fix T5's signature
* Fix Wav2Vec2 signature
* Fix LongformerForMultipleChoice input signature
* Fix BLIP and LED
* Better default serving output error handling
* Fix BART dummies
* Fix dummies for cross-attention, esp encoder-decoder models
* Fix visionencoderdecoder signature
* Fix BLIP serving output
* Small tweak to BART dummies
* Cleanup the ugly parameter inspection line that I used in a few places
* committed a breakpoint again
* Move the text_dims check
* Remove blip_text serving_output
* Add decoder_input_ids to the default input sig
* Remove all the manual overrides for encoder-decoder model signatures
* Tweak longformer/led input sigs
* Tweak default serving output
* output.keys() -> output
* make fixup
* Rework TF type hints to use | None instead of Optional[] for tf.Tensor
* Rework TF type hints to use | None instead of Optional[] for tf.Tensor
* Don't forget the imports
* Add the imports to tests too
* make fixup
* Refactor tests that depended on get_type_hints
* Better test refactor
* Fix an old hidden bug in the test_keras_fit input creation code
* Fix for the Deit tests
* fix past renamed to past_key_value
* update more `past`that were ski^êd
* fixup
* remove changes made to rag
* refactor `_reorder_cache` to use `past_key_values`
* fix git `prepare_inputs_for_generation` to pass tests when false is needed in use_cache
* Add a test to ensure int dummy inputs are int64
* Move the test into the existing int64 test and update a lot of existing dummies
* Fix remaining dummies
* Fix remaining dummies
* Test for int64 serving sigs as well
* Update core tests to use tf.int64
* Add better messages to the assertions
* Update all serving sigs to int64
* More sneaky hiding tf.int32s
* Add an optional int32 signature in save_pretrained
* make fixup
* Add Amy's suggestions
* Switch all serving sigs back to tf.int32
* Switch all dummies to tf.int32
* Adjust tests to check for tf.int32 instead of tf.int64
* Fix base dummy_inputs dtype
* Start casting to tf.int32 in input_processing
* Change dtype for unpack_inputs test
* Add proper tf.int32 test
* Make the alternate serving signature int64
* move generation_*.py src files into generation/*.py
* populate generation.__init__ with lazy loading
* move imports and references from generation.xxx.object to generation.object
* added test
* correct embedding init
* some changes in blenderbot (incomplete)
* update blenderbot (diff to be used as reference)
* update blenderbot_small
* update LED
* update marian
* update T5 and remove TFWrappedEmbeddings
* nullcontext() -> ContextManagers()
* fix embedding init
* Override save() to use the serving signature as the default
* Replace int32 with int64 in all our serving signatures
* Remember one very important line so as not to break every test at once
* Dtype fix for TFLED
* dtype fix for shift_tokens_right in general
* Dtype fixes in mBART and RAG
* Fix dtypes for test_unpack_inputs
* More dtype fixes
* Yet more mBART + RAG dtype fixes
* Yet more mBART + RAG dtype fixes
* Add a check that the model actually has a serving method
Comparisons like
version.parse(torch.__version__) > version.parse("1.6")
are True for torch==1.6.0+cu101 or torch==1.6.0+cpu
version.parse(version.parse(torch.__version__).base_version) are preferred (and available in pytorch_utils.py
* [Flax] Add remat (gradient checkpointing)
* fix variable naming in test
* flip: checkpoint using a method
* fix naming
* fix class naming
* apply PVP's suggestions from code review
* make fix-copies
* fix big-bird, electra, roberta
* cookie-cutter
* fix flax big-bird
* move test to common