Commit Graph

10784 Commits

Author SHA1 Message Date
Sylvain Gugger 3e2dd7f92d
Poc to use safetensors (#19175)
* Poc to use safetensors

* Typo

* Final version

* Add tests

* Save with the right name!

* Update tests/test_modeling_common.py

Co-authored-by: Julien Chaumond <julien@huggingface.co>

* Support for sharded checkpoints

* Test from Hub part 1

* Test from hub part 2

* Fix regular checkpoint sharding

* Bump for fixes

Co-authored-by: Julien Chaumond <julien@huggingface.co>
2022-09-30 10:58:04 -04:00
Jingya HUANG dad578e4c3
Add notebooks (#19259) 2022-09-30 10:04:36 -04:00
Karim Foda e396358104
Add stop sequence to text generation pipeline (#18444) 2022-09-30 14:26:51 +01:00
Sayak Paul 582d085bb2
Add expected output to the sample code for `ViTMSNForImageClassification` (#19183)
* chore: add expected output to the sample code.

* add: imagenet-1k labels to the model config.

* chore: apply code formatting.

* chore: change the expected output.
2022-09-30 15:25:41 +02:00
Matt 368b649af6
Rebase ESM PR and update all file formats (#19055)
* Rebase ESM PR and update all file formats

* Fix test relative imports

* Add __init__.py to the test dir

* Disable gradient checkpointing

* Remove references to TFESM... FOR NOW >:|

* Remove completed TODOs from tests

* Convert docstrings to mdx, fix-copies from BERT

* fix-copies for the README and index

* Update ESM's __init__.py to the modern format

* Add to _toctree.yml

* Ensure we correctly copy the pad_token_id from the original ESM model

* Ensure we correctly copy the pad_token_id from the original ESM model

* Tiny grammar nitpicks

* Make the layer norm after embeddings an optional flag

* Make the layer norm after embeddings an optional flag

* Update the conversion script to handle other model classes

* Remove token_type_ids entirely, fix attention_masking and add checks to convert_esm.py

* Break the copied from link from BertModel.forward to remove token_type_ids

* Remove debug array saves

* Begin ESM-2 porting

* Add a hacky workaround for the precision issue in original repo

* Code cleanup

* Remove unused checkpoint conversion code

* Remove unused checkpoint conversion code

* Fix copyright notices

* Get rid of all references to the TF weights conversion

* Remove token_type_ids from the tests

* Fix test code

* Update src/transformers/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Add credit

* Remove _ args and __ kwargs in rotary embedding

* Assertively remove asserts

* Replace einsum with torch.outer()

* Fix docstring formatting

* Remove assertions in tokenization

* Add paper citation to ESMModel docstring

* Move vocab list to single line

* Remove ESMLayer from init

* Add Facebook copyrights

* Clean up RotaryEmbedding docstring

* Fix docstring formatting

* Fix docstring for config object

* Add explanation for new config methods

* make fix-copies

* Rename all the ESM- classes to Esm-

* Update conversion script to allow pushing to hub

* Update tests to point at my repo for now

* Set config properly for tests

* Remove the gross hack that forced loss of precision in inv_freq and instead copy the data from the model being converted

* make fixup

* Update expected values for slow tests

* make fixup

* Remove EsmForCausalLM for now

* Remove EsmForCausalLM for now

* Fix padding idx test

* Updated README and docs with ESM-1b and ESM-2 separately (#19221)

* Updated README and docs with ESM-1b and ESM-2 separately

* Update READMEs, longer entry with 3 citations

* make fix-copies

Co-authored-by: Your Name <you@example.com>

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Tom Sercu <tsercu@fb.com>
Co-authored-by: Your Name <you@example.com>
2022-09-30 14:16:25 +01:00
Yih-Dar 4fd32a1f49
Catch `HFValidationError` in `TrainingSummary` (#19252)
* Catch HfValidationError in TrainingSummary

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-30 13:45:56 +02:00
NielsRogge f3d2f7a6e0
Add MarkupLM (#19198)
* First draft

* Make basic test work

* Fix most tokenizer tests

* More improvements

* Make more tests pass

* Fix more tests

* Fix some code quality

* Improve truncation

* Implement feature extractor

* Improve feature extractor and add tests

* Improve feature extractor tests

* Fix pair_input test partly

* Add fast tokenizer

* Improve implementation

* Fix rebase

* Fix rebase

* Fix most of the tokenizer tests.

* propose solution for fast

* add: integration test for fasttokenizer, warning for decode, fix template in slow tokenizer

* add: modify markuplmconverter

* add: some modify on converter and tokenizerfast

* Fix style, copies

* Make fixup

* Update tokenization_markuplm.py

* Update test_tokenization_markuplm.py

* Update markuplm related

* Improve processor, add integration test

* Add processor test file

* Improve processor

* Improve processor tests

* Fix more processor tests

* Fix processor tests

* Update docstrings

* Add Copied from statements

* Add more Copied from statements

* Add code examples

* Improve code examples

* Add model to doc tests

* Adding dependency check

* Add dummy file

* Add requires_backends

* Add model to toctree

* Fix more things, disable dependency check for now

* Apply more suggestions

* Add soft dependency

* Add annotators to tests

* Fix style

* Remove from_slow=True

* Remove print statements

* Add sanity check

* Fix processor test

* Fix processor tests, add more docs

* Add doc tests for mdx file

* Add more tips

* Apply suggestions

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: lockon-n <45759388+lockon-n@users.noreply.github.com>
Co-authored-by: SaulLu <lucilesaul.com@gmail.com>
Co-authored-by: lockon-n <dd098309@126.com>
2022-09-30 08:25:43 +02:00
rbsteinm 49d62b0178
[Wav2Vec2] Fix None loss in doc examples (#19218)
* pass sampled_negative_indices parameter to the model to avoid getting a None loss
* concerns doc examples for Wav2Vec2ForPreTraining and Wav2Vec2ConformerForPreTraining
2022-09-29 19:23:14 +02:00
Yih-Dar 1a1893e5d8
Update Past CI report script (#19228)
* Simplify the error report

* Add status placeholder

* Add job links

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-29 19:22:23 +02:00
Yih-Dar 163cd15279
Add job names in Past CI artifacts (#19235)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-29 19:18:24 +02:00
Sylvain Gugger f16bbf1475
Skip pipeline tests (#19248) 2022-09-29 12:25:15 -04:00
Matt cca6e6fea1
Cast TF generate() inputs (#19232)
* Just stick a couple of casts into generate()

* Cast decoder_input_ids too

* Don't accidentally cast floats

* Move to _generate()

* Move to after input validation

Co-authored-by: Your Name <you@example.com>
2022-09-29 16:51:08 +01:00
Alara Dirik 01eb34ab45
Improve DETR post-processing methods (#19205)
* Ensures consistent arguments and outputs with other post-processing methods
* Adds post_process_semantic_segmentation, post_process_instance_segmentation, post_process_panoptic_segmentation, post_process_object_detection methods to DetrFeatureExtractor
* Adds deprecation warnings to post_process, post_process_segmentation and post_process_panoptic
2022-09-29 17:33:13 +03:00
Sylvain Gugger 655f72a689
Fix test fetching for examples (#19237)
* Fix test fetching for examples

* Fake example modif

* Debug statements

* Typo

* You need to persist the file...

* Revert change in example

* Remove debug statements
2022-09-29 09:36:42 -04:00
atturaioe b79028f0b6
Fix TrainingArgs argument serialization (#19239) 2022-09-29 09:13:56 -04:00
Lucain 902d30b31a
Use `hf_raise_for_status` instead of deprecated `_raise_for_status` (#19244)
* Use  instead of  from huggingface_hub

* bump huggingface_hub to 0.10.0 + make deps_table_update
2022-09-29 08:58:39 -04:00
Younes Belkada 3a27ba3d18
Fix opt softmax small nit (#19243)
* fix opt softmax nit

- Use the same logic as 1eb0953755 for consistency

* Update src/transformers/models/opt/modeling_opt.py
2022-09-29 13:40:55 +02:00
mustapha ajeghrir ba9e336fa3
Fix `m2m_100.mdx` doc example missing `labels` (#19149)
The `labels` variable is not defined, the `model_inputs` already contain this information.
2022-09-29 13:27:58 +02:00
Aritra Roy Gosthipaty 0dc7b3a785
[TensorFlow] Adding GroupViT (#18020)
* chore: initial commit

* chore: adding util methods

yet to work on the nn.functional.interpolate port with align_corener=True

* chore: refactor the utils

* used tf.compat.v1.image.resize to align the F.interpolate function
* added type hints to the method signatures
* added references to the gists where one 2 one alignment of torch and tf has been shown

* chore: adding the layers

* chore: porting all the layers from torch to tf

This is the initial draft, nothing is tested yet.

* chore: aligning the layers with reference to tf clip

* chore: aligning the modules

* added demaraction comments
* added copied and adapted from comments

* chore: aligning with CLIP

* chore: wrangling the layers to keep it tf compatible

* chore: aligning the names of the layers for porting

* chore: style changes

* chore: adding docs and inits

* chore: adding tfp dependencis

the code is taken from TAPAS

* chore: initial commit for testing

* chore: aligning the vision embeddings with the vit implementatino

* chore: changing model prefix

* chore: fixing the name of the model and the layer normalization test case

* chore: every test passes but the slow ones

* chore: fix style and integration test

* chore: moving comments below decorators

* chore: make fixup and fix-copies changes

* chore: adding the Vision and Text Model to check_repo

* chore: modifying the prefix name to align it with the torch implementation

* chore: fix typo in configuration

* choer: changing the name of the model variable

* chore: adding segmentation flag

* chore: gante's review

* chore: style refactor

* chore: amy review

* chore: adding shape_list to parts that have been copied from other snippets

* chore: init batchnorm with torch defaults

* chore: adding shape_list to pass the tests

* test fix: adding seed as 0

* set seed

* chore: changing the straight through trick to fix -ve dimensinos

* chore: adding a dimension to the loss

* chore: adding reviewers and contributors names to the docs

* chore: added changes after review

* chore: code quality fixup

* chore: fixing the segmentation snippet

* chore: adding  to the layer calls

* chore: changing int32 to int64 for inputs of serving

* chore: review changes

* chore: style changes

* chore: remove from_pt=True

* fix: repo consistency

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-29 10:48:04 +01:00
Michael Benayoun bb6fa06f2d
Add a getattr method, which replaces _module_getattr in torch.fx.Tracer from PyTorch 1.13+ (#19233) 2022-09-29 11:04:49 +02:00
Gabriele Sarti 9d732fd2dd
XGLM - Fix Softmax NaNs when using FP16 (#18057)
* fix fp16 for xglm

* Removed misleading comment

* Fix undefined variable

Co-authored-by: Gabriele Sarti <gsarti@amazon.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2022-09-29 10:42:07 +02:00
Yih-Dar 99c32493e0
Fix confusing working directory in Push CI (#19234)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-29 08:36:46 +02:00
Steven Liu 6957350c2b
Focus doc around preprocessing classes (#18768)
* 📝 reframe docs around preprocessing classes

* small edits

* edits and review

* fix typo

* apply review

* clarify processor
2022-09-28 17:09:44 -07:00
Steven Liu 990936a868
Move AutoClasses under Main Classes (#19163)
* move autoclasses to main classes

* keep auto.mdx in model_doc
2022-09-28 17:09:29 -07:00
Sylvain Gugger 0fc68a7e14
Fix seq2seq QA example 2022-09-28 15:45:49 -04:00
Yih-Dar 64998a57fb
Fix cache names in CircleCI jobs (#19223)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-28 18:26:12 +02:00
Tatsuki Okada 4a0b958d61
Fix trainer seq2seq qa.py evaluate log and ft script (#19208)
* fix args option

* fix trainer eval log

* fix out of memory qa script

* do isort, black, flake

* fix tokenize target

* take it back.

* fix: comment
2022-09-28 10:55:46 -04:00
Nick Doiron 9c6aeba353
Document and validate typical_p in generation (#19128)
* Document and validate typical_p in generation
2022-09-28 15:45:05 +01:00
Yih-Dar de359c4593
Fix doctest for `TFDeiTForImageClassification` (#19173)
* Fix doctest for TFDeiTForImageClassification

* Remove unnecessary tf.random.set_seed

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-28 15:53:21 +02:00
Gabriel Luiz Freitas Almeida 22d37a9d2c
Fix deprecation warning for return_all_scores (#19217)
* Improve deprecation warning for return_all_scores

* Fix formatting
2022-09-28 08:57:43 -04:00
Joao Gante a357ed50e7
Generate: add warning when left padding should be used (#19067)
* add warning when left padding should be used

* PT: check for pad token; FLAX: can only check while not tracing
2022-09-28 13:07:08 +01:00
Ankur Goyal 942fa8ced8
Fix small use_cache typo in the docs (#19191) 2022-09-28 13:03:20 +01:00
IMvision12 2df602870b
Added tests for yaml and json parser (#19219)
* Added tests for yaml and json

* Added tests for yaml and json
2022-09-27 16:25:57 -04:00
Yih-Dar 2d95695825
Use `math.pi` instead of `torch.pi` in `MaskFormer` (#19201)
* Use math.pi

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-27 17:30:58 +02:00
Sylvain Gugger 34be08efcd
More tests for regression in cached non existence (#19216)
* More tests for regression in cached non existence

* Style
2022-09-27 09:36:34 -04:00
Nicola Procopio e3a30e2b99
translated add_new_pipeline (#19215) 2022-09-27 08:55:41 -04:00
wangxu 226b0e46d5
Add a use_parallel_residual argument to control the residual computing way (#18695)
* Add a gpt_j_residual argument to control the residual computing way

* Put duplicate code outside of the if block

* Rename parameter "gpt_j_residual" to "use_parallel_residual" and set the default value to True
2022-09-27 07:54:05 -04:00
Wang, Yi 88f597ba6a
add doc for hyperparameter search (#19192)
* add doc for hyperparameter search

* update doc
2022-09-27 07:51:51 -04:00
Arijit Mukherjee ea540a5977
add wav2vec2_alignment (#16782)
* add wav2vec2_alignment

* Update alignment.py

* Update examples/research_projects/wav2vec2/alignment.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update examples/research_projects/wav2vec2/alignment.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update examples/research_projects/wav2vec2/alignment.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update examples/research_projects/wav2vec2/alignment.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update README.md

* fix style

* fix imports

* fix multithread

* fix bash script

* [@anton-l] Style fixes and docstrings

* [@anton-l] Style fixes and docstrings

* Update alignment.py

fix blank id in backtrack

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: anton-l <aglozhkov@gmail.com>
2022-09-27 13:12:56 +02:00
Ekagra Ranjan 7132d55ca1
Remove unused `cur_len` in generation_utils.py (#18874)
* remove unused cur_len in generation_utils.py

* linting
2022-09-27 10:39:31 +02:00
Sylvain Gugger a32f97c37d
Fix cached_file in offline mode for cached non-existing files (#19206)
* Fix cached_file in offline mode for cached non-existing files

* Add tests

* Test with offline mode
2022-09-26 18:01:00 -04:00
Yih-Dar ca0886395b
Add warning for torchaudio <= 0.10 in MCTCTFeatureExtractor (#19203)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-26 23:58:02 +02:00
IMvision12 be4f269979
Updated hf_argparser.py (#19188)
* Changed json_file_parser function and added yaml parser function

* update hf_argparser

* Added allow_extra_keys argument
2022-09-26 17:02:57 -04:00
Sylvain Gugger c20b2c7e18
Use repo_type instead of deprecated datasets repo IDs (#19202)
* Use repo_type instead of deprecated datasets repo IDs

* Add missing one in doc
2022-09-26 09:50:48 -04:00
Ankur Goyal 216b2f9e80
Move the model type check (#19027)
Co-authored-by: Ankur Goyal <ankur@impira.com>
2022-09-26 09:43:34 -04:00
Yih-Dar ea75e9f10e
Use `assertAlmostEqual` in `BloomEmbeddingTest.test_logits` (#19200)
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-26 14:56:41 +02:00
dependabot[bot] 98af4f9b54
Bump protobuf in /examples/research_projects/decision_transformer (#19176)
Bumps [protobuf](https://github.com/protocolbuffers/protobuf) from 3.19.4 to 3.19.5.
- [Release notes](https://github.com/protocolbuffers/protobuf/releases)
- [Changelog](https://github.com/protocolbuffers/protobuf/blob/main/generate_changelog.py)
- [Commits](https://github.com/protocolbuffers/protobuf/compare/v3.19.4...v3.19.5)

---
updated-dependencies:
- dependency-name: protobuf
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2022-09-26 14:55:16 +02:00
Ahmad Elawady 408b5e307b
Remove pos arg from Perceiver's Pre/Postprocessors (#18602)
* Remove pos arg from Perceiver's Pre/Postprocessors

* Revert the removed pos args in public methods
2022-09-26 08:50:58 -04:00
Yih-Dar 71fc331746
Separate Push CI images from Scheduled CI (#19170)
* separate images

* Fix condition

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-09-26 10:55:42 +02:00
flozi00 fa4eeb4fd3
german training, accelerate and model sharing (#19171)
* correct spelling in README

* processing

* german training

* accelerate

* german model sharing

* build doc

* ttf links

* casing
2022-09-23 14:52:09 -04:00