dependabot[bot]
6b8dbc283c
Bump certifi from 2022.12.7 to 2023.7.22 in /examples/research_projects/lxmert ( #25096 )
...
Bump certifi in /examples/research_projects/lxmert
Bumps [certifi](https://github.com/certifi/python-certifi ) from 2022.12.7 to 2023.7.22.
- [Commits](https://github.com/certifi/python-certifi/compare/2022.12.07...2023.07.22 )
---
updated-dependencies:
- dependency-name: certifi
dependency-type: direct:production
...
Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-07-25 17:24:50 -04:00
Yih-Dar
da5ff18a4a
Fix doctest ( #25031 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-25 22:10:06 +02:00
Sebastian Husch Lee
8f36ab3e22
[`T5`, `MT5`, `UMT5`] Add [T5, MT5, UMT5]ForSequenceClassification ( #24726 )
...
* Initial addition of t5forsequenceclassification
* Adding imports and adding tests
* Formatting
* Running make fix-copies
* Adding mt5forseq
* Formatting
* run make fix-copies
* Adding to docs
* Add model_parallel
* Fix bug
* Fix
* Remove TODO
* Fixing tests for T5ForSequenceClassification
* Undo changes to dependency_versions_table.py
* Change classification head to work with T5Config directly
* Change seq length to let tests pass
* PR comments for formatting
* Formatting
* Initial addition of UMT5ForSequenceClassification
* Adding to inits and formatting
* run make fix-copies
* Add doc for UMT5ForSeqClass
* Update UMT5 config
* Fix docs
* Skip torch fx test for SequenceClassification
* Formatting
* Add skip to UMT5 tests as well
* Fix umt5 tests
* Running make fix-copies
* PR comments
* Fix for change to sentence_representation
* Rename seq_len to hidden_size since that's what it is
* Use base_model to follow format of the rest of the library
* Update docs
* Extract the decoder_input_ids changes and make one liner
* Make one-liner
2023-07-25 21:02:49 +02:00
Yih-Dar
21150cb0f3
Hotfix for failing `MusicgenForConditionalGeneration` tests ( #25091 )
...
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-25 20:26:00 +02:00
Arthur
f9cc333805
[ `PreTrainedTokenizerFast`] Keep properties from fast tokenizer ( #25053 )
...
* draft solution
* use `setdefault`
* nits
* add tests and fix truncation issue
* fix test
* test passes locally
* quality
* updates
* update tsets
2023-07-25 18:45:01 +02:00
Connor Henderson
0779fc8eb8
Edit err message and comment in `test_model_is_small` ( #25087 )
...
* Edit err message and comment in
* put back 80M comment
2023-07-25 12:24:36 -04:00
Arthur
2fac342238
[`TF`] Also apply patch to support left padding ( #25085 )
...
* tf versions
* apply changes to other models
* 3 models slipped through the cracks
2023-07-25 11:23:09 -04:00
Arthur
f104522718
[ `ForSequenceClassification`] Support `left` padding ( #24979 )
...
* support left padding
* nit
* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py
* Update src/transformers/models/gpt_neox/modeling_gpt_neox.py
2023-07-25 16:19:43 +02:00
Yih-Dar
1e662f0f07
Allow generic composite models to pass more kwargs ( #24927 )
...
* fix
* Update src/transformers/generation/utils.py
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
* update
---------
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
2023-07-25 16:07:00 +02:00
김준재_T3056
b51312e24d
🌐 [i18n-KO] Translated `perf_infer_cpu.md` to Korean ( #24920 )
...
* docs: ko: perf_infer_cpu.md
* feat: chatgpt draft
* fix: manual edits
* Update docs/source/ko/_toctree.yml
* Update docs/source/ko/perf_infer_cpu.md
* Update docs/source/ko/perf_infer_cpu.md
이 부분은 저도 걸리적거렸던 부분입니다. 반영하겠습니다!
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
* Update docs/source/ko/perf_infer_cpu.md
동의합니다! 제가 원본에 너무 얽매여 있었네요!
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
* Update docs/source/ko/perf_infer_cpu.md
말씀하신대로 원문에 너무 집착했던것 같습니다
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
* Update docs/source/ko/perf_infer_cpu.md
더 나은 어휘 사용에 감사드립니다!
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
* Update docs/source/ko/perf_infer_cpu.md
이 당시 '주기'란 용어를 생각해내질 못했네요...
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
* Update docs/source/ko/perf_infer_cpu.md
좀 더 자연스러운 문맥이 됐네요!
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
* Update docs/source/ko/perf_infer_cpu.md
굳이 원본 형식에 얽매일 필요가 없군요!
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
* Update docs/source/ko/perf_infer_cpu.md
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
---------
Co-authored-by: Wonhyeong Seo <wonhseo@kakao.com>
2023-07-25 16:04:14 +02:00
Gema Parreño
b99f7bd4fc
[DOCS] add example NoBadWordsLogitsProcessor ( #25046 )
...
* add example NoBadWordsLogitsProcessor
* fix L764 & L767
* make style
2023-07-25 09:41:48 -04:00
Arthur
dcb183f4bd
[`MPT`] Add MosaicML's `MPT` model to transformers ( #24629 )
...
* draft add new model like
* some cleaning of the config
* nits
* add nested configs
* nits
* update
* update
* added layer norms + triton kernels
* consider only LPLayerNorm for now.
* update
* all keys match.
* Update
* fixing nits here and there
* working forward pass.
* removed einops dependency
* nits
* format
* add alibi
* byebye head mask
* refactor attention
* nits.
* format
* fix nits.
* nuke ande updates
* nuke tokenizer test
* don't reshape query with kv heads
* added a bit of documentation.
* remove unneeded things
* nuke more stuff
* nit
* logits match - same generations
* rm unneeded methods
* 1 remaining failing CI test
* nit
* fix nits
* fix docs
* fix docs
* rm tokenizer
* fixup
* fixup
* fixup and fix tests
* fixed configuration object.
* use correct activation
* few minor fixes
* clarify docs a bit
* logits match à 1e-12
* skip and unskip a test
* added some slow tests.
* fix readme
* add more details
* Update docs/source/en/model_doc/mpt.md
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix configuration issues
* more fixes in config
* added more models
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* remove unneeded position ids
* fix some comments
* Apply suggestions from code review
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* revert suggestion
* mpt alibi + added batched generation
* Update src/transformers/models/mpt/__init__.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* remove init config
* Update src/transformers/models/mpt/configuration_mpt.py
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
* fix nit
* add another slow test
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* fits in one line
* some refactor because make fixup doesn't pass
* add ft notebook
* update md
* correct doc path
---------
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-07-25 14:32:40 +02:00
Xiaoke Huang
1dbc1440a7
Fix: repeat per sample for SAM image embeddings ( #25074 )
...
Repeat per sample for SAM image embeddings
2023-07-25 08:30:14 -04:00
Harheem Kim
cb8abee511
🌐 [i18n-KO] Translated `hpo_train.md` to Korean ( #24968 )
...
* dos: ko: hpo_train.mdx
* feat: chatgpt draft
* fix: manual edits
* fix: resolve suggestions
2023-07-25 08:28:20 -04:00
Arthur
f2c1df93f5
[`generate`] Only warn users if the `generation_config`'s `max_length` is set to the default value ( #25030 )
...
* check max length is default
* nit
* update warning: no-longer deprecate
* comment in the configuration_utils in case max length's default gets changed in the futur
2023-07-25 14:20:37 +02:00
Alan Ji
c879318cc5
replace `per_gpu_eval_batch_size` with `per_device_eval_batch_size` in readme of multiple-choice task ( #25078 )
...
replace `per_gpu_eval_batch_size` with `per_device_eval_batch_size`
in readme of multiple-choice
2023-07-25 08:11:56 -04:00
Susnato Dhar
25e443c0d4
Fix broken link in README_hd.md ( #25067 )
...
Update README_hd.md
2023-07-25 08:09:01 -04:00
Xuehai Pan
6bc61aa7af
Set `TF32` flag for PyTorch cuDNN backend ( #25075 )
2023-07-25 08:04:48 -04:00
Injin Paek
5dba88b2d2
fix: add TOC anchor link ( #25066 )
2023-07-25 08:02:33 -04:00
Sylvain Gugger
f295fc8a16
Fix last models for common tests that are too big. ( #25058 )
...
* Fix last models for common tests that are too big.
* Remove print statement
2023-07-25 07:56:04 -04:00
Sangam Lee
ee1eb3b325
🌐 [i18n-KO] Translated `perf_hardware.md` to Korean ( #24966 )
...
* docs: ko: perf_hardware.md
* feat: nmt draft
* fix: manual edits
* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
* fix: resolve suggestions
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
* fix: resolve suggestions
Co-authored-by: Haewon Kim <ehdvkf02@naver.com>
* Fix: manual edits
* fix: manual edits
* fix: manual edits
* fix: manual edits
* fix: fix rendering error of perf_hardware.md
---------
Co-authored-by: Hyeonseo Yun <0525yhs@gmail.com>
Co-authored-by: Haewon Kim <ehdvkf02@naver.com>
2023-07-25 07:44:24 -04:00
Haewon Kim
f6fe1d5514
🌐 [i18n-KO] Translated `<tf_xla>.md` to Korean ( #24904 )
...
* docs: ko: tf_xla.md
* feat: chatgpt draft
* fix: manual edits
* fix: manual edits
* fix: manual edits
* fix: resolve suggestions
2023-07-25 07:43:22 -04:00
Kashif Rasul
faf25c040d
[Docs] fix rope_scaling doc string ( #25072 )
...
fix rope_scaling doc string
2023-07-25 07:34:10 -04:00
Joao Gante
c0742b15cb
Generate - add beam indices output in contrained beam search ( #25042 )
2023-07-25 11:12:29 +01:00
Arthur
c53a6eae74
[`RWKV`] Add note in doc on `RwkvStoppingCriteria` ( #25055 )
...
* Add note in doc on `RwkvStoppingCriteria`
* give some breathing space to the code
2023-07-25 10:15:00 +02:00
Sylvain Gugger
d2295708a6
Better error message when signal is not supported on OS ( #25049 )
...
* Better error message when signal is not supported on OS
* Address review comments
2023-07-24 14:34:16 -04:00
seank021
c0d1c33022
🌐 [i18n-KO] Translated `perf_train_cpu.md` to Korean ( #24911 )
...
* dos: ko: perf_train_cpu.md
* feat: chatgpt draft
* fix: manual edits
* fix: resolve suggestions
* fix: manual edits
Co-authored-by: Haewon Kim <ehdvkf02@naver.com>
---------
Co-authored-by: Haewon Kim <ehdvkf02@naver.com>
2023-07-24 17:54:13 +02:00
Younes Belkada
b08f41e62a
[`8bit`] Fix 8bit corner case with Blip2 8bit ( #25047 )
...
fix 8bit corner case with Blip2 8bit
2023-07-24 16:58:40 +02:00
Nate Brake
3611fc90e0
compute_loss in trainer failing to label shift for PEFT model when label smoothing enabled. ( #25044 )
...
* added PeftModelForCausalLM to MODEL_FOR_CAUSAL_LM_MAPPING_NAMES dict
* check for PEFT model in compute_loss section
---------
Co-authored-by: Nathan Brake <nbrake3@mmm.com>
2023-07-24 10:53:10 -04:00
Rinat
a03d13c83d
Pvt model ( #24720 )
...
* pull and push updates
* add docs
* fix modeling
* Add and run test
* make copies
* add task
* fix tests and fix small issues
* Checks on a Pull Request
* fix docs
* add desc pvt.md
2023-07-24 15:34:19 +01:00
Sylvain Gugger
afe8bfc075
Comment again print statement
2023-07-24 10:12:20 -04:00
Sylvain Gugger
42571f6eb8
Make more test models smaller ( #25005 )
...
* Make more test models tiny
* Make more test models tiny
* More models
* More models
2023-07-24 10:08:47 -04:00
Sören Brunk
8f1f0bf50f
Fix typo in LlamaTokenizerFast docstring example ( #25018 )
2023-07-24 09:37:58 -04:00
Zach Mueller
3b734f5042
Add dispatch_batches to training arguments ( #25038 )
...
* Dispatch batches
* Copy items
2023-07-24 09:27:19 -04:00
Sunmin Cho
9d2b983ed0
🌐 [i18n-KO] Translated `testing.md` to Korean ( #24900 )
...
* docs: ko: testing.md
* feat: draft
* fix: manual edits
* fix: edit ko/_toctree.yml
* fix: manual edits
* fix: manual edits
* fix: manual edits
* fix: manual edits
* fix: resolve suggestions
2023-07-24 09:24:11 -04:00
Sangam Lee
383be1b763
🌐 [i18n-KO] Translated performance.md to Korean ( #24883 )
...
* dos: ko: performance.md
* feat: chatgpt draft
* fix: manual edits
* fix: manual edits
* Update docs/source/ko/performance.md
Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com>
* Update docs/source/ko/performance.md
---------
Co-authored-by: Kihoon Son <75935546+kihoon71@users.noreply.github.com>
2023-07-24 09:23:34 -04:00
Iskren Ivov Chernev
efb2ba666d
Better handling missing SYS in llama conversation tokenizer ( #24997 )
...
* Better handling missing SYS in llama conversation tokenizer
The existing code failed to add SYS if the conversation has history
without SYS, but did modify the passed conversation as it did.
Rearrange the code so modification to the conversation object are taken
into account for token id generation.
* Fix formatting with black
* Avoid one-liners
* Also fix fast tokenizer
* Drop List decl
2023-07-24 09:21:10 -04:00
Lucain
6704923107
Support GatedRepoError + use raise from ( #25034 )
...
* Support GatedRepoError + use raise from
* Apply suggestions from code review
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
* Use token instead of use_auth_token in error messages
---------
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-07-24 09:12:39 -04:00
Maria Khalusova
75317aefb3
[docs] Performance docs tidy up, part 1 ( #23963 )
...
* first pass at the single gpu doc
* overview: improved clarity and navigation
* WIP
* updated intro and deepspeed sections
* improved torch.compile section
* more improvements
* minor improvements
* make style
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* feedback addressed
* mdx -> md
* link fix
* feedback addressed
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-07-24 08:57:24 -04:00
Bharat Ramanathan
54ba8608d0
fix(integrations): store serialized `TrainingArgs` to `wandb.config` without sanitization. ( #25035 )
...
fix: store training args to wandb config without sanitization.
Allows resuming runs by reusing the wandb config.
Co-authored-by: Bharat Ramanathan <ramanathan.parameshwaran@gohuddl.com>
2023-07-24 08:42:39 -04:00
Arthur
0906d21203
[`logging.py`] set default `stderr` path if `None` ( #25033 )
...
set default logger
2023-07-24 14:31:45 +02:00
Stas Bekman
c9a82be592
[check_config_docstrings.py] improve diagnostics ( #25012 )
...
* [check_config_docstrings.py] improve diagnostics
* style
* rephrase
* fix
2023-07-23 21:17:26 -07:00
Wonhyeong Seo
b257c46a07
🌐 [i18n-KO] Updated Korean `serialization.md` ( #24686 )
...
fix: update ko/serialization.md
* chatgpt draft
2023-07-21 19:23:59 -04:00
Sylvain Gugger
87fba947a5
Move template doc file to md ( #25004 )
2023-07-21 16:49:44 -04:00
Ivan Sorokin
ea41e18cfc
improve from_pretrained for zero3 multi gpus mode ( #24964 )
...
* improve from_pretrained for zero3 multi gpus mode
* Add check if torch.distributed.is_initialized
* Revert torch.distributed
---------
Co-authored-by: Stas Bekman <stas@stason.org>
2023-07-21 15:39:28 -04:00
Arthur
95f96b45ff
[`Llama`] remove persistent `inv_freq` tensor ( #24998 )
...
remove persistent tensor
2023-07-21 18:11:08 +02:00
Younes Belkada
d3ce048c20
[`bnb`] Add simple check for bnb import ( #24995 )
...
add simple check for bnb
2023-07-21 17:50:52 +02:00
Yih-Dar
f1a1eb4ae1
Fix `llama` tokenization doctest ( #24990 )
...
fix
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-07-21 16:47:51 +02:00
Sylvain Gugger
a7d213189d
Use main_input_name for include_inputs_for_metrics ( #24993 )
2023-07-21 10:30:17 -04:00
Sylvain Gugger
a6484c89b9
Fix type annotation for deepspeed training arg ( #24988 )
2023-07-21 09:42:05 -04:00