Commit Graph

64 Commits

Author SHA1 Message Date
Sylvain Gugger b4d4d6fe87
Add RWKV-4 (#22797)
* First draft of RWKV-4

* Add support for generate

* Style post-rebase

* Properly use state

* Write doc

* Fix doc

* More math

* Add model to README, dummies and clean config

* Fix init

* multiple fixes:

- fix common tests
- fix configuraion default values
- add CI test for checking state computation
- fix some CI tests

* correct tokenizer

* some tweaks

- fix config docstring
- fix failing tests

* fix CI tests

- add output_attention / output_hidden_states
- override test_initialization
- fix failing CIs

* fix conversion script

- fix sharded case
- add new arguments

* add slow tests + more fixes on conversion script

* add another test

* final fixes

* change single name variable

* add mock attention mask for pipeline to work

* correct eos token id

* fix nits

* add checkpoints

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add `tie_word_embeddings` in docstring

* change tensor name

* fix final nits

* Trigger CI

---------

Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
2023-05-09 13:04:10 -04:00
s-JoL c2c99dc7ef
add open-llama model with ckpt (#22795)
* update Open-Llama model

* update

* update format

* update doc

* update

* update stable embedding test

* update test case

* update format

* update readme

* fix typo

* update name

* remove tokenizer and update format

* remove convert_open_llama_weights_to_hf

* update warning and doc_string

---------

Co-authored-by: songliang.bayesian <songliang.bayesian@bytedance.com>
2023-04-28 11:01:32 -04:00
Ehsan M. Kermani a0e7332839
Fix CLAP link across all READMEs (#23032)
* Fix CLAP link across all READMEs

* Fix copy only for en
2023-04-27 18:07:02 -04:00
NielsRogge 3d3204c025
Add FocalNet (#21532)
Adds FocalNet by Microsoft to transformers

---------

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: alaradirik <alaradirik@gmail.com>
2023-04-23 20:03:05 +03:00
Younes Belkada 2da73f6302
[`SAM`] Correct arxiv link (#22886)
put correct link
2023-04-20 11:23:12 +02:00
Arthur 474bf508df
Add Segment Anything Model (SAM) (#22654)
* initial commit

* keys match

* update, fix conversion

* fixes, inference working

* fix

* more fixes

* more fixes

* clean up

* more clean up

* fix copies and add convext copied layer norm

* stash

* pretty big upfate

* cleaning

* more cleaning

* fixup stuffs

* fix copies

* fix iinit

* update test removing tokenizer

* nits

* add pretrained

* more nits

* remove tracking of pipeline

* few fixes

* update san and conversion script

* fix mask decoder and prompt encoder conversion

* fixes

* small update

* fix order

* fix

* fix image embeddings

* nites

* few fixes

* fix logits

* clean up

* fixes boxes inference

* v1 AMG

* clean up

* some clean up

* multi points support

* amg working

* fixup

* clean up

* readme

* update toctree

* fix type hint

* multiple fixes

* fixup

* fixes

* updates

* updates

* more tests

* few fixes

* change to `SamForMaskGeneration`

* doc

* fixup

* fix more tests

* multiple fixes

* fix CI tests

* refactor processor

* renamings

* draft the pipeline

* refactor

* fix tests

* fix test

* few cleanings

* fix test

* edit pipelien support chunking

* udate

* add slow tests

* fix nit

* fixup

* fix nit

* current chunk pipleine

* cast boxes in fp32

* nit

* current updates

* piepleine works

* fixup

* clean up config

* fix slow tests

* fix slow tests

* clean up

* update doc and pipeline

* adds more slow tests

* fix slow tests

* cleaning

* tests pass

* add docstring

* fix copies

* clean up

* support batch of images

* style

* dummy is needed, add tests

* fix slow tests

* fix CI

* update

* adds more tests

* fixes

* fixes

* fixup

* fixes

* few fixes

* filter

* few fixes

* some refactor

* touches finales

* fix

* style

* remove pipeline files

* fixes nits

* revert pipeline changes

* fix test

* fixup

* remove automodel for automatic mask generation

* fix failing torch tests

* update mdx

* revert removal of `MODEL_FOR_AUTOMATIC_MASK_GENERATION_MAPPING`

* update sam config based on review

Co-authored-by: amyeroberts <aeroberts4444@gmail.com>
Co-authored-by: sgugger <sylvain.gugger@gmail.com>

* update low_resolution_masks -> pred_masks
inti ln with layer_norm_eps
add_decomposed_rel_pos doc
forward doc of SamForMaskGeneration

* update processor docstring

* remove image processor import empty

* update for testing

* output vision hidden states + clean recomm
also test all iou values

* fixup

* fixup

* remove unused

* Update src/transformers/models/sam/modeling_sam.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/sam/image_processing_sam.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* nits

* fix

* fix CI tests and slow tests

* replace with Amy's processor

* clearer docstring

* add `SamVisionNeck`

* refactor - all CI tests should pass

* fix broken import on Gcolab

* few fixes here and there

* fix another bug

* fix more bugs

* update and merge

* correct ckpt

* address comments

* add tips

* revert

* fix docstring

* replace with `SamModel`

* make fixup

* add support for bathed images and batch ed points

* make fixup this time, really

* make fixup again and again

* few fixes here and there, this should be the touche finale

* Update docs/source/en/model_doc/sam.mdx

* fixup

* correct checkpoints

* correct name

* rm unneeded file

* add notebook

---------

Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: amyeroberts <aeroberts4444@gmail.com>
Co-authored-by: sgugger <sylvain.gugger@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-04-19 21:01:49 +02:00
amyeroberts 5f97bbc124
Remove 'main' from doc links (#22860) 2023-04-19 15:03:57 +01:00
pioliverse 523ca4e016
add model resources for CPMAnt (new) (#20906)
* resolve conflicts

* rebase and make style

* test

* test

* test

* rebase and make style

* rebase and make style

* tests

* tests

* rewrite some functions

* rebase and make style

* fix load_tf_weights_in_cpmant

* reformat some unrelated files

* upgrade quality

* fix some bugs & docstring

* add models and tests

* solve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* tests

* resolve conflicts

* resolve conflicts

* fix load_tf_weights_in_cpmant

* reformat some unrelated files

* upgrade quality

* fix some bugs & docstring

* save resolution

* make style

* delete redefinition code

* reformat function

* reformat

* resolve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* tests

* resolve conflicts

* resolve conflicts

* fix load_tf_weights_in_cpmant

* reformat some unrelated files

* upgrade quality

* resolve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* resolve conflicts

* fix load_tf_weights_in_cpmant

* reformat some unrelated files

* upgrade quality

* resolve conflicts

* make style

* fix bugs and refactor

* modify docstrings and make style

* unify import format in __init__.py

* fix import-altclp bug

* fix copies to update index.md

* fix unused config parameters

* fix unused config parameters

* fix unused config parameters

* update README_ja.md

* dummy commit for unit test

* fix attention mask

* add CPMAntTokenizer&-Fast to auto-mapping

* drop redundant changes in README_ko

* fix  defaults in docstring

* fix use_cache and some docstring

* add missing args in tokenizer

* modify tester inheritance

* add is_jieba_available

* fix some bugs

* make style and fix-copies

* add doctests

* skip integration tests

* add is_jieba_available

* fix bugs in common tests

* adjust docstrings and make style

* add argument docstring

* adjust code to some specifications

* make style and fix-copies

* add fast tokenization test

* dummy commit for unit test

* dummy commit for unit test

* dummy commit for unit test

* normalize some comments and names

* Bert->CPMAnt

* camel names and drop redundant codes

* make style and fix-coies

* add CpmTokenizerFast _import_structure

* drop cpmanttokenizerfast in model_doc

* fix some problems

* fix CPMAnt tokenization for common test

* make style and fixup

* fix copies and fixup

* fix bugs in tokenization test

* dummy commit for connection failure in unittest

* fix copies

* drop trailing comma

* fix decorator in tests

* dummy commit for connection failure in unittest

---------

Co-authored-by: Gong Baitao <gongbaitao11@gmail.com>
2023-04-12 07:33:20 -04:00
Joel Lamy-Poirier e0921c6b53
Add GPTBigCode model (Optimized GPT2 with MQA from Santacoder & BigCode) (#22575)
* Add model with cli tool

* Remove unwanted stuff

* Add new code

* Remove inference runner

* Style

* Fix checks

* Test updates

* make fixup

* fix docs

* fix doc

* fix test

* hopefully fix pipeline tests

* refactor

* fix CIs

* add comment

* rename to `GPTBigCodeForCausalLM`

* correct readme

* make fixup + docs

* make fixup

* fixes

* fixes

* Remove pruning

* Remove import

* Doc updates

* More pruning removal

* Combine copies

* Single MQA implementation, remove kv cache pre-allocation and padding

* Update doc

* Revert refactor to match gpt2 style

* Merge back key and value caches, fix some type hints

* Update doc

* Fix position ids pith padding (PR 21080)

* Add conversion script temporarily

* Update conversion script

* Remove checkpoint conversion

* New model

* Fix MQA test

* Fix copies

* try fix tests

* FIX TEST!!

* remove  `DoubleHeadsModel`

* add MQA tests

* add slow tests

* clean up

* add CPU checker

* final fixes

* fixes

- fix GPU issue
- fixed slow tests
- skip disk offload

* fix final issue

* Simplify and comment baddbmm fix

* Remove unnecessary code

* Transpose tweaks

* Use beta=1 on cpu, improve tests

---------

Co-authored-by: younesbelkada <younesbelkada@gmail.com>
2023-04-10 10:57:21 +02:00
Younes Belkada 176ceff91f
Add DePlot + MatCha on `transformers` (#22528)
* add deplot + matcha on `transformers`

* more docs

* correct path

* Update docs/source/en/model_doc/deplot.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix

* use auto processor

* Update docs/source/en/model_doc/matcha.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make fixup

* Update docs/source/en/model_doc/deplot.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* add correct names

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2023-04-05 17:43:48 +02:00
Arthur 19ade2426a
[WIP]`NLLB-MoE` Adds the moe model (#22024)
* Initial commit

* update modeling code

* update doc

* add functions necessary

* fix impotrs

* revert changes

* fixup

* more styling to get going

* remove standalone encoder

* update code

* styling

* fix config and model

* update code and some refactoring

* make more tests pass

* Adding NLLB-200 - MoE - 54.5B for no language left behind
Fixes #21300

* fix mor common tests

* styke

* update testing file

* update

* update

* Router2 doc

* update check config with sparse layer

* add dummy router

* update current conversion script

* create on the fly conversion script

* Fixup

* style

* style 2

* fix empty return

* fix return

* Update default config sparse layers

* easier to create sparse layers

* update

* update conversion script

* update modeling

* add to toctree

* styling

* make ruff happy

* update docstring

* update conversion script

* update, will break tests but impelemting top2

* update

* local groups are supported here

* ⚠️ Support for local groups is now removed ⚠️

This is because it has to work with model parallelism that we do not support

* finish simplificaiton

* Fix forward

* style

* fixup

* Update modelling and test, refactoring

* update tests

* remove final layer)norm as it is done in the FF

* routing works! Logits test added

* nit in test

* remove top1router

* style

* make sure sparse are tested. Had to change route_tokens a liottle bit

* add support for unslip models when converting

* fixup

* style

* update test s

* update test

* REFACTOR

* encoder outputs match!

* style

* update testing

* 🎉encoder and decoder logits match 🎉

* styleing

* update tests

* cleanup tests

* fix router test and CIs

* cleanup

* cleanup test styling

* fix tests

* Finally the generation tests match!

* cleanup

* update test

* style testing file

* remove script

* cleanup

* more cleanup

* nits

* update

* NLLB tokenizer is wrong and will be fixed soon

* use LongTensors

* update tests

* revert some small changes

* fix second expert sampling and batch prioritized routing

* update tests

* finish last tests

* make ruff happy

* update

* ruff again

* style

* Update docs/source/en/model_doc/nllb-moe.mdx

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Updates based on review

* style and fix import issue

* nit

* more nits

* cleanup

* styling

* update test_seconde_expert_policy

* fix name

* last nit on the markdown examples

---------

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-03-27 19:42:00 +02:00
Mitch Naylor 57f25f4b7f
Add Mega: Moving Average Equipped Gated Attention (#21766)
* add mega file structure and plain pytorch version of mega source code

* added config class with old naming conventions

* filled in mega documentation

* added config class and embeddings with optional token types

* updated notes

* starting the conversion process, deleted intermediate and added use_cache back to config

* renamed config attributes in modeling_mega.py

* checkpointing before refactoring incremental decoding functions

* removed stateful incremental key/values for EMA and self-attention

* refactored MovingAverageGatedAttention to remove stateful k/v history and use unified attention mask

* MovingAverageGatedAttention works with incremental decoding + past values, added sequence length enforcement

* more comments in MovingAverageGatedAttention + checkpointing before GatedCrossAttention

* bug fix in attention mask handling in MovingAverageGatedAttention

* removed incremental state from GatedCrossAttention and removed IncrementalState class

* finished gated cross attention and got MegaLayer working

* fixed causal masking in mega decoder

* fixed how padding and causal masks are passed through MegaLayer with and without k/v caching

* finished MegaModel; tested with encoder, decoder-only, and cross-attention type inputs; started work on downstream classes; removed mentions of position_ids

* added optional dense hidden layer for masked and causal LM classes

* docstring updates in MultiHeadEMA and GatedCrossAttention, removed unnecessary inputs in cross-attention

* removed before_attn_fn in Mega class and updated docstrings and comments up to there

* bug fix in MovingAverageGatedAttention masking

* working conversion of MLM checkpoint in scratchpad script -- perfect matches

* moved arg for hidden dense layer in LM head to config; discovered issue where from_pretrained is renaming gamma and beta parameters

* renamed gamma and beta parameters to avoid HF renaming when loading from checkpoint

* finished checkpoint conversion script

* cleanup old class in mega config script

* removed 'copied from' statements and passing integration tests

* added num_attention_heads=1 to config for integration compatibility, decoder tests working, generation tests failing

* fixed tuple output of megamodel

* all common tests passing after fixing issues in decoder, gradient retention, and initialization

* added mega-specific tests, ready for more documentation and style checks

* updated docstrings; checkpoint before style fixes

* style and quality checks, fixed initialization problem in float_tensor, ready for PR

* added mega to toctree

* removed unnecessary arg in megaconfig

* removed unused arg and fixed code samples with leftover roberta models

* Apply suggestions from code review

Applied all suggestions except the one renaming a class, as I'll need to update that througout

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fixed issue where .view breaks batch dimension, conversion script fixed with absolute imports, updated readme with Mega->MEGA

* removed asserts in Mega code, renamed sequencenorm, gatedcrossattention, and NFFN, replaced get_activation_fn with ACTFN, and added sequencenorm to layer norms

* reformatted .forward() docstrings to match style and removed unused mask input in cross-attention

* removed all reset_parameters() methods and rolled into MegaPreTrainedModel._init_weights()

* renamed all single-letter variables and improved readability in tensor size comments, Mega->MEGA in 2 documentation files

* variable names in NFFN

* manual Mega->MEGA changes in docs

* Mega->MEGA in config auto

* style and quality fixes

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* renamed parameters and variables with confusing names, added copied from statements, moved fft conv to its own method, other cleanup from PR comments

* commit before dealing with merge conflicts

* made new attention activation functions available in ACT2FN and added generation test from OPT

* style and quality in activations and tests

* documentation fixes, renaming variables in dropout and rotary positions, used built-in causal masking, encoders->layers in MegaModel, moved comments into docstrings

* style and quality fixes after latest updates, before rotary position ids

* causal mask in MegaBlock docstring + added missing device passing

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* added Mega prefixes where missing, reverted MegaSequenceNorm to if-else, other module renaming requested in PR

* style and quality fixes + readme updates pointing to main

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-03-24 08:17:27 -04:00
Younes Belkada 0f68a7f408
Add Pix2Struct (#21400)
* v1 all keys match

* clean up

* forward pass ok

* add correct image transform

* generate works, logits matching

* clean up

* more refactor

* revert

* revert

* clean up

* clean ups

* clean up

* refactor

* refactor

* fix doc

* fix tokenizer test

* fix toctree

* revert toctree

* oops

* few fixes

* replace to `pixel_embeds`

* make fixup

* test processing & feat extractor

* fix some tests

* more fixes

* make fixup

* clean up

* more clean up

* add a single slow test

* fix test

* make fixup

* fix

* fix authors

* fix toctree

* update docs

* add docstring

* revert change

* Update src/transformers/models/pix2struct/__init__.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix tokenizer

* fix processor test

* fix test

* make fixup

* refactor

* fix config

* Update src/transformers/models/pix2struct/image_processing_pix2struct.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* format

* fix

* Update src/transformers/models/pix2struct/image_processing_pix2struct.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* make fixup

* add docstring

* fix issues

* fix

* fix

* fix

* add slow test

* fix

* fix

* fix batched issue

* fix training issues

* fix ci test

* fix slow test

* fix conversion script

* remove unneeded classes

* fix slow test

* fix require backends

* fix masked fill

* revert

* fix softmax

* add large models support

* fix conditional generation

* few fixes

* add instructions

* rm unneeded file

* Update src/transformers/models/pix2struct/convert_pix2struct_original_pytorch_to_hf.py

* fix ci test

* fix ci test really

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* fix nit

* fix nits

* fix image processors nits

* docstring

* clean up

* fix nit

* fix tests

* docstring nit

* fix reshape

* Update src/transformers/models/pix2struct/image_processing_pix2struct.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fix nit

* fix repetition

* refactor processor

* make patch size consistent

* refactor forward

* fix docstring

* fix max_patches issue

* update docstirng

* update docstring

* fix coped from

* add skip reasons

* few fixes

* Update src/transformers/models/pix2struct/image_processing_pix2struct.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* format

* fix doctests

* refactor and fix

* fix doc build issue

* fix processor test

* small fix conversion script

* replace correct weights

* make fixup

* fix some issues

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* revert config and fixes

* Update src/transformers/models/pix2struct/image_processing_pix2struct.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* more details

* fixes

* fix processor

* fix processor test

* fix

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* make fixup

* fix processor

* Update src/transformers/models/pix2struct/modeling_pix2struct.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* add copied

* make fixup

* fix copies

* update docstring

* refactor

* fix docstring

* fix conversion script

* fix vqa issue

* replace to `flattened_patches`

* nit

* fix numpy issue

* fix image processors

* add batched vqa support

* fix vqa conversion

* make fixup

* fix conversion script

* Apply suggestions from code review

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* make fixup

* add correct docstring

* update docstring

* fix module level + channel dim

* use `make_list_of_images`

* refactor

* correct docstring

* fix authors

* remove `data_format`

* add header text test

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* make fixup

* add checkpoints

---------

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2023-03-22 16:53:52 +01:00
Jason Phang 0041be5b3d
LLaMA Implementation (#21955)
* LLaMA

* sharding and docs

* tweak

* black

* inits

* ruff

* LLAMA_PRETRAINED_CONFIG_ARCHIVE_MAP

* init

* no checkpoint

* docs

* ruff

* type_vocab_size

* tokenizer fixes

* tokenizer fixes

* Update tokenization_llama.py

* Update tokenization_llama.py

* Update configuration_llama.py

* Update modeling_llama.py

* tokenizer add_bos by default

* licenses

* remove decoder

* norms and mlp

* rope overhaul

* tweaks

* black

* mention OPT implementation

* off-by-one naming

* typo

* fix

* tokenization fix and slicing bug

* padding config

* cleanup

* black

* update tests

* undo typo

* fix vocab caching logic

* ruff

* docbuilder

* attn fix from BlackSamorez

* initial feedback

* typo

* docs

* llama case

* llama case

* load checkpoint docs

* comment about tokenizer

* tokenizer defaults

* clear past_key_values if use_cache=False

* last tweaks

* last tweaks

* last tweaks

* last tweaks

---------

Co-authored-by: Stella Biderman <stellabiderman@gmail.com>
2023-03-16 09:00:53 -04:00
Sylvain Gugger ebdb185bef
v4.28.0.dev0 2023-03-14 13:49:10 -04:00
Alara Dirik cdddfbffa1
Add ConvNeXT V2 (#21679)
* Add ConvNeXt V2 to transformers
* TF model is separated from the PR to fix issues
2023-03-14 12:08:14 +03:00
Sylvain Gugger 6cb5132a7f
Fix doc link for MGP-STR (#22138) 2023-03-13 15:26:50 +00:00
wangpeng 102b5ff4a8
add new model of MGP-STR (#21418)
* add new model of MGP-STR

* fix the check failings

* remove torch and numpy from mgp_tokenization

* remove unused import from modeling_mgp_str

* add test_processing_mgp_str

* rm test_processing_mgp_str.py

* add test_processing_mgp_str

* add test_processing_mgp_str

* add test_processing_mgp_str

* rm test_processing_mgp_str and add softmax outs to model

* rm test_processing_mgp_str and add softmax outs to model

* rewrite the code of mgp-str according to PR suggestions

* rewrite the code of mgp-str according to PR suggestions

* add new model of MGP-STR

* fix the check failings

* remove torch and numpy from mgp_tokenization

* remove unused import from modeling_mgp_str

* add test_processing_mgp_str

* rm test_processing_mgp_str.py

* add test_processing_mgp_str

* add test_processing_mgp_str

* add test_processing_mgp_str

* rm test_processing_mgp_str and add softmax outs to model

* rewrite the code of mgp-str according to PR suggestions

* rewrite the code of mgp-str according to PR suggestions

* remove representation_size from MGPSTRConfig

* reformat configuration_mgp_str.py

* format test_processor_mgp_str.py

* add test for tokenizer and complete model/processer test and model file

* rm Unnecessary tupple in modeling_mgp_str

* reduce hidden_size/layers/label_size in test_model

* add integration tests and change MGPSTR to Mgpstr

* add test for logit values

* reformat test model file

---------

Co-authored-by: yue kun <yuekun.wp@alibaba-inc.com>
2023-03-13 10:11:31 +00:00
Eli Simhayev 8abe4930d3
[Time-Series] informer model (#21099)
* added informer to gitignore

* added informer to gitignore

* WIP informer2020

* added checking that instantiate works

* added config using gluonTS by kashif

* WIP config

* adding informeConfig. need to remove FeatureEmbedder

* done InformerConfig, but need to change the names

* Done informer model init. working on enc-dec

* added things to address, after reading again enc-dec in the paper

* done modeling - checking initialization work

* added informer to gitignore

* WIP informer2020

* added checking that instantiate works

* added config using gluonTS by kashif

* WIP config

* adding informeConfig. need to remove FeatureEmbedder

* done InformerConfig, but need to change the names

* Done informer model init. working on enc-dec

* added things to address, after reading again enc-dec in the paper

* done modeling - checking initialization work

* moved enc-dec init to InformerEncoder/Decoder init

* added 'init_std' to config, now model init works!

* WIP conversion script, and added code sources

* WIP conversion script: loading original informer pth works

* WIP conversion script: change defaults in the config

* WIP conversion script: supporting Informer input embedding

* WIP conversion script: added parameters for the informer embed

* WIP conversion script: change dim_feedforward=2048

* WIP conversion script: remove unused args for loading checkpoint

* just cleaning up

* DataEmbedding removed, after thinking with Kashif

* working on forward pass

* WIP forward pass: trying to establish working batch for forward pass

* cleaning and finalizing

* adding HF names and docs

* init after cleaning works

* WIP in tests

* added docs for the informer specific args

* fix style

* undo change

* cleaning informer, now need to work only enc-dec

* initial enc-dec classes

* added encoder and decoder

* added todo

* add todos for conv_layers

* added decoder docs from vanilla

* added encoder docs from vanilla

* remove encoder decoder from the original informer

* removed AttentionLayer from the original paper

* removed TriangularCausalMask, same as decoder_attention_mask

* initial sparse attention

* use conv_layers

* fixed test_config test

* fix parenthesis when itearting zip(layers, conv_layers)

* error found in prob attention, added sizes as comments

* fix sizes

* added proposal for q_reduce indexing, and remove unused

* WIP ProbMask, and changed factor=2 for testing

* remove unused libs for this PR for creating the env

* fix checking the attn_weights.size() after bmm

* Q_reduce: changed from torch.gather to simple slicing

* WIP calculate final attn_output

* finish adding v_aggregated, attn_output ready

* changed tgt_len to u in attention_mask, need to fix the size error

* comment attention_mask for encoder, and fix if cond for v_agg

* added ProbMask support (wip), removed old original code

* finished ProbMask 😃

* Revert "remove unused libs for this PR for creating the env"

This reverts commit 11a081e09e.

* fixes

* make style

* fix initial tests

* fix more tests

* dry

* make style

* remove unused files

* style

* added integration tests

* fix num_static_real_features

* fix header

* remove unused function

* fix example

* fix docs

* Update src/transformers/models/informer/configuration_informer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/modeling_informer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/configuration_informer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/configuration_informer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/configuration_informer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/informer/configuration_informer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fixes for reviewer

* use prediction_length from model

* fix style

* fixed informer.mdx

* added to index

* updated readme

* undo

* make fix-copies

* typo

* fix copy

* added Informer to toctree

* in order

* fixed comments

* remove unneeded new lines in docs

* make static real and cat optional

* fix use of distil conv layers

* fixed integration test

* added checkpoint for convlayer

* make fix-copies

* updated from time series model

* make fix-copies

* copy decoder

* fix unit tests

* updated scaling config

* fix integration tests

* IGNORE_NON_TESTED

* IGNORE_NON_AUTO_CONFIGURED

* IGNORE_NON_AUTO_CONFIGURED

* updated check configs

* fix formatting

* undo change from time series

* prediction_length should not be None

* aliign with the blog: prettify ProbSparse and change attention_factor  to sampling_factor

* make style

* make fix-copies

* niels CR: update contributed by

* niels CR: update configuration_informer.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* niels CR: update kashif -> huggingface

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* niels CR: `sampling_factor` only relevant when `attention_type`=prob

* make style

* fixed U_part: added multiplication by `L_Q`

* fixed bug: remove `is not None` from `if config.distil`

* fixed test: `decoder_seq_length` to `encoder_seq_length` in cross_attentions check

* fix integration tests

* updated model hub

* do not shift as in training

* undo

* fix make-copies

* make fix-copies

* added `if prediction_length is None`

* changed `ProbSparseAttention` to `InformerProbSparseAttention`

* changed `V_sum` -> `v_mean_dim_time`

* changed `ConvLayer` to `InformerConvLayer` and fixed `super()`

* TimeSeriesTansformer->Informer in decoder's Copied from

* more descriptive in ProbSparse

* make style

* fix coped from

* Revert "added `if prediction_length is None`"

This reverts commit b4cbddfa05.

* fixed indent

* use InformerSinusoidalPositionalEmbedding

* make fix-style

* fix from #21860

* fix name

* make fix-copies

* use time series utils

* fix dec num_heads

* docstring

* added time series util doc

* _import_structure

* formatting

* changes from review

* make style

* fix docs

* fix doc

* removed NegativeLogLikelihood

---------

Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
2023-03-07 21:36:38 +01:00
Arthur 82aac00e0f
[Flan-UL2] Add-flan-ul2 (#21929)
* add doc and readme

* add model docs

* update toctree and fix copies

* update

* update doc file

* fix

* add FLAN-UL2 to configuration mapping

* fixup

* Apply suggestions from code review

* more clarification

---------

Co-authored-by: younesbelakda <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-03-03 17:57:24 +01:00
Alara Dirik 269b054939
Add ALIGN to transformers (#21741)
Adds the ALIGN model to transformers. ALIGN is introduced in "Scaling Up Visual and Vision-Language Representation Learning With Noisy Text Supervision" by Chao Jia, Yinfei Yang, Ye Xia, Yi-Ting Chen, Zarana Parekh, Hieu Pham, Quoc V. Le, Yunhsuan Sung, Zhen Li, Tom Duerig.
2023-03-01 21:23:31 +03:00
Alara Dirik 49ab16239c
Add EfficientNet (#21563)
* Add EfficientNet to transformers
2023-02-20 16:37:11 +03:00
tanreinama f56174ac5b
add GPTSAN model (reopen) (#21291)
* add GPTSAN-Japanese

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN (update for review)

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* fix typo in comment text

* add GPTSAN

* add GPTSAN

* add GPTSAN

* add GPTSAN

* fix document and comments

* fix class name GPTSAN->GPTSan

* fix import and test for tokenizer
2023-02-20 11:25:27 +01:00
Arthur c236a62172
[CLAP] Add CLAP to the library (#21370)
* add model like clip

* update

* text model ok

* clap text works

* some refactor

- `CLAPVision` to `CLAPAudio`
- refactor kwargs of audio modules

* more refactor

* more refactor

* more refactor

* correct fusion

* more refactor

* new modules

* add basic processor

* fixup

* remove whisper copioed from

* audio logits match

* add doc

* correct filters mel and add maxlength

* style

* few fixes

* forward passes

* fixup

* fixup

* some clean up

* remove mels form the dictionnary

* pad after the repeat

* update padding when dsmaller

* fix padding

* style

* use swin patch merging

* use copied from swin

* processor with any tokenizer

* more copied from

* some clean up

* more refactor

* fix mel when rand_trunc

* style

* remove unused imports

* update processing

* remove image processing tests

* add testing fiel

* fixmodeling issues

* replace with `is_longer`

* clap in serialization

* more refactor

* `make fixup`

* make fixup

* fix feature extractor

* update test feature extractor

* `make fixup`

* clean up config

* more clean up

* more cleanup

* update tests

* refactor tests and inits

* removeCLAP vision config

* remove CLAP from image procssing auto and dummy vision objects

* update inits

* style

* re order classes in modeling clap

* Use roberta tokenizer as the other weights are not open sourced

* small cleaup

* remove tokenization CLAP

* processor tokenizr is roberta

* update feature extraction doc

* remove vclap from model zero shot

* update f_min and f_max to frequency_xx

* some changes

- fix modeling keys
- add `is_longer` in the forward pass
- make fixup

* make fixup

* consistent behavior ebtween rand_crop and fusion

* add numpy resize and bilinear and documentation

* move resizing to image utils

* clean feature extraction

* import resize from correct file

* resize in image transforms

* update

* style

* style

* nit

* remove unused arguments form the feature extractor

* style

* few fixes + make fixup

* oops

* fix more tests

* add zero shot audio classification pipeline

* update zeroshot classification pipeline

* fixup

* fix copies

* all CI tests pass

* make fixup + fix docs

* fix docs

* fix docs

* update tests pip;eline

* update zero shot pipeline

* update feature extraction clap

* update tokenization auto

* use nested simplify

* update pipeline tests

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* split in two lines

* fixes

* refactor

* clean up

* add integration tests

* update config docstring

* style

* update processor

* fix processor test

* fix feat extractor tests

* update docs

* Apply suggestions from code review

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fix readmes

* fix tips

* Update src/transformers/models/auto/configuration_auto.py

* update doc and remove todo -> properly explained

* fix idx and typo

* typoe

* cleanup config

* cleanup tests, styles and doc

* ignore docstyle on image transform

* add conversion script

* remove the `clap` indx in favor of `CLAP`

* update __init

* nits

* Update src/transformers/pipelines/__init__.py

* fix bug

* clarifiy config

* fix copy

* fix init

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix model output

* fix comment

* make fixup

* make fixup

* rename to `Clap`

* replace to `Clap`

* replace to `Clap`

* repo consistency

* again repo-consistency

* make fixup

* Apply suggestions from code review

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* add config

* changes

* update conversion

* Apply suggestions from code review

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* remove unused function

* update based on code reviews

* style

* more comments

* cleanup

* clean up

* style

* apply suggestions

* Empty commit

* pipeline will be added in a different PR

* update calls to audio utils functions

* update pipeline init

* style

* style

* styling again

* use pad

* fix repo-consistency

* update utils and add doc for audio utils

* clean up resize by using torch. update inits accordingly

* style

* CLap's  tokenizer is RobertA

* add audio utils to internal toctreee

* update totctree

* style

* update documentation and normalize naming accross audio utils and feature extraction clap

* style

* clean up

* update doc and typos

* fix doctest

* update modelin code, got rid of a lot of reshaping

* style on added doc audio utils

* update modeling clap

* style

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* docstringvariables with CLAP

* rename key

* update modeling CLAP

* update audio utils docstring

* update processing clap

* fix readmes

* fix toctree

* udpate configuration clap

* fix init

* make fixup

* fix

* fix

* update naming

* update

* update checkpoint path

* Apply suggestions from code review

* Major refactoring

* Update src/transformers/models/clap/configuration_clap.py

* merge

---------

Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2023-02-16 20:59:27 +01:00
Zineng Tang a0e69a9375
Add TVLT (#20725)
* Update image_processing_tvlt.py

* Update modeling_tvlt.py

* Update

* Update modeling_tvlt.py

* Create tvlt.mdx

* Update configuration_tvlt.py

* Update modeling_tvlt.py

* Update test_modeling_tvlt.py

* Update modeling_tvlt.py

* Update modeling_tvlt.py

* Update image_processing_tvlt.py

* Update feature_extraction_tvlt.py

* Update tvlt models

* Update tests

* Update

* Update

* Update tests

* Update README_ko.md

* Update README_ja.md

* Update README_ko.md

* Update README_zh-hans.md

* Update docs/source/en/model_doc/tvlt.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/tvlt.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/tvlt/configuration_tvlt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/tvlt/configuration_tvlt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/tvlt/configuration_tvlt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/tvlt/configuration_tvlt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update tvlt.mdx

* Update modeling_tvlt.py

* Update configuration_tvlt.py

* Update modeling_tvlt.py

* Update modeling_tvlt.py

* Update modeling_tvlt.py

* Update modeling_tvlt.py

* Add files via upload

* Update model

* Update modeling_tvlt.py

* Update tvlt models

* Update src/transformers/models/tvlt/__init__.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/__init__.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/image_processing_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/image_processing_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Add files via upload

* Add files via upload

* Delete modeling_tvlt.py

* Delete feature_extraction_tvlt.py

* Delete configuration_tvlt.py

* Delete image_processing_tvlt.py

* Delete processing_tvlt.py

* Update tvlt

* Update src/transformers/models/tvlt/configuration_tvlt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/tvlt/image_processing_tvlt.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/tvlt/configuration_tvlt.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/models/tvlt/test_modeling_tvlt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update tests/models/tvlt/test_modeling_tvlt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/tvlt/modeling_tvlt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/tvlt/modeling_tvlt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update README.md

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update README_es.md

* Update README_hd.md

* Update README_ja.md

* Update README_ko.md

* Update README_zh-hans.md

* Update README_zh-hant.md

* Update index.mdx

* Update tvlt.mdx

* Update tvlt.mdx

* Update configuration_tvlt.py

* Update src/transformers/models/tvlt/image_processing_tvlt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/tvlt/image_processing_tvlt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/tvlt/image_processing_tvlt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/tvlt/image_processing_tvlt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/tvlt/modeling_tvlt.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update modeling_tvlt.py

* Add files via upload

* Update tvlt.mdx

* Update modeling_auto.py

* Add files via upload

* Add files via upload

* Update dummy_pt_objects.py

* Update __init__.py

* Update feature_extraction_tvlt.py

* Update feature_extraction_tvlt.py

* Update image_processing_tvlt.py

* Update modeling_auto.py

* Update test_feature_extraction_tvlt.py

* Update test_processor_tvlt.py

* Update test_feature_extraction_tvlt.py

* Add files via upload

* Update test_image_processor_tvlt.py

* Update tests/models/tvlt/test_processor_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/modeling_tvlt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/tvlt/processing_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/modeling_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/modeling_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/tvlt/test_image_processor_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/modeling_tvlt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update tests/models/tvlt/test_image_processor_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/modeling_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/tvlt/test_image_processor_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/tvlt/test_image_processor_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/tvlt/test_modeling_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/modeling_tvlt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/tvlt/modeling_tvlt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/tvlt/configuration_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/tvlt/test_feature_extraction_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/processing_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/model_doc/tvlt.mdx

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/configuration_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/configuration_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/configuration_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/configuration_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/configuration_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/configuration_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/configuration_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/configuration_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/configuration_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/configuration_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/feature_extraction_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/configuration_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/feature_extraction_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/feature_extraction_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/feature_extraction_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update feature_extraction_tvlt.py

* Update feature_extraction_tvlt.py

* Update src/transformers/models/tvlt/image_processing_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/image_processing_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update image_processing_tvlt.py

* Update src/transformers/models/tvlt/modeling_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update modeling_tvlt.py

* Update modeling_tvlt.py

* Update modeling_tvlt.py

* Update test_image_processor_tvlt.py

* Update tests/models/tvlt/test_modeling_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/tvlt/test_modeling_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/tvlt/test_modeling_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/tvlt/test_modeling_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/tvlt/test_modeling_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/tvlt/test_modeling_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/tvlt/test_modeling_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/tvlt/test_modeling_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update tests/models/tvlt/test_modeling_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Add files via upload

* Add files via upload

* Update modeling_tvlt.py

* Update modeling_tvlt.py

* Update modeling_tvlt.py

* Add files via upload

* Update docs/source/en/model_doc/tvlt.mdx

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update image_processing_tvlt.py

* Add files via upload

* Add files via upload

* Update tvlt.mdx

* Update docs/source/en/model_doc/tvlt.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/tvlt.mdx

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/tvlt/image_processing_tvlt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/tvlt/image_processing_tvlt.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update docs/source/en/model_doc/tvlt.mdx

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update docs/source/en/model_doc/tvlt.mdx

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Update src/transformers/models/tvlt/configuration_tvlt.py

Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>

* Add files via upload

* Add files via upload

* Add files via upload

* Add files via upload

* Update modeling_auto.py

* Update tvlt.mdx

* Update dummy_pt_objects.py

* Update feature_extraction_tvlt.py

* Update modeling_tvlt.py

* Update test_feature_extraction_tvlt.py

* Update test_image_processor_tvlt.py

* Update test_feature_extraction_tvlt.py

* Update modeling_tvlt.py

* Update dummy_pt_objects.py

* Update dummy_speech_objects.py

* Add files via upload

* Update README_hd.md

* Update modeling_tvlt.py

* Update modeling_tvlt.py

* Update modeling_tvlt.py

* Update modeling_tvlt.py

* Update modeling_tvlt.py

* Update modeling_tvlt.py

* Update test_modeling_tvlt.py

* Update src/transformers/models/tvlt/configuration_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/feature_extraction_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/image_processing_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/image_processing_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/modeling_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/image_processing_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/modeling_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update MAE processing

* Update modeling_tvlt.py

* Update modeling_tvlt.py

* Update modeling

* Update style

* Update src/transformers/models/tvlt/modeling_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/tvlt/modeling_tvlt.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update check_repo.py

* Update tvlt.mdx

* Update __init__.py

* Update tests

* Update tvlt models

* Update configuration_tvlt.py

* Update configuration_tvlt.py

* Update image_processing_tvlt.py

* Update dummy_pt_objects.py

* Add files via upload

* Update test_modeling_tvlt.py

* Update test_feature_extraction_tvlt.py

* Update test_feature_extraction_tvlt.py

* Update test_feature_extraction_tvlt.py

* Update test_feature_extraction_tvlt.py

* Update test_feature_extraction_tvlt.py

* Update test_feature_extraction_tvlt.py

---------

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
2023-02-15 18:10:30 +00:00
Susnato Dhar 0c9c8472e6
Add Ernie-M Model to huggingface (#21349)
* config and tokenization(fast too) changed and ErnieEncoder added

* Slow Tokenization Added

* Tokenizer(slow) is now working and Fast Tokenizer removed

* Added Config code

* Added Base Model and utils

* ErnieMModel is now working

* All added except tests

* All tests passed except ErnieUIEM

* All tests passed

* all fixes done

* all fixes done

* fixed MAP

* fixed check_code_quality

* fixed Build PR Documentation issue

* Added changes(comments) and also updated to the latest upstream/main

* Added fixup

* Added # Copied comments

* Added fixup

* Added more comments and some nits

* Added fixup

* Fixed README_hd.md

* Added more fixes

* ErnieMTokenizer (being sentencepiece) protected and other docs edited

* Added code_quality fix

* Fixed for

* Added more fix

* modified AZ

* ernie-m tokenization test added!

* attention mask part fixed(with 0->self.config.pad_token_id)

* applied make fixup
2023-02-15 09:24:56 -05:00
Jannis Vamvas b0d539ccad
Add X-MOD (#20939)
* Add X-MOD to Readme

* Add documentation for X-MOD

* Implement X-MOD

* Fix formatting of X-MOD docs

* Change signature of X-MOD forward methods to use lang_ids

* Minor changes

* Rebase with main and run make fix-copies

* Make suggested changes to docstrings

* Improve code readability

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Fix code style

* Conversion script: Remove asserts and type annotations

* Remove _TOKENIZER_FOR_DOC

* XMOD -> Xmod

* Update copyright note

* Fix doctests

* Fix docstring

* Add integration test for FillMaskPipeline

* Revert "Add integration test for FillMaskPipeline"

This reverts commit 4381eb3b1d0f5d85785f89caba83928e6efa6d1f.

* Add end-to-end integration test for mask fill

* make style

* Rebase with main and make fix-copies

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
2023-02-10 15:32:06 +01:00
NielsRogge d7f1e7c009
Add BLIP-2 (#21441)
* First draft

* More improvements

* More improvements

* Improve conversion script

* Convert all weights

* Make forward pass work

* Make logits match

* More improvements

* More improvements

* More improvements

* Use get_input_embeddings

* Improve some more

* Improve model tests

* Improve model tests

* More improvements

* Fix processor

* Update files

* Update prepare_inputs_for_generation

* More improvements

* Fix copies

* More fixes

* Make fixup

* More improvements

* Add support for seq2seq language model

* More improvements

* Fix test

* More improvements

* Improve conversion script

* Remove some todo's

* Fix README's

* Improve conversion script

* Fix generation

* Fix style and remove Blip2Model

* Fix model outputs

* More improvements

* Set eos_token_id in config

* Fix quality

* Small improvements

* Add processor tests

* More improvements

* Apply suggestions

* Apply suggestions

* Add integration test

* Update image URL

* Add integration test

* Fix model_type

* Update style

* Improve docs

* Add doc tests

* Fix copies

* Remove tests which are passing

* Improve some more

* Add tests for seq2seq language models

* Minor fix

* Convert more checkpoints

* finalize CI

* Fix blip and blip2 processors

* add `accelerate` support for `blip2`

* clean up

* make style

* Update conversion script

* Update conversion script some more

* Update organization

* revert toc file

* add blip-2 to toc file

* Some more improvements

* Fix docstring

* Improve docs

---------

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
2023-02-09 16:52:11 +01:00
Stefan Schweter 7e51a441e4
Add XLM-V to Model Doc (#21498)
* doc: introduce new section for XLM-V model

* doc: mention more details for XLM-V integration

* docs: paper abstract in italics, model identifier for base model added

* doc: mention new XLM-V support

* auto: add XLM-V mapping

* doc: run make fix-copies ;)
2023-02-07 16:43:19 -05:00
Matthijs Hollemans e4bacf6614
[WIP] add SpeechT5 model (#18922)
* make SpeechT5 model by copying Wav2Vec2

* add paper to docs

* whoops added docs in wrong file

* remove SpeechT5Tokenizer + put CTC back in the name

* remove deprecated class

* remove unused docstring

* delete SpeechT5FeatureExtractor, use Wav2Vec2FeatureExtractor instead

* remove classes we don't need right now

* initial stab at speech encoder prenet

* add more speech encoder prenet stuff

* improve SpeechEncoderPrenet

* add encoder (not finished yet)

* add relative position bias to self-attention

* add encoder CTC layers

* fix formatting

* add decoder from BART, doesn't work yet

* make it work with generate loop

* wrap the encoder into a speech encoder class

* wrap the decoder in a text decoder class

* changed my mind

* changed my mind again ;-)

* load decoder weights, make it work

* add weights for text decoder postnet

* add SpeechT5ForCTC model that uses only the encoder

* clean up EncoderLayer and DecoderLayer

* implement _init_weights in SpeechT5PreTrainedModel

* cleanup config + Encoder and Decoder

* add head + cross attention masks

* improve doc comments

* fixup

* more cleanup

* more fixup

* TextDecoderPrenet works now, thanks Kendall

* add CTC loss

* add placeholders for other pre/postnets

* add type annotation

* fix freeze_feature_encoder

* set padding tokens to 0 in decoder attention mask

* encoder attention mask downsampling

* remove features_pen calculation

* disable the padding tokens thing again

* fixup

* more fixup

* code review fixes

* rename encoder/decoder wrapper classes

* allow checkpoints to be loaded into SpeechT5Model

* put encoder into wrapper for CTC model

* clean up conversion script

* add encoder for TTS model

* add speech decoder prenet

* add speech decoder post-net

* attempt to reconstruct the generation loop

* add speech generation loop

* clean up generate_speech

* small tweaks

* fix forward pass

* enable always dropout on speech decoder prenet

* sort declaration

* rename models

* fixup

* fix copies

* more fixup

* make consistency checker happy

* add Seq2SeqSpectrogramOutput class

* doc comments

* quick note about loss and labels

* add HiFi-GAN implementation (from Speech2Speech PR)

* rename file

* add vocoder to TTS model

* improve vocoder

* working on tokenizer

* more better tokenizer

* add CTC tokenizer

* fix decode and batch_code in CTC tokenizer

* fix processor

* two processors and feature extractors

* use SpeechT5WaveformFeatureExtractor instead of Wav2Vec2

* cleanup

* more cleanup

* even more fixup

* notebooks

* fix log-mel spectrograms

* support reduction factor

* fixup

* shift spectrograms to right to create decoder inputs

* return correct labels

* add labels for stop token prediction

* fix doc comments

* fixup

* remove SpeechT5ForPreTraining

* more fixup

* update copyright headers

* add usage examples

* add SpeechT5ProcessorForCTC

* fixup

* push unofficial checkpoints to hub

* initial version of tokenizer unit tests

* add slow test

* fix failing tests

* tests for CTC tokenizer

* finish CTC tokenizer tests

* processor tests

* initial test for feature extractors

* tests for spectrogram feature extractor

* fixup

* more fixup

* add decorators

* require speech for tests

* modeling tests

* more tests for ASR model

* fix imports

* add fake tests for the other models

* fixup

* remove jupyter notebooks

* add missing SpeechT5Model tests

* add missing tests for SpeechT5ForCTC

* add missing tests for SpeechT5ForTextToSpeech

* sort tests by name

* fix Hi-Fi GAN tests

* fixup

* add speech-to-speech model

* refactor duplicate speech generation code

* add processor for SpeechToSpeech model

* add usage example

* add tests for speech-to-speech model

* fixup

* enable gradient checkpointing for SpeechT5FeatureEncoder

* code review

* push_to_hub now takes repo_id

* improve doc comments for HiFi-GAN config

* add missing test

* add integration tests

* make number of layers in speech decoder prenet configurable

* rename variable

* rename variables

* add auto classes for TTS and S2S

* REMOVE CTC!!!

* S2S processor does not support save/load_pretrained

* fixup

* these models are now in an auto mapping

* fix doc links

* rename HiFiGAN to HifiGan, remove separate config file

* REMOVE auto classes

* there can be only one

* fixup

* replace assert

* reformat

* feature extractor can process input and target at same time

* update checkpoint names

* fix commit hash
2023-02-03 12:43:46 -05:00
NielsRogge 5451f8896c
Add DETA (#20983)
* First draft

* Add initial draft of conversion script

* Convert all weights

* Fix config

* Add image processor

* Fix DetaImageProcessor

* Run make fix copies

* Remove timm dependency

* Fix dummy objects

* Improve loss function

* Remove conv_encoder attribute

* Update conversion scripts

* Improve postprocessing + docs

* Fix copied from statements

* Add tests

* Improve postprocessing

* Improve postprocessing

* Update READMEs

* More improvements

* Fix rebase

* Add is_torchvision_available

* Add torchvision dependency

* Fix typo and README

* Fix bug

* Add copied from

* Fix style

* Apply suggestions

* Fix thanks to @ydshieh

* Fix another dependency check

* Simplify image processor

* Add scipy

* Improve code

* Add threshold argument

* Fix bug

* Set default threshold

* Improve integration test

* Add another integration test

* Update setup.py

* Address review

* Improve deformable attention function

* Improve copied from

* Use relative imports

* Address review

* Replace assertions

* Address review

* Update dummies

* Remove dummies

* Address comments, update READMEs

* Remove custom kernel code

* Add image processor tests

* Add requires_backends

* Add minor comment

* Update scripts

* Update organization name

* Fix defaults, add doc tests

* Add id2label for object 365

* Fix tests

* Update task guide
2023-01-31 10:43:10 +01:00
Anahita Bhiwandiwalla 3a6e4a221c
Add BridgeTower model (#20775)
* Commit with BTModel and latest HF code

* Placeholder classes for BTForMLM and BTForITR

* Importing Bert classes from transformers

* Removed objectives.py and dist_utils.py

* Removed swin_transformer.py

* Add image normalization, BridgeTowerForImageAndTextRetrieval

* Add center_crop

* Removing bert tokenizer and LCI references

* Tested config loading from HF transformers hub

* Removed state_dict updates and added path to hub

* Enable center crop

* Getting image_size from config, renaming num_heads and num_layers

* Handling max_length in BridgeTowerProcessor

* Add BridgeTowerForMaskedLM

* Add doc string for BridgeTowerConfig

* Add doc strings for BT config, processor, image processor

* Adding docs, removed swin

* Removed convert_bridgetower_original_to_pytorch.py

* Added doc files for bridgetower, removed is_vision

* Add support attention_mask=None and BridgeTowerModelOutput

* Fix formatting

* Fixes with 'make style', 'make quality', 'make fixup'

* Remove downstream tasks from BridgeTowerModel

* Formatting fixes, add return_dict to BT models

* Clean up after doc_test

* Update BTModelOutput return type, fix todo in doc

* Remove loss_names from init

* implement tests and update tuples returned by models

* Add image reference to bridgetower.mdx

* after make fix-copies, make fixup, make style, make quality, make repo-consistency

* Rename class names with BridgeTower prefix

* Fix for image_size in BTImageProcessor

* implement feature extraction bridgetower tests

* Update image_mean and image_std to be list

* remove unused import

* Removed old comments

* Rework CLIP

* update config in tests followed config update

* Formatting fixes

* Add copied from for BridgeTowerPredictionHeadTransform

* Update bridgetower.mdx

* Update test_feature_extraction_bridgetower.py

* Update bridgetower.mdx

* BridgeTowerForMaskedLM is conditioned on image too

* Add BridgeTowerForMaskedLM

* Fixes

* Call post_init to init weights

* Move freeze layers into method

* Remove BTFeatureExtractor, add BT under multimodal models

* Remove BTFeatureExtractor, add BT under multimodal models

* Code review feedback - cleanup

* Rename variables

* Formatting and style to PR review feedback

* Move center crop after resize

* Use named parameters

* Style fix for modeling_bridgetower.py

* Update docs/source/en/model_doc/bridgetower.mdx

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/model_doc/bridgetower.mdx

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/model_doc/bridgetower.mdx

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/bridgetower/modeling_bridgetower.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update src/transformers/models/bridgetower/modeling_bridgetower.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Update docs/source/en/model_doc/bridgetower.mdx

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Update src/transformers/models/bridgetower/modeling_bridgetower.py

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Rename config params, copy BERT classes, clean comments

* Cleanup irtr

* Replace Roberta imports, add BTTextConfig and Model

* Update docs, add visionconfig, consistent arg names

* make fixup

* Comments for forward in BTModel and make fixup

* correct tests

* Remove inconsistent roberta copied from

* Add BridgeTowerTextModel to dummy_pt_objects.py

* Add BridgeTowerTextModel to IGNORE_NON_TESTED

* Update docs for BT Text and Vision Configs

* Treat BridgeTowerTextModel as a private model

* BridgeTowerTextModel as private

* Run make fix-copies

* Adding BTTextModel to PRIVATE_MODELS

* Fix for issue with BT Text and Image configs

* make style changes

* Update README_ja.md

Add から to BridgeTower's description

* Clean up config, .mdx and arg names

* Fix init_weights. Remove nn.Sequential

* Formatting and style fixes

* Re-add tie_word_embeddings in config

* update test implementation

* update style

* remove commented out

* fix style

* Update README with abs for BridgeTower

* fix style

* fix mdx file

* Update bridgetower.mdx

* Update img src in bridgetower.mdx

* Update README.md

* Update README.md

* resolve style failed

* Update _toctree.yml

* Update README_ja.md

* Removed mlp_ratio, rename feats, rename BTCLIPModel

* Replace BTCLIP with BTVisionModel,pass in vision_config to BTVisionModel

* Add test_initialization support

* Add support for output_hidden_states

* Update support for output_hidden_states

* Add support for output_attentions

* Add docstring for output_hidden_states

* update tests

* add bridgetowervisionmodel as private model

* rerun the PR test

* Remove model_type, pass configs to classes, renames

* Change self.device to use weight device

* Remove image_size

* Style check fixes

* Add hidden_size and num_hidden_layers to BridgeTowerTransformer

* Update device setting

* cosmetic update

* trigger test again

* trigger tests again

* Update test_modeling_bridgetower.py

trigger tests again

* Update test_modeling_bridgetower.py

* minor update

* re-trigger tests

* Update docs/source/en/model_doc/bridgetower.mdx

Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>

* Remove pad, update max_text_len, doc cleanup, pass eps to LayerNorm

* Added copied to, some more review feedback

* make fixup

* Use BridgeTowerVisionEmbeddings

* Code cleanup

* Fixes for BridgeTowerVisionEmbeddings

* style checks

* re-tests

* fix embedding

* address comment on init file

* retrigger tests

* update import prepare_image_inputs

* update test_image_processing_bridgetower.py to reflect test_image_processing_common.py

* retrigger tests

Co-authored-by: Shaoyen Tseng <shao-yen.tseng@intel.com>
Co-authored-by: Tiep Le <tiep.le@intel.com>
Co-authored-by: amyeroberts <22614925+amyeroberts@users.noreply.github.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Tiep Le <97980157+tileintel@users.noreply.github.com>
2023-01-25 14:04:32 -05:00
Bartosz Szmelczynski 1b37fb5e17
Efficientformer (#20459)
- Adds EfficientFormer V1 to transformers
- PR co-authored by @novice03  and @Bearnardd 

Co-authored-by: novice <pranavpulijala@gmail.com>
Co-authored-by: novice <44259234+novice03@users.noreply.github.com>
2023-01-20 11:35:42 +03:00
Clémentine Fourrier 87208a05af
Graphormer model for Graph Classification (#20968)
* [FT] First commit for graphormer architecture.

The model has no tokenizer, as it uses a collator and preprocessing function for its input management.
Architecture to be tested against original one.
The arch might need to be changed to fit the checkpoint, but a revert to the original arch will make the code less nice to read.
TODO: doc

* [FIX] removed test model

* [FIX] import error

* [FIX] black and flake

* [DOC] added paper refs

* [FIX] [DOC]

* [FIX] black

* [DOC] Updated READMEs

* [FIX] Order of imports + rm Tokenizer calls

* [FIX] Moved assert in class to prevent doc build failure

* [FIX] make fix-copies

* [Doc] update from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* [FIX] Removed Graphormer from Sequence classification model list

* [DOC] Added HF copyright to Cython file

* [DOC] Fixed comments

* [FIX] typos in class doc + removed config classes.

Todo: update doc from paper definitions

* [FIX] Removed dependency to fairseq, and replaced all asserts with Exception management

* [FIX] Homogeneized initialization of weights to pretrained constructor

* [FIX] [CP] Updated multi_hop parameter to get same results as in original implementation

* [DOC] Relevant parameter description in the configuration file

* [DOC] Updated doc and comments in main graphormer file

* [FIX] make style and quality checks

* [DOC] Fix doc format

* [FIX] [WIP] Updated part of the tests, though still a wip

* [FIX] [WIP]

* [FIX] repo consistency

* [FIX] Changed input names for more understandability

* [FIX] [BUG] updated num_classes params for propagation in the model

* simplified collator

* [FIX] Updated tests to follow new naming pattern

* [TESTS] Updated test suite along with model

* |FIX] rm tokenizer import

* [DOC] add link to graphormerdoc

* Changed section in doc from text model to graph model

* Apply suggestions from code review

Spacing, inits

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* [DOC] Explain algos_graphormer functions

* Cython soft import protection

* Rm call to Callable in configuration graphormer

* [FIX] replaced asserts with Exceptions

* Add org to graphormer checkpoints

* Prefixed classes with Graphormer

* Management of init functions

* format

* fixes

* fix length file

* update indent

* relaunching ci

* Errors for missing cython imports

* fix style

* fix style doc

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-01-19 13:05:59 -05:00
Jitesh Jain 5b949623c7
Add OneFormer Model (#20577)
* Add Oneformer Model

* Add OneFormer Tests

* Add UNIVERSAL_SEGMENTATION_MAPPING

* Fix config

* 🐛 Fix error encountered while writing tests

* 🔨 Fix instance segmentation post processing

* Format Files and Add Documentation

* Add Documentation mdx file

* Run make fixup

* Run make fix-copies

* Remove unnecessary code

* Format modeling_oneformer.py

* Add OneFormer to ImageSegmentationPipeline

* Format files

* Add Demo link to Readme

* Fix fomatting errors

* Fix test failures

* Update Table in index.mdx

* Fix version

* Fix style

* Remove OneFormer from TF

* Fix Imports

* Fix dummy objects

* Fix tests

* Add newline

* Remove OneFormerFeatureExtractor

* Remove CUDA Kernels

* Use AutoBackbone for Swin

* Fix description

* Use Image Processor

* Fix copies

* Fix formatting

* Fix import order

* Fix flake8 errors

* Fix doc errors

* Add Hindi Readme entry

* Update supported backbones

* Update supported backbones

* Undo Changes

* Fix type of config

* Fix isort

* Fix auto.mdx

* Fix swin config

* Replace DinatBackbone with AutoBackbone

* Use SwinBackbone

* Use SwinBackbone

* Fix conversion script

* Fix arguments

* Add argument description

* Fix style

* Add OneFormerProcessor

* Fix OneFormerProcessor Tests

* Fix mapping

* Fix imports

* Fix inits

* Fix style

* Fix comment

* Fix docstring

* Move OneFormer to MultiModal

* Fix Copies

* Remove size divisor

* Fix check_repo.py

* Fix copies

* Add Processor for Testing Pipeline

* Fix padding for tokens

* Fix variables

* Fix formatting with correct black version

* Add Image Processor Test

* Apply suggestions

* Revert common modeling

* Add check for task

* Fix conversion script

* Fix initialization order

* Fix tests

* Undo Pipeline Changes

* Fix layers in MLP

* Fix copies

* Update image paths

* Fix copies

* Apply suggestions
2023-01-19 09:31:07 +01:00
Alara Dirik 2411f0e465
Add Mask2Former (#20792)
* Adds Mask2Former to transformers

Co-authored-by: Shivalika Singh <shivalikasingh95@gmail.com>
Co-authored-by: Shivalika Singh <73357305+shivalikasingh95@users.noreply.github.com>
Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2023-01-16 20:37:07 +03:00
NielsRogge 4ed89d48ab
Add UperNet (#20648)
* First draft

* More improvements

* Add convnext backbone

* Add conversion script

* Add more improvements

* Comment out to_dict

* Add to_dict method

* Add default config

* Fix config

* Fix backbone

* Fix backbone some more

* Add docs, auto mapping, tests

* Fix some tests

* Fix more tests

* Fix more tests

* Add conversion script

* Improve conversion script

* Add support for getting reshaped undownsampled hidden states

* Fix forward pass

* Add print statements

* Comment out set_shift_and_window_size

* More improvements

* Correct downsampling layers conversion

* Fix style

* First draft

* Fix conversion script

* Remove config attribute

* Fix more tests

* Update READMEs

* Update ConvNextBackbone

* Fix ConvNext tests

* Align ConvNext with Swin

* Remove files

* Fix index

* Improve docs

* Add output_attentions to model forward

* Add backbone mixin, improve tests

* More improvements

* Update init_weights

* Fix interpolation of logits

* Add UperNetImageProcessor

* Improve image processor

* Fix image processor

* Remove print statements

* Remove script

* Update import

* Add image processor tests

* Remove print statements

* Fix test

* Add integration test

* Add convnext integration test

* Update docstring

* Fix README

* Simplify config

* Apply suggestions

* Improve docs

* Rename class

* Fix test_initialization

* Fix import

* Address review

* Fix confg

* Convert all checkpoints

* Fix default backbone

* Usage same processor as segformer

* Apply suggestions

* Fix init_weights, update conversion scripts

* Improve config

* Use Auto API instead of creating a new image processor

* Fix docs

* Add doctests

* Remove ResNetConfig dependency

* Add always_partition argument

* Fix rebaseé

* Improve docs

* Convert checkpoints

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Niels Rogge <nielsrogge@Nielss-MBP.localdomain>
2023-01-16 09:39:13 +01:00
Jongjyh ce85686a1f
Add AltCLIP (#20446)
* add altclip

* update

* fix wrong title

* fix the copyright in readme

* add altclip model

* add altclip

* fix test_gradient_checkpointing_enable_disable

* code

* add return class

* add projection_state

* "fix pretrained model bug"

* delete print and fix 2 test instances.

* delete token

* rm xlmr

* one model one file.

* empty commit to trigger CI

* Fix modeling_outputs.py

* Fix __init__

* Fix quality

* Fix modeling file docstring

* Fix README.md

* Fix test file

* add vision model

* empty commit to trigger CI

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* del token in mdx file

* fix

* fix

* fix

* remove altrob from test list

* add vision test

* fix fx

* fix

* fix

* fix

* trigger CI

* fix copies

* fix tests

* fix style

* fix quality

* update

* recover import

* recover

* add ,

* recover

* fix copies

* trigger CI

* fix

* some of review

* update

* remove import

* last 2

* fix

* fix style

* fix style

* fix bug

* fix uncomment

* fix

* update

* fix

* second review

* empty commit to trigger CI

* empty commit to trigger CI

* fix position

* fix

* empty commit to trigger CI

* empty commit to trigger CI

* third comment

* Update docs/source/en/model_doc/altclip.mdx

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Update docs/source/en/model_doc/altclip.mdx

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Update src/transformers/__init__.py

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Update src/transformers/models/altclip/configuration_altclip.py

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Update src/transformers/models/altclip/modeling_altclip.py

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Update src/transformers/models/altclip/processing_altclip.py

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* Update src/transformers/models/altclip/modeling_altclip.py

Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>

* fix merge

* fix copies

* update

* update

* empty commit to trigger CI

* fix code example

* empty commit to trigger CI

* fix

* empty commit to trigger CI

* empty commit to trigger CI

Co-authored-by: shunxing1234 <xw747777271@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
Co-authored-by: shunxing1234 <33774367+shunxing1234@users.noreply.github.com>
Co-authored-by: Yih-Dar <2521628+ydshieh@users.noreply.github.com>
2023-01-04 09:18:57 +01:00
NielsRogge 9c6f7485a6
Add GIT (GenerativeImage2Text) (#20295)
* First draft

* Make model instantiation work

* Fix copied from statement

* More fixes

* Add correct output head

* Improve configuration

* Add conversion script

* Improve conversion script

* Remove token_type_ids

* Fix conversion of projection layers

* Convert all weights

* Use cats image

* Make logits match

* Generate caption on cats image

* Add GITProcessor

* Update conversion script

* Add support for more checkpoints

* Fix conversion script

* Add initial tests

* Remove cross-attention

* More improvements

* Remove is_decoder

* Improve model tests

* Improve tests

* Improve model outputs

* Fix model outputs equivalence

* Fix more tests

* Remove unused code

* Use generate to generate text, no use of cache for now

* Use generate more appropriately

* Fix config tests

* Fix style

* Add support for use_cache

Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>

* Fix style

* Fix GIT vision encoder

* Update README

* Fix integration test

* Set bos and eos token ids

* Improve docs

* Improve code

* Add support for provided attention_mask

* Add copied from statement

* Fix gradient checkpointing test

* Set model_input_names

* Investigate model_input_names

* Remove script

* Fix model inputs

* Fix docstring

* Rename GIT to Git

* Support more models

* Add support for textvqa model

* Add video support

* Extend conversion script for video

* Add support for large variant

* Add support for more models

* Fix config archive map

* Update integration test

* Fix README

* Fix CLIP mean and std

* Update processor

* Fix use_cache for video, thanks @gante

* Remove print statements

* Remove assertion

* Add processor tests

* Fix model_input_names

* Use Auto API for processor

* Fix processor tests

* Fix integration test

* Fix pipeline test

* Make tests faster

* Update conversion script

* Update conversion script

* Convert more checkpoints

* Update conversion script

* Fix typo

* Update docstrings

* Improve code snippets

* Fix doc tests

* Add more code examplesé

* Fix doc tests

* Add integration tests

* Fix unused variable

* revert

* Add GIT to Japanese README

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: Joao Gante <joaofranciscocardosogante@gmail.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2023-01-03 14:17:18 +01:00
Younes Belkada 0d284bd574
Add BLIP (#20716)
* add new model like

* add v1

* v1

* v1

* vision encoder logits match

* v2

* fix

* add docstring

* CI tests pass

* fix tests

* make fixup

* add to `toctree`

* fix processors

* fix processors

* fix doc

* fill title

* add content doc

* remove from tokenization auto

* fix config

* change order

* add `# Copied from`

* few fixes

- add correct license on modeling text
- remove dummy argument

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* replace name

* refactor a bit

* more refactor

* remove unused arg

* make fixup + remove some `# Adapted from ...`

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* more `# Copied from`

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* now `generate` supports no prefix

* remove `FeatureExtractor`

* fix path

* correct dependency

* fix tests

* few fixes

* add integration tests

* add correct conversion script

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* add `blip` to tokenization auto

* fix docstrings

* fix test + add image

* remove processor from uncorrect place

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* clean up a bit

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* clean pixel mask

* clean pixel mask

* fix `F`

* Update src/transformers/models/blip/modeling_blip.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fix output

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fix pad token id

* remove `token_type_ids`

* make fixup

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* make fixup

* Apply suggestions from code review

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* add comments

* Update src/transformers/models/blip/modeling_blip.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* remove `token_type_ids`

* make fixup

* better name

* replace with `image_attention_mask`

* refactor

* make fixup

* better docstring

* replace `answer_xx`

* remove ununsed args

* add `labels`

* add `labels`

* fix processing tests

* make fixup

* make fixup

* put correct repo

* remove `pad`

* remove `crop` and `center_crop`

* Update src/transformers/models/blip/image_processing_blip.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fix

* remove `size_divisor`

* fix weights `init`

* remove unneeded functions

* add suggestions

* minor changes

- change slow test output for PT 1.13
- docstring order

* replace `feature_extractor` by `image_processor`

* fix doctests

* fix weight init order + add fp16 slow test

* add `blip` to doctest

* add correct repo name and fix test

* Update src/transformers/models/blip/processing_blip.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fix tests

* use `convert_to_rgb` from `image_transforms`

* make fixup

* fix large loading issue

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-12-21 09:39:10 +01:00
Andreas Madsen b4b613b102
Implement Roberta PreLayerNorm (#20305)
* Copy RoBERTa

* formatting

* implement RoBERTa with prelayer normalization

* update test expectations

* add documentation

* add convertion script for DinkyTrain weights

* update checkpoint repo

Unfortunately the original checkpoints assumes a hacked roberta model

* add to RoBERTa-PreLayerNorm docs to toc

* run utils/check_copies.py

* lint files

* remove unused import

* fix check_repo reporting wrongly a test is missing

* fix import error, caused by rebase

* run make fix-copies

* add RobertaPreLayerNormConfig to ROBERTA_EMBEDDING_ADJUSMENT_CONFIGS

* Fix documentation <Facebook> -> Facebook

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* fixup: Fix documentation <Facebook> -> Facebook

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Add missing Flax header

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* expected_slice -> EXPECTED_SLICE

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* update copies after rebase

* add missing copied from statements

* make fix-copies

* make prelayernorm explicit in code

* fix checkpoint path for the original implementation

* add flax integration tests

* improve docs

* update utils/documentation_tests.txt

* lint files

* Remove Copyright notice

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* make fix-copies

* Remove EXPECTED_SLICE calculation comments

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-12-19 09:30:17 +01:00
NielsRogge 26dd041c6e
Add Swin2SR (#19784)
* First draft

* Add more improvements

* Improve forward pass

* Fix layernorm

* Add upscaler

* More improvements

* More improvements

* More improvements

* Improve conversion script

* Add preprocessing

* Make output match original implementation

* Add additional attributes

* Add support for more models

* Support more models

* Add support for real world sr

* Add initial Swin2SRFeatureExtractor

* Add ImageSuperResolutionOutput

* Make more tests pass

* Use BaseModelOutput

* Fix one more test

* Fix more tests

* Fix another test

* Fix all tests

* Rename to Swin2SRImageProcessor

* Fix toctree

* Fix toctree

* Fix rebase

* Improve Swin2SRImageProcessor

* Remove feature extractor file

* Improve model

* Improve conversion script

* Fix integration test

* Fix init

* Fix conversion script

* Address comments

* Improve upsampler

* Add NearestConvUpsampler

* Improve pixel shuffle upsampler

* Improve auxiliary upsampler

* Improve conversion script

* Rename conv_last to final_convolution

* Fix rebase

* Improve upsample module

* Add padding to image processor

* Fix bug

* Update padding

* Remove print statement and fix integration test

* Improve docs

* Add image processor tests

* Convert all checkpoints, fix testsé

* Remove print statements

* Fix import

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-12-16 16:24:01 +01:00
Ariel Ekgren 5f94855dc3
Add gpt-sw3 model to transformers (#20209)
* Add templates for gpt-sw3

* Add templates for gpt-sw3

* Added sentencepiece tokenizer

* intermediate commit with many changes

* fixed conflicts

* Init commit for tokenization port

* Tokenization progress

* Remove fast tokenizer

* Clean up and rename spm.model -> spiece.model

* Remove TF -> PT conversion script template, Clean up Megatron -> PT script

* Optimize encode & decode performance

* added new attention

* added new attention

* attention for gpt-sw3 working

* attention good

* Cache is now working

* fixed attention mask so that it works with causal attention

* fixed badbmm bug for cpu and caching

* updated config with correct parameters

* Refactor and leave optimizations as separate functions to avoid breaking expected functionality

* Fix special tokens mapping for both tokenizers

* cleaning up of code and comments

* HF compatible attention outputs

* Tokenizer now passing tests, add documentation

* Update documentation

* reverted back to base implementation after checking that it is identical to pretrained model

* updated gpt-sw3 config

* updated conversion script

* aligned parameters with gpt-sw3 config

* changed default scale_attn_by_inverse_layer_idx to true

* removed flag from conversion script

* added temporary model path

* reverted back to functioning convert script

* small changes to default config

* updated tests for gpt-sw3

* make style, make quality, minor cleanup

* Change local paths to testing online repository

* Change name: GptSw3 -> GPTSw3

* Remove GPTSw3TokenizerFast references

* Use official model repository and add more model sizes

* Added reference to 6.7b model

* Add GPTSw3DoubleHeadsModel to IGNORE_NON_AUTO_CONFIGURED, like GPT2DoubleHeadsModel

* Remove pointers to non-existing TFGPTSw3

* Add GPTSw3 to docs/_toctree.yml

* Remove TF artifacts from GPTSw3 in __init__ files

* Update README:s with 'make fix-copies'

* Add 20b model to archive list

* Add documentation for GPT-Sw3

* Fix typo in documentation for GPT-Sw3

* Do 'make fix-copies' again after having updated docs

* Fix some typos in docs

* Update src/transformers/models/gpt_sw3/configuration_gpt_sw3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/configuration_gpt_sw3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/__init__.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/__init__.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/convert_megatron_to_pytorch.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update tests/models/gpt_sw3/test_tokenization_gpt_sw3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/modeling_gpt_sw3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Resolve comments from PR feedback

* Resolve more comments from PR feedback, also set use_cache=True in convert script

* Add '# Copied from' comments for GPTSw3 modeling

* Set 'is_parallelizable = False'

* Remove '# Copied from' where code was modified and add 'with x->y' when appropriate

* Remove parallelize in mdx

* make style, make quality

* Update GPTSw3Config default values and corresponding documentation

* Update src/transformers/models/gpt_sw3/tokenization_gpt_sw3.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/__init__.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Clean up and protect GPTSw3Tokenizer imports with is_sentencepiece_available

* Make style, make quality

* Add dummy object for GPTSw3Tokenizer via 'make fix-copies'

* make fix-copies

* Remove GPTSw3 modeling classes

* make style, make quality

* Add GPTSw3 auto-mappings for other GPT2 heads

* Update docs/source/en/model_doc/gpt-sw3.mdx

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/convert_megatron_to_pytorch.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Update src/transformers/models/gpt_sw3/tokenization_gpt_sw3.py

Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>

* Remove old TODO-comment

* Add example usage to GPTSw3Tokenizer docstring

* make style, make quality

* Add implementation details and example usage to gpt-sw3.mdx

Co-authored-by: JoeyOhman <joeyoh@kth.se>
Co-authored-by: Arthur <48595927+ArthurZucker@users.noreply.github.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-12-12 13:12:13 -05:00
NielsRogge d151a8c550
Add BiT + ViT hybrid (#20550)
* First draft

* More improvements

* Add backbone, first draft of ViT hybrid

* Add AutoBackbone

* More improvements

* Fix bug

* More improvements

* More improvements

* Convert ViT-hybrid

* More improvements

* add patch bit

* Fix style

* Improve code

* cleaned v1

* more cleaning

* more refactoring

* Improve models, add tests

* Add docs and tests

* Make more tests pass

* Improve default backbone config

* Update model_type

* Fix more tests

* Add more copied from statements

* More improvements

* Add push to hub to conversion scripts

* clean

* more cleanup

* clean

* replace to

* fix

* Update src/transformers/models/bit/configuration_bit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* fix base model prefix

* more cleaning

* get rid of stem

* clean

* replace flag

* Update src/transformers/models/bit/configuration_bit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* Update src/transformers/models/bit/configuration_bit.py

Co-authored-by: NielsRogge <48327001+NielsRogge@users.noreply.github.com>

* add check

* another check

* fix for hybrid vit

* final fix

* update config

* fix class name

* fix `make fix-copies`

* remove `use_activation`

* Update src/transformers/models/bit/configuration_bit.py

* rm unneeded file

* Add BiT image processor

* rm unneeded file

* add doc

* Add image processor to conversion script

* Add ViTHybrid image processor

* Add resources

* Move bit to correct position

* Fix auto mapping

* Rename hybrid to Hybrid

* Fix name in toctree

* Fix READMEs'

* Improve config

* Simplify GroupNormActivation layer

* fix test + make style

* Improve config

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* remove comment

* remove comment

* replace

* replace

* remove all conv_layer

* refactor norm_layer

* revert x

* add copied from

* last changes + integration tests

* make fixup

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix name

* fix message

* remove assert and refactor

* refactor + make fixup

* refactor - add  + sfety checker

* fix docstring + checkpoint names

* fix merge issues

* fix function name

* fix copies

* Apply suggestions from code review

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* fix model checkpoint

* fix doctest output

* vit name on doc

* fix name on doc

* fix small nits

* fixed integration tests

* final changes - slow tests pass

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
2022-12-07 11:03:39 +01:00
Sourab Mangrulkar 25e10da427
Adding anchor links to Hindi README (#20606) 2022-12-06 18:06:25 +05:30
Kamal Raj Kanakarajan 13e736685a
Add BioGPT (#20420)
* biogpt initial commit

* updated init

* fix faster decoding with use_cache

* 1. fix input_ids and input_embeds with correct device
2. added _keys_to_ignore_on_load_missing
3. updated prepare_inputs_for_generation

* add activation_dropout and scale_embedding

* replace fsmt attention with bart attention

* added test

* run make fix-copies

* doc init and fix build

* updated README with proper information

* 1. added tips to docs
2. updated BioGptTokenizer func

* 1. added tokenizer test
2. refactor tokenizer

* make fixup

* add biogpt fairseq to hf converter

* updated layer names more
similar to original checkpoints

* config update doc string and set defaults

* added "#copied" from bart model and
updated doc strings

* enable model_input_names in tokenizer

* 1.  positionalembedding depending on attention_mask
2. added attention mask to prepare for generation

* added test to verify past and generation

* BioGptLMHeadModel -> BioGptForCausalLM

* fix typo

* tokenization and test
Copyright and updated assertion

* updated Copyright and
one func at time in line

* Copyright updates and
minor doc fix

* replace assertion with ValueError

* rm extra space

* added code syntax

* revert cmnt position change

* add tokenizer to auto

* updated doc string

* tokenizer doc string update

* biogpt hub model update to microsoft/biogpt

* make fixup

* rm cmnt to fix flake8 5.0.4 vs 6 error
2022-12-05 10:12:03 -05:00
fatih cc3d0e1b01
[New Model] Add TimeSformer model (#18908)
* init timesformer

* apply fix-copies

* reformat style

* revert back some incoorect style updates

* init timesformer

* apply fix-copies

* reformat style

* revert back some incoorect style updates

* update timseformer doc

* add some functions and classes

* add new config params

* implement multiple classes

* update TimeSformerLayer

* update TimeSformerModel, TimeSformerPreTrainedModel, TimeSformerEncoder

* several fixes

* reformat

* temporary update

* fix some typos

* fix weight converter

* more fixes

* fix a typo

* fix typo

* remove redundant params

* fix for latest hf-hub

* merge fix

* fix some checks

* video classification works with einops

* add paper info to docs

* merge fix

* remove redundant line

* remove redundant docstring

* update config

* fix some typos

* fix converter

* update some test constants

* refactor einops functions

* reformat

* fix a comment

* remove redundat imports

* reformat

* fix a typo

* remove comment

* remove unused imports

* remove redundant doc line

* reformat

* add missing line

* fix docs

* fix timesformer auto feat ext

* add unittests

* reformat

* fix docs

* some fixes and updates

* fix readme

* fix modeling

* fix readme

* update index

* revert _toctree.yml changes

* update timseformer.mdx

* update drop_path_prob to drop_path_rate

* add dosctring for drop_path_rate

* update TimeSformerPatchEmbed naming

* remove to_2tuple

* explicit use of nn.functional

* reformat

* many updates from review comments

* fix a typo

* reformat

* remove assert, better variable name

* make variable names more explicit

* add some adapted from

* more explicit variable names

* remove redundant docstring

* fix initilaization

* move permute inside embedding

* update class names

* remove unused imports

* add test for video classification

* update PretrainedModel with PreTrainedModel

* remove double permute

* update based on sylvain's review

* aply auto fix

* update image_processing_auto for timesformer

* update hub urls

* reformat

* remove duplicate import

* update doc link
2022-12-02 09:13:25 +01:00
Sylvain Gugger 60d1f31bb0
v4.26.0.dev0 2022-12-01 16:19:33 -05:00
Yang An 721764028e
Add Chinese-CLIP implementation (#20368)
* init chinese-clip model from clip

* init model tests and docs

* implement chinese-clip into hf

* implement chinese-clip into hf

* implement chinese-clip into hf

* implement chinese-clip into hf

* implement chinese-clip into hf

* update usecase example in model implementation

* fix codestyle

* fix model_type typo in readme

* add placeholder in doc

* add placeholder in doc

* update the init script

* update usecase

* fix codestyle

* update testcase

* update testcase

* update testcase

* update testcase

* update testcase

* update testcase

* update testcase

* update testcase

* update testcase

* update testcase

* update testcase

* update testcase

* forward the convert_rgb

* update testcase

* update testcase

* update testcase

* merge the recent update from clip about model_input_name property

* update the doc

* update the doc

* update the doc

* update the doc

* remove unused imports

* reformat code style

* update the doc

* fix isort style

* bypass a weird failed unit test which is unrelated with my PR

* update the doc

* implement independent vision config class

* implement independent vision model class

* fix refactor bug

* fix refactor bug

* fix refactor bug

* make style

* fix refactor bug

* make style

* fix refactor bug

* fix refactor bug

* make style

* fix refactor bug

* fix refactor bug

* doc-build restyle

* implement independent text config class

* implement independent text model class

* implement independent text model class

* make style

* make fix-copies

* fix refactor bug

* fix refactor bug

* fix refactor bug

* fix refactor bug

* fix refactor bug

* fix refactor bug

* fix refactor bug

* fix refactor bug

* fix refactor bug

* fix refactor bug

* make style

* update doc

* black and isort

* update doc

* Update src/transformers/models/chinese_clip/configuration_chinese_clip.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/auto/tokenization_auto.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* modify the model type from chinese-clip to chinese_clip

* format the example comment of ChineseCLIPVisionConfig

* correct the copyright comment

* fix the tokenizer specification

* add copied from for loss function

* remove unused class

* update CHINESE_CLIP_TEXT_INPUTS_DOCSTRING

* update CHINESE_CLIP_INPUTS_DOCSTRING

* update doc

* update doc

* update code comment in config

* update copied from statement

* make style

* rename the doc file

* add copied statement

* remove unused attention_mask, causal_attention_mask in ChineseCLIPVisionEncoder

* remove ChineseCLIPTextPreTrainedModel

* fix bug

* fix bug

* fix bug

* update doc

* make style

* Update src/transformers/models/chinese_clip/configuration_chinese_clip.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* Update src/transformers/models/chinese_clip/configuration_chinese_clip.py

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>

* update ChineseCLIPImageProcessor in image_processing_auto

* fix config_class of chinesecliptextmodel

* fix the test case

* update the docs

* remove the copied from comment for ChineseCLIPTextModel, since it has diverged from BertModel with customed config_class

* update the testcase

* final fix

Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2022-11-30 19:22:23 +01:00
NielsRogge 4973d2a04c
Add Audio Spectogram Transformer (#19981)
* First draft

* Make conversion script work

* Add id2label mapping, run code quality

* Fix copies

* Add first draft of feature extractor

* Update conversion script to use feature extractor

* Make more tests pass

* Add docs

* update input_features to input_values + pad by default to max length

* Fix doc tests

* Add feature extractor tests

* Add proper padding/truncation to feature extractor

* Add support for conversion of all audioset checkpoints

* Improve docs and extend conversion script

* Fix README

* Rename spectogram to spectrogram

* Fix copies

* Add integration test

* Remove dummy conv

* Update to ast

* Update organization

* Fix init

* Rename model to AST

* Add require_torchaudio annotator

* Move import of ASTFeatureExtractor under a is_speech_available

* Fix rebase

* Add pipeline config

* Update name of classifier head

* Rename time_dimension and frequency_dimension for clarity

* Remove print statement

* Fix pipeline test

* Fix pipeline test

* Fix index table

* Fix init

* Fix conversion script

* Rename to ForAudioClassification

* Fix index table

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
2022-11-21 18:58:54 +01:00