* mc for new cross lingual sentence model
* fat text
* url spelling fix
* more url spelling fixes
* slight thanks change
* small improvements in text
* multilingual word xchange
* change colab link
* xval fold number
* add model links
* line break in model names
* Update README.md
* Update README.md
* new examples link
* new examples link
* add evaluation dataset name
* add more about multi lingual
* typo fix
* typo
* typos
* hyperparameter typos
* hyperparameter typo
* add metadata
* add metadata
* Update README.md
* typo fix
* Small improvement
* model card German Sentence Embeddings V2
- for German RoBERTa for Sentence Embeddings V2
- marked old as outdated
* small correction
* small improvement in description
* small spelling fix
* spelling fix
* add evaluation results
* spearman explanation
* add number of trials
* HerBERT transformer model for Polish language understanding.
* HerbertTokenizerFast generated with HerbertConverter
* Herbert base and large model cards
* Herbert model cards with tags
* Herbert tensorflow models
* Herbert model tests based on Bert test suit
* src/transformers/tokenization_herbert.py edited online with Bitbucket
* src/transformers/tokenization_herbert.py edited online with Bitbucket
* docs/source/model_doc/herbert.rst edited online with Bitbucket
* Herbert tokenizer tests and bug fixes
* src/transformers/configuration_herbert.py edited online with Bitbucket
* Copyrights and tests for TFHerbertModel
* model_cards/allegro/herbert-base-cased/README.md edited online with Bitbucket
* model_cards/allegro/herbert-large-cased/README.md edited online with Bitbucket
* Bug fixes after testing
* Reformat modified_only_fixup
* Proper order of configuration
* Herbert proper documentation formatting
* Formatting with make modified_only_fixup
* Dummies fixed
* Adding missing models to documentation
* Removing HerBERT model as it is a simple extension of BERT
* Update model_cards/allegro/herbert-base-cased/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* Update model_cards/allegro/herbert-large-cased/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* HerbertTokenizer deprecated configuration removed
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* model card for bert-base-NER
* add meta data up top
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* [Model card] SinhalaBERTo model.
This is the model card for keshan/SinhalaBERTo model.
* Update model_cards/keshan/SinhalaBERTo/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
* Create README.md
Model description for all LEGAL-BERT models, published as part of "LEGAL-BERT: The Muppets straight out of Law School". Chalkidis et al., 2018, In Findings of EMNLP 2020
* Update model_cards/nlpaueb/legal-bert-base-uncased/README.md
Co-authored-by: Julien Chaumond <chaumond@gmail.com>
'The class `AutoModelWithLMHead` is deprecated and will be removed in a future version. Please use `AutoModelForCausalLM` for causal language models, `AutoModelForMaskedLM` for masked language models and `AutoModelForSeq2SeqLM` for encoder-decoder models.'
I dont know how to change the 'How to use this model directly from the 🤗/transformers library:' part since it is not part of the model-paper
* configuration_squeezebert.py
thin wrapper around bert tokenizer
fix typos
wip sb model code
wip modeling_squeezebert.py. Next step is to get the multi-layer-output interface working
set up squeezebert to use BertModelOutput when returning results.
squeezebert documentation
formatting
allow head mask that is an array of [None, ..., None]
docs
docs cont'd
path to vocab
docs and pointers to cloud files (WIP)
line length and indentation
squeezebert model cards
formatting of model cards
untrack modeling_squeezebert_scratchpad.py
update aws paths to vocab and config files
get rid of stub of NSP code, and advise users to pretrain with mlm only
fix rebase issues
redo rebase of modeling_auto.py
fix issues with code formatting
more code format auto-fixes
move squeezebert before bert in tokenization_auto.py and modeling_auto.py because squeezebert inherits from bert
tests for squeezebert modeling and tokenization
fix typo
move squeezebert before bert in modeling_auto.py to fix inheritance problem
disable test_head_masking, since squeezebert doesn't yet implement head masking
fix issues exposed by the test_modeling_squeezebert.py
fix an issue exposed by test_tokenization_squeezebert.py
fix issue exposed by test_modeling_squeezebert.py
auto generated code style improvement
issue that we inherited from modeling_xxx.py: SqueezeBertForMaskedLM.forward() calls self.cls(), but there is no self.cls, and I think the goal was actually to call self.lm_head()
update copyright
resolve failing 'test_hidden_states_output' and remove unused encoder_hidden_states and encoder_attention_mask
docs
add integration test. rename squeezebert-mnli --> squeezebert/squeezebert-mnli
autogenerated formatting tweaks
integrate feedback from patrickvonplaten and sgugger to programming style and documentation strings
* tiny change to order of imports
Two new pre-trained models "vinai/bertweet-covid19-base-cased" and "vinai/bertweet-covid19-base-uncased" are resulted by further pre-training the pre-trained model "vinai/bertweet-base" on a corpus of 23M COVID-19 English Tweets for 40 epochs.
* Add BERTweet and PhoBERT models
* Update modeling_auto.py
Re-add `bart` to LM_MAPPING
* Update tokenization_auto.py
Re-add `from .configuration_mobilebert import MobileBertConfig`
not sure why it's replaced by `from transformers.configuration_mobilebert import MobileBertConfig`
* Add BERTweet and PhoBERT to pretrained_models.rst
* Update tokenization_auto.py
Remove BertweetTokenizer and PhobertTokenizer out of tokenization_auto.py (they are currently not supported by AutoTokenizer.
* Update BertweetTokenizer - without nltk
* Update model card for BERTweet
* PhoBERT - with Auto mode - without import fastBPE
* PhoBERT - with Auto mode - without import fastBPE
* BERTweet - with Auto mode - without import fastBPE
* Add PhoBERT and BERTweet to TF modeling auto
* Improve Docstrings for PhobertTokenizer and BertweetTokenizer
* Update PhoBERT and BERTweet model cards
* Fixed a merge conflict in tokenization_auto
* Used black to reformat BERTweet- and PhoBERT-related files
* Used isort to reformat BERTweet- and PhoBERT-related files
* Reformatted BERTweet- and PhoBERT-related files based on flake8
* Updated test files
* Updated test files
* Updated tf test files
* Updated tf test files
* Updated tf test files
* Updated tf test files
* Update commits from huggingface
* Delete unnecessary files
* Add tokenizers to auto and init files
* Add test files for tokenizers
* Revised model cards
* Update save_vocabulary function in BertweetTokenizer and PhobertTokenizer and test files
* Revised test files
* Update orders of Phobert and Bertweet tokenizers in auto tokenization file
* [model cards] ported allenai Deep Encoder, Shallow Decoder models
* typo
* fix references
* add allenai/wmt19-de-en-6-6 model cards
* fill-in the missing info for the build script as provided by the searcher.
* ready for PR
* cleanup
* correct FSMT_PRETRAINED_MODEL_ARCHIVE_LIST
* fix
* perfectionism
* revert change from another PR
* odd, already committed this one
* non-interactive upload workaround
* backup the failed experiment
* store langs in config
* workaround for localizing model path
* doc clean up as in https://github.com/huggingface/transformers/pull/6956
* style
* back out debug mode
* document: run_eval.py --num_beams 10
* remove unneeded constant
* typo
* re-use bart's Attention
* re-use EncoderLayer, DecoderLayer from bart
* refactor
* send to cuda and fp16
* cleanup
* revert (moved to another PR)
* better error message
* document run_eval --num_beams
* solve the problem of tokenizer finding the right files when model is local
* polish, remove hardcoded config
* add a note that the file is autogenerated to avoid losing changes
* prep for org change, remove unneeded code
* switch to model4.pt, update scores
* s/python/bash/
* missing init (but doesn't impact the finetuned model)
* cleanup
* major refactor (reuse-bart)
* new model, new expected weights
* cleanup
* cleanup
* full link
* fix model type
* merge porting notes
* style
* cleanup
* have to create a DecoderConfig object to handle vocab_size properly
* doc fix
* add note (not a public class)
* parametrize
* - add bleu scores integration tests
* skip test if sacrebleu is not installed
* cache heavy models/tokenizers
* some tweaks
* remove tokens that aren't used
* more purging
* simplify code
* switch to using decoder_start_token_id
* add doc
* Revert "major refactor (reuse-bart)"
This reverts commit 226dad15ca.
* decouple from bart
* remove unused code #1
* remove unused code #2
* remove unused code #3
* update instructions
* clean up
* move bleu eval to examples
* check import only once
* move data+gen script into files
* reuse via import
* take less space
* add prepare_seq2seq_batch (auto-tested)
* cleanup
* recode test to use json instead of yaml
* ignore keys not needed
* use the new -y in transformers-cli upload -y
* [xlm tok] config dict: fix str into int to match definition (#7034)
* [s2s] --eval_max_generate_length (#7018)
* Fix CI with change of name of nlp (#7054)
* nlp -> datasets
* More nlp -> datasets
* Woopsie
* More nlp -> datasets
* One last
* extending to support allen_nlp wmt models
- allow a specific checkpoint file to be passed
- more arg settings
- scripts for allen_nlp models
* sync with changes
* s/fsmt-wmt/wmt/ in model names
* s/fsmt-wmt/wmt/ in model names (p2)
* s/fsmt-wmt/wmt/ in model names (p3)
* switch to a better checkpoint
* typo
* make non-optional args such - adjust tests where possible or skip when there is no other choice
* consistency
* style
* adjust header
* cards moved (model rename)
* use best custom hparams
* update info
* remove old cards
* cleanup
* s/stas/facebook/
* update scores
* s/allen_nlp/allenai/
* url maps aren't needed
* typo
* move all the doc / build /eval generators to their own scripts
* cleanup
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* Apply suggestions from code review
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
* fix indent
* duplicated line
* style
* use the correct add_start_docstrings
* oops
* resizing can't be done with the core approach, due to 2 dicts
* check that the arg is a list
* style
* style
Co-authored-by: Sam Shleifer <sshleifer@gmail.com>
Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
ParsBERT v2.0 is a fine-tuned and vocab-reconstructed version of ParsBERT, and it's able to be used in other scopes!
It includes these features:
- We added some unused-vocab for use in summarization and other scopes.
- We fine-tuned the model on vast styles of writing in the Persian language.