transformers

Commit Graph

Author	SHA1	Message	Date
Patrick von Platen	afc4ece462	[Generate] Facilitate PyTorch generate using `ModelOutputs` (#6735 ) * fix generate for GPT2 Double Head * fix gpt2 double head model * fix bart / t5 * also add for no beam search * fix no beam search * fix encoder decoder * simplify t5 * simplify t5 * fix t5 tests * fix BART * fix transfo-xl * fix conflict * integrating sylvains and sams comments * fix tf past_decoder_key_values * fix enc dec test	2020-09-01 12:38:25 +02:00
Sam Shleifer	8af1970e45	Fix marian slow test (#6854 )	2020-08-31 16:10:43 -04:00
Huang Lianzhe	2de7ee0385	Dataset and DataCollator for BERT Next Sentence Prediction (NSP) task (#6644 ) * add datacollator and dataset for next sentence prediction task * bug fix (numbers of special tokens & truncate sequences) * bug fix (+ dict inputs support for data collator) * add padding for nsp data collator; renamed cached files to avoid conflict. * add test for nsp data collator * Style Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-08-31 08:25:00 -04:00
Stas Bekman	563485bf95	[tests] fix typos in inputs (#6818 )	2020-08-30 18:19:57 +08:00
Sam Shleifer	0f58903bb6	Pegasus finetune script: add --adafactor (#6811 )	2020-08-29 17:43:32 -04:00
Sam Shleifer	3cac867fac	t5 model should make decoder_attention_mask (#6800 )	2020-08-28 15:22:33 -04:00
Sam Shleifer	20f7786453	Fix style (#6803 )	2020-08-28 15:02:25 -04:00
Sam Shleifer	9336086ab5	prepare_seq2seq_batch makes labels/ decoder_input_ids made later. (#6654 ) * broken test * batch parity * tests pass * boom boom * boom boom * split out bart tokenizer tests * fix tests * boom boom * Fixed dataset bug * Fix marian * Undo extra * Get marian working * Fix t5 tok tests * Test passing * Cleanup * better assert msg * require torch * Fix mbart tests * undo extra decoder_attn_mask change * Fix import * pegasus tokenizer can ignore src_lang kwargs * unused kwarg test cov * boom boom * add todo for pegasus issue * cover one word translation edge case * Cleanup * doc	2020-08-28 11:15:17 -04:00
RafaelWO	cb276b41de	Transformer-XL: Improved tokenization with sacremoses (#6322 ) * Improved tokenization with sacremoses * The TransfoXLTokenizer is now using sacremoses for tokenization * Added tokenization of comma-separated and floating point numbers. * Removed prepare_for_tokenization() from tokenization_transfo_xl.py because punctuation is handled by sacremoses * Added corresponding tests * Removed test comapring TransfoXLTokenizer and TransfoXLTokenizerFast * Added deprecation warning to TransfoXLTokenizerFast * isort change Co-authored-by: Teven <teven.lescao@gmail.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-08-28 09:56:17 -04:00
Stas Bekman	92ac2fa7d1	[transformers-cli] fix logger getter (#6777 )	2020-08-27 20:01:17 -04:00
Lysandre	42fddacd1c	Format	2020-08-27 18:31:51 +02:00
Stas Bekman	dbfe34f2f5	[test schedulers] adjust to test the first step's reading (#6429 ) * [test schedulers] small improvement * cleanup	2020-08-27 12:23:28 -04:00
Stas Bekman	e6b811f0a7	[testing] replace hardcoded paths to allow running tests from anywhere (#6523 ) * [testing] replace hardcoded paths to allow running tests from anywhere * fix the merge conflict	2020-08-27 12:22:18 -04:00
Nikolai Yakovenko	971d1802d0	Add AdaFactor optimizer from fairseq (#6722 ) * AdaFactor optimizer ported from fairseq. Tested for T5 finetuning and MLM -- reduced memory consumption compared to ADAM. * update PR fixes, add basic test * bug -- incorrect params in test * bugfix -- import Adafactor into test * bugfix -- removed accidental T5 include * resetting T5 to master * bugfix -- include Adafactor in __init__ * longer loop for adafactor test * remove double error class declare * lint * black * isort * Update src/transformers/optimization.py Co-authored-by: Sam Shleifer <sshleifer@gmail.com> * single docstring * Cleanup docstring Co-authored-by: Nikolai Y <nikolai.yakovenko@point72.com> Co-authored-by: Sam Shleifer <sshleifer@gmail.com>	2020-08-27 04:58:13 -04:00
Julien Chaumond	3242e4d942	[model_cards] Fix tiny typos	2020-08-26 23:16:06 +02:00
Patrick von Platen	858b7d5873	[TF Longformer] Improve Speed for TF Longformer (#6447 ) * add tf graph compile tests * fix conflict * remove more tf transpose statements * fix conflicts * fix comment typos * move function to class function * fix black * fix black * make style	2020-08-26 14:55:41 -04:00
Lysandre	a75c64d80c	Black 20 release	2020-08-26 17:20:22 +02:00
Lysandre Debut	77abd1e79f	Centralize logging (#6434 ) * Logging * Style * hf_logging > utils.logging * Address @thomwolf's comments * Update test * Update src/transformers/benchmark/benchmark_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Revert bad change Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-26 11:10:36 -04:00
Sam Shleifer	624495706c	T5Tokenizer adds EOS token if not already added (#5866 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-25 14:56:08 -04:00
Sam Shleifer	e11d923bfc	Fix pegasus-xsum integration test (#6726 )	2020-08-25 14:06:28 -04:00
Sylvain Gugger	abc0202194	More tests to Trainer (#6699 ) * More tests to Trainer * Add warning in the doc	2020-08-25 07:07:36 -04:00
Sylvain Gugger	a573777901	Update repo to isort v5 (#6686 ) * Run new isort * More changes * Update CI, CONTRIBUTING and benchmarks	2020-08-24 11:03:01 -04:00
Sam Shleifer	5bf4465e6c	Regression test for pegasus bugfix (#6606 )	2020-08-20 15:34:43 -04:00
sgugger	86c07e634f	One last threshold to raise	2020-08-20 14:23:09 -04:00
Sylvain Gugger	e8af90c052	Move threshold up for flaky test with Electra (#6622 ) * Move threshold up for flaky test with Electra * Update above as well	2020-08-20 13:59:40 -04:00
Patrick von Platen	505f2d749e	[Tests] fix attention masks in Tests (#6621 ) * fix distilbert * fix typo	2020-08-20 13:23:47 -04:00
Denisa Roberts	c9454507cf	Add tests for Reformer tokenizer (#6485 )	2020-08-20 18:58:44 +02:00
Sylvain Gugger	573bdb0a5d	Add tests to Trainer (#6605 ) * Add tests to Trainer * Test if removing long breaks everything * Remove ugly hack * Fix distributed test * Use float for number of epochs	2020-08-20 11:13:50 -04:00
Suraj Patil	7581884dee	[BartTokenizerFast] add prepare_seq2seq_batch (#6543 )	2020-08-19 10:37:48 -04:00
Patrick von Platen	8bcceaceff	fix model outputs test (#6593 )	2020-08-19 16:18:51 +02:00
Pradhy729	2a7402cbd3	Feed forward chunking others (#6365 ) * Feed forward chunking for Distilbert & Albert * Added ff chunking for many other models * Change model signature * Added chunking for XLM * Cleaned up by removing some variables. * remove test_chunking flag Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>	2020-08-19 14:31:10 +02:00
Patrick von Platen	fe0b85e77a	[EncoderDecoder] Add functionality to tie encoder decoder weights (#6538 ) * start adding tie encoder to decoder functionality * finish model tying * make style * Apply suggestions from code review * fix t5 list including cross attention * apply sams suggestions * Update src/transformers/modeling_encoder_decoder.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * add max depth break point Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-19 14:23:45 +02:00
Sam Shleifer	ab42d74850	Fix bart base test (#6587 )	2020-08-18 21:28:10 -04:00
Sam Shleifer	1529bf9680	add BartConfig.force_bos_token_to_be_generated (#6526 ) Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-18 19:15:50 -04:00
Sam Shleifer	12d7624199	[marian] converter supports models from new Tatoeba project (#6342 )	2020-08-17 23:55:42 -04:00
Suraj Patil	407da12ef1	[T5Tokenizer] add prepare_seq2seq_batch method (#6122 ) * tests	2020-08-17 13:57:19 -04:00
Suraj Patil	2a77813d53	[BartTokenizer] add prepare s2s batch (#6212 ) Co-authored-by: sgugger <sylvain.gugger@gmail.com>	2020-08-17 11:44:46 -04:00
Funtowicz Morgan	b41cc0b86a	Fix flaky ONNX tests (#6531 )	2020-08-17 09:04:35 -04:00
Kevin Canwen Xu	37709b5909	Remove deprecated assertEquals (#6532 ) `assertEquals` is deprecated: https://stackoverflow.com/questions/930995/assertequals-vs-assertequal-in-python/931011 This PR replaces these deprecated methods.	2020-08-17 17:13:58 +08:00
Masatoshi Suzuki	48c6c6139f	Support additional dictionaries for BERT Japanese tokenizers (#6515 ) * Update BERT Japanese tokenizers * Update CircleCI config to download unidic * Specify to use the latest dictionary packages	2020-08-17 12:00:23 +08:00
Patrick von Platen	1d6e71e116	[EncoderDecoder] Add Cross Attention for GPT2 (#6415 ) * add cross attention layers for gpt2 * make gpt2 cross attention work * finish bert2gpt2 * add explicit comments * remove attention mask since not yet supported * revert attn mask in pipeline * Update src/transformers/modeling_gpt2.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Update src/transformers/modeling_encoder_decoder.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-14 09:43:29 +02:00
Suraj Patil	680f1337c3	MBartForConditionalGeneration (#6441 ) * add MBartForConditionalGeneration * style * rebase and fixes * add mbart test in TEST_FILES_WITH_NO_COMMON_TESTS * fix docs * don't ignore mbart * doc * fix mbart fairseq link * put mbart before bart * apply doc suggestions	2020-08-14 03:21:16 -04:00
Lysandre Debut	f7cbc13db7	Test model outputs equivalence (#6445 ) * Test model outputs equivalence * Fix failing tests * From dict to kwargs * DistilBERT * Addressing @sgugger and @patrickvonplaten's comments	2020-08-13 11:59:35 -04:00
Stas Bekman	e983da0e7d	cleanup tf unittests: part 2 (#6260 ) * cleanup torch unittests: part 2 * remove trailing comma added by isort, and which breaks flake * one more comma * revert odd balls * part 3: odd cases * more ["key"] -> .key refactoring * .numpy() is not needed * more unncessary .numpy() removed * more simplification	2020-08-13 04:29:06 -04:00
Joe Davison	bc820476a5	add targets arg to fill-mask pipeline (#6239 ) * add targets arg to fill-mask pipeline * add tests and more error handling * quality * update docstring	2020-08-12 12:48:29 -04:00
Patrick von Platen	0735def8e1	[EncoderDecoder] Add encoder-decoder for roberta/ vanilla longformer (#6411 ) * add encoder-decoder for roberta * fix headmask * apply Sylvains suggestions * fix typo * Apply suggestions from code review	2020-08-12 18:23:30 +02:00
Sylvain Gugger	e9c3031463	Fixes to make life easier with the nlp library (#6423 ) * allow using tokenizer.pad as a collate_fn in pytorch * allow using tokenizer.pad as a collate_fn in pytorch * Add documentation and tests * Make attention mask the right shape * Better test Co-authored-by: Thomas Wolf <thomwolf@users.noreply.github.com>	2020-08-12 08:00:56 -04:00
Stas Bekman	ece0903e11	lr_schedulers: add get_polynomial_decay_schedule_with_warmup (#6361 ) * [wip] add get_polynomial_decay_schedule_with_warmup * style * add assert * change lr_end to a much smaller default number * check for exact equality * [model_cards] electra-base-turkish-cased-ner (#6350) * for electra-base-turkish-cased-ner * Add metadata Co-authored-by: Julien Chaumond <chaumond@gmail.com> * Temporarily de-activate TPU CI * Update modeling_tf_utils.py (#6372) fix typo: ckeckpoint->checkpoint * the test now works again (#6371) * correct pl link in readme (#6364) * refactor almost identical tests (#6339) * refactor almost identical tests * important to add a clear assert error message * make the assert error even more descriptive than the original bt * Small docfile fixes (#6328) * Patch models (#6326) * TFAlbertFor{TokenClassification, MultipleChoice} * Patch models * BERT and TF BERT info s * Update check_repo * Ci GitHub caching (#6382) * Cache Github Actions CI * Remove useless file * Colab button (#6389) * Add colab button * Add colab link for tutorials * Fix links for open in colab (#6391) * Update src/transformers/optimization.py consistently use lr_end=1e-7 default Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * [wip] add get_polynomial_decay_schedule_with_warmup * style * add assert * change lr_end to a much smaller default number * check for exact equality * Update src/transformers/optimization.py consistently use lr_end=1e-7 default Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * remove dup (leftover from merge) * convert the test into the new refactored format * stick to using the current_step as is, without ++ Co-authored-by: M. Yusuf Sarıgöz <yusufsarigoz@gmail.com> Co-authored-by: Julien Chaumond <chaumond@gmail.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Alexander Measure <ameasure@gmail.com> Co-authored-by: Rohit Gupta <rohitgr1998@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-08-11 17:56:41 -04:00
Sam Shleifer	be1520d3a3	rename prepare_translation_batch -> prepare_seq2seq_batch (#6103 )	2020-08-11 15:57:07 -04:00
Sam Shleifer	66fa8ceaea	PegasusForConditionalGeneration (torch version) (#6340 ) Co-authored-by: Jingqing Zhang <jingqing.zhang15@imperial.ac.uk>	2020-08-11 14:31:23 -04:00

1 2 3 4 5 ...

505 Commits