transformers

Commit Graph

Author	SHA1	Message	Date
Patrick von Platen	8abd7f69fc	fix warning for position ids (#6884 )	2020-09-02 06:44:51 -04:00
Parthe Pandit	7cb0572c64	Update modeling_bert.py (#6897 ) outptus -> outputs in example of BertForPreTraining	2020-09-02 06:39:01 -04:00
David Mark Nemeskey	e3c55ceb8d	Model card for huBERT (#6893 ) * Create README.md Model card for huBERT. * Update README.md lowercase h * Update model_cards/SZTAKI-HLT/hubert-base-cc/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-09-02 04:50:10 -04:00
Patrick von Platen	1889e96c8c	fix QA example for PT (#6890 )	2020-09-02 09:53:09 +02:00
Julien Chaumond	d822ab636b	[model_cards] Fix file path for flexudy/t5-base-multi-sentence-doctor	2020-09-02 00:02:40 +02:00
Rohan Rajpal	ad5fb33c9a	Create README.md (#6598 )	2020-09-01 17:59:15 -04:00
Rohan Rajpal	f9dadcd85b	Create README.md (#6602 )	2020-09-01 17:58:43 -04:00
Igli Manaj	f5d69c75f7	Update multilingual passage rereanking model card (#6788 ) Fix range of possible score, add inference .	2020-09-01 17:56:19 -04:00
Tom Grek	5d820f3ca6	Model card for primer/BART-Squad2 (#6801 )	2020-09-01 17:52:32 -04:00
zolekode	8b884dadc6	added model card for flexudys t5 model (#6759 ) Co-authored-by: zolekode <pascal.zoleko@fau.de>	2020-09-01 17:38:55 -04:00
hakan	bff6d517cd	loodos turkish model cards added (#6840 )	2020-09-01 17:35:24 -04:00
Manuel Romero	502d194b95	Create README.md (#6887 ) Add language meta attribute	2020-09-01 17:09:10 -04:00
Manuel Romero	d082edf216	Create README.md (#6888 ) Add language meta attribute	2020-09-01 17:09:02 -04:00
Abed khooli	dacbee9a50	Create README.md (#6886 ) * Create README.md model card for akhooli/xlm-r-large-arabic-sent * Update model_cards/akhooli/xlm-r-large-arabic-sent/README.md Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2020-09-01 17:06:15 -04:00
Abed khooli	e2971e61bd	Create README.md (#6885 )	2020-09-01 16:57:48 -04:00
Patrick von Platen	4d1a3ffde8	[EncoderDecoder] Add xlm-roberta to encoder decoder (#6878 ) * finish xlm-roberta * finish docs * expose XLMRobertaForCausalLM	2020-09-01 21:56:39 +02:00
Patrick von Platen	311992630c	Create README.md (#6883 ) * Create README.md * Update README.md	2020-09-01 19:24:45 +02:00
Jin Young (Daniel) Sohn	21d719238c	Add cache_dir to save features TextDataset (#6879 ) * Add cache_dir to save features TextDataset This is in case the dataset is in a RO filesystem, for which is the case in tests (GKE TPU tests). * style	2020-09-01 11:42:17 -04:00
Lysandre Debut	1461aac8d7	Update docs stable version	2020-09-01 11:02:24 -04:00
Lysandre	3726754a6c	v3.1.0 documentation	2020-09-01 14:39:07 +02:00
Lysandre	4b3ee9cbc5	Release: v3.1.0	2020-09-01 14:27:52 +02:00
Patrick von Platen	afc4ece462	[Generate] Facilitate PyTorch generate using `ModelOutputs` (#6735 ) * fix generate for GPT2 Double Head * fix gpt2 double head model * fix bart / t5 * also add for no beam search * fix no beam search * fix encoder decoder * simplify t5 * simplify t5 * fix t5 tests * fix BART * fix transfo-xl * fix conflict * integrating sylvains and sams comments * fix tf past_decoder_key_values * fix enc dec test	2020-09-01 12:38:25 +02:00
Funtowicz Morgan	397f819615	Restore PaddingStrategy.MAX_LENGTH on QAPipeline while no v2. (#6875 ) Signed-off-by: Morgan Funtowicz <funtowiczmo@gmail.com>	2020-09-01 05:35:35 -04:00
Sam Shleifer	a32d85f0d4	delete reinit (#6862 )	2020-09-01 03:43:27 -04:00
Sylvain Gugger	d5f1ffa0d8	Logging doc (#6852 ) * Add logging doc * Foamtting * Update docs/source/main_classes/logging.rst * Update src/transformers/utils/logging.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2020-09-01 03:16:34 -04:00
Stas Bekman	59a6a32a61	add a final report to all pytest jobs (#6861 ) we had it added for one job, please add it to all pytest jobs - we need the output of what tests were run to debug the codecov issue. thank you!	2020-08-31 22:47:23 -04:00
Sam Shleifer	431ab19d7a	[fix] typo in available in helper function (#6859 )	2020-08-31 17:59:34 -04:00
Sam Shleifer	367235ee52	Bart can make decoder_input_ids from labels (#6758 )	2020-08-31 16:16:47 -04:00
Sam Shleifer	b9772897ec	[s2s] command line args for faster val steps (#6833 )	2020-08-31 16:16:10 -04:00
Sam Shleifer	8af1970e45	Fix marian slow test (#6854 )	2020-08-31 16:10:43 -04:00
Funtowicz Morgan	bbdba0a76d	Update ONNX notebook to include section on quantization. (#6831 ) * Update ONNX notebook to include section on quantization. Signed-off-by: Morgan Funtowicz <morgan@huggingface.co> * Addressing ONNX team comments	2020-08-31 21:28:00 +02:00
Sylvain Gugger	a59bcefbb1	Split hp search methods (#6857 ) * Split the run_hp_search by backend * Unused import	2020-08-31 15:16:39 -04:00
krfricke	23f9611c16	Add checkpointing to Ray Tune HPO (#6747 ) * Introduce HPO checkpointing for PBT * Moved checkpoint saving * Fixed checkpoint subdir pass * Fixed style * Enable/disable checkpointing, check conditions for various tune schedulers incl. PBT * Adjust number of GPUs to number of jobs * Avoid mode pickling in ray * Move hp search to integrations	2020-08-31 14:38:46 -04:00
Sam Shleifer	61b7ba93f5	Marian distill scripts + integration test (#6799 )	2020-08-31 13:48:26 -04:00
Jin Young (Daniel) Sohn	02d09c8fcc	Only access loss tensor every logging_steps (#6802 ) * Only access loss tensor every logging_steps * tensor.item() was being called every step. This must not be done for XLA:TPU tensors as it's terrible for performance causing TPU<>CPU communication at each step. On RoBERTa MLM for example, it reduces step time by 30%, should be larger for smaller step time models/tasks. * Train batch size was not correct in case a user uses the `per_gpu_train_batch_size` flag * Avg reduce loss accross eval shards * Fix style (#6803) * t5 model should make decoder_attention_mask (#6800) * [s2s] Test hub configs in self-scheduled CI (#6809) * [s2s] round runtime in run_eval (#6798) * Pegasus finetune script: add --adafactor (#6811) * [bart] rename self-attention -> attention (#6708) * [tests] fix typos in inputs (#6818) * Fixed open in colab link (#6825) * Add model card for singbert lite. Update widget for singbert and singbert-large. (#6827) * BR_BERTo model card (#6793) * clearly indicate shuffle=False (#6312) * Clarify shuffle * clarify shuffle Co-authored-by: Kevin Canwen Xu <canwenxu@126.com> * [s2s README] Add more dataset download instructions (#6737) * Style * Patch logging issue * Set default logging level to `WARNING` instead of `INFO` * TF Flaubert w/ pre-norm (#6841) * Dataset and DataCollator for BERT Next Sentence Prediction (NSP) task (#6644) * add datacollator and dataset for next sentence prediction task * bug fix (numbers of special tokens & truncate sequences) * bug fix (+ dict inputs support for data collator) * add padding for nsp data collator; renamed cached files to avoid conflict. * add test for nsp data collator * Style Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> * Fix in Adafactor docstrings (#6845) * Fix resuming training for Windows (#6847) * Only access loss tensor every logging_steps * tensor.item() was being called every step. This must not be done for XLA:TPU tensors as it's terrible for performance causing TPU<>CPU communication at each step. On RoBERTa MLM for example, it reduces step time by 30%, should be larger for smaller step time models/tasks. * Train batch size was not correct in case a user uses the `per_gpu_train_batch_size` flag * Avg reduce loss accross eval shards * comments Co-authored-by: Sam Shleifer <sshleifer@gmail.com> Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> Co-authored-by: Thomas Ashish Cherian <6967017+PandaWhoCodes@users.noreply.github.com> Co-authored-by: Zane Lim <zyuanlim@gmail.com> Co-authored-by: Rodolfo De Nadai <rdenadai@gmail.com> Co-authored-by: xujiaze13 <37360975+xujiaze13@users.noreply.github.com> Co-authored-by: Kevin Canwen Xu <canwenxu@126.com> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Huang Lianzhe <hlz@pku.edu.cn> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2020-08-31 11:35:51 -04:00
Sylvain Gugger	c48546c7f7	Fix resuming training for Windows (#6847 )	2020-08-31 11:02:30 -04:00
Sylvain Gugger	d2f9cb838e	Fix in Adafactor docstrings (#6845 )	2020-08-31 10:52:47 -04:00
Huang Lianzhe	2de7ee0385	Dataset and DataCollator for BERT Next Sentence Prediction (NSP) task (#6644 ) * add datacollator and dataset for next sentence prediction task * bug fix (numbers of special tokens & truncate sequences) * bug fix (+ dict inputs support for data collator) * add padding for nsp data collator; renamed cached files to avoid conflict. * add test for nsp data collator * Style Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Lysandre <lysandre.debut@reseau.eseo.fr>	2020-08-31 08:25:00 -04:00
Lysandre Debut	895d394669	TF Flaubert w/ pre-norm (#6841 )	2020-08-31 04:53:20 -04:00
Lysandre	4561f05c5f	Set default logging level to `WARNING` instead of `INFO`	2020-08-31 09:56:25 +02:00
Lysandre	05c3214153	Patch logging issue	2020-08-31 09:37:08 +02:00
Sam Shleifer	dfa10a41ba	[s2s README] Add more dataset download instructions (#6737 )	2020-08-30 16:29:24 -04:00
xujiaze13	32fe44086c	clearly indicate shuffle=False (#6312 ) * Clarify shuffle * clarify shuffle Co-authored-by: Kevin Canwen Xu <canwenxu@126.com>	2020-08-30 19:26:10 +08:00
Rodolfo De Nadai	0eecaceac7	BR_BERTo model card (#6793 )	2020-08-30 19:02:46 +08:00
Zane Lim	d176aaad7f	Add model card for singbert lite. Update widget for singbert and singbert-large. (#6827 )	2020-08-30 18:21:49 +08:00
Thomas Ashish Cherian	a5847619e3	Fixed open in colab link (#6825 )	2020-08-30 18:21:00 +08:00
Stas Bekman	563485bf95	[tests] fix typos in inputs (#6818 )	2020-08-30 18:19:57 +08:00
Sam Shleifer	22933e661f	[bart] rename self-attention -> attention (#6708 )	2020-08-29 18:03:08 -04:00
Sam Shleifer	0f58903bb6	Pegasus finetune script: add --adafactor (#6811 )	2020-08-29 17:43:32 -04:00
Sam Shleifer	ac47458a02	[s2s] round runtime in run_eval (#6798 )	2020-08-29 17:36:31 -04:00

... 3 4 5 6 7 ...

5250 Commits All Branches Search

5250 Commits

All Branches