transformers

Commit Graph

Author	SHA1	Message	Date
Patrick von Platen	9b90810558	[Flax] Dataset streaming example (#12470 ) * fix_torch_device_generate_test * remove @ * upload * finish dataset streaming * adapt readme * finish * up * up * up * up * Apply suggestions from code review * finish * make style * make style2 * finish Co-authored-by: Patrick von Platen <patrick@huggingface.co>	2021-07-05 15:13:10 +01:00
Navjot	eceb1042c1	flax.linen.apply takes state as the first param, followed by the input (#12510 )	2021-07-05 19:33:14 +05:30
Suraj Patil	f1c81d6b92	[Flax] ViT training example (#12300 ) * begin script * clean example, add readme * update readme * remove decay mask * remove masking * update readme & make flake happy	2021-07-05 18:23:03 +05:30
Akmal	e799e0f1ed	[Flax] Fix wav2vec2 pretrain arguments (#12498 )	2021-07-05 13:35:20 +01:00
sadakmed	0e1718afb6	create LxmertModelIntegrationTest Pytorch (#9989 ) * create LxmertModelIntegrationTest * implementation using numpy seeding to fix inputs params. * fix code quality * isort check	2021-07-05 05:21:25 -04:00
Suraj Patil	23ab0b6980	[examples/flax] clip style image-text training example (#12491 ) * clip style example * fix post init * add requirements * update readme, few small fixes	2021-07-05 13:26:44 +05:30
Lysandre Debut	89a8739f0c	Add `Repository` import to the FLAX example script (#12501 )	2021-07-05 03:51:11 -04:00
Patrick von Platen	2df63282e0	Update README.md	2021-07-04 13:16:29 +01:00
Omar Sanseviero	a76eebfc80	Add guide on how to build demos for the Flax sprint (#12468 )	2021-07-02 20:35:17 +02:00
Patrick von Platen	b21905e03d	Update README.md	2021-07-02 14:12:47 +01:00
Patrick von Platen	d24a523130	Update README.md	2021-07-02 13:41:14 +01:00
Patrick von Platen	e3fce2f868	Update README.md Thanks a lot @BirgerMoell	2021-07-02 12:12:54 +01:00
Lysandre Debut	b889d3f6c4	Fix TAPAS test uncovered by #12446 (#12480 )	2021-07-02 04:35:10 -04:00
Matthew LeMay	b4ecc6bef2	fixed typo in flax-projects readme (#12466 )	2021-07-02 12:27:39 +05:30
Sylvain Gugger	e52288a140	Rework notebooks and move them to the Notebooks repo (#12471 )	2021-07-02 02:29:51 -04:00
Stas Bekman	2d1d92181a	[roberta] fix lm_head.decoder.weight ignore_key handling (#12446 ) * fix lm_head.decoder.weight ignore_key handling * fix the mutable class variable * Update src/transformers/models/roberta/modeling_roberta.py Co-authored-by: Lysandre Debut <lysandre@huggingface.co> * replicate the comment * make deterministic Co-authored-by: Lysandre Debut <lysandre@huggingface.co>	2021-07-01 10:31:19 -07:00
Teven	7f0027db30	Fixing bug with param count without embeddings (#12461 ) * fixing bug with param count without embeddings * Update src/transformers/modeling_utils.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * style Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-07-01 13:25:40 -04:00
Souvic Chakraborty	d5b8fe3b90	Validation split added: custom data files @sgugger, @patil-suraj (#12407 ) * Validation split added: custom data files Validation split added in case of no validation file and loading custom data * Updated documentation with custom file usage Updated documentation with custom file usage * Update README.md * Update README.md * Update README.md * Made some suggested stylistic changes * Used logger instead of print. Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Made similar changes to add validation split In case of a missing validation file, a validation split will be used now. * max_train_samples to be used for training only max_train_samples got misplaced, now corrected so that it is applied on training data only, not whole data. * styled * changed ordering * Improved language of documentation Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Improved language of documentation Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Fixed styling issue * Update run_mlm.py Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-07-01 13:22:42 -04:00
Thibault FEVRY	f929462b25	Import check_inits handling of duplicate definitions. (#12467 ) * Import fix_inits handling of duplicate definitions. * Style fix	2021-07-01 12:52:00 -04:00
Patrick von Platen	7f87bfc910	Add TPU README (#12463 ) * Add TPU README * Apply suggestions from code review * Update examples/research_projects/jax-projects/README.md * Update examples/research_projects/jax-projects/README.md Co-authored-by: Stefan Schweter <stefan@schweter.it> Co-authored-by: Stefan Schweter <stefan@schweter.it>	2021-07-01 17:11:54 +01:00
Patrick von Platen	1457839fc5	Update README.md	2021-07-01 15:52:11 +01:00
Suzana Ilić	c18af5d40c	Added talk details (#12465 )	2021-07-01 16:19:23 +02:00
Jin Young (Daniel) Sohn	6c5b20aa09	Fix training_args.py barrier for torch_xla (#12464 ) torch_xla currently has its own synchronization primitives, so use xm.rendezvous(tag) instead.	2021-07-01 10:17:38 -04:00
Lysandre Debut	2a501ac954	Comment fast GPU TF tests (#12452 )	2021-07-01 09:26:46 -04:00
Patrick von Platen	27d348f2fe	[Wav2Vec2, Hubert] Fix ctc loss test (#12458 ) * fix_torch_device_generate_test * remove @ * fix test	2021-07-01 08:59:32 -04:00
Patrick von Platen	b655f16d4e	[Flax community event] How to use hub during training (#12447 ) * fix_torch_device_generate_test * remove @ * upload * finish doc * Apply suggestions from code review Co-authored-by: Omar Sanseviero <osanseviero@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Julien Chaumond <chaumond@gmail.com> * finish Co-authored-by: Omar Sanseviero <osanseviero@users.noreply.github.com> Co-authored-by: Lysandre Debut <lysandre@huggingface.co> Co-authored-by: Julien Chaumond <chaumond@gmail.com>	2021-07-01 11:41:22 +01:00
SaulLu	3aa37b945e	Add test for a WordLevel tokenizer model (#12437 ) * add a test for a WordLevel tokenizer * adapt common test to new tokenizer	2021-07-01 12:37:07 +02:00
Patrick von Platen	0d1f67e651	[Flax] Add wav2vec2 (#12271 ) * fix_torch_device_generate_test * remove @ * start flax wav2vec2 * save intermediate * forward pass has correct shape * add weight norm * add files * finish ctc * make style * finish gumbel quantizer * correct docstrings * correct some more files * fix vit * finish quality * correct tests * correct docstring * correct tests * start wav2vec2 pretraining script * save intermediate * start pretraining script * finalize pretraining script * finish * finish * small typo * finish * correct * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Suraj Patil <surajp815@gmail.com> * make style * push Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: Suraj Patil <surajp815@gmail.com>	2021-06-30 18:44:23 +01:00
Suraj Patil	3f36a2c064	[JAX/Flax readme] add philosophy doc (#12419 ) * add philosophy doc * fix typos * update doc * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> * address Patricks suggestions * add a training example and fix typos * jit the training step * jit train step * fix example code * typo * Apply suggestions from code review Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-06-30 21:40:12 +05:30
Suzana Ilić	1ad1c4a864	Add to talks section (#12442 )	2021-06-30 16:58:03 +02:00
fcakyon	42477d68fa	fix typo in mt5 configuration docstring (#12432 )	2021-06-30 15:24:06 +01:00
Lysandre	89073a95ba	Document patch release v4.8.2	2021-06-30 14:39:52 +02:00
NielsRogge	6e68597877	Add CANINE (#12024 ) * First pass * More progress * Add support for local attention * More improvements * More improvements * Conversion script working * Add CanineTokenizer * Make style & quality * First draft of integration test * Remove decoder test * Improve tests * Add documentation * Mostly docs improvements * Add CanineTokenizer tests * Fix most tests on GPU, improve upsampling projection * Address most comments by @dhgarrette * Remove decoder logic * Improve Canine tests, improve docs of CanineConfig * All tokenizer tests passing * Make fix-copies and fix tokenizer tests * Fix test_model_outputs_equivalence test * Apply suggestions from @sgugger's review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Address some more comments * Add support for hidden_states and attentions of shallow encoders * Define custom CanineModelOutputWithPooling, tests pass * First pass * More progress * Add support for local attention * More improvements * More improvements * Conversion script working * Add CanineTokenizer * Make style & quality * First draft of integration test * Remove decoder test * Improve tests * Add documentation * Mostly docs improvements * Add CanineTokenizer tests * Fix most tests on GPU, improve upsampling projection * Address most comments by @dhgarrette * Remove decoder logic * Improve Canine tests, improve docs of CanineConfig * All tokenizer tests passing * Make fix-copies and fix tokenizer tests * Fix test_model_outputs_equivalence test * Apply suggestions from @sgugger's review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * Address some more comments * Make conversion script work for Canine-c too * Fix tokenizer tests * Remove file Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-30 08:05:44 -04:00
Jabin Huang	69f570156e	Add default bos_token and eos_token for tokenizer of deberta_v2 (#12429 ) * fix ids_to_tokens naming error in tokenizer of deberta v2 * Update tokenization_deberta_v2.py Add bos_token and eos_token. * format code Co-authored-by: Jipeng Huang <jihuan@microsoft.com>	2021-06-30 08:03:58 -04:00
Sylvain Gugger	c9486fd0f5	Fix default bool in argparser (#12424 ) * Fix default bool in argparser * Add more to test	2021-06-30 07:57:05 -04:00
Suzana Ilić	90d69456eb	Added to talks section (#12433 ) Added one more confirmed speaker, zoom links and gcal event links	2021-06-30 13:14:11 +02:00
Sylvain Gugger	31a8110918	Add option to save on each training node (#12421 ) * Add option to save on each training node * Apply suggestions from code review Co-authored-by: Stas Bekman <stas00@users.noreply.github.com> * Address review comments Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>	2021-06-30 02:41:47 -04:00
Stas Bekman	990540b72d	[modelcard] fix (#12422 ) this PR is fixing an incorrect attribute - probably some tests are needed?	2021-06-29 17:59:03 -04:00
Sylvain Gugger	dc42e770b8	Easily train a new fast tokenizer from a given one (#12361 ) * [WIP] Easily train a new fast tokenizer from a given one * Fix test * Roll out to other tokenizers and add tests * Fix bug with unk id and add emoji to test * Really use something different in test * Implement special tokens map * Map special tokens in the Transformers tokenizers * Fix test * Make test more robust * Fix test for BPE * More robust map and test Co-authored-by SaulLu * Test file * Stronger tests Co-authored-by: SaulLu <lucilesaul.com@gmail.com> * Map unk token for Wordpiece and address review comment * Fix lowercase test and address review comment * Fix all tests * Simplify test * Fix tests for realsies * Easily train a new fast tokenizer from a given one - tackle the special tokens format (str or AddedToken) (#12420) * Propose change in tests regarding lower case * add new test for special tokens types * put back the test part about decoding * add feature: the AddedToken is re-build with the different mapped content * Address review comment: simplify AddedToken building Co-authored-by: sgugger <sylvain.gugger@gmail.com> * Update src/transformers/tokenization_utils_fast.py Co-authored-by: sgugger <sylvain.gugger@gmail.com> Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> Co-authored-by: SaulLu <lucilesaul.com@gmail.com> Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>	2021-06-29 15:00:08 -04:00
Suzana Ilić	b440b8d1ce	Added talks (#12415 )	2021-06-29 16:01:16 +01:00
Shamane Siri	5257818e68	minor fixes in original RAG training (#12395 )	2021-06-29 13:39:48 +01:00
Jabin Huang	e3f39a2952	fix ids_to_tokens naming error in tokenizer of deberta v2 (#12412 ) Co-authored-by: Jipeng Huang <jihuan@microsoft.com>	2021-06-29 08:15:35 -04:00
Patrick von Platen	813328682e	[Flax] Example scripts - correct weight decay (#12409 ) * fix_torch_device_generate_test * remove @ * finish * finish * correct style	2021-06-29 12:01:08 +01:00
Suraj Patil	aecae53377	[example/flax] add summarization readme (#12393 ) * add readme * update readme and add requirements * Update examples/flax/summarization/README.md Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>	2021-06-29 14:02:33 +05:30
Will Rice	3886104574	Fix TFWav2Vec2 SpecAugment (#12289 ) * Fix TFWav2Vec2 SpecAugment * Invert masks * Feedback changes	2021-06-29 09:15:57 +01:00
Will Rice	bc084938f2	Add out of vocabulary error to ASR models (#12288 ) * Add OOV error to ASR models * Feedback changes	2021-06-29 08:57:46 +01:00
NielsRogge	1fc6817a30	Rename detr targets to labels (#12280 ) * Rename target to labels in DetrFeatureExtractor * Update DetrFeatureExtractor tests accordingly * Improve docs of DetrFeatureExtractor * Improve docs * Make style	2021-06-29 03:07:46 -04:00
Stas Bekman	7682e97702	[models] respect dtype of the model when instantiating it (#12316 ) * [models] respect dtype of the model when instantiating it * cleanup * cleanup * rework to handle non-float dtype * fix * switch to fp32 tiny model * improve * use dtype.is_floating_point * Apply suggestions from code review Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com> * fix the doc * recode to use explicit torch_dtype_auto_detect, torch_dtype args * docs and tweaks * docs and tweaks * docs and tweaks * merge 2 args, add docs * fix * fix * better doc * better doc Co-authored-by: Sylvain Gugger <35901082+sgugger@users.noreply.github.com>	2021-06-28 20:11:21 -07:00
Patrick von Platen	31c3e7e75b	[Flax] Add T5 pretraining script (#12355 ) * fix_torch_device_generate_test * remove @ * add length computatan * finish masking * finish * upload * fix some bugs * finish * fix dependency table * correct tensorboard * Apply suggestions from code review * correct processing * slight change init * correct some more mistakes * apply suggestions * improve readme * fix indent * Apply suggestions from code review Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com> * correct tokenizer * finish * finish * finish * finish Co-authored-by: Patrick von Platen <patrick@huggingface.co> Co-authored-by: SaulLu <55560583+SaulLu@users.noreply.github.com>	2021-06-28 20:11:29 +01:00
Stas Bekman	e277074889	pass the matching trainer log level to deepspeed (#12401 )	2021-06-28 11:43:24 -07:00

1 2 3 4 5 ...

7492 Commits All Branches Search

7492 Commits

All Branches