Commit Graph

18 Commits

Author SHA1 Message Date
Lysandre a75c64d80c Black 20 release 2020-08-26 17:20:22 +02:00
Sylvain Gugger a573777901
Update repo to isort v5 (#6686)
* Run new isort

* More changes

* Update CI, CONTRIBUTING and benchmarks
2020-08-24 11:03:01 -04:00
Patrick von Platen 505f2d749e
[Tests] fix attention masks in Tests (#6621)
* fix distilbert

* fix typo
2020-08-20 13:23:47 -04:00
Pradhy729 2a7402cbd3
Feed forward chunking others (#6365)
* Feed forward chunking for Distilbert & Albert

* Added ff chunking for many other models

* Change model signature

* Added chunking for XLM

* Cleaned up by removing some variables.

* remove test_chunking flag

Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2020-08-19 14:31:10 +02:00
Pradhy729 b25cec13c5
Feed forward chunking (#6024)
* Chunked feed forward for Bert

This is an initial implementation to test applying feed forward chunking for BERT.
Will need additional modifications based on output and benchmark results.

* Black and cleanup

* Feed forward chunking in BertLayer class.

* Isort

* add chunking for all models

* fix docs

* Fix typo

Co-authored-by: patrickvonplaten <patrick.v.platen@gmail.com>
2020-08-11 03:12:45 -04:00
Patrick von Platen 7f65daa2e1
fix reformer fp16 (#6237) 2020-08-04 13:02:25 +02:00
Stas Bekman 5deed37f9f
cleanup torch unittests (#6196)
* improve unit tests

this is a sample of one test according to the request in https://github.com/huggingface/transformers/issues/5973
before I apply it to the rest

* batch 1

* batch 2

* batch 3

* batch 4

* batch 5

* style

* non-tf template

* last deletion of check_loss_output
2020-08-04 02:42:56 -04:00
Sylvain Gugger d951c14ae4
Model output test (#6155)
* Use return_dict=True in all tests

* Formatting
2020-07-31 09:44:37 -04:00
Patrick von Platen 9d37c56bab
[Reformer] - Cache hidden states and buckets to speed up inference (#5578)
* fix merge rebase

* add intermediate reformer code

* save intermediate caching results

* save intermediate

* save intermediate results

* save intermediate

* upload next step

* fix generate tests

* make tests work

* add named tuple output

* Apply suggestions from code review

* fix use_cache for False case

* fix tensor to gpu

* fix tensor to gpu

* refactor

* refactor and make style
2020-07-17 16:17:42 +02:00
as-stevens f867000f56
[Reformer classification head] Implement the reformer model classification head for text classification (#5198)
* Reformer model head classification implementation for text classification

* Reformat the reformer model classification code

* PR review comments, and test case implementation for reformer for classification head changes

* CI/CD reformer for classification head test import error fix

* CI/CD test case implementation  added ReformerForSequenceClassification to all_model_classes

* Code formatting- fixed

* Normal test cases added for reformer classification head

* Fix test cases implementation for the reformer classification head

* removed token_type_id parameter from the reformer classification head

* fixed the test case for reformer classification head

* merge conflict with master fixed

* merge conflict, changed reformer classification to accept the choice_label parameter added in latest code

* refactored the the reformer classification head test code

* reformer classification head, common transform test cases fixed

* final set of the review comment, rearranging the reformer classes and docstring add to classification forward method

* fixed the compilation error and text case fix for reformer classification head

* Apply suggestions from code review

Remove unnecessary dup

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2020-07-14 09:16:22 +02:00
Patrick von Platen 989ae326b5
[Reformer] Adapt Reformer MaskedLM Attn mask (#5560)
* fix attention mask

* fix slow test

* refactor attn masks

* fix fp16 generate test
2020-07-07 10:48:06 +02:00
Patrick von Platen d16e36c7e5
[Reformer] Add Masked LM Reformer (#5426)
* fix conflicts

* fix

* happy rebasing
2020-07-01 22:43:18 +02:00
Patrick von Platen fe81f7d12c
finish reformer qa head (#5433) 2020-07-01 12:27:14 -04:00
Sam Shleifer 13deb95a40
Move tests/utils.py -> transformers/testing_utils.py (#5350) 2020-07-01 10:31:17 -04:00
Patrick von Platen 9ca485734a
[Reformer] Improved memory if input is shorter than chunk length (#4720)
* improve handling of short inputs for reformer

* correct typo in assert statement

* fix other tests
2020-06-02 23:08:39 +02:00
Julien Chaumond d4c2cb402d
Kill model archive maps (#4636)
* Kill model archive maps

* Fixup

* Also kill model_archive_map for MaskedBertPreTrainedModel

* Unhook config_archive_map

* Tokenizers: align with model id changes

* make style && make quality

* Fix CI
2020-06-02 09:39:33 -04:00
Julien Chaumond 4c06893610
Fix nn.DataParallel compatibility in PyTorch 1.5 (#4300)
* Test case for #3936

* multigpu tests pass on pytorch 1.4.0

* Fixup

* multigpu tests pass on pytorch 1.5.0

* Update src/transformers/modeling_utils.py

* Update src/transformers/modeling_utils.py

* rename multigpu to require_multigpu

* mode doc
2020-05-18 20:34:50 -04:00
Patrick von Platen dca34695d0
Reformer (#3351)
* first copy & past commit from Bert and morgans LSH code

* add easy way to compare to trax original code

* translate most of function

* make trax lsh self attention deterministic with numpy seed + copy paste code

* add same config

* add same config

* make layer init work

* implemented hash_vectors function for lsh attention

* continue reformer translation

* hf LSHSelfAttentionLayer gives same output as trax layer

* refactor code

* refactor code

* refactor code

* refactor

* refactor + add reformer config

* delete bogus file

* split reformer attention layer into two layers

* save intermediate step

* save intermediate step

* make test work

* add complete reformer block layer

* finish reformer layer

* implement causal and self mask

* clean reformer test and refactor code

* fix merge conflicts

* fix merge conflicts

* update init

* fix device for GPU

* fix chunk length init for tests

* include morgans optimization

* improve memory a bit

* improve comment

* factorize num_buckets

* better testing parameters

* make whole model work

* make lm model work

* add t5 copy paste tokenizer

* add chunking feed forward

* clean config

* add improved assert statements

* make tokenizer work

* improve test

* correct typo

* extend config

* add complexer test

* add new axial position embeddings

* add local block attention layer

* clean tests

* refactor

* better testing

* save intermediate progress

* clean test file

* make shorter input length work for model

* allow variable input length

* refactor

* make forward pass for pretrained model work

* add generation possibility

* finish dropout and init

* make style

* refactor

* add first version of RevNet Layers

* make forward pass work and add convert file

* make uploaded model forward pass work

* make uploaded model forward pass work

* refactor code

* add namedtuples and cache buckets

* correct head masks

* refactor

* made reformer more flexible

* make style

* remove set max length

* add attention masks

* fix up tests

* fix lsh attention mask

* make random seed optional for the moment

* improve memory in reformer

* add tests

* make style

* make sure masks work correctly

* detach gradients

* save intermediate

* correct backprob through gather

* make style

* change back num hashes

* rename to labels

* fix rotation shape

* fix detach

* update

* fix trainer

* fix backward dropout

* make reformer more flexible

* fix conflict

* fix

* fix

* add tests for fixed seed in reformer layer

* fix trainer typo

* fix typo in activations

* add fp16 tests

* add fp16 training

* support fp16

* correct gradient bug in reformer

* add fast gelu

* re-add dropout for embedding dropout

* better naming

* better naming

* renaming

* finalize test branch

* finalize tests

* add more tests

* finish tests

* fix

* fix type trainer

* fix fp16 tests

* fix tests

* fix tests

* fix tests

* fix issue with dropout

* fix dropout seeds

* correct random seed on gpu

* finalize random seed for dropout

* finalize random seed for dropout

* remove duplicate line

* correct half precision bug

* make style

* refactor

* refactor

* docstring

* remove sinusoidal position encodings for reformer

* move chunking to modeling_utils

* make style

* clean config

* make style

* fix tests

* fix auto tests

* pretrained models

* fix docstring

* update conversion file

* Update pretrained_models.rst

* fix rst

* fix rst

* update copyright

* fix test path

* fix test path

* fix small issue in test

* include reformer in generation tests

* add docs for axial position encoding

* finish docs

* Update convert_reformer_trax_checkpoint_to_pytorch.py

* remove isort

* include sams comments

* remove wrong comment in utils

* correct typos

* fix typo

* Update reformer.rst

* applied morgans optimization

* make style

* make gpu compatible

* remove bogus file

* big test refactor

* add example for chunking

* fix typo

* add to README
2020-05-07 10:17:01 +02:00