transformers/tests/utils/test_activations.py

# Copyright 2020 The HuggingFace Team. All rights reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

import unittest

from transformers import is_torch_available
from transformers.testing_utils import require_torch


if is_torch_available():
    import torch

    from transformers.activations import gelu_new, gelu_python, get_activation


@require_torch
class TestActivations(unittest.TestCase):
    def test_gelu_versions(self):
        x = torch.tensor([-100, -1, -0.1, 0, 0.1, 1.0, 100])
        torch_builtin = get_activation("gelu")
        self.assertTrue(torch.allclose(gelu_python(x), torch_builtin(x)))
        self.assertFalse(torch.allclose(gelu_python(x), gelu_new(x)))

    def test_gelu_10(self):
        x = torch.tensor([-100, -1, -0.1, 0, 0.1, 1.0, 100])
        torch_builtin = get_activation("gelu")
        gelu10 = get_activation("gelu_10")

        y_gelu = torch_builtin(x)
        y_gelu_10 = gelu10(x)

        clipped_mask = torch.where(y_gelu_10 < 10.0, 1, 0)

        self.assertTrue(torch.max(y_gelu_10).item() == 10.0)
        self.assertTrue(torch.allclose(y_gelu * clipped_mask, y_gelu_10 * clipped_mask))

    def test_get_activation(self):
        get_activation("gelu")
        get_activation("gelu_10")
        get_activation("gelu_fast")
        get_activation("gelu_new")
        get_activation("gelu_python")
        get_activation("linear")
        get_activation("mish")
        get_activation("quick_gelu")
        get_activation("relu")
        get_activation("sigmoid")
        get_activation("silu")
        get_activation("swish")
        get_activation("tanh")
        with self.assertRaises(KeyError):
            get_activation("bogus")
        with self.assertRaises(KeyError):
            get_activation(None)

    def test_activations_are_distinct_objects(self):
        act1 = get_activation("gelu")
        act1.a = 1
        act2 = get_activation("gelu")
        self.assertEqual(act1.a, 1)
        with self.assertRaises(AttributeError):
            _ = act2.a
Copyright (#8970) * Add copyright everywhere missing * Style 2020-12-08 07:36:34 +08:00			`# Copyright 2020 The HuggingFace Team. All rights reserved.`
			`#`
			`# Licensed under the Apache License, Version 2.0 (the "License");`
			`# you may not use this file except in compliance with the License.`
			`# You may obtain a copy of the License at`
			`#`
			`# http://www.apache.org/licenses/LICENSE-2.0`
			`#`
			`# Unless required by applicable law or agreed to in writing, software`
			`# distributed under the License is distributed on an "AS IS" BASIS,`
			`# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.`
			`# See the License for the specific language governing permissions and`
			`# limitations under the License.`

get_activation('relu') provides a simple mapping from strings i… (#2807) * activations.py contains a mapping from string to activation function * resolves some `gelu` vs `gelu_new` ambiguity 2020-02-13 21:28:33 +08:00			`import unittest`

			`from transformers import is_torch_available`
Move tests/utils.py -> transformers/testing_utils.py (#5350) 2020-07-01 22:31:17 +08:00			`from transformers.testing_utils import require_torch`
get_activation('relu') provides a simple mapping from strings i… (#2807) * activations.py contains a mapping from string to activation function * resolves some `gelu` vs `gelu_new` ambiguity 2020-02-13 21:28:33 +08:00

			`if is_torch_available():`
			`import torch`

Fix SEW-D implementation differences (#14191) * Fix SEW-D * Update tests * isort 2021-10-28 21:22:18 +08:00			`from transformers.activations import gelu_new, gelu_python, get_activation`
Update repo to isort v5 (#6686) * Run new isort * More changes * Update CI, CONTRIBUTING and benchmarks 2020-08-24 23:03:01 +08:00
get_activation('relu') provides a simple mapping from strings i… (#2807) * activations.py contains a mapping from string to activation function * resolves some `gelu` vs `gelu_new` ambiguity 2020-02-13 21:28:33 +08:00
			`@require_torch`
			`class TestActivations(unittest.TestCase):`
			`def test_gelu_versions(self):`
Replace legacy tensor.Tensor with torch.tensor/torch.empty (#12027) * Replace legacy torch.Tensor constructor with torch.{tensor, empty} * Remove torch.Tensor in examples 2021-06-08 20:58:38 +08:00			`x = torch.tensor([-100, -1, -0.1, 0, 0.1, 1.0, 100])`
get_activation('relu') provides a simple mapping from strings i… (#2807) * activations.py contains a mapping from string to activation function * resolves some `gelu` vs `gelu_new` ambiguity 2020-02-13 21:28:33 +08:00			`torch_builtin = get_activation("gelu")`
Fix SEW-D implementation differences (#14191) * Fix SEW-D * Update tests * isort 2021-10-28 21:22:18 +08:00			`self.assertTrue(torch.allclose(gelu_python(x), torch_builtin(x)))`
			`self.assertFalse(torch.allclose(gelu_python(x), gelu_new(x)))`
get_activation('relu') provides a simple mapping from strings i… (#2807) * activations.py contains a mapping from string to activation function * resolves some `gelu` vs `gelu_new` ambiguity 2020-02-13 21:28:33 +08:00
Gelu10 (#15676) * Add GeLU10 (clipped version of GeLU) to transformers to improve quantization performances. * Add unittests. * Import tensorflow after `is_tf_available` check. * Fix tensorflow wrong function `tf.tensor` to `tf.constant` * style. * use `tf.math.max` * Fix tf tests. * style. * style style style style style style * style style style style style style * Address @sgugger comments. * Fix wrong operator for raising ValueError for ClippedGELUActivation. 2022-02-23 01:21:16 +08:00			`def test_gelu_10(self):`
			`x = torch.tensor([-100, -1, -0.1, 0, 0.1, 1.0, 100])`
			`torch_builtin = get_activation("gelu")`
			`gelu10 = get_activation("gelu_10")`

			`y_gelu = torch_builtin(x)`
			`y_gelu_10 = gelu10(x)`

			`clipped_mask = torch.where(y_gelu_10 < 10.0, 1, 0)`

			`self.assertTrue(torch.max(y_gelu_10).item() == 10.0)`
			`self.assertTrue(torch.allclose(y_gelu * clipped_mask, y_gelu_10 * clipped_mask))`

get_activation('relu') provides a simple mapping from strings i… (#2807) * activations.py contains a mapping from string to activation function * resolves some `gelu` vs `gelu_new` ambiguity 2020-02-13 21:28:33 +08:00			`def test_get_activation(self):`
TF: Add sigmoid activation function (#16819) 2022-04-19 23:13:08 +08:00			`get_activation("gelu")`
			`get_activation("gelu_10")`
Reformer (#3351) * first copy & past commit from Bert and morgans LSH code * add easy way to compare to trax original code * translate most of function * make trax lsh self attention deterministic with numpy seed + copy paste code * add same config * add same config * make layer init work * implemented hash_vectors function for lsh attention * continue reformer translation * hf LSHSelfAttentionLayer gives same output as trax layer * refactor code * refactor code * refactor code * refactor * refactor + add reformer config * delete bogus file * split reformer attention layer into two layers * save intermediate step * save intermediate step * make test work * add complete reformer block layer * finish reformer layer * implement causal and self mask * clean reformer test and refactor code * fix merge conflicts * fix merge conflicts * update init * fix device for GPU * fix chunk length init for tests * include morgans optimization * improve memory a bit * improve comment * factorize num_buckets * better testing parameters * make whole model work * make lm model work * add t5 copy paste tokenizer * add chunking feed forward * clean config * add improved assert statements * make tokenizer work * improve test * correct typo * extend config * add complexer test * add new axial position embeddings * add local block attention layer * clean tests * refactor * better testing * save intermediate progress * clean test file * make shorter input length work for model * allow variable input length * refactor * make forward pass for pretrained model work * add generation possibility * finish dropout and init * make style * refactor * add first version of RevNet Layers * make forward pass work and add convert file * make uploaded model forward pass work * make uploaded model forward pass work * refactor code * add namedtuples and cache buckets * correct head masks * refactor * made reformer more flexible * make style * remove set max length * add attention masks * fix up tests * fix lsh attention mask * make random seed optional for the moment * improve memory in reformer * add tests * make style * make sure masks work correctly * detach gradients * save intermediate * correct backprob through gather * make style * change back num hashes * rename to labels * fix rotation shape * fix detach * update * fix trainer * fix backward dropout * make reformer more flexible * fix conflict * fix * fix * add tests for fixed seed in reformer layer * fix trainer typo * fix typo in activations * add fp16 tests * add fp16 training * support fp16 * correct gradient bug in reformer * add fast gelu * re-add dropout for embedding dropout * better naming * better naming * renaming * finalize test branch * finalize tests * add more tests * finish tests * fix * fix type trainer * fix fp16 tests * fix tests * fix tests * fix tests * fix issue with dropout * fix dropout seeds * correct random seed on gpu * finalize random seed for dropout * finalize random seed for dropout * remove duplicate line * correct half precision bug * make style * refactor * refactor * docstring * remove sinusoidal position encodings for reformer * move chunking to modeling_utils * make style * clean config * make style * fix tests * fix auto tests * pretrained models * fix docstring * update conversion file * Update pretrained_models.rst * fix rst * fix rst * update copyright * fix test path * fix test path * fix small issue in test * include reformer in generation tests * add docs for axial position encoding * finish docs * Update convert_reformer_trax_checkpoint_to_pytorch.py * remove isort * include sams comments * remove wrong comment in utils * correct typos * fix typo * Update reformer.rst * applied morgans optimization * make style * make gpu compatible * remove bogus file * big test refactor * add example for chunking * fix typo * add to README 2020-05-07 16:17:01 +08:00			`get_activation("gelu_fast")`
TF: Add sigmoid activation function (#16819) 2022-04-19 23:13:08 +08:00			`get_activation("gelu_new")`
Fix SEW-D implementation differences (#14191) * Fix SEW-D * Update tests * isort 2021-10-28 21:22:18 +08:00			`get_activation("gelu_python")`
Implementation of activations as pytorch modules (#15616) * Implement activations as pytorch modules * Apply fixup * Add missing tests for activations * Update docstring Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> 2022-02-17 03:37:52 +08:00			`get_activation("linear")`
TF: Add sigmoid activation function (#16819) 2022-04-19 23:13:08 +08:00			`get_activation("mish")`
			`get_activation("quick_gelu")`
			`get_activation("relu")`
Implementation of activations as pytorch modules (#15616) * Implement activations as pytorch modules * Apply fixup * Add missing tests for activations * Update docstring Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com> 2022-02-17 03:37:52 +08:00			`get_activation("sigmoid")`
TF: Add sigmoid activation function (#16819) 2022-04-19 23:13:08 +08:00			`get_activation("silu")`
			`get_activation("swish")`
			`get_activation("tanh")`
get_activation('relu') provides a simple mapping from strings i… (#2807) * activations.py contains a mapping from string to activation function * resolves some `gelu` vs `gelu_new` ambiguity 2020-02-13 21:28:33 +08:00			`with self.assertRaises(KeyError):`
			`get_activation("bogus")`
			`with self.assertRaises(KeyError):`
			`get_activation(None)`
Fix activations being all the same module (#19728) 2022-10-18 23:56:45 +08:00
			`def test_activations_are_distinct_objects(self):`
			`act1 = get_activation("gelu")`
			`act1.a = 1`
			`act2 = get_activation("gelu")`
			`self.assertEqual(act1.a, 1)`
			`with self.assertRaises(AttributeError):`
			`_ = act2.a`