53 KiB
Testing
ð€ Transformersã¢ãã«ãã©ã®ããã«ãã¹ããããæ°ãããã¹ããæžããŠæ¢åã®ãã¹ããæ¹åã§ããããèŠãŠã¿ãŸãããã
ãã®ãªããžããªã«ã¯2ã€ã®ãã¹ãã¹ã€ãŒãããããŸãïŒ
tests
-- äžè¬çãªAPIçšã®ãã¹ãexamples
-- APIã®äžéšã§ã¯ãªãããŸããŸãªã¢ããªã±ãŒã·ã§ã³çšã®ãã¹ã
How transformers are tested
-
PRãæåºããããšã9ã€ã®CircleCiãžã§ãã§ãã¹ããããŸããPRãžã®æ°ããã³ãããããšã«åãã¹ããããŸãããããã®ãžã§ãã¯ããã®èšå®ãã¡ã€ã«ã§å®çŸ©ãããŠãããå¿ èŠãªå Žåã¯åãç°å¢ãèªåã®ãã·ã³ã§åçŸã§ããŸãã
ãããã®CIãžã§ãã¯
@slow
ãã¹ããå®è¡ããŸããã -
GitHub Actionsã«ãã£ãŠå®è¡ããã3ã€ã®ãžã§ãããããŸãïŒ
-
torch hub integration: torch hubã®çµ±åãåäœãããã©ããã確èªããŸãã
-
self-hosted (push):
main
ã«ã³ããããè¡ãããå Žåã«ãGPUã§é«éãã¹ããå®è¡ããŸãããã®ãžã§ãã¯ãmain
ã§ã®ã³ãããã以äžã®ãã©ã«ããŒã®ã³ãŒããæŽæ°ããå Žåã«ã®ã¿å®è¡ãããŸãïŒsrc
ãtests
ã.github
ïŒè¿œå ãããã¢ãã«ã«ãŒããããŒãããã¯ãªã©ã®å®è¡ãé²ãããïŒã -
self-hosted runner: GPUã§
tests
ãšexamples
ã®éåžžã®ãã¹ããšé ããã¹ããå®è¡ããŸãã
-
RUN_SLOW=1 pytest tests/
RUN_SLOW=1 pytest examples/
çµæã¯[here](https://github.com/huggingface/transformers/actions)ã§èŠ³å¯ã§ããŸãã
Running tests
Choosing which tests to run
ãã®ããã¥ã¡ã³ãã¯ããã¹ããå®è¡ããæ¹æ³ã®å€ãã®è©³çŽ°ã«ã€ããŠèª¬æããŠããŸãããã¹ãŠãèªãã åŸã§ããããã«è©³çŽ°ãå¿ èŠãªå Žåã¯ããã¡ãã§èŠã€ããããšãã§ããŸãã
以äžã¯ããã¹ããå®è¡ããããã®ããã€ãã®æã䟿å©ãªæ¹æ³ã§ãã
ãã¹ãŠå®è¡ããŸã:
pytest
ãŸãã¯ïŒ
make test
åŸè ã¯æ¬¡ã®ããã«å®çŸ©ãããããšã«æ³šæããŠãã ããã
python -m pytest -n auto --dist=loadfile -s -v ./tests/
以äžã¯ãpytestã«æž¡ãèšå®æ å ±ã§ãã
- ãã¹ãããã»ã¹ãCPUã³ã¢ã®æ°ãšåãã ãå®è¡ããããã«æ瀺ããŸãããã ããRAMãååã§ãªãå Žåã¯æ³šæãå¿ èŠã§ãã
- åããã¡ã€ã«ããã®ãã¹ãŠã®ãã¹ãã¯ãåããã¹ãããã»ã¹ã§å®è¡ãããããã«ããŸãã
- åºåã®ãã£ããã£ãè¡ããŸããã
- åé·ã¢ãŒãã§å®è¡ããŸãã
Getting the list of all tests
ãã¹ãã¹ã€ãŒãã®ãã¹ãŠã®ãã¹ãïŒ
pytest --collect-only -q
æå®ããããã¹ã ãã¡ã€ã«ã®ãã¹ãŠã®ãã¹ã:
pytest tests/test_optimization.py --collect-only -q
Run a specific test module
åå¥ã®ãã¹ã ã¢ãžã¥ãŒã«ãå®è¡ããã«ã¯:
pytest tests/utils/test_logging.py
Run specific tests
ã»ãšãã©ã®ãã¹ãã§unittestã䜿çšãããŠãããããç¹å®ã®ãµããã¹ããå®è¡ããã«ã¯ããããã®ãã¹ããå«ãunittestã¯ã©ã¹ã®ååãç¥ã£ãŠããå¿ èŠããããŸããäŸãã°ãããã¯æ¬¡ã®ããã«ãªããããããŸããïŒ
pytest tests/test_optimization.py::OptimizationTest::test_adam_w
ãã¹ãã®å®è¡æ¹æ³:
ãã¹ããã¡ã€ã«: tests/test_optimization.py
ã¯ã©ã¹å: OptimizationTest
ãã¹ãé¢æ°ã®åå: test_adam_w
ãã¡ã€ã«ã«è€æ°ã®ã¯ã©ã¹ãå«ãŸããŠããå Žåã¯ãç¹å®ã®ã¯ã©ã¹ã®ãã¹ãã®ã¿ãå®è¡ããããšãéžæã§ããŸããäŸãã°ïŒ
pytest tests/test_optimization.py::OptimizationTest
ãã¹ãã¯ã©ã¹å ã®ãã¹ãŠã®ãã¹ããå®è¡ããŸãã
åè¿°ã®éããOptimizationTest
ã¯ã©ã¹ã«å«ãŸãããã¹ããå®è¡ããã«ã¯ã次ã®ã³ãã³ããå®è¡ã§ããŸãïŒ
pytest tests/test_optimization.py::OptimizationTest --collect-only -q
ããŒã¯ãŒãåŒã䜿çšããŠãã¹ããå®è¡ã§ããŸãã
ååã« adam
ãå«ãŸãããã¹ãã®ã¿ãå®è¡ããã«ã¯ïŒ
pytest -k adam tests/test_optimization.py
and
ããã³or
ã¯ããã¹ãŠã®ããŒã¯ãŒããäžèŽããããããããã瀺ãããã«äœ¿çšã§ããŸããnot
ã¯åŠå®ããããã«äœ¿çšã§ããŸãã
adam
ãšããååãå«ããã¹ããé€ããŠãã¹ãŠã®ãã¹ããå®è¡ããã«ã¯ïŒ
pytest -k "not adam" tests/test_optimization.py
以äžã¯ãæäŸãããããã¹ãã®æ¥æ¬èªèš³ã§ãã
pytest -k "ada and not adam" tests/test_optimization.py
ããšãã°ãtest_adafactor
ãštest_adam_w
ã®äž¡æ¹ãå®è¡ããã«ã¯ã以äžã®ã³ãã³ãã䜿çšã§ããŸã:
pytest -k "test_adam_w or test_adam_w" tests/test_optimization.py
泚æ: ããã§ã¯ãor
ã䜿çšããŠããŸããããŒã¯ãŒãã®ããããäžã€ãäžèŽããã°ãäž¡æ¹ãå«ããããã§ãã
äž¡æ¹ã®ãã¿ãŒã³ãå«ããã¹ãã®ã¿ãå«ãããå Žåã¯ãand
ã䜿çšããŠãã ããã
pytest -k "test and ada" tests/test_optimization.py
Run accelerate
tests
æã
ãã¢ãã«ã«å¯Ÿã㊠accelerate
ãã¹ããå®è¡ããå¿
èŠããããŸããããšãã°ãOPT
å®è¡ã«å¯ŸããŠãããã®ãã¹ããå®è¡ãããå Žåãã³ãã³ãã« -m accelerate_tests
ãè¿œå ããã ãã§æžã¿ãŸãïŒ
RUN_SLOW=1 pytest -m accelerate_tests tests/models/opt/test_modeling_opt.py
Run documentation tests
ããã¥ã¡ã³ããŒã·ã§ã³ã®äŸãæ£ãããã©ããããã¹ãããã«ã¯ãdoctests
ãåæ ŒããŠãããã確èªããå¿
èŠããããŸãã
äŸãšããŠãWhisperModel.forward
ã®ããã¯ã¹ããªã³ã°ã䜿çšããŸãããã
r"""
Returns:
Example:
```python
>>> import torch
>>> from transformers import WhisperModel, WhisperFeatureExtractor
>>> from datasets import load_dataset
>>> model = WhisperModel.from_pretrained("openai/whisper-base")
>>> feature_extractor = WhisperFeatureExtractor.from_pretrained("openai/whisper-base")
>>> ds = load_dataset("hf-internal-testing/librispeech_asr_dummy", "clean", split="validation")
>>> inputs = feature_extractor(ds[0]["audio"]["array"], return_tensors="pt")
>>> input_features = inputs.input_features
>>> decoder_input_ids = torch.tensor([[1, 1]]) * model.config.decoder_start_token_id
>>> last_hidden_state = model(input_features, decoder_input_ids=decoder_input_ids).last_hidden_state
>>> list(last_hidden_state.shape)
[1, 2, 512]
```"""
æå®ãããã¡ã€ã«å ã®ãã¹ãŠã®ããã¯ã¹ããªã³ã°äŸãèªåçã«ãã¹ãããããã«ã以äžã®è¡ãå®è¡ããŠãã ããïŒ
pytest --doctest-modules <path_to_file_or_dir>
ãã¡ã€ã«ã«ããŒã¯ããŠã³æ¡åŒµåãããå Žåã¯ã--doctest-glob="*.md"
åŒæ°ãè¿œå ããå¿
èŠããããŸãã
Run only modified tests
pytest-pickedã䜿çšãããšãæªã¹ããŒãžã³ã°ã®ãã¡ã€ã«ãŸãã¯çŸåšã®ãã©ã³ãïŒGitã«åŸã£ãŠïŒã«é¢é£ãããã¹ããå®è¡ã§ããŸããããã¯ãå€æŽå 容ã«é¢é£ãããã¹ãã®ã¿å®è¡ããããããå€æŽãäœãå£ããŠããªãããšãè¿ éã«ç¢ºèªããçŽ æŽãããæ¹æ³ã§ããå€æŽãããŠããªããã¡ã€ã«ã«é¢é£ãããã¹ãã¯å®è¡ãããŸããã
pip install pytest-picked
pytest --picked
ãã¹ãŠã®ãã¹ãã¯ãå€æŽãããããŸã ã³ããããããŠããªããã¡ã€ã«ãšãã©ã«ãããå®è¡ãããŸãã
Automatically rerun failed tests on source modification
pytest-xdistã¯ãéåžžã«äŸ¿å©ãªæ©èœãæäŸããŠããããã¹ãŠã®å€±æãããã¹ããæ€åºãããã¡ã€ã«ãä¿®æ£ããéã«ãããã®å€±æãããã¹ããé£ç¶ããŠåå®è¡ããããšãã§ããŸãããã®ãããä¿®æ£ãè¡ã£ãåŸã«pytestãåèµ·åããå¿ èŠããããŸããããã¹ãŠã®ãã¹ããåæ ŒãããŸã§ç¹°ãè¿ããããã®åŸå床ãã«ã©ã³ãå®è¡ãããŸãã
pip install pytest-xdist
ã¢ãŒãã«å
¥ãã«ã¯ïŒ pytest -f
ãŸãã¯pytest --looponfail
ãã¡ã€ã«ã®å€æŽã¯ãlooponfailroots
ã«ãŒããã£ã¬ã¯ããªãšãã®å
容å
šäœïŒååž°çã«ïŒãèŠãŠæ€åºãããŸãããã®å€ã®ããã©ã«ããæ©èœããªãå Žåãsetup.cfg
ã§èšå®ãªãã·ã§ã³ãå€æŽããŠãããžã§ã¯ãå
ã§å€æŽã§ããŸãã
[tool:pytest]
looponfailroots = transformers tests
ãŸã㯠pytest.ini
/tox.ini
ãã¡ã€ã«:
[pytest]
looponfailroots = transformers tests
ãã¡ã€ã«ã®å€æŽãæ¢ãããšã¯ãiniãã¡ã€ã«ã®ãã£ã¬ã¯ããªãåºæºã«ããŠæå®ããããã£ã¬ã¯ããªå ã§ã®ã¿è¡ãããŸãã
pytest-watch ã¯ããã®æ©èœã®ä»£æ¿å®è£ ã§ãã
Skip a test module
ç¹å®ã®ãã¹ãã¢ãžã¥ãŒã«ãé€å€ããŠãã¹ãŠã®ãã¹ãã¢ãžã¥ãŒã«ãå®è¡ãããå Žåãå®è¡ãããã¹ãã®æ瀺çãªãªã¹ããæå®ããããšãã§ããŸããäŸãã°ãtest_modeling_*.py
ãã¹ããé€å€ããŠãã¹ãŠãå®è¡ããã«ã¯æ¬¡ã®ããã«ããŸãïŒ
pytest *ls -1 tests/*py | grep -v test_modeling*
Clearing state
CIãã«ãããã³é床ã«å¯Ÿããéé¢ãéèŠãªå ŽåïŒãã£ãã·ã¥ã«å¯ŸããŠïŒããã£ãã·ã¥ãã¯ãªã¢ããå¿ èŠããããŸãïŒ
pytest --cache-clear tests
Running tests in parallel
åè¿°ã®ããã«ãmake test
㯠pytest-xdist
ãã©ã°ã€ã³ãä»ããŠãã¹ãã䞊åå®è¡ããŸãïŒ-n X
åŒæ°ãäŸ: -n 2
ã§2ã€ã®äžŠåãžã§ããå®è¡ïŒã
pytest-xdist
ã® --dist=
ãªãã·ã§ã³ã䜿çšãããšããã¹ããã©ã®ããã«ã°ã«ãŒãåãããããå¶åŸ¡ã§ããŸãã--dist=loadfile
ã¯åããã¡ã€ã«ã«ãããã¹ããåãããã»ã¹ã«é
眮ããŸãã
ãã¹ãã®å®è¡é åºãç°ãªãäºæž¬äžå¯èœã§ãããããpytest-xdist
ã䜿çšããŠãã¹ãã¹ã€ãŒããå®è¡ãããšå€±æãçºçããå ŽåïŒã€ãŸããããã€ãã®æªæ€åºã®é£åãã¹ããããå ŽåïŒãpytest-replay ã䜿çšããŠãã¹ããåãé åºã§åçãããã®åŸã倱æããã·ãŒã±ã³ã¹ãæå°éã«ããã®ã«åœ¹ç«ã¡ãŸãã
Test order and repetition
æœåšçãªçžäºäŸåæ§ãç¶æ ã«é¢é£ãããã°ïŒãã£ã¢ããŠã³ïŒãæ€åºããããã«ããã¹ããè€æ°åãé£ç¶ããŠãã©ã³ãã ã«ããŸãã¯ã»ããã§ç¹°ãè¿ãããšã¯æçšã§ãããããŠãåçŽãªè€æ°åã®ç¹°ãè¿ãã¯ãDLã®ã©ã³ãã æ§ã«ãã£ãŠæããã«ãªãããã€ãã®åé¡ãæ€åºããã®ã«åœ¹ç«ã¡ãŸãã
Repeat tests
pip install pytest-flakefinder
ãããŠããã¹ãŠã®ãã¹ããè€æ°åå®è¡ããŸã (ããã©ã«ãã§ã¯ 50 å)ã
pytest --flake-finder --flake-runs=5 tests/test_failing_test.py
ãã®ãã©ã°ã€ã³ã¯ãpytest-xdist
ã® -n
ãã©ã°ã§ã¯åäœããŸããã
å¥ã®ãã©ã°ã€ã³ pytest-repeat
ããããŸããããã㯠unittest
ã§ã¯åäœããŸããã
Run tests in a random order
pip install pytest-random-order
éèŠ: pytest-random-order
ãååšãããšããã¹ãã¯èªåçã«ã©ã³ãã åãããŸããèšå®ã®å€æŽãå€æŽã¯å¿
èŠãããŸããã
ã³ãã³ãã©ã€ã³ãªãã·ã§ã³ã¯å¿
é ã§ãã
åã«èª¬æããããã«ãããã«ãããçµåããããã¹ã (1 ã€ã®ãã¹ãã®ç¶æ
ãå¥ã®ãã¹ãã®ç¶æ
ã«åœ±é¿ãäžãã) ã®æ€åºãå¯èœã«ãªããŸãããã€
pytest-random-order
ãã€ã³ã¹ããŒã«ãããŠãããšããã®ã»ãã·ã§ã³ã«äœ¿çšãããã©ã³ãã ã·ãŒããåºåãããŸããäŸ:
pytest tests
[...]
Using --random-order-bucket=module
Using --random-order-seed=573663
ãã®ãããæå®ãããç¹å®ã®ã·ãŒã±ã³ã¹ã倱æããå Žåããã®æ£ç¢ºãªã·ãŒããè¿œå ããããšã§ãããåçŸã§ããŸããäŸ:
pytest --random-order-seed=573663
[...]
Using --random-order-bucket=module
Using --random-order-seed=573663
ç¹å®ã®ãã¹ãã®ãªã¹ãã䜿çšããªãå ŽåããŸãã¯ãŸã£ãããªã¹ãã䜿çšããªãå Žåãåããã¹ãã®æ£ç¢ºãªé åºãåçŸããŸãããã¹ãã®ãªã¹ããæåã§çµã蟌ã¿å§ãããšãã·ãŒãã«äŸåããããã¹ãã倱æããæ£ç¢ºãªé åºã§æåã§ãªã¹ããæå®ããå¿
èŠããããŸããããã«ã¯ã--random-order-bucket=none
ã䜿çšããŠã©ã³ãã åãç¡å¹ã«ããããpytestã«æ瀺ããå¿
èŠããããŸããäŸãã°ã次ã®ããã«ããŸãïŒ
pytest --random-order-bucket=none tests/test_a.py tests/test_c.py tests/test_b.py
ãã¹ãŠã®ãã¹ãã®ã·ã£ããã«ãç¡å¹ã«ããã«ã¯:
pytest --random-order-bucket=none
ããã©ã«ãã§ã¯ã--random-order-bucket=module
ãæé»çã«é©çšãããã¢ãžã¥ãŒã«ã¬ãã«ã§ãã¡ã€ã«ãã·ã£ããã«ããŸãããŸããclass
ãpackage
ãglobal
ãããã³none
ã¬ãã«ã§ã·ã£ããã«ããããšãã§ããŸãã詳现ã«ã€ããŠã¯ããã®ããã¥ã¡ã³ããŒã·ã§ã³ãåç
§ããŠãã ããã
å¥ã®ã©ã³ãã åã®ä»£æ¿æ段ã¯ãpytest-randomly
ã§ãããã®ã¢ãžã¥ãŒã«ã¯éåžžã«äŒŒãæ©èœ/ã€ã³ã¿ãŒãã§ãŒã¹ãæã£ãŠããŸãããpytest-random-order
ã§å©çšå¯èœãªãã±ããã¢ãŒããæã£ãŠããŸãããã€ã³ã¹ããŒã«åŸã«èªåçã«æå¹ã«ãªããšããåãåé¡ããããŸãã
Look and feel variations
pytest-sugar
pytest-sugar ã¯ãå€èŠ³ãšæäœæ§ãåäžãããããã°ã¬ã¹ããŒãè¿œå ããå³åº§ã«å€±æãããã¹ããšã¢ãµãŒã·ã§ã³ã衚瀺ãããã©ã°ã€ã³ã§ããã€ã³ã¹ããŒã«åŸã«èªåçã«ã¢ã¯ãã£ãåãããŸãã
pip install pytest-sugar
ããã䜿çšããã«ãã¹ããå®è¡ããã«ã¯ã次ãå®è¡ããŸãã
pytest -p no:sugar
ãŸãã¯ã¢ã³ã€ã³ã¹ããŒã«ããŸãã
Report each sub-test name and its progress
pytest
ã«ããåäžãŸãã¯ã°ã«ãŒãã®ãã¹ãã®å Žå (pip install pytest-pspec
ã®åŸ):
pytest --pspec tests/test_optimization.py
Instantly shows failed tests
pytest-instafail ã§ã¯ã倱æãšãšã©ãŒãå³åº§ã«è¡šç€ºãããŸãã ãã¹ãã»ãã·ã§ã³ãçµäºãããŸã§åŸ æ©ããŸãã
pip install pytest-instafail
pytest --instafail
To GPU or not to GPU
GPU ãæå¹ãªèšå®ã§ãCPU ã®ã¿ã¢ãŒãã§ãã¹ãããã«ã¯ãCUDA_VISIBLE_DEVICES=""
ãè¿œå ããŸãã
CUDA_VISIBLE_DEVICES="" pytest tests/utils/test_logging.py
ãŸãã¯ãè€æ°ã® GPU ãããå Žåã¯ãpytest
ã§ã©ãã䜿çšããããæå®ã§ããŸããããšãã°ã
2 çªç®ã® GPU GPU 0
ãš 1
ãããå Žåã¯ã次ãå®è¡ã§ããŸãã
CUDA_VISIBLE_DEVICES="1" pytest tests/utils/test_logging.py
ããã¯ãç°ãªãGPUã§ç°ãªãã¿ã¹ã¯ãå®è¡ãããå Žåã«äŸ¿å©ã§ãã
äžéšã®ãã¹ãã¯CPUã®ã¿ã§å®è¡ããå¿ èŠããããä»ã®ãã¹ãã¯CPUãGPUããŸãã¯TPUã§å®è¡ããå¿ èŠãããããŸãå¥ã®ãã¹ãã¯è€æ°ã®GPUã§å®è¡ããå¿ èŠããããŸãã次ã®ã¹ããããã³ã¬ãŒã¿ãŒã¯ããã¹ãã®CPU/GPU/TPUã«é¢ããèŠä»¶ãèšå®ããããã«äœ¿çšãããŸãïŒ
require_torch
- ãã®ãã¹ãã¯torchã®äžã§ã®ã¿å®è¡ãããŸããrequire_torch_gpu
-require_torch
ã«å ããŠãå°ãªããšã1ã€ã®GPUãå¿ èŠã§ããrequire_torch_multi_gpu
-require_torch
ã«å ããŠãå°ãªããšã2ã€ã®GPUãå¿ èŠã§ããrequire_torch_non_multi_gpu
-require_torch
ã«å ããŠã0ãŸãã¯1ã€ã®GPUãå¿ èŠã§ããrequire_torch_up_to_2_gpus
-require_torch
ã«å ããŠã0ã1ããŸãã¯2ã€ã®GPUãå¿ èŠã§ããrequire_torch_xla
-require_torch
ã«å ããŠãå°ãªããšã1ã€ã®TPUãå¿ èŠã§ãã
以äžã®è¡šã«GPUã®èŠä»¶ã瀺ããŸãïŒ
| n gpus | decorator |
|--------+--------------------------------|
| >= 0
| @require_torch
|
| >= 1
| @require_torch_gpu
|
| >= 2
| @require_torch_multi_gpu
|
| < 2
| @require_torch_non_multi_gpu
|
| < 3
| @require_torch_up_to_2_gpus
|
ããšãã°ã䜿çšå¯èœãª GPU ã 2 ã€ä»¥äžãããpytorch ãã€ã³ã¹ããŒã«ãããŠããå Žåã«ã®ã¿å®è¡ããå¿ èŠããããã¹ãã次ã«ç€ºããŸãã
@require_torch_multi_gpu
def test_example_with_multi_gpu():
ãã¹ãã« tensorflow
ãå¿
èŠãªå Žåã¯ãrequire_tf
ãã³ã¬ãŒã¿ã䜿çšããŸããäŸãã°ïŒ
@require_tf
def test_tf_thing_with_tensorflow():
ãããã®ãã³ã¬ãŒã¿ã¯ç©ã¿éããããšãã§ããŸããããšãã°ããã¹ããé ããpytorch ã§å°ãªããšã 1 ã€ã® GPU ãå¿ èŠãªå Žåã¯ã次ã®ããã«ãªããŸãã èšå®æ¹æ³:
@require_torch_gpu
@slow
def test_example_slow_on_gpu():
@parametrized
ã®ãããªäžéšã®ãã³ã¬ãŒã¿ã¯ãã¹ãåãæžãæããããã@require_*
ã¹ããã ãã³ã¬ãŒã¿ããªã¹ãããå¿
èŠããããŸãã
æåŸã«ããããæ£ããåäœããããã«ããŸããæ£ãã䜿çšäŸã¯æ¬¡ã®ãšããã§ã
@parameterized.expand(...)
@require_torch_multi_gpu
def test_integration_foo():
ãã®é åºã®åé¡ã¯ @pytest.mark.parametrize
ã«ã¯ååšããŸãããæåãŸãã¯æåŸã«é
眮ããŠããããã§ãåé¡ã¯è§£æ±ºãããŸãã
ä»äºããã ããããã¯éåäœãã¹ãã§ã®ã¿æ©èœããŸãã
å éšãã¹ã:
- å©çšå¯èœãª GPU ã®æ°:
from transformers.testing_utils import get_gpu_count
n_gpu = get_gpu_count() # works with torch and tf
Testing with a specific PyTorch backend or device
ç¹å®ã®torchããã€ã¹ã§ãã¹ãã¹ã€ãŒããå®è¡ããã«ã¯ãTRANSFORMERS_TEST_DEVICE="$device"
ãè¿œå ããŸãããã㧠$device
ã¯å¯Ÿè±¡ã®ããã¯ãšã³ãã§ããäŸãã°ãCPUã§ãã¹ãããã«ã¯ä»¥äžã®ããã«ããŸãïŒ
TRANSFORMERS_TEST_DEVICE="cpu" pytest tests/utils/test_logging.py
ãã®å€æ°ã¯ãmps
ãªã©ã®ã«ã¹ã¿ã ãŸãã¯ããŸãäžè¬çã§ã¯ãªã PyTorch ããã¯ãšã³ãããã¹ãããã®ã«åœ¹ç«ã¡ãŸãããŸããç¹å®ã® GPU ãã¿ãŒã²ããã«ããããCPU å°çšã¢ãŒãã§ãã¹ããããããããšã§ãCUDA_VISIBLE_DEVICES
ãšåãå¹æãéæããããã«äœ¿çšããããšãã§ããŸãã
ç¹å®ã®ããã€ã¹ã§ã¯ãåããŠãtorchããã€ã³ããŒãããåŸãè¿œå ã®ã€ã³ããŒããå¿
èŠã«ãªããŸããããã¯ãç°å¢å€æ° TRANSFORMERS_TEST_BACKEND
ã䜿çšããŠæå®ã§ããŸãã
TRANSFORMERS_TEST_BACKEND="torch_npu" pytest tests/utils/test_logging.py
Distributed training
pytest
ã¯çŽæ¥çã«åæ£ãã¬ãŒãã³ã°ãåŠçããããšã¯ã§ããŸãããè©Šã¿ããšããµãããã»ã¹ã¯æ£ããåŠçãè¡ãããèªåèªèº«ã pytest
ã§ãããšæã蟌ãã§ãã¹ãã¹ã€ãŒããã«ãŒãã§å®è¡ãç¶ããŸãããã ããéåžžã®ããã»ã¹ãçæããããããè€æ°ã®ã¯ãŒã«ãŒãçæããIOãã€ãã管çããããã»ã¹ãçæããã°æ©èœããŸãã
ããã䜿çšããããã€ãã®ãã¹ãããããŸãïŒ
å®è¡ãã€ã³ãã«ããã«ç§»åããã«ã¯ããããã®ãã¹ãå
㧠execute_subprocess_async
åŒã³åºããæ€çŽ¢ããŠãã ããã
ãããã®ãã¹ããå®è¡ããã«ã¯ãå°ãªããšã2ã€ã®GPUãå¿ èŠã§ãïŒ
CUDA_VISIBLE_DEVICES=0,1 RUN_SLOW=1 pytest -sv tests/test_trainer_distributed.py
Output capture
ãã¹ãã®å®è¡äžã«ãstdout
ããã³ stderr
ã«éä¿¡ãããåºåã¯ãã£ããã£ãããŸãããã¹ããŸãã¯ã»ããã¢ããã¡ãœããã倱æããå Žåãéåžžãããã«å¯Ÿå¿ãããã£ããã£ãããåºåã倱æã®ãã¬ãŒã¹ããã¯ãšå
±ã«è¡šç€ºãããŸãã
åºåã®ãã£ããã£ãç¡å¹ã«ããstdout
ãš stderr
ãéåžžéãã«ååŸããã«ã¯ã-s
ãŸã㯠--capture=no
ã䜿çšããŠãã ããïŒ
ãããã®ãã¹ããå®è¡ããã«ã¯å°ãªããšã2ã€ã®GPUãå¿ èŠã§ãïŒ
pytest -s tests/utils/test_logging.py
ãã¹ãçµæã JUnit 圢åŒã®åºåã«éä¿¡ããã«ã¯:
py.test tests --junitxml=result.xml
Color control
è²ãæããªãããã«ããïŒäŸïŒé»è²ã®ããã¹ããçœãèæ¯ã«è¡šç€ºãããšèªã¿ã«ããã§ãïŒïŒ
pytest --color=no tests/utils/test_logging.py
Sending test report to online pastebin service
ãã¹ã倱æããšã« URL ãäœæããŸãã
pytest --pastebin=failed tests/utils/test_logging.py
ããã«ããããã¹ãå®è¡æ
å ±ããªã¢ãŒãã®PasteãµãŒãã¹ã«éä¿¡ãããåãšã©ãŒã«å¯ŸããŠURLãæäŸãããŸããéåžžéããã¹ããéžæããããããšãã°ç¹å®ã®ãšã©ãŒã®ã¿ãéä¿¡ãããå Žå㯠-x
ãè¿œå ã§æå®ã§ããŸãã
ãã¹ãã»ãã·ã§ã³å šäœã®ãã°ã«å¯ŸããURLãäœæããæ¹æ³ïŒ
pytest --pastebin=all tests/utils/test_logging.py
Writing tests
ð€ transformersã®ãã¹ã㯠unittest
ãåºã«ããŠããŸããã pytest
ã§å®è¡ããããããã»ãšãã©ã®å Žåãäž¡æ¹ã®ã·ã¹ãã ã®æ©èœã䜿çšã§ããŸãã
ãã¡ãã§ãµããŒããããŠããæ©èœãèªãããšãã§ããŸãããéèŠãªããšã¯ãã»ãšãã©ã® pytest
ã®ãã£ã¯ã¹ãã£ãåäœããªãããšã§ãããã©ã¡ãŒã¿åãåæ§ã§ããã䌌ããããªæ¹æ³ã§åäœãã parameterized
ã¢ãžã¥ãŒã«ã䜿çšããŠããŸãã
Parametrization
åããã¹ããç°ãªãåŒæ°ã§è€æ°åå®è¡ããå¿ èŠãããããšããããããŸããããã¯ãã¹ãå éšããè¡ãããšãã§ããŸããããã®å Žåããã®ãã¹ããåäžã®åŒæ°ã»ããã§å®è¡ããæ¹æ³ã¯ãããŸããã
# test_this1.py
import unittest
from parameterized import parameterized
class TestMathUnitTest(unittest.TestCase):
@parameterized.expand(
[
("negative", -1.5, -2.0),
("integer", 1, 1.0),
("large fraction", 1.6, 1),
]
)
def test_floor(self, name, input, expected):
assert_equal(math.floor(input), expected)
ããã©ã«ãã§ã¯ããã®ãã¹ãã¯3åå®è¡ãããããããã®å®è¡ã§ test_floor
ã®æåŸã®3ã€ã®åŒæ°ããã©ã¡ãŒã¿ãªã¹ãã®å¯Ÿå¿ããåŒæ°ã«å²ãåœãŠãããŸãã
ãããŠãnegative
ãš integer
ãã©ã¡ãŒã¿ã®ã»ããã®ã¿ãå®è¡ããããšãã§ããŸã:
pytest -k "negative and integer" tests/test_mytest.py
ãŸãã¯ãNegative
ã®ãµããã¹ããé€ããã¹ãŠã®å Žåã次ã®ããã«ãªããŸãã
pytest -k "not negative" tests/test_mytest.py
-k
ãã£ã«ã¿ãŒã䜿çšããããšã«å ããŠãåãµããã¹ãã®æ£ç¢ºãªååã調ã¹ããã®æ£ç¢ºãªååã䜿çšããŠä»»æã®ãµããã¹ããŸãã¯ãã¹ãŠã®ãµããã¹ããå®è¡ããããšãã§ããŸãã
pytest test_this1.py --collect-only -q
ãããšæ¬¡ã®ãã®ããªã¹ããããŸã:
test_this1.py::TestMathUnitTest::test_floor_0_negative
test_this1.py::TestMathUnitTest::test_floor_1_integer
test_this1.py::TestMathUnitTest::test_floor_2_large_fraction
ãããã£ãŠã2 ã€ã®ç¹å®ã®ãµããã¹ãã®ã¿ãå®è¡ã§ããããã«ãªããŸããã
pytest test_this1.py::TestMathUnitTest::test_floor_0_negative test_this1.py::TestMathUnitTest::test_floor_1_integer
transformers
ã®éçºè
äŸåé¢ä¿ã«ãã§ã«å«ãŸããŠããã¢ãžã¥ãŒã«parameterized ã¯ãunittests
ãš pytest
ãã¹ãã®äž¡æ¹ã§æ©èœããŸãã
ãã ãããã¹ãã unittest
ã§ãªãå Žåãpytest.mark.parametrize
ã䜿çšããããšãã§ããŸãïŒãŸãã¯æ¢åã®ãã¹ãã®ããã€ãã§ã䞻㫠examples
ã®äžã§äœ¿çšãããŠããã®ãèŠãããšãã§ããŸãïŒã
次ã«ãåãäŸã瀺ããŸãããä»åºŠã¯ pytest
ã® parametrize
ããŒã«ãŒã䜿çšããŠããŸãïŒ
# test_this2.py
import pytest
@pytest.mark.parametrize(
"name, input, expected",
[
("negative", -1.5, -2.0),
("integer", 1, 1.0),
("large fraction", 1.6, 1),
],
)
def test_floor(name, input, expected):
assert_equal(math.floor(input), expected)
parameterized
ãšåæ§ã«ãpytest.mark.parametrize
ã䜿çšãããšã-k
ãã£ã«ã¿ã圹ç«ããªãå Žåã§ãããµããã¹ãã®å®è¡ã现ããå¶åŸ¡ã§ããŸãããã ãããã®ãã©ã¡ãŒã¿åé¢æ°ã¯ãµããã¹ãã®ååããããã«ç°ãªããã®ã«ããŸãã以äžã«ãã®äŸã瀺ããŸãïŒ
pytest test_this2.py --collect-only -q
ãããšæ¬¡ã®ãã®ããªã¹ããããŸã:
test_this2.py::test_floor[integer-1-1.0]
test_this2.py::test_floor[negative--1.5--2.0]
test_this2.py::test_floor[large fraction-1.6-1]
ããã§ãç¹å®ã®ãã¹ãã®ã¿ãå®è¡ã§ããããã«ãªããŸããã
pytest test_this2.py::test_floor[negative--1.5--2.0] test_this2.py::test_floor[integer-1-1.0]
åã®äŸãšåæ§ã«ã
Files and directories
ãã¹ãã®äžã§ãçŸåšã®ãã¹ããã¡ã€ã«ããã®çžå¯Ÿäœçœ®ãç¥ãå¿
èŠãããããšããããããŸããããããããã¯ç°¡åãªããšã§ã¯ãããŸããããªããªãããã¹ãã¯è€æ°ã®ãã£ã¬ã¯ããªããåŒã³åºãããããç°ãªãæ·±ãã®ãµããã£ã¬ã¯ããªã«ååšããããšãããããã§ããtransformers.test_utils.TestCasePlus
ãšãããã«ããŒã¯ã©ã¹ã¯ããã¹ãŠã®åºæ¬ãã¹ãæŽçããç°¡åã«ã¢ã¯ã»ã¹ã§ããããã«ããããšã§ããã®åé¡ã解決ããŸãã
-
pathlib
ãªããžã§ã¯ãïŒãã¹ãŠå®å šã«è§£æ±ºããããã®ïŒïŒtest_file_path
- çŸåšã®ãã¹ããã¡ã€ã«ã®ãã¹ãã€ãŸã__file__
test_file_dir
- çŸåšã®ãã¹ããã¡ã€ã«ãå«ããã£ã¬ã¯ããªtests_dir
-tests
ãã¹ãã¹ã€ãŒãã®ãã£ã¬ã¯ããªexamples_dir
-examples
ãã¹ãã¹ã€ãŒãã®ãã£ã¬ã¯ããªrepo_root_dir
- ãªããžããªã®ãã£ã¬ã¯ããªsrc_dir
-transformers
ãµããã£ã¬ã¯ããªãååšããå Žæ
-
ãã¹ã®æååè¡šçŸââäžèšãšåãã§ããããããã¯
pathlib
ãªããžã§ã¯ãã§ã¯ãªãæååãšããŠãã¹ãè¿ããŸãïŒtest_file_path_str
test_file_dir_str
tests_dir_str
examples_dir_str
repo_root_dir_str
src_dir_str
ãããã䜿çšãå§ããã«ã¯ããã¹ãã transformers.test_utils.TestCasePlus
ã®ãµãã¯ã©ã¹ã«ååšããããšã確èªããã ãã§ããäŸïŒ
from transformers.testing_utils import TestCasePlus
class PathExampleTest(TestCasePlus):
def test_something_involving_local_locations(self):
data_dir = self.tests_dir / "fixtures/tests_samples/wmt_en_ro"
ãããpathlib
ãä»ããŠãã¹ãæäœããå¿
èŠããªãå ŽåããŸãã¯åã«æååãšããŠãã¹ãå¿
èŠãªå Žåã¯ãpathlib
ãªããžã§ã¯ãã« str()
ãåŒã³åºããã_str
ã§çµããã¢ã¯ã»ãµã䜿çšã§ããŸããäŸïŒ
from transformers.testing_utils import TestCasePlus
class PathExampleTest(TestCasePlus):
def test_something_involving_stringified_locations(self):
examples_dir = self.examples_dir_str
Temporary files and directories
äžæã®äžæãã¡ã€ã«ãšãã£ã¬ã¯ããªã®äœ¿çšã¯ã䞊åãã¹ãã®å®è¡ã«ã¯æ¬ ãããŸãããããã«ããããã¹ãããäºãã®ããŒã¿ãäžæžãããªãããã«ããŸãããŸããããããäœæããåãã¹ãã®çµäºæã«äžæãã¡ã€ã«ãšãã£ã¬ã¯ããªãåé€ãããããšãæã¿ãŸãããã®ããããããã®ããŒãºãæºããããã±ãŒãžã§ãã tempfile
ã®ãããªããã±ãŒãžã®äœ¿çšã¯éèŠã§ãã
ãããããã¹ãã®ãããã°æã«ã¯ãäžæãã¡ã€ã«ããã£ã¬ã¯ããªã«äœãæ ŒçŽãããŠãããã確èªã§ããå¿ èŠãããããã¹ããåå®è¡ãããã³ã«ã©ã³ãã ã«å€æŽãããªããã®æ£ç¢ºãªãã¹ãç¥ããããšæããŸãã
transformers.test_utils.TestCasePlus
ãšãããã«ããŒã¯ã©ã¹ã¯ããã®ãããªç®çã«æé©ã§ãããã㯠unittest.TestCase
ã®ãµãã¯ã©ã¹ã§ããããããã¹ãã¢ãžã¥ãŒã«ã§ç°¡åã«ç¶æ¿ããããšãã§ããŸãã
以äžã¯ãã®äœ¿çšäŸã§ãïŒ
from transformers.testing_utils import TestCasePlus
class ExamplesTests(TestCasePlus):
def test_whatever(self):
tmp_dir = self.get_auto_remove_tmp_dir()
ãã®ã³ãŒãã¯ãŠããŒã¯ãªäžæãã£ã¬ã¯ããªãäœæããtmp_dir
ããã®å Žæã«èšå®ããŸãã
- ãŠããŒã¯ãªäžæãã£ã¬ã¯ããªãäœæããŸãïŒ
def test_whatever(self):
tmp_dir = self.get_auto_remove_tmp_dir()
tmp_dir
ã«ã¯ãäœæãããäžæãã£ã¬ã¯ããªãžã®ãã¹ãå«ãŸããŸããæéçµäºåŸã¯èªåçã«åé€ãããŸã
ãã¹ãã
- ä»»æã®äžæãã£ã¬ã¯ããªãäœæãããã¹ãã®éå§åã«ããã空ã§ããããšã確èªãããã¹ãåŸã«ã¯ç©ºã«ããªãã§ãã ããã
def test_whatever(self):
tmp_dir = self.get_auto_remove_tmp_dir("./xxx")
ããã¯ãç¹å®ã®ãã£ã¬ã¯ããªãç£èŠããåã®ãã¹ããããã«ããŒã¿ãæ®ããªãããšã確èªãããå Žåã«ããããã°ã«åœ¹ç«ã¡ãŸãã
-
before
ãšafter
åŒæ°ãçŽæ¥ãªãŒããŒã©ã€ãããããšã§ãããã©ã«ãã®åäœããªãŒããŒã©ã€ãã§ããŸãã以äžã®ããããã®åäœã«å°ããŸãïŒbefore=True
ïŒãã¹ãã®éå§æã«åžžã«äžæãã£ã¬ã¯ããªãã¯ãªã¢ãããŸããbefore=False
ïŒäžæãã£ã¬ã¯ããªãæ¢ã«ååšããå Žåãæ¢åã®ãã¡ã€ã«ã¯ãã®ãŸãŸã«ãªããŸããafter=True
ïŒãã¹ãã®çµäºæã«åžžã«äžæãã£ã¬ã¯ããªãåé€ãããŸããafter=False
ïŒãã¹ãã®çµäºæã«åžžã«äžæãã£ã¬ã¯ããªã¯ãã®ãŸãŸã«ãªããŸãã
rm -r
ã®çžåœãå®å
šã«å®è¡ããããã«ãæ瀺ç㪠tmp_dir
ã䜿çšãããå Žåããããžã§ã¯ããªããžããªã®ãã§ãã¯ã¢ãŠãã®ãµããã£ã¬ã¯ããªã®ã¿ãèš±å¯ãããŸãã誀ã£ãŠ /tmp
ãªã©ã®ãã¡ã€ã«ã·ã¹ãã ã®éèŠãªéšåãåé€ãããªãããã«ãåžžã« ./
ããå§ãŸããã¹ãæž¡ããŠãã ããã
åãã¹ãã¯è€æ°ã®äžæãã£ã¬ã¯ããªãç»é²ã§ããèŠæ±ããªãéããã¹ãŠèªåã§åé€ãããŸãã
Temporary sys.path override
å¥ã®ãã¹ãããã€ã³ããŒãããããã«äžæçã« sys.path
ããªãŒããŒã©ã€ãããå¿
èŠãããå ŽåãExtendSysPath
ã³ã³ããã¹ããããŒãžã£ã䜿çšã§ããŸããäŸïŒ
import os
from transformers.testing_utils import ExtendSysPath
bindir = os.path.abspath(os.path.dirname(__file__))
with ExtendSysPath(f"{bindir}/.."):
from test_trainer import TrainerIntegrationCommon # noqa
Skipping tests
ããã¯ããã°ãèŠã€ãããæ°ãããã¹ããäœæãããå Žåã§ãã£ãŠãããã°ããŸã ä¿®æ£ãããŠããªãå Žåã«åœ¹ç«ã¡ãŸããã¡ã€ã³ãªããžããªã«ã³ãããã§ããããã«ããã«ã¯ãmake test
ã®å®è¡äžã«ãããã¹ãããããå¿
èŠããããŸãã
ã¡ãœããïŒ
-
skip ã¯ããã¹ããç¹å®ã®æ¡ä»¶ãæºããããå Žåã«ã®ã¿ãã¹ããããšãæåŸ ããŠããããã以å€ã®å Žå㯠pytest ããã¹ãã®å®è¡ãã¹ãããããŸããäžè¬çãªäŸã¯ãWindowså°çšã®ãã¹ããéWindowsãã©ãããã©ãŒã ã§ã¹ãããããå ŽåããŸãã¯çŸåšå©çšã§ããªãå€éšãªãœãŒã¹ã«äŸåãããã¹ããã¹ãããããå Žåã§ãïŒäŸ: ããŒã¿ããŒã¹ãå©çšã§ããªãå ŽåïŒã
-
xfail ã¯ãäœããã®çç±ã§ãã¹ãã倱æããããšãæåŸ ããŠããŸããäžè¬çãªäŸã¯ããŸã å®è£ ãããŠããªãæ©èœã®ãã¹ããããŸã ä¿®æ£ãããŠããªããã°ã®ãã¹ãã§ãããã¹ããäºæ³ããã倱æã«ãããããããã¹ããå ŽåïŒpytest.mark.xfailã§ããŒã¯ããããã¹ãïŒãããã¯xpassãšããŠãã¹ããµããªãŒã«å ±åãããŸãã
ãããã®2ã€ã®éã®éèŠãªéãã®1ã€ã¯ãskip
ã¯ãã¹ããå®è¡ããªãç¹ã§ãããxfail
ã¯å®è¡ããŸãããããã£ãŠããã°ã®ããã³ãŒããä»ã®ãã¹ãã«åœ±é¿ãäžããå Žåã¯ãxfail
ã䜿çšããªãã§ãã ããã
Implementation
- ãã¹ãå šäœãç¡æ¡ä»¶ã«ã¹ãããããæ¹æ³ã¯æ¬¡ã®ãšããã§ãïŒ
@unittest.skip("this bug needs to be fixed")
def test_feature_x():
ãŸã㯠pytest çµç±:
@pytest.mark.skip(reason="this bug needs to be fixed")
ãŸã㯠xfail
ã®æ¹æ³:
@pytest.mark.xfail
def test_feature_x():
- ãã¹ãå ã®å éšãã§ãã¯ã«åºã¥ããŠãã¹ããã¹ãããããæ¹æ³ã¯æ¬¡ã®ãšããã§ãã
def test_feature_x():
if not has_something():
pytest.skip("unsupported configuration")
ãŸãã¯ã¢ãžã¥ãŒã«å šäœ:
import pytest
if not pytest.config.getoption("--custom-flag"):
pytest.skip("--custom-flag is missing, skipping tests", allow_module_level=True)
ãŸã㯠xfail
ã®æ¹æ³:
def test_feature_x():
pytest.xfail("expected to fail until bug XYZ is fixed")
- äžéšã®ã€ã³ããŒããæ¬ èœããŠããå Žåã«ã¢ãžã¥ãŒã«å ã®ãã¹ãŠã®ãã¹ããã¹ãããããæ¹æ³ã¯æ¬¡ã®ãšããã§ãã
docutils = pytest.importorskip("docutils", minversion="0.3")
- æ¡ä»¶ã«åºã¥ããŠãã¹ããã¹ãããããŸãã
@pytest.mark.skipif(sys.version_info < (3,6), reason="requires python3.6 or higher")
def test_feature_x():
ãŸãã¯ïŒ
@unittest.skipIf(torch_device == "cpu", "Can't do half precision")
def test_feature_x():
ãŸãã¯ã¢ãžã¥ãŒã«å šäœãã¹ãããããŸãã
@pytest.mark.skipif(sys.platform == 'win32', reason="does not run on windows")
class TestClass():
def test_feature_x(self):
詳现ãäŸãããã³æ¹æ³ã«ã€ããŠã®è©³çŽ°ã¯ãã¡ããåç §ããŠãã ããã
Slow tests
ãã¹ãã©ã€ãã©ãªã¯çå®ã«æé·ããŠããããã¹ãã®äžéšã¯æ°åããããŸãããã®ãããCIã§ãã¹ãã¹ã€ãŒãã®å®äºãåŸ ã€ã®ã¯1æéåŸ ã€äœè£ããªãããšããããŸãããããã£ãŠãããã€ãã®äŸå€ãé€ããŠãé ããã¹ãã¯ä»¥äžã®äŸã®ããã«ããŒã¯ãã¹ãã§ãïŒ
from transformers.testing_utils import slow
@slow
def test_integration_foo():
ãã¹ãã@slow
ãšããŠããŒã¯ããããããã®ãããªãã¹ããå®è¡ããã«ã¯ãç°å¢å€æ° RUN_SLOW=1
ãèšå®ããŸããäŸ:
RUN_SLOW=1 pytest tests
@parameterized
ã®ãããªãã³ã¬ãŒã¿ã¯ãã¹ãåãæžãæããããã@slow
ããã³ä»ã®ã¹ããããã³ã¬ãŒã¿ @require_*
ã¯æ£ããåäœããããã«ã¯ãæåŸã«ãªã¹ãã¢ããããå¿
èŠããããŸãã以äžã¯æ£ãã䜿çšäŸã®äžäŸã§ãïŒ
@parameterized.expand(...)
@slow
def test_integration_foo():
ãã®ããã¥ã¡ã³ãã®åé ã§èª¬æããããã«ãé ããã¹ãã¯å®æçãªã¹ã±ãžã¥ãŒã«ã«åŸã£ãŠå®è¡ãããPRã®CIãã§ãã¯ã§ã¯å®è¡ãããŸããããã®ãããäžéšã®åé¡ãPRã®æåºæã«èŠèœãšãããããŒãžãããå¯èœæ§ããããŸãããã®ãããªåé¡ã¯æ¬¡åã®ã¹ã±ãžã¥ãŒã«ãããCIãžã§ãã§æ€åºãããŸããããããããã¯ãŸããPRãæåºããåã«èªåã®ãã·ã³ã§é ããã¹ããå®è¡ããéèŠæ§ãæå³ããŠããŸãã
ã©ã®ãã¹ããé ããã¹ããšããŠããŒã¯ãã¹ãããéžæããããã®ããããŸããªææ決å®ã¡ã«ããºã ã次ã«ç€ºãããŠããŸãïŒ
-
ãã¹ããã©ã€ãã©ãªã®å éšã³ã³ããŒãã³ãã®1ã€ã«çŠç¹ãåœãŠãŠããå ŽåïŒäŸ: ã¢ããªã³ã°ãã¡ã€ã«ãããŒã¯ã³åãã¡ã€ã«ããã€ãã©ã€ã³ïŒããã®ãã¹ãã¯é ããã¹ãã¹ã€ãŒãã§å®è¡ããå¿ èŠããããŸãããããã©ã€ãã©ãªã®ä»ã®åŽé¢ãããšãã°ããã¥ã¡ã³ããŒã·ã§ã³ãäŸã«çŠç¹ãåœãŠãŠããå Žåããããã®ãã¹ãã¯é ããã¹ãã¹ã€ãŒãã§å®è¡ããå¿ èŠããããŸãããããŠããã®ã¢ãããŒããæŽç·Žãããããã«äŸå€ãèšããå¿ èŠããããŸãã
-
éããŠã§ã€ãã»ãããçŽ50MB以äžã®ããŒã¿ã»ãããããŠã³ããŒãããå¿ èŠããããã¹ãŠã®ãã¹ãïŒäŸ: ã¢ãã«çµ±åãã¹ããããŒã¯ãã€ã¶çµ±åãã¹ãããã€ãã©ã€ã³çµ±åãã¹ãïŒã¯é ããã¹ããšããŠèšå®ããå¿ èŠããããŸããæ°ããã¢ãã«ãè¿œå ããå Žåãçµ±åãã¹ãçšã«ã©ã³ãã ãªãŠã§ã€ããæã€å°ããªããŒãžã§ã³ãäœæããããã«ã¢ããããŒãããå¿ èŠããããŸããããã«ã€ããŠã¯ä»¥äžã®æ®µèœã§è©³ãã説æããŸãã
-
ç¹ã«é«éåãããŠããªããã¬ãŒãã³ã°ãè¡ãå¿ èŠããããã¹ãŠã®ãã¹ãã¯é ããã¹ããšããŠèšå®ããå¿ èŠããããŸãã
-
äžéšã®ãé ããã§ããã¹ãã§ãªããã¹ããéåžžã«é ãå Žåãããã³ãããã
@slow
ãšããŠèšå®ããå¿ èŠãããå Žåã«ã¯äŸå€ãå°å ¥ã§ããŸãã倧容éã®ãã¡ã€ã«ããã£ã¹ã¯ã«ä¿åããã³èªã¿èŸŒã¿ããèªåã¢ããªã³ã°ãã¹ãã¯ã@slow
ãšããŠããŒã¯ããããã¹ãã®è¯ãäŸã§ãã -
CIã§1ç§æªæºã§ãã¹ããå®äºããå ŽåïŒããŠã³ããŒããå«ãïŒãããã¯éåžžã®ãã¹ãã§ããã¹ãã§ãã
ãã¹ãŠã®éé
ããã¹ãã¯ãããŸããŸãªå
éšèŠçŽ ãå®å
šã«ã«ããŒããå¿
èŠããããŸãããé«éã§ããå¿
èŠããããŸããããšãã°ãç¹å¥ã«äœæãããå°ããªã¢ãã«ïŒã¬ã€ã€ãŒæ°ãæå°éã§ãèªåœãµã€ãºãå°ãããªã©ïŒã䜿çšããŠãããªãã®ã«ãã¬ããžãå®çŸã§ããŸãããã®åŸã@slow
ãã¹ãã§ã¯å€§èŠæš¡ãªé
ãã¢ãã«ã䜿çšããŠè³ªçãªãã¹ããå®è¡ã§ããŸãããããã䜿çšããã«ã¯ã以äžã®ããã« tiny ã¢ãã«ãæ¢ããŠãã ããïŒ
grep tiny tests examples
ã¹ã¯ãªããã®äŸããããããã«ãã tiny-wmt19-en-de ã®ãããªå°ããªã¢ãã«ãäœæãããŸããç¹å®ã®ã¢ãã«ã®ã¢ãŒããã¯ãã£ã«ç°¡åã«èª¿æŽã§ããŸãã
å®è¡æéã誀ã£ãŠæž¬å®ããããšãç°¡åã§ããããšãã°ã巚倧ãªã¢ãã«ã®ããŠã³ããŒãã«é¢ãããªãŒããŒããããããå ŽåãããŒã«ã«ã§ãã¹ããããšããŠã³ããŒãããããã¡ã€ã«ããã£ãã·ã¥ãããããŠã³ããŒãæéãèšæž¬ãããªããªããŸãããããã£ãŠãCIãã°ã®å®è¡é床ã¬ããŒãïŒpytest --durations=0 tests
ã®åºåïŒã確èªããŠãã ããã
ãã®ã¬ããŒãã¯ãé ããã¹ããšããŠããŒã¯ãããŠããªãé ãå€ãå€ããé«éã«æžãçŽãå¿ èŠããããã¹ããèŠã€ããã®ã«ã圹ç«ã¡ãŸãããã¹ãã¹ã€ãŒããCIã§é ããªãå§ããå Žåããã®ã¬ããŒãã®ããããªã¹ãã«ã¯æãé ããã¹ãã衚瀺ãããŸãã
Testing the stdout/stderr output
stdout
ããã³/ãŸã㯠stderr
ã«æžã蟌ãé¢æ°ããã¹ãããããã«ããã¹ã㯠pytest
ã® capsys ã·ã¹ãã ã䜿çšããŠãããã®ã¹ããªãŒã ã«ã¢ã¯ã»ã¹ã§ããŸãã以äžã¯ãã®æ¹æ³ã§ãïŒ
import sys
def print_to_stdout(s):
print(s)
def print_to_stderr(s):
sys.stderr.write(s)
def test_result_and_stdout(capsys):
msg = "Hello"
print_to_stdout(msg)
print_to_stderr(msg)
out, err = capsys.readouterr() # consume the captured output streams
# optional: if you want to replay the consumed streams:
sys.stdout.write(out)
sys.stderr.write(err)
# test:
assert msg in out
assert msg in err
ãããŠãã¡ãããã»ãšãã©ã®å Žåãstderr
ã¯äŸå€ã®äžéšãšããŠæäŸãããããããã®ãããªå Žåã«ã¯ try/excel ã䜿çšããå¿
èŠããããŸãã
ã±ãŒã¹ïŒ
def raise_exception(msg):
raise ValueError(msg)
def test_something_exception():
msg = "Not a good value"
error = ""
try:
raise_exception(msg)
except Exception as e:
error = str(e)
assert msg in error, f"{msg} is in the exception:\n{error}"
stdout ããã£ããã£ãããã 1 ã€ã®ã¢ãããŒãã¯ãcontextlib.redirect_stdout
ã䜿çšããããšã§ãã
from io import StringIO
from contextlib import redirect_stdout
def print_to_stdout(s):
print(s)
def test_result_and_stdout():
msg = "Hello"
buffer = StringIO()
with redirect_stdout(buffer):
print_to_stdout(msg)
out = buffer.getvalue()
# optional: if you want to replay the consumed streams:
sys.stdout.write(out)
# test:
assert msg in out
stdout ããã£ããã£ããéã®éèŠãªæœåšçãªåé¡ã¯ãéåžžã® print
ã§ãããŸã§ã«åºåãããå
容ããªã»ããããå¯èœæ§ããã \r
æåãå«ãŸããŠããå¯èœæ§ãããããšã§ããpytest
èªäœã«ã¯åé¡ã¯ãããŸããããpytest -s
ã§ã¯ãããã®æåããããã¡ã«å«ãŸããããã-s
ãããšãªãã§ãã¹ããå®è¡ã§ããããã«ããã«ã¯ãre.sub(r'~.*\r', '', buf, 0, re.M)
ã䜿çšããŠãã£ããã£ãããåºåã«å¯ŸããŠè¿œå ã®ã¯ãªãŒã³ã¢ãããè¡ãå¿
èŠããããŸãã
ãããããã®åŸã\r
ãå«ãŸããŠãããã©ããã«ãããããããã¹ãŠã®æäœãèªåçã«åŠçãããã«ããŒã³ã³ããã¹ããããŒãžã£ã©ãããŒããããŸãããããã£ãŠã次ã®ããã«ç°¡åã«è¡ããŸãïŒ
from transformers.testing_utils import CaptureStdout
with CaptureStdout() as cs:
function_that_writes_to_stdout()
print(cs.out)
å®å šãªãã¹ãäŸã¯æ¬¡ã®ãšããã§ãã
from transformers.testing_utils import CaptureStdout
msg = "Secret message\r"
final = "Hello World"
with CaptureStdout() as cs:
print(msg + final)
assert cs.out == final + "\n", f"captured: {cs.out}, expecting {final}"
stderr
ããã£ããã£ãããå Žåã¯ã代ããã« CaptureStderr
ã¯ã©ã¹ã䜿çšããŠãã ããã
from transformers.testing_utils import CaptureStderr
with CaptureStderr() as cs:
function_that_writes_to_stderr()
print(cs.err)
äž¡æ¹ã®ã¹ããªãŒã ãäžåºŠã«ãã£ããã£ããå¿
èŠãããå Žåã¯ã芪㮠CaptureStd
ã¯ã©ã¹ã䜿çšããŸãã
from transformers.testing_utils import CaptureStd
with CaptureStd() as cs:
function_that_writes_to_stdout_and_stderr()
print(cs.err, cs.out)
ãŸãããã¹ãã®åé¡ã®ãããã°ãæ¯æŽããããã«ãããã©ã«ãã§ããããã®ã³ã³ããã¹ã ãããŒãžã£ãŒã¯çµäºæã«ãã£ããã£ãããã¹ããªãŒã ãèªåçã«åçããŸãã æèããã
Capturing logger stream
ãã¬ãŒã®åºåãæ€èšŒããå¿
èŠãããå Žåã¯ãCaptureLogger
ã䜿çšã§ããŸãã
from transformers import logging
from transformers.testing_utils import CaptureLogger
msg = "Testing 1, 2, 3"
logging.set_verbosity_info()
logger = logging.get_logger("transformers.models.bart.tokenization_bart")
with CaptureLogger(logger) as cl:
logger.info(msg)
assert cl.out, msg + "\n"
Testing with environment variables
ç¹å®ã®ãã¹ãã§ç°å¢å€æ°ã®åœ±é¿ããã¹ããããå Žåã¯ããã«ã㌠ãã³ã¬ãŒã¿ã䜿çšã§ããŸãã
transformers.testing_utils.mockenv
from transformers.testing_utils import mockenv
class HfArgumentParserTest(unittest.TestCase):
@mockenv(TRANSFORMERS_VERBOSITY="error")
def test_env_override(self):
env_level_str = os.getenv("TRANSFORMERS_VERBOSITY", None)
å Žåã«ãã£ãŠã¯ãå€éšããã°ã©ã ãåŒã³åºãå¿
èŠããããããos.environ
ã«PYTHONPATH
ãèšå®ããŠã€ã³ã¯ã«ãŒãããå¿
èŠããããŸãã
è€æ°ã®ããŒã«ã« ãã¹ããã«ã㌠ã¯ã©ã¹ transformers.test_utils.TestCasePlus
ã圹ã«ç«ã¡ãŸãã
from transformers.testing_utils import TestCasePlus
class EnvExampleTest(TestCasePlus):
def test_external_prog(self):
env = self.get_env()
# now call the external program, passing `env` to it
ãã¹ããã¡ã€ã«ã tests
ãã¹ãã¹ã€ãŒããŸã㯠examples
ã®ã©ã¡ãã«ãããã«å¿ããŠ
env[PYTHONPATH]
ã䜿çšããŠãããã 2 ã€ã®ãã£ã¬ã¯ããªã®ãããããå«ããŸãããŸãããã¹ãã確å®ã«è¡ãããããã«ããããã® src
ãã£ã¬ã¯ããªãå«ããŸãã
çŸåšã®ãªããžããªã«å¯ŸããŠå®è¡ãããæåŸã«ããã¹ããå®è¡ãããåã«ãã§ã«èšå®ãããŠãã env[PYTHONPATH]
ã䜿çšããŠå®è¡ãããŸãã
äœãããã°åŒã°ããŸãã
ãã®ãã«ã㌠ã¡ãœãã㯠os.environ
ãªããžã§ã¯ãã®ã³ããŒãäœæãããããå
ã®ãªããžã§ã¯ãã¯ãã®ãŸãŸæ®ããŸãã
Getting reproducible results
ç¶æ³ã«ãã£ãŠã¯ããã¹ãã®ã©ã³ãã æ§ãåé€ãããå ŽåããããŸããåäžã®åçŸå¯èœãªçµæã»ãããååŸããã«ã¯ã ã·ãŒããä¿®æ£ããå¿ èŠããããŸã:
seed = 42
# python RNG
import random
random.seed(seed)
# pytorch RNGs
import torch
torch.manual_seed(seed)
torch.backends.cudnn.deterministic = True
if torch.cuda.is_available():
torch.cuda.manual_seed_all(seed)
# numpy RNG
import numpy as np
np.random.seed(seed)
# tf RNG
tf.random.set_seed(seed)
Debugging tests
èŠåãçºçããæç¹ã§ãããã¬ãŒãéå§ããã«ã¯ã次ã®æé ãå®è¡ããŸãã
pytest tests/utils/test_logging.py -W error::UserWarning --pdb
Working with github actions workflows
ã»ã«ãããã·ã¥ã®ã¯ãŒã¯ãããŒCIãžã§ããããªã¬ãŒããã«ã¯ã以äžã®æé ãå®è¡ããå¿ èŠããããŸãïŒ
transformers
ã®ãªã¢ãŒããªããžããªã§æ°ãããã©ã³ããäœæããŸãïŒãã©ãŒã¯ã§ã¯ãªããå ã®ãªããžããªã§è¡ããŸãïŒã- ãã©ã³ãã®ååã¯
ci_
ãŸãã¯ci-
ã§å§ãŸãå¿ èŠããããŸãïŒmain
ãããªã¬ãŒããŸãããmain
ã§ã¯PRãäœæã§ããŸããïŒããŸããç¹å®ã®ãã¹ã§ã®ã¿ããªã¬ãŒãããŸã - ãã®ããã¥ã¡ã³ããæžãããåŸã«å€æŽãããå Žåã«åããŠãææ°ã®å®çŸ©ã¯ãã¡ãã® push: ã«ãããŸãã - ãã®ãã©ã³ãããPRãäœæããŸãã
- ãã®åŸããã®ãžã§ããããã«è¡šç€ºãããŸãããžã§ãã¯ããã¯ãã°ãããå Žåãããã«å®è¡ãããªãããšããããŸãã
Testing Experimental CI Features
CIæ©èœã®ãã¹ãã¯éåžžã®CIã®æ£åžžãªåäœã«å¹²æžããå¯èœæ§ããããããæ°ããCIæ©èœãè¿œå ããå Žåã以äžã®æé ã«åŸãå¿ èŠããããŸãã
- ãã¹ããå¿ èŠãªãã®ããã¹ãããããã®æ°ããå°çšã®ãžã§ããäœæããŸãã
- æ°ãããžã§ãã¯åžžã«æåããå¿ èŠããããããåžžã«ã°ãªãŒã³ âïŒè©³çŽ°ã¯ä»¥äžåç §ïŒã衚瀺ããå¿ èŠããããŸãã
- ããŸããŸãªçš®é¡ã®PRïŒãŠãŒã¶ãŒãã©ãŒã¯ãã©ã³ããéãã©ãŒã¯ãã©ã³ããgithub.com UIããçŽæ¥ãã¡ã€ã«ãç·šéãããã©ã³ããããŸããŸãªåŒ·å¶ããã·ã¥ãªã©ïŒãå®è¡ããããŸã§ããã€ãã®æ¥éå®è¡ããå®éšçãªãžã§ãã®ãã°ãç£èŠããŸãïŒæå³çã«åžžã«ã°ãªãŒã³ã«ãªãããã«ãªã£ãŠããå šäœã®ãžã§ãã®ç·ã§ã¯ãªãïŒã
- ãã¹ãŠãå®å®ããŠããããšãæ確ã«ãªã£ãããæ°ããå€æŽãæ¢åã®ãžã§ãã«çµ±åããŸãã
ãã®ããã«ãCIæ©èœèªäœã®å®éšãéåžžã®ã¯ãŒã¯ãããŒã«å¹²æžããªãããã«ã§ããŸãã
ã§ã¯ãæ°ããCIæ©èœãéçºäžã§ããéããžã§ããåžžã«æåãããã«ã¯ã©ãããã°ããã§ããããïŒ
TravisCIã®ãããªäžéšã®CI㯠ignore-step-failure
ããµããŒãããå
šäœã®ãžã§ããæåãšããŠå ±åããŸããããã®ææžãäœæãããæç¹ã§ã¯CircleCIãšGithub Actionsã¯ããããµããŒãããŠããŸããã
ãããã£ãŠã以äžã®ã¯ãŒã¯ã¢ã©ãŠã³ãã䜿çšã§ããŸãïŒ
- bashã¹ã¯ãªããå
ã§æœåšçãªå€±æãæå¶ããããã«å®è¡ã³ãã³ãã®åé ã«
set +euo pipefail
ãèšè¿°ããŸãã - æåŸã®ã³ãã³ãã¯æåããå¿
èŠããããŸããããšãã°
echo "done"
ãŸãã¯åã«true
ã䜿çšã§ããŸãã
以äžã¯äŸã§ãïŒ
- run:
name: run CI experiment
command: |
set +euo pipefail
echo "setting run-all-despite-any-errors-mode"
this_command_will_fail
echo "but bash continues to run"
# emulate another failure
false
# but the last command must be a success
echo "during experiment do not remove: reporting success to CI, even if there were failures"
åçŽãªã³ãã³ãã®å Žåã¯ã次ã®ããã«ããããšãã§ããŸãã
cmd_that_may_fail || true
ãã¡ãããçµæã«æºè¶³ããããå®éšçãªã¹ãããããžã§ããéåžžã®ãžã§ããšçµ±åããset +euo pipefail
ãªã©ã®è¿œå ããèŠçŽ ãåé€ããŠãå®éšçãªãžã§ããéåžžã®CIã®åäœã«å¹²æžããªãããã«ããŸãã
ãã®ããã»ã¹å
šäœã¯ãå®éšçãªã¹ãããã«å¯Ÿã㊠allow-failure
ã®ãããªãã®ãèšå®ããPRã®å
šäœã®ã¹ããŒã¿ã¹ã«åœ±é¿ãäžããã«å€±æãããããšãã§ããã°ãã¯ããã«ç°¡åã«ãªã£ãã§ããããããããåè¿°ã®éããçŸåšã¯CircleCIãšGithub Actionsã¯ãã®æ©èœããµããŒãããŠããŸããã
ãã®æ©èœã«é¢ããŠã®æ祚ããCIã«ç¹æã®ã¹ã¬ããã§ãã®é²æç¶æ³ã確èªã§ããŸãïŒ