transformers/utils
NielsRogge f3d2f7a6e0
Add MarkupLM (#19198)
* First draft

* Make basic test work

* Fix most tokenizer tests

* More improvements

* Make more tests pass

* Fix more tests

* Fix some code quality

* Improve truncation

* Implement feature extractor

* Improve feature extractor and add tests

* Improve feature extractor tests

* Fix pair_input test partly

* Add fast tokenizer

* Improve implementation

* Fix rebase

* Fix rebase

* Fix most of the tokenizer tests.

* propose solution for fast

* add: integration test for fasttokenizer, warning for decode, fix template in slow tokenizer

* add: modify markuplmconverter

* add: some modify on converter and tokenizerfast

* Fix style, copies

* Make fixup

* Update tokenization_markuplm.py

* Update test_tokenization_markuplm.py

* Update markuplm related

* Improve processor, add integration test

* Add processor test file

* Improve processor

* Improve processor tests

* Fix more processor tests

* Fix processor tests

* Update docstrings

* Add Copied from statements

* Add more Copied from statements

* Add code examples

* Improve code examples

* Add model to doc tests

* Adding dependency check

* Add dummy file

* Add requires_backends

* Add model to toctree

* Fix more things, disable dependency check for now

* Apply more suggestions

* Add soft dependency

* Add annotators to tests

* Fix style

* Remove from_slow=True

* Remove print statements

* Add sanity check

* Fix processor test

* Fix processor tests, add more docs

* Add doc tests for mdx file

* Add more tips

* Apply suggestions

Co-authored-by: Niels Rogge <nielsrogge@Nielss-MacBook-Pro.local>
Co-authored-by: lockon-n <45759388+lockon-n@users.noreply.github.com>
Co-authored-by: SaulLu <lucilesaul.com@gmail.com>
Co-authored-by: lockon-n <dd098309@126.com>
2022-09-30 08:25:43 +02:00
..
test_module Custom pipeline (#18079) 2022-07-19 12:02:35 +02:00
tf_ops Check TF ops for ONNX compliance (#10025) 2021-02-15 07:55:10 -05:00
check_config_docstrings.py Add X-CLIP (#18852) 2022-09-08 14:50:30 +02:00
check_copies.py Add Donut (#18488) 2022-08-12 16:40:58 +02:00
check_doc_toc.py Split model list on modality (#18328) 2022-08-01 11:10:20 -05:00
check_dummies.py Add some tests for check_dummies (#19146) 2022-09-21 14:54:09 -04:00
check_inits.py Make check_init script more robust and clean inits (#17408) 2022-05-25 07:23:56 -04:00
check_repo.py [TensorFlow] Adding GroupViT (#18020) 2022-09-29 10:48:04 +01:00
check_self_hosted_runner.py Add offline runners info in the Slack report (#19169) 2022-09-23 19:23:05 +02:00
check_table.py Fix some typos. (#17560) 2022-07-11 05:00:13 -04:00
check_tf_ops.py Check TF ops for ONNX compliance (#10025) 2021-02-15 07:55:10 -05:00
custom_init_isort.py explicitly set utf8 for Windows (#17664) 2022-06-13 08:05:45 -04:00
documentation_tests.txt Add MarkupLM (#19198) 2022-09-30 08:25:43 +02:00
download_glue_data.py Raise exceptions instead of asserts (#13907) 2021-10-07 12:44:23 +05:30
get_ci_error_statistics.py Update Past CI report script (#19228) 2022-09-29 19:22:23 +02:00
get_github_job_time.py add a script to get time info. from GA workflow jobs (#18822) 2022-09-01 12:02:52 +02:00
get_modified_files.py Updates the default branch from master to main (#16326) 2022-03-23 03:46:59 -04:00
notification_service.py Add offline runners info in the Slack report (#19169) 2022-09-23 19:23:05 +02:00
notification_service_doc_tests.py fix missing block when there is no failure (#18775) 2022-08-29 09:10:13 +02:00
past_ci_versions.py Add PyTorch 1.11 to past CI (#18302) 2022-07-26 15:47:23 +02:00
prepare_for_doc_test.py Add a check regarding the number of occurrences of ``` (#18389) 2022-08-01 14:23:02 +02:00
print_env.py Print more library versions in CI (#17384) 2022-06-02 10:24:16 +02:00
release.py Clean README in post release job as well. (#17519) 2022-06-02 07:44:03 -04:00
sort_auto_mappings.py Automatically sort auto mappings (#17250) 2022-05-16 13:24:20 -04:00
tests_fetcher.py Fix cached_file in offline mode for cached non-existing files (#19206) 2022-09-26 18:01:00 -04:00
update_metadata.py Automatically tag CLIP repos as zero-shot-image-classification (#19064) 2022-09-16 15:40:38 +02:00