Documentation
This commit is contained in:
parent
4dde31cb76
commit
cf5c5c9e1c
|
@ -1,9 +1,38 @@
|
|||
Transformers
|
||||
================================================================================================================================================
|
||||
|
||||
Transformers is a library of state-of-the-art pre-trained models for Natural Language Processing (NLP).
|
||||
🤗 Transformers (formerly known as `pytorch-transformers` and `pytorch-pretrained-bert`) provides general-purpose architectures
|
||||
(BERT, GPT-2, RoBERTa, XLM, DistilBert, XLNet...) for Natural Language Understanding (NLU) and Natural Language Generation
|
||||
(NLG) with over 32+ pretrained models in 100+ languages and deep interoperability between TensorFlow 2.0 and PyTorch.
|
||||
|
||||
The library currently contains PyTorch implementations, pre-trained model weights, usage scripts and conversion utilities for the following models:
|
||||
Features
|
||||
---------------------------------------------------
|
||||
|
||||
- As easy to use as pytorch-transformers
|
||||
- As powerful and concise as Keras
|
||||
- High performance on NLU and NLG tasks
|
||||
- Low barrier to entry for educators and practitioners
|
||||
|
||||
State-of-the-art NLP for everyone
|
||||
- Deep learning researchers
|
||||
- Hands-on practitioners
|
||||
- AI/ML/NLP teachers and educators
|
||||
|
||||
Lower compute costs, smaller carbon footprint
|
||||
- Researchers can share trained models instead of always retraining
|
||||
- Practitioners can reduce compute time and production costs
|
||||
- 8 architectures with over 30 pretrained models, some in more than 100 languages
|
||||
|
||||
Choose the right framework for every part of a model's lifetime
|
||||
- Train state-of-the-art models in 3 lines of code
|
||||
- Deep interoperability between TensorFlow 2.0 and PyTorch models
|
||||
- Move a single model between TF2.0/PyTorch frameworks at will
|
||||
- Seamlessly pick the right framework for training, evaluation, production
|
||||
|
||||
Contents
|
||||
---------------------------------
|
||||
|
||||
The library currently contains PyTorch and Tensorflow implementations, pre-trained model weights, usage scripts and conversion utilities for the following models:
|
||||
|
||||
1. `BERT <https://github.com/google-research/bert>`_ (from Google) released with the paper `BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding <https://arxiv.org/abs/1810.04805>`_ by Jacob Devlin, Ming-Wei Chang, Kenton Lee and Kristina Toutanova.
|
||||
2. `GPT <https://github.com/openai/finetune-transformer-lm>`_ (from OpenAI) released with the paper `Improving Language Understanding by Generative Pre-Training <https://blog.openai.com/language-unsupervised>`_ by Alec Radford, Karthik Narasimhan, Tim Salimans and Ilya Sutskever.
|
||||
|
@ -14,6 +43,7 @@ The library currently contains PyTorch implementations, pre-trained model weight
|
|||
7. `RoBERTa <https://github.com/pytorch/fairseq/tree/master/examples/roberta>`_ (from Facebook), released together with the paper a `Robustly Optimized BERT Pretraining Approach <https://arxiv.org/abs/1907.11692>`_ by Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, Veselin Stoyanov.
|
||||
8. `DistilBERT <https://huggingface.co/transformers/model_doc/distilbert.html>`_ (from HuggingFace) released together with the blog post `Smaller, faster, cheaper, lighter: Introducing DistilBERT, a distilled version of BERT <https://medium.com/huggingface/distilbert-8cf3380435b5>`_ by Victor Sanh, Lysandre Debut and Thomas Wolf.
|
||||
|
||||
|
||||
.. toctree::
|
||||
:maxdepth: 2
|
||||
:caption: Notes
|
||||
|
|
|
@ -44,15 +44,15 @@ Here is the full list of the currently provided pretrained models together with
|
|||
| +------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
|
||||
| | ``bert-large-uncased-whole-word-masking-finetuned-squad`` | | 24-layer, 1024-hidden, 16-heads, 340M parameters. |
|
||||
| | | | The ``bert-large-uncased-whole-word-masking`` model fine-tuned on SQuAD |
|
||||
| | | (see details of fine-tuning in the `example section <https://github.com/huggingface/transformers/tree/master/examples>`__). |
|
||||
| | | (see details of fine-tuning in the `example section <https://github.com/huggingface/transformers/tree/master/examples>`__). |
|
||||
| +------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
|
||||
| | ``bert-large-cased-whole-word-masking-finetuned-squad`` | | 24-layer, 1024-hidden, 16-heads, 340M parameters |
|
||||
| | | | The ``bert-large-cased-whole-word-masking`` model fine-tuned on SQuAD |
|
||||
| | | (see `details of fine-tuning in the example section <https://huggingface.co/transformers/examples.html>`__) |
|
||||
| | | (see `details of fine-tuning in the example section <https://huggingface.co/transformers/examples.html>`__) |
|
||||
| +------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
|
||||
| | ``bert-base-cased-finetuned-mrpc`` | | 12-layer, 768-hidden, 12-heads, 110M parameters. |
|
||||
| | | | The ``bert-base-cased`` model fine-tuned on MRPC |
|
||||
| | | (see `details of fine-tuning in the example section <https://huggingface.co/transformers/examples.html>`__) |
|
||||
| | | (see `details of fine-tuning in the example section <https://huggingface.co/transformers/examples.html>`__) |
|
||||
+-------------------+------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+
|
||||
| GPT | ``openai-gpt`` | | 12-layer, 768-hidden, 12-heads, 110M parameters. |
|
||||
| | | | OpenAI GPT English model |
|
||||
|
|
Loading…
Reference in New Issue