From ac1b449cc938bb34bc9021feff599cfd3b2376ae Mon Sep 17 00:00:00 2001 From: Julien Chaumond Date: Sat, 21 Dec 2019 00:09:01 -0500 Subject: [PATCH] [doc] move distilroberta to more appropriate place cc @lysandrejik --- docs/source/pretrained_models.rst | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-) diff --git a/docs/source/pretrained_models.rst b/docs/source/pretrained_models.rst index a359990f5a..eb7b41ffc9 100644 --- a/docs/source/pretrained_models.rst +++ b/docs/source/pretrained_models.rst @@ -3,6 +3,7 @@ Pretrained models Here is the full list of the currently provided pretrained models together with a short presentation of each model. +For a list that includes community-uploaded models, refer to `https://huggingface.co/models `__. +-------------------+------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+ | Architecture | Shortcut name | Details of the model | @@ -154,6 +155,10 @@ Here is the full list of the currently provided pretrained models together with | | | | ``roberta-large`` fine-tuned on `MNLI `__. | | | | (see `details `__) | | +------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+ +| | ``distilroberta-base`` | | 6-layer, 768-hidden, 12-heads, 82M parameters | +| | | | The DistilRoBERTa model distilled from the RoBERTa model `roberta-base` checkpoint. | +| | | (see `details `__) | +| +------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+ | | ``roberta-base-openai-detector`` | | 12-layer, 768-hidden, 12-heads, 125M parameters | | | | | ``roberta-base`` fine-tuned by OpenAI on the outputs of the 1.5B-parameter GPT-2 model. | | | | (see `details `__) | @@ -174,10 +179,6 @@ Here is the full list of the currently provided pretrained models together with | | | | The DistilGPT2 model distilled from the GPT2 model `gpt2` checkpoint. | | | | (see `details `__) | | +------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+ -| | ``distilroberta-base`` | | 6-layer, 768-hidden, 12-heads, 82M parameters | -| | | | The DistilRoBERTa model distilled from the RoBERTa model `roberta-base` checkpoint. | -| | | (see `details `__) | -| +------------------------------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------+ | | ``distilbert-base-german-cased`` | | 6-layer, 768-hidden, 12-heads, 66M parameters | | | | | The German DistilBERT model distilled from the German DBMDZ BERT model `bert-base-german-dbmdz-cased` checkpoint. | | | | (see `details `__) |