Model cards for CS224n SQuAD2.0 models (#3406)

* Model cards for CS224n SQuAD2.0 models * consistent spacing
2020-03-24 08:28:33 -07:00 · 2020-03-24 08:28:33 -07:00 · e279a312d6
parent 7372e62b2c
commit e279a312d6
5 changed files with 370 additions and 0 deletions
--- a/model_cards/elgeish/cs224n-squad2.0-albert-base-v2/README.md
+++ b/model_cards/elgeish/cs224n-squad2.0-albert-base-v2/README.md
@ -0,0 +1,74 @@
 ## CS224n SQuAD2.0 Project Dataset
 The goal of this model is to save CS224n students GPU time when establising
 baselines to beat for the [Default Final Project](http://web.stanford.edu/class/cs224n/project/default-final-project-handout.pdf).
 The training set used to fine-tune this model is the same as
 the [official one](https://rajpurkar.github.io/SQuAD-explorer/); however,
 evaluation and model selection were performed using roughly half of the official
 dev set, 6078 examples, picked at random. The data files can be found at
 <https://github.com/elgeish/squad/tree/master/data> — this is the Winter 2020
 version. Given that the official SQuAD2.0 dev set contains the project's test
 set, students must make sure not to use the official SQuAD2.0 dev set in any way
 — including the use of models fine-tuned on the official SQuAD2.0, since they
 used the official SQuAD2.0 dev set for model selection.
 ## Results
 ```json
 {
  "exact": 78.94044093451794,
  "f1": 81.7724930324639,
  "total": 6078,
  "HasAns_exact": 76.28865979381443,
  "HasAns_f1": 82.20385314478195,
  "HasAns_total": 2910,
  "NoAns_exact": 81.37626262626263,
  "NoAns_f1": 81.37626262626263,
  "NoAns_total": 3168,
  "best_exact": 78.95689371503784,
  "best_exact_thresh": 0.0,
  "best_f1": 81.78894581298378,
  "best_f1_thresh": 0.0
 }
 ```
 ## Notable Arguments
 ```json
 {
  "do_lower_case": true,
  "doc_stride": 128,
  "fp16": false,
  "fp16_opt_level": "O1",
  "gradient_accumulation_steps": 24,
  "learning_rate": 3e-05,
  "max_answer_length": 30,
  "max_grad_norm": 1,
  "max_query_length": 64,
  "max_seq_length": 384,
  "model_name_or_path": "albert-base-v2",
  "model_type": "albert",
  "num_train_epochs": 3,
  "per_gpu_train_batch_size": 8,
  "save_steps": 5000,
  "seed": 42,
  "train_batch_size": 8,
  "version_2_with_negative": true,
  "warmup_steps": 0,
  "weight_decay": 0
 }
 ```
 ## Environment Setup
 ```json
 {
  "transformers": "2.5.1",
  "pytorch": "1.4.0=py3.6_cuda10.1.243_cudnn7.6.3_0",
  "python": "3.6.5=hc3d631a_2",
  "os": "Linux 4.15.0-1060-aws #62-Ubuntu SMP Tue Feb 11 21:23:22 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux",
  "gpu": "Tesla V100-SXM2-16GB"
 }
 ```
 ## Related Models
 * [elgeish/cs224n-squad2.0-albert-large-v2](https://huggingface.co/elgeish/cs224n-squad2.0-albert-large-v2)
 * [elgeish/cs224n-squad2.0-albert-xxlarge-v1](https://huggingface.co/elgeish/cs224n-squad2.0-albert-xxlarge-v1)
 * [elgeish/cs224n-squad2.0-distilbert-base-uncased](https://huggingface.co/elgeish/cs224n-squad2.0-distilbert-base-uncased)
 * [elgeish/cs224n-squad2.0-roberta-base](https://huggingface.co/elgeish/cs224n-squad2.0-roberta-base)
--- a/model_cards/elgeish/cs224n-squad2.0-albert-large-v2/README.md
+++ b/model_cards/elgeish/cs224n-squad2.0-albert-large-v2/README.md
@ -0,0 +1,74 @@
 ## CS224n SQuAD2.0 Project Dataset
 The goal of this model is to save CS224n students GPU time when establising
 baselines to beat for the [Default Final Project](http://web.stanford.edu/class/cs224n/project/default-final-project-handout.pdf).
 The training set used to fine-tune this model is the same as
 the [official one](https://rajpurkar.github.io/SQuAD-explorer/); however,
 evaluation and model selection were performed using roughly half of the official
 dev set, 6078 examples, picked at random. The data files can be found at
 <https://github.com/elgeish/squad/tree/master/data> — this is the Winter 2020
 version. Given that the official SQuAD2.0 dev set contains the project's test
 set, students must make sure not to use the official SQuAD2.0 dev set in any way
 — including the use of models fine-tuned on the official SQuAD2.0, since they
 used the official SQuAD2.0 dev set for model selection.
 ## Results
 ```json
 {
  "exact": 79.2694965449161,
  "f1": 82.50844352970152,
  "total": 6078,
  "HasAns_exact": 74.87972508591065,
  "HasAns_f1": 81.64478342732858,
  "HasAns_total": 2910,
  "NoAns_exact": 83.30176767676768,
  "NoAns_f1": 83.30176767676768,
  "NoAns_total": 3168,
  "best_exact": 79.2694965449161,
  "best_exact_thresh": 0.0,
  "best_f1": 82.50844352970155,
  "best_f1_thresh": 0.0
 }
 ```
 ## Notable Arguments
 ```json
 {
  "do_lower_case": true,
  "doc_stride": 128,
  "fp16": false,
  "fp16_opt_level": "O1",
  "gradient_accumulation_steps": 1,
  "learning_rate": 3e-05,
  "max_answer_length": 30,
  "max_grad_norm": 1,
  "max_query_length": 64,
  "max_seq_length": 384,
  "model_name_or_path": "albert-large-v2",
  "model_type": "albert",
  "num_train_epochs": 5,
  "per_gpu_train_batch_size": 8,
  "save_steps": 5000,
  "seed": 42,
  "train_batch_size": 8,
  "version_2_with_negative": true,
  "warmup_steps": 0,
  "weight_decay": 0
 }
 ```
 ## Environment Setup
 ```json
 {
  "transformers": "2.5.1",
  "pytorch": "1.4.0=py3.6_cuda10.1.243_cudnn7.6.3_0",
  "python": "3.6.5=hc3d631a_2",
  "os": "Linux 4.15.0-1060-aws #62-Ubuntu SMP Tue Feb 11 21:23:22 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux",
  "gpu": "Tesla V100-SXM2-16GB"
 }
 ```
 ## Related Models
 * [elgeish/cs224n-squad2.0-albert-base-v2](https://huggingface.co/elgeish/cs224n-squad2.0-albert-base-v2)
 * [elgeish/cs224n-squad2.0-albert-xxlarge-v1](https://huggingface.co/elgeish/cs224n-squad2.0-albert-xxlarge-v1)
 * [elgeish/cs224n-squad2.0-distilbert-base-uncased](https://huggingface.co/elgeish/cs224n-squad2.0-distilbert-base-uncased)
 * [elgeish/cs224n-squad2.0-roberta-base](https://huggingface.co/elgeish/cs224n-squad2.0-roberta-base)
--- a/model_cards/elgeish/cs224n-squad2.0-albert-xxlarge-v1/README.md
+++ b/model_cards/elgeish/cs224n-squad2.0-albert-xxlarge-v1/README.md
@ -0,0 +1,74 @@
 ## CS224n SQuAD2.0 Project Dataset
 The goal of this model is to save CS224n students GPU time when establising
 baselines to beat for the [Default Final Project](http://web.stanford.edu/class/cs224n/project/default-final-project-handout.pdf).
 The training set used to fine-tune this model is the same as
 the [official one](https://rajpurkar.github.io/SQuAD-explorer/); however,
 evaluation and model selection were performed using roughly half of the official
 dev set, 6078 examples, picked at random. The data files can be found at
 <https://github.com/elgeish/squad/tree/master/data> — this is the Winter 2020
 version. Given that the official SQuAD2.0 dev set contains the project's test
 set, students must make sure not to use the official SQuAD2.0 dev set in any way
 — including the use of models fine-tuned on the official SQuAD2.0, since they
 used the official SQuAD2.0 dev set for model selection.
 ## Results
 ```json
 {
  "exact": 85.93287265547877,
  "f1": 88.91258331187983,
  "total": 6078,
  "HasAns_exact": 84.36426116838489,
  "HasAns_f1": 90.58786301361013,
  "HasAns_total": 2910,
  "NoAns_exact": 87.37373737373737,
  "NoAns_f1": 87.37373737373737,
  "NoAns_total": 3168,
  "best_exact": 85.93287265547877,
  "best_exact_thresh": 0.0,
  "best_f1": 88.91258331187993,
  "best_f1_thresh": 0.0
 }
 ```
 ## Notable Arguments
 ```json
 {
  "do_lower_case": true,
  "doc_stride": 128,
  "fp16": false,
  "fp16_opt_level": "O1",
  "gradient_accumulation_steps": 24,
  "learning_rate": 3e-05,
  "max_answer_length": 30,
  "max_grad_norm": 1,
  "max_query_length": 64,
  "max_seq_length": 512,
  "model_name_or_path": "albert-xxlarge-v1",
  "model_type": "albert",
  "num_train_epochs": 4,
  "per_gpu_train_batch_size": 1,
  "save_steps": 1000,
  "seed": 42,
  "train_batch_size": 1,
  "version_2_with_negative": true,
  "warmup_steps": 814,
  "weight_decay": 0
 }
 ```
 ## Environment Setup
 ```json
 {
  "transformers": "2.5.1",
  "pytorch": "1.4.0=py3.6_cuda10.1.243_cudnn7.6.3_0",
  "python": "3.6.5=hc3d631a_2",
  "os": "Linux 4.15.0-1060-aws #62-Ubuntu SMP Tue Feb 11 21:23:22 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux",
  "gpu": "Tesla V100-SXM2-16GB"
 }
 ```
 ## Related Models
 * [elgeish/cs224n-squad2.0-albert-base-v2](https://huggingface.co/elgeish/cs224n-squad2.0-albert-base-v2)
 * [elgeish/cs224n-squad2.0-albert-large-v2](https://huggingface.co/elgeish/cs224n-squad2.0-albert-large-v2)
 * [elgeish/cs224n-squad2.0-distilbert-base-uncased](https://huggingface.co/elgeish/cs224n-squad2.0-distilbert-base-uncased)
 * [elgeish/cs224n-squad2.0-roberta-base](https://huggingface.co/elgeish/cs224n-squad2.0-roberta-base)
--- a/model_cards/elgeish/cs224n-squad2.0-distilbert-base-uncased/README.md
+++ b/model_cards/elgeish/cs224n-squad2.0-distilbert-base-uncased/README.md
@ -0,0 +1,74 @@
 ## CS224n SQuAD2.0 Project Dataset
 The goal of this model is to save CS224n students GPU time when establising
 baselines to beat for the [Default Final Project](http://web.stanford.edu/class/cs224n/project/default-final-project-handout.pdf).
 The training set used to fine-tune this model is the same as
 the [official one](https://rajpurkar.github.io/SQuAD-explorer/); however,
 evaluation and model selection were performed using roughly half of the official
 dev set, 6078 examples, picked at random. The data files can be found at
 <https://github.com/elgeish/squad/tree/master/data> — this is the Winter 2020
 version. Given that the official SQuAD2.0 dev set contains the project's test
 set, students must make sure not to use the official SQuAD2.0 dev set in any way
 — including the use of models fine-tuned on the official SQuAD2.0, since they
 used the official SQuAD2.0 dev set for model selection.
 ## Results
 ```json
 {
  "exact": 65.16946363935504,
  "f1": 67.87348075352251,
  "total": 6078,
  "HasAns_exact": 69.51890034364261,
  "HasAns_f1": 75.16667217179045,
  "HasAns_total": 2910,
  "NoAns_exact": 61.17424242424242,
  "NoAns_f1": 61.17424242424242,
  "NoAns_total": 3168,
  "best_exact": 65.16946363935504,
  "best_exact_thresh": 0.0,
  "best_f1": 67.87348075352243,
  "best_f1_thresh": 0.0
 }
 ```
 ## Notable Arguments
 ```json
 {
  "do_lower_case": true,
  "doc_stride": 128,
  "fp16": false,
  "fp16_opt_level": "O1",
  "gradient_accumulation_steps": 24,
  "learning_rate": 3e-05,
  "max_answer_length": 30,
  "max_grad_norm": 1,
  "max_query_length": 64,
  "max_seq_length": 384,
  "model_name_or_path": "distilbert-base-uncased-distilled-squad",
  "model_type": "distilbert",
  "num_train_epochs": 4,
  "per_gpu_train_batch_size": 32,
  "save_steps": 5000,
  "seed": 42,
  "train_batch_size": 32,
  "version_2_with_negative": true,
  "warmup_steps": 0,
  "weight_decay": 0
 }
 ```
 ## Environment Setup
 ```json
 {
  "transformers": "2.5.1",
  "pytorch": "1.4.0=py3.6_cuda10.1.243_cudnn7.6.3_0",
  "python": "3.6.5=hc3d631a_2",
  "os": "Linux 4.15.0-1060-aws #62-Ubuntu SMP Tue Feb 11 21:23:22 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux",
  "gpu": "Tesla V100-SXM2-16GB"
 }
 ```
 ## Related Models
 * [elgeish/cs224n-squad2.0-albert-base-v2](https://huggingface.co/elgeish/cs224n-squad2.0-albert-base-v2)
 * [elgeish/cs224n-squad2.0-albert-large-v2](https://huggingface.co/elgeish/cs224n-squad2.0-albert-large-v2)
 * [elgeish/cs224n-squad2.0-albert-xxlarge-v1](https://huggingface.co/elgeish/cs224n-squad2.0-albert-xxlarge-v1)
 * [elgeish/cs224n-squad2.0-roberta-base](https://huggingface.co/elgeish/cs224n-squad2.0-roberta-base)
--- a/model_cards/elgeish/cs224n-squad2.0-roberta-base/README.md
+++ b/model_cards/elgeish/cs224n-squad2.0-roberta-base/README.md
@ -0,0 +1,74 @@
 ## CS224n SQuAD2.0 Project Dataset
 The goal of this model is to save CS224n students GPU time when establising
 baselines to beat for the [Default Final Project](http://web.stanford.edu/class/cs224n/project/default-final-project-handout.pdf).
 The training set used to fine-tune this model is the same as
 the [official one](https://rajpurkar.github.io/SQuAD-explorer/); however,
 evaluation and model selection were performed using roughly half of the official
 dev set, 6078 examples, picked at random. The data files can be found at
 <https://github.com/elgeish/squad/tree/master/data> — this is the Winter 2020
 version. Given that the official SQuAD2.0 dev set contains the project's test
 set, students must make sure not to use the official SQuAD2.0 dev set in any way
 — including the use of models fine-tuned on the official SQuAD2.0, since they
 used the official SQuAD2.0 dev set for model selection.
 ## Results
 ```json
 {
  "exact": 75.32082922013821,
  "f1": 78.66699523704254,
  "total": 6078,
  "HasAns_exact": 74.84536082474227,
  "HasAns_f1": 81.83436324767868,
  "HasAns_total": 2910,
  "NoAns_exact": 75.75757575757575,
  "NoAns_f1": 75.75757575757575,
  "NoAns_total": 3168,
  "best_exact": 75.32082922013821,
  "best_exact_thresh": 0.0,
  "best_f1": 78.66699523704266,
  "best_f1_thresh": 0.0
 }
 ```
 ## Notable Arguments
 ```json
 {
  "do_lower_case": true,
  "doc_stride": 128,
  "fp16": false,
  "fp16_opt_level": "O1",
  "gradient_accumulation_steps": 24,
  "learning_rate": 3e-05,
  "max_answer_length": 30,
  "max_grad_norm": 1,
  "max_query_length": 64,
  "max_seq_length": 384,
  "model_name_or_path": "roberta-base",
  "model_type": "roberta",
  "num_train_epochs": 4,
  "per_gpu_train_batch_size": 16,
  "save_steps": 5000,
  "seed": 42,
  "train_batch_size": 16,
  "version_2_with_negative": true,
  "warmup_steps": 0,
  "weight_decay": 0
 }
 ```
 ## Environment Setup
 ```json
 {
  "transformers": "2.5.1",
  "pytorch": "1.4.0=py3.6_cuda10.1.243_cudnn7.6.3_0",
  "python": "3.6.5=hc3d631a_2",
  "os": "Linux 4.15.0-1060-aws #62-Ubuntu SMP Tue Feb 11 21:23:22 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux",
  "gpu": "Tesla V100-SXM2-16GB"
 }
 ```
 ## Related Models
 * [elgeish/cs224n-squad2.0-albert-base-v2](https://huggingface.co/elgeish/cs224n-squad2.0-albert-base-v2)
 * [elgeish/cs224n-squad2.0-albert-large-v2](https://huggingface.co/elgeish/cs224n-squad2.0-albert-large-v2)
 * [elgeish/cs224n-squad2.0-albert-xxlarge-v1](https://huggingface.co/elgeish/cs224n-squad2.0-albert-xxlarge-v1)
 * [elgeish/cs224n-squad2.0-distilbert-base-uncased](https://huggingface.co/elgeish/cs224n-squad2.0-distilbert-base-uncased)