docs: update spider leaderboard top1-miniseek-ex91.6

This commit is contained in:
junewgl 2023-11-10 15:26:18 +08:00
parent 16e0e679be
commit d2106fee85
1 changed files with 15 additions and 12 deletions

View File

@ -14,18 +14,21 @@ Curated tutorials and resources for Large Language Models, Text2SQL, and more.
We warmly welcome contributions from everyone, whether you've found a typo, a bug, have a suggestion, or want to share a resource related to LLM+Text2SQL. For detailed guidelines on how to contribute, please see our [CONTRIBUTING.md](CONTRIBUTING.md) file.
## 🔔 Leaderboard
| | [WikiSQL](https://github.com/salesforce/WikiSQL#leaderboard) | [Spider](https://yale-lily.github.io/spider)<br/>Exact Match(EM) | [Spider](https://yale-lily.github.io/spider)<br/>Exact Execution(EX) | [BIRD](https://bird-bench.github.io/)<br/>Valid Efficiency Score (VES) | [BIRD](https://bird-bench.github.io/)<br/>Execution Accuracy (EX) |
|:---:|:-----------------------------------------------------------------------------------------------------------------------:|:-----------------------------------------------------------------------------------------------:|:---------------------------------------------------------------------------------------------------:|:----------------------------------------------------------------------------:|:----------------------------------------------------------------------------:|
| 🏆1 | **93.0** <br/>(2021/05-[SeaD+Execution-Guided Decoding](https://arxiv.org/pdf/2105.07911.pdf)) | **74.0** <br/>(2022/09-[Graphix-3B + PICARD](https://arxiv.org/pdf/2301.07507.pdf)) | **86.6** <br/>(2023/08-[DAIL-SQL + GPT-4 + Self-Consistency](https://arxiv.org/pdf/2308.15363.pdf)) | **64.22** <br/>(2023/10-SFT CodeS-15B) | **60.37** <br/>(2023/10-SFT CodeS-15B) |
| 🥈2 | 92.7 <br/>(2021/03-[SDSQL+Execution-Guided Decoding](https://arxiv.org/pdf/2103.04399.pdf)) | 73.9 <br/>(2022/09-CatSQL + GraPPa) | 86.2 <br/>(2023/08-[DAIL-SQL + GPT-4](https://arxiv.org/pdf/2308.15363.pdf)) | 63.62 <br/>(2023/10-SFT CodeS-7B) | 59.25 <br/>(2023/10-SFT CodeS-7B) |
| 🥉3 | 92.5 <br/>(2020/11-[IE-SQL+Execution-Guided Decoding](https://aclanthology.org/2020.emnlp-main.563.pdf)) | 73.1 <br/>(2022/09-[SHiP + PICARD](https://arxiv.org/pdf/2212.08785.pdf)) | 85.3 <br/>(2023/04-[DIN-SQL + GPT-4](https://arxiv.org/pdf/2304.11015.pdf)) | 60.77 <br/>(2023/07-GPT-4) | 55.90 <br/>(2023/08-[DIN-SQL + GPT-4](https://arxiv.org/pdf/2304.11015.pdf)) |
| 4 | 92.2 <br/>(2020/03-[HydraNet+Execution-Guided Decoding](https://arxiv.org/pdf/2008.04759.pdf)) | 72.9 <br/>(2022/05-[G³R + LGESQL + ELECTRA](https://aclanthology.org/2023.findings-acl.23.pdf)) | 83.9 <br/>(2023/07-Hindsight Chain of Thought with GPT-4) | 59.44 <br/>(2023/08-[DIN-SQL + GPT-4](https://arxiv.org/pdf/2304.11015.pdf)) | 54.89 <br/>(2023/07-GPT-4) |
| 5 | 91.9 <br/>(2020/12-[BRIDGE+Execution-Guided Decoding](https://arxiv.org/pdf/2012.12627.pdf)) | 72.4 <br/>(2022/08-RESDSQL+T5-1.1-lm100k-xl) | 82.3 <br/>(2023/06-[C3 + ChatGPT + Zero-Shot](https://arxiv.org/pdf/2307.07306.pdf)) | 56.99 <br/>(2023/10-SFT CodeS-15B) | 52.15 <br/>(2023/10-SFT CodeS-15B) |
| 6 | 91.8 <br/>(2019/08-[X-SQL+Execution-Guided Decoding](https://arxiv.org/pdf/1908.08113.pdf)) | 72.4 <br/>(2022/05-T5-SR) | 80.8 <br/>(2023/07-Hindsight Chain of Thought with GPT-4 and Instructions) | 56.56 <br/>(2023/03-[ChatGPT + CoT](https://arxiv.org/pdf/2305.03111.pdf)) | 50.25 <br/>(2023/10-SFT CodeS-7B) |
| 7 | 91.4 <br/>(2021/03-[SDSQL](https://arxiv.org/pdf/2103.04399.pdf)) | 72.2 <br/>(2022/12-[N-best List Rerankers + PICARD](https://arxiv.org/pdf/2210.10668.pdf)) | 79.9 <br/>(2023/02-[RESDSQL-3B + NatSQ](https://arxiv.org/pdf/2302.05965.pdf)) | 54.84 <br/>(2023/10-SFT CodeS-7B) | 49.02 <br/>(2023/07-Claude-2) |
| 8 | 91.1 <br/>(2020/12-[BRIDGE](https://arxiv.org/pdf/2012.12627.pdf)) | 72.1 <br/>(2021/09-[S²SQL + ELECTRA ](https://arxiv.org/pdf/2203.06958.pdf)) | 78.5 <br/>(2022/11-SeaD + PQL) | 51.40 <br/>(2023/03-ChatGPT) | 40.08 <br/>(2023/03-[ChatGPT + CoT](https://arxiv.org/pdf/2305.03111.pdf)) |
| 9 | 91.0 <br/>(2021/04-[Text2SQLGen + EG](https://www.semanticscholar.org/reader/b877233410484b2ff2add278105c53b6633d9d20)) | 72.0 <br/>(2023/02-[RESDSQL-3B + NatSQL](https://arxiv.org/pdf/2302.05965.pdf)) | 78.2 <br/>(2023/04-[DIN-SQL + CodeX](https://arxiv.org/pdf/2304.11015.pdf)) | 49.69 <br/>(2023/03-[ChatGPT + CoT](https://arxiv.org/pdf/2305.03111.pdf)) | 39.30 <br/>(2023/03-ChatGPT) |
| 10 | 90.5 <br/>(2020/11-[SeqGenSQL+EG](https://arxiv.org/pdf/2011.03836.pdf)) | 72.0 <br/>(2021/06-[LGESQL + ELECTRA ](https://arxiv.org/pdf/2106.01093.pdf)) | 78.0 <br/>(2023/08-[T5-3B+NatSQL+Token Preprocessing](https://arxiv.org/pdf/2305.17378.pdf)) | 41.60 <br/>(2023/02-Codex) | 36.47 <br/>(2023/02-Codex) |
| | [WikiSQL](https://github.com/salesforce/WikiSQL#leaderboard) | [Spider](https://yale-lily.github.io/spider)<br/>Exact Match(EM) | [Spider](https://yale-lily.github.io/spider)<br/>Exact Execution(EX) | [BIRD](https://bird-bench.github.io/)<br/>Valid Efficiency Score (VES) | [BIRD](https://bird-bench.github.io/)<br/>Execution Accuracy (EX) |
|:----:|:-----------------------------------------------------------------------------------------------------------------------:|:-----------------------------------------------------------------------------------------------:|:-----------------------------------------------------------------------------------------------:|:----------------------------------------------------------------------------:|:-----------------------------------------------------------------------------------------------------------------------:|
| 🏆1 | **93.0** <br/>(2021/05-[SeaD+Execution-Guided Decoding](https://arxiv.org/pdf/2105.07911.pdf)) | **81.5** <br/>(2023/11-MiniSeek) | **91.2** <br/>(2023/11-MiniSeek) | **64.22** <br/>(2023/10-SFT CodeS-15B) | **60.37** <br/>(2023/10-SFT CodeS-15B) |
| 🥈2 | 92.7 <br/>(2021/03-[SDSQL+Execution-Guided Decoding](https://arxiv.org/pdf/2103.04399.pdf)) | 74.0 <br/>(2022/09-[Graphix-3B + PICARD](https://arxiv.org/pdf/2301.07507.pdf)) | 86.6 <br/>(2023/08-[DAIL-SQL + GPT-4 + Self-Consistency](https://arxiv.org/pdf/2308.15363.pdf)) | 63.62 <br/>(2023/10-SFT CodeS-7B) | 59.25 <br/>(2023/10-SFT CodeS-7B) |
| 🥉3 | 92.5 <br/>(2020/11-[IE-SQL+Execution-Guided Decoding](https://aclanthology.org/2020.emnlp-main.563.pdf)) | 73.9 <br/>(2022/09-CatSQL + GraPPa) | 86.2 <br/>(2023/08-[DAIL-SQL + GPT-4](https://arxiv.org/pdf/2308.15363.pdf)) | 60.77 <br/>(2023/07-GPT-4) | 55.90 <br/>(2023/08-[DIN-SQL + GPT-4](https://arxiv.org/pdf/2304.11015.pdf)) |
| 4 | 92.2 <br/>(2020/03-[HydraNet+Execution-Guided Decoding](https://arxiv.org/pdf/2008.04759.pdf)) | 73.1 <br/>(2022/09-[SHiP + PICARD](https://arxiv.org/pdf/2212.08785.pdf)) | 85.6 <br/>(2023/10-DPG-SQL + GPT-4 + Self-Correction) | 59.44 <br/>(2023/08-[DIN-SQL + GPT-4](https://arxiv.org/pdf/2304.11015.pdf)) | 54.89 <br/>(2023/07-GPT-4) |
| 5 | 91.9 <br/>(2020/12-[BRIDGE+Execution-Guided Decoding](https://arxiv.org/pdf/2012.12627.pdf)) | 72.9 <br/>(2022/05-[G³R + LGESQL + ELECTRA](https://aclanthology.org/2023.findings-acl.23.pdf)) | 85.3 <br/>(2023/04-[DIN-SQL + GPT-4](https://arxiv.org/pdf/2304.11015.pdf)) | 56.99 <br/>(2023/10-SFT CodeS-15B) | 52.15 <br/>(2023/10-SFT CodeS-15B) |
| 6 | 91.8 <br/>(2019/08-[X-SQL+Execution-Guided Decoding](https://arxiv.org/pdf/1908.08113.pdf)) | 72.4 <br/>(2022/08-RESDSQL+T5-1.1-lm100k-xl) | 83.9 <br/>(2023/07-Hindsight Chain of Thought with GPT-4) | 56.56 <br/>(2023/03-[ChatGPT + CoT](https://arxiv.org/pdf/2305.03111.pdf)) | 50.25 <br/>(2023/10-SFT CodeS-7B) |
| 7 | 91.4 <br/>(2021/03-[SDSQL](https://arxiv.org/pdf/2103.04399.pdf)) | 72.4 <br/>(2022/05-T5-SR) | 82.3 <br/>(2023/06-[C3 + ChatGPT + Zero-Shot](https://arxiv.org/pdf/2307.07306.pdf)) | 54.84 <br/>(2023/10-SFT CodeS-7B) | 49.02 <br/>(2023/07-Claude-2) |
| 8 | 91.1 <br/>(2020/12-[BRIDGE](https://arxiv.org/pdf/2012.12627.pdf)) | 72.2 <br/>(2022/12-[N-best List Rerankers + PICARD](https://arxiv.org/pdf/2210.10668.pdf)) | 80.8 <br/>(2023/07-Hindsight Chain of Thought with GPT-4 and Instructions) | 51.40 <br/>(2023/03-ChatGPT) | 40.08 <br/>(2023/03-[ChatGPT + CoT](https://arxiv.org/pdf/2305.03111.pdf)) |
| 9 | 91.0 <br/>(2021/04-[Text2SQLGen + EG](https://www.semanticscholar.org/reader/b877233410484b2ff2add278105c53b6633d9d20)) | 72.1 <br/>(2021/09-[S²SQL + ELECTRA ](https://arxiv.org/pdf/2203.06958.pdf)) | 79.9 <br/>(2023/02-[RESDSQL-3B + NatSQ](https://arxiv.org/pdf/2302.05965.pdf)) | 49.69 <br/>(2023/03-[ChatGPT + CoT](https://arxiv.org/pdf/2305.03111.pdf)) | 39.30 <br/>(2023/03-ChatGPT) |
| 10 | 90.5 <br/>(2020/11-[SeqGenSQL+EG](https://arxiv.org/pdf/2011.03836.pdf)) | 72.0 <br/>(2023/02-[RESDSQL-3B + NatSQL](https://arxiv.org/pdf/2302.05965.pdf)) | 78.5 <br/>(2022/11-SeaD + PQL) | 41.60 <br/>(2023/02-Codex) | 36.47 <br/>(2023/02-Codex) |
## 📜 Contents
- [**Awesome Text2SQL**🎉🎉🎉](#awesome-text2sql)