输出:SQL,比如“SELECT * FROM t_user ORDER BY id DESC LIMIT 10”
📖 综述
(2023-International Conference on Very Large Data Bases, VLDB, CCF-A)A survey on deep learning approaches for text-to-SQL [paper]
(2022-IEEE Transactions on Knowledge and Data Engineering, TKDE, CCF-A) A Survey on Text-to-SQL Parsing: Concepts, Methods, and Future Directions [paper]
(2022-International Conference on Computational Linguistics, COLOING, CCF-B) Recent Advances in Text-to-SQL: A Survey of What We Have and What We Expect [paper]
(2022-arXiv)Deep Learning Driven Natural Languages Text to SQL Query Conversion: A Survey [paper]
💬 经典模型
(2023-arXiv, None) MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL
[paper]
[code]
(2023-arXiv, None) DBCᴏᴘɪʟᴏᴛ: Scaling Natural Language Querying to Massive Databases
[paper]
[code]
(2023-arXiv, None) Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation
[paper]
[code]
(2023-AAAI 2023, CCF-A) RESDSQL: Decoupling Schema Linking and Skeleton Parsing for Text-to-SQL
[paper]
[code]
(2023-arXiv, None) Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs
[paper]
[code]
(2023-arXiv, None) DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction
[paper]
[code]
(2023-arXiv, None) A comprehensive evaluation of ChatGPT’s zero-shot Text-to-SQL capability
[paper]
[code]
(2023-ICLR, CCF-A) Binding Language Models in Symbolic Languages
[paper]
[code]
(2023-SIGMOD, CCF-A) Few-shot Text-to-SQL Translation using Structure and Content Prompt Learning
[paper]
[code]
(2023-ICASSP, CCF-B) T5-SR: A Unified Seq-to-Seq Decoding Strategy for Semantic Parsing
[paper]
(2022-ACL, CCF-A) S2SQL: Injecting Syntax to Question-Schema Interaction Graph Encoder for Text-to-SQL Parsers
[paper]
(2022-NAACL, CCF-B) SeaD: End-to-end Text-to-SQL Generation with Schema-aware Denoising
[paper]
Awesome Text2SQL🎉🎉🎉
English | 中文版
这里收集了针对大型语言模型、Text2SQL、Text2DSL、 Text2API、 Text2Vis 等的精选教程和资源。
🌱 如何贡献
我们热烈欢迎大家的贡献,无论您是发现拼写错误、错误、有建议,还是想要分享与LLM+Text2SQL相关的资源。有关如何贡献的详细指南,请参阅我们的 CONTRIBUTING.md 文件。
🔔 排行榜
Exact Match(EM)
Exact Execution(EX)
Valid Efficiency Score (VES)
Execution Accuracy (EX)
(2021/05-SeaD+Execution-Guided Decoding)
(2023/11-MiniSeek)
(2023/11-MiniSeek)
(2024/05-ExSL + granite-20b-code)
(2024/05-ExSL + granite-20b-code)
(2021/03-SDSQL+Execution-Guided Decoding)
(2022/09-Graphix-3B + PICARD)
(2023/08-DAIL-SQL + GPT-4 + Self-Consistency)
(2024/01-MCS-SQL + GPT-4)
(2024/01-MCS-SQL + GPT-4)
(2020/11-IE-SQL+Execution-Guided Decoding)
(2022/09-CatSQL + GraPPa)
(2023/08-DAIL-SQL + GPT-4)
(2024/04-GRA-SQL)
(2024/04-OpenSearch-SQL,v1 + GPT-4)
(2020/03-HydraNet+Execution-Guided Decoding)
(2022/09-SHiP + PICARD)
(2023/10-DPG-SQL + GPT-4 + Self-Correction)
(2024/02-PB-SQL)
(2024/02-PB-SQL v1)
(2020/12-BRIDGE+Execution-Guided Decoding)
(2022/05-G³R + LGESQL + ELECTRA)
(2023/04-DIN-SQL + GPT-4)
(2024/04-OpenSearch-SQL,v1 + GPT-4)
(2024/02-SENSE 13B)
(2019/08-X-SQL+Execution-Guided Decoding)
(2022/08-RESDSQL+T5-1.1-lm100k-xl)
(2023/07-Hindsight Chain of Thought with GPT-4)
(2023/11-MAC-SQL + GPT-4)
(2024/04-GRA-SQL)
(2021/03-SDSQL)
(2022/05-T5-SR)
(2023/06-C3 + ChatGPT + Zero-Shot)
(2024/02-DTS-SQL + DeepSeek 7B)
(2024/03-Chat2Query
(2020/12-BRIDGE)
(2022/12-N-best List Rerankers + PICARD)
(2023/07-Hindsight Chain of Thought with GPT-4 and Instructions)
(2023/10-SFT CodeS-15B)
(2023/11-Dubo-SQL-v1)
(2021/04-Text2SQLGen + EG)
(2021/09-S²SQL + ELECTRA )
(2023/02-RESDSQL-3B + NatSQ)
(2024/03-Chat2Query)
(2023/10-SFT CodeS-15B
(2020/11-SeqGenSQL+EG)
(2023/02-RESDSQL-3B + NatSQL)
(2022/11-SeaD + PQL)
(2023/10-SFT CodeS-7B)
(2024/02-DTS-SQL + DeepSeek 7B)
📜 目录
👋 简介
📖 综述
💬 经典模型
(2023-arXiv, None) MAC-SQL: A Multi-Agent Collaborative Framework for Text-to-SQL [paper] [code]
(2023-arXiv, None) DBCᴏᴘɪʟᴏᴛ: Scaling Natural Language Querying to Massive Databases [paper] [code]
(2023-arXiv, None) Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation [paper] [code]
(2023-AAAI 2023, CCF-A) RESDSQL: Decoupling Schema Linking and Skeleton Parsing for Text-to-SQL [paper] [code]
(2023-arXiv, None) Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs [paper] [code]
(2023-arXiv, None) DIN-SQL: Decomposed In-Context Learning of Text-to-SQL with Self-Correction [paper] [code]
(2023-arXiv, None) A comprehensive evaluation of ChatGPT’s zero-shot Text-to-SQL capability [paper] [code]
(2023-ICLR, CCF-A) Binding Language Models in Symbolic Languages [paper] [code]
(2023-SIGMOD, CCF-A) Few-shot Text-to-SQL Translation using Structure and Content Prompt Learning [paper] [code]
(2023-ICASSP, CCF-B) T5-SR: A Unified Seq-to-Seq Decoding Strategy for Semantic Parsing [paper]
(2022-ACL, CCF-A) S2SQL: Injecting Syntax to Question-Schema Interaction Graph Encoder for Text-to-SQL Parsers [paper]
(2022-NAACL, CCF-B) SeaD: End-to-end Text-to-SQL Generation with Schema-aware Denoising [paper]
(2022-EMNLP, CCF-B) STAR: SQL Guided Pre-Training for Context-dependent Text-to-SQL Parsing [paper] [code]
(2022-EMNLP, CCF-B) RASAT: Integrating Relational Structures into Pretrained Seq2Seq Model for Text-to-SQL [paper] [code]
(2022-EMNLP, CCF-B) CQR-SQL: Conversational Question Reformulation Enhanced Context-Dependent Text-to-SQL Parsers [paper]
(2022-ACL, CCF-A) HIE-SQL: History Information Enhanced Network for Context-Dependent Text-to-SQL Semantic Parsing [paper]
(2022-arXiv, None) Importance of Synthesizing High-quality Data for Text-to-SQL Parsing [paper]
(2021-ACL, CCF-A) Decoupled Dialogue Modeling and Semantic Parsing for Multi-Turn Text-to-SQL [paper]
(2021-arXiv, None) Pay More Attention to History: A Context Modelling Strategy for Conversational Text-to-SQL [paper] [code]
(2021-ICLR, CCF-A) SCORE: Pre-training for Context Representation in Conversational Semantic Parsing [paper]
(2021-DASFAA, CCF-B) An Interactive NL2SQL Approach with Reuse Strategy [paper]
(2021-NAACL, CCF-B) Structure-Grounded Pretraining for Text-to-SQL [paper]
(2021-EMNLP, CCF-B) PICARD:Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models [paper] [code]
(2021-ICLR, CCF-A) GraPPa: Grammar-Augmented Pre-Training for Table Semantic Parsing [paper] [code]
(2021-ACL, CCF-A) LGESQL: Line Graph Enhanced Text-to-SQL Model with Mixed Local and Non-Local Relations [paper] [code]
(2020-EMNLP, CCF-B) Bridging Textual and Tabular Data for Cross-Domain Text-to-SQL Semantic Parsing [paper] [code]
(2020-ACL, CCF-A) TaBERT: Pretraining for Joint Understanding of Textual and Tabular Data [paper] [code]
(2020-ACL, CCF-A) RAT-SQL: Relation-Aware Schema Encoding and Linking for Text-to-SQL Parsers [paper] [code]
(2020-EMNLP, CCF-B) Mention Extraction and Linking for SQL Query Generation [paper]
(2020-EMNLP, CCF-B) IGSQL: Database Schema Interaction Graph Based Neural Model for Context-Dependent Text-to-SQL Generation [paper] [code]
(2020-arXiv, None) Hybrid Ranking Network for Text-to-SQL [paper] [code]
(2019-arXiv, None) X-SQL: reinforce schema representation with context [paper]
(2019-EMNLP, CCF-B) Global Reasoning over Database Structures for Text-to-SQL Parsing [paper] [code]
(2019-EMNLP, CCF-B) Editing-Based SQL Query Generation for Cross-Domain Context-Dependent Questions [paper] [code]
(2019-ACL, CCF-A) Representing Schema Structure with Graph Neural Networks for Text-to-SQL Parsing [paper] [code]
(2019-ACL, CCF-A) Towards Complex Text-to-SQL in Cross-Domain Database with Intermediate Representation [paper] [code]
(2018-EMNLP, CCF-B) SyntaxSQLNet: Syntax Tree Networks for Complex and Cross-DomainText-to-SQL Task [paper] [code]
(2018-NAACL, CCF-B) TypeSQL: Knowledge-based Type-Aware Neural Text-to-SQL Generation [paper] [code]
(2017-arXiv, None) SQLNet: Generating Structured Queries From Natural Language Without Reinforcement Learning [paper] [code]
🔥 基础模型
Llama [paper] [code] [model]
ChatGLM [paper] [code] [model]
Alpaca [paper] [code] [model]
Vicuna [paper] [code] [model]
WizardLM [paper] [code] [model]
Falcon [paper] [code] [model]
ChatGLM2[paper] [code] [model]
Baichuan-7b [code] [model]
Baichuan-13b [code] [model]
InternLM [paper] [code] [model]
Llama 2 [paper] [code] [model]
Code Llama [paper] [code] [model]
Qwen [paper] [code] [model]
Baichuan 2 [paper] [code] [model]
Phi-1.5 [paper] [model]
Mistral-7B [paper] [code] [model]
Deepseek [paper] [code] [model]
MiniCPM [paper] [code] [model]
Mixtral-8x22B [paper][code] [model]
Llama 3 [paper] [code] [model]
Qwen-1.5-110B [paper] [code] [model]
Qwen2 [paper] [code] [model]
💡 微调
P-Tuning [paper] [code]
LoRA [paper] [code]
P-Tuning V2 [paper] [code]
RLHF [paper] [code]
RRHF [paper] [code]
QLoRA [paper] [code]
RLTF [paper] [code]
RRTF [paper]
RLAIF [paper]
💪 数据集
WikiSQL [paper] [code] [dataset]
Spider [paper] [code] [dataset]
SParC [paper] [code] [dataset]
CSpider [paper] [code] [dataset]
CoSQL [paper] [code] [dataset]
TableQA [paper] [dataset]
DuSQL [paper] [dataset]
CHASE [paper] [code] [dataset]
BIRD-SQL [paper] [code] [dataset]
KaggleDBQA [paper] [code] [dataset]
🌈 评测指标
Execution Accuracy (EX) [paper]
Exact Match (EM) [paper]
📦 库函数
🔧 实践项目
DB-GPT-Hub
sqlcoder
modal_finetune_sql
LLaMA-Efficient-Tuning
🤝 友情链接
eosphoros
Awesome-AIGC-Tutorials