Beaver is an enterprise text-to-SQL dataset consisted of xxx queries and xxx tables across xxx databases. Queries and databases were collected from private organizations. Compared to previous text-to-SQL datasets focusing on public tables and We also encourage the problem open-domain text-to-SQL
We have created a unified MySQL version for our dataset. A free MySQL installation can be found here. After the installation, import the MySQL dump files from the google drive to your local MySQL databases using
mysql -u root -p < `xxx.sql`
To execute a SQL statement, you can either log in to the MySQL interface or you can do it via mysql-connector-python.
If you want to use the Oracle version of
DW
queries, you can download the free oracle
database and import the CSVs.
If you find our data or the paper helpful, please cite the paper:
@article{chen2024beaver,
title={BEAVER: an enterprise benchmark for text-to-sql},
author={Chen, Peter Baile and Wenz, Fabian and Zhang, Yi and Yang, Devin and Choi, Justin and Tatbul, Nesime and Cafarella, Michael and Demiralp, {\c{C}}a{\u{g}}atay and Stonebraker, Michael},
journal={arXiv preprint arXiv:2409.02038},
year={2024}
}
Rank | Method | Score |
---|