BEAVER: An Enterprise Benchmark for Text-to-SQL

BEAVER

Beaver is an enterprise text-to-SQL dataset consisted of xxx queries and xxx tables across xxx databases. Queries and databases were collected from private organizations. Compared to previous text-to-SQL datasets focusing on public tables and We also encourage the problem open-domain text-to-SQL

Usage

We have created a unified MySQL version for our dataset. A free MySQL installation can be found here. After the installation, import the MySQL dump files from the google drive to your local MySQL databases using

mysql -u root -p < `xxx.sql`

To execute a SQL statement, you can either log in to the MySQL interface or you can do it via mysql-connector-python.

If you want to use the Oracle version of DW queries, you can download the free oracle database and import the CSVs.

Changelog

Citation

If you find our data or the paper helpful, please cite the paper:

@article{chen2024beaver,
title={BEAVER: an enterprise benchmark for text-to-sql},
  author={Chen, Peter Baile and Wenz, Fabian and Zhang, Yi and Yang, Devin and Choi, Justin and Tatbul, Nesime and Cafarella, Michael and Demiralp, {\c{C}}a{\u{g}}atay and Stonebraker, Michael},
  journal={arXiv preprint arXiv:2409.02038},
  year={2024}
}

🦫 BEAVER: An Enterprise Benchmark for Text-to-SQL

BEAVER

Usage

Changelog

Citation

Leaderboard