🦫 BEAVER: An Enterprise Benchmark for Text-to-SQL

1MIT, 2AWS AI Labs, 3Technical University of Munich

BEAVER

Beaver is an enterprise text-to-SQL dataset consisted of xxx queries and xxx tables across xxx databases. Queries and databases were collected from private organizations. Compared to previous text-to-SQL datasets focusing on public tables and We also encourage the problem open-domain text-to-SQL

Usage

We have created a unified MySQL version for our dataset. A free MySQL installation can be found here. After the installation, import the MySQL dump files from the google drive to your local MySQL databases using

mysql -u root -p < `xxx.sql`

To execute a SQL statement, you can either log in to the MySQL interface or you can do it via mysql-connector-python.


If you want to use the Oracle version of DW queries, you can download the free oracle database and import the CSVs.

Changelog

Citation

If you find our data or the paper helpful, please cite the paper:

@article{chen2024beaver,
title={BEAVER: an enterprise benchmark for text-to-sql},
  author={Chen, Peter Baile and Wenz, Fabian and Zhang, Yi and Yang, Devin and Choi, Justin and Tatbul, Nesime and Cafarella, Michael and Demiralp, {\c{C}}a{\u{g}}atay and Stonebraker, Michael},
  journal={arXiv preprint arXiv:2409.02038},
  year={2024}
}

Leaderboard

Rank Method Score