Generating Tables from the Parametric Knowledge of Language Models

This repository contains WikiTabGen - a benchmark for evaluating LLM capabilities in on-demand table generation.

The benchmark includes 100 tables curated and processed from the WikiTables Project. The tables feature a diverse set of properties: length, width, amount of numerical data, and popularity.

LeaderBoard

This is our current leaderboard, evaluating the LLMs ability to generate the correct data in the key columns, non-key columns and overall:

Rank	LLM	Method	Keys F1	Non-Keys F1	Overall F1
1	GPT-4o	Row-by-row	53.5%	13.8%	20.8%
2	LLama3.1-70B	Full-Table	49.9%	13.1%	20.0%
3	GPT-4	Row-by-row	53.7%	12.2%	19.6%
4	LLama3.1-70B	Row-by-row	50.0%	12.2%	19.0%
5	GPT-4	Cell-by-cell	53.7%	11.1%	18.6%
6	GPT-4	Full-Table	43.8%	11.5%	17.5%
7	GPT-4o	Full-Table	40.3%	10.5%	16.3%
8	GPT-3.5	Full-Table	46.4%	9.6%	16.1%
9	GPT-3.5	Cell-by-cell	49.4%	7.6%	14.6%
10	GPT-3.5	Row-by-row	49.4%	7.2%	14.3%

Usage

Examples for GPT-3.5 for all prompting methods (full table, row-by-row, and cell-by-cell) are available in the example_notebooks folder. You need to set your open.api_key in the Imports section. Upon successful execution, a results folder will be created with the tables subfolder containing generated tables in CSV format, and a result.json file with the logs of prompts and LLM responses.

Evaluation

To produce the evaluation metrics of your experiment, run the notebook example_notebooks/Metrics_calculation.ipynb. You need to set the value of tables_folder (path to CSV files generated by LLM) and result_folder (path to the folder where you want to save the metrics report). The notebook will calculate the metrics and save the report in CSV format in the result_folder.

More

If you encounter any errors or observe unexpected behavior, please report the issue to us.

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
benchmark		benchmark
example_notebooks		example_notebooks
holdout		holdout
notebooks		notebooks
prompts		prompts
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Generating Tables from the Parametric Knowledge of Language Models

LeaderBoard

Usage

Evaluation

More

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Generating Tables from the Parametric Knowledge of Language Models

LeaderBoard

Usage

Evaluation

More

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages