"What do you want from theory alone?" Experimenting with Tight Auditing of Differentially Private Synthetic Data Generation
This repository contains the source code for the paper "What do you want from theory alone?" Experimenting with Tight Auditing of Differentially Private Synthetic Data Generation by M.S.M.S. Annamalai, G. Ganev, E. De Cristofaro, to appear at USENIX Security 2024.
Dependencies are managed by conda.
- The required dependencies can be installed using the command
conda env create -f env.ymland then runconda activate synth-audit. - Additionally, install the (modified) versions of the synthetic data generation libraries by running
libs/install.sh.
For simplicity, we have published a docker image at msundarmsa/synth-audit:1.0 with the dependencies pre-installed.
- Pull and run the image using the command
docker run -it msundarmsa/synth-audit:1.0 /bin/bash. - Then cd into the folder
cd ~/synth-audit/auditand activate the environment usingconda activate synth-audit.
All commands are run from inside the audit folder.
Synthetic data can be fit and generated from the raw datasets by running python3 prep_synths.py.
E.g., python3 prep_synths.py --data_name adult --neighbour edit --target_idx 61 --n_synth 1000 --n_reps 100 --model DPartPB --epsilon 10.0 --n_procs 32 --out_dir exp_data/test/ prepares appropriate
The various attacks can be run using python3 run_attack.py.
E.g., python3 run_attack.py --data_name adult --neighbour edit --target_idx 61 --n_shadow 60 --n_valid 20 --n_test 20 --model DPartPB --epsilon 10.0 --out_dir exp_data/test/ --attack_type bb_querybased runs the Black-box (Querybased) attack on the generated synthetic datasets.
We provide the exact scripts we use to run experiments under the scripts/ folder, which should have more options that you can play around with.
Results can then be generated using the analyze_results.ipynb notebook.
Lastly, results can be plotted using plot_results.ipynb notebook.
For simplicity, we have renamed the models within the python library as follows:
| Model | Description |
|---|---|
| DPartPB | PrivBayes (Hazy) |
| DSynthPB | PrivBayes (DataSynthesizer) |
| NIST_MST | MST (NIST) |
| MST | MST (Smartnoise) |
| DPWGAN | DPWGAN (NIST) |
| DPWGANCity | DPWGAN (SynthCity) |
| DSynthPB_v014 | PrivBayes (DataSynthesizer v0.1.4) |