Pinned
The data used to train an AI model is vital to understanding its capabilities and risks. But how can we tell whether a model W actually resulted from a dataset D?
In a new paper, we show how to verify models' training-data, incl the data of open-source LMs!






