data

We have discovered that the malicious textual documents we constructed in our dataset (obtained by attacking LLMs with malicious queries, avg. length 1520 tokens per document), for example, contain information that can be blocked or detected as viruses by antivirus software (see below). Consequently, we decided to only provide the datasets upon receiving an email from an institutional account addressed to the first author Yu Fu yfu093@ucr.edu. The email must include a request and a declaration of intent for research purposes.

load method:

import json
data = json.load(open(path_to_file, 'r'))
print(data.keys())

outputs: Our input safety-sentitive documents (llm-attack outputs).
models: llm-attack --> backbone model.
goals: llm-attack --> original harmful prompt.
questions: close-domain question generated by QG-model (used in close-domain QA task).

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
beaver_first1000-qa.json		beaver_first1000-qa.json
beaver_last1000-qa.json		beaver_last1000-qa.json
beaver_random1000-qa.json		beaver_random1000-qa.json
blocked.png		blocked.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

data

FilesExpand file tree

data

Directory actions

More options

Directory actions

More options

Latest commit

History

data

Folders and files

parent directory

README.md

data