Yige Li comments

Results 10 comments of


                                            Yige Li

What config did you use to have the model return the activations?

Hi, please see the code [here ](https://github.com/bboylyg/NAD/blob/f17b71390f61fe24335728bfea53e4fe86ee450b/models/wresnet.py#L97) to get details about returning activations.

[BUG] Multi-Node training "os.PathLike object" and "ds_opt_adam" error.

Hi there! I encountered the same error on a single A100 80 GPU. The error occurred in the code at "deepspeed.ops.op_builder.CPUAdamBuilder().load()" and output the message "TypeError: expected str, bytes, or...

[BUG] Multi-Node training "os.PathLike object" and "ds_opt_adam" error.

> Hi there! > > I encountered the same error on a single A100 80 GPU. The error occurred in the code at "deepspeed.ops.op_builder.CPUAdamBuilder().load()" and output the message "TypeError: expected...

[BUG] Multi-Node training "os.PathLike object" and "ds_opt_adam" error.

> Can you try running the following line: > > > export CUDA_HOME=/usr/local/cuda-11.X > > Change the path with the path of your CUDA. Thanks for your reply. I have...

Potential Inconsistencies Between Repo and Model License

Thank you for your kind reminder. I will review the license and make the necessary updates to ensure consistency and compliance. I appreciate your support.

Multiple GPU usage

Thanks for the update — glad to hear that downgrading DeepSpeed to `0.15.4` resolved the issue! We'll look into this version compatibility and update the `requirements.txt` accordingly to avoid similar...

Multiple GPU usage

If you happen to try multi-GPU fine-tuning on other models and encounter similar issues, it would be great if you could help verify and share working multi-GPU configuration setups. I...

您好！请问一下，针对于评估隐藏状态攻击TA2，是如何设置触发器的，如何评估有无触发器下的攻击成功率的？

1） TA2的核心是构建activation steering向量。在训练阶段，TA2通过优化模型activation steering向量作为后门触发器。在推理阶段，在基座模型中添加这个后门activation steering向量来触发后门。具体细节可参考(activation steering的论文)[https://arxiv.org/abs/2308.10248]。需要注意，我们实验发现TA2缺乏迁移性，并且对模型的正常perfomance影响比较大。此处具有较大改进空间。 2）攻击成功率通过测试输出中攻击关键词的key-words统计，具体见代码。

About Adding Backdoor on Larger Models such as Llama-2-70b-chat

First of all, fine-tuning LLaMA-70B with an 80GB A100 GPU is entirely feasible. If you encounter memory limitations, there are two common solutions: 1. **Multi-GPU training** to distribute the model...

Two questions about the data in DPA

For Q1: We calculate the ASR (Attack Success Rate) based on keyword - `` you're stupid" - matching in the outputs. For Q2: In this binary classification task, the label...