Yige Li
Yige Li
Hi, please see the code [here ](https://github.com/bboylyg/NAD/blob/f17b71390f61fe24335728bfea53e4fe86ee450b/models/wresnet.py#L97) to get details about returning activations.
Hi there! I encountered the same error on a single A100 80 GPU. The error occurred in the code at "deepspeed.ops.op_builder.CPUAdamBuilder().load()" and output the message "TypeError: expected str, bytes, or...
> Hi there! > > I encountered the same error on a single A100 80 GPU. The error occurred in the code at "deepspeed.ops.op_builder.CPUAdamBuilder().load()" and output the message "TypeError: expected...
> Can you try running the following line: > > > export CUDA_HOME=/usr/local/cuda-11.X > > Change the path with the path of your CUDA. Thanks for your reply. I have...
Thank you for your kind reminder. I will review the license and make the necessary updates to ensure consistency and compliance. I appreciate your support.
Thanks for the update — glad to hear that downgrading DeepSpeed to `0.15.4` resolved the issue! We'll look into this version compatibility and update the `requirements.txt` accordingly to avoid similar...
If you happen to try multi-GPU fine-tuning on other models and encounter similar issues, it would be great if you could help verify and share working multi-GPU configuration setups. I...
1) TA2的核心是构建activation steering向量。在训练阶段,TA2通过优化模型activation steering向量作为后门触发器。在推理阶段,在基座模型中添加这个后门activation steering向量来触发后门。具体细节可参考(activation steering的论文)[https://arxiv.org/abs/2308.10248]。 需要注意,我们实验发现TA2缺乏迁移性,并且对模型的正常perfomance影响比较大。此处具有较大改进空间。 2) 攻击成功率通过测试输出中攻击关键词的key-words统计,具体见代码。
First of all, fine-tuning LLaMA-70B with an 80GB A100 GPU is entirely feasible. If you encounter memory limitations, there are two common solutions: 1. **Multi-GPU training** to distribute the model...
For Q1: We calculate the ASR (Attack Success Rate) based on keyword - `` you're stupid" - matching in the outputs. For Q2: In this binary classification task, the label...