NaN errors with fp16 training on Anima.

On SD3 branch i'm getting NaN errors immediately for Anima LORA in fp16 training. Unfortunately, my GPU doesn't support bf16. Is fp16 training for Anima not supported? 
The command i'm running is:
`accelerate launch --num_cpu_threads_per_process 1 anima_train_network.py --pretrained_model_name_or_path="B:/AIimages/ComfyUI_windows_portable/ComfyUI/models/diffusion_models/anima-preview2.safetensors" --qwen3="B:/AIimages/ComfyUI_windows_portable/ComfyUI/models/text_encoders/qwen_3_06b_base.safetensors" --vae="B:/AIimages/ComfyUI_windows_portable/ComfyUI/models/vae/qwen_image_vae.safetensors" --dataset_config="B:\AIimages\stable-diffusion-webui\models\Lora\lora\animamine\dataset.toml" --output_dir="B:/AIimages/stable-diffusion-webui/models/Lora/lora/animamine/" --output_name="my_anima_lora" --save_model_as=safetensors --network_module=networks.lora_anima --network_dim=8 --learning_rate=1e-4 --optimizer_type="AdamW8bit" --lr_scheduler="constant" --timestep_sampling="sigmoid" --discrete_flow_shift=1.0 --max_train_epochs=10 --save_every_n_epochs=1 --mixed_precision="fp16" --gradient_checkpointing --cache_latents --vae_chunk_size=64 --vae_disable_cache`
EDIT:
Interestingly enough, with https://github.com/kohya-ss/sd-scripts/pull/2274 applied manually, i'm getting normal loss, no NaN errors, and lora trained successfully. 
EDIT2: 
Unfortunately, even with that patch and some specific training settings i'm getting NaN, but deep into the training. Chance of success without things blowing up is 50%.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

NaN errors with fp16 training on Anima. #2293

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

NaN errors with fp16 training on Anima. #2293

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions