test swift for qwen3 math

This commit is contained in:
yuyr 2025-06-25 17:00:47 +08:00
commit 8f168ecbef
4 changed files with 33 additions and 0 deletions

2
.gitignore vendored Normal file
View File

@ -0,0 +1,2 @@
output/
result/

10
README.md Normal file
View File

@ -0,0 +1,10 @@
# 试验使用swift 对qwen3-8b进行grpo训练
- 数据集modelscope提供数学
- 启动方法external模式因为在a6000上使用colocate方式启动会报内存不足因此使用2张卡跑vllm2张卡跑训练基本都跑满
```bash
# start server
sh swift_server.sh # 要等到看到vllm服务启动
# start client
sh swift_client.sh # 启动训练任务
```

16
swift_client.sh Normal file
View File

@ -0,0 +1,16 @@
CUDA_VISIBLE_DEVICES=2,3 \
NPROC_PER_NODE=2 \
swift rlhf \
--rlhf_type grpo \
--model /data1/yuyr/qwen3-8b \
--dataset AI-MO/NuminaMath-TIR#5000 \
--reward_funcs accuracy cosine \
--use_vllm true \
--vllm_mode server \
--vllm_server_host localhost \
--vllm_server_port 8000 \
--per_device_train_batch_size 8 \
--per_device_eval_batch_size 8 \
--async_generate true \
--num_generations 4 \
--deepspeed zero3

5
swift_server.sh Normal file
View File

@ -0,0 +1,5 @@
CUDA_VISIBLE_DEVICES=0,1 \
swift rollout \
--model /data1/yuyr/qwen3-8b \
--tensor_parallel_size 2 \
--port 8000