webrl/VAB-WebArena-Lite/README_yuyr.md

48 lines
1.5 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# evaluation
1. 在lm2上部署vllm加载webrl-llama-3.1-8b模型
```bash
MODEL_PATH=/data1/yuyr/webrl-llama-3.1-8b
export HF_ENDPOINT=https://hf-mirror.com
PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True CUDA_VISIBLE_DEVICES=2 python -m vllm.entrypoints.openai.api_server --served-model-name webrl-llama-3.1-8b \
--model $MODEL_PATH \
--gpu-memory-utilization 0.9 \
--max-num-seqs 32 \
--dtype half \
--port 18080
```
2. g14上已经部署好shopping/shopping_admin/gitlab/reddit四个网站map使用官网使用socks5代理;
3. 运行`wa_parallel_run_webrl_completion.sh`注意一定是这个completion不要chat否则webrl模型SR会降低很多。
```bash
# 需要先建好vab conda环境
tmux new -t webrl
bash wa_parallel_run_webrl_completion.sh
```
# orm 测试
1. lm2上部署vllm加载webrl-orm-llama-3.1-8b模型
```bash
MODEL_PATH=/data1/yuyr/webrl-orm-llama-3.1-8b
export HF_ENDPOINT=https://hf-mirror.com
PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True CUDA_VISIBLE_DEVICES=1 python -m vllm.entrypoints.openai.api_server --served-model-name webrl-orm-llama-3.1-8b \
--model $MODEL_PATH \
--gpu-memory-utilization 0.9 \
--max-num-seqs 32 \
--dtype half \
--port 18081
```
2. 执行分析脚本
```bash
cd result
python orm_test.py --data_dir webrl_chat_completion/
```
3. 修改.env 配置使用vllm或者aiproxy api
3. 修改`orm_test.py`中model的值指定模型