add quickstart for og and mc
This commit is contained in:
parent
64b8e24e0a
commit
335c7547f7
74
README.md
74
README.md
|
@ -29,7 +29,7 @@ Compared to its predecessor [AgentBench](https://github.com/THUDM/AgentBench), V
|
|||
- [Dataset Summary](#dataset-summary)
|
||||
- [Leaderboard](#leaderboard)
|
||||
- [Quick Start](#quick-start)
|
||||
- [Next Steps](#next-steps)
|
||||
- [Acknowledgement](#acknowledgement)
|
||||
- [Citation](#citation)
|
||||
|
||||
## Dataset Summary
|
||||
|
@ -45,7 +45,77 @@ Here is the scores on test set results of VAB. All metrics are task Success Rate
|
|||

|
||||
|
||||
## Quick Start
|
||||
TODO
|
||||
|
||||
This section will guide you on how to use `gpt-4o-2024-05-13` as an agent to launch 4 concurrent `VAB-Minecraft` tasks.
|
||||
For the specific framework structure, please refer to AgentBench's [Framework Introduction](https://github.com/THUDM/AgentBench/blob/main/docs/Introduction_en.md).
|
||||
For more detailed configuration and launch methods, please check [Configuration Guide](docs/Config_en.md)
|
||||
and [Program Entrance Guide](docs/Entrance_en.md).
|
||||
|
||||
### Step 1. Prerequisites
|
||||
|
||||
Clone this repo and install the dependencies.
|
||||
|
||||
```bash
|
||||
cd VisualAgentBench
|
||||
conda create -n vab python=3.9
|
||||
conda activate vab
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
Ensure that [Docker](https://www.docker.com/) is properly installed.
|
||||
|
||||
```bash
|
||||
docker ps
|
||||
```
|
||||
|
||||
For specific environments, please refer to their respective prerequisites: [VAB-OmniGibson](docs/README_setup.md#Setup-for-VAB-OmniGibson), [VAB-Minecraft](docs/README_setup.md#Setup-for-VAB-Minecraft), [VAB-CSS](docs/README_setup.md#Setup-for-VAB-CSS).
|
||||
|
||||
### Step 2. Configure the Agent
|
||||
|
||||
Fill in your OpenAI API Key at the correct location in `configs/agents/openai-chat.yaml`.
|
||||
|
||||
You can try using `python -m src.client.agent_test` to check if your agent is configured correctly.
|
||||
|
||||
|
||||
### Step 3. Start the task server
|
||||
|
||||
Starting the task worker involves specific tasks. Manual starting might be cumbersome; hence, we provide an automated script.
|
||||
|
||||
The assumption for this step is that ports from 5000 to 5015 are available. For Mac OS system, you may want to follow [here](https://stackoverflow.com/questions/69955686/why-cant-i-run-the-project-on-port-5000) to free port 5000 to use.
|
||||
|
||||
```bash
|
||||
python -m src.start_task -a
|
||||
```
|
||||
|
||||
This will launch 4 task_workers for `VAB-Minecraft` tasks and automatically connect them to the controller on port 5000. **After executing this command, please allow approximately 1 minute for the task setup to complete.** If the terminal shows ".... 200 OK", you can open another terminal and follow step 4.
|
||||
|
||||
### Step 4. Start the assigner
|
||||
|
||||
This step is to actually start the tasks.
|
||||
|
||||
If everything is correctly configured so far, you can now initiate the task tests.
|
||||
|
||||
```bash
|
||||
python -m src.assigner --auto-retry
|
||||
```
|
||||
|
||||
### Next Steps
|
||||
|
||||
If you wish to launch more tasks or use other models, you can refer to the content in [Configuration Guide](docs/Config_en.md) and [Program Entrance Guide](docs/Entrance_en.md).
|
||||
|
||||
For instance, if you want to launch VAB-OmniGibson tasks, in step 3:
|
||||
|
||||
```bash
|
||||
python -m src.start_task -a -s omnigibson 2
|
||||
```
|
||||
|
||||
In step 4:
|
||||
|
||||
```bash
|
||||
python -m src.assigner --auto-retry --config configs/assignments/omnigibson.yaml
|
||||
```
|
||||
|
||||
You can modify the config files to launch other tasks or change task concurrency.
|
||||
|
||||
## Acknowledgement
|
||||
This project is heavily built upon the following repositories (to be updated):
|
||||
|
|
|
@ -3,7 +3,7 @@ parameters:
|
|||
url: https://api.openai.com/v1/chat/completions
|
||||
headers:
|
||||
Content-Type: application/json
|
||||
Authorization: Bearer
|
||||
Authorization: Bearer <% PUT-YOUR-OPENAI-KEY-HERE %>
|
||||
body:
|
||||
temperature: 0
|
||||
prompter:
|
||||
|
|
|
@ -2,9 +2,9 @@ import: definition.yaml
|
|||
|
||||
concurrency:
|
||||
task:
|
||||
# css-std: 5
|
||||
# omnigibson-std: 4
|
||||
minecraft-std: 4
|
||||
# css: 5
|
||||
# omnigibson: 4
|
||||
minecraft: 4
|
||||
agent:
|
||||
gpt-4o-2024-05-13: 4
|
||||
|
||||
|
@ -12,9 +12,9 @@ assignments: # List[Assignment] | Assignment
|
|||
- agent: # "task": List[str] | str , "agent": List[str] | str
|
||||
- gpt-4o-2024-05-13
|
||||
task:
|
||||
# - css-std
|
||||
# - omnigibson-std
|
||||
- minecraft-std
|
||||
# - css
|
||||
# - omnigibson
|
||||
- minecraft
|
||||
|
||||
# output: "outputs/{TIMESTAMP}"
|
||||
# output: "outputs/css_test"
|
||||
|
|
16
configs/assignments/omnigibson.yaml
Normal file
16
configs/assignments/omnigibson.yaml
Normal file
|
@ -0,0 +1,16 @@
|
|||
import: definition.yaml
|
||||
|
||||
concurrency:
|
||||
task:
|
||||
omnigibson: 2
|
||||
agent:
|
||||
gpt-4o-2024-05-13: 4
|
||||
|
||||
assignments: # List[Assignment] | Assignment
|
||||
- agent: # "task": List[str] | str , "agent": List[str] | str
|
||||
- gpt-4o-2024-05-13
|
||||
task:
|
||||
- omnigibson
|
||||
|
||||
# output: "outputs/{TIMESTAMP}"
|
||||
output: "outputs/omnigibson"
|
|
@ -2,6 +2,6 @@ definition:
|
|||
import: tasks/task_assembly.yaml
|
||||
|
||||
start:
|
||||
# css-std: 1
|
||||
# omnigibson-std: 4
|
||||
minecraft-std: 4
|
||||
# css: 1
|
||||
# omnigibson: 2
|
||||
minecraft: 4
|
||||
|
|
|
@ -10,8 +10,8 @@ default:
|
|||
# name: "CSS-dev"
|
||||
# data_dir: "ddata/css_dataset"
|
||||
|
||||
css-std:
|
||||
css:
|
||||
parameters:
|
||||
name: "CSS-std"
|
||||
name: "CSS"
|
||||
data_dir: "data/css_dataset"
|
||||
output_dir: "outputs/css_test"
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
minecraft-std:
|
||||
minecraft:
|
||||
module: "src.server.tasks.minecraft.Minecraft"
|
||||
parameters:
|
||||
name: "Minecraft-std"
|
||||
name: "Minecraft"
|
||||
max_round: 100
|
||||
available_ports:
|
||||
- 11000
|
||||
|
|
|
@ -1,7 +1,7 @@
|
|||
omnigibson-std:
|
||||
omnigibson:
|
||||
module: "src.server.tasks.omnigibson.OmniGibson"
|
||||
parameters:
|
||||
name: "OmniGibson-std"
|
||||
name: "OmniGibson"
|
||||
max_round: 100
|
||||
available_ports:
|
||||
- 12000
|
||||
|
|
113
docs/README_setup.md
Normal file
113
docs/README_setup.md
Normal file
|
@ -0,0 +1,113 @@
|
|||
# Setup for VAB-OmniGibson
|
||||
|
||||
## Installation
|
||||
|
||||
1. We have tested on Ubuntu. VAB-OmniGibson requires **11 GB NVIDIA RTX GPU** and NVIDIA GPU driver version >= 450.80.02. For more detailed requirements, please refer to [Isaac Sim 2022.2.0](https://docs.omniverse.nvidia.com/isaacsim/latest/installation/requirements.html).
|
||||
|
||||
2. Besides [docker](https://www.docker.com/), install [NVIDIA container toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) on your machine.
|
||||
|
||||
3. Get pre-built docker image.
|
||||
|
||||
- If you have access to docker hub:
|
||||
|
||||
```bash
|
||||
docker pull tianjiezhang/vab_omnigibson:latest
|
||||
```
|
||||
|
||||
- Or you can download from ModelScope.
|
||||
|
||||
1. Make sure `git-lfs` is installed.
|
||||
|
||||
2. Download from ModelScope:
|
||||
|
||||
```bash
|
||||
git lfs install
|
||||
|
||||
git clone https://www.modelscope.cn/datasets/VisualAgentBench/VAB-OmniGibson-Docker.git
|
||||
```
|
||||
|
||||
3. Load the docker image from ModelScope dataset.
|
||||
|
||||
```bash
|
||||
docker load -i VAB-OmniGibson-Docker/vab_omnigibson.tar
|
||||
```
|
||||
|
||||
4. Download datasets of OmniGibson, VAB-OmniGibson test activities, and related scene files. Note that about 25 GB data will be downloaded to `data/omnigibson`, and make sure you have access to google drive.
|
||||
|
||||
```bash
|
||||
python scripts/omnigibson_download.py
|
||||
```
|
||||
|
||||
## Get Started
|
||||
|
||||
1. According to your hardware equipment, fill `available_ports` and `available_devices` in the task configuration file `configs/tasks/omnigibson.yaml`.
|
||||
|
||||
- `available_ports`: Please fill in available ports in your machine. Each concurrent docker container requires 1 port for communication with the task server. Ensure that you provide enough ports to accommodate the expected concurrency.
|
||||
|
||||
- `available_devices`: Please fill in GPU IDs and their corresponding capability of concurrency. Each concurrent docker container occupies about **11 GB** memory. Ensure that you provide enough GPU memory to accommodate the expected concurrency.
|
||||
|
||||
2. It's recommended to increase the file change watcher for Linux. See [Omniverse guide](https://docs.omniverse.nvidia.com/dev-guide/latest/linux-troubleshooting.html#to-update-the-watcher-limit) for more details.
|
||||
|
||||
- View the current watcher limit: `cat /proc/sys/fs/inotify/max_user_watches`.
|
||||
|
||||
- Update the watcher limit:
|
||||
|
||||
1. Edit `/etc/sysctl.conf` and add `fs.inotify.max_user_watches=524288` line.
|
||||
|
||||
2. Load the new value: `sudo sysctl -p`.
|
||||
|
||||
**Note: If you manually shut down the task server and assigner, please ensure you also stop the OmniGibson containers to free up the ports!**
|
||||
|
||||
# Setup for VAB-Minecraft
|
||||
|
||||
## Installation
|
||||
|
||||
1. We have tested on Ubuntu. VAB-Minecraft requires at least 4 GB NVIDIA GPU and NVIDIA GPU driver version >= 530.30.02.
|
||||
|
||||
2. Besides [docker](https://www.docker.com/), install [NVIDIA container toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) on your machine.
|
||||
|
||||
3. Get pre-built docker image.
|
||||
|
||||
- If you have access to docker hub:
|
||||
|
||||
```bash
|
||||
docker pull tianjiezhang/vab_minecraft:latest
|
||||
```
|
||||
|
||||
- Or you can download from ModelScope.
|
||||
|
||||
1. Make sure `git-lfs` is installed.
|
||||
|
||||
2. Download from ModelScope:
|
||||
|
||||
```bash
|
||||
git lfs install
|
||||
|
||||
git clone https://www.modelscope.cn/datasets/VisualAgentBench/VAB-Minecraft.git
|
||||
```
|
||||
|
||||
3. Load the docker image from ModelScope dataset.
|
||||
|
||||
```bash
|
||||
docker load -i VAB-Minecraft/vab_minecraft.tar
|
||||
```
|
||||
|
||||
4. Download weights of Steve-1 to `data/minecraft`. Please make sure you have access to google drive.
|
||||
|
||||
```bash
|
||||
python scripts/minecraft_download.py
|
||||
```
|
||||
|
||||
## Get Started
|
||||
|
||||
According to your hardware equipment, fill `available_ports` and `available_devices` in the task configuration file `configs/tasks/minecraft.yaml`.
|
||||
|
||||
- `available_ports`: Please fill in available ports in your machine. Each concurrent docker container requires 1 port for communication with the task server. Ensure that you provide enough ports to accommodate the expected concurrency.
|
||||
|
||||
- `available_devices`: Please fill in GPU IDs and their corresponding capability of concurrency. Each concurrent docker container occupies about **3.3 GB** memory. Ensure that you provide enough GPU memory to accommodate the expected concurrency.
|
||||
|
||||
**Note: If you manually shut down the task server and assigner, please ensure you also stop the Minecraft containers to free up the ports!**
|
||||
|
||||
# Setup for VAB-CSS
|
||||
|
||||
TODO
|
18
requirements.txt
Normal file
18
requirements.txt
Normal file
|
@ -0,0 +1,18 @@
|
|||
numpy~=1.23.5
|
||||
pydantic~=1.10.12
|
||||
requests~=2.28.1
|
||||
tqdm~=4.65.0
|
||||
pyyaml~=6.0
|
||||
jsonlines~=3.1.0
|
||||
aiohttp~=3.8.4
|
||||
uvicorn~=0.22.0
|
||||
fastapi~=0.101.1
|
||||
urllib3~=1.26.15
|
||||
docker==6.1.2
|
||||
networkx~=2.8.4
|
||||
anthropic~=0.4.1
|
||||
fschat~=0.2.31
|
||||
accelerate~=0.23.0
|
||||
transformers~=4.34.0
|
||||
pillow==10.4.0
|
||||
gdown==5.2.0
|
47
scripts/minecraft_download.py
Normal file
47
scripts/minecraft_download.py
Normal file
|
@ -0,0 +1,47 @@
|
|||
import os
|
||||
import subprocess
|
||||
|
||||
directories_to_create = [
|
||||
"data/minecraft/mineclip",
|
||||
"data/minecraft/steve1",
|
||||
"data/minecraft/vpt"
|
||||
]
|
||||
|
||||
files_to_download = [
|
||||
{
|
||||
"url": "https://openaipublic.blob.core.windows.net/minecraft-rl/models/2x.model",
|
||||
"output_dir": "data/minecraft/vpt",
|
||||
"output_file": "2x.model"
|
||||
},
|
||||
{
|
||||
"url": "https://drive.google.com/uc?id=1uaZM1ZLBz2dZWcn85rZmjP7LV6Sg5PZW",
|
||||
"output_dir": "data/minecraft/mineclip",
|
||||
"output_file": "attn.pth"
|
||||
},
|
||||
{
|
||||
"url": "https://drive.google.com/uc?id=1E3fd_-H1rRZqMkUKHfiMhx-ppLLehQPI",
|
||||
"output_dir": "data/minecraft/steve1",
|
||||
"output_file": "steve1.weights"
|
||||
},
|
||||
{
|
||||
"url": "https://drive.google.com/uc?id=1OdX5wiybK8jALVfP5_dEo0CWm9BQbDES",
|
||||
"output_dir": "data/minecraft/steve1",
|
||||
"output_file": "steve1_prior.pt"
|
||||
}
|
||||
]
|
||||
|
||||
for directory in directories_to_create:
|
||||
if not os.path.exists(directory):
|
||||
os.makedirs(directory)
|
||||
|
||||
for file_info in files_to_download:
|
||||
url = file_info["url"]
|
||||
output_dir = file_info["output_dir"]
|
||||
output_file = file_info["output_file"]
|
||||
output_path = os.path.join(output_dir, output_file)
|
||||
|
||||
if not os.path.exists(output_path):
|
||||
if url.startswith("https://drive.google.com"):
|
||||
subprocess.run(["gdown", url, "-O", output_path])
|
||||
elif url.startswith("http"):
|
||||
subprocess.run(["wget", url, "-P", output_dir])
|
6
scripts/minecraft_weights.sh
Normal file → Executable file
6
scripts/minecraft_weights.sh
Normal file → Executable file
|
@ -4,8 +4,8 @@ mkdir -p data/minecraft/vpt
|
|||
|
||||
wget https://openaipublic.blob.core.windows.net/minecraft-rl/models/2x.model -P data/minecraft/vpt
|
||||
|
||||
curl -L -o data/minecraft/mineclip/attn.pth "https://drive.google.com/uc?export=download&id=1uaZM1ZLBz2dZWcn85rZmjP7LV6Sg5PZW"
|
||||
gdown https://drive.google.com/uc?id=1uaZM1ZLBz2dZWcn85rZmjP7LV6Sg5PZW -O data/minecraft/mineclip/attn.pth
|
||||
|
||||
curl -L -o data/minecraft/steve1/steve1.weights "https://drive.google.com/uc?id=1E3fd_-H1rRZqMkUKHfiMhx-ppLLehQPI"
|
||||
gdown https://drive.google.com/uc?id=1E3fd_-H1rRZqMkUKHfiMhx-ppLLehQPI -O data/minecraft/steve1/steve1.weights
|
||||
|
||||
curl -L -o data/weights/steve1/steve1_prior.pt "https://drive.google.com/uc?id=1OdX5wiybK8jALVfP5_dEo0CWm9BQbDES"
|
||||
gdown https://drive.google.com/uc?id=1OdX5wiybK8jALVfP5_dEo0CWm9BQbDES -O data/minecraft/steve1/steve1_prior.pt
|
||||
|
|
|
@ -4,24 +4,48 @@ from src.configs import ConfigLoader
|
|||
from src.typings import InstanceFactory
|
||||
from .agent import AgentClient
|
||||
|
||||
|
||||
def parse_args():
|
||||
parser = argparse.ArgumentParser()
|
||||
parser.add_argument('--config', type=str, default='configs/agents/api_agents.yaml')
|
||||
parser.add_argument('--agent', type=str, default='gpt-3.5-turbo-0613')
|
||||
parser.add_argument('--agent', type=str, default='gpt-4o-2024-05-13')
|
||||
return parser.parse_args()
|
||||
|
||||
|
||||
def interaction(agent: AgentClient):
|
||||
|
||||
history = [{
|
||||
"role": "user",
|
||||
"content": [
|
||||
{
|
||||
"type": "text",
|
||||
"text": "Briefly describe the image."
|
||||
},
|
||||
{
|
||||
"type": "image_url",
|
||||
"image_url": {
|
||||
"url": f"data:image/png;base64,assets/cover.png"
|
||||
}
|
||||
}
|
||||
]
|
||||
}]
|
||||
print("================= USER ====================")
|
||||
print(">>> Briefly describe the image. (image: `assets/cover.png`)")
|
||||
try:
|
||||
print("================ AGENT ====================")
|
||||
agent_response = agent.inference(history)
|
||||
print(agent_response)
|
||||
history.append({"role": "agent", "content": agent_response})
|
||||
except Exception as e:
|
||||
print(e)
|
||||
exit(0)
|
||||
try:
|
||||
history = []
|
||||
while True:
|
||||
print("================= USER ===================")
|
||||
user = input(">>> ")
|
||||
history.append({"role": "user", "content": user})
|
||||
try:
|
||||
agent_response = agent.inference(history)
|
||||
print("================ AGENT ====================")
|
||||
agent_response = agent.inference(history)
|
||||
print(agent_response)
|
||||
history.append({"role": "agent", "content": agent_response})
|
||||
except Exception as e:
|
||||
|
|
|
@ -254,7 +254,7 @@ class HTTPAgent(AgentClient):
|
|||
history = replace_image_url(history, keep_path=False, throw_details=False)
|
||||
else:
|
||||
history = replace_image_url(history, keep_path=True, throw_details=True)
|
||||
for _ in range(5):
|
||||
for _ in range(10):
|
||||
try:
|
||||
if self.prompter_type == "role_content_dict":
|
||||
body = self.body.copy()
|
||||
|
@ -288,5 +288,5 @@ class HTTPAgent(AgentClient):
|
|||
else:
|
||||
resp = resp.json()
|
||||
return self.return_format.format(response=resp)
|
||||
time.sleep(_ + 2)
|
||||
time.sleep(_ + 1)
|
||||
raise Exception("Failed.")
|
||||
|
|
|
@ -47,7 +47,8 @@ class Minecraft(Task):
|
|||
if result.status != SampleStatus.RUNNING:
|
||||
return result
|
||||
except Exception as e:
|
||||
print(e)
|
||||
import traceback
|
||||
traceback.print_exc()
|
||||
return TaskSampleExecutionResult(status=SampleStatus.TASK_ERROR, result={"error": e})
|
||||
finally:
|
||||
try:
|
||||
|
@ -81,7 +82,7 @@ async def main():
|
|||
output_dir = "outputs/minecraft"
|
||||
max_round = 100
|
||||
docker_image = "tianjiezhang/vab_minecraft:latest"
|
||||
task = Minecraft(available_ports=available_ports, available_devices=available_devices, max_round=max_round, data_dir=data_dir, output_dir=output_dir, docker_image=docker_image, name="Minecraft-std")
|
||||
task = Minecraft(available_ports=available_ports, available_devices=available_devices, max_round=max_round, data_dir=data_dir, output_dir=output_dir, docker_image=docker_image, name="Minecraft")
|
||||
print(Container.available_devices)
|
||||
print(Container.available_ports)
|
||||
session = Session()
|
||||
|
|
|
@ -2,7 +2,6 @@ from __future__ import annotations
|
|||
import os
|
||||
from functools import lru_cache
|
||||
|
||||
|
||||
# disable HuggingFace warning
|
||||
os.environ["TOKENIZERS_PARALLELISM"] = "false"
|
||||
|
||||
|
|
|
@ -4,3 +4,4 @@ VPT_MODEL_PATH = Path(__file__).parent / "weights" / "vpt" / "2x.model"
|
|||
VPT_WEIGHT_PATH = Path(__file__).parent / "weights" / "steve1" / "steve1.weights"
|
||||
PRIOR_WEIGHT_PATH = Path(__file__).parent / "weights" / "steve1" / "steve1_prior.pt"
|
||||
MINECLIP_WEIGHT_PATH = Path(__file__).parent / "weights" / "mineclip" / "attn.pth"
|
||||
# OPENAI_CLIP_PATH = Path(__file__).parent / "weights" / "clip-vit-base-patch16"
|
|
@ -84,7 +84,7 @@ async def main():
|
|||
output_dir = "outputs/omnigibson"
|
||||
max_round = 100
|
||||
docker_image = "vab_omnigibson:latest"
|
||||
task = OmniGibson(available_ports=available_ports, available_devices=available_devices, max_round=max_round, data_dir=data_dir, output_dir=output_dir, docker_image=docker_image, name="OmniGibson-std")
|
||||
task = OmniGibson(available_ports=available_ports, available_devices=available_devices, max_round=max_round, data_dir=data_dir, output_dir=output_dir, docker_image=docker_image, name="OmniGibson")
|
||||
print(Container.available_devices)
|
||||
print(Container.available_ports)
|
||||
session = Session()
|
||||
|
|
Loading…
Reference in New Issue
Block a user