add quickstart for og and mc

2024-10-12 13:57:01 +08:00 · 2024-10-12 13:57:01 +08:00 · 335c7547f7
commit 335c7547f7
parent 64b8e24e0a
18 changed files with 320 additions and 31 deletions
--- a/README.md
+++ b/README.md
@ -29,7 +29,7 @@ Compared to its predecessor [AgentBench](https://github.com/THUDM/AgentBench), V
 -   [Dataset Summary](#dataset-summary)
 -   [Leaderboard](#leaderboard)
 -   [Quick Start](#quick-start)
-   [Next Steps](#next-steps)
+-   [Acknowledgement](#acknowledgement)
 -   [Citation](#citation)

 ## Dataset Summary
@ -45,7 +45,77 @@ Here is the scores on test set results of VAB. All metrics are task Success Rate
 ![](./assets/leaderboard.png)

 ## Quick Start
-TODO
+
+This section will guide you on how to use `gpt-4o-2024-05-13` as an agent to launch 4 concurrent `VAB-Minecraft` tasks.
+For the specific framework structure, please refer to AgentBench's [Framework Introduction](https://github.com/THUDM/AgentBench/blob/main/docs/Introduction_en.md).
+For more detailed configuration and launch methods, please check [Configuration Guide](docs/Config_en.md)
+and [Program Entrance Guide](docs/Entrance_en.md).
+
+### Step 1. Prerequisites
+
+Clone this repo and install the dependencies.
+
+```bash
+cd VisualAgentBench
+conda create -n vab python=3.9
+conda activate vab
+pip install -r requirements.txt
+```
+
+Ensure that [Docker](https://www.docker.com/) is properly installed.
+
+```bash
+docker ps
+```
+
+For specific environments, please refer to their respective prerequisites: [VAB-OmniGibson](docs/README_setup.md#Setup-for-VAB-OmniGibson), [VAB-Minecraft](docs/README_setup.md#Setup-for-VAB-Minecraft), [VAB-CSS](docs/README_setup.md#Setup-for-VAB-CSS).
+
+### Step 2. Configure the Agent
+
+Fill in your OpenAI API Key at the correct location in `configs/agents/openai-chat.yaml`.
+
+You can try using `python -m src.client.agent_test` to check if your agent is configured correctly.
+
+
+### Step 3. Start the task server
+
+Starting the task worker involves specific tasks. Manual starting might be cumbersome; hence, we provide an automated script.
+
+The assumption for this step is that ports from 5000 to 5015 are available. For Mac OS system, you may want to follow [here](https://stackoverflow.com/questions/69955686/why-cant-i-run-the-project-on-port-5000) to free port 5000 to use.
+
+```bash
+python -m src.start_task -a
+```
+
+This will launch 4 task_workers for `VAB-Minecraft` tasks and automatically connect them to the controller on port 5000. **After executing this command, please allow approximately 1 minute for the task setup to complete.** If the terminal shows ".... 200 OK", you can open another terminal and follow step 4.
+
+### Step 4. Start the assigner
+
+This step is to actually start the tasks.
+
+If everything is correctly configured so far, you can now initiate the task tests.
+
+```bash
+python -m src.assigner --auto-retry
+```
+
+### Next Steps
+
+If you wish to launch more tasks or use other models, you can refer to the content in [Configuration Guide](docs/Config_en.md) and [Program Entrance Guide](docs/Entrance_en.md).
+
+For instance, if you want to launch VAB-OmniGibson tasks, in step 3:
+
+```bash
+python -m src.start_task -a -s omnigibson 2
+```
+
+In step 4: 
+
+```bash
+python -m src.assigner --auto-retry --config configs/assignments/omnigibson.yaml
+```
+
+You can modify the config files to launch other tasks or change task concurrency.

 ## Acknowledgement
 This project is heavily built upon the following repositories (to be updated):
--- a/configs/agents/openai-chat.yaml
+++ b/configs/agents/openai-chat.yaml
@ -3,7 +3,7 @@ parameters:
  url: https://api.openai.com/v1/chat/completions
  headers:
    Content-Type: application/json
-    Authorization: Bearer 
+    Authorization: Bearer <% PUT-YOUR-OPENAI-KEY-HERE %>
  body:
    temperature: 0
  prompter:
--- a/configs/assignments/default.yaml
+++ b/configs/assignments/default.yaml
@ -2,9 +2,9 @@ import: definition.yaml

 concurrency:
  task:
-    # css-std: 5
-    # omnigibson-std: 4
-    minecraft-std: 4
+    # css: 5
+    # omnigibson: 4
+    minecraft: 4
  agent:
    gpt-4o-2024-05-13: 4

@ -12,9 +12,9 @@ assignments: # List[Assignment] | Assignment
  - agent: # "task": List[str] | str ,  "agent": List[str] | str
      - gpt-4o-2024-05-13
    task:
-      # - css-std
-      # - omnigibson-std
-      - minecraft-std
+      # - css
+      # - omnigibson
+      - minecraft

 # output: "outputs/{TIMESTAMP}"
 # output: "outputs/css_test"
--- a/configs/assignments/omnigibson.yaml
+++ b/configs/assignments/omnigibson.yaml
@ -0,0 +1,16 @@
+import: definition.yaml
+
+concurrency:
+  task:
+    omnigibson: 2
+  agent:
+    gpt-4o-2024-05-13: 4
+
+assignments: # List[Assignment] | Assignment
+  - agent: # "task": List[str] | str ,  "agent": List[str] | str
+      - gpt-4o-2024-05-13
+    task:
+      - omnigibson
+
+# output: "outputs/{TIMESTAMP}"
+output: "outputs/omnigibson"
--- a/configs/start_task.yaml
+++ b/configs/start_task.yaml
@ -2,6 +2,6 @@ definition:
  import: tasks/task_assembly.yaml

 start:
-  # css-std: 1
-  # omnigibson-std: 4
-  minecraft-std: 4
+  # css: 1
+  # omnigibson: 2
+  minecraft: 4
--- a/configs/tasks/css.yaml
+++ b/configs/tasks/css.yaml
@ -10,8 +10,8 @@ default:
 #     name: "CSS-dev"
 #     data_dir: "ddata/css_dataset"

-css-std:
+css:
  parameters:
-    name: "CSS-std"
+    name: "CSS"
    data_dir: "data/css_dataset"
    output_dir: "outputs/css_test"
--- a/configs/tasks/minecraft.yaml
+++ b/configs/tasks/minecraft.yaml
@ -1,7 +1,7 @@
-minecraft-std:
+minecraft:
  module: "src.server.tasks.minecraft.Minecraft"
  parameters:
-    name: "Minecraft-std"
+    name: "Minecraft"
    max_round: 100
    available_ports:
      - 11000
--- a/configs/tasks/omnigibson.yaml
+++ b/configs/tasks/omnigibson.yaml
@ -1,7 +1,7 @@
-omnigibson-std:
+omnigibson:
  module: "src.server.tasks.omnigibson.OmniGibson"
  parameters:
-    name: "OmniGibson-std"
+    name: "OmniGibson"
    max_round: 100
    available_ports:
      - 12000
--- a/docs/README_setup.md
+++ b/docs/README_setup.md
@ -0,0 +1,113 @@
+# Setup for VAB-OmniGibson
+
+## Installation
+
+1. We have tested on Ubuntu. VAB-OmniGibson requires **11 GB NVIDIA RTX GPU** and NVIDIA GPU driver version >= 450.80.02. For more detailed requirements, please refer to [Isaac Sim 2022.2.0](https://docs.omniverse.nvidia.com/isaacsim/latest/installation/requirements.html).
+
+2. Besides [docker](https://www.docker.com/), install [NVIDIA container toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) on your machine.
+
+3. Get pre-built docker image.
+    
+    - If you have access to docker hub:
+        
+        ```bash
+        docker pull tianjiezhang/vab_omnigibson:latest
+        ```
+    
+    - Or you can download from ModelScope.
+        
+        1. Make sure `git-lfs` is installed.
+        
+        2. Download from ModelScope:
+
+            ```bash
+            git lfs install
+
+            git clone https://www.modelscope.cn/datasets/VisualAgentBench/VAB-OmniGibson-Docker.git
+            ```
+
+        3. Load the docker image from ModelScope dataset.
+
+            ```bash
+            docker load -i VAB-OmniGibson-Docker/vab_omnigibson.tar
+            ```
+
+4. Download datasets of OmniGibson, VAB-OmniGibson test activities, and related scene files. Note that about 25 GB data will be downloaded to `data/omnigibson`, and make sure you have access to google drive.
+    
+    ```bash
+    python scripts/omnigibson_download.py
+    ```
+
+## Get Started
+
+1. According to your hardware equipment, fill `available_ports` and `available_devices` in the task configuration file `configs/tasks/omnigibson.yaml`.
+
+    - `available_ports`: Please fill in available ports in your machine. Each concurrent docker container requires 1 port for communication with the task server. Ensure that you provide enough ports to accommodate the expected concurrency.
+
+    - `available_devices`: Please fill in GPU IDs and their corresponding capability of concurrency. Each concurrent docker container occupies about **11 GB** memory. Ensure that you provide enough GPU memory to accommodate the expected concurrency.
+
+2. It's recommended to increase the file change watcher for Linux. See [Omniverse guide](https://docs.omniverse.nvidia.com/dev-guide/latest/linux-troubleshooting.html#to-update-the-watcher-limit) for more details.
+
+    - View the current watcher limit: `cat /proc/sys/fs/inotify/max_user_watches`.
+
+    - Update the watcher limit:
+
+        1. Edit `/etc/sysctl.conf` and add `fs.inotify.max_user_watches=524288` line.
+
+        2. Load the new value: `sudo sysctl -p`.
+
+**Note: If you manually shut down the task server and assigner, please ensure you also stop the OmniGibson containers to free up the ports!**
+
+# Setup for VAB-Minecraft
+
+## Installation
+
+1. We have tested on Ubuntu. VAB-Minecraft requires at least 4 GB NVIDIA GPU and NVIDIA GPU driver version >= 530.30.02.
+
+2. Besides [docker](https://www.docker.com/), install [NVIDIA container toolkit](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html) on your machine.
+
+3. Get pre-built docker image.
+    
+    - If you have access to docker hub:
+        
+        ```bash
+        docker pull tianjiezhang/vab_minecraft:latest
+        ```
+    
+    - Or you can download from ModelScope.
+        
+        1. Make sure `git-lfs` is installed.
+        
+        2. Download from ModelScope:
+
+            ```bash
+            git lfs install
+
+            git clone https://www.modelscope.cn/datasets/VisualAgentBench/VAB-Minecraft.git
+            ```
+
+        3. Load the docker image from ModelScope dataset.
+
+            ```bash
+            docker load -i VAB-Minecraft/vab_minecraft.tar
+            ```
+
+4. Download weights of Steve-1 to `data/minecraft`. Please make sure you have access to google drive.
+    
+    ```bash
+    python scripts/minecraft_download.py
+    ```
+
+## Get Started
+
+According to your hardware equipment, fill `available_ports` and `available_devices` in the task configuration file `configs/tasks/minecraft.yaml`.
+
+- `available_ports`: Please fill in available ports in your machine. Each concurrent docker container requires 1 port for communication with the task server. Ensure that you provide enough ports to accommodate the expected concurrency.
+
+- `available_devices`: Please fill in GPU IDs and their corresponding capability of concurrency. Each concurrent docker container occupies about **3.3 GB** memory. Ensure that you provide enough GPU memory to accommodate the expected concurrency.
+
+**Note: If you manually shut down the task server and assigner, please ensure you also stop the Minecraft containers to free up the ports!**
+
+# Setup for VAB-CSS
+
+TODO
--- a/requirements.txt
+++ b/requirements.txt
@ -0,0 +1,18 @@
+numpy~=1.23.5
+pydantic~=1.10.12
+requests~=2.28.1
+tqdm~=4.65.0
+pyyaml~=6.0
+jsonlines~=3.1.0
+aiohttp~=3.8.4
+uvicorn~=0.22.0
+fastapi~=0.101.1
+urllib3~=1.26.15
+docker==6.1.2
+networkx~=2.8.4
+anthropic~=0.4.1
+fschat~=0.2.31
+accelerate~=0.23.0
+transformers~=4.34.0
+pillow==10.4.0
+gdown==5.2.0
--- a/scripts/minecraft_download.py
+++ b/scripts/minecraft_download.py
@ -0,0 +1,47 @@
+import os
+import subprocess
+
+directories_to_create = [
+    "data/minecraft/mineclip",
+    "data/minecraft/steve1",
+    "data/minecraft/vpt"
+]
+
+files_to_download = [
+    {
+        "url": "https://openaipublic.blob.core.windows.net/minecraft-rl/models/2x.model",
+        "output_dir": "data/minecraft/vpt",
+        "output_file": "2x.model"
+    },
+    {
+        "url": "https://drive.google.com/uc?id=1uaZM1ZLBz2dZWcn85rZmjP7LV6Sg5PZW",
+        "output_dir": "data/minecraft/mineclip",
+        "output_file": "attn.pth"
+    },
+    {
+        "url": "https://drive.google.com/uc?id=1E3fd_-H1rRZqMkUKHfiMhx-ppLLehQPI",
+        "output_dir": "data/minecraft/steve1",
+        "output_file": "steve1.weights"
+    },
+    {
+        "url": "https://drive.google.com/uc?id=1OdX5wiybK8jALVfP5_dEo0CWm9BQbDES",
+        "output_dir": "data/minecraft/steve1",
+        "output_file": "steve1_prior.pt"
+    }
+]
+
+for directory in directories_to_create:
+    if not os.path.exists(directory):
+        os.makedirs(directory)
+
+for file_info in files_to_download:
+    url = file_info["url"]
+    output_dir = file_info["output_dir"]
+    output_file = file_info["output_file"]
+    output_path = os.path.join(output_dir, output_file)
+
+    if not os.path.exists(output_path):
+        if url.startswith("https://drive.google.com"):
+            subprocess.run(["gdown", url, "-O", output_path])
+        elif url.startswith("http"):
+            subprocess.run(["wget", url, "-P", output_dir])
--- a/scripts/minecraft_weights.sh
+++ b/scripts/minecraft_weights.sh
@ -4,8 +4,8 @@ mkdir -p data/minecraft/vpt

 wget https://openaipublic.blob.core.windows.net/minecraft-rl/models/2x.model -P data/minecraft/vpt

-curl -L -o data/minecraft/mineclip/attn.pth "https://drive.google.com/uc?export=download&id=1uaZM1ZLBz2dZWcn85rZmjP7LV6Sg5PZW"
+gdown https://drive.google.com/uc?id=1uaZM1ZLBz2dZWcn85rZmjP7LV6Sg5PZW -O data/minecraft/mineclip/attn.pth

-curl -L -o data/minecraft/steve1/steve1.weights "https://drive.google.com/uc?id=1E3fd_-H1rRZqMkUKHfiMhx-ppLLehQPI"
+gdown https://drive.google.com/uc?id=1E3fd_-H1rRZqMkUKHfiMhx-ppLLehQPI -O data/minecraft/steve1/steve1.weights

-curl -L -o data/weights/steve1/steve1_prior.pt "https://drive.google.com/uc?id=1OdX5wiybK8jALVfP5_dEo0CWm9BQbDES"
+gdown https://drive.google.com/uc?id=1OdX5wiybK8jALVfP5_dEo0CWm9BQbDES -O data/minecraft/steve1/steve1_prior.pt
--- a/src/client/agent_test.py
+++ b/src/client/agent_test.py
@ -4,24 +4,48 @@ from src.configs import ConfigLoader
 from src.typings import InstanceFactory
 from .agent import AgentClient

-
 def parse_args():
    parser = argparse.ArgumentParser()
    parser.add_argument('--config', type=str, default='configs/agents/api_agents.yaml')
-    parser.add_argument('--agent', type=str, default='gpt-3.5-turbo-0613')
+    parser.add_argument('--agent', type=str, default='gpt-4o-2024-05-13')
    return parser.parse_args()


 def interaction(agent: AgentClient):
+    
+    history = [{
+        "role": "user",
+        "content": [
+            {
+                "type": "text",
+                "text": "Briefly describe the image."
+            },
+            {
+                "type": "image_url",
+                "image_url": {
+                    "url": f"data:image/png;base64,assets/cover.png"
+                }
+            }
+        ]
+    }]
+    print("================= USER ====================")
+    print(">>> Briefly describe the image. (image: `assets/cover.png`)")
+    try:
+        print("================ AGENT ====================")
+        agent_response = agent.inference(history)
+        print(agent_response)
+        history.append({"role": "agent", "content": agent_response})
+    except Exception as e:
+        print(e)
+        exit(0)
    try:
-        history = []
        while True:
            print("================= USER  ===================")
            user = input(">>> ")
            history.append({"role": "user", "content": user})
            try:
-                agent_response = agent.inference(history)
                print("================ AGENT ====================")
+                agent_response = agent.inference(history)
                print(agent_response)
                history.append({"role": "agent", "content": agent_response})
            except Exception as e:
--- a/src/client/agents/http_agent.py
+++ b/src/client/agents/http_agent.py
@ -254,7 +254,7 @@ class HTTPAgent(AgentClient):
            history = replace_image_url(history, keep_path=False, throw_details=False)
        else:
            history = replace_image_url(history, keep_path=True, throw_details=True)
-        for _ in range(5):
+        for _ in range(10):
            try:
                if self.prompter_type == "role_content_dict":
                    body = self.body.copy()
@ -288,5 +288,5 @@ class HTTPAgent(AgentClient):
            else:
                resp = resp.json()
                return self.return_format.format(response=resp)
-            time.sleep(_ + 2)
+            time.sleep(_ + 1)
        raise Exception("Failed.")
--- a/src/server/tasks/minecraft/task.py
+++ b/src/server/tasks/minecraft/task.py
@ -47,7 +47,8 @@ class Minecraft(Task):
                if result.status != SampleStatus.RUNNING:
                    return result
        except Exception as e:
-            print(e)
+            import traceback
+            traceback.print_exc()
            return TaskSampleExecutionResult(status=SampleStatus.TASK_ERROR, result={"error": e})
        finally:
            try:
@ -81,7 +82,7 @@ async def main():
    output_dir = "outputs/minecraft"
    max_round = 100
    docker_image = "tianjiezhang/vab_minecraft:latest"
-    task = Minecraft(available_ports=available_ports, available_devices=available_devices, max_round=max_round, data_dir=data_dir, output_dir=output_dir, docker_image=docker_image, name="Minecraft-std")
+    task = Minecraft(available_ports=available_ports, available_devices=available_devices, max_round=max_round, data_dir=data_dir, output_dir=output_dir, docker_image=docker_image, name="Minecraft")
    print(Container.available_devices)
    print(Container.available_ports)
    session = Session()
--- a/src/server/tasks/minecraft/vab_minecraft_src/jarvis/mineclip/tokenization.py
+++ b/src/server/tasks/minecraft/vab_minecraft_src/jarvis/mineclip/tokenization.py
@ -2,7 +2,6 @@ from __future__ import annotations
 import os
 from functools import lru_cache

-
 # disable HuggingFace warning
 os.environ["TOKENIZERS_PARALLELISM"] = "false"

--- a/src/server/tasks/minecraft/vab_minecraft_src/jarvis/steveI/path.py
+++ b/src/server/tasks/minecraft/vab_minecraft_src/jarvis/steveI/path.py
@ -4,3 +4,4 @@ VPT_MODEL_PATH = Path(__file__).parent / "weights" / "vpt" / "2x.model"
 VPT_WEIGHT_PATH = Path(__file__).parent / "weights" / "steve1" / "steve1.weights"
 PRIOR_WEIGHT_PATH = Path(__file__).parent / "weights" / "steve1" / "steve1_prior.pt"
 MINECLIP_WEIGHT_PATH = Path(__file__).parent / "weights" / "mineclip" / "attn.pth"
+# OPENAI_CLIP_PATH = Path(__file__).parent / "weights" / "clip-vit-base-patch16"
--- a/src/server/tasks/omnigibson/task.py
+++ b/src/server/tasks/omnigibson/task.py
@ -84,7 +84,7 @@ async def main():
    output_dir = "outputs/omnigibson"
    max_round = 100
    docker_image = "vab_omnigibson:latest"
-    task = OmniGibson(available_ports=available_ports, available_devices=available_devices, max_round=max_round, data_dir=data_dir, output_dir=output_dir, docker_image=docker_image, name="OmniGibson-std")
+    task = OmniGibson(available_ports=available_ports, available_devices=available_devices, max_round=max_round, data_dir=data_dir, output_dir=output_dir, docker_image=docker_image, name="OmniGibson")
    print(Container.available_devices)
    print(Container.available_ports)
    session = Session()