Skip to content

anchen1011/FireAct

Folders and files

NameName
Last commit message
Last commit date

Latest commit

94c6189 · Oct 22, 2023

History

26 Commits
Oct 9, 2023
Oct 9, 2023
Oct 8, 2023
Oct 8, 2023
Oct 8, 2023
Oct 8, 2023
Oct 8, 2023
Oct 8, 2023
Oct 10, 2023
Oct 22, 2023
Oct 8, 2023
Oct 8, 2023
Oct 8, 2023

Repository files navigation

FireAct: Toward Language Agent Fine-tuning

Build License Hugging Face

teaser

This repository is based on our publication FireAct: Toward Language Agent Fine-tuning (PDF). It contains prompts, demo code and fine-tuning data we generated. It also includes the description and directory for the model family we fine-tuned. If you use this code or data in your work, please cite:

@misc{chen2023fireact,
      title={FireAct: Toward Language Agent Fine-tuning}, 
      author={Baian Chen and Chang Shu and Ehsan Shareghi and Nigel Collier and Karthik Narasimhan and Shunyu Yao},
      year={2023},
      eprint={2310.05915},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Overview

  • Define tools in tools/
  • Define tasks in tasks/
  • Collect data & run experiments via generation.py
  • Results will be saved in trajs/

Data & Prompts

  • Data to generate training data and run experiments in data/. We also include samples of training data for both Alpaca format and GPT format. See details here.
  • Prompts to generate training data and run experiments in prompts/

Setup

Set up OpenAI API key and store in environment variable (see here)

export OPENAI_API_KEY=<YOUR_KEY>

Set up SERP API key and store in environment variable (see here)

export SERPAPI_API_KEY=<YOUR_KEY>

Create virtual env, for example with conda

conda create -n fireact python=3.9
conda activate fireact

Clone this repo and install dependencies

git clone https://github.com/anchen1011/FireAct.git
pip install -r requirements.txt

Run Demo

Data Generation

Example:

python generation.py \
    --task hotpotqa \
    --backend gpt-4 \
    --promptpath default \
    --evaluate \
    --random \
    --task_split val \
    --temperature 0 \
    --task_end_index 5

See details with command python generation.py -h

You need to set a high number (thousands) of --task_end_index to get sufficient good data samples. [WARNING] This is costly with gpt-4 and serpapi.

You need to convert trajectories into alpaca format or gpt format for training. See our examples here.

Supervised Fine-tuning

Example:

cd finetune/llama_lora
python finetune.py \
    --base_model meta-llama/Llama-2-13b-chat-hf \
    --data_path ../../data/finetune/alpaca_format/hotpotqa.json \
    --micro_batch_size 8 \
    --num_epochs 30 \
    --output_dir ../../models/lora/fireact-llama-2-13b \
    --val_set_size 0.01 \
    --cutoff_len 512 \

See details here.

Inference

Example (FireAct Llama):

python generation.py \
    --task hotpotqa \
    --backend llama \
    --evaluate \
    --random \
    --task_split dev \
    --task_end_index 5 \
    --modelpath meta-llama/Llama-2-7b-chat \
    --add_lora \
    --alpaca_format \
    --peftpath forestai/fireact_llama_2_7b_lora 

Example (FireAct GPT):

python generation.py \
    --task hotpotqa \
    --backend ft:gpt-3.5-turbo-0613:<YOUR_MODEL> \
    --evaluate \
    --random \
    --task_split dev \
    --temperature 0 \
    --chatgpt_format \
    --task_end_index 5

See details with command python generation.py -h

Set --task_end_index 500 for quantitative evaluations. See our examples here.

Model Zoo

We release a selected set of multitask models based on Llama family. Details can be found in their model cards.

Base Model Training Method Hugging Face
Llama2-7B LoRA forestai/fireact_llama_2_7b_lora
Llama2-13B LoRA forestai/fireact_llama_2_13b_lora
CodeLlama-7B LoRA forestai/fireact_codellama_7b_lora
CodeLlama-13B LoRA forestai/fireact_codellama_13b_lora
CodeLlama-34B LoRA forestai/fireact_codellama_34b_lora
Llama2-7B Full Model forestai/fireact_llama_2_7b

References

  1. Our generation code is based on ysymyth/ReAct
  2. Our Llama full model training code is based on tatsu-lab/stanford_alpaca
  3. Our Llama LoRA training code is based on tloen/alpaca-lora
  4. Our GPT fine-tuning code is based on anchen1011/chatgpt-finetune-ui