FireAct: Toward Language Agent Fine-tuning

This repository is based on our publication FireAct: Toward Language Agent Fine-tuning (PDF). It contains prompts, demo code and fine-tuning data we generated. It also includes the description and directory for the model family we fine-tuned. If you use this code or data in your work, please cite:

@misc{chen2023fireact,
      title={FireAct: Toward Language Agent Fine-tuning}, 
      author={Baian Chen and Chang Shu and Ehsan Shareghi and Nigel Collier and Karthik Narasimhan and Shunyu Yao},
      year={2023},
      eprint={2310.05915},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

Overview

Define tools in tools/
Define tasks in tasks/
Collect data & run experiments via generation.py
Results will be saved in trajs/

Data & Prompts

Data to generate training data and run experiments in data/. We also include samples of training data for both Alpaca format and GPT format. See details here.
Prompts to generate training data and run experiments in prompts/

Setup

Set up OpenAI API key and store in environment variable (see here)

export OPENAI_API_KEY=<YOUR_KEY>

Set up SERP API key and store in environment variable (see here)

export SERPAPI_API_KEY=<YOUR_KEY>

Create virtual env, for example with conda

conda create -n fireact python=3.9
conda activate fireact

Clone this repo and install dependencies

git clone https://github.com/anchen1011/FireAct.git
pip install -r requirements.txt

Run Demo

Data Generation

Example:

python generation.py \
    --task hotpotqa \
    --backend gpt-4 \
    --promptpath default \
    --evaluate \
    --random \
    --task_split val \
    --temperature 0 \
    --task_end_index 5

See details with command python generation.py -h

You need to set a high number (thousands) of --task_end_index to get sufficient good data samples. [WARNING] This is costly with gpt-4 and serpapi.

You need to convert trajectories into alpaca format or gpt format for training. See our examples here.

Supervised Fine-tuning

Example:

cd finetune/llama_lora
python finetune.py \
    --base_model meta-llama/Llama-2-13b-chat-hf \
    --data_path ../../data/finetune/alpaca_format/hotpotqa.json \
    --micro_batch_size 8 \
    --num_epochs 30 \
    --output_dir ../../models/lora/fireact-llama-2-13b \
    --val_set_size 0.01 \
    --cutoff_len 512 \

See details here.

Inference

Example (FireAct Llama):

python generation.py \
    --task hotpotqa \
    --backend llama \
    --evaluate \
    --random \
    --task_split dev \
    --task_end_index 5 \
    --modelpath meta-llama/Llama-2-7b-chat \
    --add_lora \
    --alpaca_format \
    --peftpath forestai/fireact_llama_2_7b_lora

Example (FireAct GPT):

python generation.py \
    --task hotpotqa \
    --backend ft:gpt-3.5-turbo-0613:<YOUR_MODEL> \
    --evaluate \
    --random \
    --task_split dev \
    --temperature 0 \
    --chatgpt_format \
    --task_end_index 5

See details with command python generation.py -h

Set --task_end_index 500 for quantitative evaluations. See our examples here.

Model Zoo

We release a selected set of multitask models based on Llama family. Details can be found in their model cards.

Base Model	Training Method	Hugging Face
Llama2-7B	LoRA	forestai/fireact_llama_2_7b_lora
Llama2-13B	LoRA	forestai/fireact_llama_2_13b_lora
CodeLlama-7B	LoRA	forestai/fireact_codellama_7b_lora
CodeLlama-13B	LoRA	forestai/fireact_codellama_13b_lora
CodeLlama-34B	LoRA	forestai/fireact_codellama_34b_lora
Llama2-7B	Full Model	forestai/fireact_llama_2_7b

References

Our generation code is based on ysymyth/ReAct
Our Llama full model training code is based on tatsu-lab/stanford_alpaca
Our Llama LoRA training code is based on tloen/alpaca-lora
Our GPT fine-tuning code is based on anchen1011/chatgpt-finetune-ui

Name	Name	Last commit message	Last commit date
Latest commit anchen1011 Update README.md Oct 22, 2023 94c6189 · Oct 22, 2023 History 26 Commits
data	data	finetune data	Oct 9, 2023
finetune	finetune	Update README.md	Oct 9, 2023
models	models	init	Oct 8, 2023
prompts	prompts	init	Oct 8, 2023
tasks	tasks	data	Oct 8, 2023
tools	tools	init	Oct 8, 2023
trajs	trajs	add llama trajs	Oct 8, 2023
.gitignore	.gitignore	init	Oct 8, 2023
LICENSE	LICENSE	Create LICENSE	Oct 10, 2023
README.md	README.md	Update README.md	Oct 22, 2023
generation.py	generation.py	init	Oct 8, 2023
requirements.txt	requirements.txt	init	Oct 8, 2023
teaser.png	teaser.png	Add files via upload	Oct 8, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FireAct: Toward Language Agent Fine-tuning

Overview

Data & Prompts

Setup

Run Demo

Data Generation

Supervised Fine-tuning

Inference

Model Zoo

References

About

Releases

Packages

Contributors 2

Languages

License

anchen1011/FireAct

Folders and files

Latest commit

History

Repository files navigation

FireAct: Toward Language Agent Fine-tuning

Overview

Data & Prompts

Setup

Run Demo

Data Generation

Supervised Fine-tuning

Inference

Model Zoo

References

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages