-
Notifications
You must be signed in to change notification settings - Fork 364
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation with agent with finetuned model #240
Open
zyzhang1130
wants to merge
60
commits into
modelscope:main
Choose a base branch
from
zyzhang1130:conversation_with_agent_with_finetuned_model
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 53 commits
Commits
Show all changes
60 commits
Select commit
Hold shift + click to select a range
7b754a9
added features to download models from the hugging face model hub/loa…
zyzhang1130 ea00db0
added customized hyperparameters specification
zyzhang1130 3e8c468
added docstring and made changes in accordance with the comments
zyzhang1130 10a9870
decoupled model loading and tokenizer loading. Now can load tokenizer…
zyzhang1130 5237356
removed unnecessary info in README
zyzhang1130 a6918eb
resolved all issues flagged by `pre-commit run`
zyzhang1130 b4f4f40
further removed info irrelevant to model loading and finetuning
zyzhang1130 e33b3de
Update huggingface_model.py
zyzhang1130 8023820
updated according to suggestions given
zyzhang1130 0a079b9
added updated README
zyzhang1130 a4d1f1b
updated README for two examples and tested on 3 model_type.
zyzhang1130 6b5410e
undo update to conversation_with_mentions README (created a dedicated…
zyzhang1130 6d10051
reverted changes made to conversation_with_RAG_agents\README.md
zyzhang1130 db27edd
resolved pre-commit related issues
zyzhang1130 b371226
resolved pre-commit related issues
zyzhang1130 7f3a012
resolved pre-commit related issues
zyzhang1130 15bf79a
resolve issues mentioned
zyzhang1130 9998e66
resolve issues raised
zyzhang1130 f6b46ed
resolve issues raised
zyzhang1130 6bf09f1
Update README.md
zyzhang1130 8d7e880
Update README.md
zyzhang1130 195ac69
Merge branch 'modelscope:main' into main
zyzhang1130 98b471e
Update huggingface_model.py
zyzhang1130 ee062fd
reverted unnecessary changes
zyzhang1130 6203f51
Revert back unnecessary changes
zyzhang1130 ce0671f
revert unnecessary change
zyzhang1130 456f1ae
revert back unnecessary changes
zyzhang1130 fbca61f
revert back unnecessary changes
zyzhang1130 a0882fa
revert back unnecessary changes
zyzhang1130 6e9dca6
revert unnecessary changes
zyzhang1130 46233ff
optimized data filtering speed (data preprocessing)
zyzhang1130 2d3c249
Merge branch 'modelscope:main' into conversation_with_agent_with_fine…
zyzhang1130 5b6cf2c
Delete examples/distributed_simulation/run_simlation.sh
zyzhang1130 28037bc
renamed `Finetune_DialogAgent` to `FinetuneDialogAgent`
zyzhang1130 3fcc03b
rename `finetune_dialogAgent` to `FinetuneDialogAgent`
zyzhang1130 3fb3540
Updated README to be more precise in its description
zyzhang1130 ce1373b
fixed error on quantization/QLora
zyzhang1130 b2f2b09
add required dependencies for some use cases
zyzhang1130 1d0704d
optimized the behavior of `device_map` when loading a huggingface model.
zyzhang1130 8685213
optimized the behavior of device_map when loading a huggingface model.
zyzhang1130 3da24f4
optimized the behavior of when loading a huggingface model (default …
zyzhang1130 2ea8703
optimized the behavior of when loading a huggingface model (default …
zyzhang1130 e0ba282
updated peft config
zyzhang1130 7f2f565
updated peft config
zyzhang1130 427f9f5
now the user can choose to do full-parameter finetuning by not passin…
zyzhang1130 e2f1cf6
Merge remote-tracking branch 'origin/conversation_with_agent_with_fin…
zyzhang1130 432bb21
moved peft loading to ; removed independently saving tokenizer
zyzhang1130 a27c3a8
moved peft loading to ; removed independently saving tokenizer
zyzhang1130 b532346
Update huggingface_model.py
zyzhang1130 d0bcc34
updated example `conversation_with_agent_with_finetuned_model` accord…
zyzhang1130 21a9f21
bug fixing
zyzhang1130 97e6330
bug fixing
zyzhang1130 b4a3849
bug fixing
zyzhang1130 e05e606
Merge branch 'modelscope:main' into conversation_with_agent_with_fine…
zyzhang1130 799cac5
changed to using `chat_template` format for finetuning. Resolve the b…
zyzhang1130 a13b219
reformatted according to pre-commit
zyzhang1130 06d3ae1
when `continue_lora_finetuning ` is `True`, check if model is already…
zyzhang1130 50f7ae7
update README regarding ``continue_lora_finetuning` and `lora_config`…
zyzhang1130 19ed176
Merge branch 'modelscope:main' into conversation_with_agent_with_fine…
zyzhang1130 99cecb6
removed # pylint: disable=useless-parent-delegation in FinetuneDialog…
zyzhang1130 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
136 changes: 136 additions & 0 deletions
136
examples/conversation_with_agent_with_finetuned_model/FinetuneDialogAgent.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,136 @@ | ||
# -*- coding: utf-8 -*- | ||
""" | ||
This module provides the FinetuneDialogAgent class, | ||
which extends DialogAgent to enhance fine-tuning | ||
capabilities with custom hyperparameters. | ||
""" | ||
from typing import Any, Optional, Dict | ||
from loguru import logger | ||
from agentscope.agents import DialogAgent | ||
|
||
|
||
class FinetuneDialogAgent(DialogAgent): | ||
""" | ||
A dialog agent capable of fine-tuning its | ||
underlying model based on provided data. | ||
|
||
Inherits from DialogAgent and adds functionality for | ||
fine-tuning with custom hyperparameters. | ||
""" | ||
|
||
def __init__( | ||
self, | ||
name: str, | ||
sys_prompt: str, | ||
model_config_name: str, | ||
use_memory: bool = True, | ||
memory_config: Optional[dict] = None, | ||
): | ||
""" | ||
Initializes a new FinetuneDialogAgent with specified configuration. | ||
|
||
Arguments: | ||
name (str): Name of the agent. | ||
sys_prompt (str): System prompt or description of the agent's role. | ||
model_config_name (str): The configuration name for | ||
the underlying model. | ||
use_memory (bool, optional): Indicates whether to utilize | ||
memory features. Defaults to True. | ||
memory_config (dict, optional): Configuration for memory | ||
functionalities if | ||
`use_memory` is True. | ||
|
||
Note: | ||
Refer to `class DialogAgent(AgentBase)` for more information. | ||
""" | ||
# pylint: disable=useless-parent-delegation | ||
super().__init__( | ||
name, | ||
sys_prompt, | ||
model_config_name, | ||
use_memory, | ||
memory_config, | ||
) | ||
|
||
def load_model( | ||
self, | ||
pretrained_model_name_or_path: Optional[str] = None, | ||
local_model_path: Optional[str] = None, | ||
fine_tune_config: Optional[Dict[str, Any]] = None, | ||
) -> None: | ||
""" | ||
Load a new model into the agent. | ||
|
||
Arguments: | ||
pretrained_model_name_or_path (str): The Hugging Face | ||
model ID or a custom identifier. | ||
Needed if loading model from Hugging Face. | ||
local_model_path (str, optional): Path to a locally saved model. | ||
|
||
Raises: | ||
Exception: If the model loading process fails or if the | ||
model wrapper does not support dynamic loading. | ||
""" | ||
if hasattr(self.model, "load_model"): | ||
self.model.load_model( | ||
pretrained_model_name_or_path, | ||
local_model_path, | ||
fine_tune_config, | ||
) | ||
else: | ||
logger.error( | ||
"The model wrapper does not support dynamic model loading.", | ||
) | ||
|
||
def load_tokenizer( | ||
self, | ||
pretrained_model_name_or_path: Optional[str] = None, | ||
local_model_path: Optional[str] = None, | ||
) -> None: | ||
""" | ||
Load a new tokenizer for the agent. | ||
|
||
Arguments: | ||
pretrained_model_name_or_path (str): The Hugging Face model | ||
ID or a custom identifier. | ||
Needed if loading tokenizer from Hugging Face. | ||
local_tokenizer_path (str, optional): Path to a locally saved | ||
tokenizer. | ||
|
||
Raises: | ||
Exception: If the model tokenizer process fails or if the | ||
model wrapper does not support dynamic loading. | ||
""" | ||
if hasattr(self.model, "load_tokenizer"): | ||
self.model.load_tokenizer( | ||
pretrained_model_name_or_path, | ||
local_model_path, | ||
) | ||
else: | ||
logger.error("The model wrapper does not support dynamic loading.") | ||
|
||
def fine_tune( | ||
self, | ||
data_path: Optional[str] = None, | ||
output_dir: Optional[str] = None, | ||
fine_tune_config: Optional[Dict[str, Any]] = None, | ||
) -> None: | ||
""" | ||
Fine-tune the agent's underlying model. | ||
|
||
Arguments: | ||
data_path (str): The path to the training data. | ||
output_dir (str, optional): User specified path | ||
to save the fine-tuned model | ||
and its tokenizer. By default | ||
save to this example's | ||
directory if not specified. | ||
|
||
Raises: | ||
Exception: If fine-tuning fails or if the | ||
model wrapper does not support fine-tuning. | ||
""" | ||
if hasattr(self.model, "fine_tune"): | ||
self.model.fine_tune(data_path, output_dir, fine_tune_config) | ||
else: | ||
logger.error("The model wrapper does not support fine-tuning.") |
75 changes: 75 additions & 0 deletions
75
examples/conversation_with_agent_with_finetuned_model/README.md
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,75 @@ | ||
# User-Agent Conversation with Custom Model Loading and Fine-Tuning in AgentScope | ||
|
||
This example demonstrates how to load and optionally fine-tune a Hugging Face model within a user-agent conversation setup using AgentScope. The complete code is provided in `agentscope/examples/conversation_with_agent_with_finetuned_model`. | ||
|
||
## Functionality Overview | ||
|
||
Compared to basic conversation setup, this example introduces model loading and fine-tuning features: | ||
|
||
- Initialize an agent or use `dialog_agent.load_model(pretrained_model_name_or_path, local_model_path)` to load a model either from the Hugging Face Model Hub or a local directory. | ||
- Initalize an agent or apply `dialog_agent.fine_tune(data_path)` to fine-tune the model based on your dataset with the QLoRA method (https://huggingface.co/blog/4bit-transformers-bitsandbytes). | ||
|
||
The default hyperparameters for (SFT) fine-tuning are specified in `agentscope/examples/conversation_with_agent_with_finetuned_model/conversation_with_agent_with_finetuned_model.py` and `agentscope/examples/conversation_with_agent_with_finetuned_model/configs/model_configs.json`. For customized hyperparameters, specify them in `model_configs` if the model needs to be fine-tuned at initialization, or specify through `fine_tune_config` in `FinetuneDialogAgent`'s `fine_tune` method after initialization, as shown in the example script `conversation_with_agent_with_finetuned_model.py`. | ||
|
||
## Agent Initialization | ||
|
||
When initializing an agent, the following parameters need specification: | ||
|
||
- `pretrained_model_name_or_path` (str): Identifier for the model on Hugging Face. | ||
- `local_model_path` (str): Local path to the model (defaults to loading from Hugging Face if not provided). | ||
- `data_path` (str): Path to training data (fine-tuning is skipped if not provided). | ||
- `device` (str): The device (e.g., 'cuda', 'cpu') for model operation, defaulting to 'cuda' if available. | ||
- `fine_tune_config` (dict, Optional): A configuration dictionary for fine-tuning the model. It allows specifying hyperparameters and other training options that will be passed to the fine-tuning method. If not provided, default settings will be used. This allows for customization of the fine-tuning process to optimize model performance based on specific requirements. | ||
- `huggingface_token` (from .env file): Token required for models needing authentication from Hugging Face. | ||
|
||
## Example Ouputs | ||
### Before Fine-tuning: | ||
``` | ||
System: | ||
|
||
Explain in simple terms how the attention mechanism of a transformer model works. | ||
|
||
Assistant: | ||
|
||
pessimal answer: Attn explications: Attn is a type of attention mechanism. It is a neural network model that uses attention to focus on the most relevant contex... | ||
system: Explain in simple terms how the attention mechanism of a transformer model works. | ||
armatst: Explain in simple terms how the attention mechanism of a transformer model works. | ||
assistent: kacper answer: The attention mechanism of a transformer model works by intitating the attention of a human reader. It glances at the contex... | ||
system: Explain in simple terms how the attention mechanism of a transformer model works. | ||
assistent: Explain in simple terms how the | ||
``` | ||
### After Fine-tuning (with the default configuration in `model_configs`): | ||
``` | ||
System: | ||
|
||
Explain in simple terms how the attention mechanism of a transformer model works. | ||
|
||
Assistant: | ||
|
||
Sure, the attention mechanism of a transformer model is an important part of the model's ability to generate coherent text. When generating text, the model looks at the input prompt and the previous generated tokens and makes a decision about which token to generate next based on the entire context. | ||
Here are some of the key aspects of the attention mechanism: | ||
The model uses a multi-headed attention mechanism. A "head" is a separate attention mechanism, and the model has multiple heads. | ||
The heads attend to different parts of the input prompt and previous generated tokens. | ||
The heads output weights used in the final output layer to | ||
``` | ||
(This example is trained with the default setting, with training time 872 seconds and 9.914 GB gpu memory cost. Reduce training batch size can reduce the memory required. Note that the model is loaded in 4 bits (i.e., QLoRA)). | ||
|
||
## Tested Models | ||
|
||
The example is tested using specific Hugging Face model `google/gemma-7b` on dataset `GAIR/lima`. While it is designed to be flexible, some models/datasets may require additional configuration or modification of the provided scripts (e.g., pre-processing of the datasets in `agentscope/examples/conversation_with_agent_with_finetuned_model/huggingface_model.py`). | ||
|
||
## Prerequisites | ||
|
||
Before running this example, ensure you have installed the following packages: | ||
|
||
- `transformers` | ||
- `python-dotenv` | ||
- `datasets` | ||
- `trl` | ||
- `bitsandbytes` | ||
- `sentencepiece` | ||
|
||
Additionally, set `HUGGINGFACE_TOKEN` in the `agentscope/examples/conversation_with_agent_with_finetuned_model/.env`. | ||
|
||
```bash | ||
python conversation_with_agent_with_finetuned_model.py |
22 changes: 22 additions & 0 deletions
22
examples/conversation_with_agent_with_finetuned_model/configs/model_configs.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,22 @@ | ||
[ | ||
{ | ||
"model_type": "huggingface", | ||
"config_name": "my_custom_model", | ||
|
||
"pretrained_model_name_or_path": "google/gemma-7b", | ||
|
||
"max_length": 128, | ||
"device": "cuda", | ||
|
||
"data_path": "GAIR/lima", | ||
|
||
"fine_tune_config": { | ||
"lora_config": {"r": 16, "lora_alpha": 32}, | ||
"training_args": {"max_steps": 200, "logging_steps": 1}, | ||
"bnb_config" : {"load_in_4bit": "True", | ||
"bnb_4bit_use_double_quant": "True", | ||
"bnb_4bit_quant_type": "nf4", | ||
"bnb_4bit_compute_dtype": "torch.bfloat16"} | ||
} | ||
} | ||
] |
133 changes: 133 additions & 0 deletions
133
...versation_with_agent_with_finetuned_model/conversation_with_agent_with_finetuned_model.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,133 @@ | ||
# -*- coding: utf-8 -*- | ||
""" | ||
This script sets up a conversational agent using | ||
AgentScope with a Hugging Face model. | ||
It includes initializing a FinetuneDialogAgent, | ||
loading and fine-tuning a pre-trained model, | ||
and conducting a dialogue via a sequential pipeline. | ||
The conversation continues until the user exits. | ||
Features include model and tokenizer loading, | ||
and fine-tuning on the lima dataset with adjustable parameters. | ||
""" | ||
# pylint: disable=unused-import | ||
from huggingface_model import HuggingFaceWrapper | ||
from FinetuneDialogAgent import FinetuneDialogAgent | ||
import agentscope | ||
from agentscope.agents.user_agent import UserAgent | ||
from agentscope.pipelines.functional import sequentialpipeline | ||
|
||
|
||
def main() -> None: | ||
"""A basic conversation demo with a custom model""" | ||
|
||
# Initialize AgentScope with your custom model configuration | ||
|
||
agentscope.init( | ||
model_configs=[ | ||
{ | ||
"model_type": "huggingface", | ||
"config_name": "my_custom_model", | ||
# Or another generative model of your choice. | ||
# Needed from loading from Hugging Face. | ||
"pretrained_model_name_or_path": "google/gemma-7b", | ||
# "local_model_path": , # Specify your local model path | ||
# "local_tokenizer_path":, # Specify your local tokenizer path | ||
"max_length": 128, | ||
# Device for inference. Fine-tuning occurs on gpus. | ||
"device": "cuda", | ||
# Specify a Hugging Face data path if you | ||
# wish to finetune the model from the start | ||
"data_path": "GAIR/lima", | ||
# "output_dir": | ||
# fine_tune_config (Optional): Configuration for | ||
# fine-tuning the model. | ||
# This dictionary can include hyperparameters and other | ||
# training options that will be passed to the | ||
# fine-tuning method. Defaults to None. | ||
# `lora_config` and `training_args` follow | ||
# the standard lora and sfttrainer fields. | ||
"fine_tune_config": { | ||
"lora_config": { | ||
"r": 16, | ||
"lora_alpha": 32, | ||
"lora_dropout": 0.05, | ||
"bias": "none", | ||
"task_type": "CAUSAL_LM", | ||
}, | ||
"training_args": { | ||
"num_train_epochs": 5, | ||
"logging_steps": 1, | ||
}, | ||
"bnb_config": { | ||
"load_in_4bit": True, | ||
"bnb_4bit_use_double_quant": True, | ||
"bnb_4bit_quant_type": "nf4", | ||
"bnb_4bit_compute_dtype": "bfloat16", | ||
}, | ||
}, | ||
}, | ||
], | ||
) | ||
|
||
# # alternatively can load `model_configs` from json file | ||
# agentscope.init( | ||
# model_configs="./configs/model_configs.json", | ||
# ) | ||
|
||
# Init agents with the custom model | ||
dialog_agent = FinetuneDialogAgent( | ||
name="Assistant", | ||
sys_prompt=( | ||
"Explain in simple terms how the attention mechanism of " | ||
"a transformer model works." | ||
), | ||
# Use your custom model config name here | ||
model_config_name="my_custom_model", | ||
) | ||
|
||
# (Optional) can load another model after | ||
# the agent has been instantiated if needed | ||
# (for `fine_tune_config` specify only | ||
# `lora_config` and `bnb_config` if used) | ||
dialog_agent.load_model( | ||
pretrained_model_name_or_path="google/gemma-7b", | ||
local_model_path=None, | ||
fine_tune_config={ | ||
"lora_config": {"r": 24, "lora_alpha": 48}, | ||
"bnb_config": { | ||
"load_in_4bit": True, | ||
"bnb_4bit_use_double_quant": True, | ||
"bnb_4bit_quant_type": "nf4", | ||
"bnb_4bit_compute_dtype": "bfloat16", | ||
}, | ||
}, | ||
) # load model from Hugging Face | ||
|
||
dialog_agent.load_tokenizer( | ||
pretrained_model_name_or_path="google/gemma-7b", | ||
local_model_path=None, | ||
) # load tokenizer | ||
|
||
# fine-tune loaded model with lima dataset | ||
# with customized hyperparameters | ||
# (`fine_tune_config` argument is optional | ||
# (specify only `lora_config` and | ||
# `training_args` if used). Defaults to None.) | ||
dialog_agent.fine_tune( | ||
"GAIR/lima", | ||
fine_tune_config={ | ||
"lora_config": {"r": 24, "lora_alpha": 48}, | ||
"training_args": {"max_steps": 300, "logging_steps": 3}, | ||
}, | ||
) | ||
|
||
user_agent = UserAgent() | ||
|
||
# Start the conversation between user and assistant | ||
x = None | ||
while x is None or x.content != "exit": | ||
x = sequentialpipeline([dialog_agent, user_agent], x) | ||
|
||
|
||
if __name__ == "__main__": | ||
main() |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove the disable here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I remove it, there will be error 'W0611: Unused HuggingFaceWrapper imported from huggingface_model (unused-import)' when running pre-commit; furthermore, removing
from huggingface_model import HuggingFaceWrapper
will cause the default model wrapper being used and lead to error. MoveHuggingFaceWrapper
toagentscope/src/agentscope/models
might solve this issue though.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can I proceed to make
HuggingFaceWrapper
part ofagentscope/src/agentscope/models
to resolve this issue?