-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update sft trainer to include better packing #3100
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution, and sorry for the late reply. I've made some comments. Can you please also:
- include a test for
pack_example_smater
- modify the packing argument in
SFTConfig
to allow for this packing method
return knapsacks, [[tup[1]] for tup in numbers] | ||
|
||
|
||
def pack_examples_smarter(examples: dict[str, list[list]], seq_length: int) -> dict[str, list[list]]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wdyt of
def pack_examples_smarter(examples: dict[str, list[list]], seq_length: int) -> dict[str, list[list]]: | |
def pack_examples_knapsack(examples: dict[str, list[list]], seq_length: int) -> dict[str, list[list]]: |
or even
def pack_examples_smarter(examples: dict[str, list[list]], seq_length: int) -> dict[str, list[list]]: | |
def pack_examples_knapsack_greedy(examples: dict[str, list[list]], seq_length: int) -> dict[str, list[list]]: |
packing: bool, | ||
formatting_func: Optional[Callable[[dict], str]], | ||
dataset_name: str, | ||
pack_smart: bool = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe this is better:
packing: bool, | |
formatting_func: Optional[Callable[[dict], str]], | |
dataset_name: str, | |
pack_smart: bool = False, | |
packing: Union[bool, str], | |
formatting_func: Optional[Callable[[dict], str]], | |
dataset_name: str, |
and use if packing == "knapsack":
What does this PR do?
Fixes #2466
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines.
Who can review?
@qgallouedec @shirinyamani @AIR-hl
Hi everyone, I am new to TRL, have attempted a fix at the issue 2466 here.
The greedy knapsack tries to fit as many examples as possible without overshooting, the remaining ones are chunked as before by breaking it up into smaller example size (<= max_seq_len). Also I havent made this the default packing and kept it under a flag.
Please let me know if this is close to what was expected, I can add tests then, thanks!