Skip to content

AutoConfig has potential issue with composite config. #38258 solved #38672

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

gspeter-max
Copy link

Fixes #38258

Description

This PR resolves an issue where keyword arguments passed to from_pretrained or from_config for composite models were not being correctly routed to the respective sub-configs. This would lead to a TypeError when an argument intended for a sub-model (e.g., use_cache=True for a text model) was passed down to a child constructor that did not accept it.

Solution

The solution introduces a new private static method, _route_kwargs, to the _BaseAutoModelClass in auto/factory.py. This centralized helper method is responsible for:

  1. Iterating through the provided kwargs.
  2. Checking if a given keyword argument is a valid attribute of any of the model's sub-configs (e.g., text_config, vision_config).
  3. If a match is found, the attribute is correctly set on the corresponding sub-config object (config.text_config.use_cache = True).
  4. The keyword argument is then removed from the main kwargs dictionary to prevent it from being passed down incorrectly.

This helper method is now called from the entry points of both the from_pretrained and from_config methods. This ensures that the argument routing is applied robustly and consistently, regardless of how a user chooses to load a composite model.

This approach fixes the underlying issue in the factory layer, providing a general solution for all current and future composite models.

Testing

I have confirmed that these changes fix the issue by running the relevant tests in tests/models/auto/test_modeling_auto_composite.py. Additionally, all quality checks (make quality) pass successfully.

@gspeter-max
Copy link
Author

Hi Maintainer,

The main files I contributed changes to in this pull request are:

tests/models/auto/test_modeling_auto_composite.py (added a new test)
src/transformers/models/auto/auto_factory.py (made code changes to support the test)
The other changes in this PR are just formatting updates, which were added automatically when I ran ruff format. There are no logic/code changes in those other files.

Thank you!

Copy link
Member

@zucchini-nlp zucchini-nlp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I don't think this is what we want. Sub configs might share the same config keys, for example num_attention_heads/hidden_size are all common kwargs found in many configs

I think we would have to ask the users to indicate which sub config they want to modify and then update the values there. We will assume that user's intention is to update one value and keep sub-config defaults. That will be a bigger change and require more updates to each model's config file

I don't have a draft yet, so if you want to give it a stab feel free to make changes. You can play with a few models for beginning, instead of updating all at once. And when the design is approved, we can update all config files :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

AutoConfig has potential issue with composite config.
2 participants