Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add syncing models utility to ivy #28818

Merged
merged 10 commits into from
Sep 19, 2024
Merged

feat: add syncing models utility to ivy #28818

merged 10 commits into from
Sep 19, 2024

Conversation

YushaArif99
Copy link
Contributor

@YushaArif99 YushaArif99 commented Sep 12, 2024

Wanted to have your thoughts on how to expose these utility functions? I think it makes sense to have them inside the ivy.functional.backends.tensorflow.module.py module. But this creates an issue as to how we import this helper. So there wont be any ivy.sync_models_torch_and_tf but instead we'll have to do:

from ivy.functional.backends.tensorflow.module import sync_models_torch_and_tf
sync_models_torch_and_tf(...)

What do you guys suggest we do here @Sam-Armstrong @hmahmood24

@hmahmood24
Copy link
Contributor

Wanted to have your thoughts on how to expose these utility functions? I think it makes sense to have them inside the ivy.functional.backends.tensorflow.module.py module. But this creates an issue as to how we import this helper. So there wont be any ivy.sync_models_torch_and_tf but instead we'll have to do:

from ivy.functional.backends.tensorflow.module import sync_models_torch_and_tf
sync_models_torch_and_tf(...)

What do you guys suggest we do here @Sam-Armstrong @hmahmood24

My thinking was just exposing these as util functions so something like ivy.utils.sync_models_torch_and_tf or ivy.syncing_utils.sync_models_torch_and_tf seems cleaner to me rather than having to import the backends explicitly. What do you think @YushaArif99 ?

@Sam-Armstrong
Copy link
Contributor

@YushaArif99 Couldn't we just store all helper functions like this in a new file like ivy/utils/syncing.py or something like that? Is there any need to have this in the backend?

Also, I'd suggest we could expose a general ivy.sync_models function, which will detect the relevant models types and route to the correct function - sync_models_torch_and_tf or whatever. If anything other than torch and tf models are passed, for now we can just throw a ivy.exceptions.IvyNotImplementedException. I think this would be the best UX.

@YushaArif99
Copy link
Contributor Author

Sure! This is indeed a better UX. I was only inclined to not go this route as the logic contained within these utility helpers is framework-specific. And so it seems to deviate from the general convention of ivy API being an intermediary.
Then there's also a case of having dependency imports so will probably need to have the framework-specific imports as local imports.

Otherwise, this is definitely cleaner.

Based on this, should we still go forward with having these inside ivy.utils @hmahmood24 @Sam-Armstrong ?

@YushaArif99
Copy link
Contributor Author

@YushaArif99 Couldn't we just store all helper functions like this in a new file like ivy/utils/syncing.py or something like that? Is there any need to have this in the backend?

Also, I'd suggest we could expose a general ivy.sync_models function, which will detect the relevant models types and route to the correct function - sync_models_torch_and_tf or whatever. If anything other than torch and tf models are passed, for now we can just throw a ivy.exceptions.IvyNotImplementedException. I think this would be the best UX.

Yeah that's a good idea 👍🏼

@Sam-Armstrong
Copy link
Contributor

Sam-Armstrong commented Sep 12, 2024

@YushaArif99 I don't see an issue with it, because you'd have to import torch into the tensorflow backend anyway, which is also against the general convention of ivy. But I guess if we do expose ivy.sync_models it wouldn't be super important where functions like sync_models_torch_and_tf are located anyway, as they wouldn't be used directly by the user. So I guess it's up to you really 👍

@YushaArif99
Copy link
Contributor Author

@YushaArif99 I don't see an issue with it, because you'd have to import torch into the tensorflow backend anyway, which is also against the general convention of ivy. But I guess if we do expose ivy.sync_models it wouldn't be super important where functions like sync_models_torch_and_tf are located anyway, as they wouldn't be used directly by the user. So I guess it's up to you really 👍

Yeah I think exposing ivy.sync_models as the main interface is much cleaner. This way, we can have sync_models_torch_and_tf inside the TF module.py and similarily sync_models_torch_and_jax inside the JAX module.py which I feel is more intuitive.

I'll go ahead with this approach then. Thanks for the suggestions both!

@YushaArif99
Copy link
Contributor Author

YushaArif99 commented Sep 18, 2024

Hey @hmahmood24 @Sam-Armstrong, could you guys give this a final look and confirm whether we're all okay with this approach. I have added all the syncing logic inside ivy.stateful.utilities.py and have exposed ivy.sync_models_torch as the main interface for users.

Copy link
Contributor

@Sam-Armstrong Sam-Armstrong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! quick question though, could we generalise this to just be called ivy.sync_models, and in the longer term use this same function for syncing models from any framework? I feel like that would be a cleaner api. for now we can just throw a meaningful exception if a torch.nn.Module is not the first argument. what do you think @YushaArif99?

ivy/functional/backends/tensorflow/module.py Outdated Show resolved Hide resolved
@YushaArif99
Copy link
Contributor Author

question though, could we generalise this to just be called ivy.sync_models, and in the longer term use this same function for syncing models from any framework?

yeah we can definately do that. We can add source and target kwargs and then reroute to the appropriate implementation based on their values.

@Sam-Armstrong
Copy link
Contributor

@YushaArif99 do we need source/target kwargs, can't we just infer based on the input types?

Copy link
Contributor

@hmahmood24 hmahmood24 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@YushaArif99 LGTM. Agreed with @Sam-Armstrong's point though that keeping a single ivy.sync_models API should be cleaner if we can implement that 👍🏼

@YushaArif99
Copy link
Contributor Author

YushaArif99 commented Sep 19, 2024

@YushaArif99 do we need source/target kwargs, can't we just infer based on the input types?

@Sam-Armstrong that would require us importing all frameworks inside ivy.sync_models which is not ideal imo.

@Sam-Armstrong
Copy link
Contributor

@YushaArif99 maybe we could check if each framework is in sys.modules when doing the check? so something like:

if "torch" in sys.modules:
    import torch

    if isinstance(x, torch.nn.Module):

@hmahmood24
Copy link
Contributor

hmahmood24 commented Sep 19, 2024

@YushaArif99 do we need source/target kwargs, can't we just infer based on the input types?

@Sam-Armstrong that would require us importing all frameworks inside ivy.sync_models which is not ideal imo.

@YushaArif99 Can't we just use this which shouldn't require us importing any fws

…tances of native modules by traversing through the `mro` chain.
@YushaArif99 YushaArif99 merged commit ba475c1 into main Sep 19, 2024
2 of 5 checks passed
@YushaArif99 YushaArif99 deleted the sync_model_pt branch September 19, 2024 16:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants