Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Proposal] Make Openapi(REST) client the only entrypoint #398

Open
Al-Pragliola opened this issue Sep 17, 2024 · 3 comments
Open

[Proposal] Make Openapi(REST) client the only entrypoint #398

Al-Pragliola opened this issue Sep 17, 2024 · 3 comments

Comments

@Al-Pragliola
Copy link
Contributor

Al-Pragliola commented Sep 17, 2024

At the moment Model Registry has two client entrypoints in the codebase:

  1. pkg/openapi: the auto generated client from the openapi schema
  2. pkg/core: a grpc client that talks directly with the MLMD service

image

This creates a strong dependency on the MLMD service and makes difficult changing the underlying service without major breaking changes in the future, also we cannot trace/log/audit what users are doing to the metadata store if they bypass our proxy. My proposal is to move the package from pkg/core to internal/core and do a refactoring to make it a switchable datastore for the model registry. What I mean by switchable datastore is to change the config arguments that we take in the proxy:

  • (change) mlmd_hostname -> datastore_hostname
  • (change) mlmd_port -> datastore_port
  • (create) datastore_service(type?) = mlmd / sqlite / whatever

then in model registry we can use a factory method to use the correct datastore implementation of the interface ModelRegistryServiceAPIServicer

I did a check on which projects import pkg/core and for upstream I didn't find any (https://pkg.go.dev/github.com/kubeflow/model-registry/pkg/core?tab=importedby), but we have a breaking change in the midstream (https://pkg.go.dev/github.com/opendatahub-io/model-registry/pkg/core?tab=importedby) we should make the team that works on opendatahub-io/odh-model-controller aware of this change.

@lampajr
Copy link
Member

lampajr commented Sep 18, 2024

Hey @Al-Pragliola, you have my +1 on this proposal for all the motivation you have already highlighted here!

I think that having a single entrypoint is always the best solution, especially here where we don't know if we will change the underlying "datastore" in the future.

Moreover, as you pointed out, I think this is the right timing to do this change as we don't have many external projects relying on the internal pkg/core.

(change) mlmd_hostname -> datastore_hostname
(change) mlmd_port -> datastore_port
(create) datastore_service(type?) = mlmd / sqlite / whatever

I agree on this generalization and I also like the datastore name.

then in model registry we can use a factory method to use the correct datastore implementation of the interface ModelRegistryServiceAPIServicer

Yeah that's an option, otherwise we could also have different core services, one per datastore that we could support in the future by keeping a single ModelRegistryServiceAPIServicer implementation and changing the injected core service accordingly.

I did a check on which projects import pkg/core and for upstream I didn't find any (https://pkg.go.dev/github.com/kubeflow/model-registry/pkg/core?tab=importedby)

AFAIK I can confirm this as well

but we have a breaking change in the midstream (https://pkg.go.dev/github.com/opendatahub-io/model-registry/pkg/core?tab=importedby) we should make the team that works on opendatahub-io/odh-model-controller aware of this change.

Yeah, that's the only component that is using the pkg/core directly that I am aware of, shouldn't be a huge effort to switch to the REST client.

@tarilabs
Copy link
Member

Thank you @Al-Pragliola for this proposal, as anticipated I fully agree.

One minor suggestion in your diagram is:

  • box currently labeled "External application 2" should actually read "Model Registry plug-ins/extensions"

we indeed want to avoid external developers/users consume an API which currently "leaks" an internal implementation choice (the gRPC connection to MLMD) hence motivating this proposal

Beyond the ModelRegistryServiceAPIServicer, as mentioned the "Core API" interface was designed to meet the MR requirements, even before the REST API layer came to be, and in general I can see an option where we could leverage it but without exposing the internal implementation gRPC connection.

Any comment on this documentation text being added btw?

@rareddy
Copy link
Contributor

rareddy commented Sep 18, 2024

(create) datastore_service(type?) = mlmd / sqlite / whatever

I do like this abstraction at the ModelRegistryServiceAPIServicer to include other datastores, this way we can propose an alternative to MLMD while we phase this out (although since we are alpha there is no reason support this beyond initial release).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants