Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Proposal for a TargetNodeDirectory in Tractus-X. #1556

Closed

Conversation

bmg13
Copy link
Contributor

@bmg13 bmg13 commented Sep 16, 2024

WHAT

Decision Record for new proposal for FederatedCatalog TargetNodeDirectory with Tractus-X.

Related with #1585

@bmg13 bmg13 marked this pull request as ready for review September 16, 2024 13:45
Copy link
Contributor

This pull request is stale because it has been open for 7 days with no activity.

@github-actions github-actions bot added the stale label Sep 24, 2024
@florianrusch
Copy link

Doesn't such a solution already exists with the BPN/EDC-Discovery? How would your service be different?

@paullatzelsperger
Copy link
Contributor

Doesn't such a solution already exists with the BPN/EDC-Discovery? How would your service be different?

The TargetNodeDirectory is an internal service to the FederatedCatalog, that would source its information from the BPN/EDC-Discovery service.

@paullatzelsperger paullatzelsperger added documentation Improvements or additions to documentation and removed stale labels Sep 24, 2024

So having a service containing the data of desired BPNL's and respective Connectors' URL's that a certain partner wants, but here the partner itself has the service in their scope and users are able to input (in the API of new service) the Participants.

This solution allows the member to choose the relevant data that interests them (offers from specific participants) enabling the reduction of storage cost and less network calls. Additionally, each member has control on how to host and manage this new service, service changes do not affect other parties (unless contract changes) and is able to be independently scalable.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This solution allows the member to choose the relevant data that interests them (offers from specific participants) enabling the reduction of storage cost and less network calls. Additionally, each member has control on how to host and manage this new service, service changes do not affect other parties (unless contract changes) and is able to be independently scalable.
This solution allows the member to choose precisely the Target Catalog Nodes that interests them reducing storage cost and network calls. Additionally, each member has control on how to host and manage this new service, service changes do not affect other parties (unless contract changes) and is able to be independently scalable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, nice touch, added here.

Even with a new service registering the Discovery Service (basically making it available in the space to other partners) the existing partners that may want to retrieve the new partners' offers must establish the need to retrieve them by defining the new BPNL somewhere.


A service containing the data of desired BPNL's and respective Connectors' URL's that a certain partner wants, but here the partner itself has the service in their scope and users are able to input (in the API of new service) the Participants. Simply, the new service would have a db/cache containing the BPNL's and Connectors' URL's of each partner this member wants the offers from. Again, **this new service works as the TargetNodeDirectory for the partner's FederatedCatalog**.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you feel the need to put something in simpler terms, why do you keep both forms and not only the simpler one?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah. Removed duplicated content in here.

- List BPNL's (value and connectors associated with it)


Regarding the service's Persistence Store, any suggestion is welcomed. Having a DB or a local Cache can suffice, but other options like persisting in a file could also be considered. Or a mixed between this options could be the best solution.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As this is a decision record document, I would avoid asking for a suggestion. That's what discussions are for. I believe you - as a contributor - have the knowledge to decide which option would suit the needs of tractusx and clear state why that decision is being taken.

The reviewer task is then to give its opinion on the decisions you made and possibly add a suggestion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the suggestion and focused only on proposed solution. Changed in here.

Copy link
Contributor

@rafaelmag110 rafaelmag110 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall this decision record seems to be much more complicated than it actually needs to be. As I said in one of the comments, if you feel the need to use simpler terms, then avoid having repeated text. This is a straight forward decision to explain:
-The FC crawls Target Catalog Nodes, which in essence are connector URLs.
-These nodes are resolved by the FC ExectionManager from the TargetNodeDirectory
-The current directory implementation is not ideal because ...
-The decision is to implement it the following way ...

@rafaelmag110
Copy link
Contributor

IMO I'd avoid making the TargetNodeDirectory extension dependent on the tractusx discovery service. It might come to a point where the association decides the DS is not longer needed and we would need to revisit this again.

A TargetCatalogNode is a connector URL, and representing it as a mapping between BPNL - Connector URL makes it unnecessarily complex.

We can allow the user of this new service API to input various TargetCatalogNodes (as in, a list of connector urls), in a single POST but how those URLs are resolved beforehand isn't the TargetNodeDirectory business.

@ndr-brt ndr-brt self-requested a review September 26, 2024 10:46
@bmg13
Copy link
Contributor Author

bmg13 commented Sep 26, 2024

IMO I'd avoid making the TargetNodeDirectory extension dependent on the tractusx discovery service. It might come to a point where the association decides the DS is not longer needed and we would need to revisit this again.

A TargetCatalogNode is a connector URL, and representing it as a mapping between BPNL - Connector URL makes it unnecessarily complex.

We can allow the user of this new service API to input various TargetCatalogNodes (as in, a list of connector urls), in a single POST but how those URLs are resolved beforehand isn't the TargetNodeDirectory business.

Yeah, I was aiming at abstracting this action from the user and trying to allow the service to do it on its own, however, after reading that comment, I have to agree with you. Removing an external dependency (with some overhead even) from this service and allowing it to work only as a TargetNodeDirectory while giving the user that control is the best approach, specially in the long term. Changed the doc to reflect it.

Copy link
Contributor

@ndr-brt ndr-brt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the reasons to deploy this service separatedly? It could be deployed together with the FederatedCatalog.

I also think it could implement a process to keep the directory in sync with the Discovery Service, and to provide an API to manage a "BPN Whitelist", that will permit to "enable" only specific BPNs.

From the [documentation](https://eclipse-edc.github.io/docs/#/submodule/FederatedCatalog/docs/developer/architecture/federated-catalog.architecture)
> The Federated Catalog requires a list of Target Catalog Nodes, so it knows which endpoints to crawl. This list is provided by the TargetNodeDirectory. During the preparation of a crawl run, the ExecutionManager queries that directory and obtains the list of TCNs.

So having a service containing the data of Connectors' URL's that a certain partner wants and allow them to host it. Users are able to input (using the API of new service) the Connectors' URL's of the connectors they want the catalogs from.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what do we mean here with "service"? a class? a runtime? an endpoint?
could we give it a name as well?

Copy link
Contributor Author

@bmg13 bmg13 Sep 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the idea would be to have a runnable jar (runtime)
following a suggestion, maybe TargetNodeDirectoryService.
Changed in here.


## Approach

The user is able to obtain the Connectors' URL's (through the Discovery Service, as an example) and store them in the new service through its API. The API will allow to save a list of Connectors' URL's in bulk and the service is responsible to store that (in memory or in database). These can later be retrieved and crawled by the Federated Catalog.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this operation meant to be done manually?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, with the goal of the removing the Discovery Service dependency from this TargetNodeDirectoryService the responsibility of obtaining and saving the the Connectors' URL's lies with the user, as sugggested here.

@bmg13
Copy link
Contributor Author

bmg13 commented Sep 26, 2024

What are the reasons to deploy this service separatedly? It could be deployed together with the FederatedCatalog.

The idea is to have this TargetNodeDirectoryService as a valid solution to any use, however if some user intends to define other TND they should not be locked at this Service, I believe.

I also think it could implement a process to keep the directory in sync with the Discovery Service, and to provide an API to manage a "BPN Whitelist", that will permit to "enable" only specific BPNs.

Initially, something similar was proposed, the user inputs the BPNs from which they want the catalogs from (working as a whitelist) and the TargetNodeDirectoryService would collect the respective Connectors' URL's from the Discovery Service and store them. However, and rightly pointed out by @rafaelmag110 in here, there is a way to ensure the user chooses the Connectors' URL's they want, whitelisting them, beforehand (manually by using the Discovery Service, per example) and not create that explicit dependency on TargetNodeDirectoryService.

@bmg13 bmg13 requested a review from ndr-brt September 26, 2024 14:26
Copy link
Contributor

@ndr-brt ndr-brt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the reasons to deploy this service separatedly? It could be deployed together with the FederatedCatalog.

The idea is to have this TargetNodeDirectoryService as a valid solution to any use, however if some user intends to define other TND they should not be locked at this Service, I believe.

I also think it could implement a process to keep the directory in sync with the Discovery Service, and to provide an API to manage a "BPN Whitelist", that will permit to "enable" only specific BPNs.

Initially, something similar was proposed, the user inputs the BPNs from which they want the catalogs from (working as a whitelist) and the TargetNodeDirectoryService would collect the respective Connectors' URL's from the Discovery Service and store them. However, and rightly pointed out by @rafaelmag110 in here, there is a way to ensure the user chooses the Connectors' URL's they want, whitelisting them, beforehand (manually by using the Discovery Service, per example) and not create that explicit dependency on TargetNodeDirectoryService.

To me looks overkill to have another separated service just for the TargetNodeDirectory, I didn't see in this document a valid reason to do so.

I also don't see why it shouldn't interact with the Discovery Service, I mean, everything in software can change, but that's what we do, we adapt to update things to make them work altogether :) (plus I don't think that it would eventually disappear from one day to another).
One downside that I see by having this process as manual is that, for example, if a participant changes the URL, there will be the need of a manual intervention to update it.
BPNs instead are more stable data (it's, in fact, the current Catena-X participant identifier), shorter and easier to read than an URL.

Machines are made for automation, manual work is boring and prone to error, let's just avoid it.

EDIT: I posted this as a comment but I deleted it and re-proposed it as review :)

@bmg13
Copy link
Contributor Author

bmg13 commented Sep 27, 2024

To me looks overkill to have another separated service just for the TargetNodeDirectory, I didn't see in this document a valid reason to do so.

Okay, possible unclear phrasing on my side, but what was intended with this proposal would be to have the TargetNodeDirectoryService as an own runtime, however Tractus-EDC Helm charts would be updated to include this service. Were you suggesting this or are you considering the TargetNodeDirectoryService to be embedded in the FederatedCatalog runtime?

@ndr-brt
Copy link
Contributor

ndr-brt commented Sep 27, 2024

To me looks overkill to have another separated service just for the TargetNodeDirectory, I didn't see in this document a valid reason to do so.

Okay, possible unclear phrasing on my side, but what was intended with this proposal would be to have the TargetNodeDirectoryService as an own runtime, however Tractus-EDC Helm charts would be updated to include this service. Were you suggesting this or are you considering the TargetNodeDirectoryService to be embedded in the FederatedCatalog runtime?

I was suggesting to keep it embedded in the federated catalog, because I don't see why it should deployed in a separated runtime.
I mean, every service could be deployed in a separated runtime, but unless there's a good reason to do so, I would avoid that.

@bmg13
Copy link
Contributor Author

bmg13 commented Sep 27, 2024

I was suggesting to keep it embedded in the federated catalog, because I don't see why it should deployed in a separated runtime.

Okay, this make sense, will rewrite to include TND embedded in the federated catalog.

I also don't see why it shouldn't interact with the Discovery Service, I mean, everything in software can change, but that's what we do, we adapt to update things to make them work altogether :) (plus I don't think that it would eventually disappear from one day to another).
One downside that I see by having this process as manual is that, for example, if a participant changes the URL, there will be the need of a manual intervention to update it.
BPNs instead are more stable data (it's, in fact, the current Catena-X participant identifier), shorter and easier to read than an URL.

I see. So we will suggest having a DiscoveryServiceRetriever (DSR) extension included in the Federated Catalog. This will allow to:

  • Have and API for a member to add/delete the BPNL's (or connector URL's) of participants from which catalogs are wanted;
  • Create a bearer token (after requesting from identity provider) to retrieve the connector url's associated with stored BPNL's;
  • Store the data retrieved.

If by any chance a member does not aim at use the Discovery Service, a new extension can be create and replace the DSR.

This will allow to use the SQL (or InMem) store already present in the Federated Catalog.
Changes added here.

@bmg13 bmg13 requested a review from ndr-brt September 27, 2024 17:00
Copy link
Contributor

@ndr-brt ndr-brt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at this point my suggestion would be to rephrase this and #1555 in a single DR, because they target the same scope, it will be easier to follow for the community.

@bmg13
Copy link
Contributor Author

bmg13 commented Sep 30, 2024

at this point my suggestion would be to rephrase this and #1555 in a single DR, because they target the same scope, it will be easier to follow for the community.

I followed the idea of it being two separated decisions (despite same scope), but will try to merge them in the meantime. Just highlighting that the DR will contain two decisions, one for FC distribution and other for the TND.

Copy link

sonarcloud bot commented Sep 30, 2024

@bmg13
Copy link
Contributor Author

bmg13 commented Sep 30, 2024

Considering the suggestion, this PR is closed and existing proposal added to this PR.

@bmg13 bmg13 closed this Sep 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants