Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: Proposal for FederatedCatalog Distribution and TargetNodeDirectory #1555

Open
wants to merge 18 commits into
base: main
Choose a base branch
from

Conversation

bmg13
Copy link
Contributor

@bmg13 bmg13 commented Sep 16, 2024

WHAT

Decision Record for new proposal for FederatedCatalog distribution (with Tractus-X) and the TargetNodeDirectory (TND).

The TND initial proposal was defined in this PR and continue in the current one.

Related with #1585

@bmg13 bmg13 marked this pull request as ready for review September 16, 2024 13:27
Copy link
Contributor

This pull request is stale because it has been open for 7 days with no activity.

@github-actions github-actions bot added the stale label Sep 24, 2024
Copy link
Contributor

@paullatzelsperger paullatzelsperger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, before going into details I want to clarify a few things. A decision record should document a Decision (not options) and have the following structure:

  • Decision: outline in a succinct and concise manner, in one or two sentences, what was decided.
  • Rationale: outline the thought process how that decision was reached, maybe mention one or two alternatives and why they were not chosen. Motivate your decision.
  • Approach: how the decision can be implemented. This should also outline possible limitations.

Specifically, the FC is not a "standalone service". What you probably mean is a separate runtime (docker image, runnable JAR file) and thus a separate deployment in the Tx-EDC Helm Chart, like the data plane. It is extremely important to use precise wording

@paullatzelsperger paullatzelsperger added documentation Improvements or additions to documentation and removed stale labels Sep 24, 2024
The Federated Catalog aims at providing the aggregation of catalogs from multiple participants in a centralized point to reduce latency.

Choosing a solution that decouples from Control Plane (like the one used for the Data Plane) and able to be scalable will future-proof the Federated Catalog as a feature and embraces wider usage.
Like so, having a Federated Catalog service, scalable on its own based on user needs, and not on Control Plane, allows the best resource management in the long term. It also enables adding features to the Federated Catalog without the need to change the Control Plane logic (assuming no API usage change).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the first part is already described by the previous sentence, the second one is not 100% correct, because it is actually possible to add features to the control plane without change its "logic" (depending on what do you mean with "logic", but generally speaking that's what the EDC modular architecture is about)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was bad phrasing on my side, I was aiming at mentioned the goal of having independent updates. However, in the previous paragraph was already mentioned the decoupling this approach promotes so it (like the scalability) can be seen as repetitive information.
Removed in here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in any case also the "indipendent updates" is not something we want to achieve with this decision, because we will release the FC together with CP and DP, otherwise we would put it in a separate chart.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. I wanted to highlight, as an example, that if a change in the Federated Catalog Cache or similar should be transparent for other instances (like CP or DP). And it would be, I believe, but updating to a new version when all are in tractus-edc charts then "independent update" is not accurate.
Thanks!

@bmg13 bmg13 requested a review from ndr-brt September 26, 2024 14:26

## Decision

The proposed solution is to have a Federated Catalog as own runnable service capable of crawling all the chosen catalogs and expose that data.
Copy link
Contributor

@paullatzelsperger paullatzelsperger Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this should be an unequivocal decision, not a proposal. It is a "decision record", not a "proposal record". words matter.
Also, what is a "runnable service"? We should be very explicit here, for example, in saying: "The Federated Catalog will be deployed as standalone component. For this, the Helm charts will be updated to feature a new deployment template called 'federated-catalog'".

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, updated here.


## Rationale

The Federated Catalog aims at providing the aggregation of catalogs from multiple participants in a centralized point to reduce latency.
Copy link
Contributor

@paullatzelsperger paullatzelsperger Sep 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not needed. No need to motivate the existence of the FC, only the decision to deploy it as standalone component.
It also makes sense here to highlight some of the disadvantages (increased configuration complexity, more remote calls,...) and why they are still justified (scalability, etc.).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in here.


## Approach

The Federated Catalog running as own instance can be performed either by running respective Docker image in a container or directly running a generated `jar`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no theoretical concepts needed. Please only highlight how it is going to be deployed in the Tractus-X EDC helm charts, i.e. as separate deployment template. Doing this will bring some increased complexity, for example we need an additional Kubernetes ingress definition, as opposed to: just an additional path in an existing ingress.

Here it would also be advisable to mention, that there is going to be an additional runtime (i.e. docker image), analogous to edc-controlplane-XYZ and edc-dataplane-XYZ and to list the required variants (in-mem, postgres persistence).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, changed in here.

Regarding deployment, the Tractus-Connector Helm charts will be updated to include this service.

Relevant to highlight a potential challenge of the proposed approach.
- Having its own instance (that itself contains a cache) may use considerable computing resources (related with storage cost).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use considerable computing resources (related with storage cost).

how so? what cost do you anticipate, that wouldn't also be there if the FC ran embedded in the ControlPlane?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. The idea was comparing a specific runtime vs embedded in control plane could incur on having more resources used, but should be a minimal difference and I do not have comparing numbers.
Regarding cost, is a non issue, since same data would be store regardless of solution.
Removed here.

@bmg13
Copy link
Contributor Author

bmg13 commented Sep 30, 2024

Considering this suggestion, the Decision Record proposals for the Distribution and the TargetNodeDirectory were merged in one since they related within same scope.

@bmg13 bmg13 changed the title docs: Proposal for FederatedCatalog distribution docs: Proposal for FederatedCatalog Distribution and TargetNodeDirectory Sep 30, 2024

To enable the Federated Catalog flow, please [see this table](https://github.com/eclipse-tractusx/tractusx-edc/blob/75bdacbad43e2cad352204ea28a359c6aac7adea/docs/development/management-domains/README.md#enable-and-configure-the-crawler-subsystem).

For its TargetNodeDirectory, the user is able to obtain the Connectors' URL's through the Discovery Service and store them in the new extension through its API. The API will allow to save a list of BPNLs (and Connectors' URL's if desired) and the `DiscoveryServiceRetrieverExtension` is responsible to retrieve the data and store it (in memory or in a database). The URL's can later be retrieved and crawled by the Federated Catalog.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

coul you please explain more here? e.g. showing how the API will be structured, which endpoint will be exposed.
Plus, how will it work by default, if no BPN is provided will it fetch all the available participants or not at all?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, thank you, changed in here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be also good to describe the api that permits to "save a list of BPNLs", and, if eventually there are other accessory methods (get, delete, ...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I understand. Added in here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, we should be operating on DIDs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated here.

To enable the Federated Catalog flow, please [see this table](https://github.com/eclipse-tractusx/tractusx-edc/blob/75bdacbad43e2cad352204ea28a359c6aac7adea/docs/development/management-domains/README.md#enable-and-configure-the-crawler-subsystem).

For its TargetNodeDirectory, the user is able to obtain the Connectors' URL's through the Discovery Service and store them in the new extension through its API. The API will allow to save a list of BPNLs (and Connectors' URL's if desired) and the `DiscoveryServiceRetrieverExtension` is responsible to retrieve the data and store it (in memory or in a database). The URL's can later be retrieved and crawled by the Federated Catalog.
This solution improves on the default one of having the data in a static file since a dynamic approach would avoid downtime when a change is required.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this sentence belongs to the rationale section

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed in here.

@bmg13 bmg13 requested a review from ndr-brt October 7, 2024 14:58
Copy link
Contributor

@ndr-brt ndr-brt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please add some more details on the approach part, it is still too vague, it would be good to see paths, methods, request/response body structure and so on.


To enable the Federated Catalog flow, please [see this table](https://github.com/eclipse-tractusx/tractusx-edc/blob/75bdacbad43e2cad352204ea28a359c6aac7adea/docs/development/management-domains/README.md#enable-and-configure-the-crawler-subsystem).

For its TargetNodeDirectory, the user is able to obtain the Connectors' URL's through the Discovery Service and store them in the new extension through its API. The API will allow to save a list of BPNLs (and Connectors' URL's if desired) and the `DiscoveryServiceRetrieverExtension` is responsible to retrieve the data and store it (in memory or in a database). The URL's can later be retrieved and crawled by the Federated Catalog.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be also good to describe the api that permits to "save a list of BPNLs", and, if eventually there are other accessory methods (get, delete, ...)


The retrieval of Connector URL's through the Discovery Service is enabled by the endpoint:
```
POST: /api/administration/connectors/discovery
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any api spec document from the Discovery Service documentation that can be referenced here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not find much, been using a int environment swagger to test it, however regarding DS guideline usage it can be found https://catenax-ev.github.io/docs/standards/CX-0001-EDCDiscoveryAPI#22-api-specification. Added in the PR here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to discuss this. The Discovery Service will likely be retired in 2025. We may need to use it as a temporary solution, but then this API should remain alpha until then.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We were not aware of the 2025 timeline for the DiscoveryService. We had a proposal that would not enforce the DS querying in this extension by having the member responsible to obtain the Participant data, but since that would create more manual work, the proposal was then to query the DS (since we had no idea of it ceasing to be used).

@bmg13 bmg13 requested a review from ndr-brt October 7, 2024 17:39
@bmg13
Copy link
Contributor Author

bmg13 commented Oct 8, 2024

@paullatzelsperger can you review my address to your request changes, please?

Copy link
Contributor

@paullatzelsperger paullatzelsperger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please avoid blather and repetition, be succinct and to the point. Also, please adhere to the "Decision - Rationale - Approach" structure better.

For example, if an additional API is planned, it should be outlined in the Decision section, motivated in the Rationale section and explained in the Approach section.


## Decision

Federated Catalog will be deployable as a standalone component capable of crawling all the chosen catalogs and expose that data. The Tractus-Connector Helm charts will be updated to feature a new Federated Catalog deployment template.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Federated Catalog will be deployable as a standalone component capable of crawling all the chosen catalogs and expose that data. The Tractus-Connector Helm charts will be updated to feature a new Federated Catalog deployment template.
The Federated Catalog will be deployed as a standalone component. The Tractus-X EDC Connector Helm charts will be updated to feature a new Federated Catalog deployment template.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed in here.

## Decision

Federated Catalog will be deployable as a standalone component capable of crawling all the chosen catalogs and expose that data. The Tractus-Connector Helm charts will be updated to feature a new Federated Catalog deployment template.
Regarding the TargetNodeDirectory, a new extension in the FederatedCatalog will have a db/cache containing the BPNL's and Connectors' URL's of each partner a member wants the offers from.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is spurious. Instead, please outline how the TND is supposed to get populated and where the target node data is sourced from.

Suggested change
Regarding the TargetNodeDirectory, a new extension in the FederatedCatalog will have a db/cache containing the BPNL's and Connectors' URL's of each partner a member wants the offers from.
Regarding the `TargetNodeDirectory`, a new extension in the FederatedCatalog will have a db/cache containing the BPNL's and Connectors' URLs of each partner a member wants the offers from.

Copy link
Contributor Author

@bmg13 bmg13 Nov 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated in here.

Comment on lines 10 to 11
Considering the Federated Catalog distribution, choosing a solution that decouples it from the Control Plane (like the one used for the Data Plane) and able to be scalable will future-proof the Federated Catalog as a feature and embraces wider usage.
Having a specific runtime incurs on additional overhead (new Helm Chart, as example) and results in additional configuration complexity. Also, periodical crawling results in increased remote calls over time. However, being decoupled from the Control Plane, this solution permits the Federated Catalog to scale independently (based on own demand).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please be more succinct, e.g.

Suggested change
Considering the Federated Catalog distribution, choosing a solution that decouples it from the Control Plane (like the one used for the Data Plane) and able to be scalable will future-proof the Federated Catalog as a feature and embraces wider usage.
Having a specific runtime incurs on additional overhead (new Helm Chart, as example) and results in additional configuration complexity. Also, periodical crawling results in increased remote calls over time. However, being decoupled from the Control Plane, this solution permits the Federated Catalog to scale independently (based on own demand).
While a standalone component (= K8S deployment) brings a slight increase in configuration complexity, its ability to be managed and scaled independently makes up for that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed in here.

Comment on lines 13 to 16
For TargetNodeDirectory it will be set by a new extension responsible for exposing an API, where a member can input the BPNL's of the participants from which the catalogs are wanted, and then it will retrieve and store the respective Connector URL's. This new extension would get the data from the Discovery Service, provided a BPNL, and will be named `DiscoveryServiceRetrieverExtension`. This solution allows the member to choose precisely the Target Catalog Nodes that interests them, resulting in reduced network calls and latency.
Additionally, if a Connector URL is registered (or unregistered) in the Discovery Service, the retriever will reflect it since it requests based on BPNL (which should not change) and the registered URL's will be returned.

This solution improves on the default one of having the data in a static file since a dynamic approach would avoid downtime when a change is required.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should go in the Approach section, as it outlines technical implementation.
It is enough to say something to the effect of "target node data is sourced directly from the BPNL Directory service (or whatever it's called) since that already exists in Tractus-X and provides comprehensive participant data." It might also be worth mentioning, that there currently is no way for participants not to be listed in that TND.

Apart from that, there is no need to explain the TND's functionality here. Rather, in the "Approach" section, please detail where the TND gets its data, if and how it's cached + what cache eviction strategy will be used, how new (or deleted) entries are detected etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we using BPNs when we are planning to move away from those at the protocol layer? We should be entering the DIDs. This means we need to define an easy way for people to find the DID, which can be done using the BDRS client.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I understand it and updated the documentation to include DID input, having the BDRS resolve of those DID and call the Discovery Service to obtain the Connector URL's.
This is aligned to what was discussed in a weekly meeting and is updated here.


## Approach

Since the Federated Catalog will be a standalone runtime, the Tractus-Connector Helm charts will be updated to include the Federated Catalog as a separated deployment. The update will include the creation of a specific `deployment-federatedcatalog.yaml`, similar [to this one](https://github.com/eclipse-tractusx/tractusx-edc/blob/a263bf71a110245657131509d4b37d058a1d220d/charts/tractusx-connector-azure-vault/templates/deployment-dataplane.yaml#L47) (for `ingress` and `hpa` as well), for different scenarios (InMemory, PostreSQL, etc.). This results in added configuration complexity.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Since the Federated Catalog will be a standalone runtime, the Tractus-Connector Helm charts will be updated to include the Federated Catalog as a separated deployment. The update will include the creation of a specific `deployment-federatedcatalog.yaml`, similar [to this one](https://github.com/eclipse-tractusx/tractusx-edc/blob/a263bf71a110245657131509d4b37d058a1d220d/charts/tractusx-connector-azure-vault/templates/deployment-dataplane.yaml#L47) (for `ingress` and `hpa` as well), for different scenarios (InMemory, PostreSQL, etc.). This results in added configuration complexity.
Since the Federated Catalog will be a standalone runtime, the Tractus-X EDC Connector Helm charts will be updated to include the Federated Catalog as a separated deployment. The update will include the creation of a specific `deployment-federatedcatalog.yaml`, similar [to this one](https://github.com/eclipse-tractusx/tractusx-edc/blob/a263bf71a110245657131509d4b37d058a1d220d/charts/tractusx-connector-azure-vault/templates/deployment-dataplane.yaml#L47) (for `ingress` and `hpa` as well), for different scenarios (InMemory, PostreSQL, etc.). This results in added configuration complexity.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed in here.


To enable the Federated Catalog flow, please [see this table](https://github.com/eclipse-tractusx/tractusx-edc/blob/75bdacbad43e2cad352204ea28a359c6aac7adea/docs/development/management-domains/README.md#enable-and-configure-the-crawler-subsystem).

For its TargetNodeDirectory, the user is able to obtain the Connectors' URL's through the Discovery Service and store them in the new extension through its API. The API will allow to save a list of BPNLs (and Connectors' URL's if desired) and the `DiscoveryServiceRetrieverExtension` is responsible to retrieve the data and store it (in memory or in a database). The URL's can later be retrieved and crawled by the Federated Catalog.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

who's exposing the API? The DS? the extension?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The extension. Added in here.

Information regarding the related API can be found [here](https://catenax-ev.github.io/docs/standards/CX-0001-EDCDiscoveryAPI#22-api-specification).

Some limitations of this TargetNodeDirectory solution are:
- Each partner must have the BPNLs beforehand. If a new Partner registers and an existing partner would want their catalog, the BPNL (or Connector URL's) of the new partner must be obtained first;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how so? if a new participant registers, their BPN is added to the Discovery Service and the connectors' TND would pick that up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes no sense to crawl all catalogs and is potentially harmful as it is the default. A wildcard should need to be explicitly configured for this.

Copy link
Contributor Author

@bmg13 bmg13 Oct 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So if no participants are set (to obtain their catalogs) instead of having all catalogs whitelisted in the space (which is too much, I see) the solution is to "blacklist" them. I mean to only obtain catalogs if they are explicitly defined and, if none is, no crawling should happen. Correct?

Copy link
Contributor Author

@bmg13 bmg13 Oct 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how so? if a new participant registers, their BPN is added to the Discovery Service and the connectors' TND would pick that up.

the storage used by this extension will be updated to include the new BPNL and in future requests to the DiscoveryService that BPNL will also be contemplated.

Will update the doc to include these.

Some limitations of this TargetNodeDirectory solution are:
- Each partner must have the BPNLs beforehand. If a new Partner registers and an existing partner would want their catalog, the BPNL (or Connector URL's) of the new partner must be obtained first;
- Deal with the overhead an additional persistence store;
- The usage of the Discovery Service requires a technical user account to access it (must be requested).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very good point. Any idea on how to handle that? Are the technical user creds stored in a vault, are they config properties,...?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the reasons why the Discovery Service needs to be retired. If we have to use it for the time being, the code needs to be externalized from the core TX FCC codebase. We need to discuss this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considered a vault usage since believe it to be safer. Will include in the documentation.

#### Save BPNL's
Request body would contain a list of BPNL's, allowing to store in bulk.
```
[POST] /api/target-nodes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what API is this? If it is the connector extension's API, then please explain why it is necessary. The extension's job is to autonomically query the DS, so why does it need an API?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it is the connector extension's API, then please explain why it is necessary

yes, it is. the need here is to allow the user to use the extensions new API to indicate which participants catalogs they may require (or not). The extension will only query the DS for the Participants data that are indicated through this API.
Updated the documentation to try to be more explicit here.

{
"bpn": "BPNL000000000002",
"connectorEndpoint": [
"https://connector2/api/v1/dsp",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what does this mean? I know that connectors have multiple URLs in the DiscoveryService (which to me is a design flaw), but what does this mean for the TND? will there be multiple entries per BPN? If so, what does this mean for using the BPN as primary key? Will we need update the data model for the TargetNode?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very good point.
Yes, the aim to have the BPNL as the pk will require the TargetNode targetUrl prop to be a list instead. Or consider a different pk and allowing the BPNL to not be unique.
I would prefer to have the BPNL as pk and change the TargetNode.

Comment on lines 13 to 16
For TargetNodeDirectory it will be set by a new extension responsible for exposing an API, where a member can input the BPNL's of the participants from which the catalogs are wanted, and then it will retrieve and store the respective Connector URL's. This new extension would get the data from the Discovery Service, provided a BPNL, and will be named `DiscoveryServiceRetrieverExtension`. This solution allows the member to choose precisely the Target Catalog Nodes that interests them, resulting in reduced network calls and latency.
Additionally, if a Connector URL is registered (or unregistered) in the Discovery Service, the retriever will reflect it since it requests based on BPNL (which should not change) and the registered URL's will be returned.

This solution improves on the default one of having the data in a static file since a dynamic approach would avoid downtime when a change is required.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we using BPNs when we are planning to move away from those at the protocol layer? We should be entering the DIDs. This means we need to define an easy way for people to find the DID, which can be done using the BDRS client.


To enable the Federated Catalog flow, please [see this table](https://github.com/eclipse-tractusx/tractusx-edc/blob/75bdacbad43e2cad352204ea28a359c6aac7adea/docs/development/management-domains/README.md#enable-and-configure-the-crawler-subsystem).

For its TargetNodeDirectory, the user is able to obtain the Connectors' URL's through the Discovery Service and store them in the new extension through its API. The API will allow to save a list of BPNLs (and Connectors' URL's if desired) and the `DiscoveryServiceRetrieverExtension` is responsible to retrieve the data and store it (in memory or in a database). The URL's can later be retrieved and crawled by the Federated Catalog.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, we should be operating on DIDs


The retrieval of Connector URL's through the Discovery Service is enabled by the endpoint:
```
POST: /api/administration/connectors/discovery
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to discuss this. The Discovery Service will likely be retired in 2025. We may need to use it as a temporary solution, but then this API should remain alpha until then.

Information regarding the related API can be found [here](https://catenax-ev.github.io/docs/standards/CX-0001-EDCDiscoveryAPI#22-api-specification).

Some limitations of this TargetNodeDirectory solution are:
- Each partner must have the BPNLs beforehand. If a new Partner registers and an existing partner would want their catalog, the BPNL (or Connector URL's) of the new partner must be obtained first;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes no sense to crawl all catalogs and is potentially harmful as it is the default. A wildcard should need to be explicitly configured for this.

Some limitations of this TargetNodeDirectory solution are:
- Each partner must have the BPNLs beforehand. If a new Partner registers and an existing partner would want their catalog, the BPNL (or Connector URL's) of the new partner must be obtained first;
- Deal with the overhead an additional persistence store;
- The usage of the Discovery Service requires a technical user account to access it (must be requested).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the reasons why the Discovery Service needs to be retired. If we have to use it for the time being, the code needs to be externalized from the core TX FCC codebase. We need to discuss this.

Copy link
Contributor

This pull request is stale because it has been open for 7 days with no activity.

@github-actions github-actions bot added the stale label Oct 22, 2024
… into dr_proposal_federated_catalog_distribution
Copy link
Contributor

This pull request was closed because it has been inactive for 7 days since being marked as stale.

@github-actions github-actions bot closed this Oct 29, 2024
@bmg13
Copy link
Contributor Author

bmg13 commented Oct 29, 2024

@jimmarino @ndr-brt @paullatzelsperger can someone reopen this pr as well, please?

… into dr_proposal_federated_catalog_distribution
@bmg13
Copy link
Contributor Author

bmg13 commented Oct 29, 2024

Updated the documentation to include DID input, having the BDRS resolve of those DID and call the Discovery Service to obtain the Connector URL's. This is aligned to what was discussed in a weekly meeting.

@bmg13 bmg13 requested a review from ndr-brt October 29, 2024 14:23
@github-actions github-actions bot removed the stale label Oct 30, 2024
Copy link
Contributor

github-actions bot commented Nov 7, 2024

This pull request is stale because it has been open for 7 days with no activity.

@github-actions github-actions bot added the stale label Nov 7, 2024
@bmg13
Copy link
Contributor Author

bmg13 commented Nov 7, 2024

sorry for reping, but @jimmarino @ndr-brt @paullatzelsperger @wolf4ood can someone look at this pr, please?

@jimmarino
Copy link
Contributor

sorry for reping, but @jimmarino @ndr-brt @paullatzelsperger @wolf4ood can someone look at this pr, please?

Can you resolve the outstanding comments and we can re-review?

@github-actions github-actions bot removed the stale label Nov 8, 2024
Copy link

sonarcloud bot commented Nov 8, 2024

@bmg13
Copy link
Contributor Author

bmg13 commented Nov 8, 2024

sorry for reping, but @jimmarino @ndr-brt @paullatzelsperger @wolf4ood can someone look at this pr, please?

Can you resolve the outstanding comments and we can re-review?

Sorry about it, all seems addressed now, thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

5 participants