High level flow diagram #4

pombredanne · 2024-02-09T14:49:40Z

The attached diagram presents some high level flow of the various parts: VCIO, PurlDB, FederatedCode

federatedcode-flow.odp

pombredanne · 2024-11-20T11:56:45Z

Here are the step we are going through at a high level

Process to populate Git repos

In VCIO

command line to export from VCIO
outcome: vulnerabilities and minimal package data saved on disk as YAML and commit the results in the backing git repo(s) then push back

In SCIO

add-on pipeline to export a project output in SCIO
outcome: details package data saved on disk as YAML and commit the results in the backing git repo(s), then push back

Process to advertise Package and Vulnerability data:

In FederatedCode

command line to "sync" the backing git repos
- For Vulnerabilities
- For Packages
outcome: repos are cloned locally and the data pulled in the FederatedCode database, ideally only looking at changes and data pointers, not the whole data

In FederatedCode

command line to "federate" the data
- For Vulnerabilities
- For Packages
outcome: new and updated data is advertised in the Fediverse for the stream of events impacting each package

Process to retrieve Package and Vulnerability data in PULL mode:

In a PurlDB instance

command line to retrieve "federated" data from FederatedCode, possibly for a single PURL (or possibly as an API on demand endpoint?)
outcome:
- the summary data is pulled straight from FederatedCode git repos, using PURL as a key
- the scans details are fetched from FederatedCode git repos
- the scans are imported in the DB and packages created

In a VCIO instance

command line to retrieve "federated" data from FederatedCode, possibly for a single PURL (or possibly as an API on demand endpoint?)
outcome:
- the summary data is pulled straight from FederatedCode git repos, using PURL as a key
- the vulnerable package versions and vulnerability details are fetched from FederatedCode git repos (And in the future also advisories?)
- the data are imported in the DB and vulnerability with packages and their relationships created

Process for PurlDB and VCIO to obtain federated Package and Vulnerability data in PUSH mode:

We will need first a process for PurlDB and VCIO to subscribe to federated Package and Vulnerability data. Then once this is done we should get federated messages processed as explained below.

In a PurlDB instance

command line or API to subscribe to "federated" data from FederatedCode
command line to receive "federated" data from FederatedCode or endpoint that is part of the fediverse that can receive activitypub messages. This is only for the packages for which we have subscribed in FederatedCode
outcome:
- the summary activitypub data is received as PUSHED from FederatedCode

The data are updated as in PULL mode:

the summary data is pulled straight from FederatedCode git repos, using PURL as a key
the scans details are fetched from FederatedCode git repos
the scans are imported in the DB and packages created

In a VCIO instance

command line or API to subscribe to "federated" data from FederatedCode
command line to receive "federated" data from FederatedCode or endpoint that is part of the fediverse that can receive activitypub messages. This is only for the packages for which we have subscribed in FederatedCode
outcome:
- the summary data is received from FederatedCode

The data are updated as in PULL mode:

the summary data is pulled straight from FederatedCode git repos, using PURL as a key
the vulnerable package versions and vulnerability details are fetched from FederatedCode git repos (And in the future also advisories?)
the data are imported in the DB and vulnerability with packages and their relationships created

Process to "curate" data:

A design is to consider VCIO curations as advisories made by a person or org, then we eventually have many advisories from actual VCIO data sources and other "federated" advisories. This will demand some WIP changes on VCIO models to happen.

Curate vulnerabilities in FederatedCode and VCIO
Curate packages in FederatedCode and PurlDB

pombredanne · 2024-11-20T11:57:21Z

@ziadhany fyi, we need to make this set of flows clear so we can write the doc. Let's chat

ziadhany · 2024-11-21T03:00:36Z

Sure, @pombredanne let's have a chat and finalize the flows

Process to advertize Package and Vulnerability data:

1. In FederatedCode


* command line to "sync" the backing git repos
  
  * [x]  For Vulnerabilities
  * [ ]  For Packages

* outcome: repos are cloned locally and the data pulled in the database, only looking at changes


2. In FederatedCode


* command line to "federate" the data
  
  * [x]  For Vulnerabilities
  * [ ]  For Packages

For Packages in VCIO: We had this before but we changed the file structure slightly. I think we need thorough testing to catch any bugs, especially in the importer ( sync ) . Additionally, we should ensure robust testing for the federate functionality to avoid issues when federating messages.

Process to retrieve Package and Vulnerability data in PULL mode:

1. In PurlDB


* [ ]  command line to retrieve "federated" data from FederatedCode, possibly for a single PURL (or possibly as an API on demand endpoint?)

* outcome:
  
  * the summary data is pulled from FederatedCode
  * the scans details are fetched from backing git repos
  * the scans are imported in the DB and packages created


2. In VCIO


* [ ]  command line to retrieve "federated" data from FederatedCode, possibly for a single PURL (or possibly as an API on demand endpoint?)

* outcome:
  
  * the summary data is pulled from FederatedCode
  * the vulnerable package and vulnerability details are fetched from backing git repos (what about advisories?)
  * the data are imported in the DB and vulnerability and packages and relationship created

We have an endpoint for this (sync) /repository/{repo-id}/sync-repo/. Click on 'sync' POST Form request. This endpoint pulls the Git repository data, then runs the importer script, which fetches the diff and processes only the diff (the unprocessed commits). then It creates the vulnerability and package relations. However, we need to create a test to catch any bugs, and we need to double-check the relations we want to store for VCIO. We also need to determine what we will store for SCIO/PurlDB

Process for PurlDB and VCIO to obtain federated Package and Vulnerability data in PUSH mode:

We will need first a process for PurlDB and VCIO to subscribe to federated Package and Vulnerability data. Then once this is done we should get federated messages processed as explained below.
1. In PurlDB


* [ ]  command line to receive "federated" data from FederatedCode
  or endpoint that is part of the fediverse that can receive activitypub messages

* outcome:
  
  * the summary data is received from FederatedCode
  * the data are updated as in PULL mode


2. In VCIO


* [ ]  command line to receive "federated" data from FederatedCode
  or endpoint that is part of the fediverse that can receive activitypub messages

* outcome:
  
  * the summary data is received from FederatedCode
  * the data are updated as in PULL mode

I think we should have an endpoint in VCIO and PurlDB that updates the vulnerability or package after it is reviewed and accepted on FederatedCode. Then, VCIO will push the changes to the Git repo, and FederatedCode will sync the repo and update the relation.

Process to "curate" data:

* [ ]  Curate vulnerabilities in FederatedCode and VCIO

* [ ]  Curate packages in FederatedCode and PurlDB

I'm not sure about this, but it depends on many factors. Should we rely on the FederatedCode review or the GitHub repo review (pull request) and treat Git as the source of truth? I was thinking we could have both mechanisms. For example, if we create a review in FederatedCode, it could trigger one in GitHub. However, I believe this approach might lead to issues, such as message sync problems and merge conflicts. I think it might be better to rely on just one and set up a GitHub action/trigger on merge.

This was referenced Nov 21, 2024

FederatedCode: Create PurlDB importer that can pull the data straight from FederatedCode git repos #28

Open

FederatedCode: Create VCIO importer that can pull the data straight from FederatedCode git repos #29

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

High level flow diagram #4

High level flow diagram #4

pombredanne commented Feb 9, 2024 •

edited

Loading

pombredanne commented Nov 20, 2024 •

edited

Loading

pombredanne commented Nov 20, 2024

ziadhany commented Nov 21, 2024 •

edited

Loading

Process to advertize Package and Vulnerability data:

Process to retrieve Package and Vulnerability data in PULL mode:

Process for PurlDB and VCIO to obtain federated Package and Vulnerability data in PUSH mode:

Process to "curate" data:

High level flow diagram #4

High level flow diagram #4

Comments

pombredanne commented Feb 9, 2024 • edited Loading

pombredanne commented Nov 20, 2024 • edited Loading

Process to populate Git repos

Process to advertise Package and Vulnerability data:

Process to retrieve Package and Vulnerability data in PULL mode:

Process for PurlDB and VCIO to obtain federated Package and Vulnerability data in PUSH mode:

Process to "curate" data:

pombredanne commented Nov 20, 2024

ziadhany commented Nov 21, 2024 • edited Loading

Process to advertize Package and Vulnerability data:

Process to retrieve Package and Vulnerability data in PULL mode:

Process for PurlDB and VCIO to obtain federated Package and Vulnerability data in PUSH mode:

Process to "curate" data:

pombredanne commented Feb 9, 2024 •

edited

Loading

pombredanne commented Nov 20, 2024 •

edited

Loading

ziadhany commented Nov 21, 2024 •

edited

Loading