Improvement to dataplane selection and configuration #4031
Labels
breaking-change
Will require manual intervention for version update
dpf
Feature related to the Data Plane Framework
enhancement
New feature or request
story
Overarching issue with linked sub-issues
Milestone
Feature Request
This is a story issue that spans across several subtasks and aims to improve the data plane capabilities in the aspects detailed here.
1. The
transferType
as key indicatorIn the provider data plane, the
transferType
(sometimes referred to as "format") should dictate the physical destination of a data pipeline:PULL
transfers: theDataSink
is inferred, e.g.HttpData-PULL
means, the data is piped into aHttpDataSink
PUSH
transfers: theDataSink
is inferred, but additional data is required, e.g.AmazonS3-PUSH
, the data is pushed into aS3BucketDataSink
, but the bucket name, region, etc. must be provided by the consumer.Note that the source of the data pipeline is always determined by the
DataAddress
that is associated with theAsset
.From this, we can derive the following requirements:
transferType
. I.e. the provider DP must be able to infer fromHttpData-PULL
that it must instantiate aHttpDataSink
. This must be extensible.transferType -> destinationType
. To achieve that, the filtering based on thesourceType
should be dropped, and there must be an explicit mapping instead of two disjoint lists ("allowedTransferType", "allowedDestinationTypes").TransferRequestMessage#dataDestination
should be optional.DataAddress#getType
could be made optional.2. Additional filtering when building the catalog
In the current implementation, the inclusion of a
Distribution
for an asset in the catalog is solely based on the physical capability of a data plane (determined by its correspondingDataPlaneInstance
). That means, if there is a data plane, that can handle a certain format, it will be included in the catalog. In some situations, users may want to restrict the transfer of certain assets to specific data planes, for example due to security concerns. Note that this could cause some assets not to be included in the catalog at all!To achieve that, the data plane selector is extended with a
getCandidates()
method, which evaluates dynamically at runtime the set of data planes that can satisfy a data offering.3. Automatic mapping of the
transferType
As stated before, it is necessary to infer the physical
DataSink
from thetransferType
. To achieve that, an extensible directory oftransferType -> Class<DataSink>
is to be provided.Which Areas Would Be Affected?
Data Plane, Data Plane Selector
Why Is the Feature Desired?
compliance with DSP (
transferType
as sole arbiter), avoid invalid configurations in data plane registrationSolution Proposal
the following list of subtasks will realize this feature:
transferType
s can handle #4150select
operation intoDataPlaneClientFactory
implementation #4203data-plane-server
launcher #4198The text was updated successfully, but these errors were encountered: