You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem?
Currently, the model interface is applied before the pre-processing function. When both pre-processing and post-processing functions are defined, the model interface doesn't accurately represent the required format for model input and output.
We need to
Reposition the model interface to be applied after the pre-processing and post-processing functions. This will ensure that the model interface correctly describes the model input and output format.
Implement automatic recognition and application of the model interface for existing pre-processing and post-processing functions. This should be done for the following known pre-processing functions:
The model interface will accurately represent the required format for model input and output.
Improved consistency and ease of use for developers working with pre-processing and post-processing functions.
Reduced manual configuration errors when setting up model interfaces.
What alternatives have you considered?
Instead of repositioning model interface, create a new object called "model predict mapping" that can describe the model input and output mapping after pre-processing function, but the cons is that this seems very redundant to model interface.
The text was updated successfully, but these errors were encountered:
@mingshl, in my opinion, we should deprecate the pre and post processing functionality in the connector and move this functionality to the pipeline/flow level. The connector's job is simply to provide a bridge to an external AI service endpoint. In my opinion, we should keep it simple and decouple data transform functionality.
Previously, the connectors were built for neural search and back then we had a very limited interface that was limited to converting query text into text embeddings. It was easy to predefine pre and post processing logic.
Now that we provide the generic capability to integrate any ML model into OpenSearch (search and ingest) data flows, we need to provide flexibility and ease-of-use for configure data processing within flows/pipelines. The current pre and post processing logic should be re-packaged as preset configurations for data transform/processors that can be used with neural queries.
Is your feature request related to a problem?
Currently, the model interface is applied before the pre-processing function. When both pre-processing and post-processing functions are defined, the model interface doesn't accurately represent the required format for model input and output.
We need to
Reposition the model interface to be applied after the pre-processing and post-processing functions. This will ensure that the model interface correctly describes the model input and output format.
Implement automatic recognition and application of the model interface for existing pre-processing and post-processing functions. This should be done for the following known pre-processing functions:
ml-commons/common/src/main/java/org/opensearch/ml/common/connector/MLPreProcessFunction.java
Lines 28 to 33 in 5432f25
and post prcessing functions:
ml-commons/common/src/main/java/org/opensearch/ml/common/connector/MLPostProcessFunction.java
Lines 22 to 29 in 5432f25
Expected Outcome:
The model interface will accurately represent the required format for model input and output.
Improved consistency and ease of use for developers working with pre-processing and post-processing functions.
Reduced manual configuration errors when setting up model interfaces.
What alternatives have you considered?
Instead of repositioning model interface, create a new object called "model predict mapping" that can describe the model input and output mapping after pre-processing function, but the cons is that this seems very redundant to model interface.
The text was updated successfully, but these errors were encountered: