-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Add data binding support for file retrieval in FTP client #6529
Comments
Related to #4234 |
@NipunaRanasinghe did we think about the performance aspect of this. Lets say we have a large file (may be a JSON or a CSV) and we are trying to get it and data bind it to JSON (or CSV) what would be the behaviour ? Better to check the behaviour of Ballerina IO package when we try to read a large file. |
@ayeshLK Your concerns regarding large files is certainly valid. However, it's important to note that the proposed approach includes both byte array streams and, direct data types such as JSON, XML, etc as the expected type. Therefore this approach gives users the flexibility to make a conscious decision based on their application needs. (If they are dealing with relatively small files, direct data binding would be convenient and appropriate. Conversely, for larger files, they could opt for streaming to manage resource usage more effectively). Nevertheless, we can do a performance comparison between the two approaches after implementing the suggested changes. This will help us ensure that the new functionality can be used efficiently across different scenarios. |
@NipunaRanasinghe Shall we have a separate API for streaming like in IO API[1]? public isolated function getBlockAsStream(string path, int blockSize) returns stream<Block, Error?>;
# The read-only byte array that is used to read the byte content from the streams.
public type Block readonly & byte[]; |
What would the usage look like to read CSV files from the FTP server? CSV file read/write is a common usecase. |
@daneshk here the stream capability is kept under the same API, mainly to the keep the backward compatibility (so that the existing code will not break and users can optionally switch to infer the types, only if they want). If the backward compatibility is not a priority, I'm +1 to move the streaming capability into a new API. |
Here the CSV support is not prioratised (as mentioned in non-goals), as Ballerina still lack of first-class support for CSV data types (unlike json and xml). Therefore one option would be to go for a cleaner implementation once we we have a better support for CSV data type via the But if this should be prioritized, I'm +1 to consider the CSV support (with the representation of |
When considering about the return types |
As per the design discussion with @shafreenAnfar, @daneshk and @dilanSachi, we've decided to,
|
Please find the new proposal below, which includes the additional requirements mentioned above. SummaryThe current implementation of the Goals
Non-Goals
MotivationCurrently, fetching, converting, and uploading file contents using the FTP connector involves multiple steps and explicit stream handling, which can be cumbersome. By enabling direct data binding through specific APIs, we aim to provide a more streamlined and efficient way to handle file contents. DescriptionThe proposed changes will introduce new methods in the FTP connector, each tailored to fetch and bind file contents to specific data types directly. Similarly, new methods for uploading contents directly as various data types will be introduced. Proposed New APIs to replace
|
@daneshk @shafreenAnfar appreciate your reviews/feedback on the revised proposal added in #6529. |
There are a few minor queries. Overall LGTM
remote isolated function getJsonWithType(string path, typedesc<T> returnType = <>) returns returnType|Error;
|
I wonder if we should we suffix remote isolated function fileGetCsv(string path) returns string[][]|Error; |
Yes we are. There was a mistakenly added syntax issue in the proposed
Yes, updated the proposal with the suggestion.
If we require to support that scenario (in addition to retrieving the string content as a byte array), we have the option to add a new API called @daneshk Please let me know your thoughts on this and, I'll update the proposal once we finalize the API name. |
@shafreenAnfar Yes we have that option too. My only concern is whether this naming convention might deviate from the existing naming conventions we have for our other FTP APIs ( So if we are to do this, shouldn't we apply the same convention to the other APIs as well (e.g. WDYT? |
Interesting! I was thinking two types of APIs; one set as low level APIs (what we have now) and another set as high level APIs. |
Yes, the |
On second thoughts, I think we can skip the file and go ahead with what @NipunaRanasinghe suggested. |
Ack @shafreenAnfar. Will proceed with the exiting design. Thanks for your valuable feedback so far on this. |
The implementation is currently on hold due to #42830 and #42848 release blockers. |
Summary
The current implementation of the
get()
method in the Ballerina FTP connector allows fetching file contents only as a stream of bytes.This proposal suggests enhancing the
get()
method to support direct data binding to JSON, XML, and byte arrays, reducing the need for manual conversions and stream handling.Goals
get()
method to support direct data binding, allowing it to return file contents directly as JSON, XML, or byte arrays.Non-Goals
This proposal does not aim to remove the existing return type of the
get()
method, which returns file contents as a stream of bytes. All the newly introduced return types will be added as subtypes of a union type, to be backward compatible.This proposal does not aim to support additional data formats beyond JSON, XML, and byte arrays (e.g., CSV).
Motivation
Currently, fetching and converting file contents using the FTP connector involves multiple steps and explicit stream handling, which can be cumbersome and error-prone.
By enabling direct data binding, we aim to provide a more streamlined and efficient way to handle file contents.
Description
The proposed changes will enhance the
get()
method in the FTP connector to automatically infer and bind the data type based on the expected return type. This will be accomplished by overloading theget()
method to accept a type descriptor as an optional parameter that dictates the expected format of the data.Proposed Changes to the
get()
MethodExamples
Fetching content directly as JSON:
Fetching content directly as XML:
Fetching content directly as byte array:
Fetching content as a stream of byte arrays:
Compatibility and Migration
get()
method with the default behavior (stream of bytes) will continue to work without modifications.Risks and Assumptions
returnType
. Incorrect assumptions may lead to runtime errors.The text was updated successfully, but these errors were encountered: