You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The current implementation in DefaultSparkSqlFunctionResponseHandle and its usage pattern lead to inefficient memory usage and unnecessary data conversions:
DefaultSparkSqlFunctionResponseHandle loads all data into an ArrayList and then creates an iterator from this ArrayList.
The consuming code (e.g. AsyncQueryExecutorServiceImpl) iterates over this iterator and puts all the data back into a new ArrayList.
This leads to:
Double memory usage: The data exists in both the original ArrayList inside DefaultSparkSqlFunctionResponseHandle and the new result ArrayList in the consuming code.
Unnecessary conversion: Data is converted from ArrayList to Iterator and then back to ArrayList, without leveraging the potential benefits of the iterator pattern such as lazy loading or memory efficiency.
What solution would you like?
I'm proposing two potential solutions:
Direct ArrayList access: If all data is typically needed at once, modify DefaultSparkSqlFunctionResponseHandle to provide a method that returns the full ArrayList directly, bypassing the iterator.
True lazy loading: For scenarios where streaming might be beneficial, implement real lazy loading in DefaultSparkSqlFunctionResponseHandle, fetching data on-demand.
What alternatives have you considered?
Keeping the current implementation but optimizing the consuming code to use the iterator directly without creating a new ArrayList.
Implementing a hybrid approach that provides both direct list access and iterator functionality, allowing for flexibility in different usage scenarios.
The text was updated successfully, but these errors were encountered:
Is your feature request related to a problem?
The current implementation in DefaultSparkSqlFunctionResponseHandle and its usage pattern lead to inefficient memory usage and unnecessary data conversions:
DefaultSparkSqlFunctionResponseHandle loads all data into an ArrayList and then creates an iterator from this ArrayList.
The consuming code (e.g. AsyncQueryExecutorServiceImpl) iterates over this iterator and puts all the data back into a new ArrayList.
This leads to:
What solution would you like?
I'm proposing two potential solutions:
What alternatives have you considered?
The text was updated successfully, but these errors were encountered: